aboutsummaryrefslogtreecommitdiff
path: root/include
AgeCommit message (Collapse)AuthorFilesLines
2024-05-07memcg: dynamically allocate lruvec_statsShakeel Butt1-56/+6
To decouple the dependency of lruvec_stats on NR_VM_NODE_STAT_ITEMS, we need to dynamically allocate lruvec_stats in the mem_cgroup_per_node structure. Also move the definition of lruvec_stats_percpu and lruvec_stats and related functions to the memcontrol.c to facilitate later patches. No functional changes in the patch. Link: https://lkml.kernel.org/r/[email protected] Signed-off-by: Shakeel Butt <[email protected]> Reviewed-by: Yosry Ahmed <[email protected]> Reviewed-by: T.J. Mercier <[email protected]> Reviewed-by: Roman Gushchin <[email protected]> Cc: Johannes Weiner <[email protected]> Cc: Michal Hocko <[email protected]> Cc: Muchun Song <[email protected]> Signed-off-by: Andrew Morton <[email protected]>
2024-05-07Merge tag 'kvm-riscv-6.10-1' of https://github.com/kvm-riscv/linux into HEADPaolo Bonzini5-18/+70
KVM/riscv changes for 6.10 - Support guest breakpoints using ebreak - Introduce per-VCPU mp_state_lock and reset_cntx_lock - Virtualize SBI PMU snapshot and counter overflow interrupts - New selftests for SBI PMU and Guest ebreak
2024-05-08KVM: PPC: Fix documentation for ppc mmu capsJoel Stanley1-2/+2
The documentation mentions KVM_CAP_PPC_RADIX_MMU, but the defines in the kvm headers spell it KVM_CAP_PPC_MMU_RADIX. Similarly with KVM_CAP_PPC_MMU_HASH_V3. Fixes: c92701322711 ("KVM: PPC: Book3S HV: Add userspace interfaces for POWER9 MMU") Signed-off-by: Joel Stanley <[email protected]> Acked-by: Paul Mackerras <[email protected]> Signed-off-by: Michael Ellerman <[email protected]> Link: https://msgid.link/[email protected]
2024-05-07HID: do not assume HAT Switch logical max < 8Benjamin Tissoires1-3/+3
Turns out that the code can handle a greater range, but the data stored can not. This is problematic on the Raptor Mach 2 joystick which logical max is 239. The kernel interprets it as `-15` and thus ignores the Hat Switch handling. Link: https://gitlab.freedesktop.org/libevdev/udev-hid-bpf/-/issues/17 Link: https://lore.kernel.org/r/[email protected] Reviewed-by: Peter Hutterer <[email protected]> Signed-off-by: Benjamin Tissoires <[email protected]>
2024-05-07md: Revert "md: Fix overflow in is_mddev_idle"Li Nan1-1/+1
This reverts commit 3f9f231236ce7e48780d8a4f1f8cb9fae2df1e4e. Using 64bit for 'sync_io' is unnecessary from the gendisk side. This overflow will not cause any functional impact, except for a UBSAN warning. Solving this overflow requires introducing additional calculations and checks which are not necessary. So just keep using 32bit for 'sync_io'. Signed-off-by: Li Nan <[email protected]> Link: https://lore.kernel.org/r/[email protected] Signed-off-by: Jens Axboe <[email protected]>
2024-05-07block: add a blk_alloc_discard_bio helperChristoph Hellwig1-0/+3
Factor out a helper from __blkdev_issue_discard that chews off as much as possible from a discard range and allocates a bio for it. Signed-off-by: Christoph Hellwig <[email protected]> Link: https://lore.kernel.org/r/[email protected] Signed-off-by: Jens Axboe <[email protected]>
2024-05-07block: add a bio_chain_and_submit helperChristoph Hellwig1-0/+1
This is basically blk_next_bio just with the bio allocation moved to the caller to allow for more flexible bio handling in the caller. Signed-off-by: Christoph Hellwig <[email protected]> Link: https://lore.kernel.org/r/[email protected] Signed-off-by: Jens Axboe <[email protected]>
2024-05-07bug: Improve commentThorsten Blum1-1/+1
Add parentheses to WARN_ON_ONCE() for consistency. Signed-off-by: Thorsten Blum <[email protected]> Signed-off-by: Arnd Bergmann <[email protected]>
2024-05-07ACPI/NUMA: Remove architecture dependent remainingsRobert Richter1-5/+0
With the removal of the Itanium architecture [1] the last architecture dependent functions: acpi_numa_slit_init(), acpi_numa_memory_affinity_init() were removed. Remove its remainings in the header files too and make them static. [1] commit cf8e8658100d ("arch: Remove Itanium (IA-64) architecture") Reviewed-by: Dan Williams <[email protected]> Reviewed-by: Jonathan Cameron <[email protected]> Reviewed-by: Alison Schofield <[email protected]> Signed-off-by: Robert Richter <[email protected]> Signed-off-by: Rafael J. Wysocki <[email protected]>
2024-05-07x86/numa: Fix SRAT lookup of CFMWS ranges with numa_fill_memblks()Robert Richter1-6/+1
For configurations that have the kconfig option NUMA_KEEP_MEMINFO disabled, numa_fill_memblks() only returns with NUMA_NO_MEMBLK (-1). SRAT lookup fails then because an existing SRAT memory range cannot be found for a CFMWS address range. This causes the addition of a duplicate numa_memblk with a different node id and a subsequent page fault and kernel crash during boot. Fix this by making numa_fill_memblks() always available regardless of NUMA_KEEP_MEMINFO. As Dan suggested, the fix is implemented to remove numa_fill_memblks() from sparsemem.h and alos using __weak for the function. Note that the issue was initially introduced with [1]. But since phys_to_target_node() was originally used that returned the valid node 0, an additional numa_memblk was not added. Though, the node id was wrong too, a message is seen then in the logs: kernel/numa.c: pr_info_once("Unknown target node for memory at 0x%llx, assuming node 0\n", [1] commit fd49f99c1809 ("ACPI: NUMA: Add a node and memblk for each CFMWS not in SRAT") Suggested-by: Dan Williams <[email protected]> Link: https://lore.kernel.org/all/[email protected]/ Fixes: 8f1004679987 ("ACPI/NUMA: Apply SRAT proximity domain to entire CFMWS window") Reviewed-by: Jonathan Cameron <[email protected]> Reviewed-by: Alison Schofield <[email protected]> Reviewed-by: Dan Williams <[email protected]> Signed-off-by: Robert Richter <[email protected]> Acked-by: Borislav Petkov (AMD) <[email protected]> Signed-off-by: Rafael J. Wysocki <[email protected]>
2024-05-07page_pool: don't use driver-set flags field directlyAlexander Lobakin1-3/+10
page_pool::p is driver-defined params, copied directly from the structure passed to page_pool_create(). The structure isn't meant to be modified by the Page Pool core code and this even might look confusing[0][1]. In order to be able to alter some flags, let's define our own, internal fields the same way as the already existing one (::has_init_callback). They are defined as bits in the driver-set params, leave them so here as well, to not waste byte-per-bit or so. Almost 30 bits are still free for future extensions. We could've defined only new flags here or only the ones we may need to alter, but checking some flags in one place while others in another doesn't sound convenient or intuitive. ::flags passed by the driver can now go to the "slow" PP params. Suggested-by: Jakub Kicinski <[email protected]> Link[0]: https://lore.kernel.org/netdev/[email protected] Suggested-by: Alexander Duyck <[email protected]> Link[1]: https://lore.kernel.org/netdev/CAKgT0UfZCGnWgOH96E4GV3ZP6LLbROHM7SHE8NKwq+exX+Gk_Q@mail.gmail.com Signed-off-by: Alexander Lobakin <[email protected]> Signed-off-by: Christoph Hellwig <[email protected]>
2024-05-07page_pool: make sure frag API fields don't span between cachelinesAlexander Lobakin1-1/+11
After commit 5027ec19f104 ("net: page_pool: split the page_pool_params into fast and slow") that made &page_pool contain only "hot" params at the start, cacheline boundary chops frag API fields group in the middle again. To not bother with this each time fast params get expanded or shrunk, let's just align them to `4 * sizeof(long)`, the closest upper pow-2 to their actual size (2 longs + 1 int). This ensures 16-byte alignment for the 32-bit architectures and 32-byte alignment for the 64-bit ones, excluding unnecessary false-sharing. ::page_state_hold_cnt is used quite intensively on hotpath no matter if frag API is used, so move it to the newly created hole in the first cacheline. Signed-off-by: Alexander Lobakin <[email protected]> Signed-off-by: Christoph Hellwig <[email protected]>
2024-05-07dma: avoid redundant calls for sync operationsAlexander Lobakin3-5/+64
Quite often, devices do not need dma_sync operations on x86_64 at least. Indeed, when dev_is_dma_coherent(dev) is true and dev_use_swiotlb(dev) is false, iommu_dma_sync_single_for_cpu() and friends do nothing. However, indirectly calling them when CONFIG_RETPOLINE=y consumes about 10% of cycles on a cpu receiving packets from softirq at ~100Gbit rate. Even if/when CONFIG_RETPOLINE is not set, there is a cost of about 3%. Add dev->need_dma_sync boolean and turn it off during the device initialization (dma_set_mask()) depending on the setup: dev_is_dma_coherent() for the direct DMA, !(sync_single_for_device || sync_single_for_cpu) or the new dma_map_ops flag, %DMA_F_CAN_SKIP_SYNC, advertised for non-NULL DMA ops. Then later, if/when swiotlb is used for the first time, the flag is reset back to on, from swiotlb_tbl_map_single(). On iavf, the UDP trafficgen with XDP_DROP in skb mode test shows +3-5% increase for direct DMA. Suggested-by: Christoph Hellwig <[email protected]> # direct DMA shortcut Co-developed-by: Eric Dumazet <[email protected]> Signed-off-by: Eric Dumazet <[email protected]> Signed-off-by: Alexander Lobakin <[email protected]> Signed-off-by: Christoph Hellwig <[email protected]>
2024-05-07dma: compile-out DMA sync op calls when not usedAlexander Lobakin1-29/+33
Some platforms do have DMA, but DMA there is always direct and coherent. Currently, even on such platforms DMA sync operations are compiled and called. Add a new hidden Kconfig symbol, DMA_NEED_SYNC, and set it only when either sync operations are needed or there is DMA ops or swiotlb or DMA debug is enabled. Compile global dma_sync_*() and dma_need_sync() only when it's set, otherwise provide empty inline stubs. The change allows for future optimizations of DMA sync calls depending on runtime conditions. Signed-off-by: Alexander Lobakin <[email protected]> Signed-off-by: Christoph Hellwig <[email protected]>
2024-05-07iommu/dma: fix zeroing of bounce buffer padding used by untrusted devicesMichael Kelley1-0/+5
iommu_dma_map_page() allocates swiotlb memory as a bounce buffer when an untrusted device wants to map only part of the memory in an granule. The goal is to disallow the untrusted device having DMA access to unrelated kernel data that may be sharing the granule. To meet this goal, the bounce buffer itself is zeroed, and any additional swiotlb memory up to alloc_size after the bounce buffer end (i.e., "post-padding") is also zeroed. However, as of commit 901c7280ca0d ("Reinstate some of "swiotlb: rework "fix info leak with DMA_FROM_DEVICE"""), swiotlb_tbl_map_single() always initializes the contents of the bounce buffer to the original memory. Zeroing the bounce buffer is redundant and probably wrong per the discussion in that commit. Only the post-padding needs to be zeroed. Also, when the DMA min_align_mask is non-zero, the allocated bounce buffer space may not start on a granule boundary. The swiotlb memory from the granule boundary to the start of the allocated bounce buffer might belong to some unrelated bounce buffer. So as described in the "second issue" in [1], it can't be zeroed to protect against untrusted devices. But as of commit af133562d5af ("swiotlb: extend buffer pre-padding to alloc_align_mask if necessary"), swiotlb_tbl_map_single() allocates pre-padding slots when necessary to meet min_align_mask requirements, making it possible to zero the pre-padding area as well. Finally, iommu_dma_map_page() uses the swiotlb for untrusted devices and also for certain kmalloc() memory. Current code does the zeroing for both cases, but it is needed only for the untrusted device case. Fix all of this by updating iommu_dma_map_page() to zero both the pre-padding and post-padding areas, but not the actual bounce buffer. Do this only in the case where the bounce buffer is used because of an untrusted device. [1] https://lore.kernel.org/all/[email protected]/ Signed-off-by: Michael Kelley <[email protected]> Reviewed-by: Petr Tesarik <[email protected]> Signed-off-by: Christoph Hellwig <[email protected]>
2024-05-07swiotlb: remove alloc_size argument to swiotlb_tbl_map_single()Michael Kelley1-1/+1
Currently swiotlb_tbl_map_single() takes alloc_align_mask and alloc_size arguments to specify an swiotlb allocation that is larger than mapping_size. This larger allocation is used solely by iommu_dma_map_single() to handle untrusted devices that should not have DMA visibility to memory pages that are partially used for unrelated kernel data. Having two arguments to specify the allocation is redundant. While alloc_align_mask naturally specifies the alignment of the starting address of the allocation, it can also implicitly specify the size by rounding up the mapping_size to that alignment. Additionally, the current approach has an edge case bug. iommu_dma_map_page() already does the rounding up to compute the alloc_size argument. But swiotlb_tbl_map_single() then calculates the alignment offset based on the DMA min_align_mask, and adds that offset to alloc_size. If the offset is non-zero, the addition may result in a value that is larger than the max the swiotlb can allocate. If the rounding up is done _after_ the alignment offset is added to the mapping_size (and the original mapping_size conforms to the value returned by swiotlb_max_mapping_size), then the max that the swiotlb can allocate will not be exceeded. In view of these issues, simplify the swiotlb_tbl_map_single() interface by removing the alloc_size argument. Most call sites pass the same value for mapping_size and alloc_size, and they pass alloc_align_mask as zero. Just remove the redundant argument from these callers, as they will see no functional change. For iommu_dma_map_page() also remove the alloc_size argument, and have swiotlb_tbl_map_single() compute the alloc_size by rounding up mapping_size after adding the offset based on min_align_mask. This has the side effect of fixing the edge case bug but with no other functional change. Also add a sanity test on the alloc_align_mask. While IOMMU code currently ensures the granule is not larger than PAGE_SIZE, if that guarantee were to be removed in the future, the downstream effect on the swiotlb might go unnoticed until strange allocation failures occurred. Tested on an ARM64 system with 16K page size and some kernel test-only hackery to allow modifying the DMA min_align_mask and the granule size that becomes the alloc_align_mask. Tested these combinations with a variety of original memory addresses and sizes, including those that reproduce the edge case bug: * 4K granule and 0 min_align_mask * 4K granule and 0xFFF min_align_mask (4K - 1) * 16K granule and 0xFFF min_align_mask * 64K granule and 0xFFF min_align_mask * 64K granule and 0x3FFF min_align_mask (16K - 1) With the changes, all combinations pass. Signed-off-by: Michael Kelley <[email protected]> Reviewed-by: Petr Tesarik <[email protected]> Signed-off-by: Christoph Hellwig <[email protected]>
2024-05-07Merge tag 'samsung-dt64-6.10-2' of ↵Arnd Bergmann1-0/+116
https://git.kernel.org/pub/scm/linux/kernel/git/krzk/linux into soc/dt Samsung DTS ARM64 changes for v6.10, part two Few changes exclusively for Google GS101: 1. Add HSI0 and HSI2 clock controllers (CMUs). 2. Add USB 3.1 Dual Role Device (DRD) support. 3. Add UFS (Universal Flash Storage) support. 4. Document bus clocks in pin controllers necessary for accessing registers. * tag 'samsung-dt64-6.10-2' of https://git.kernel.org/pub/scm/linux/kernel/git/krzk/linux: arm64: dts: exynos: gs101: specify empty clocks for remaining pinctrl arm64: dts: exynos: gs101: specify bus clock for pinctrl_hsi2 arm64: dts: exynos: gs101: specify bus clock for pinctrl_peric[01] arm64: dts: exynos: gs101: specify bus clock for pinctrl (far) alive arm64: dts: exynos: gs101: enable ufs, phy on oriole & define ufs regulator arm64: dts: exynos: gs101: Add ufs and ufs-phy dt nodes arm64: dts: exynos: gs101: Add the hsi2 sysreg node dt-bindings: soc: google: exynos-sysreg: add dedicated hsi2 sysreg compatible arm64: dts: exynos: gs101-oriole: enable USB on this board arm64: dts: exynos: gs101: add USB & USB-phy nodes arm64: dts: exynos: gs101: enable cmu-hsi2 clock controller arm64: dts: exynos: gs101: enable cmu-hsi0 clock controller dt-bindings: clock: google,gs101-clock: add HSI2 clock management unit dt-bindings: clock: google,gs101-clock: add HSI0 clock management unit Link: https://lore.kernel.org/r/[email protected] Signed-off-by: Arnd Bergmann <[email protected]>
2024-05-07printk: cleanup deprecated uses of strncpy/strcpyJustin Stitt1-1/+1
Cleanup some deprecated uses of strncpy() and strcpy() [1]. There doesn't seem to be any bugs with the current code but the readability of this code could benefit from a quick makeover while removing some deprecated stuff as a benefit. The most interesting replacement made in this patch involves concatenating "ttyS" with a digit-led user-supplied string. Instead of doing two distinct string copies with carefully managed offsets and lengths, let's use the more robust and self-explanatory scnprintf(). scnprintf will 1) respect the bounds of @buf, 2) null-terminate @buf, 3) do the concatenation. This allows us to drop the manual NUL-byte assignment. Also, since isdigit() is used about a dozen lines after the open-coded version we'll replace it for uniformity's sake. All the strcpy() --> strscpy() replacements are trivial as the source strings are literals and much smaller than the destination size. No behavioral change here. Use the new 2-argument version of strscpy() introduced in Commit e6584c3964f2f ("string: Allow 2-argument strscpy()"). However, to make this work fully (since the size must be known at compile time), also update the extern-qualified declaration to have the proper size information. Link: https://www.kernel.org/doc/html/latest/process/deprecated.html#strncpy-on-nul-terminated-strings [1] Link: https://github.com/KSPP/linux/issues/90 [2] Link: https://manpages.debian.org/testing/linux-manual-4.8/strscpy.9.en.html [3] Cc: [email protected] Signed-off-by: Justin Stitt <[email protected]> Reviewed-by: Petr Mladek <[email protected]> Link: https://lore.kernel.org/r/20240429-strncpy-kernel-printk-printk-c-v1-1-4da7926d7b69@google.com [[email protected]: Removed obsolete brackets and added empty lines.] Signed-off-by: Petr Mladek <[email protected]>
2024-05-07gpiolib: Discourage to use formatting strings in line namesAndy Shevchenko1-3/+1
Currently the documentation for line names allows to use %u inside the alternative name. This is broken in character device approach from day 1 and being in use solely in sysfs. Character device interface has a line number as a part of its address, so the users better rely on it. Hence remove the misleading documentation. On top of that, there are no in-kernel users (out of 6, if I'm correct) for such names and moreover if one exists it won't help in distinguishing lines with the same naming as '%u' will also be in them and we will get a warning in gpiochip_set_desc_names() for such cases. Signed-off-by: Andy Shevchenko <[email protected]> Reviewed-by: Linus Walleij <[email protected]> Reviewed-by: Kent Gibson <[email protected]> Link: https://lore.kernel.org/r/[email protected] Signed-off-by: Bartosz Golaszewski <[email protected]>
2024-05-07Merge tag 'intel-gpio-v6.10-1' of ↵Bartosz Golaszewski1-4/+4
git://git.kernel.org/pub/scm/linux/kernel/git/andy/linux-gpio-intel into gpio/for-next intel-gpio for v6.10-1 * New driver for vGPIO controller on Intel Granite Rapids-D * Update ACPI GPIO library to unify the IRQ code path * Better GPIO IRQ line labeling for ACPI * Switched Intel SCH driver to use "mapped" I/O accessors The following is an automated git shortlog grouped by driver: Add Intel Granite Rapids-D vGPIO driver: - Add Intel Granite Rapids-D vGPIO driver crystalcove: - Use -ENOTSUPP consistently gpiolib: - acpi: Set label for IRQ only lines - acpi: Add fwnode name to the GPIO interrupt label - acpi: Pass con_id instead of property into acpi_dev_gpio_irq_get_by() - acpi: Move acpi_can_fallback_to_crs() out of __acpi_find_gpio() - acpi: Simplify error handling in __acpi_find_gpio() - acpi: Extract __acpi_find_gpio() helper - acpi: Check for errors first in acpi_find_gpio() - acpi: Remove never true check in acpi_get_gpiod_by_index() sch: - Utilise temporary variable for struct device - Switch to memory mapped IO accessors wcove: - Use -ENOTSUPP consistently
2024-05-06Merge tag 'ipsec-next-2024-05-03' of ↵Jakub Kicinski4-1/+10
git://git.kernel.org/pub/scm/linux/kernel/git/klassert/ipsec-next Steffen Klassert says: ==================== pull request (net-next): ipsec-next 2024-05-03 1) Remove Obsolete UDP_ENCAP_ESPINUDP_NON_IKE Support. This was defined by an early version of an IETF draft that did not make it to a standard. 2) Introduce direction attribute for xfrm states. xfrm states have a direction, a stsate can be used either for input or output packet processing. Add a direction to xfrm states to make it clear for what a xfrm state is used. * tag 'ipsec-next-2024-05-03' of git://git.kernel.org/pub/scm/linux/kernel/git/klassert/ipsec-next: xfrm: Restrict SA direction attribute to specific netlink message types xfrm: Add dir validation to "in" data path lookup xfrm: Add dir validation to "out" data path lookup xfrm: Add Direction to the SA in or out udpencap: Remove Obsolete UDP_ENCAP_ESPINUDP_NON_IKE Support ==================== Link: https://lore.kernel.org/r/[email protected] Signed-off-by: Jakub Kicinski <[email protected]>
2024-05-07regmap: Reorder fields in 'struct regmap_config' to save some memoryChristophe JAILLET1-31/+31
On x86_64 and allmodconfig, this shrinks the size of 'struct regmap_config' from 328 to 312 bytes. This is usually a win, because this structure is used as a static global variable. When moving the kerneldoc fields, I've tried to keep the layout as consistent as possible, which is not really easy! Before: /* size: 328, cachelines: 6, members: 55 */ /* sum members: 296, holes: 6, sum holes: 25 */ /* padding: 7 */ /* last cacheline: 8 bytes */ After: /* size: 312, cachelines: 5, members: 55 */ /* sum members: 296, holes: 5, sum holes: 16 */ /* last cacheline: 56 bytes */ For the records, this is also widely used: $git grep static.*regmap_config | wc -l 1327 Signed-off-by: Christophe JAILLET <[email protected]> Link: https://lore.kernel.org/r/5e039cd8fe415dd7ab3169948c08a5311db9fb9a.1715024007.git.christophe.jaillet@wanadoo.fr Signed-off-by: Mark Brown <[email protected]>
2024-05-07gtp: add IPv6 supportPablo Neira Ayuso2-0/+5
Add new iflink attributes to configure in-kernel UDP listener socket address: IFLA_GTP_LOCAL and IFLA_GTP_LOCAL6. If none of these attributes are specified, default is still to IPv4 INADDR_ANY for backward compatibility. Add new attributes to set up family and IPv6 address of GTP tunnels: GTPA_FAMILY, GTPA_PEER_ADDR6 and GTPA_MS_ADDR6. If no GTPA_FAMILY is specified, AF_INET is assumed for backward compatibility. setsockopt IPV6_ADDRFORM allows to downgrade socket from IPv6 to IPv4 after socket is bound. Assumption is that socket listener that is attached to the gtp device needs to be either IPv4 or IPv6. Therefore, GTP socket listener does not allow for IPv4-mapped-IPv6 listener. Signed-off-by: Pablo Neira Ayuso <[email protected]>
2024-05-07gtp: properly parse extension headersPablo Neira Ayuso1-0/+5
Currently GTP packets are dropped if the next extension field is set to non-zero value, but this are valid GTP packets. TS 29.281 provides a longer header format, which is defined as struct gtp1_header_long. Such long header format is used if any of the S, PN, E flags is set. This long header is 4 bytes longer than struct gtp1_header, plus variable length (optional) extension headers. The next extension header field is zero is no extension header is provided. The extension header is composed of a length field which includes total number of 4 byte words including the extension header itself (1 byte), payload (variable length) and next type (1 byte). The extension header size and its payload is aligned to 4 bytes. A GTP packet might come with a chain extensions headers, which makes it slightly cumbersome to parse because the extension next header field comes at the end of the extension header, and there is a need to check if this field becomes zero to stop the extension header parser. Signed-off-by: Pablo Neira Ayuso <[email protected]>
2024-05-06Reapply "drm/qxl: simplify qxl_fence_wait"Linus Torvalds1-7/+0
This reverts commit 07ed11afb68d94eadd4ffc082b97c2331307c5ea. Stephen Rostedt reports: "I went to run my tests on my VMs and the tests hung on boot up. Unfortunately, the most I ever got out was: [ 93.607888] Testing event system initcall: OK [ 93.667730] Running tests on all trace events: [ 93.669757] Testing all events: OK [ 95.631064] ------------[ cut here ]------------ Timed out after 60 seconds" and further debugging points to a possible circular locking dependency between the console_owner locking and the worker pool locking. Reverting the commit allows Steve's VM to boot to completion again. [ This may obviously result in the "[TTM] Buffer eviction failed" messages again, which was the reason for that original revert. But at this point this seems preferable to a non-booting system... ] Reported-and-bisected-by: Steven Rostedt <[email protected]> Link: https://lore.kernel.org/all/[email protected]/ Acked-by: Maxime Ripard <[email protected]> Cc: Alex Constantino <[email protected]> Cc: Maxime Ripard <[email protected]> Cc: Timo Lindfors <[email protected]> Cc: Dave Airlie <[email protected]> Cc: Gerd Hoffmann <[email protected]> Cc: Maarten Lankhorst <[email protected]> Cc: Thomas Zimmermann <[email protected]> Cc: Daniel Vetter <[email protected]> Signed-off-by: Linus Torvalds <[email protected]>
2024-05-06kunit: Print last test location on faultMickaël Salaün1-3/+21
This helps identify the location of test faults with opportunistic calls to _KUNIT_SAVE_LOC(). This can be useful while writing tests or debugging them. It is possible to call KUNIT_SUCCESS() to explicit save last location. Cc: Brendan Higgins <[email protected]> Cc: David Gow <[email protected]> Cc: Rae Moar <[email protected]> Cc: Shuah Khan <[email protected]> Reviewed-by: Kees Cook <[email protected]> Signed-off-by: Mickaël Salaün <[email protected]> Link: https://lore.kernel.org/r/[email protected] Signed-off-by: Shuah Khan <[email protected]>
2024-05-06kunit: Handle test faultsMickaël Salaün1-3/+0
Previously, when a kernel test thread crashed (e.g. NULL pointer dereference, general protection fault), the KUnit test hanged for 30 seconds and exited with a timeout error. Fix this issue by waiting on task_struct->vfork_done instead of the custom kunit_try_catch.try_completion, and track the execution state by initially setting try_result with -EINTR and only setting it to 0 if the test passed. Fix kunit_generic_run_threadfn_adapter() signature by returning 0 instead of calling kthread_complete_and_exit(). Because thread's exit code is never checked, always set it to 0 to make it clear. To make this explicit, export kthread_exit() for KUnit tests built as module. Fix the -EINTR error message, which couldn't be reached until now. This is tested with a following patch. Cc: Brendan Higgins <[email protected]> Cc: Eric W. Biederman <[email protected]> Cc: Shuah Khan <[email protected]> Reviewed-by: Kees Cook <[email protected]> Reviewed-by: David Gow <[email protected]> Tested-by: Rae Moar <[email protected]> Signed-off-by: Mickaël Salaün <[email protected]> Link: https://lore.kernel.org/r/[email protected] Signed-off-by: Shuah Khan <[email protected]>
2024-05-06Merge tag 'slab-for-6.9-rc7-fixes' of ↵Linus Torvalds1-2/+2
git://git.kernel.org/pub/scm/linux/kernel/git/vbabka/slab Pull slab fixes from Vlastimil Babka: - Fix for cleanup infrastructure (Dan Carpenter) This makes the __free(kfree) cleanup hooks not crash on error pointers. - SLUB fix for freepointer checking (Nicolas Bouchinet) This fixes a recently introduced bug that manifests when init_on_free, CONFIG_SLAB_FREELIST_HARDENED and consistency checks (slub_debug=F) are all enabled, and results in false-positive freepointer corrupt reports for caches that store freepointer outside of the object area. * tag 'slab-for-6.9-rc7-fixes' of git://git.kernel.org/pub/scm/linux/kernel/git/vbabka/slab: mm/slab: make __free(kfree) accept error pointers mm/slub: avoid zeroing outside-object freepointer for single free
2024-05-06NFS/knfsd: Remove the invalid NFS error 'NFSERR_OPNOTSUPP'Trond Myklebust2-3/+0
NFSERR_OPNOTSUPP is not described by any RFC, and should not be used. Signed-off-by: Trond Myklebust <[email protected]> Signed-off-by: Chuck Lever <[email protected]>
2024-05-06printk: Change type of CONFIG_BASE_SMALL to boolYoann Congal3-4/+4
CONFIG_BASE_SMALL is currently a type int but is only used as a boolean. So, change its type to bool and adapt all usages: CONFIG_BASE_SMALL == 0 becomes !IS_ENABLED(CONFIG_BASE_SMALL) and CONFIG_BASE_SMALL != 0 becomes IS_ENABLED(CONFIG_BASE_SMALL). Reviewed-by: Petr Mladek <[email protected]> Reviewed-by: Greg Kroah-Hartman <[email protected]> Reviewed-by: Masahiro Yamada <[email protected]> Signed-off-by: Yoann Congal <[email protected]> Link: https://lore.kernel.org/r/[email protected] Signed-off-by: Petr Mladek <[email protected]>
2024-05-06NFSD: add listener-{set,get} netlink commandLorenzo Bianconi1-0/+17
Introduce write_ports netlink command. For listener-set, userspace is expected to provide a NFS listeners list it wants enabled. All other sockets will be closed. Reviewed-by: Jeff Layton <[email protected]> Co-developed-by: Jeff Layton <[email protected]> Signed-off-by: Jeff Layton <[email protected]> Signed-off-by: Lorenzo Bianconi <[email protected]> Signed-off-by: Chuck Lever <[email protected]>
2024-05-06SUNRPC: add a new svc_find_listener helperJeff Layton1-0/+2
svc_find_listener will return the transport instance pointer for the endpoint accepting connections/peer traffic from the specified transport class and matching sockaddr. Signed-off-by: Jeff Layton <[email protected]> Signed-off-by: Lorenzo Bianconi <[email protected]> Signed-off-by: Chuck Lever <[email protected]>
2024-05-06SUNRPC: introduce svc_xprt_create_from_sa utility routineLorenzo Bianconi1-0/+3
Add svc_xprt_create_from_sa utility routine and refactor svc_xprt_create() codebase in order to introduce the capability to create a svc port from socket address. Reviewed-by: Jeff Layton <[email protected]> Tested-by: Jeff Layton <[email protected]> Signed-off-by: Lorenzo Bianconi <[email protected]> Signed-off-by: Chuck Lever <[email protected]>
2024-05-06NFSD: add write_version to netlink commandLorenzo Bianconi1-0/+18
Introduce write_version netlink command through a "declarative" interface. This patch introduces a change in behavior since for version-set userspace is expected to provide a NFS major/minor version list it wants to enable while all the other ones will be disabled. (procfs write_version command implements imperative interface where the admin writes +3/-3 to enable/disable a single version. Reviewed-by: Jeff Layton <[email protected]> Tested-by: Jeff Layton <[email protected]> Signed-off-by: Lorenzo Bianconi <[email protected]> Signed-off-by: Chuck Lever <[email protected]>
2024-05-06NFSD: convert write_threads to netlink commandLorenzo Bianconi1-0/+12
Introduce write_threads netlink command similar to the one available through the procfs. Tested-by: Jeff Layton <[email protected]> Reviewed-by: Jeff Layton <[email protected]> Co-developed-by: Jeff Layton <[email protected]> Signed-off-by: Jeff Layton <[email protected]> Signed-off-by: Lorenzo Bianconi <[email protected]> Signed-off-by: Chuck Lever <[email protected]>
2024-05-06nfsd: trivial GET_DIR_DELEGATION supportJeff Layton1-0/+6
This adds basic infrastructure for handing GET_DIR_DELEGATION calls from clients, including the decoders and encoders. For now, it always just returns NFS4_OK + GDD4_UNAVAIL. Eventually clients may start sending this operation, and it's better if we can return GDD4_UNAVAIL instead of having to abort the whole compound. Signed-off-by: Jeff Layton <[email protected]> Signed-off-by: Chuck Lever <[email protected]>
2024-05-06powerpc/dexcr: Add DEXCR prctl interfaceBenjamin Gray1-0/+16
Now that we track a DEXCR on a per-task basis, individual tasks are free to configure it as they like. The interface is a pair of getter/setter prctl's that work on a single aspect at a time (multiple aspects at once is more difficult if there are different rules applied for each aspect, now or in future). The getter shows the current state of the process config, and the setter allows setting/clearing the aspect. Signed-off-by: Benjamin Gray <[email protected]> [mpe: Account for PR_RISCV_SET_ICACHE_FLUSH_CTX, shrink some longs lines] Signed-off-by: Michael Ellerman <[email protected]> Link: https://msgid.link/[email protected]
2024-05-06Merge back thermal cotntrol material for v6.10.Rafael J. Wysocki2-107/+28
2024-05-06alpha: drop pre-EV56 supportArnd Bergmann2-16/+4
All EV4 machines are already gone, and the remaining EV5 based machines all support the slightly more modern EV56 generation as well. Debian only supports EV56 and later. Drop both of these and build kernels optimized for EV56 and higher when the "generic" options is selected, tuning for an out-of-order EV6 pipeline, same as Debian userspace. Since this was the only supported architecture without 8-bit and 16-bit stores, common kernel code no longer has to worry about aligning struct members, and existing workarounds from the block and tty layers can be removed. The alpha memory management code no longer needs an abstraction for the differences between EV4 and EV5+. Link: https://lists.debian.org/debian-alpha/2023/05/msg00009.html Acked-by: Paul E. McKenney <[email protected]> Acked-by: Matt Turner <[email protected]> Signed-off-by: Arnd Bergmann <[email protected]>
2024-05-06net: create tcp_gro_header_pull helper functionFelix Fietkau1-1/+3
Pull the code out of tcp_gro_receive in order to access the tcp header from tcp4/6_gro_receive. Acked-by: Paolo Abeni <[email protected]> Reviewed-by: Eric Dumazet <[email protected]> Signed-off-by: Felix Fietkau <[email protected]> Reviewed-by: David Ahern <[email protected]> Reviewed-by: Willem de Bruijn <[email protected]> Signed-off-by: Paolo Abeni <[email protected]>
2024-05-06net: create tcp_gro_lookup helper functionFelix Fietkau1-0/+1
This pulls the flow port matching out of tcp_gro_receive, so that it can be reused for the next change, which adds the TCP fraglist GRO heuristic. Acked-by: Paolo Abeni <[email protected]> Reviewed-by: Eric Dumazet <[email protected]> Signed-off-by: Felix Fietkau <[email protected]> Reviewed-by: David Ahern <[email protected]> Reviewed-by: Willem de Bruijn <[email protected]> Signed-off-by: Paolo Abeni <[email protected]>
2024-05-06net: move skb_gro_receive_list from udp to coreFelix Fietkau1-0/+1
This helper function will be used for TCP fraglist GRO support Acked-by: Paolo Abeni <[email protected]> Reviewed-by: Eric Dumazet <[email protected]> Signed-off-by: Felix Fietkau <[email protected]> Reviewed-by: David Ahern <[email protected]> Reviewed-by: Willem de Bruijn <[email protected]> Signed-off-by: Paolo Abeni <[email protected]>
2024-05-06netfilter: conntrack: fix ct-state for ICMPv6 Multicast Router DiscoveryLinus Lüssing1-0/+1
So far Multicast Router Advertisements and Multicast Router Solicitations from the Multicast Router Discovery protocol (RFC4286) would be marked as INVALID for IPv6, even if they are in fact intact and adhering to RFC4286. This broke MRA reception and by that multicast reception on IPv6 multicast routers in a Proxmox managed setup, where Proxmox would install a rule like "-m conntrack --ctstate INVALID -j DROP" at the top of the FORWARD chain with br-nf-call-ip6tables enabled by default. Similar to as it's done for MLDv1, MLDv2 and IPv6 Neighbor Discovery already, fix this issue by excluding MRD from connection tracking handling as MRD always uses predefined multicast destinations for its messages, too. This changes the ct-state for ICMPv6 MRD messages from INVALID to UNTRACKED. This issue was found and fixed with the help of the mrdisc tool (https://github.com/troglobit/mrdisc). Signed-off-by: Linus Lüssing <[email protected]> Signed-off-by: Pablo Neira Ayuso <[email protected]>
2024-05-06firewire: core: remove flag and width from u64 formats of tracepoints eventsTakashi Sakamoto1-3/+3
The pointer to fw_packet structure is passed to ring buffer of tracepoints framework as the value of u64 type. '0x%016llx' is used for the print format of value, while the flag and width are useless in the case. This commit removes them. Link: https://lore.kernel.org/r/[email protected] Signed-off-by: Takashi Sakamoto <[email protected]>
2024-05-06firewire: core: fix type of timestamp for async_inbound_template tracepoints ↵Takashi Sakamoto1-1/+1
events The type of time stamp should be u16, instead of u8. Link: https://lore.kernel.org/r/[email protected] Signed-off-by: Takashi Sakamoto <[email protected]>
2024-05-06firewire: core: add tracepoint event for handling bus resetTakashi Sakamoto1-1/+27
The core function expects hardware drivers to call fw_core_handle_bus_reset() when changing bus topology. The 1394 OHCI driver calls it when handling selfID event as a result of any bus-reset. This commit adds a tracepoints event for it. Link: https://lore.kernel.org/r/[email protected] Signed-off-by: Takashi Sakamoto <[email protected]>
2024-05-06firewire: core: add tracepoints events for initiating bus resetTakashi Sakamoto1-0/+33
At a commit 673249124304 ("firewire: core: option to log bus reset initiation"), some kernel log messages were added to trace initiation of bus reset. The kernel log messages are really helpful, while nowadays it is not preferable just for debugging purpose. For the purpose, Linux kernel tracepoints is more preferable. This commit adds some alternative tracepoints events. Link: https://lore.kernel.org/r/[email protected] Signed-off-by: Takashi Sakamoto <[email protected]>
2024-05-06firewire: core: add tracepoints event for asynchronous inbound phy packetTakashi Sakamoto1-0/+30
At the former commit, a pair of tracepoints events is added to trace asynchronous outbound phy packet. This commit adds a tracepoints event to trace inbound phy packet. It includes transaction status as well as the content of phy packet. This is an example for Remote Reply Packet as a response to Remote Access Packet sent by lsfirewirephy command in linux-firewire-utils: async_phy_inbound: \ packet=0xffff955fc02b4e10 generation=1 status=1 timestamp=0x0619 \ first_quadlet=0x001c8208 second_quadlet=0xffe37df7 Link: https://lore.kernel.org/r/[email protected] Signed-off-by: Takashi Sakamoto <[email protected]>
2024-05-06firewire: core/cdev: add tracepoints events for asynchronous phy packetTakashi Sakamoto1-0/+48
In IEEE 1394 bus, the type of asynchronous packet without any offset to node address space is called as phy packet. The destination of packet is IEEE 1394 phy itself. This type of packet is used for several purposes, mainly for selfID at the state of bus reset, to force selection of root node, and to adjust gap count. This commit adds tracepoints events for the type of asynchronous outbound packet. Like asynchronous outbound transaction packets, a pair of events are added to trace initiation and completion of transmission. In the case that the phy packet is sent by kernel API, the match between the initiation and completion is not so easy, since the data of 'struct fw_packet' is allocated statically. In the case that it is sent by userspace applications via cdev, the match is easy, since the data is allocated per each. This example is for Remote Access Packet by lsfirewirephy command in linux-firewire-utils: async_phy_outbound_initiate: \ packet=0xffff89fb34e42e78 generation=1 first_quadlet=0x00148200 \ second_quadlet=0xffeb7dff async_phy_outbound_complete: \ packet=0xffff89fb34e42e78 generation=1 status=1 timestamp=0x0619 Link: https://lore.kernel.org/r/[email protected] Signed-off-by: Takashi Sakamoto <[email protected]>
2024-05-06firewire: core: add tracepoints events for asynchronous outbound responseTakashi Sakamoto1-0/+24
In a view of core transaction service, the asynchronous outbound response consists of two stages; initiation and completion. This commit adds a pair of events for the asynchronous outbound response. The following example is for asynchronous write quadlet request as IEC 61883-1 FCP response to node 0xffc1. async_response_outbound_initiate: \ transaction=0xffff89fa08cf16c0 generation=4 scode=2 dst_id=0xffc1 \ tlabel=25 tcode=2 src_id=0xffc0 rcode=0 \ header={0xffc16420,0xffc00000,0x0,0x0} data={} async_response_outbound_complete: \ transaction=0xffff89fa08cf16c0 generation=4 scode=2 status=1 \ timestamp=0x0000 Link: https://lore.kernel.org/r/[email protected] Signed-off-by: Takashi Sakamoto <[email protected]>