blaster4385/linux-IllusionX - Linux kernel with personal config changes for arch linux

Age	Commit message (Collapse)	Author	Files	Lines
2024-07-18	kselftest/alsa: Use card name rather than number in test names	Mark Brown	2	-58/+104
	Currently for the PCM and mixer tests we report test names which identify the card being tested with the card number. This ensures we have unique names but since card numbers are dynamically assigned at runtime the names we end up with will often not be stable on systems with multiple cards especially where those cards are provided by separate modules loeaded at runtime. This makes it difficult for automated systems and UIs to relate test results between runs on affected platforms. Address this by replacing our use of card numbers with card names which are more likely to be stable across runs. We use the card ID since it is guaranteed to be unique by default, unlike the long name. There is still some vulnerability to ordering issues if multiple cards with the same base ID are present in the system but have separate dependencies but not all drivers put distinguishing information in their long names. Signed-off-by: Mark Brown <[email protected]> Link: https://patch.msgid.link/[email protected] Signed-off-by: Takashi Iwai <[email protected]>
2024-07-18	ALSA: hda/realtek: Fix the speaker output on Samsung Galaxy Book Pro 360	Seunghun Han	1	-0/+1
	Samsung Galaxy Book Pro 360 (13" 2022 NT935QDB-KC71S) with codec SSID 144d:c1a4 requires the same workaround to enable the speaker amp as other Samsung models with the ALC298 codec. Signed-off-by: Seunghun Han <[email protected]> Cc: <[email protected]> Link: https://patch.msgid.link/[email protected] Signed-off-by: Takashi Iwai <[email protected]>
2024-07-18	Kconfig: reduce the amount of power sequencing noise	Bartosz Golaszewski	4	-8/+6
	Kconfig will ask the user twice about power sequencing: once for the QCom WCN power sequencing driver and then again for the PCI power control driver using it. Let's automate the selection of PCI_PWRCTL by introducing a new hidden symbol: HAVE_PWRCTL which should be selected by all platforms that have the need to include PCI power control code (right now: only ARCH_QCOM). The pwrseq-based PCI pwrctl driver itself will then be selected by the drivers binding to devices that may require external handling of the power-up sequence (currently: ath11k and ath12k) based on the value of HAVE_PWRCTL. Make all PCI pwrctl Kconfig symbols hidden so that no questions are asked during configuration. Fixes: 4565d2652a37 ("PCI/pwrctl: Add PCI power control core code") Reported-by: Linus Torvalds <[email protected]> Closes: https://lore.kernel.org/lkml/CAHk-=wjWc5dzcj2O1tEgNHY1rnQW63JwtuZi_vAZPqy6wqpoUQ@mail.gmail.com/ Acked-by: Jeff Johnson <[email protected]> # drivers/net/wireless/ath Link: https://lore.kernel.org/r/[email protected] Signed-off-by: Bartosz Golaszewski <[email protected]>
2024-07-18	thermal: core: Allow thermal zones to tell the core to ignore them	Rafael J. Wysocki	4	-29/+37
	The iwlwifi wireless driver registers a thermal zone that is only needed when the network interface handled by it is up and it wants that thermal zone to be effectively ignored by the core otherwise. Before commit a8a261774466 ("thermal: core: Call monitor_thermal_zone() if zone temperature is invalid") that could be achieved by returning an error code from the thermal zone's .get_temp() callback because the core did not really handle errors returned by it almost at all. However, commit a8a261774466 made the core attempt to recover from the situation in which the temperature of a thermal zone cannot be determined due to errors returned by its .get_temp() and is always invalid from the core's perspective. That was done because there are thermal zones in which .get_temp() returns errors to start with due to some difficulties related to the initialization ordering, but then it will start to produce valid temperature values at one point. Unfortunately, the simple approach taken by commit a8a261774466, which is to poll the thermal zone periodically until its .get_temp() callback starts to return valid temperature values, is at odds with the special thermal zone in iwlwifi in which .get_temp() may always return an error because its network interface may always be down. If that happens, every attempt to invoke the thermal zone's .get_temp() callback resulting in an error causes the thermal core to print a dev_warn() message to the kernel log which is super-noisy. To address this problem, make the core handle the case in which .get_temp() returns 0, but the temperature value returned by it is not actually valid, in a special way. Namely, make the core completely ignore the invalid temperature value coming from .get_temp() in that case, which requires folding in update_temperature() into its caller and a few related changes. On the iwlwifi side, modify iwl_mvm_tzone_get_temp() to return 0 and put THERMAL_TEMP_INVALID into the temperature return memory location instead of returning an error when the firmware is not running or it is not of the right type. Also, to clearly separate the handling of invalid temperature values from the thermal zone initialization, introduce a special THERMAL_TEMP_INIT value specifically for the latter purpose. Fixes: a8a261774466 ("thermal: core: Call monitor_thermal_zone() if zone temperature is invalid") Closes: https://lore.kernel.org/linux-pm/[email protected]/ Reported-by: Eric Biggers <[email protected]> Reported-by: Stefan Lippers-Hollmann <[email protected]> Link: https://bugzilla.kernel.org/show_bug.cgi?id=201761 Tested-by: Oleksandr Natalenko <[email protected]> Tested-by: Stefan Lippers-Hollmann <[email protected]> Cc: 6.10+ <[email protected]> # 6.10+ Signed-off-by: Rafael J. Wysocki <[email protected]> Link: https://patch.msgid.link/[email protected] [ rjw: Rebased on top of the current mainline ] Signed-off-by: Rafael J. Wysocki <[email protected]>
2024-07-18	Merge tag 'nf-24-07-17' of ↵	Paolo Abeni	6	-13/+111
	git://git.kernel.org/pub/scm/linux/kernel/git/netfilter/nf Pablo Neira Ayuso says: ==================== Netfilter/IPVS fixes for net The following patchset contains Netfilter/IPVS fixes for net: 1) Call nf_expect_get_id() to delete expectation by ID. By trial and error it is possible to leak the LSB of the expectation address on x86_64. This bug is a leftover when converting the existing code to use nf_expect_get_id(). 2) Incorrect initialization in pipapo set backend leads to packet mismatches. From Florian Westphal. 3) Extend netfilter's selftests to cover for the pipapo set backend, also from Florian. 4) Fix sparse warning in IPVS when adding service, from Chen Hanxiao. netfilter pull request 24-07-17 * tag 'nf-24-07-17' of git://git.kernel.org/pub/scm/linux/kernel/git/netfilter/nf: ipvs: properly dereference pe in ip_vs_add_service selftests: netfilter: add test case for recent mismatch bug netfilter: nf_set_pipapo: fix initial map fill netfilter: ctnetlink: use helper function to calculate expect ID ==================== Link: https://patch.msgid.link/[email protected] Signed-off-by: Paolo Abeni <[email protected]>
2024-07-18	Merge branch 'net-dsa-fix-chip-wide-frame-size-config-in-some-drivers'	Paolo Abeni	2	-1/+5
	Martin Willi says: ==================== net: dsa: Fix chip-wide frame size config in some drivers Some DSA chips support a chip-wide frame size configurations, only. Some drivers adjust that chip-wide setting for user port changes, overriding the frame size requirements on the CPU port that includes tagger overhead. Fix the mv88e6xxx and b53 drivers and align them to the behavior of other drivers. v2: - Skip chip-wide config for non-CPU ports instead of finding the maximim MTU over all ports - Add a patch fixing the b53 driver as well v1: https://lore.kernel.org/netdev/[email protected]/ ==================== Link: https://patch.msgid.link/[email protected] Signed-off-by: Paolo Abeni <[email protected]>
2024-07-18	net: dsa: b53: Limit chip-wide jumbo frame config to CPU ports	Martin Willi	1	-0/+3
	Broadcom switches supported by the b53 driver use a chip-wide jumbo frame configuration. In the commit referenced with the Fixes tag, the setting is applied just for the last port changing its MTU. While configuring CPU ports accounts for tagger overhead, user ports do not. When setting the MTU for a user port, the chip-wide setting is reduced to not include the tagger overhead, resulting in an potentially insufficient chip-wide maximum frame size for the CPU port. As, by design, the CPU port MTU is adjusted for any user port change, apply the chip-wide setting only for CPU ports. This aligns the driver to the behavior of other switch drivers. Fixes: 6ae5834b983a ("net: dsa: b53: add MTU configuration support") Suggested-by: Vladimir Oltean <[email protected]> Signed-off-by: Martin Willi <[email protected]> Reviewed-by: Vladimir Oltean <[email protected]> Signed-off-by: Paolo Abeni <[email protected]>
2024-07-18	net: dsa: mv88e6xxx: Limit chip-wide frame size config to CPU ports	Martin Willi	1	-1/+2
	Marvell chips not supporting per-port jumbo frame size configurations use a chip-wide frame size configuration. In the commit referenced with the Fixes tag, the setting is applied just for the last port changing its MTU. While configuring CPU ports accounts for tagger overhead, user ports do not. When setting the MTU for a user port, the chip-wide setting is reduced to not include the tagger overhead, resulting in an potentially insufficient maximum frame size for the CPU port. Specifically, sending full-size frames from the CPU port on a MV88E6097 having a user port MTU of 1500 bytes results in dropped frames. As, by design, the CPU port MTU is adjusted for any user port change, apply the chip-wide setting only for CPU ports. Fixes: 1baf0fac10fb ("net: dsa: mv88e6xxx: Use chip-wide max frame size for MTU") Suggested-by: Vladimir Oltean <[email protected]> Signed-off-by: Martin Willi <[email protected]> Reviewed-by: Vladimir Oltean <[email protected]> Signed-off-by: Paolo Abeni <[email protected]>
2024-07-18	net: airoha: Fix NULL pointer dereference in airoha_qdma_cleanup_rx_queue()	Lorenzo Bianconi	1	-2/+1
	Move page_pool_get_dma_dir() inside the while loop of airoha_qdma_cleanup_rx_queue routine in order to avoid possible NULL pointer dereference if airoha_qdma_init_rx_queue() fails before properly allocating the page_pool pointer. Fixes: 23020f049327 ("net: airoha: Introduce ethernet support for EN7581 SoC") Signed-off-by: Lorenzo Bianconi <[email protected]> Reviewed-by: Simon Horman <[email protected]> Link: https://patch.msgid.link/7330a41bba720c33abc039955f6172457a3a34f0.1721205981.git.lorenzo@kernel.org Signed-off-by: Paolo Abeni <[email protected]>
2024-07-18	net: wwan: t7xx: add support for Dell DW5933e	Jack Wu	1	-0/+1
	add support for Dell DW5933e (0x14c0, 0x4d75) Signed-off-by: Jack Wu <[email protected]> Link: https://patch.msgid.link/[email protected] Signed-off-by: Paolo Abeni <[email protected]>
2024-07-18	Merge branch 'ipv4-fix-incorrect-tos-in-route-get-reply'	Paolo Abeni	4	-20/+22
	Ido Schimmel says: ==================== ipv4: Fix incorrect TOS in route get reply Two small fixes for incorrect TOS in route get reply. See more details in the commit messages. No regressions in FIB tests: # ./fib_tests.sh [...] Tests passed: 218 Tests failed: 0 ==================== Link: https://patch.msgid.link/[email protected] Signed-off-by: Paolo Abeni <[email protected]>
2024-07-18	ipv4: Fix incorrect TOS in fibmatch route get reply	Ido Schimmel	2	-13/+13
	The TOS value that is returned to user space in the route get reply is the one with which the lookup was performed ('fl4->flowi4_tos'). This is fine when the matched route is configured with a TOS as it would not match if its TOS value did not match the one with which the lookup was performed. However, matching on TOS is only performed when the route's TOS is not zero. It is therefore possible to have the kernel incorrectly return a non-zero TOS: # ip link add name dummy1 up type dummy # ip address add 192.0.2.1/24 dev dummy1 # ip route get fibmatch 192.0.2.2 tos 0xfc 192.0.2.0/24 tos 0x1c dev dummy1 proto kernel scope link src 192.0.2.1 Fix by instead returning the DSCP field from the FIB result structure which was populated during the route lookup. Output after the patch: # ip link add name dummy1 up type dummy # ip address add 192.0.2.1/24 dev dummy1 # ip route get fibmatch 192.0.2.2 tos 0xfc 192.0.2.0/24 dev dummy1 proto kernel scope link src 192.0.2.1 Extend the existing selftests to not only verify that the correct route is returned, but that it is also returned with correct "tos" value (or without it). Fixes: b61798130f1b ("net: ipv4: RTM_GETROUTE: return matched fib result when requested") Signed-off-by: Ido Schimmel <[email protected]> Reviewed-by: David Ahern <[email protected]> Reviewed-by: Guillaume Nault <[email protected]> Signed-off-by: Paolo Abeni <[email protected]>
2024-07-18	ipv4: Fix incorrect TOS in route get reply	Ido Schimmel	3	-7/+9
	The TOS value that is returned to user space in the route get reply is the one with which the lookup was performed ('fl4->flowi4_tos'). This is fine when the matched route is configured with a TOS as it would not match if its TOS value did not match the one with which the lookup was performed. However, matching on TOS is only performed when the route's TOS is not zero. It is therefore possible to have the kernel incorrectly return a non-zero TOS: # ip link add name dummy1 up type dummy # ip address add 192.0.2.1/24 dev dummy1 # ip route get 192.0.2.2 tos 0xfc 192.0.2.2 tos 0x1c dev dummy1 src 192.0.2.1 uid 0 cache Fix by adding a DSCP field to the FIB result structure (inside an existing 4 bytes hole), populating it in the route lookup and using it when filling the route get reply. Output after the patch: # ip link add name dummy1 up type dummy # ip address add 192.0.2.1/24 dev dummy1 # ip route get 192.0.2.2 tos 0xfc 192.0.2.2 dev dummy1 src 192.0.2.1 uid 0 cache Fixes: 1a00fee4ffb2 ("ipv4: Remove rt_key_{src,dst,tos} from struct rtable.") Signed-off-by: Ido Schimmel <[email protected]> Reviewed-by: David Ahern <[email protected]> Reviewed-by: Guillaume Nault <[email protected]> Signed-off-by: Paolo Abeni <[email protected]>
2024-07-18	net: flow_dissector: use DEBUG_NET_WARN_ON_ONCE	Pablo Neira Ayuso	1	-1/+1
	The following splat is easy to reproduce upstream as well as in -stable kernels. Florian Westphal provided the following commit: d1dab4f71d37 ("net: add and use __skb_get_hash_symmetric_net") but this complementary fix has been also suggested by Willem de Bruijn and it can be easily backported to -stable kernel which consists in using DEBUG_NET_WARN_ON_ONCE instead to silence the following splat given __skb_get_hash() is used by the nftables tracing infrastructure to to identify packets in traces. [69133.561393] ------------[ cut here ]------------ [69133.561404] WARNING: CPU: 0 PID: 43576 at net/core/flow_dissector.c:1104 __skb_flow_dissect+0x134f/ [...] [69133.561944] CPU: 0 PID: 43576 Comm: socat Not tainted 6.10.0-rc7+ #379 [69133.561959] RIP: 0010:__skb_flow_dissect+0x134f/0x2ad0 [69133.561970] Code: 83 f9 04 0f 84 b3 00 00 00 45 85 c9 0f 84 aa 00 00 00 41 83 f9 02 0f 84 81 fc ff ff 44 0f b7 b4 24 80 00 00 00 e9 8b f9 ff ff <0f> 0b e9 20 f3 ff ff 41 f6 c6 20 0f 84 e4 ef ff ff 48 8d 7b 12 e8 [69133.561979] RSP: 0018:ffffc90000006fc0 EFLAGS: 00010246 [69133.561988] RAX: 0000000000000000 RBX: ffffffff82f33e20 RCX: ffffffff81ab7e19 [69133.561994] RDX: dffffc0000000000 RSI: ffffc90000007388 RDI: ffff888103a1b418 [69133.562001] RBP: ffffc90000007310 R08: 0000000000000000 R09: 0000000000000000 [69133.562007] R10: ffffc90000007388 R11: ffffffff810cface R12: ffff888103a1b400 [69133.562013] R13: 0000000000000000 R14: ffffffff82f33e2a R15: ffffffff82f33e28 [69133.562020] FS: 00007f40f7131740(0000) GS:ffff888390800000(0000) knlGS:0000000000000000 [69133.562027] CS: 0010 DS: 0000 ES: 0000 CR0: 0000000080050033 [69133.562033] CR2: 00007f40f7346ee0 CR3: 000000015d200001 CR4: 00000000001706f0 [69133.562040] Call Trace: [69133.562044] <IRQ> [69133.562049] ? __warn+0x9f/0x1a0 [ 1211.841384] ? __skb_flow_dissect+0x107e/0x2860 [...] [ 1211.841496] ? bpf_flow_dissect+0x160/0x160 [ 1211.841753] __skb_get_hash+0x97/0x280 [ 1211.841765] ? __skb_get_hash_symmetric+0x230/0x230 [ 1211.841776] ? mod_find+0xbf/0xe0 [ 1211.841786] ? get_stack_info_noinstr+0x12/0xe0 [ 1211.841798] ? bpf_ksym_find+0x56/0xe0 [ 1211.841807] ? __rcu_read_unlock+0x2a/0x70 [ 1211.841819] nft_trace_init+0x1b9/0x1c0 [nf_tables] [ 1211.841895] ? nft_trace_notify+0x830/0x830 [nf_tables] [ 1211.841964] ? get_stack_info+0x2b/0x80 [ 1211.841975] ? nft_do_chain_arp+0x80/0x80 [nf_tables] [ 1211.842044] nft_do_chain+0x79c/0x850 [nf_tables] Fixes: 9b52e3f267a6 ("flow_dissector: handle no-skb use case") Suggested-by: Willem de Bruijn <[email protected]> Signed-off-by: Pablo Neira Ayuso <[email protected]> Reviewed-by: Willem de Bruijn <[email protected]> Link: https://patch.msgid.link/[email protected] Signed-off-by: Paolo Abeni <[email protected]>
2024-07-18	nsfs: use cleanup guard	Christian Brauner	1	-3/+3
	Ensure that rcu read lock is given up before returning. Link: https://lore.kernel.org/r/20240716-elixier-fliesen-1ab342151a61@brauner Fixes: ca567df74a28 ("nsfs: add pid translation ioctls") Reported-by: [email protected] Signed-off-by: Christian Brauner <[email protected]>
2024-07-18	fs/adfs: add MODULE_DESCRIPTION	Jeff Johnson	1	-0/+1
	Fix the 'make W=1' issue: WARNING: modpost: missing MODULE_DESCRIPTION() in fs/adfs/adfs.o Signed-off-by: Jeff Johnson <[email protected]> Link: https://lore.kernel.org/r/[email protected] Signed-off-by: Christian Brauner <[email protected]>
2024-07-18	misc: Kconfig: exclude mrvl-cn10k-dpi compilation for 32-bit systems	Vamsi Attunuru	1	-1/+1
	Upon adding CONFIG_ARCH_THUNDER & CONFIG_COMPILE_TEST dependency, compilation errors arise on 32-bit ARM with writeq() & readq() calls which are used for accessing 64-bit values. Since DPI hardware only works with 64-bit register accesses, using CONFIG_64BIT dependency to skip compilation on 32-bit systems. Fixes: a5e43e2d202d ("misc: Kconfig: add a new dependency for MARVELL_CN10K_DPI") Signed-off-by: Vamsi Attunuru <[email protected]> Tested-by: Nathan Chancellor <[email protected]> Tested-by: Jeff Johnson <[email protected]> Acked-by: Arnd Bergmann <[email protected]> Link: https://lore.kernel.org/r/[email protected] Signed-off-by: Greg Kroah-Hartman <[email protected]>
2024-07-18	landlock: Various documentation improvements	Günther Noack	3	-22/+24
	* Fix some typos, incomplete or confusing phrases. * Split paragraphs where appropriate. * List the same error code multiple times, if it has multiple possible causes. * Bring wording closer to the man page wording, which has undergone more thorough review (esp. for LANDLOCK_ACCESS_FS_WRITE_FILE). * Small semantic clarifications * Call the ephemeral port range "ephemeral" * Clarify reasons for EFAULT in landlock_add_rule() * Clarify @rule_type doc for landlock_add_rule() This is a collection of small fixes which I collected when preparing the corresponding man pages [1]. Cc: Alejandro Colomar <[email protected]> Cc: Konstantin Meskhidze <[email protected]> Link: https://lore.kernel.org/r/[email protected] [1] Signed-off-by: Günther Noack <[email protected]> Link: https://lore.kernel.org/r/[email protected] [mic: Add label to link, fix formatting spotted by make htmldocs, synchronize userspace-api documentation's date] Signed-off-by: Mickaël Salaün <[email protected]>
2024-07-17	driver core: auxiliary bus: Fix documentation of auxiliary_device	Shay Drory	1	-3/+4
	Fix the documentation of the below field of struct auxiliary_device include/linux/auxiliary_bus.h:150: warning: Function parameter or struct member 'sysfs' not described in 'auxiliary_device' include/linux/auxiliary_bus.h:150: warning: Excess struct member 'irqs' description in 'auxiliary_device' include/linux/auxiliary_bus.h:150: warning: Excess struct member 'lock' description in 'auxiliary_device' include/linux/auxiliary_bus.h:150: warning: Excess struct member 'irq_dir_exists' description in 'auxiliary_device' Fixes: a808878308a8 ("driver core: auxiliary bus: show auxiliary device IRQs") Reported-by: Stephen Rothwell <[email protected]> Signed-off-by: Shay Drory <[email protected]> Signed-off-by: Saeed Mahameed <[email protected]> Link: https://patch.msgid.link/[email protected] Signed-off-by: Jakub Kicinski <[email protected]>
2024-07-17	net: airoha: fix error branch in airoha_dev_xmit and airoha_set_gdm_ports	Lorenzo Bianconi	1	-4/+6
	Fix error case management in airoha_dev_xmit routine since we need to DMA unmap pending buffers starting from q->head. Moreover fix a typo in error case branch in airoha_set_gdm_ports routine. Fixes: 23020f049327 ("net: airoha: Introduce ethernet support for EN7581 SoC") Signed-off-by: Lorenzo Bianconi <[email protected]> Reviewed-by: Simon Horman <[email protected]> Link: https://patch.msgid.link/b628871bc8ae4861b5e2ab4db90aaf373cbb7cee.1721203880.git.lorenzo@kernel.org Signed-off-by: Jakub Kicinski <[email protected]>
2024-07-17	gve: Fix XDP TX completion handling when counters overflow	Joshua Washington	1	-2/+3
	In gve_clean_xdp_done, the driver processes the TX completions based on a 32-bit NIC counter and a 32-bit completion counter stored in the tx queue. Fix the for loop so that the counter wraparound is handled correctly. Fixes: 75eaae158b1b ("gve: Add XDP DROP and TX support for GQI-QPL format") Signed-off-by: Joshua Washington <[email protected]> Signed-off-by: Praveen Kaligineedi <[email protected]> Reviewed-by: Simon Horman <[email protected]> Link: https://patch.msgid.link/[email protected] Signed-off-by: Jakub Kicinski <[email protected]>
2024-07-18	Merge branch 'topic/ppc-kvm' into next	Michael Ellerman	14	-8/+189
	Merge the powerpc KVM topic branch.
2024-07-17	ia64: scrub ia64 from poison.h	Alexey Dobriyan	1	-6/+0
	Link: https://lkml.kernel.org/r/c72e5467-06a8-4739-ae6a-7c84c96cad77@p183 Signed-off-by: Alexey Dobriyan <[email protected]> Signed-off-by: Andrew Morton <[email protected]>
2024-07-17	watchdog/perf: properly initialize the turbo mode timestamp and rearm counter	Thomas Gleixner	1	-3/+8
	For systems on which the performance counter can expire early due to turbo modes the watchdog handler has a safety net in place which validates that since the last watchdog event there has at least 4/5th of the watchdog period elapsed. This works reliably only after the first watchdog event because the per CPU variable which holds the timestamp of the last event is never initialized. So a first spurious event will validate against a timestamp of 0 which results in a delta which is likely to be way over the 4/5 threshold of the period. As this might happen before the first watchdog hrtimer event increments the watchdog counter, this can lead to false positives. Fix this by initializing the timestamp before enabling the hardware event. Reset the rearm counter as well, as that might be non zero after the watchdog was disabled and reenabled. Link: https://lkml.kernel.org/r/87frsfu15a.ffs@tglx Fixes: 7edaeb6841df ("kernel/watchdog: Prevent false positives with turbo modes") Signed-off-by: Thomas Gleixner <[email protected]> Cc: Arjan van de Ven <[email protected]> Cc: Peter Zijlstra <[email protected]> Cc: <[email protected]> Signed-off-by: Andrew Morton <[email protected]>
2024-07-17	mm/mglru: fix ineffective protection calculation	Yu Zhao	1	-44/+38
	mem_cgroup_calculate_protection() is not stateless and should only be used as part of a top-down tree traversal. shrink_one() traverses the per-node memcg LRU instead of the root_mem_cgroup tree, and therefore it should not call mem_cgroup_calculate_protection(). The existing misuse in shrink_one() can cause ineffective protection of sub-trees that are grandchildren of root_mem_cgroup. Fix it by reusing lru_gen_age_node(), which already traverses the root_mem_cgroup tree, to calculate the protection. Previously lru_gen_age_node() opportunistically skips the first pass, i.e., when scan_control->priority is DEF_PRIORITY. On the second pass, lruvec_is_sizable() uses appropriate scan_control->priority, set by set_initial_priority() from lru_gen_shrink_node(), to decide whether a memcg is too small to reclaim from. Now lru_gen_age_node() unconditionally traverses the root_mem_cgroup tree. So it should call set_initial_priority() upfront, to make sure lruvec_is_sizable() uses appropriate scan_control->priority on the first pass. Otherwise, lruvec_is_reclaimable() can return false negatives and result in premature OOM kills when min_ttl_ms is used. Link: https://lkml.kernel.org/r/[email protected] Fixes: e4dde56cd208 ("mm: multi-gen LRU: per-node lru_gen_folio lists") Signed-off-by: Yu Zhao <[email protected]> Reported-by: T.J. Mercier <[email protected]> Cc: <[email protected]> Signed-off-by: Andrew Morton <[email protected]>
2024-07-17	mm/zswap: fix a white space issue	Dan Carpenter	1	-1/+1
	We accidentally deleted a tab in commit f84152e9efc5 ("mm/zswap: use only one pool in zswap"). Add it back. Link: https://lkml.kernel.org/r/[email protected] Signed-off-by: Dan Carpenter <[email protected]> Reviewed-by: Chengming Zhou <[email protected]> Signed-off-by: Andrew Morton <[email protected]>
2024-07-17	mm/hugetlb: fix kernel NULL pointer dereference when migrating hugetlb folio	Miaohe Lin	1	-0/+3
	A kernel crash was observed when migrating hugetlb folio: BUG: kernel NULL pointer dereference, address: 0000000000000008 PGD 0 P4D 0 Oops: Oops: 0002 [#1] PREEMPT SMP NOPTI CPU: 0 PID: 3435 Comm: bash Not tainted 6.10.0-rc6-00450-g8578ca01f21f #66 RIP: 0010:__folio_undo_large_rmappable+0x70/0xb0 RSP: 0018:ffffb165c98a7b38 EFLAGS: 00000097 RAX: fffffbbc44528090 RBX: 0000000000000000 RCX: 0000000000000000 RDX: ffffa30e000a2800 RSI: 0000000000000246 RDI: ffffa3153ffffcc0 RBP: fffffbbc44528000 R08: 0000000000002371 R09: ffffffffbe4e5868 R10: 0000000000000001 R11: 0000000000000001 R12: ffffa3153ffffcc0 R13: fffffbbc44468000 R14: 0000000000000001 R15: 0000000000000001 FS: 00007f5b3a716740(0000) GS:ffffa3151fc00000(0000) knlGS:0000000000000000 CS: 0010 DS: 0000 ES: 0000 CR0: 0000000080050033 CR2: 0000000000000008 CR3: 000000010959a000 CR4: 00000000000006f0 Call Trace: <TASK> __folio_migrate_mapping+0x59e/0x950 __migrate_folio.constprop.0+0x5f/0x120 move_to_new_folio+0xfd/0x250 migrate_pages+0x383/0xd70 soft_offline_page+0x2ab/0x7f0 soft_offline_page_store+0x52/0x90 kernfs_fop_write_iter+0x12c/0x1d0 vfs_write+0x380/0x540 ksys_write+0x64/0xe0 do_syscall_64+0xb9/0x1d0 entry_SYSCALL_64_after_hwframe+0x77/0x7f RIP: 0033:0x7f5b3a514887 RSP: 002b:00007ffe138fce68 EFLAGS: 00000246 ORIG_RAX: 0000000000000001 RAX: ffffffffffffffda RBX: 000000000000000c RCX: 00007f5b3a514887 RDX: 000000000000000c RSI: 0000556ab809ee10 RDI: 0000000000000001 RBP: 0000556ab809ee10 R08: 00007f5b3a5d1460 R09: 000000007fffffff R10: 0000000000000000 R11: 0000000000000246 R12: 000000000000000c R13: 00007f5b3a61b780 R14: 00007f5b3a617600 R15: 00007f5b3a616a00 It's because hugetlb folio is passed to __folio_undo_large_rmappable() unexpectedly. large_rmappable flag is imperceptibly set to hugetlb folio since commit f6a8dd98a2ce ("hugetlb: convert alloc_buddy_hugetlb_folio to use a folio"). Then commit be9581ea8c05 ("mm: fix crashes from deferred split racing folio migration") makes folio_migrate_mapping() call folio_undo_large_rmappable() triggering the bug. Fix this issue by clearing large_rmappable flag for hugetlb folios. They don't need that flag set anyway. Link: https://lkml.kernel.org/r/[email protected] Fixes: f6a8dd98a2ce ("hugetlb: convert alloc_buddy_hugetlb_folio to use a folio") Fixes: be9581ea8c05 ("mm: fix crashes from deferred split racing folio migration") Signed-off-by: Miaohe Lin <[email protected]> Cc: Hugh Dickins <[email protected]> Cc: Matthew Wilcox (Oracle) <[email protected]> Cc: Muchun Song <[email protected]> Cc: <[email protected]> Signed-off-by: Andrew Morton <[email protected]>
2024-07-17	mm/hugetlb: fix possible recursive locking detected warning	Miaohe Lin	2	-1/+2
	When tries to demote 1G hugetlb folios, a lockdep warning is observed: ============================================ WARNING: possible recursive locking detected 6.10.0-rc6-00452-ga4d0275fa660-dirty #79 Not tainted -------------------------------------------- bash/710 is trying to acquire lock: ffffffff8f0a7850 (&h->resize_lock){+.+.}-{3:3}, at: demote_store+0x244/0x460 but task is already holding lock: ffffffff8f0a6f48 (&h->resize_lock){+.+.}-{3:3}, at: demote_store+0xae/0x460 other info that might help us debug this: Possible unsafe locking scenario: CPU0 ---- lock(&h->resize_lock); lock(&h->resize_lock); * DEADLOCK * May be due to missing lock nesting notation 4 locks held by bash/710: #0: ffff8f118439c3f0 (sb_writers#5){.+.+}-{0:0}, at: ksys_write+0x64/0xe0 #1: ffff8f11893b9e88 (&of->mutex#2){+.+.}-{3:3}, at: kernfs_fop_write_iter+0xf8/0x1d0 #2: ffff8f1183dc4428 (kn->active#98){.+.+}-{0:0}, at: kernfs_fop_write_iter+0x100/0x1d0 #3: ffffffff8f0a6f48 (&h->resize_lock){+.+.}-{3:3}, at: demote_store+0xae/0x460 stack backtrace: CPU: 3 PID: 710 Comm: bash Not tainted 6.10.0-rc6-00452-ga4d0275fa660-dirty #79 Hardware name: QEMU Standard PC (i440FX + PIIX, 1996), BIOS rel-1.14.0-0-g155821a1990b-prebuilt.qemu.org 04/01/2014 Call Trace: <TASK> dump_stack_lvl+0x68/0xa0 __lock_acquire+0x10f2/0x1ca0 lock_acquire+0xbe/0x2d0 __mutex_lock+0x6d/0x400 demote_store+0x244/0x460 kernfs_fop_write_iter+0x12c/0x1d0 vfs_write+0x380/0x540 ksys_write+0x64/0xe0 do_syscall_64+0xb9/0x1d0 entry_SYSCALL_64_after_hwframe+0x77/0x7f RIP: 0033:0x7fa61db14887 RSP: 002b:00007ffc56c48358 EFLAGS: 00000246 ORIG_RAX: 0000000000000001 RAX: ffffffffffffffda RBX: 0000000000000002 RCX: 00007fa61db14887 RDX: 0000000000000002 RSI: 000055a030050220 RDI: 0000000000000001 RBP: 000055a030050220 R08: 00007fa61dbd1460 R09: 000000007fffffff R10: 0000000000000000 R11: 0000000000000246 R12: 0000000000000002 R13: 00007fa61dc1b780 R14: 00007fa61dc17600 R15: 00007fa61dc16a00 </TASK> Lockdep considers this an AA deadlock because the different resize_lock mutexes reside in the same lockdep class, but this is a false positive. Place them in distinct classes to avoid these warnings. Link: https://lkml.kernel.org/r/[email protected] Fixes: 8531fc6f52f5 ("hugetlb: add hugetlb demote page support") Signed-off-by: Miaohe Lin <[email protected]> Acked-by: Muchun Song <[email protected]> Cc: <[email protected]> Signed-off-by: Andrew Morton <[email protected]>
2024-07-17	mm/gup: clear the LRU flag of a page before adding to LRU batch	yangge	1	-12/+31
	If a large number of CMA memory are configured in system (for example, the CMA memory accounts for 50% of the system memory), starting a virtual virtual machine with device passthrough, it will call pin_user_pages_remote(..., FOLL_LONGTERM, ...) to pin memory. Normally if a page is present and in CMA area, pin_user_pages_remote() will migrate the page from CMA area to non-CMA area because of FOLL_LONGTERM flag. But the current code will cause the migration failure due to unexpected page refcounts, and eventually cause the virtual machine fail to start. If a page is added in LRU batch, its refcount increases one, remove the page from LRU batch decreases one. Page migration requires the page is not referenced by others except page mapping. Before migrating a page, we should try to drain the page from LRU batch in case the page is in it, however, folio_test_lru() is not sufficient to tell whether the page is in LRU batch or not, if the page is in LRU batch, the migration will fail. To solve the problem above, we modify the logic of adding to LRU batch. Before adding a page to LRU batch, we clear the LRU flag of the page so that we can check whether the page is in LRU batch by folio_test_lru(page). It's quite valuable, because likely we don't want to blindly drain the LRU batch simply because there is some unexpected reference on a page, as described above. This change makes the LRU flag of a page invisible for longer, which may impact some programs. For example, as long as a page is on a LRU batch, we cannot isolate it, and we cannot check if it's an LRU page. Further, a page can now only be on exactly one LRU batch. This doesn't seem to matter much, because a new page is allocated from buddy and added to the lru batch, or be isolated, it's LRU flag may also be invisible for a long time. Link: https://lkml.kernel.org/r/[email protected] Link: https://lkml.kernel.org/r/[email protected] Fixes: 9a4e9f3b2d73 ("mm: update get_user_pages_longterm to migrate pages allocated from CMA region") Signed-off-by: yangge <[email protected]> Cc: Aneesh Kumar K.V <[email protected]> Cc: Baolin Wang <[email protected]> Cc: David Hildenbrand <[email protected]> Cc: Barry Song <[email protected]> Cc: Hugh Dickins <[email protected]> Signed-off-by: Andrew Morton <[email protected]>
2024-07-17	mm/numa_balancing: teach mpol_to_str about the balancing mode	Tvrtko Ursulin	1	-4/+14
	Since balancing mode was added in bda420b98505 ("numa balancing: migrate on fault among multiple bound nodes"), it was possible to set this mode but it wouldn't be shown in /proc/<pid>/numa_maps since there was no support for it in the mpol_to_str() helper. Furthermore, because the balancing mode sets the MPOL_F_MORON flag, it would be displayed as 'default' due a workaround introduced a few years earlier in 8790c71a18e5 ("mm/mempolicy.c: fix mempolicy printing in numa_maps"). To tidy this up we implement two changes: Replace the MPOL_F_MORON check by pointer comparison against the preferred_node_policy array. By doing this we generalise the current special casing and replace the incorrect 'default' with the correct 'bind' for the mode. Secondly, we add a string representation and corresponding handling for the MPOL_F_NUMA_BALANCING flag. With the two changes together we start showing the balancing flag when it is set and therefore complete the fix. Representation format chosen is to separate multiple flags with vertical bars, following what existed long time ago in kernel 2.6.25. But as between then and now there wasn't a way to display multiple flags, this patch does not change the format in practice. Some /proc/<pid>/numa_maps output examples: 555559580000 bind=balancing:0-1,3 file=... 555585800000 bind=balancing\|static:0,2 file=... 555635240000 prefer=relative:0 file= Link: https://lkml.kernel.org/r/[email protected] Signed-off-by: Tvrtko Ursulin <[email protected]> Fixes: bda420b98505 ("numa balancing: migrate on fault among multiple bound nodes") References: 8790c71a18e5 ("mm/mempolicy.c: fix mempolicy printing in numa_maps") Reviewed-by: "Huang, Ying" <[email protected]> Cc: Mel Gorman <[email protected]> Cc: Peter Zijlstra <[email protected]> Cc: Ingo Molnar <[email protected]> Cc: Rik van Riel <[email protected]> Cc: Johannes Weiner <[email protected]> Cc: "Matthew Wilcox (Oracle)" <[email protected]> Cc: Dave Hansen <[email protected]> Cc: Andi Kleen <[email protected]> Cc: Michal Hocko <[email protected]> Cc: David Rientjes <[email protected]> Cc: <[email protected]> [5.12+] Signed-off-by: Andrew Morton <[email protected]>
2024-07-17	mm: memcg1: convert charge move flags to unsigned long long	Roman Gushchin	1	-2/+2
	Currently MOVE_ANON and MOVE_FILE flags are defined as integers and it leads to the following Smatch static checker warning: mm/memcontrol-v1.c:609 mem_cgroup_move_charge_write() warn: was expecting a 64 bit value instead of '~(1 \| 2)' Fix this be redefining them as unsigned long long. Even though the issue allows to set high 32 bits of mc.flags to an arbitrary number, these bits are never used, so it doesn't have any significant consequences. Link: https://lkml.kernel.org/r/[email protected] Signed-off-by: Roman Gushchin <[email protected]> Reported-by: Dan Carpenter <[email protected]> Signed-off-by: Andrew Morton <[email protected]>
2024-07-17	alloc_tag: fix page_ext_get/page_ext_put sequence during page splitting	Suren Baghdasaryan	1	-2/+3
	pgalloc_tag_sub() might call page_ext_put() using a page different from the one used in page_ext_get() call. This does not pose an issue since page_ext_put() ignores this parameter as long as it's non-NULL but technically this is wrong. Fix it by storing the original page used in page_ext_get() and passing it to page_ext_put(). Link: https://lkml.kernel.org/r/[email protected] Fixes: be25d1d4e822 ("mm: create new codetag references during page splitting") Signed-off-by: Suren Baghdasaryan <[email protected]> Cc: Kees Cook <[email protected]> Cc: Kent Overstreet <[email protected]> Cc: Pasha Tatashin <[email protected]> Cc: Sourav Panda <[email protected]> Cc: Vlastimil Babka <[email protected]> Signed-off-by: Andrew Morton <[email protected]>
2024-07-17	lib: reuse page_ext_data() to obtain codetag_ref	Suren Baghdasaryan	1	-1/+1
	codetag_ref_from_page_ext() reimplements the same calculation as page_ext_data(). Reuse existing function instead. Link: https://lkml.kernel.org/r/[email protected] Fixes: dcfe378c81f7 ("lib: introduce support for page allocation tagging") Signed-off-by: Suren Baghdasaryan <[email protected]> Cc: Kees Cook <[email protected]> Cc: Kent Overstreet <[email protected]> Cc: Pasha Tatashin <[email protected]> Cc: Sourav Panda <[email protected]> Cc: Vlastimil Babka <[email protected]> Signed-off-by: Andrew Morton <[email protected]>
2024-07-17	lib: add missing newline character in the warning message	Suren Baghdasaryan	1	-1/+1
	Link: https://lkml.kernel.org/r/[email protected] Fixes: 22d407b164ff ("lib: add allocation tagging support for memory allocation profiling") Signed-off-by: Suren Baghdasaryan <[email protected]> Cc: Kees Cook <[email protected]> Cc: Kent Overstreet <[email protected]> Cc: Pasha Tatashin <[email protected]> Cc: Sourav Panda <[email protected]> Cc: Vlastimil Babka <[email protected]> Signed-off-by: Andrew Morton <[email protected]>
2024-07-17	mm/mglru: fix overshooting shrinker memory	Yu Zhao	1	-2/+8
	set_initial_priority() tries to jump-start global reclaim by estimating the priority based on cold/hot LRU pages. The estimation does not account for shrinker objects, and it cannot do so because their sizes can be in different units other than page. If shrinker objects are the majority, e.g., on TrueNAS SCALE 24.04.0 where ZFS ARC can use almost all system memory, set_initial_priority() can vastly underestimate how much memory ARC shrinker can evict and assign extreme low values to scan_control->priority, resulting in overshoots of shrinker objects. To reproduce the problem, using TrueNAS SCALE 24.04.0 with 32GB DRAM, a test ZFS pool and the following commands: fio --name=mglru.file --numjobs=36 --ioengine=io_uring \ --directory=/root/test-zfs-pool/ --size=1024m --buffered=1 \ --rw=randread --random_distribution=random \ --time_based --runtime=1h & for ((i = 0; i < 20; i++)) do sleep 120 fio --name=mglru.anon --numjobs=16 --ioengine=mmap \ --filename=/dev/zero --size=1024m --fadvise_hint=0 \ --rw=randrw --random_distribution=random \ --time_based --runtime=1m done To fix the problem: 1. Cap scan_control->priority at or above DEF_PRIORITY/2, to prevent the jump-start from being overly aggressive. 2. Account for the progress from mm_account_reclaimed_pages(), to prevent kswapd_shrink_node() from raising the priority unnecessarily. Link: https://lkml.kernel.org/r/[email protected] Fixes: e4dde56cd208 ("mm: multi-gen LRU: per-node lru_gen_folio lists") Signed-off-by: Yu Zhao <[email protected]> Reported-by: Alexander Motin <[email protected]> Cc: Wei Xu <[email protected]> Cc: <[email protected]> Signed-off-by: Andrew Morton <[email protected]>
2024-07-17	mm/mglru: fix div-by-zero in vmpressure_calc_level()	Yu Zhao	1	-1/+0
	evict_folios() uses a second pass to reclaim folios that have gone through page writeback and become clean before it finishes the first pass, since folio_rotate_reclaimable() cannot handle those folios due to the isolation. The second pass tries to avoid potential double counting by deducting scan_control->nr_scanned. However, this can result in underflow of nr_scanned, under a condition where shrink_folio_list() does not increment nr_scanned, i.e., when folio_trylock() fails. The underflow can cause the divisor, i.e., scale=scanned+reclaimed in vmpressure_calc_level(), to become zero, resulting in the following crash: [exception RIP: vmpressure_work_fn+101] process_one_work at ffffffffa3313f2b Since scan_control->nr_scanned has no established semantics, the potential double counting has minimal risks. Therefore, fix the problem by not deducting scan_control->nr_scanned in evict_folios(). Link: https://lkml.kernel.org/r/[email protected] Fixes: 359a5e1416ca ("mm: multi-gen LRU: retry folios written back while isolated") Reported-by: Wei Xu <[email protected]> Signed-off-by: Yu Zhao <[email protected]> Cc: Alexander Motin <[email protected]> Cc: <[email protected]> Signed-off-by: Andrew Morton <[email protected]>
2024-07-17	mm/kmemleak: replace strncpy() with strscpy()	Kees Cook	1	-3/+3
	Replace the depreciated[1] strncpy() calls with strscpy(). Uses of object->comm do not depend on the padding side-effect. Link: https://github.com/KSPP/linux/issues/90 [1] Link: https://lkml.kernel.org/r/[email protected] Signed-off-by: Kees Cook <[email protected]> Acked-by: Catalin Marinas <[email protected]> Signed-off-by: Andrew Morton <[email protected]>
2024-07-17	mm, page_alloc: put should_fail_alloc_page() back behing CONFIG_FAIL_PAGE_ALLOC	Vlastimil Babka	4	-11/+7
	This mostly reverts commit af3b854492f3 ("mm/page_alloc.c: allow error injection"). The commit made should_fail_alloc_page() a noinline function that's always called from the page allocation hotpath, even if it's empty because CONFIG_FAIL_PAGE_ALLOC is not enabled, and there is no option to disable it and prevent the associated function call overhead. As with the preceding patch "mm, slab: put should_failslab back behind CONFIG_SHOULD_FAILSLAB" and for the same reasons, put the should_fail_alloc_page() back behind the config option. When enabled, the ALLOW_ERROR_INJECTION and BTF_ID records are preserved so it's not a complete revert. Link: https://lkml.kernel.org/r/[email protected] Signed-off-by: Vlastimil Babka <[email protected]> Cc: Akinobu Mita <[email protected]> Cc: Alexei Starovoitov <[email protected]> Cc: Andrii Nakryiko <[email protected]> Cc: Christoph Lameter <[email protected]> Cc: Daniel Borkmann <[email protected]> Cc: David Rientjes <[email protected]> Cc: Eduard Zingerman <[email protected]> Cc: Hao Luo <[email protected]> Cc: Hyeonggon Yoo <[email protected]> Cc: Jiri Olsa <[email protected]> Cc: John Fastabend <[email protected]> Cc: KP Singh <[email protected]> Cc: Martin KaFai Lau <[email protected]> Cc: Mateusz Guzik <[email protected]> Cc: Roman Gushchin <[email protected]> Cc: Song Liu <[email protected]> Cc: Stanislav Fomichev <[email protected]> Cc: Yonghong Song <[email protected]> Signed-off-by: Andrew Morton <[email protected]>
2024-07-17	mm, slab: put should_failslab() back behind CONFIG_SHOULD_FAILSLAB	Vlastimil Babka	4	-17/+12
	Patch series "revert unconditional slab and page allocator fault injection calls". These two patches largely revert commits that added function call overhead into slab and page allocation hotpaths and that cannot be currently disabled even though related CONFIG_ options do exist. A much more involved solution that can keep the callsites always existing but hidden behind a static key if unused, is possible [1] and can be pursued by anyone who believes it's necessary. Meanwhile the fact the should_failslab() error injection is already not functional on kernels built with current gcc without anyone noticing [2], and lukewarm response to [1] suggests the need is not there. I believe it will be more fair to have the state after this series as a baseline for possible further optimisation, instead of the unconditional overhead. For example a possible compromise for anyone who's fine with an empty function call overhead but not the full CONFIG_FAILSLAB / CONFIG_FAIL_PAGE_ALLOC overhead is to reuse patch 1 from [1] but insert a static key check only inside should_failslab() and should_fail_alloc_page() before performing the more expensive checks. [1] https://lore.kernel.org/all/[email protected]/#t [2] https://github.com/bpftrace/bpftrace/issues/3258 This patch (of 2): This mostly reverts commit 4f6923fbb352 ("mm: make should_failslab always available for fault injection"). The commit made should_failslab() a noinline function that's always called from the slab allocation hotpath, even if it's empty because CONFIG_SHOULD_FAILSLAB is not enabled, and there is no option to disable that call. This is visible in profiles and the function call overhead can be noticeable especially with cpu mitigations. Meanwhile the bpftrace program example in the commit silently does not work without CONFIG_SHOULD_FAILSLAB anyway with a recent gcc, because the empty function gets a .constprop clone that is actually being called (uselessly) from the slab hotpath, while the error injection is hooked to the original function that's not being called at all [1]. Thus put the whole should_failslab() function back behind CONFIG_SHOULD_FAILSLAB. It's not a complete revert of 4f6923fbb352 - the int return type that returns -ENOMEM on failure is preserved, as well ALLOW_ERROR_INJECTION annotation. The BTF_ID() record that was meanwhile added is also guarded by CONFIG_SHOULD_FAILSLAB. [1] https://github.com/bpftrace/bpftrace/issues/3258 Link: https://lkml.kernel.org/r/[email protected] Link: https://lkml.kernel.org/r/[email protected] Signed-off-by: Vlastimil Babka <[email protected]> Cc: Akinobu Mita <[email protected]> Cc: Alexei Starovoitov <[email protected]> Cc: Andrii Nakryiko <[email protected]> Cc: Christoph Lameter <[email protected]> Cc: Daniel Borkmann <[email protected]> Cc: David Rientjes <[email protected]> Cc: Eduard Zingerman <[email protected]> Cc: Hao Luo <[email protected]> Cc: Hyeonggon Yoo <[email protected]> Cc: Jiri Olsa <[email protected]> Cc: John Fastabend <[email protected]> Cc: KP Singh <[email protected]> Cc: Martin KaFai Lau <[email protected]> Cc: Mateusz Guzik <[email protected]> Cc: Roman Gushchin <[email protected]> Cc: Song Liu <[email protected]> Cc: Stanislav Fomichev <[email protected]> Cc: Yonghong Song <[email protected]> Signed-off-by: Andrew Morton <[email protected]>
2024-07-17	mm: ignore data-race in __swap_writepage	Pei Li	1	-1/+6
	Syzbot reported a possible data race: BUG: KCSAN: data-race in __swap_writepage / scan_swap_map_slots read-write to 0xffff888102fca610 of 8 bytes by task 7106 on cpu 1. read to 0xffff888102fca610 of 8 bytes by task 7080 on cpu 0. While we are in __swap_writepage to read sis->flags, scan_swap_map_slots is trying to update it with SWP_SCANNING. value changed: 0x0000000000008083 -> 0x0000000000004083. While this can be updated non-atomicially, this won't affect SWP_SYNCHRONOUS_IO, so we consider this data-race safe. This is possibly introduced by commit 3222d8c2a7f8 ("block: remove ->rw_page"), where this if branch is introduced. Link: https://lkml.kernel.org/r/[email protected] Fixes: 3222d8c2a7f8 ("block: remove ->rw_page") Signed-off-by: Pei Li <[email protected]> Reported-by: [email protected] Closes: https://syzkaller.appspot.com/bug?extid=da25887cc13da6bf3b8c Cc: Dan Williams <[email protected]> Cc: Shuah Khan <[email protected]> Signed-off-by: Andrew Morton <[email protected]>
2024-07-17	hugetlbfs: ensure generic_hugetlb_get_unmapped_area() returns higher address ↵	Donet Tom	1	-5/+6
	than mmap_min_addr generic_hugetlb_get_unmapped_area() was returning an address less than mmap_min_addr if the mmap argument addr, after alignment, was less than mmap_min_addr, causing mmap to fail. This is because current generic_hugetlb_get_unmapped_area() code does not take into account mmap_min_addr. This patch ensures that generic_hugetlb_get_unmapped_area() always returns an address that is greater than mmap_min_addr. Additionally, similar to generic_get_unmapped_area(), vm_end_gap() checks are included to maintain stack gap. How to reproduce ================ #include <stdio.h> #include <stdlib.h> #include <sys/mman.h> #include <unistd.h> #define HUGEPAGE_SIZE (16 * 1024 * 1024) int main() { void addr = mmap((void )-1, HUGEPAGE_SIZE, PROT_READ \| PROT_WRITE, MAP_SHARED \| MAP_ANONYMOUS \| MAP_HUGETLB, -1, 0); if (addr == MAP_FAILED) { perror("mmap"); exit(EXIT_FAILURE); } snprintf((char )addr, HUGEPAGE_SIZE, "Hello, Huge Pages!"); printf("%s\n", (char )addr); if (munmap(addr, HUGEPAGE_SIZE) == -1) { perror("munmap"); exit(EXIT_FAILURE); } return 0; } Result without fix ================== # cat /proc/meminfo \|grep -i HugePages_Free HugePages_Free: 20 # ./test mmap: Permission denied # Result with fix =============== # cat /proc/meminfo \|grep -i HugePages_Free HugePages_Free: 20 # ./test Hello, Huge Pages! # Link: https://lkml.kernel.org/r/[email protected] Signed-off-by: Donet Tom <[email protected]> Reported-by Pavithra Prakash <[email protected]> Acked-by: Kirill A. Shutemov <[email protected]> Cc: Alexei Starovoitov <[email protected]> Cc: Andy Lutomirski <[email protected]> Cc: Aneesh Kumar K.V <[email protected]> Cc: David Hildenbrand <[email protected]> Cc: Matthew Wilcox (Oracle) <[email protected]> Cc: Mike Rapoport (IBM) <[email protected]> Cc: Muchun Song <[email protected]> Cc: Nicholas Piggin <[email protected]> Cc: Ritesh Harjani (IBM) <[email protected]> Cc: Tony Battersby <[email protected]> Signed-off-by: Andrew Morton <[email protected]>
2024-07-17	Merge tag 'media/v6.11-1' of ↵	Linus Torvalds	405	-10210/+21983
	git://git.kernel.org/pub/scm/linux/kernel/git/mchehab/linux-media Pull media updates from Mauro Carvalho Chehab: - New sensor drivers: gc05a2, gc08a3 and imx283 - New serializer/deserializer drivers: max96714 and max96717 - New JPEG encoder driver: e5010 - Support for Raspberry Pi PiSP Backend (BE) ISP driver - Old documentation for av7110 driver removed, as a new version was added as Documentation/userspace-api/media/dvb/legacy.rst - atompisp: Linux firmwares are now available, so drop firmware-related task from TODO and update firmware logic - The imx258 driver has gained several improvements - wave5 driver has gained support for HEVC decoding - em28xx gained support for MyGica UTV3 - av7110 budget-patch driver removed - Lots of other cleanups, improvements and fixes tag 'media/v6.11-1' of git://git.kernel.org/pub/scm/linux/kernel/git/mchehab/linux-media: (301 commits) media: raspberrypi: Switch to remove_new media: uapi: pisp_be_config: Add extra config fields media: uapi: pisp_be_config: Re-sort pisp_be_tiles_config media: uapi: pisp_common: Capitalize all macros media: uapi: pisp_common: Add 32 bpp format test media: uapi: pisp_be_config: Drop BIT() from uAPI media: stm32: dcmipp: correct error handling in dcmipp_create_subdevs media: atomisp: Fix spelling mistakes in sh_css_sp.c media: atomisp: Fix spelling mistake in ia_css_debug.c media: atomisp: Fix spelling mistake in hmm_bo.c media: atomisp: Fix spelling mistake in ia_css_eed1_8.host.c media: atomisp: Fix spelling mistake in sh_css_internal.h media: atomisp: Fix spelling mistake "pipline" -> "pipeline" media: atomisp: Remove unused GPIO related defines and APIs media: atomisp: Replace COMPILATION_ERROR_IF() by static_assert() media: atomisp: Clean up unused macros from math_support.h media: atomisp: csi2-bridge: Add DMI quirk for OV5693 on Xiaomi Mipad2 media: atomisp: Update TODO media: atomisp: Prefix firmware paths with "intel/ipu/" media: atomisp: Remove firmware_name module parameter ...
2024-07-17	Merge tag 'devicetree-for-6.11' of ↵	Linus Torvalds	97	-1301/+2751
	git://git.kernel.org/pub/scm/linux/kernel/git/robh/linux Pull devicetree updates from Rob Herring: "DT Bindings: - Convert and add a bunch of IBM FSI related bindings - Add a new schema listing legacy compatibles which will (probably) never be documented. This will silence various checks warning about them. - Add bindings for Sierra Wireless mangOH Green SPI IoT interface, new Arm 2024 Cortex and Neoverse CPUs, QCom sc8180x PDC, QCom SDX75 GPI DMA, imx8mp/imx8qxp fsl,irqsteer, and Renesas RZ/G2UL CRU and CSI-2 blocks - Convert Spreadtrum sprd-timer, FSL cpm_qe, FSL fsl,ls-scfg-msi, FSL q(b)man-, FSL qoriq-mc, and img,pdc-wdt bindings to DT schema - Drop obsolete stericsson,abx500.txt DT core: - Update dtc to upstream version v1.7.0-93-g1df7b047fe43 - Add support to run DT validation on DTs with applied overlays - Add helper for creating boolean properties in dynamic nodes and use that for dynamic PCI nodes - Clean-up early parsing of '#{address,size}-cells'" tag 'devicetree-for-6.11' of git://git.kernel.org/pub/scm/linux/kernel/git/robh/linux: (39 commits) dt-bindings: timer: sprd-timer: convert to YAML dt-bindings: incomplete-devices: document devices without bindings dt-bindings: trivial-devices: document the Sierra Wireless mangOH Green SPI IoT interface scripts/dtc: Update to upstream version v1.7.0-93-g1df7b047fe43 dt-bindings: soc: fsl: Add fsl,ls1028a-reset for reset syscon node dt-bindings: soc: fsl: cpm_qe: convert to yaml format dt-bindings: i2c: i2c-fsi: Convert to json-schema dt-bindings: fsi: Document the FSI Hub Controller dt-bindings: fsi: Document the AST2700 FSI controller dt-bindings: fsi: ast2600-fsi-master: Convert to json-schema dt-bindings: fsi: ibm,i2cr-fsi-master: Reference common FSI controller dt-bindings: fsi: Document the FSI controller common properties dt-bindings: fsi: Document the IBM SBEFIFO engine dt-bindings: fsi: p9-occ: Convert to json-schema dt-bindings: fsi: Document the IBM SCOM engine dt-bindings: fsi: fsi2spi: Document SPI controller child nodes dt-bindings: interrupt-controller: convert fsl,ls-scfg-msi to yaml dt-bindings: soc: fsl: Convert q(b)man-* to yaml format dt-bindings: misc: fsl,qoriq-mc: convert to yaml format dt-bindings: drop stale Anson Huang from maintainers ...
2024-07-17	Merge tag 'for-6.11-rc1' of ↵	Linus Torvalds	1	-0/+1
	git://git.kernel.org/pub/scm/linux/kernel/git/pateldipen1984/linux Pull hardware timestamp update from Dipen Patel: - Add module description in hte test to silence modpost warnings * tag 'for-6.11-rc1' of git://git.kernel.org/pub/scm/linux/kernel/git/pateldipen1984/linux: hte: tegra-194: add missing MODULE_DESCRIPTION() macro
2024-07-17	Merge tag 'leds-next-6.11' of ↵	Linus Torvalds	58	-1864/+2728
	git://git.kernel.org/pub/scm/linux/kernel/git/lee/leds Pull LED updates from Lee Jones: "Core Frameworks: - New trigger for Input Events - New led_mc_set_brightness() call to adapt colour/brightness for mutli-colour LEDs - New lled_mc_trigger_event() call to call the above based on given trigger conditions - New led_get_color_name() call, a wrapper around the existing led_colors[] array - A new flag to avoid automatic renaming of LED devices New Drivers: - Silergy SY7802 Flash LED Controller - Texas Instruments LP5569 LED Controller - ChromeOS EC LED Controller New Device Support: - KTD202{6,7} support for Kinetic KTD2026/7 LEDs Fix-ups: - Replace ACPI/DT firmware helpers with agnostic variants - Make use of resource managed devm_* API calls - Device Tree binding adaptions/conversions/creation - Constify/staticise applicable data structures - Trivial; spelling, whitespace, coding-style adaptions - Drop i2c_device_id::driver_data where the value is unused - Utilise centrally provided helpers and macros to aid simplicity and avoid duplication - Use generic platform device properties instead of OF/ACPI specific ones - Consolidate/de-duplicate various functionality - Remove superfluous/duplicated/unused sections - Make use of the new _scoped() guard APIs - Improve/simplify error handling Bug Fixes: - Flush pending brightness changes before activating the trigger - Repair incorrect device naming preventing matches - Prevent memory leaks by correctly free resources during error handling routines - Repair locking issue causing circular dependency splats and lock-ups - Unregister sysfs entries before deactivating triggers to prevent use-after issues - Supply a bunch of MODULE_DESCRIPTIONs to silence modpost warnings - Use correct return codes expected by the callers - Omit set_brightness() error message for a LEDs that support only HW triggers" tag 'leds-next-6.11' of git://git.kernel.org/pub/scm/linux/kernel/git/lee/leds: (65 commits) leds: leds-lp5569: Enable chip after chip configuration leds: leds-lp5569: Better handle enabling clock internal setting leds: leds-lp5569: Fix typo in driver name leds: flash: leds-qcom-flash: Test the correct variable in init leds: leds-lp55xx: Convert mutex lock/unlock to guard API leds: leds-lp5523: Convert to sysfs_emit API leds: leds-lp5569: Convert to sysfs_emit API Revert "leds: led-core: Fix refcount leak in of_led_get()" leds: leds-lp5569: Add support for Texas Instruments LP5569 leds: leds-lp55xx: Drop deprecated defines leds: leds-lp55xx: Support ENGINE program up to 128 bytes leds: leds-lp55xx: Generalize sysfs master_fader leds: leds-lp55xx: Generalize sysfs engine_leds leds: leds-lp55xx: Generalize sysfs engine_load and engine_mode leds: leds-lp55xx: Generalize stop_engine function leds: leds-lp55xx: Generalize turn_off_channels function leds: leds-lp55xx: Generalize set_led_current function leds: leds-lp55xx: Generalize multicolor_brightness function leds: leds-lp55xx: Generalize led_brightness function leds: leds-lp55xx: Generalize firmware_loaded function ...
2024-07-17	Merge tag 'backlight-next-6.11' of ↵	Linus Torvalds	30	-64/+550
	git://git.kernel.org/pub/scm/linux/kernel/git/lee/backlight Pull backlight updates from Lee Jones: "New Drivers: - Texas Instruments LM3509 Backlight Driver Fix-ups: - Device Tree binding adaptions/conversions/creation - Drop i2c_device_id::driver_data where the value is unused - Make use of the new _scoped() guard APIs - Decouple from fbdev by providing Backlight with its own BACKLIGHT_POWER_ constrains Bug Fixes: - Correctly assess return values (NULL vs IS_ERR()) - Supply a bunch of MODULE_DESCRIPTIONs to silence modpost warnings" * tag 'backlight-next-6.11' of git://git.kernel.org/pub/scm/linux/kernel/git/lee/backlight: (23 commits) backlight: sky81452-backlight: Use backlight power constants backlight: rave-sp-backlight: Use backlight power constants backlight: pwm-backlight: Use backlight power constants backlight: pcf50633-backlight: Use backlight power constants backlight: pandora-backlight: Use backlight power constants backlight: mp3309c: Use backlight power constants backlight: lm3533-backlight: Use backlight power constants backlight: led-backlight: Use backlight power constants backlight: ktd253-backlight: Use backlight power constants backlight: kb3886-bl: Use backlight power constants backlight: journada_bl: Use backlight power constants backlight: ipaq-micro-backlight: Use backlight power constants backlight: gpio-backlight: Use backlight power constants backlight: corgi-lcd: Use backlight power constants backlight: ams369fb06: Use backlight power constants backlight: aat2870-backlight: Use blacklight power constants backlight: Add BACKLIGHT_POWER_ constants for power states backlight: lm3509_bl: Fix early returns in for_each_child_of_node() backlight: Drop explicit initialization of struct i2c_device_id::driver_data to 0 backlight: Add missing MODULE_DESCRIPTION() macros ...
2024-07-17	Merge tag 'mfd-next-6.11' of ↵	Linus Torvalds	143	-1182/+6718
	git://git.kernel.org/pub/scm/linux/kernel/git/lee/mfd Pull MFD updates from Lee Jones: "New Drivers: - ROHM BD96801 Power Management IC - Cirrus Logic CS40L50 Haptic Driver with Waveform Memory - Marvell 88PM886 Power Management IC New Device Support: - Keyboard Backlight to ChromeOS Embedded Controller - LEDs to ChromeOS Embedded Controller - Charge Control to ChromeOS Embedded Controller - HW Monitoring Service to ChromeOS Embedded Controller - AUXADCs to MediaTek MT635{7,8,9} Power Management ICs New Functionality: - Allow Syscon consumers to supply their own Regmaps on registration Fix-ups: - Constify/staticise applicable data structures - Remove superfluous/duplicated/unused sections - Device Tree binding adaptions/conversions/creation - Trivial; spelling, whitespace, coding-style adaptions - Utilise centrally provided helpers and macros to aid simplicity/duplication - Drop i2c_device_id::driver_data where the value is unused - Replace ACPI/DT firmware helpers with agnostic variants - Move over to GPIOD (descriptor-based) APIs - Annotate a bunch of __counted_by() cases - Straighten out some includes Bug Fixes: - Ensure potentially asserted recent lines are deasserted during initialisation - Avoid "<module>.ko is added to multiple modules" warnings - Supply a bunch of MODULE_DESCRIPTIONs to silence modpost warnings - Fix Wvoid-pointer-to-enum-cast warnings" * tag 'mfd-next-6.11' of git://git.kernel.org/pub/scm/linux/kernel/git/lee/mfd: (87 commits) mfd: timberdale: Attach device properties to TSC2007 board info mfd: tmio: Move header to platform_data mfd: tmio: Sanitize comments mfd: tmio: Update include files mmc: tmio/sdhi: Fix includes mfd: tmio: Remove obsolete io accessors mfd: tmio: Remove obsolete platform_data watchdog: bd96801_wdt: Add missing include for FIELD_*() dt-bindings: mfd: syscon: Add APM poweroff mailbox dt-bindings: mfd: syscon: Split and enforce documenting MFD children dt-bindings: mfd: rk817: Merge support for RK809 dt-bindings: mfd: rk817: Fixup clocks and reference dai-common dt-bindings: mfd: syscon: Add TI's opp table compatible mfd: omap-usb-tll: Use struct_size to allocate tll dt-bindings: mfd: Explain lack of child dependency in simple-mfd dt-bindings: mfd: Dual licensing for st,stpmic1 bindings mfd: omap-usb-tll: Annotate struct usbtll_omap with __counted_by mfd: tps6594-core: Remove unneeded semicolon in tps6594_check_crc_mode() mfd: lm3533: Move to new GPIO descriptor-based APIs mfd: tps65912: Use devm helper functions to simplify probe ...
2024-07-17	Merge tag 'for-linus-2024071601' of ↵	Linus Torvalds	114	-1579/+6443
	git://git.kernel.org/pub/scm/linux/kernel/git/hid/hid Pull HID updates from Benjamin Tissoires: - rewrite of the HID-BPF internal implementation to use bpf struct_ops instead of a tracing endpoint (Benjamin Tissoires) - add two new HID-BPF hooks to be able to intercept userspace calls targeting a HID device and filtering them (Benjamin Tissoires) - add support for various new devices through HID-BPF filters (Benjamin Tissoires) - add support for the magic keyboard backlight (Orlando Chamberlain) - add the missing MODULE_DESCRIPTION() macros in HID drivers (Jeff Johnson) - use of kvzalloc in case memory gets too fragmented (Hailong Liu) - retrieve the device firmware node in the child HID device (Danny Kaehn) - some hid-uclogic improvements (José Expósito) - some more typos, trivial fixes, kernel doctext and unused functions cleanups * tag 'for-linus-2024071601' of git://git.kernel.org/pub/scm/linux/kernel/git/hid/hid: (60 commits) HID: hid-steam: Fix typo in goto label HID: mcp2221: Remove unnecessary semicolon HID: Fix spelling mistakes "Kensigton" -> "Kensington" HID: add more missing MODULE_DESCRIPTION() macros HID: samples: fix the 2 struct_ops definitions HID: fix for amples in for-6.11/bpf HID: apple: Add support for magic keyboard backlight on T2 Macs HID: bpf: Thrustmaster TCA Yoke Boeing joystick fix HID: bpf: Add Huion Dial 2 bpf fixup HID: bpf: Add support for the XP-PEN Deco Mini 4 HID: bpf: move the BIT() macro to hid_bpf_helpers.h HID: bpf: add a driver for the Huion Inspiroy 2S (H641P) HID: bpf: Add a HID report composition helper macros HID: bpf: doc fixes for hid_hw_request() hooks HID: bpf: doc fixes for hid_hw_request() hooks HID: bpf: fix gcc warning and unify __u64 into u64 selftests/hid: ensure CKI can compile our new tests on old kernels selftests/hid: add an infinite loop test for hid_bpf_try_input_report selftests/hid: add another test for injecting an event from an event hook HID: bpf: allow hid_device_event hooks to inject input reports on self ...
2024-07-17	Merge tag 'for-linus-6.11-1' of https://github.com/cminyard/linux-ipmi	Linus Torvalds	4	-9/+11
	Pull IPMI updates from Corey Minyard: "Some cleanups for device changes coming, and some range checks on data coming from a host to a BMC" * tag 'for-linus-6.11-1' of https://github.com/cminyard/linux-ipmi: ipmi: Drop explicit initialization of struct i2c_device_id::driver_data to 0 ipmi: ssif_bmc: prevent integer overflow on 32bit systems
2024-07-17	Merge tag 'platform-drivers-x86-v6.11-1' of ↵	Linus Torvalds	63	-524/+2301
	git://git.kernel.org/pub/scm/linux/kernel/git/pdx86/platform-drivers-x86 Pull x86 platform driver updates from Ilpo Järvinen: - amd/pmf: Report system state changes using existing input events - asus-wmi: Zenbook 2023 camera LED disable support and fix TUF laptop keyboard RGB LED sysfs interface - dell-pc: Fan modes / platform profile support - hp-wmi: Fix platform profile switching on Omen/Victus laptops - intel/ISST: Use only TPMI interface when TPMI and legacy interfaces are available - intel/pmc: LTR restore support to pair with LTR ignore - intel/tpmi: Performance Limit Reasons (PLR) and APIC <-> Punit CPU numbering mapping support - WMI: driver override support and docs improvements - lenovo-yoga-c630: Support for EC (platform/arm64) - platform/arm64: Fix build with COMPILE_TEST (broke after addition of C630) - tools: Intel Speed Select Turbo Ratio Limit fix - Miscellaneous cleanups / refactoring / improvements * tag 'platform-drivers-x86-v6.11-1' of git://git.kernel.org/pub/scm/linux/kernel/git/pdx86/platform-drivers-x86: (65 commits) platform/x86: asus-wmi: fix TUF laptop RGB variant platform/x86/intel/tpmi/plr: Fix output in plr_print_bits() Docs/admin-guide: Remove pmf leftover reference from the index platform/x86: ideapad-laptop: use cleanup.h platform/x86: hp-wmi: Fix implementation of the platform_profile_omen_get function platform: arm64: EC_LENOVO_YOGA_C630 should depend on ARCH_QCOM platform: arm64: EC_ACER_ASPIRE1 should depend on ARCH_QCOM platform/x86/amd/pmf: Remove update system state document platform/x86/amd/pmf: Use existing input event codes to update system states platform/x86: hp-wmi: Fix platform profile option switch bug on Omen and Victus laptops platform/x86:intel/pmc: Add support to undo ltr_ignore platform/x86:intel/pmc: Use the Elvis operator platform/x86:intel/pmc: Use DEFINE_SHOW_STORE_ATTRIBUTE macro platform/x86:intel/pmc: Remove unneeded min_t check platform/x86:intel/pmc: Add support to show ltr_ignore value platform/x86:intel/pmc: Move pmc assignment closer to first usage platform/x86:intel/pmc: Convert index variables to be unsigned platform/x86:intel/pmc: Simplify mutex usage with cleanup helpers platform/x86:intel/pmc: Use the return value of pmc_core_send_msg tools/power/x86/intel-speed-select: v1.20 release ...