blaster4385/linux-IllusionX - Linux kernel with personal config changes for arch linux

Age	Commit message (Collapse)	Author	Files	Lines
2024-07-04	net: dsa: microchip: lan937x: Add error handling in lan937x_setup	Oleksij Rempel	1	-10/+17
	Introduce error handling for lan937x_cfg function calls in lan937x_setup. This change ensures that if any lan937x_cfg or ksz_rmw32 calls fails, the function will return the appropriate error code. Signed-off-by: Oleksij Rempel <[email protected]> Acked-by: Arun Ramadoss <[email protected]> Link: https://patch.msgid.link/[email protected] Signed-off-by: Paolo Abeni <[email protected]>
2024-07-04	l2tp: Remove duplicate included header file trace.h	Thorsten Blum	1	-1/+0
	Remove duplicate included header file trace.h and the following warning reported by make includecheck: trace.h is included more than once Compile-tested only. Signed-off-by: Thorsten Blum <[email protected]> Reviewed-by: Michal Kubiak <[email protected]> Link: https://patch.msgid.link/[email protected] Signed-off-by: Paolo Abeni <[email protected]>
2024-07-04	tcp: Don't flag tcp_sk(sk)->rx_opt.saw_unknown for TCP AO.	Kuniyuki Iwashima	1	-0/+7
	When we process segments with TCP AO, we don't check it in tcp_parse_options(). Thus, opt_rx->saw_unknown is set to 1, which unconditionally triggers the BPF TCP option parser. Let's avoid the unnecessary BPF invocation. Fixes: 0a3a809089eb ("net/tcp: Verify inbound TCP-AO signed segments") Signed-off-by: Kuniyuki Iwashima <[email protected]> Reviewed-by: Eric Dumazet <[email protected]> Acked-by: Dmitry Safonov <[email protected]> Link: https://patch.msgid.link/[email protected] Signed-off-by: Paolo Abeni <[email protected]>
2024-07-03	Merge branch 'fix-oom-and-order-check-in-msg_zerocopy-selftest'	Jakub Kicinski	1	-2/+12
	Zijian Zhang says: ==================== fix OOM and order check in msg_zerocopy selftest In selftests/net/msg_zerocopy.c, it has a while loop keeps calling sendmsg on a socket with MSG_ZEROCOPY flag, and it will recv the notifications until the socket is not writable. Typically, it will start the receiving process after around 30+ sendmsgs. However, as the introduction of commit dfa2f0483360 ("tcp: get rid of sysctl_tcp_adv_win_scale"), the sender is always writable and does not get any chance to run recv notifications. The selftest always exits with OUT_OF_MEMORY because the memory used by opt_skb exceeds the net.core.optmem_max. Meanwhile, it could be set to a different value to trigger OOM on older kernels too. Thus, we introduce "cfg_notification_limit" to force sender to receive notifications after some number of sendmsgs. And, we find that when lock debugging is on, notifications may not come in order. Thus, we have order checking outputs managed by cfg_verbose, to avoid too many outputs in this case. ==================== Link: https://patch.msgid.link/[email protected] Signed-off-by: Jakub Kicinski <[email protected]>
2024-07-03	selftests: make order checking verbose in msg_zerocopy selftest	Zijian Zhang	1	-1/+1
	We find that when lock debugging is on, notifications may not come in order. Thus, we have order checking outputs managed by cfg_verbose, to avoid too many outputs in this case. Fixes: 07b65c5b31ce ("test: add msg_zerocopy test") Signed-off-by: Zijian Zhang <[email protected]> Signed-off-by: Xiaochun Lu <[email protected]> Reviewed-by: Willem de Bruijn <[email protected]> Link: https://patch.msgid.link/[email protected] Signed-off-by: Jakub Kicinski <[email protected]>
2024-07-03	selftests: fix OOM in msg_zerocopy selftest	Zijian Zhang	1	-1/+11
	In selftests/net/msg_zerocopy.c, it has a while loop keeps calling sendmsg on a socket with MSG_ZEROCOPY flag, and it will recv the notifications until the socket is not writable. Typically, it will start the receiving process after around 30+ sendmsgs. However, as the introduction of commit dfa2f0483360 ("tcp: get rid of sysctl_tcp_adv_win_scale"), the sender is always writable and does not get any chance to run recv notifications. The selftest always exits with OUT_OF_MEMORY because the memory used by opt_skb exceeds the net.core.optmem_max. Meanwhile, it could be set to a different value to trigger OOM on older kernels too. Thus, we introduce "cfg_notification_limit" to force sender to receive notifications after some number of sendmsgs. Fixes: 07b65c5b31ce ("test: add msg_zerocopy test") Signed-off-by: Zijian Zhang <[email protected]> Signed-off-by: Xiaochun Lu <[email protected]> Reviewed-by: Willem de Bruijn <[email protected]> Link: https://patch.msgid.link/[email protected] Signed-off-by: Jakub Kicinski <[email protected]>
2024-07-03	Merge branch 'intel-wired-lan-driver-updates-2024-06-25-ice'	Jakub Kicinski	3	-31/+111
	Tony Nguyen says: ==================== Intel Wired LAN Driver Updates 2024-06-25 (ice) This series contains updates to ice driver only. Milena adds disabling of extts events when PTP is disabled. Jake prevents possible NULL pointer by checking that timestamps are ready before processing extts events and adds checks for unsupported PTP pin configuration. Petr Oros replaces _test_bit() with the correct test_bit() macro. v1: https://lore.kernel.org/netdev/[email protected]/ ==================== Link: https://patch.msgid.link/[email protected] Signed-off-by: Jakub Kicinski <[email protected]>
2024-07-03	ice: use proper macro for testing bit	Petr Oros	1	-1/+1
	Do not use _test_bit() macro for testing bit. The proper macro for this is one without underline. _test_bit() is what test_bit() was prior to const-optimization. It directly calls arch_test_bit(), i.e. the arch-specific implementation (or the generic one). It's strictly _internal_ and shouldn't be used anywhere outside the actual test_bit() macro. test_bit() is a wrapper which checks whether the bitmap and the bit number are compile-time constants and if so, it calls the optimized function which evaluates this call to a compile-time constant as well. If either of them is not a compile-time constant, it just calls _test_bit(). test_bit() is the actual function to use anywhere in the kernel. IOW, calling _test_bit() avoids potential compile-time optimizations. The sensors is not a compile-time constant, thus most probably there are no object code changes before and after the patch. But anyway, we shouldn't call internal wrappers instead of the actual API. Fixes: 4da71a77fc3b ("ice: read internal temperature sensor") Acked-by: Ivan Vecera <[email protected]> Reviewed-by: Alexander Lobakin <[email protected]> Signed-off-by: Petr Oros <[email protected]> Reviewed-by: Jiri Pirko <[email protected]> Tested-by: Pucha Himasekhar Reddy <[email protected]> (A Contingent worker at Intel) Signed-off-by: Tony Nguyen <[email protected]> Link: https://patch.msgid.link/[email protected] Signed-off-by: Jakub Kicinski <[email protected]>
2024-07-03	ice: Reject pin requests with unsupported flags	Jacob Keller	2	-16/+23
	The driver receives requests for configuring pins via the .enable callback of the PTP clock object. These requests come into the driver with flags which modify the requested behavior from userspace. Current implementation in ice does not reject flags that it doesn't support. This causes the driver to incorrectly apply requests with such flags as PTP_PEROUT_DUTY_CYCLE, or any future flags added by the kernel which it is not yet aware of. Fix this by properly validating flags in both ice_ptp_cfg_perout and ice_ptp_cfg_extts. Ensure that we check by bit-wise negating supported flags rather than just checking and rejecting known un-supported flags. This is preferable, as it ensures better compatibility with future kernels. Fixes: 172db5f91d5f ("ice: add support for auxiliary input/output pins") Reviewed-by: Przemek Kitszel <[email protected]> Signed-off-by: Jacob Keller <[email protected]> Signed-off-by: Karol Kolacinski <[email protected]> Reviewed-by: Simon Horman <[email protected]> Tested-by: Pucha Himasekhar Reddy <[email protected]> (A Contingent worker at Intel) Signed-off-by: Tony Nguyen <[email protected]> Link: https://patch.msgid.link/[email protected] Signed-off-by: Jakub Kicinski <[email protected]>
2024-07-03	ice: Don't process extts if PTP is disabled	Jacob Keller	1	-0/+4
	The ice_ptp_extts_event() function can race with ice_ptp_release() and result in a NULL pointer dereference which leads to a kernel panic. Panic occurs because the ice_ptp_extts_event() function calls ptp_clock_event() with a NULL pointer. The ice driver has already released the PTP clock by the time the interrupt for the next external timestamp event occurs. To fix this, modify the ice_ptp_extts_event() function to check the PTP state and bail early if PTP is not ready. Fixes: 172db5f91d5f ("ice: add support for auxiliary input/output pins") Reviewed-by: Przemek Kitszel <[email protected]> Signed-off-by: Jacob Keller <[email protected]> Signed-off-by: Karol Kolacinski <[email protected]> Reviewed-by: Simon Horman <[email protected]> Tested-by: Pucha Himasekhar Reddy <[email protected]> (A Contingent worker at Intel) Signed-off-by: Tony Nguyen <[email protected]> Link: https://patch.msgid.link/[email protected] Signed-off-by: Jakub Kicinski <[email protected]>
2024-07-03	ice: Fix improper extts handling	Milena Olech	2	-22/+91
	Extts events are disabled and enabled by the application ts2phc. However, in case where the driver is removed when the application is running, a specific extts event remains enabled and can cause a kernel crash. As a side effect, when the driver is reloaded and application is started again, remaining extts event for the channel from a previous run will keep firing and the message "extts on unexpected channel" might be printed to the user. To avoid that, extts events shall be disabled when PTP is released. Fixes: 172db5f91d5f ("ice: add support for auxiliary input/output pins") Reviewed-by: Przemek Kitszel <[email protected]> Co-developed-by: Jacob Keller <[email protected]> Signed-off-by: Jacob Keller <[email protected]> Signed-off-by: Milena Olech <[email protected]> Signed-off-by: Karol Kolacinski <[email protected]> Reviewed-by: Simon Horman <[email protected]> Tested-by: Pucha Himasekhar Reddy <[email protected]> (A Contingent worker at Intel) Signed-off-by: Tony Nguyen <[email protected]> Link: https://patch.msgid.link/[email protected] Signed-off-by: Jakub Kicinski <[email protected]>
2024-07-03	selftest: af_unix: Add test case for backtrack after finalising SCC.	Kuniyuki Iwashima	1	-2/+23
	syzkaller reported a KMSAN splat in __unix_walk_scc() while backtracking edge_stack after finalising SCC. Let's add a test case exercising the path. Signed-off-by: Kuniyuki Iwashima <[email protected]> Signed-off-by: Shigeru Yoshida <[email protected]> Link: https://patch.msgid.link/[email protected] Signed-off-by: Jakub Kicinski <[email protected]>
2024-07-03	af_unix: Fix uninit-value in __unix_walk_scc()	Shigeru Yoshida	1	-4/+5
	KMSAN reported uninit-value access in __unix_walk_scc() [1]. In the list_for_each_entry_reverse() loop, when the vertex's index equals it's scc_index, the loop uses the variable vertex as a temporary variable that points to a vertex in scc. And when the loop is finished, the variable vertex points to the list head, in this case scc, which is a local variable on the stack (more precisely, it's not even scc and might underflow the call stack of __unix_walk_scc(): container_of(&scc, struct unix_vertex, scc_entry)). However, the variable vertex is used under the label prev_vertex. So if the edge_stack is not empty and the function jumps to the prev_vertex label, the function will access invalid data on the stack. This causes the uninit-value access issue. Fix this by introducing a new temporary variable for the loop. [1] BUG: KMSAN: uninit-value in __unix_walk_scc net/unix/garbage.c:478 [inline] BUG: KMSAN: uninit-value in unix_walk_scc net/unix/garbage.c:526 [inline] BUG: KMSAN: uninit-value in __unix_gc+0x2589/0x3c20 net/unix/garbage.c:584 __unix_walk_scc net/unix/garbage.c:478 [inline] unix_walk_scc net/unix/garbage.c:526 [inline] __unix_gc+0x2589/0x3c20 net/unix/garbage.c:584 process_one_work kernel/workqueue.c:3231 [inline] process_scheduled_works+0xade/0x1bf0 kernel/workqueue.c:3312 worker_thread+0xeb6/0x15b0 kernel/workqueue.c:3393 kthread+0x3c4/0x530 kernel/kthread.c:389 ret_from_fork+0x6e/0x90 arch/x86/kernel/process.c:147 ret_from_fork_asm+0x1a/0x30 arch/x86/entry/entry_64.S:244 Uninit was stored to memory at: unix_walk_scc net/unix/garbage.c:526 [inline] __unix_gc+0x2adf/0x3c20 net/unix/garbage.c:584 process_one_work kernel/workqueue.c:3231 [inline] process_scheduled_works+0xade/0x1bf0 kernel/workqueue.c:3312 worker_thread+0xeb6/0x15b0 kernel/workqueue.c:3393 kthread+0x3c4/0x530 kernel/kthread.c:389 ret_from_fork+0x6e/0x90 arch/x86/kernel/process.c:147 ret_from_fork_asm+0x1a/0x30 arch/x86/entry/entry_64.S:244 Local variable entries created at: ref_tracker_free+0x48/0xf30 lib/ref_tracker.c:222 netdev_tracker_free include/linux/netdevice.h:4058 [inline] netdev_put include/linux/netdevice.h:4075 [inline] dev_put include/linux/netdevice.h:4101 [inline] update_gid_event_work_handler+0xaa/0x1b0 drivers/infiniband/core/roce_gid_mgmt.c:813 CPU: 1 PID: 12763 Comm: kworker/u8:31 Not tainted 6.10.0-rc4-00217-g35bb670d65fc #32 Hardware name: QEMU Standard PC (i440FX + PIIX, 1996), BIOS 1.16.3-2.fc40 04/01/2014 Workqueue: events_unbound __unix_gc Fixes: 3484f063172d ("af_unix: Detect Strongly Connected Components.") Reported-by: syzkaller <[email protected]> Signed-off-by: Shigeru Yoshida <[email protected]> Reviewed-by: Kuniyuki Iwashima <[email protected]> Link: https://patch.msgid.link/[email protected] Signed-off-by: Jakub Kicinski <[email protected]>
2024-07-03	bonding: Fix out-of-bounds read in bond_option_arp_ip_targets_set()	Sam Sun	1	-3/+3
	In function bond_option_arp_ip_targets_set(), if newval->string is an empty string, newval->string+1 will point to the byte after the string, causing an out-of-bound read. BUG: KASAN: slab-out-of-bounds in strlen+0x7d/0xa0 lib/string.c:418 Read of size 1 at addr ffff8881119c4781 by task syz-executor665/8107 CPU: 1 PID: 8107 Comm: syz-executor665 Not tainted 6.7.0-rc7 #1 Hardware name: QEMU Standard PC (i440FX + PIIX, 1996), BIOS 1.15.0-1 04/01/2014 Call Trace: <TASK> __dump_stack lib/dump_stack.c:88 [inline] dump_stack_lvl+0xd9/0x150 lib/dump_stack.c:106 print_address_description mm/kasan/report.c:364 [inline] print_report+0xc1/0x5e0 mm/kasan/report.c:475 kasan_report+0xbe/0xf0 mm/kasan/report.c:588 strlen+0x7d/0xa0 lib/string.c:418 __fortify_strlen include/linux/fortify-string.h:210 [inline] in4_pton+0xa3/0x3f0 net/core/utils.c:130 bond_option_arp_ip_targets_set+0xc2/0x910 drivers/net/bonding/bond_options.c:1201 __bond_opt_set+0x2a4/0x1030 drivers/net/bonding/bond_options.c:767 __bond_opt_set_notify+0x48/0x150 drivers/net/bonding/bond_options.c:792 bond_opt_tryset_rtnl+0xda/0x160 drivers/net/bonding/bond_options.c:817 bonding_sysfs_store_option+0xa1/0x120 drivers/net/bonding/bond_sysfs.c:156 dev_attr_store+0x54/0x80 drivers/base/core.c:2366 sysfs_kf_write+0x114/0x170 fs/sysfs/file.c:136 kernfs_fop_write_iter+0x337/0x500 fs/kernfs/file.c:334 call_write_iter include/linux/fs.h:2020 [inline] new_sync_write fs/read_write.c:491 [inline] vfs_write+0x96a/0xd80 fs/read_write.c:584 ksys_write+0x122/0x250 fs/read_write.c:637 do_syscall_x64 arch/x86/entry/common.c:52 [inline] do_syscall_64+0x40/0x110 arch/x86/entry/common.c:83 entry_SYSCALL_64_after_hwframe+0x63/0x6b ---[ end trace ]--- Fix it by adding a check of string length before using it. Fixes: f9de11a16594 ("bonding: add ip checks when store ip target") Signed-off-by: Yue Sun <[email protected]> Signed-off-by: Simon Horman <[email protected]> Acked-by: Jay Vosburgh <[email protected]> Reviewed-by: Hangbin Liu <[email protected]> Link: https://patch.msgid.link/[email protected] Signed-off-by: Jakub Kicinski <[email protected]>
2024-07-03	Merge branch 'selftests-openvswitch-address-some-flakes-in-the-ci-environment'	Jakub Kicinski	2	-8/+16
	Aaron Conole says: ==================== selftests: openvswitch: Address some flakes in the CI environment These patches aim to make using the openvswitch testsuite more reliable. These should address the major sources of flakiness in the openvswitch test suite allowing the CI infrastructure to exercise the openvswitch module for patch series. There should be no change for users who simply run the tests (except that patch 3/3 does make some of the debugging a bit easier by making some output more verbose). ==================== Link: https://patch.msgid.link/[email protected] Signed-off-by: Jakub Kicinski <[email protected]>
2024-07-03	selftests: openvswitch: Be more verbose with selftest debugging.	Aaron Conole	1	-3/+6
	The openvswitch selftest is difficult to debug for anyone that isn't directly familiar with the openvswitch module and the specifics of the test cases. Many times when something fails, the debug log will be sparsely populated and it takes some time to understand where a failure occured. Increase the amount of details logged to the debug log by trapping all 'info' logs, and all 'ovs_sbx' commands. Signed-off-by: Aaron Conole <[email protected]> Reviewed-by: Simon Horman <[email protected]> Link: https://patch.msgid.link/[email protected] Signed-off-by: Jakub Kicinski <[email protected]>
2024-07-03	selftests: openvswitch: Attempt to autoload module.	Aaron Conole	1	-5/+9
	Previously, the openvswitch.sh test suites would not attempt to autoload the openvswitch module. The idea was that a user who is manually running tests might not even have the OVS module loaded or configured for their own development. However, if the kernel module is configured, and the module can be autoloaded then we should just attempt to load it and run the tests. This is especially true in the CI environments, where the CI tests should be able to rely on auto loading to get the test suite running. Signed-off-by: Aaron Conole <[email protected]> Reviewed-by: Simon Horman <[email protected]> Tested-by: Simon Horman <[email protected]> Link: https://patch.msgid.link/[email protected] Signed-off-by: Jakub Kicinski <[email protected]>
2024-07-03	selftests: openvswitch: Bump timeout to 15 minutes.	Aaron Conole	1	-0/+1
	We found that since some tests rely on the TCP SYN timeouts to cause flow misses, the default test suite timeout of 45 seconds is quick to be exceeded. Bump the timeout to 15 minutes. Signed-off-by: Aaron Conole <[email protected]> Reviewed-by: Simon Horman <[email protected]> Tested-by: Simon Horman <[email protected]> Link: https://patch.msgid.link/[email protected] Signed-off-by: Jakub Kicinski <[email protected]>
2024-07-03	net: ethtool: fix compat with old RSS context API	Jakub Kicinski	1	-2/+2
	Device driver gets access to rxfh_dev, while rxfh is just a local copy of user space params. We need to check what RSS context ID driver assigned in rxfh_dev, not rxfh. Using rxfh leads to trying to store all contexts at index 0xffffffff. From the user perspective it leads to "driver chose duplicate ID" warnings when second context is added and inability to access any contexts even tho they were successfully created - xa_load() for the actual context ID will return NULL, and syscall will return -ENOENT. Looks like a rebasing mistake, since rxfh_dev was added relatively recently by commit fb6e30a72539 ("net: ethtool: pass a pointer to parameters to get/set_rxfh ethtool ops"). Fixes: eac9122f0c41 ("net: ethtool: record custom RSS contexts in the XArray") Reviewed-by: Edward Cree <[email protected]> Link: https://patch.msgid.link/[email protected] Signed-off-by: Jakub Kicinski <[email protected]>
2024-07-03	net: rswitch: Avoid use-after-free in rswitch_poll()	Radu Rendec	1	-2/+2
	The use-after-free is actually in rswitch_tx_free(), which is inlined in rswitch_poll(). Since `skb` and `gq->skbs[gq->dirty]` are in fact the same pointer, the skb is first freed using dev_kfree_skb_any(), then the value in skb->len is used to update the interface statistics. Let's move around the instructions to use skb->len before the skb is freed. This bug is trivial to reproduce using KFENCE. It will trigger a splat every few packets. A simple ARP request or ICMP echo request is enough. Fixes: 271e015b9153 ("net: rswitch: Add unmap_addrs instead of dma address in each desc") Signed-off-by: Radu Rendec <[email protected]> Reviewed-by: Yoshihiro Shimoda <[email protected]> Reviewed-by: Niklas Söderlund <[email protected]> Link: https://patch.msgid.link/[email protected] Signed-off-by: Jakub Kicinski <[email protected]>
2024-07-03	selftests: drv-net: rss_ctx: allow more noise on default context	Jakub Kicinski	1	-6/+10
	As predicted by David running the test on a machine with a single interface is a bit unreliable. We try to send 20k packets with iperf and expect fewer than 10k packets on the default context. The test isn't very quick, iperf will usually send 100k packets by the time we stop it. So we're off by 5x on the number of iperf packets but still expect default context to only get the hardcoded 10k. The intent is to make sure we get noticeably less traffic on the default context. Use half of the resulting iperf traffic instead of the hard coded 10k. Link: https://patch.msgid.link/[email protected] Signed-off-by: Jakub Kicinski <[email protected]>
2024-07-03	tools: ynl: use ident name for Family, too.	Paolo Abeni	1	-26/+26
	This allow consistent naming convention between Family and others element's name. Signed-off-by: Paolo Abeni <[email protected]> Link: https://patch.msgid.link/9bbcab3094970b371bd47aa18481ae6ca5a93687.1719930479.git.pabeni@redhat.com Signed-off-by: Jakub Kicinski <[email protected]>
2024-07-04	netfilter: nf_tables: unconditionally flush pending work before notifier	Florian Westphal	1	-2/+1
	syzbot reports: KASAN: slab-uaf in nft_ctx_update include/net/netfilter/nf_tables.h:1831 KASAN: slab-uaf in nft_commit_release net/netfilter/nf_tables_api.c:9530 KASAN: slab-uaf int nf_tables_trans_destroy_work+0x152b/0x1750 net/netfilter/nf_tables_api.c:9597 Read of size 2 at addr ffff88802b0051c4 by task kworker/1:1/45 [..] Workqueue: events nf_tables_trans_destroy_work Call Trace: nft_ctx_update include/net/netfilter/nf_tables.h:1831 [inline] nft_commit_release net/netfilter/nf_tables_api.c:9530 [inline] nf_tables_trans_destroy_work+0x152b/0x1750 net/netfilter/nf_tables_api.c:9597 Problem is that the notifier does a conditional flush, but its possible that the table-to-be-removed is still referenced by transactions being processed by the worker, so we need to flush unconditionally. We could make the flush_work depend on whether we found a table to delete in nf-next to avoid the flush for most cases. AFAICS this problem is only exposed in nf-next, with commit e169285f8c56 ("netfilter: nf_tables: do not store nft_ctx in transaction objects"), with this commit applied there is an unconditional fetch of table->family which is whats triggering the above splat. Fixes: 2c9f0293280e ("netfilter: nf_tables: flush pending destroy work before netlink notifier") Reported-and-tested-by: [email protected] Closes: https://syzkaller.appspot.com/bug?extid=4fd66a69358fc15ae2ad Signed-off-by: Florian Westphal <[email protected]> Signed-off-by: Pablo Neira Ayuso <[email protected]>
2024-07-03	Merge tag 'trace-v6.10-rc6' of ↵	Linus Torvalds	2	-1/+2
	git://git.kernel.org/pub/scm/linux/kernel/git/trace/linux-trace Pull tracing fix from Steven Rostedt: "Fix ioctl conflict with memmapped ring buffer ioctl It was reported that the ioctl() number used to update the ring buffer memory mapping conflicted with the TCGETS ioctl causing strace to report: $ strace -e ioctl stty ioctl(0, TCGETS or TRACE_MMAP_IOCTL_GET_READER, {c_iflag=ICRNL\|IXON, c_oflag=NL0\|CR0\|TAB0\|BS0\|VT0\|FF0\|OPOST\|ONLCR, c_cflag=B38400\|CS8\|CREAD, c_lflag=ISIG\|ICANON\|ECHO\|ECHOE\|ECHOK\|IEXTEN\|ECHOCTL\|ECHOKE, ...}) = 0 Since this ioctl hasn't been in a full release yet, change it from "T", 0x1 to "R" 0x20, and also reserve 0x20-0x2F for future ioctl commands, as some more are being worked on for the future" * tag 'trace-v6.10-rc6' of git://git.kernel.org/pub/scm/linux/kernel/git/trace/linux-trace: tracing: Have memmapped ring buffer use ioctl of "R" range 0x20-2F
2024-07-03	tracing: Have memmapped ring buffer use ioctl of "R" range 0x20-2F	Steven Rostedt (Google)	2	-1/+2
	To prevent conflicts with other ioctl numbers to allow strace to have an idea of what is happening, add the range of ioctls for the trace buffer mapping from _IO("T", 0x1) to the range of "R" 0x20 - 0x2F. Link: https://lore.kernel.org/linux-trace-kernel/[email protected]/ Link: https://lore.kernel.org/linux-trace-kernel/[email protected]/ Cc: Jonathan Corbet <[email protected]> Fixes: cf9f0f7c4c5bb ("tracing: Allow user-space mapping of the ring-buffer") Link: https://lore.kernel.org/[email protected] Reported-by: "Dmitry V. Levin" <[email protected]> Reviewed-by: Mathieu Desnoyers <[email protected]> Acked-by: Masami Hiramatsu (Google) <[email protected]> Signed-off-by: Steven Rostedt (Google) <[email protected]>
2024-07-03	nilfs2: fix incorrect inode allocation from reserved inodes	Ryusuke Konishi	4	-12/+20
	If the bitmap block that manages the inode allocation status is corrupted, nilfs_ifile_create_inode() may allocate a new inode from the reserved inode area where it should not be allocated. Previous fix commit d325dc6eb763 ("nilfs2: fix use-after-free bug of struct nilfs_root"), fixed the problem that reserved inodes with inode numbers less than NILFS_USER_INO (=11) were incorrectly reallocated due to bitmap corruption, but since the start number of non-reserved inodes is read from the super block and may change, in which case inode allocation may occur from the extended reserved inode area. If that happens, access to that inode will cause an IO error, causing the file system to degrade to an error state. Fix this potential issue by adding a wraparound option to the common metadata object allocation routine and by modifying nilfs_ifile_create_inode() to disable the option so that it only allocates inodes with inode numbers greater than or equal to the inode number read in "nilfs->ns_first_ino", regardless of the bitmap status of reserved inodes. Link: https://lkml.kernel.org/r/[email protected] Signed-off-by: Ryusuke Konishi <[email protected]> Cc: Hillf Danton <[email protected]> Cc: Jan Kara <[email protected]> Cc: Matthew Wilcox (Oracle) <[email protected]> Cc: <[email protected]> Signed-off-by: Andrew Morton <[email protected]>
2024-07-03	nilfs2: add missing check for inode numbers on directory entries	Ryusuke Konishi	2	-0/+11
	Syzbot reported that mounting and unmounting a specific pattern of corrupted nilfs2 filesystem images causes a use-after-free of metadata file inodes, which triggers a kernel bug in lru_add_fn(). As Jan Kara pointed out, this is because the link count of a metadata file gets corrupted to 0, and nilfs_evict_inode(), which is called from iput(), tries to delete that inode (ifile inode in this case). The inconsistency occurs because directories containing the inode numbers of these metadata files that should not be visible in the namespace are read without checking. Fix this issue by treating the inode numbers of these internal files as errors in the sanity check helper when reading directory folios/pages. Also thanks to Hillf Danton and Matthew Wilcox for their initial mm-layer analysis. Link: https://lkml.kernel.org/r/[email protected] Signed-off-by: Ryusuke Konishi <[email protected]> Reported-by: [email protected] Closes: https://syzkaller.appspot.com/bug?extid=d79afb004be235636ee8 Reported-by: Jan Kara <[email protected]> Closes: https://lkml.kernel.org/r/20240617075758.wewhukbrjod5fp5o@quack3 Tested-by: Ryusuke Konishi <[email protected]> Cc: Hillf Danton <[email protected]> Cc: Matthew Wilcox (Oracle) <[email protected]> Cc: <[email protected]> Signed-off-by: Andrew Morton <[email protected]>
2024-07-03	nilfs2: fix inode number range checks	Ryusuke Konishi	3	-3/+10
	Patch series "nilfs2: fix potential issues related to reserved inodes". This series fixes one use-after-free issue reported by syzbot, caused by nilfs2's internal inode being exposed in the namespace on a corrupted filesystem, and a couple of flaws that cause problems if the starting number of non-reserved inodes written in the on-disk super block is intentionally (or corruptly) changed from its default value. This patch (of 3): In the current implementation of nilfs2, "nilfs->ns_first_ino", which gives the first non-reserved inode number, is read from the superblock, but its lower limit is not checked. As a result, if a number that overlaps with the inode number range of reserved inodes such as the root directory or metadata files is set in the super block parameter, the inode number test macros (NILFS_MDT_INODE and NILFS_VALID_INODE) will not function properly. In addition, these test macros use left bit-shift calculations using with the inode number as the shift count via the BIT macro, but the result of a shift calculation that exceeds the bit width of an integer is undefined in the C specification, so if "ns_first_ino" is set to a large value other than the default value NILFS_USER_INO (=11), the macros may potentially malfunction depending on the environment. Fix these issues by checking the lower bound of "nilfs->ns_first_ino" and by preventing bit shifts equal to or greater than the NILFS_USER_INO constant in the inode number test macros. Also, change the type of "ns_first_ino" from signed integer to unsigned integer to avoid the need for type casting in comparisons such as the lower bound check introduced this time. Link: https://lkml.kernel.org/r/[email protected] Link: https://lkml.kernel.org/r/[email protected] Signed-off-by: Ryusuke Konishi <[email protected]> Cc: Hillf Danton <[email protected]> Cc: Jan Kara <[email protected]> Cc: Matthew Wilcox (Oracle) <[email protected]> Cc: <[email protected]> Signed-off-by: Andrew Morton <[email protected]>
2024-07-03	mm: avoid overflows in dirty throttling logic	Jan Kara	1	-4/+26
	The dirty throttling logic is interspersed with assumptions that dirty limits in PAGE_SIZE units fit into 32-bit (so that various multiplications fit into 64-bits). If limits end up being larger, we will hit overflows, possible divisions by 0 etc. Fix these problems by never allowing so large dirty limits as they have dubious practical value anyway. For dirty_bytes / dirty_background_bytes interfaces we can just refuse to set so large limits. For dirty_ratio / dirty_background_ratio it isn't so simple as the dirty limit is computed from the amount of available memory which can change due to memory hotplug etc. So when converting dirty limits from ratios to numbers of pages, we just don't allow the result to exceed UINT_MAX. This is root-only triggerable problem which occurs when the operator sets dirty limits to >16 TB. Link: https://lkml.kernel.org/r/[email protected] Signed-off-by: Jan Kara <[email protected]> Reported-by: Zach O'Keefe <[email protected]> Reviewed-By: Zach O'Keefe <[email protected]> Cc: <[email protected]> Signed-off-by: Andrew Morton <[email protected]>
2024-07-03	Revert "mm/writeback: fix possible divide-by-zero in wb_dirty_limits(), again"	Jan Kara	1	-1/+1
	Patch series "mm: Avoid possible overflows in dirty throttling". Dirty throttling logic assumes dirty limits in page units fit into 32-bits. This patch series makes sure this is true (see patch 2/2 for more details). This patch (of 2): This reverts commit 9319b647902cbd5cc884ac08a8a6d54ce111fc78. The commit is broken in several ways. Firstly, the removed (u64) cast from the multiplication will introduce a multiplication overflow on 32-bit archs if wb_thresh * bg_thresh >= 1<<32 (which is actually common - the default settings with 4GB of RAM will trigger this). Secondly, the div64_u64() is unnecessarily expensive on 32-bit archs. We have div64_ul() in case we want to be safe & cheap. Thirdly, if dirty thresholds are larger than 1<<32 pages, then dirty balancing is going to blow up in many other spectacular ways anyway so trying to fix one possible overflow is just moot. Link: https://lkml.kernel.org/r/[email protected] Link: https://lkml.kernel.org/r/[email protected] Fixes: 9319b647902c ("mm/writeback: fix possible divide-by-zero in wb_dirty_limits(), again") Signed-off-by: Jan Kara <[email protected]> Reviewed-By: Zach O'Keefe <[email protected]> Cc: <[email protected]> Signed-off-by: Andrew Morton <[email protected]>
2024-07-03	mm: optimize the redundant loop of mm_update_owner_next()	Jinliang Zheng	1	-0/+2
	When mm_update_owner_next() is racing with swapoff (try_to_unuse()) or /proc or ptrace or page migration (get_task_mm()), it is impossible to find an appropriate task_struct in the loop whose mm_struct is the same as the target mm_struct. If the above race condition is combined with the stress-ng-zombie and stress-ng-dup tests, such a long loop can easily cause a Hard Lockup in write_lock_irq() for tasklist_lock. Recognize this situation in advance and exit early. Link: https://lkml.kernel.org/r/[email protected] Signed-off-by: Jinliang Zheng <[email protected]> Acked-by: Michal Hocko <[email protected]> Cc: Christian Brauner <[email protected]> Cc: Jens Axboe <[email protected]> Cc: Mateusz Guzik <[email protected]> Cc: Matthew Wilcox (Oracle) <[email protected]> Cc: Oleg Nesterov <[email protected]> Cc: Tycho Andersen <[email protected]> Cc: <[email protected]> Signed-off-by: Andrew Morton <[email protected]>
2024-07-03	Merge tag 'io_uring-6.10-20240703' of git://git.kernel.dk/linux	Linus Torvalds	1	-4/+6
	Pull io_uring fix from Jens Axboe: "A fix for a feature that went into the 6.10 merge window actually ended up causing a regression in building bundles for receives. Fix that up by ensuring we don't overwrite msg_inq before we use it in the loop" * tag 'io_uring-6.10-20240703' of git://git.kernel.dk/linux: io_uring/net: don't clear msg_inq before io_recv_buf_select() needs it
2024-07-03	Merge tag 'media/v6.10-3' of ↵	Linus Torvalds	3	-2/+3
	git://git.kernel.org/pub/scm/linux/kernel/git/mchehab/linux-media Pull media fixes from Mauro Carvalho Chehab: "Some fixes related to the IPU6 driver" * tag 'media/v6.10-3' of git://git.kernel.org/pub/scm/linux/kernel/git/mchehab/linux-media: media: ivsc: Depend on IPU_BRIDGE or not IPU_BRIDGE media: intel/ipu6: Fix a null pointer dereference in ipu6_isys_query_stream_by_source media: ipu6: Use the ISYS auxdev device as the V4L2 device's device
2024-07-03	s390/dasd: Fix invalid dereferencing of indirect CCW data pointer	Stefan Haberland	2	-3/+3
	Fix invalid dereferencing of indirect CCW data pointer in dasd_eckd_dump_sense() that leads to a kernel panic in error cases. When using indirect addressing for DASD CCWs (IDAW) the CCW CDA pointer does not contain the data address itself but a pointer to the IDAL. This needs to be translated from physical to virtual as well before using it. This dereferencing is also used for dasd_page_cache and also fixed although it is very unlikely that this code path ever gets used. Fixes: c0bd39601c13 ("s390/dasd: use new address translation helpers") Cc: [email protected] Signed-off-by: Stefan Haberland <[email protected]> Reviewed-by: Heiko Carstens <[email protected]> Signed-off-by: Heiko Carstens <[email protected]>
2024-07-03	wifi: iwlwifi: mvm: check vif for NULL/ERR_PTR before dereference	Miri Korenblit	3	-4/+6
	iwl_mvm_get_bss_vif might return a NULL or ERR_PTR. Some of the callers check only the NULL case, and some doesn't check at all. Some of the callers even have a pointer to the mvmvif of the bss vif, so we don't even need to call this function, and can simply get the vif from mvmvif. Do it for those cases, and for the others - properly check if IS_ERR_OR_NULL Fixes: ec0d43d26f2c ("wifi: iwlwifi: mvm: Activate EMLSR based on traffic volume") Signed-off-by: Miri Korenblit <[email protected]> Link: https://patch.msgid.link/20240703064027.a661f8c65aac.I45cf09b01af8ee3d55828863958ead741ea43b7f@changeid Signed-off-by: Johannes Berg <[email protected]>
2024-07-03	wifi: iwlwifi: mvm: avoid link lookup in statistics	Johannes Berg	1	-7/+6
	We already iterate the link bss_conf/link_info and have the pointer, or know that deflink/bss_conf is used, so avoid an extra lookup and just pass the pointer. This may also avoid a crash when this is processed during restart, where the FW to link conf array (link_id_to_link_conf) may be NULLed out. Fixes: c1e458b987f2 ("wifi: iwlwifi: mvm: Move beacon filtering to be per link") Signed-off-by: Johannes Berg <[email protected]> Reviewed-by: Ilan Peer <[email protected]> Signed-off-by: Miri Korenblit <[email protected]> Link: https://patch.msgid.link/20240703064026.346a6ef67a86.Iba5d65d728ca9f58518c88d029496c1250670544@changeid Signed-off-by: Johannes Berg <[email protected]>
2024-07-03	wifi: iwlwifi: mvm: don't wake up rx_sync_waitq upon RFKILL	Emmanuel Grumbach	2	-8/+4
	Since we now want to sync the queues even when we're in RFKILL, we shouldn't wake up the wait queue since we still expect to get all the notifications from the firmware. Fixes: 4d08c0b3357c ("wifi: iwlwifi: mvm: handle BA session teardown in RF-kill") Signed-off-by: Emmanuel Grumbach <[email protected]> Signed-off-by: Miri Korenblit <[email protected]> Link: https://patch.msgid.link/20240703064027.be7a9dbeacde.I5586cb3ca8d6e44f79d819a48a0c22351ff720c9@changeid Signed-off-by: Johannes Berg <[email protected]>
2024-07-03	wifi: iwlwifi: properly set WIPHY_FLAG_SUPPORTS_EXT_KEK_KCK	Daniel Gabay	1	-1/+1
	The WIPHY_FLAG_SUPPORTS_EXT_KEK_KCK should be set based on the WOWLAN_KEK_KCK_MATERIAL command version. Currently, the command version in the firmware has advanced to 4, which prevents the flag from being set correctly, fix that. Signed-off-by: Daniel Gabay <[email protected]> Signed-off-by: Miri Korenblit <[email protected]> Link: https://patch.msgid.link/20240703064026.a0f162108575.If1a9785727d2a1b0197a396680965df1b53d4096@changeid Signed-off-by: Johannes Berg <[email protected]>
2024-07-03	wifi: wilc1000: fix ies_len type in connect path	Jozef Hopko	1	-1/+2
	Commit 205c50306acf ("wifi: wilc1000: fix RCU usage in connect path") made sure that the IEs data was manipulated under the relevant RCU section. Unfortunately, while doing so, the commit brought a faulty implicit cast from int to u8 on the ies_len variable, making the parsing fail to be performed correctly if the IEs block is larger than 255 bytes. This failure can be observed with Access Points appending a lot of IEs TLVs in their beacon frames (reproduced with a Pixel phone acting as an Access Point, which brough 273 bytes of IE data in my testing environment). Fix IEs parsing by removing this undesired implicit cast. Fixes: 205c50306acf ("wifi: wilc1000: fix RCU usage in connect path") Signed-off-by: Jozef Hopko <[email protected]> Signed-off-by: Alexis Lothoré <[email protected]> Acked-by: Ajay Singh <[email protected]> Signed-off-by: Kalle Valo <[email protected]> Link: https://patch.msgid.link/[email protected]
2024-07-03	s390/vfio_ccw: Fix target addresses of TIC CCWs	Eric Farman	1	-4/+5
	The processing of a Transfer-In-Channel (TIC) CCW requires locating the target of the CCW in the channel program, and updating the address to reflect what will actually be sent to hardware. An error exists where the 64-bit virtual address is truncated to 32-bits (variable "cda") when performing this math. Since s390 addresses of that size are 31-bits, this leaves that additional bit enabled such that the resulting I/O triggers a channel program check. This shows up occasionally when booting a KVM guest from a passthrough DASD device: ..snip... Interrupt Response Block Data: : 0x0000000000003990 Function Ctrl : [Start] Activity Ctrl : Status Ctrl : [Alert] [Primary] [Secondary] [Status-Pending] Device Status : Channel Status : [Program-Check] cpa=: 0x00000000008d0018 prev_ccw=: 0x0000000000000000 this_ccw=: 0x0000000000000000 ...snip... dasd-ipl: Failed to run IPL1 channel program The channel program address of "0x008d0018" in the IRB doesn't look wrong, but tracing the CCWs shows the offending bit enabled: ccw=0x0000012e808d0000 cda=00a0b030 ccw=0x0000012e808d0008 cda=00a0b038 ccw=0x0000012e808d0010 cda=808d0008 ccw=0x0000012e808d0018 cda=00a0b040 Fix the calculation of the TIC CCW's data address such that it points to a valid 31-bit address regardless of the input address. Fixes: bd36cfbbb9e1 ("s390/vfio_ccw_cp: use new address translation helpers") Signed-off-by: Eric Farman <[email protected]> Reviewed-by: Heiko Carstens <[email protected]> Link: https://lore.kernel.org/r/[email protected] Signed-off-by: Alexander Gordeev <[email protected]> Signed-off-by: Heiko Carstens <[email protected]>
2024-07-03	sctp: cancel a blocking accept when shutdown a listen socket	Xin Long	1	-4/+10
	As David Laight noticed, "In a multithreaded program it is reasonable to have a thread blocked in accept(). With TCP a subsequent shutdown(listen_fd, SHUT_RDWR) causes the accept to fail. But nothing happens for SCTP." sctp_disconnect() is eventually called when shutdown a listen socket, but nothing is done in this function. This patch sets RCV_SHUTDOWN flag in sk->sk_shutdown there, and adds the check (sk->sk_shutdown & RCV_SHUTDOWN) to break and return in sctp_accept(). Note that shutdown() is only supported on TCP-style SCTP socket. Reported-by: David Laight <[email protected]> Signed-off-by: Xin Long <[email protected]> Signed-off-by: David S. Miller <[email protected]>
2024-07-03	net: dsa: microchip: lan937x: disable VPHY support	Lucas Stach	2	-0/+7
	As described by the microchip article "LAN937X - The required configuration for the external MAC port to operate at RGMII-to-RGMII 1Gbps link speed." [1]: "When VPHY is enabled, the auto-negotiation process following IEEE 802.3 standard will be triggered and will result in RGMII-to-RGMII signal failure on the interface because VPHY will try to poll the PHY status that is not available in the scenario of RGMII-to-RGMII connection (normally the link partner is usually an external processor). Note that when VPHY fails on accessing PHY registers, it will fall back to 100Mbps speed, it indicates disabling VPHY is optional if you only need the port to link at 100Mbps speed. Again, VPHY must and can only be disabled by writing VPHY_DISABLE bit in the register below as there is no strapping pin for the control." This patch was tested on LAN9372, so far it seems to not to affect VPHY based clock crossing optimization for the ports with integrated PHYs. [1]: https://microchip.my.site.com/s/article/LAN937X-The-required-configuration-for-the-external-MAC-port-to-operate-at-RGMII-to-RGMII-1Gbps-link-speed Signed-off-by: Lucas Stach <[email protected]> Signed-off-by: Oleksij Rempel <[email protected]> Reviewed-by: Vladimir Oltean <[email protected]> Signed-off-by: David S. Miller <[email protected]>
2024-07-03	net: dsa: microchip: lan937x: disable in-band status support for RGMII ↵	Lucas Stach	1	-1/+2
	interfaces This driver do not support in-band mode and in case of CPU<->Switch link, this mode is not working any way. So, disable it otherwise ingress path of the switch MAC will stay disabled. Note: lan9372 manual do not document 0xN301 BIT(2) for the RGMII mode and recommend[1] to disable in-band link status update for the RGMII RX path by clearing 0xN302 BIT(0). But, 0xN301 BIT(2) seems to work too, so keep it unified with other KSZ switches. [1] https://microchip.my.site.com/s/article/LAN937X-The-required-configuration-for-the-external-MAC-port-to-operate-at-RGMII-to-RGMII-1Gbps-link-speed Signed-off-by: Lucas Stach <[email protected]> Signed-off-by: Oleksij Rempel <[email protected]> Reviewed-by: Vladimir Oltean <[email protected]> Acked-by: Arun Ramadoss <[email protected]> Signed-off-by: David S. Miller <[email protected]>
2024-07-03	net: dsa: microchip: lan9371/2: add 100BaseTX PHY support	Lucas Stach	3	-0/+6
	On the LAN9371 and LAN9372, the 4th internal PHY is a 100BaseTX PHY instead of a 100BaseT1 PHY. The 100BaseTX PHYs have a different base register offset. Signed-off-by: Lucas Stach <[email protected]> Signed-off-by: Oleksij Rempel <[email protected]> Acked-by: Arun Ramadoss <[email protected]> Signed-off-by: David S. Miller <[email protected]>
2024-07-03	net: stmmac: enable HW-accelerated VLAN stripping for gmac4 only	Furong Xu	1	-3/+4
	Commit 750011e239a5 ("net: stmmac: Add support for HW-accelerated VLAN stripping") enables MAC level VLAN tag stripping for all MAC cores, but leaves set_hw_vlan_mode() and rx_hw_vlan() un-implemented for both gmac and xgmac. On gmac and xgmac, ethtool reports rx-vlan-offload is on, both MAC and driver do nothing about VLAN packets actually, although VLAN works well. Driver level stripping should be used on gmac and xgmac for now. Fixes: 750011e239a5 ("net: stmmac: Add support for HW-accelerated VLAN stripping") Signed-off-by: Furong Xu <[email protected]> Signed-off-by: David S. Miller <[email protected]>
2024-07-02	Merge branch 'device-memory-tcp'	Jakub Kicinski	11	-186/+316
	Prep patches for Device Memory TCP Pick up a couple of prep patches for Device Memory TCP which stand on their own. Link: https://patch.msgid.link/[email protected] Signed-off-by: Jakub Kicinski <[email protected]>
2024-07-02	tools: net: package libynl for use in selftests	Jakub Kicinski	3	-2/+29
	Support building the C YNL userspace library into one big static file. We can then link selftests against it for easy to use C netlink interface. Signed-off-by: Mina Almasry <[email protected]> Link: https://patch.msgid.link/[email protected] Signed-off-by: Jakub Kicinski <[email protected]>
2024-07-02	page_pool: convert to use netmem	Mina Almasry	8	-184/+287
	Abstract the memory type from the page_pool so we can later add support for new memory types. Convert the page_pool to use the new netmem type abstraction, rather than use struct page directly. As of this patch the netmem type is a no-op abstraction: it's always a struct page underneath. All the page pool internals are converted to use struct netmem instead of struct page, and the page pool now exports 2 APIs: 1. The existing struct page API. 2. The new struct netmem API. Keeping the existing API is transitional; we do not want to refactor all the current drivers using the page pool at once. The netmem abstraction is currently a no-op. The page_pool uses page_to_netmem() to convert allocated pages to netmem, and uses netmem_to_page() to convert the netmem back to pages to pass to mm APIs, Follow up patches to this series add non-paged netmem support to the page_pool. This change is factored out on its own to limit the code churn to this 1 patch, for ease of code review. Signed-off-by: Mina Almasry <[email protected]> Reviewed-by: Pavel Begunkov <[email protected]> Link: https://patch.msgid.link/[email protected] Signed-off-by: Jakub Kicinski <[email protected]>
2024-07-02	net: ntb_netdev: Move ntb_netdev_rx_handler() to call netif_rx() from ↵	Dave Jiang	1	-1/+1
	__netif_rx() The following is emitted when using idxd (DSA) dmanegine as the data mover for ntb_transport that ntb_netdev uses. [74412.546922] BUG: using smp_processor_id() in preemptible [00000000] code: irq/52-idxd-por/14526 [74412.556784] caller is netif_rx_internal+0x42/0x130 [74412.562282] CPU: 6 PID: 14526 Comm: irq/52-idxd-por Not tainted 6.9.5 #5 [74412.569870] Hardware name: Intel Corporation ArcherCity/ArcherCity, BIOS EGSDCRB1.E9I.1752.P05.2402080856 02/08/2024 [74412.581699] Call Trace: [74412.584514] <TASK> [74412.586933] dump_stack_lvl+0x55/0x70 [74412.591129] check_preemption_disabled+0xc8/0xf0 [74412.596374] netif_rx_internal+0x42/0x130 [74412.600957] __netif_rx+0x20/0xd0 [74412.604743] ntb_netdev_rx_handler+0x66/0x150 [ntb_netdev] [74412.610985] ntb_complete_rxc+0xed/0x140 [ntb_transport] [74412.617010] ntb_rx_copy_callback+0x53/0x80 [ntb_transport] [74412.623332] idxd_dma_complete_txd+0xe3/0x160 [idxd] [74412.628963] idxd_wq_thread+0x1a6/0x2b0 [idxd] [74412.634046] irq_thread_fn+0x21/0x60 [74412.638134] ? irq_thread+0xa8/0x290 [74412.642218] irq_thread+0x1a0/0x290 [74412.646212] ? __pfx_irq_thread_fn+0x10/0x10 [74412.651071] ? __pfx_irq_thread_dtor+0x10/0x10 [74412.656117] ? __pfx_irq_thread+0x10/0x10 [74412.660686] kthread+0x100/0x130 [74412.664384] ? __pfx_kthread+0x10/0x10 [74412.668639] ret_from_fork+0x31/0x50 [74412.672716] ? __pfx_kthread+0x10/0x10 [74412.676978] ret_from_fork_asm+0x1a/0x30 [74412.681457] </TASK> The cause is due to the idxd driver interrupt completion handler uses threaded interrupt and the threaded handler is not hard or soft interrupt context. However __netif_rx() can only be called from interrupt context. Change the call to netif_rx() in order to allow completion via normal context for dmaengine drivers that utilize threaded irq handling. While the following commit changed from netif_rx() to __netif_rx(), baebdf48c360 ("net: dev: Makes sure netif_rx() can be invoked in any context."), the change should've been a noop instead. However, the code precedes this fix should've been using netif_rx_ni() or netif_rx_any_context(). Fixes: 548c237c0a99 ("net: Add support for NTB virtual ethernet device") Reported-by: Jerry Dai <[email protected]> Tested-by: Jerry Dai <[email protected]> Signed-off-by: Dave Jiang <[email protected]> Link: https://patch.msgid.link/[email protected] Signed-off-by: Jakub Kicinski <[email protected]>
2024-07-02	net: phy: aquantia: add missing include guards	Bartosz Golaszewski	1	-0/+5
	The header is missing the include guards so add them. Reviewed-by: Andrew Lunn <[email protected]> Fixes: fb470f70fea7 ("net: phy: aquantia: add hwmon support") Signed-off-by: Bartosz Golaszewski <[email protected]> Link: https://patch.msgid.link/[email protected] Signed-off-by: Jakub Kicinski <[email protected]>