aboutsummaryrefslogtreecommitdiff
path: root/include
AgeCommit message (Collapse)AuthorFilesLines
2024-02-07ipv4: add __unregister_nexthop_notifier()Eric Dumazet1-0/+1
unregister_nexthop_notifier() assumes the caller does not hold rtnl. We need in the following patch to use it from a context already holding rtnl. Add __unregister_nexthop_notifier(). unregister_nexthop_notifier() becomes a wrapper. Signed-off-by: Eric Dumazet <[email protected]> Reviewed-by: David Ahern <[email protected]> Reviewed-by: Antoine Tenart <[email protected]> Link: https://lore.kernel.org/r/[email protected] Signed-off-by: Jakub Kicinski <[email protected]>
2024-02-07net: add exit_batch_rtnl() methodEric Dumazet1-0/+3
Many (struct pernet_operations)->exit_batch() methods have to acquire rtnl. In presence of rtnl mutex pressure, this makes cleanup_net() very slow. This patch adds a new exit_batch_rtnl() method to reduce number of rtnl acquisitions from cleanup_net(). exit_batch_rtnl() handlers are called while rtnl is locked, and devices to be killed can be queued in a list provided as their second argument. A single unregister_netdevice_many() is called right before rtnl is released. exit_batch_rtnl() handlers are called before ->exit() and ->exit_batch() handlers. Signed-off-by: Eric Dumazet <[email protected]> Reviewed-by: Antoine Tenart <[email protected]> Link: https://lore.kernel.org/r/[email protected] Signed-off-by: Jakub Kicinski <[email protected]>
2024-02-07Merge tag 'mlx5-updates-2024-02-01' of ↵Jakub Kicinski3-8/+9
git://git.kernel.org/pub/scm/linux/kernel/git/saeed/linux Saeed Mahameed says: ==================== mlx5-updates-2024-02-01 1) IPSec global stats for xfrm and mlx5 2) XSK memory improvements for non-linear SKBs 3) Software steering debug dump to use seq_file ops 4) Various code clean-ups * tag 'mlx5-updates-2024-02-01' of git://git.kernel.org/pub/scm/linux/kernel/git/saeed/linux: net/mlx5e: XDP, Exclude headroom and tailroom from memory calculations net/mlx5e: XSK, Exclude tailroom from non-linear SKBs memory calculations net/mlx5: DR, Change SWS usage to debug fs seq_file interface net/mlx5: Change missing SyncE capability print to debug net/mlx5: Remove initial segmentation duplicate definitions net/mlx5: Return specific error code for timeout on wait_fw_init net/mlx5: SF, Stop waiting for FW as teardown was called net/mlx5: remove fw reporter dump option for non PF net/mlx5: remove fw_fatal reporter dump option for non PF net/mlx5: Rename mlx5_sf_dev_remove Documentation: Fix counter name of mlx5 vnic reporter net/mlx5e: Delete obsolete IPsec code net/mlx5e: Connect mlx5 IPsec statistics with XFRM core xfrm: get global statistics from the offloaded device xfrm: generalize xdo_dev_state_update_curlft to allow statistics update ==================== Link: https://lore.kernel.org/r/[email protected] Signed-off-by: Jakub Kicinski <[email protected]>
2024-02-07net: mdio: add 2.5g and 5g related PMA speed constantsMarek Behún1-0/+2
Add constants indicating 2.5g and 5g ability in the MMD PMA speed register. Signed-off-by: Marek Behún <[email protected]> Signed-off-by: Heiner Kallweit <[email protected]> Link: https://lore.kernel.org/r/[email protected] Signed-off-by: Jakub Kicinski <[email protected]>
2024-02-07net: Do not return value from init_dummy_netdev()Amit Cohen1-1/+1
init_dummy_netdev() always returns zero and all the callers do not check the returned value. Set the function to not return value, as it is not really used today. Signed-off-by: Amit Cohen <[email protected]> Reviewed-by: Ido Schimmel <[email protected]> Reviewed-by: Jiri Pirko <[email protected]> Reviewed-by: Simon Horman <[email protected]> Link: https://lore.kernel.org/r/[email protected] Signed-off-by: Jakub Kicinski <[email protected]>
2024-02-07netlabel: cleanup struct netlbl_lsm_catmapGeorge Guo1-4/+3
Simplify the code from macro NETLBL_CATMAP_MAPTYPE to u64, and fix warning "Macros with complex values should be enclosed in parentheses" on "#define NETLBL_CATMAP_BIT (NETLBL_CATMAP_MAPTYPE)0x01", which is modified to "#define NETLBL_CATMAP_BIT ((u64)0x01)". Signed-off-by: George Guo <[email protected]> Acked-by: Paul Moore <[email protected]> Signed-off-by: David S. Miller <[email protected]>
2024-02-06net: phy: add helper phy_advertise_eee_allHeiner Kallweit1-0/+1
Per default phylib preserves the EEE advertising at the time of phy probing. The EEE advertising can be changed from user space, in addition this helper allows to set the EEE advertising to all supported modes from drivers in kernel space. Suggested-by: Andrew Lunn <[email protected]> Signed-off-by: Heiner Kallweit <[email protected]> Reviewed-by: Andrew Lunn <[email protected]> Link: https://lore.kernel.org/r/[email protected] Signed-off-by: Jakub Kicinski <[email protected]>
2024-02-06bonding: Add independent control state machineAahil Awatramani4-0/+27
Add support for the independent control state machine per IEEE 802.1AX-2008 5.4.15 in addition to the existing implementation of the coupled control state machine. Introduces two new states, AD_MUX_COLLECTING and AD_MUX_DISTRIBUTING in the LACP MUX state machine for separated handling of an initial Collecting state before the Collecting and Distributing state. This enables a port to be in a state where it can receive incoming packets while not still distributing. This is useful for reducing packet loss when a port begins distributing before its partner is able to collect. Added new functions such as bond_set_slave_tx_disabled_flags and bond_set_slave_rx_enabled_flags to precisely manage the port's collecting and distributing states. Previously, there was no dedicated method to disable TX while keeping RX enabled, which this patch addresses. Note that the regular flow process in the kernel's bonding driver remains unaffected by this patch. The extension requires explicit opt-in by the user (in order to ensure no disruptions for existing setups) via netlink support using the new bonding parameter coupled_control. The default value for coupled_control is set to 1 so as to preserve existing behaviour. Signed-off-by: Aahil Awatramani <[email protected]> Reviewed-by: Hangbin Liu <[email protected]> Link: https://lore.kernel.org/r/[email protected] Signed-off-by: Paolo Abeni <[email protected]>
2024-02-06net: phy: constify phydev->drvRussell King (Oracle)1-1/+1
Device driver structures are shared between all devices that they match, and thus nothing should never write to the device driver structure through the phydev->drv pointer. Let's make this pointer const to catch code that attempts to do so. Suggested-by: Christian Marangi <[email protected]> Signed-off-by: Russell King (Oracle) <[email protected]> Link: https://lore.kernel.org/r/[email protected] Signed-off-by: Paolo Abeni <[email protected]>
2024-02-06net: dst: Make dst_destroy() static and return void.Sebastian Andrzej Siewior1-1/+0
Since commit 52df157f17e56 ("xfrm: take refcnt of dst when creating struct xfrm_dst bundle") dst_destroy() returns only NULL and no caller cares about the return value. There are no in in-tree users of dst_destroy() outside of the file. Make dst_destroy() static and return void. Signed-off-by: Sebastian Andrzej Siewior <[email protected]> Reviewed-by: Eric Dumazet <[email protected]> Link: https://lore.kernel.org/r/[email protected] Signed-off-by: Paolo Abeni <[email protected]>
2024-02-05net/mlx5: Remove initial segmentation duplicate definitionsGal Pressman1-0/+1
Device definitions belong in mlx5_ifc, remove the duplicates in mlx5_core.h. Signed-off-by: Gal Pressman <[email protected]> Reviewed-by: Jianbo Liu <[email protected]> Signed-off-by: Saeed Mahameed <[email protected]>
2024-02-05xfrm: get global statistics from the offloaded deviceLeon Romanovsky1-0/+3
Iterate over all SAs in order to fill global IPsec statistics. Acked-by: Steffen Klassert <[email protected]> Signed-off-by: Leon Romanovsky <[email protected]> Signed-off-by: Saeed Mahameed <[email protected]>
2024-02-05xfrm: generalize xdo_dev_state_update_curlft to allow statistics updateLeon Romanovsky2-8/+5
In order to allow drivers to fill all statistics, change the name of xdo_dev_state_update_curlft to be xdo_dev_state_update_stats. Acked-by: Steffen Klassert <[email protected]> Signed-off-by: Leon Romanovsky <[email protected]> Signed-off-by: Saeed Mahameed <[email protected]>
2024-02-05sctp: preserve const qualifier in sctp_sk()Eric Dumazet1-4/+1
We can change sctp_sk() to propagate its argument const qualifier, thanks to container_of_const(). Signed-off-by: Eric Dumazet <[email protected]> Cc: Marcelo Ricardo Leitner <[email protected]> Cc: Xin Long <[email protected]> Acked-by: Xin Long <[email protected]> Signed-off-by: David S. Miller <[email protected]>
2024-02-04net: make dev_unreg_count globalEric Dumazet2-2/+1
We can use a global dev_unreg_count counter instead of a per netns one. As a bonus we can factorize the changes done on it for bulk device removals. Signed-off-by: Eric Dumazet <[email protected]> Signed-off-by: David S. Miller <[email protected]>
2024-02-04tun: Fix code style issues in <linux/if_tun.h>Yunjian Wang1-3/+13
This fixes the following code style problem: - WARNING: please, no spaces at the start of a line - CHECK: Please use a blank line after function/struct/union/enum declarations Signed-off-by: Yunjian Wang <[email protected]> Reviewed-by: Jiri Pirko <[email protected]> Reviewed-by: Willem de Bruijn <[email protected]> Acked-by: Jason Wang <[email protected]> Signed-off-by: David S. Miller <[email protected]>
2024-02-02net/sched: Add helper macros with module namesMichal Koutný3-0/+6
The macros are preparation for adding module aliases en mass in a separate commit. Although it would be tempting to create aliases like cls-foo for name cls_foo, this could not be used because modprobe utilities treat '-' and '_' interchangeably. In the end, the naming follows pattern of proto modules in linux/net.h. Signed-off-by: Michal Koutný <[email protected]> Acked-by: Jamal Hadi Salim <[email protected]> Reviewed-by: Jiri Pirko <[email protected]> Link: https://lore.kernel.org/r/[email protected] Signed-off-by: Jakub Kicinski <[email protected]>
2024-02-01Merge git://git.kernel.org/pub/scm/linux/kernel/git/netdev/netJakub Kicinski12-30/+50
Cross-merge networking fixes after downstream PR. No conflicts or adjacent changes. Signed-off-by: Jakub Kicinski <[email protected]>
2024-02-01Merge tag 'net-6.8-rc3' of ↵Linus Torvalds4-7/+21
git://git.kernel.org/pub/scm/linux/kernel/git/netdev/net Pull networking fixes from Jakub Kicinski: "Including fixes from netfilter. As Paolo promised we continue to hammer out issues in our selftests. This is not the end but probably the peak. Current release - regressions: - smc: fix incorrect SMC-D link group matching logic Current release - new code bugs: - eth: bnxt: silence WARN() when device skips a timestamp, it happens Previous releases - regressions: - ipmr: fix null-deref when forwarding mcast packets - conntrack: evaluate window negotiation only for packets in the REPLY direction, otherwise SYN retransmissions trigger incorrect window scale negotiation - ipset: fix performance regression in swap operation Previous releases - always broken: - tcp: add sanity checks to types of pages getting into the rx zerocopy path, we only support basic NIC -> user, no page cache pages etc. - ip6_tunnel: make sure to pull inner header in __ip6_tnl_rcv() - nt_tables: more input sanitization changes - dsa: mt7530: fix 10M/100M speed on MediaTek MT7988 switch - bridge: mcast: fix loss of snooping after long uptime, jiffies do wrap on 32bit - xen-netback: properly sync TX responses, protect with locking - phy: mediatek-ge-soc: sync calibration values with MediaTek SDK, increase connection stability - eth: pds: fixes for various teardown, and reset races Misc: - hsr: silence WARN() if we can't alloc supervision frame, it happens" * tag 'net-6.8-rc3' of git://git.kernel.org/pub/scm/linux/kernel/git/netdev/net: (82 commits) doc/netlink/specs: Add missing attr in rt_link spec idpf: avoid compiler padding in virtchnl2_ptype struct selftests: mptcp: join: stop transfer when check is done (part 2) selftests: mptcp: join: stop transfer when check is done (part 1) selftests: mptcp: allow changing subtests prefix selftests: mptcp: decrease BW in simult flows selftests: mptcp: increase timeout to 30 min selftests: mptcp: add missing kconfig for NF Mangle selftests: mptcp: add missing kconfig for NF Filter in v6 selftests: mptcp: add missing kconfig for NF Filter mptcp: fix data re-injection from stale subflow selftests: net: enable some more knobs selftests: net: add missing config for NF_TARGET_TTL selftests: forwarding: List helper scripts in TEST_FILES Makefile variable selftests: net: List helper scripts in TEST_FILES Makefile variable selftests: net: Remove executable bits from library scripts selftests: bonding: Check initial state selftests: team: Add missing config options hv_netvsc: Fix race condition between netvsc_probe and netvsc_remove xen-netback: properly sync TX responses ...
2024-02-01Merge tag 'hid-for-linus-2024020101' of ↵Linus Torvalds1-11/+0
git://git.kernel.org/pub/scm/linux/kernel/git/hid/hid Pull HID fixes from Benjamin Tissoires: - cleanups in the error path in hid-steam (Dan Carpenter) - fixes for Wacom tablets selftests that sneaked in while the CI was taking a break during the year end holidays (Benjamin Tissoires) - null pointer check in nvidia-shield (Kunwu Chan) - memory leak fix in hidraw (Su Hui) - another null pointer fix in i2c-hid-of (Johan Hovold) - another memory leak fix in HID-BPF this time, as well as a double fdget() fix reported by Dan Carpenter (Benjamin Tissoires) - fix for Cirque touchpad when they go on suspend (Kai-Heng Feng) - new device ID in hid-logitech-hidpp: "Logitech G Pro X SuperLight 2" (Jiri Kosina) * tag 'hid-for-linus-2024020101' of git://git.kernel.org/pub/scm/linux/kernel/git/hid/hid: HID: bpf: use __bpf_kfunc instead of noinline HID: bpf: actually free hdev memory after attaching a HID-BPF program HID: bpf: remove double fdget() HID: i2c-hid-of: fix NULL-deref on failed power up HID: hidraw: fix a problem of memory leak in hidraw_release() HID: i2c-hid: Skip SET_POWER SLEEP for Cirque touchpad on system suspend HID: nvidia-shield: Add missing null pointer checks to LED initialization HID: logitech-hidpp: add support for Logitech G Pro X Superlight 2 selftests/hid: wacom: fix confidence tests HID: hid-steam: Fix cleanup in probe() HID: hid-steam: remove pointless error message
2024-02-01Merge tag 'lsm-pr-20240131' of ↵Linus Torvalds1-2/+2
git://git.kernel.org/pub/scm/linux/kernel/git/pcmoore/lsm Pull lsm fixes from Paul Moore: "Two small patches to fix some problems relating to LSM hook return values and how the individual LSMs interact" * tag 'lsm-pr-20240131' of git://git.kernel.org/pub/scm/linux/kernel/git/pcmoore/lsm: lsm: fix default return value of the socket_getpeersec_*() hooks lsm: fix the logic in security_inode_getsecctx()
2024-02-01Merge tag 'nf-24-01-31' of ↵Jakub Kicinski2-0/+6
git://git.kernel.org/pub/scm/linux/kernel/git/netfilter/nf Pablo Neira Ayuso says: ==================== Netfilter fixes for net 1) TCP conntrack now only evaluates window negotiation for packets in the REPLY direction, from Ryan Schaefer. Otherwise SYN retransmissions trigger incorrect window scale negotiation. From Ryan Schaefer. 2) Restrict tunnel objects to NFPROTO_NETDEV which is where it makes sense to use this object type. 3) Fix conntrack pick up from the middle of SCTP_CID_SHUTDOWN_ACK packets. From Xin Long. 4) Another attempt from Jozsef Kadlecsik to address the slow down of the swap command in ipset. 5) Replace a BUG_ON by WARN_ON_ONCE in nf_log, and consolidate check for the case that the logger is NULL from the read side lock section. 6) Address lack of sanitization for custom expectations. Restrict layer 3 and 4 families to what it is supported by userspace. * tag 'nf-24-01-31' of git://git.kernel.org/pub/scm/linux/kernel/git/netfilter/nf: netfilter: nft_ct: sanitize layer 3 and 4 protocol number in custom expectations netfilter: nf_log: replace BUG_ON by WARN_ON_ONCE when putting logger netfilter: ipset: fix performance regression in swap operation netfilter: conntrack: check SCTP_CID_SHUTDOWN_ACK for vtag setting in sctp_new netfilter: nf_tables: restrict tunnel object to NFPROTO_NETDEV netfilter: conntrack: correct window scaling with retransmitted SYN ==================== Link: https://lore.kernel.org/r/[email protected] Signed-off-by: Jakub Kicinski <[email protected]>
2024-02-01net/mlx5: DPLL, Implement lock status error valueJiri Pirko1-0/+8
Fill-up the lock status error value properly. Signed-off-by: Jiri Pirko <[email protected]> Acked-by: Vadim Fedorenko <[email protected]> Reviewed-by: Simon Horman <[email protected]> Signed-off-by: Paolo Abeni <[email protected]>
2024-02-01dpll: extend lock_status_get() op by status error and expose to userJiri Pirko1-0/+1
Pass additional argunent status_error over lock_status_get() so drivers can fill it up. In case they do, expose the value over previously introduced attribute to user. Do it only in case the current lock_status is either "unlocked" or "holdover". Signed-off-by: Jiri Pirko <[email protected]> Acked-by: Vadim Fedorenko <[email protected]> Reviewed-by: Simon Horman <[email protected]> Signed-off-by: Paolo Abeni <[email protected]>
2024-02-01dpll: extend uapi by lock status error attributeJiri Pirko1-0/+30
If the dpll devices goes to state "unlocked" or "holdover", it may be caused by an error. In that case, allow user to see what the error was. Introduce a new attribute and values it can carry. Signed-off-by: Jiri Pirko <[email protected]> Acked-by: Vadim Fedorenko <[email protected]> Reviewed-by: Simon Horman <[email protected]> Signed-off-by: Paolo Abeni <[email protected]>
2024-02-01cpumask: define cleanup function for cpumasksYury Norov1-0/+3
Now we can simplify code that allocates cpumasks for local needs. Signed-off-by: Yury Norov <[email protected]> Signed-off-by: Paolo Abeni <[email protected]>
2024-02-01cpumask: add cpumask_weight_andnot()Yury Norov2-0/+25
Similarly to cpumask_weight_and(), cpumask_weight_andnot() is a handy helper that may help to avoid creating an intermediate mask just to calculate number of bits that set in a 1st given mask, and clear in 2nd one. Signed-off-by: Yury Norov <[email protected]> Reviewed-by: Jacob Keller <[email protected]> Signed-off-by: Paolo Abeni <[email protected]>
2024-02-01net: dsa: Add KSZ8567 switch supportPhilippe Schenker1-0/+1
This commit introduces support for the KSZ8567, a robust 7-port Ethernet switch. The KSZ8567 features two RGMII/MII/RMII interfaces, each capable of gigabit speeds, complemented by five 10/100 Mbps MAC/PHYs. Signed-off-by: Philippe Schenker <[email protected]> Acked-by: Arun Ramadoss <[email protected]> Reviewed-by: Andrew Lunn <[email protected]> Reviewed-by: Florian Fainelli <[email protected]> Link: https://lore.kernel.org/r/[email protected] Signed-off-by: Paolo Abeni <[email protected]>
2024-01-31af_unix: fix lockdep positive in sk_diag_dump_icons()Eric Dumazet1-6/+14
syzbot reported a lockdep splat [1]. Blamed commit hinted about the possible lockdep violation, and code used unix_state_lock_nested() in an attempt to silence lockdep. It is not sufficient, because unix_state_lock_nested() is already used from unix_state_double_lock(). We need to use a separate subclass. This patch adds a distinct enumeration to make things more explicit. Also use swap() in unix_state_double_lock() as a clean up. v2: add a missing inline keyword to unix_state_lock_nested() [1] WARNING: possible circular locking dependency detected 6.8.0-rc1-syzkaller-00356-g8a696a29c690 #0 Not tainted syz-executor.1/2542 is trying to acquire lock: ffff88808b5df9e8 (rlock-AF_UNIX){+.+.}-{2:2}, at: skb_queue_tail+0x36/0x120 net/core/skbuff.c:3863 but task is already holding lock: ffff88808b5dfe70 (&u->lock/1){+.+.}-{2:2}, at: unix_dgram_sendmsg+0xfc7/0x2200 net/unix/af_unix.c:2089 which lock already depends on the new lock. the existing dependency chain (in reverse order) is: -> #1 (&u->lock/1){+.+.}-{2:2}: lock_acquire+0x1e3/0x530 kernel/locking/lockdep.c:5754 _raw_spin_lock_nested+0x31/0x40 kernel/locking/spinlock.c:378 sk_diag_dump_icons net/unix/diag.c:87 [inline] sk_diag_fill+0x6ea/0xfe0 net/unix/diag.c:157 sk_diag_dump net/unix/diag.c:196 [inline] unix_diag_dump+0x3e9/0x630 net/unix/diag.c:220 netlink_dump+0x5c1/0xcd0 net/netlink/af_netlink.c:2264 __netlink_dump_start+0x5d7/0x780 net/netlink/af_netlink.c:2370 netlink_dump_start include/linux/netlink.h:338 [inline] unix_diag_handler_dump+0x1c3/0x8f0 net/unix/diag.c:319 sock_diag_rcv_msg+0xe3/0x400 netlink_rcv_skb+0x1df/0x430 net/netlink/af_netlink.c:2543 sock_diag_rcv+0x2a/0x40 net/core/sock_diag.c:280 netlink_unicast_kernel net/netlink/af_netlink.c:1341 [inline] netlink_unicast+0x7e6/0x980 net/netlink/af_netlink.c:1367 netlink_sendmsg+0xa37/0xd70 net/netlink/af_netlink.c:1908 sock_sendmsg_nosec net/socket.c:730 [inline] __sock_sendmsg net/socket.c:745 [inline] sock_write_iter+0x39a/0x520 net/socket.c:1160 call_write_iter include/linux/fs.h:2085 [inline] new_sync_write fs/read_write.c:497 [inline] vfs_write+0xa74/0xca0 fs/read_write.c:590 ksys_write+0x1a0/0x2c0 fs/read_write.c:643 do_syscall_x64 arch/x86/entry/common.c:52 [inline] do_syscall_64+0xf5/0x230 arch/x86/entry/common.c:83 entry_SYSCALL_64_after_hwframe+0x63/0x6b -> #0 (rlock-AF_UNIX){+.+.}-{2:2}: check_prev_add kernel/locking/lockdep.c:3134 [inline] check_prevs_add kernel/locking/lockdep.c:3253 [inline] validate_chain+0x1909/0x5ab0 kernel/locking/lockdep.c:3869 __lock_acquire+0x1345/0x1fd0 kernel/locking/lockdep.c:5137 lock_acquire+0x1e3/0x530 kernel/locking/lockdep.c:5754 __raw_spin_lock_irqsave include/linux/spinlock_api_smp.h:110 [inline] _raw_spin_lock_irqsave+0xd5/0x120 kernel/locking/spinlock.c:162 skb_queue_tail+0x36/0x120 net/core/skbuff.c:3863 unix_dgram_sendmsg+0x15d9/0x2200 net/unix/af_unix.c:2112 sock_sendmsg_nosec net/socket.c:730 [inline] __sock_sendmsg net/socket.c:745 [inline] ____sys_sendmsg+0x592/0x890 net/socket.c:2584 ___sys_sendmsg net/socket.c:2638 [inline] __sys_sendmmsg+0x3b2/0x730 net/socket.c:2724 __do_sys_sendmmsg net/socket.c:2753 [inline] __se_sys_sendmmsg net/socket.c:2750 [inline] __x64_sys_sendmmsg+0xa0/0xb0 net/socket.c:2750 do_syscall_x64 arch/x86/entry/common.c:52 [inline] do_syscall_64+0xf5/0x230 arch/x86/entry/common.c:83 entry_SYSCALL_64_after_hwframe+0x63/0x6b other info that might help us debug this: Possible unsafe locking scenario: CPU0 CPU1 ---- ---- lock(&u->lock/1); lock(rlock-AF_UNIX); lock(&u->lock/1); lock(rlock-AF_UNIX); *** DEADLOCK *** 1 lock held by syz-executor.1/2542: #0: ffff88808b5dfe70 (&u->lock/1){+.+.}-{2:2}, at: unix_dgram_sendmsg+0xfc7/0x2200 net/unix/af_unix.c:2089 stack backtrace: CPU: 1 PID: 2542 Comm: syz-executor.1 Not tainted 6.8.0-rc1-syzkaller-00356-g8a696a29c690 #0 Hardware name: Google Google Compute Engine/Google Compute Engine, BIOS Google 11/17/2023 Call Trace: <TASK> __dump_stack lib/dump_stack.c:88 [inline] dump_stack_lvl+0x1e7/0x2d0 lib/dump_stack.c:106 check_noncircular+0x366/0x490 kernel/locking/lockdep.c:2187 check_prev_add kernel/locking/lockdep.c:3134 [inline] check_prevs_add kernel/locking/lockdep.c:3253 [inline] validate_chain+0x1909/0x5ab0 kernel/locking/lockdep.c:3869 __lock_acquire+0x1345/0x1fd0 kernel/locking/lockdep.c:5137 lock_acquire+0x1e3/0x530 kernel/locking/lockdep.c:5754 __raw_spin_lock_irqsave include/linux/spinlock_api_smp.h:110 [inline] _raw_spin_lock_irqsave+0xd5/0x120 kernel/locking/spinlock.c:162 skb_queue_tail+0x36/0x120 net/core/skbuff.c:3863 unix_dgram_sendmsg+0x15d9/0x2200 net/unix/af_unix.c:2112 sock_sendmsg_nosec net/socket.c:730 [inline] __sock_sendmsg net/socket.c:745 [inline] ____sys_sendmsg+0x592/0x890 net/socket.c:2584 ___sys_sendmsg net/socket.c:2638 [inline] __sys_sendmmsg+0x3b2/0x730 net/socket.c:2724 __do_sys_sendmmsg net/socket.c:2753 [inline] __se_sys_sendmmsg net/socket.c:2750 [inline] __x64_sys_sendmmsg+0xa0/0xb0 net/socket.c:2750 do_syscall_x64 arch/x86/entry/common.c:52 [inline] do_syscall_64+0xf5/0x230 arch/x86/entry/common.c:83 entry_SYSCALL_64_after_hwframe+0x63/0x6b RIP: 0033:0x7f26d887cda9 Code: 28 00 00 00 75 05 48 83 c4 28 c3 e8 e1 20 00 00 90 48 89 f8 48 89 f7 48 89 d6 48 89 ca 4d 89 c2 4d 89 c8 4c 8b 4c 24 08 0f 05 <48> 3d 01 f0 ff ff 73 01 c3 48 c7 c1 b0 ff ff ff f7 d8 64 89 01 48 RSP: 002b:00007f26d95a60c8 EFLAGS: 00000246 ORIG_RAX: 0000000000000133 RAX: ffffffffffffffda RBX: 00007f26d89abf80 RCX: 00007f26d887cda9 RDX: 000000000000003e RSI: 00000000200bd000 RDI: 0000000000000004 RBP: 00007f26d88c947a R08: 0000000000000000 R09: 0000000000000000 R10: 00000000000008c0 R11: 0000000000000246 R12: 0000000000000000 R13: 000000000000000b R14: 00007f26d89abf80 R15: 00007ffcfe081a68 Fixes: 2aac7a2cb0d9 ("unix_diag: Pending connections IDs NLA") Reported-by: syzbot <[email protected]> Signed-off-by: Eric Dumazet <[email protected]> Reviewed-by: Kuniyuki Iwashima <[email protected]> Link: https://lore.kernel.org/r/[email protected] Signed-off-by: Jakub Kicinski <[email protected]>
2024-01-31af_unix: Remove CONFIG_UNIX_SCM.Kuniyuki Iwashima1-3/+4
Originally, the code related to garbage collection was all in garbage.c. Commit f4e65870e5ce ("net: split out functions related to registering inflight socket files") moved some functions to scm.c for io_uring and added CONFIG_UNIX_SCM just in case AF_UNIX was built as module. However, since commit 97154bcf4d1b ("af_unix: Kconfig: make CONFIG_UNIX bool"), AF_UNIX is no longer built separately. Also, io_uring does not support SCM_RIGHTS now. Let's move the functions back to garbage.c Signed-off-by: Kuniyuki Iwashima <[email protected]> Acked-by: Jens Axboe <[email protected]> Link: https://lore.kernel.org/r/[email protected] Signed-off-by: Jakub Kicinski <[email protected]>
2024-01-31af_unix: Remove io_uring code for GC.Kuniyuki Iwashima1-1/+0
Since commit 705318a99a13 ("io_uring/af_unix: disable sending io_uring over sockets"), io_uring's unix socket cannot be passed via SCM_RIGHTS, so it does not contribute to cyclic reference and no longer be candidate for garbage collection. Also, commit 6e5e6d274956 ("io_uring: drop any code related to SCM_RIGHTS") cleaned up SCM_RIGHTS code in io_uring. Let's do it in AF_UNIX as well by reverting commit 0091bfc81741 ("io_uring/af_unix: defer registered files gc to io_uring release") and commit 10369080454d ("net: reclaim skb->scm_io_uring bit"). Signed-off-by: Kuniyuki Iwashima <[email protected]> Acked-by: Jens Axboe <[email protected]> Link: https://lore.kernel.org/r/[email protected] Signed-off-by: Jakub Kicinski <[email protected]>
2024-01-31netfilter: ipset: fix performance regression in swap operationJozsef Kadlecsik1-0/+4
The patch "netfilter: ipset: fix race condition between swap/destroy and kernel side add/del/test", commit 28628fa9 fixes a race condition. But the synchronize_rcu() added to the swap function unnecessarily slows it down: it can safely be moved to destroy and use call_rcu() instead. Eric Dumazet pointed out that simply calling the destroy functions as rcu callback does not work: sets with timeout use garbage collectors which need cancelling at destroy which can wait. Therefore the destroy functions are split into two: cancelling garbage collectors safely at executing the command received by netlink and moving the remaining part only into the rcu callback. Link: https://lore.kernel.org/lkml/[email protected]/ Fixes: 28628fa952fe ("netfilter: ipset: fix race condition between swap/destroy and kernel side add/del/test") Reported-by: Ale Crismani <[email protected]> Reported-by: David Wang <[email protected]> Tested-by: David Wang <[email protected]> Signed-off-by: Jozsef Kadlecsik <[email protected]> Signed-off-by: Pablo Neira Ayuso <[email protected]>
2024-01-31netfilter: nf_tables: restrict tunnel object to NFPROTO_NETDEVPablo Neira Ayuso1-0/+2
Bail out on using the tunnel dst template from other than netdev family. Add the infrastructure to check for the family in objects. Fixes: af308b94a2a4 ("netfilter: nf_tables: add tunnel support") Signed-off-by: Pablo Neira Ayuso <[email protected]>
2024-01-31Merge tag 'nf-next-24-01-29' of ↵David S. Miller2-1/+11
https://git.kernel.org/pub/scm/linux/kernel/git/netfilter/nf-next Florian Westphal says: ==================== nf-next pr 2024-01-29 This batch contains updates for your *next* tree. First three changes, from Phil Sutter, allow userspace to define a table that is exclusively owned by a daemon (via netlink socket aliveness) without auto-removing this table when the userspace program exits. Such table gets marked as orphaned and a restarting management daemon may re-attach/reassume ownership. Next patch, from Pablo, passes already-validated flags variable around rather than having called code re-fetch it from netlnik message. Patches 5 and 6 update ipvs and nf_conncount to use the recently introduced KMEM_CACHE() macro. Last three patches, from myself, tweak kconfig logic a little to permit a kernel configuration that can run iptables-over-nftables but not classic (setsockopt) iptables. Such builds lack the builtin-filter/mangle/raw/nat/security tables, the set/getsockopt interface and the "old blob format" interpreter/traverser. For now, this is 'oldconfig friendly', users need to manually deselect existing config options for this. ==================== Signed-off-by: David S. Miller <[email protected]>
2024-01-31ethtool: add linkmode bitmap support to struct ethtool_keeeHeiner Kallweit1-0/+3
Add linkmode bitmap members to struct ethtool_keee, but keep the legacy u32 bitmaps for compatibility with existing drivers. Use linkmode "supported" not being empty as indicator that a user wants to use the linkmode bitmap members instead of the legacy bitmaps. Signed-off-by: Heiner Kallweit <[email protected]> Reviewed-by: Andrew Lunn <[email protected]> Signed-off-by: David S. Miller <[email protected]>
2024-01-31ethtool: add suffix _u32 to legacy bitmap members of struct ethtool_keeeHeiner Kallweit1-3/+3
This is in preparation of using the existing names for linkmode bitmaps. Suggested-by: Andrew Lunn <[email protected]> Signed-off-by: Heiner Kallweit <[email protected]> Reviewed-by: Andrew Lunn <[email protected]> Signed-off-by: David S. Miller <[email protected]>
2024-01-31ethtool: adjust struct ethtool_keee to kernel needsHeiner Kallweit1-5/+3
This patch changes the following in struct ethtool_keee - remove member cmd, it's not needed on kernel side - remove reserved fields - switch the semantically boolean members to type bool We don't have to change any user of the boolean members due to the implicit casting from/to bool. A small change is needed where a pointer to bool members is used, in addition remove few now unneeded double negations. Reviewed-by: Andrew Lunn <[email protected]> Signed-off-by: Heiner Kallweit <[email protected]> Signed-off-by: David S. Miller <[email protected]>
2024-01-31ethtool: replace struct ethtool_eee with a new struct ethtool_keee on kernel ↵Heiner Kallweit4-10/+22
side In order to pass EEE link modes beyond bit 32 to userspace we have to complement the 32 bit bitmaps in struct ethtool_eee with linkmode bitmaps. Therefore, similar to ethtool_link_settings and ethtool_link_ksettings, add a struct ethtool_keee. In a first step it's an identical copy of ethtool_eee. This patch simply does a s/ethtool_eee/ethtool_keee/g for all users. No functional change intended. Suggested-by: Andrew Lunn <[email protected]> Signed-off-by: Heiner Kallweit <[email protected]> Reviewed-by: Andrew Lunn <[email protected]> Signed-off-by: David S. Miller <[email protected]>
2024-01-31net: stmmac: Offload queueMaxSDU from tc-taprioRohan G Thomas1-0/+1
Add support for configuring queueMaxSDU. As DWMAC IPs doesn't support queueMaxSDU table handle this in the SW. The maximum 802.3 frame size that is allowed to be transmitted by any queue is queueMaxSDU + 16 bytes (i.e. 6 bytes SA + 6 bytes DA + 4 bytes FCS). Inspired from intel i225 driver. Signed-off-by: Rohan G Thomas <[email protected]> Signed-off-by: David S. Miller <[email protected]>
2024-01-31HID: bpf: use __bpf_kfunc instead of noinlineBenjamin Tissoires1-11/+0
Follow the docs at Documentation/bpf/kfuncs.rst: - declare the function with `__bpf_kfunc` - disables missing prototype warnings, which allows to remove them from include/linux/hid-bpf.h Removing the prototypes is not an issue because we currently have to redeclare them when writing the BPF program. They will eventually be generated by bpftool directly AFAIU. Link: https://lore.kernel.org/r/[email protected] Signed-off-by: Benjamin Tissoires <[email protected]>
2024-01-30lsm: fix default return value of the socket_getpeersec_*() hooksOndrej Mosnacek1-2/+2
For these hooks the true "neutral" value is -EOPNOTSUPP, which is currently what is returned when no LSM provides this hook and what LSMs return when there is no security context set on the socket. Correct the value in <linux/lsm_hooks.h> and adjust the dispatch functions in security/security.c to avoid issues when the BPF LSM is enabled. Cc: [email protected] Fixes: 98e828a0650f ("security: Refactor declaration of LSM hooks") Signed-off-by: Ondrej Mosnacek <[email protected]> [PM: subject line tweak] Signed-off-by: Paul Moore <[email protected]>
2024-01-29Merge tag 'mm-hotfixes-stable-2024-01-28-23-21' of ↵Linus Torvalds2-3/+4
git://git.kernel.org/pub/scm/linux/kernel/git/akpm/mm Pull misc fixes from Andrew Morton: "22 hotfixes. 11 are cc:stable and the remainder address post-6.7 issues or aren't considered appropriate for backporting" * tag 'mm-hotfixes-stable-2024-01-28-23-21' of git://git.kernel.org/pub/scm/linux/kernel/git/akpm/mm: (22 commits) mm: thp_get_unmapped_area must honour topdown preference mm: huge_memory: don't force huge page alignment on 32 bit userfaultfd: fix mmap_changing checking in mfill_atomic_hugetlb selftests/mm: ksm_tests should only MADV_HUGEPAGE valid memory scs: add CONFIG_MMU dependency for vfree_atomic() mm/memory: fix folio_set_dirty() vs. folio_mark_dirty() in zap_pte_range() mm/huge_memory: fix folio_set_dirty() vs. folio_mark_dirty() selftests/mm: Update va_high_addr_switch.sh to check CPU for la57 flag selftests: mm: fix map_hugetlb failure on 64K page size systems MAINTAINERS: supplement of zswap maintainers update stackdepot: make fast paths lock-less again stackdepot: add stats counters exported via debugfs mm, kmsan: fix infinite recursion due to RCU critical section mm/writeback: fix possible divide-by-zero in wb_dirty_limits(), again selftests/mm: switch to bash from sh MAINTAINERS: add man-pages git trees mm: memcontrol: don't throttle dying tasks on memory.high mm: mmap: map MAP_STACK to VM_NOHUGEPAGE uprobes: use pagesize-aligned virtual address when replacing pages selftests/mm: mremap_test: fix build warning ...
2024-01-29netfilter: nf_tables: Implement table adoption supportPhil Sutter1-0/+6
Allow a new process to take ownership of a previously owned table, useful mostly for firewall management services restarting or suspending when idle. By extending __NFT_TABLE_F_UPDATE, the on/off/on check in nf_tables_updtable() also covers table adoption, although it is actually not needed: Table adoption is irreversible because nf_tables_updtable() rejects attempts to drop NFT_TABLE_F_OWNER so table->nlpid setting can happen just once within the transaction. If the transaction commences, table's nlpid and flags fields are already set and no further action is required. If it aborts, the table returns to orphaned state. Signed-off-by: Phil Sutter <[email protected]> Signed-off-by: Florian Westphal <[email protected]>
2024-01-29netfilter: nf_tables: Introduce NFT_TABLE_F_PERSISTPhil Sutter1-1/+4
This companion flag to NFT_TABLE_F_OWNER requests the kernel to keep the table around after the process has exited. It marks such table as orphaned (by dropping OWNER flag but keeping PERSIST flag in place), which opens it for other processes to manipulate. For the sake of simplicity, PERSIST flag may not be altered though. Signed-off-by: Phil Sutter <[email protected]> Signed-off-by: Florian Westphal <[email protected]>
2024-01-29netfilter: uapi: Document NFT_TABLE_F_OWNER flagPhil Sutter1-0/+1
Add at least this one-liner describing the obvious. Fixes: 6001a930ce03 ("netfilter: nftables: introduce table ownership") Signed-off-by: Phil Sutter <[email protected]> Signed-off-by: Florian Westphal <[email protected]>
2024-01-29ptp: add FemtoClock3 Wireless as ptp hardware clockMin Li1-0/+273
The RENESAS FemtoClock3 Wireless is a high-performance jitter attenuator, frequency translator, and clock synthesizer. The device is comprised of 3 digital PLLs (DPLL) to track CLKIN inputs and three independent low phase noise fractional output dividers (FOD) that output low phase noise clocks. FemtoClock3 supports one Time Synchronization (Time Sync) channel to enable an external processor to control the phase and frequency of the Time Sync channel and to take phase measurements using the TDC. Intended applications are synchronization using the precision time protocol (PTP) and synchronization with 0.5 Hz and 1 Hz signals from GNSS. Signed-off-by: Min Li <[email protected]> Acked-by: Lee Jones <[email protected]> Signed-off-by: David S. Miller <[email protected]>
2024-01-29ptp: introduce PTP_CLOCK_EXTOFF event for the measured external offsetMin Li2-3/+13
This change is for the PHC devices that can measure the phase offset between PHC signal and the external signal, such as the 1PPS signal of GNSS. Reporting PTP_CLOCK_EXTOFF to user space will be piggy-backed to the existing ptp_extts_event so that application such as ts2phc can poll the external offset the same way as extts. Hence, ts2phc can use the offset to achieve the alignment between PHC and the external signal by the help of either SW or HW filters. Signed-off-by: Min Li <[email protected]> Acked-by: Richard Cochran <[email protected]> Signed-off-by: David S. Miller <[email protected]>
2024-01-28Merge tag 'x86_urgent_for_v6.8_rc2' of ↵Linus Torvalds1-0/+1
git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip Pull x86 fixes from Borislav Petkov: - Make sure 32-bit syscall registers are properly sign-extended - Add detection for AMD's Zen5 generation CPUs and Intel's Clearwater Forest CPU model number - Make a stub function export non-GPL because it is part of the paravirt alternatives and that can be used by non-GPL code * tag 'x86_urgent_for_v6.8_rc2' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip: x86/CPU/AMD: Add more models to X86_FEATURE_ZEN5 x86/entry/ia32: Ensure s32 is sign extended to s64 x86/cpu: Add model number for Intel Clearwater Forest processor x86/CPU/AMD: Add X86_FEATURE_ZEN5 x86/paravirt: Make BUG_func() usable by non-GPL modules
2024-01-26Merge tag 'for-netdev' of ↵Jakub Kicinski10-65/+341
https://git.kernel.org/pub/scm/linux/kernel/git/bpf/bpf-next Daniel Borkmann says: ==================== pull-request: bpf-next 2024-01-26 We've added 107 non-merge commits during the last 4 day(s) which contain a total of 101 files changed, 6009 insertions(+), 1260 deletions(-). The main changes are: 1) Add BPF token support to delegate a subset of BPF subsystem functionality from privileged system-wide daemons such as systemd through special mount options for userns-bound BPF fs to a trusted & unprivileged application. With addressed changes from Christian and Linus' reviews, from Andrii Nakryiko. 2) Support registration of struct_ops types from modules which helps projects like fuse-bpf that seeks to implement a new struct_ops type, from Kui-Feng Lee. 3) Add support for retrieval of cookies for perf/kprobe multi links, from Jiri Olsa. 4) Bigger batch of prep-work for the BPF verifier to eventually support preserving boundaries and tracking scalars on narrowing fills, from Maxim Mikityanskiy. 5) Extend the tc BPF flavor to support arbitrary TCP SYN cookies to help with the scenario of SYN floods, from Kuniyuki Iwashima. 6) Add code generation to inline the bpf_kptr_xchg() helper which improves performance when stashing/popping the allocated BPF objects, from Hou Tao. 7) Extend BPF verifier to track aligned ST stores as imprecise spilled registers, from Yonghong Song. 8) Several fixes to BPF selftests around inline asm constraints and unsupported VLA code generation, from Jose E. Marchesi. 9) Various updates to the BPF IETF instruction set draft document such as the introduction of conformance groups for instructions, from Dave Thaler. 10) Fix BPF verifier to make infinite loop detection in is_state_visited() exact to catch some too lax spill/fill corner cases, from Eduard Zingerman. 11) Refactor the BPF verifier pointer ALU check to allow ALU explicitly instead of implicitly for various register types, from Hao Sun. 12) Fix the flaky tc_redirect_dtime BPF selftest due to slowness in neighbor advertisement at setup time, from Martin KaFai Lau. 13) Change BPF selftests to skip callback tests for the case when the JIT is disabled, from Tiezhu Yang. 14) Add a small extension to libbpf which allows to auto create a map-in-map's inner map, from Andrey Grafin. * tag 'for-netdev' of https://git.kernel.org/pub/scm/linux/kernel/git/bpf/bpf-next: (107 commits) selftests/bpf: Add missing line break in test_verifier bpf, docs: Clarify definitions of various instructions bpf: Fix error checks against bpf_get_btf_vmlinux(). bpf: One more maintainer for libbpf and BPF selftests selftests/bpf: Incorporate LSM policy to token-based tests selftests/bpf: Add tests for LIBBPF_BPF_TOKEN_PATH envvar libbpf: Support BPF token path setting through LIBBPF_BPF_TOKEN_PATH envvar selftests/bpf: Add tests for BPF object load with implicit token selftests/bpf: Add BPF object loading tests with explicit token passing libbpf: Wire up BPF token support at BPF object level libbpf: Wire up token_fd into feature probing logic libbpf: Move feature detection code into its own file libbpf: Further decouple feature checking logic from bpf_object libbpf: Split feature detectors definitions from cached results selftests/bpf: Utilize string values for delegate_xxx mount options bpf: Support symbolic BPF FS delegation mount options bpf: Fail BPF_TOKEN_CREATE if no delegation option was set on BPF FS bpf,selinux: Allocate bpf_security_struct per BPF token selftests/bpf: Add BPF token-enabled tests libbpf: Add BPF token support to bpf_prog_load() API ... ==================== Link: https://lore.kernel.org/r/[email protected] Signed-off-by: Jakub Kicinski <[email protected]>
2024-01-26ipmr: fix kernel panic when forwarding mcast packetsNicolas Dichtel1-1/+1
The stacktrace was: [ 86.305548] BUG: kernel NULL pointer dereference, address: 0000000000000092 [ 86.306815] #PF: supervisor read access in kernel mode [ 86.307717] #PF: error_code(0x0000) - not-present page [ 86.308624] PGD 0 P4D 0 [ 86.309091] Oops: 0000 [#1] PREEMPT SMP NOPTI [ 86.309883] CPU: 2 PID: 3139 Comm: pimd Tainted: G U 6.8.0-6wind-knet #1 [ 86.311027] Hardware name: QEMU Standard PC (i440FX + PIIX, 1996), BIOS rel-1.11.1-0-g0551a4be2c-prebuilt.qemu-project.org 04/01/2014 [ 86.312728] RIP: 0010:ip_mr_forward (/build/work/knet/net/ipv4/ipmr.c:1985) [ 86.313399] Code: f9 1f 0f 87 85 03 00 00 48 8d 04 5b 48 8d 04 83 49 8d 44 c5 00 48 8b 40 70 48 39 c2 0f 84 d9 00 00 00 49 8b 46 58 48 83 e0 fe <80> b8 92 00 00 00 00 0f 84 55 ff ff ff 49 83 47 38 01 45 85 e4 0f [ 86.316565] RSP: 0018:ffffad21c0583ae0 EFLAGS: 00010246 [ 86.317497] RAX: 0000000000000000 RBX: 0000000000000000 RCX: 0000000000000000 [ 86.318596] RDX: ffff9559cb46c000 RSI: 0000000000000000 RDI: 0000000000000000 [ 86.319627] RBP: ffffad21c0583b30 R08: 0000000000000000 R09: 0000000000000000 [ 86.320650] R10: 0000000000000000 R11: 0000000000000000 R12: 0000000000000001 [ 86.321672] R13: ffff9559c093a000 R14: ffff9559cc00b800 R15: ffff9559c09c1d80 [ 86.322873] FS: 00007f85db661980(0000) GS:ffff955a79d00000(0000) knlGS:0000000000000000 [ 86.324291] CS: 0010 DS: 0000 ES: 0000 CR0: 0000000080050033 [ 86.325314] CR2: 0000000000000092 CR3: 000000002f13a000 CR4: 0000000000350ef0 [ 86.326589] Call Trace: [ 86.327036] <TASK> [ 86.327434] ? show_regs (/build/work/knet/arch/x86/kernel/dumpstack.c:479) [ 86.328049] ? __die (/build/work/knet/arch/x86/kernel/dumpstack.c:421 /build/work/knet/arch/x86/kernel/dumpstack.c:434) [ 86.328508] ? page_fault_oops (/build/work/knet/arch/x86/mm/fault.c:707) [ 86.329107] ? do_user_addr_fault (/build/work/knet/arch/x86/mm/fault.c:1264) [ 86.329756] ? srso_return_thunk (/build/work/knet/arch/x86/lib/retpoline.S:223) [ 86.330350] ? __irq_work_queue_local (/build/work/knet/kernel/irq_work.c:111 (discriminator 1)) [ 86.331013] ? exc_page_fault (/build/work/knet/./arch/x86/include/asm/paravirt.h:693 /build/work/knet/arch/x86/mm/fault.c:1515 /build/work/knet/arch/x86/mm/fault.c:1563) [ 86.331702] ? asm_exc_page_fault (/build/work/knet/./arch/x86/include/asm/idtentry.h:570) [ 86.332468] ? ip_mr_forward (/build/work/knet/net/ipv4/ipmr.c:1985) [ 86.333183] ? srso_return_thunk (/build/work/knet/arch/x86/lib/retpoline.S:223) [ 86.333920] ipmr_mfc_add (/build/work/knet/./include/linux/rcupdate.h:782 /build/work/knet/net/ipv4/ipmr.c:1009 /build/work/knet/net/ipv4/ipmr.c:1273) [ 86.334583] ? __pfx_ipmr_hash_cmp (/build/work/knet/net/ipv4/ipmr.c:363) [ 86.335357] ip_mroute_setsockopt (/build/work/knet/net/ipv4/ipmr.c:1470) [ 86.336135] ? srso_return_thunk (/build/work/knet/arch/x86/lib/retpoline.S:223) [ 86.336854] ? ip_mroute_setsockopt (/build/work/knet/net/ipv4/ipmr.c:1470) [ 86.337679] do_ip_setsockopt (/build/work/knet/net/ipv4/ip_sockglue.c:944) [ 86.338408] ? __pfx_unix_stream_read_actor (/build/work/knet/net/unix/af_unix.c:2862) [ 86.339232] ? srso_return_thunk (/build/work/knet/arch/x86/lib/retpoline.S:223) [ 86.339809] ? aa_sk_perm (/build/work/knet/security/apparmor/include/cred.h:153 /build/work/knet/security/apparmor/net.c:181) [ 86.340342] ip_setsockopt (/build/work/knet/net/ipv4/ip_sockglue.c:1415) [ 86.340859] raw_setsockopt (/build/work/knet/net/ipv4/raw.c:836) [ 86.341408] ? security_socket_setsockopt (/build/work/knet/security/security.c:4561 (discriminator 13)) [ 86.342116] sock_common_setsockopt (/build/work/knet/net/core/sock.c:3716) [ 86.342747] do_sock_setsockopt (/build/work/knet/net/socket.c:2313) [ 86.343363] __sys_setsockopt (/build/work/knet/./include/linux/file.h:32 /build/work/knet/net/socket.c:2336) [ 86.344020] __x64_sys_setsockopt (/build/work/knet/net/socket.c:2340) [ 86.344766] do_syscall_64 (/build/work/knet/arch/x86/entry/common.c:52 /build/work/knet/arch/x86/entry/common.c:83) [ 86.345433] ? srso_return_thunk (/build/work/knet/arch/x86/lib/retpoline.S:223) [ 86.346161] ? syscall_exit_work (/build/work/knet/./include/linux/audit.h:357 /build/work/knet/kernel/entry/common.c:160) [ 86.346938] ? srso_return_thunk (/build/work/knet/arch/x86/lib/retpoline.S:223) [ 86.347657] ? syscall_exit_to_user_mode (/build/work/knet/kernel/entry/common.c:215) [ 86.348538] ? srso_return_thunk (/build/work/knet/arch/x86/lib/retpoline.S:223) [ 86.349262] ? do_syscall_64 (/build/work/knet/./arch/x86/include/asm/cpufeature.h:171 /build/work/knet/arch/x86/entry/common.c:98) [ 86.349971] entry_SYSCALL_64_after_hwframe (/build/work/knet/arch/x86/entry/entry_64.S:129) The original packet in ipmr_cache_report() may be queued and then forwarded with ip_mr_forward(). This last function has the assumption that the skb dst is set. After the below commit, the skb dst is dropped by ipv4_pktinfo_prepare(), which causes the oops. Fixes: bb7403655b3c ("ipmr: support IP_PKTINFO on cache report IGMP msg") Signed-off-by: Nicolas Dichtel <[email protected]> Reviewed-by: Eric Dumazet <[email protected]> Link: https://lore.kernel.org/r/[email protected] Signed-off-by: Jakub Kicinski <[email protected]>