aboutsummaryrefslogtreecommitdiff
path: root/net
AgeCommit message (Collapse)AuthorFilesLines
2022-04-13net: rtnetlink: add ndm flags and state mask attributesNikolay Aleksandrov1-0/+2
Add ndm flags/state masks which will be used for bulk delete filtering. All of these are used by the bridge and vxlan drivers. Also minimal attr policy validation is added, it is up to ndo_fdb_del_bulk implementers to further validate them. Signed-off-by: Nikolay Aleksandrov <[email protected]> Signed-off-by: David S. Miller <[email protected]>
2022-04-13net: bridge: fdb: add support for fine-grained flushingNikolay Aleksandrov4-12/+54
Add the ability to specify exactly which fdbs to be flushed. They are described by a new structure - net_bridge_fdb_flush_desc. Currently it can match on port/bridge ifindex, vlan id and fdb flags. It is used to describe the existing dynamic fdb flush operation. Note that this flush operation doesn't treat permanent entries in a special way (fdb_delete vs fdb_delete_local), it will delete them regardless if any port is using them, so currently it can't directly replace deletes which need to handle that case, although we can extend it later for that too. Signed-off-by: Nikolay Aleksandrov <[email protected]> Signed-off-by: David S. Miller <[email protected]>
2022-04-13net: bridge: fdb: add ndo_fdb_del_bulkNikolay Aleksandrov3-0/+27
Add a minimal ndo_fdb_del_bulk implementation which flushes all entries. Support for more fine-grained filtering will be added in the following patches. Signed-off-by: Nikolay Aleksandrov <[email protected]> Signed-off-by: David S. Miller <[email protected]>
2022-04-13net: rtnetlink: add NLM_F_BULK support to rtnl_fdb_delNikolay Aleksandrov1-19/+48
When NLM_F_BULK is specified in a fdb del message we need to handle it differently. First since this is a new call we can strictly validate the passed attributes, at first only ifindex and vlan are allowed as these will be the initially supported filter attributes, any other attribute is rejected. The mac address is no longer mandatory, but we use it to error out in older kernels because it cannot be specified with bulk request (the attribute is not allowed) and then we have to dispatch the call to ndo_fdb_del_bulk if the device supports it. The del bulk callback can do further validation of the attributes if necessary. Signed-off-by: Nikolay Aleksandrov <[email protected]> Signed-off-by: David S. Miller <[email protected]>
2022-04-13net: rtnetlink: add bulk delete support flagNikolay Aleksandrov1-0/+8
Add a new rtnl flag (RTNL_FLAG_BULK_DEL_SUPPORTED) which is used to verify that the delete operation allows bulk object deletion. Also emit a warning if anyone tries to set it for non-delete kind. Suggested-by: David Ahern <[email protected]> Signed-off-by: Nikolay Aleksandrov <[email protected]> Signed-off-by: David S. Miller <[email protected]>
2022-04-13net: rtnetlink: add helper to extract msg type's kindNikolay Aleksandrov1-1/+1
Add a helper which extracts the msg type's kind using the kind mask (0x3). Signed-off-by: Nikolay Aleksandrov <[email protected]> Signed-off-by: David S. Miller <[email protected]>
2022-04-13net: rtnetlink: add msg kind namesNikolay Aleksandrov1-3/+3
Add rtnl kind names instead of using raw values. We'll need to check for DEL kind later to validate bulk flag support. Signed-off-by: Nikolay Aleksandrov <[email protected]> Signed-off-by: David S. Miller <[email protected]>
2022-04-13Revert "net: dsa: setup master before ports"Vladimir Oltean1-13/+10
This reverts commit 11fd667dac315ea3f2469961f6d2869271a46cae. dsa_slave_change_mtu() updates the MTU of the DSA master and of the associated CPU port, but only if it detects a change to the master MTU. The blamed commit in the Fixes: tag below addressed a regression where dsa_slave_change_mtu() would return early and not do anything due to ds->ops->port_change_mtu() not being implemented. However, that commit also had the effect that the master MTU got set up to the correct value by dsa_master_setup(), but the associated CPU port's MTU did not get updated. This causes breakage for drivers that rely on the ->port_change_mtu() DSA call to account for the tagging overhead on the CPU port, and don't set up the initial MTU during the setup phase. Things actually worked before because they were in a fragile equilibrium where dsa_slave_change_mtu() was called before dsa_master_setup() was. So dsa_slave_change_mtu() could actually detect a change and update the CPU port MTU too. Restore the code to the way things used to work by reverting the reorder of dsa_tree_setup_master() and dsa_tree_setup_ports(). That change did not have a concrete motivation going for it anyway, it just looked better. Fixes: 066dfc429040 ("Revert "net: dsa: stop updating master MTU from master.c"") Signed-off-by: Vladimir Oltean <[email protected]> Signed-off-by: David S. Miller <[email protected]>
2022-04-13NFC: NULL out the dev->rfkill to prevent UAFLin Ma1-0/+1
Commit 3e3b5dfcd16a ("NFC: reorder the logic in nfc_{un,}register_device") assumes the device_is_registered() in function nfc_dev_up() will help to check when the rfkill is unregistered. However, this check only take effect when device_del(&dev->dev) is done in nfc_unregister_device(). Hence, the rfkill object is still possible be dereferenced. The crash trace in latest kernel (5.18-rc2): [ 68.760105] ================================================================== [ 68.760330] BUG: KASAN: use-after-free in __lock_acquire+0x3ec1/0x6750 [ 68.760756] Read of size 8 at addr ffff888009c93018 by task fuzz/313 [ 68.760756] [ 68.760756] CPU: 0 PID: 313 Comm: fuzz Not tainted 5.18.0-rc2 #4 [ 68.760756] Hardware name: QEMU Standard PC (i440FX + PIIX, 1996), BIOS rel-1.14.0-0-g155821a1990b-prebuilt.qemu.org 04/01/2014 [ 68.760756] Call Trace: [ 68.760756] <TASK> [ 68.760756] dump_stack_lvl+0x57/0x7d [ 68.760756] print_report.cold+0x5e/0x5db [ 68.760756] ? __lock_acquire+0x3ec1/0x6750 [ 68.760756] kasan_report+0xbe/0x1c0 [ 68.760756] ? __lock_acquire+0x3ec1/0x6750 [ 68.760756] __lock_acquire+0x3ec1/0x6750 [ 68.760756] ? lockdep_hardirqs_on_prepare+0x410/0x410 [ 68.760756] ? register_lock_class+0x18d0/0x18d0 [ 68.760756] lock_acquire+0x1ac/0x4f0 [ 68.760756] ? rfkill_blocked+0xe/0x60 [ 68.760756] ? lockdep_hardirqs_on_prepare+0x410/0x410 [ 68.760756] ? mutex_lock_io_nested+0x12c0/0x12c0 [ 68.760756] ? nla_get_range_signed+0x540/0x540 [ 68.760756] ? _raw_spin_lock_irqsave+0x4e/0x50 [ 68.760756] _raw_spin_lock_irqsave+0x39/0x50 [ 68.760756] ? rfkill_blocked+0xe/0x60 [ 68.760756] rfkill_blocked+0xe/0x60 [ 68.760756] nfc_dev_up+0x84/0x260 [ 68.760756] nfc_genl_dev_up+0x90/0xe0 [ 68.760756] genl_family_rcv_msg_doit+0x1f4/0x2f0 [ 68.760756] ? genl_family_rcv_msg_attrs_parse.constprop.0+0x230/0x230 [ 68.760756] ? security_capable+0x51/0x90 [ 68.760756] genl_rcv_msg+0x280/0x500 [ 68.760756] ? genl_get_cmd+0x3c0/0x3c0 [ 68.760756] ? lock_acquire+0x1ac/0x4f0 [ 68.760756] ? nfc_genl_dev_down+0xe0/0xe0 [ 68.760756] ? lockdep_hardirqs_on_prepare+0x410/0x410 [ 68.760756] netlink_rcv_skb+0x11b/0x340 [ 68.760756] ? genl_get_cmd+0x3c0/0x3c0 [ 68.760756] ? netlink_ack+0x9c0/0x9c0 [ 68.760756] ? netlink_deliver_tap+0x136/0xb00 [ 68.760756] genl_rcv+0x1f/0x30 [ 68.760756] netlink_unicast+0x430/0x710 [ 68.760756] ? memset+0x20/0x40 [ 68.760756] ? netlink_attachskb+0x740/0x740 [ 68.760756] ? __build_skb_around+0x1f4/0x2a0 [ 68.760756] netlink_sendmsg+0x75d/0xc00 [ 68.760756] ? netlink_unicast+0x710/0x710 [ 68.760756] ? netlink_unicast+0x710/0x710 [ 68.760756] sock_sendmsg+0xdf/0x110 [ 68.760756] __sys_sendto+0x19e/0x270 [ 68.760756] ? __ia32_sys_getpeername+0xa0/0xa0 [ 68.760756] ? fd_install+0x178/0x4c0 [ 68.760756] ? fd_install+0x195/0x4c0 [ 68.760756] ? kernel_fpu_begin_mask+0x1c0/0x1c0 [ 68.760756] __x64_sys_sendto+0xd8/0x1b0 [ 68.760756] ? lockdep_hardirqs_on+0xbf/0x130 [ 68.760756] ? syscall_enter_from_user_mode+0x1d/0x50 [ 68.760756] do_syscall_64+0x3b/0x90 [ 68.760756] entry_SYSCALL_64_after_hwframe+0x44/0xae [ 68.760756] RIP: 0033:0x7f67fb50e6b3 ... [ 68.760756] RSP: 002b:00007f67fa91fe90 EFLAGS: 00000293 ORIG_RAX: 000000000000002c [ 68.760756] RAX: ffffffffffffffda RBX: 0000000000000000 RCX: 00007f67fb50e6b3 [ 68.760756] RDX: 000000000000001c RSI: 0000559354603090 RDI: 0000000000000003 [ 68.760756] RBP: 00007f67fa91ff00 R08: 00007f67fa91fedc R09: 000000000000000c [ 68.760756] R10: 0000000000000000 R11: 0000000000000293 R12: 00007ffe824d496e [ 68.760756] R13: 00007ffe824d496f R14: 00007f67fa120000 R15: 0000000000000003 [ 68.760756] </TASK> [ 68.760756] [ 68.760756] Allocated by task 279: [ 68.760756] kasan_save_stack+0x1e/0x40 [ 68.760756] __kasan_kmalloc+0x81/0xa0 [ 68.760756] rfkill_alloc+0x7f/0x280 [ 68.760756] nfc_register_device+0xa3/0x1a0 [ 68.760756] nci_register_device+0x77a/0xad0 [ 68.760756] nfcmrvl_nci_register_dev+0x20b/0x2c0 [ 68.760756] nfcmrvl_nci_uart_open+0xf2/0x1dd [ 68.760756] nci_uart_tty_ioctl+0x2c3/0x4a0 [ 68.760756] tty_ioctl+0x764/0x1310 [ 68.760756] __x64_sys_ioctl+0x122/0x190 [ 68.760756] do_syscall_64+0x3b/0x90 [ 68.760756] entry_SYSCALL_64_after_hwframe+0x44/0xae [ 68.760756] [ 68.760756] Freed by task 314: [ 68.760756] kasan_save_stack+0x1e/0x40 [ 68.760756] kasan_set_track+0x21/0x30 [ 68.760756] kasan_set_free_info+0x20/0x30 [ 68.760756] __kasan_slab_free+0x108/0x170 [ 68.760756] kfree+0xb0/0x330 [ 68.760756] device_release+0x96/0x200 [ 68.760756] kobject_put+0xf9/0x1d0 [ 68.760756] nfc_unregister_device+0x77/0x190 [ 68.760756] nfcmrvl_nci_unregister_dev+0x88/0xd0 [ 68.760756] nci_uart_tty_close+0xdf/0x180 [ 68.760756] tty_ldisc_kill+0x73/0x110 [ 68.760756] tty_ldisc_hangup+0x281/0x5b0 [ 68.760756] __tty_hangup.part.0+0x431/0x890 [ 68.760756] tty_release+0x3a8/0xc80 [ 68.760756] __fput+0x1f0/0x8c0 [ 68.760756] task_work_run+0xc9/0x170 [ 68.760756] exit_to_user_mode_prepare+0x194/0x1a0 [ 68.760756] syscall_exit_to_user_mode+0x19/0x50 [ 68.760756] do_syscall_64+0x48/0x90 [ 68.760756] entry_SYSCALL_64_after_hwframe+0x44/0xae This patch just add the null out of dev->rfkill to make sure such dereference cannot happen. This is safe since the device_lock() already protect the check/write from data race. Fixes: 3e3b5dfcd16a ("NFC: reorder the logic in nfc_{un,}register_device") Signed-off-by: Lin Ma <[email protected]> Reviewed-by: Krzysztof Kozlowski <[email protected]> Signed-off-by: David S. Miller <[email protected]>
2022-04-13ipv6: exthdrs: use swap() instead of open coding itGuo Zhengkui1-4/+1
Address the following coccicheck warning: net/ipv6/exthdrs.c:620:44-45: WARNING opportunity for swap() by using swap() for the swapping of variable values and drop the tmp (`addr`) variable that is not needed any more. Signed-off-by: Guo Zhengkui <[email protected]> Signed-off-by: David S. Miller <[email protected]>
2022-04-13tls: rx: only copy IV from the packet for TLS 1.2Jakub Kicinski1-10/+10
TLS 1.3 and ChaChaPoly don't carry IV in the packet. The code before this change would copy out iv_size worth of whatever followed the TLS header in the packet and then for TLS 1.3 | ChaCha overwrite that with the sequence number. Waste of cycles especially with TLS 1.2 being close to dead and TLS 1.3 being the common case. Signed-off-by: Jakub Kicinski <[email protected]> Signed-off-by: David S. Miller <[email protected]>
2022-04-13tls: rx: use MAX_IV_SIZE for allocationsJakub Kicinski1-1/+1
IVs are 8 or 16 bytes, no point reading out the exact value for quantities this small. Signed-off-by: Jakub Kicinski <[email protected]> Signed-off-by: David S. Miller <[email protected]>
2022-04-13tls: rx: use async as an in-out argumentJakub Kicinski1-15/+16
Propagating EINPROGRESS thru multiple layers of functions is error prone. Use darg->async as an in/out argument, like we use darg->zc today. On input it tells the code if async is allowed, on output if it took place. Signed-off-by: Jakub Kicinski <[email protected]> Signed-off-by: David S. Miller <[email protected]>
2022-04-13tls: rx: return the already-copied data on crypto errorJakub Kicinski1-6/+10
async crypto handler will report the socket error no need to report it again. We can, however, let the data we already copied be reported to user space but we need to make sure the error will be reported next time around. Signed-off-by: Jakub Kicinski <[email protected]> Signed-off-by: David S. Miller <[email protected]>
2022-04-13tls: rx: treat process_rx_list() errors as transientJakub Kicinski1-12/+8
process_rx_list() only fails if it can't copy data to user space. There is no point recording the error onto sk->sk_err or giving up on the data which was read partially. Treat the return value like a normal socket partial read. Signed-off-by: Jakub Kicinski <[email protected]> Signed-off-by: David S. Miller <[email protected]>
2022-04-13tls: rx: assume crypto always calls our callbackJakub Kicinski1-3/+0
If crypto didn't always invoke our callback for async we'd not be clearing skb->sk and would crash in the skb core when freeing it. This if must be dead code. Signed-off-by: Jakub Kicinski <[email protected]> Signed-off-by: David S. Miller <[email protected]>
2022-04-13tls: rx: don't handle TLS 1.3 in the async crypto callbackJakub Kicinski1-10/+5
Async crypto never worked with TLS 1.3 and was explicitly disabled in commit 8497ded2d16c ("net/tls: Disable async decrytion for tls1.3"). There's no need for us to handle TLS 1.3 padding in the async cb. Signed-off-by: Jakub Kicinski <[email protected]> Signed-off-by: David S. Miller <[email protected]>
2022-04-13tls: rx: move counting TlsDecryptErrors for syncJakub Kicinski1-2/+2
Move counting TlsDecryptErrors to tls_do_decryption() where differences between sync and async crypto are reconciled. No functional changes, this code just always gave me a pause. Signed-off-by: Jakub Kicinski <[email protected]> Signed-off-by: David S. Miller <[email protected]>
2022-04-13tls: rx: reuse leave_on_list label for psockJakub Kicinski1-8/+4
The code is identical, we can save a few LoC. Signed-off-by: Jakub Kicinski <[email protected]> Signed-off-by: David S. Miller <[email protected]>
2022-04-13tls: rx: consistently use unlocked accessors for rx_listJakub Kicinski1-5/+5
rx_list is protected by the socket lock, no need to take the built-in spin lock on accesses. Signed-off-by: Jakub Kicinski <[email protected]> Signed-off-by: David S. Miller <[email protected]>
2022-04-13esp: limit skb_page_frag_refill use to a single pageSabrina Dubroca2-6/+4
Commit ebe48d368e97 ("esp: Fix possible buffer overflow in ESP transformation") tried to fix skb_page_frag_refill usage in ESP by capping allocsize to 32k, but that doesn't completely solve the issue, as skb_page_frag_refill may return a single page. If that happens, we will write out of bounds, despite the check introduced in the previous patch. This patch forces COW in cases where we would end up calling skb_page_frag_refill with a size larger than a page (first in esp_output_head with tailen, then in esp_output_tail with skb->data_len). Fixes: cac2661c53f3 ("esp4: Avoid skb_cow_data whenever possible") Fixes: 03e2a30f6a27 ("esp6: Avoid skb_cow_data whenever possible") Signed-off-by: Sabrina Dubroca <[email protected]> Signed-off-by: Steffen Klassert <[email protected]>
2022-04-12Merge tag 'nfsd-5.18-1' of ↵Linus Torvalds2-1/+4
git://git.kernel.org/pub/scm/linux/kernel/git/cel/linux Pull nfsd fixes from Chuck Lever: - Fix a write performance regression - Fix crashes during request deferral on RDMA transports * tag 'nfsd-5.18-1' of git://git.kernel.org/pub/scm/linux/kernel/git/cel/linux: SUNRPC: Fix the svc_deferred_event trace class SUNRPC: Fix NFSD's request deferral on RDMA transports nfsd: Clean up nfsd_file_put() nfsd: Fix a write performance regression SUNRPC: Return true/false (not 1/0) from bool functions
2022-04-12fou: Remove XRFM from NET_FOU KconfigCoco Li2-2/+0
XRFM is no longer needed for configuring FOU tunnels (CONFIG_NET_FOU_IP_TUNNELS), remove from Kconfig. Also remove the xrfm.h dependency in fou.c. It was added in '23461551c006 ("fou: Support for foo-over-udp RX path")' for depencies of udp_del_offload and udp_offloads, which were removed in 'd92283e338f6 ("fou: change to use UDP socket GRO")'. Built and installed kernel and setup GUE/FOU tunnels. Signed-off-by: Coco Li <[email protected]> Link: https://lore.kernel.org/r/[email protected] Signed-off-by: Jakub Kicinski <[email protected]>
2022-04-12Merge git://git.kernel.org/pub/scm/linux/kernel/git/netfilter/nfJakub Kicinski2-5/+4
Pablo Neira Ayuso says: ==================== Netfilter fixes for net 1) Fix cgroupv2 from the input path, from Florian Westphal. 2) Fix incorrect return value of nft_parse_register(), from Antoine Tenart. * git://git.kernel.org/pub/scm/linux/kernel/git/netfilter/nf: netfilter: nf_tables: nft_parse_register can return a negative value netfilter: nft_socket: make cgroup match work in input too ==================== Link: https://lore.kernel.org/r/[email protected] Signed-off-by: Jakub Kicinski <[email protected]>
2022-04-12page_pool: Add recycle stats to page_pool_put_page_bulkLorenzo Bianconi1-2/+13
Add missing recycle stats to page_pool_put_page_bulk routine. Reviewed-by: Joe Damato <[email protected]> Signed-off-by: Lorenzo Bianconi <[email protected]> Reviewed-by: Ilias Apalodimas <[email protected]> Link: https://lore.kernel.org/r/3712178b51c007cfaed910ea80e68f00c916b1fa.1649685634.git.lorenzo@kernel.org Signed-off-by: Paolo Abeni <[email protected]>
2022-04-12net: remove noblock parameter from recvmsg() entitiesOliver Hartkopp29-99/+71
The internal recvmsg() functions have two parameters 'flags' and 'noblock' that were merged inside skb_recv_datagram(). As a follow up patch to commit f4b41f062c42 ("net: remove noblock parameter from skb_recv_datagram()") this patch removes the separate 'noblock' parameter for recvmsg(). Analogue to the referenced patch for skb_recv_datagram() the 'flags' and 'noblock' parameters are unnecessarily split up with e.g. err = sk->sk_prot->recvmsg(sk, msg, size, flags & MSG_DONTWAIT, flags & ~MSG_DONTWAIT, &addr_len); or in err = INDIRECT_CALL_2(sk->sk_prot->recvmsg, tcp_recvmsg, udp_recvmsg, sk, msg, size, flags & MSG_DONTWAIT, flags & ~MSG_DONTWAIT, &addr_len); instead of simply using only flags all the time and check for MSG_DONTWAIT where needed (to preserve for the formerly separated no(n)block condition). Signed-off-by: Oliver Hartkopp <[email protected]> Link: https://lore.kernel.org/r/[email protected] Signed-off-by: Paolo Abeni <[email protected]>
2022-04-12netfilter: nf_tables: nft_parse_register can return a negative valueAntoine Tenart1-1/+1
Since commit 6e1acfa387b9 ("netfilter: nf_tables: validate registers coming from userspace.") nft_parse_register can return a negative value, but the function prototype is still returning an unsigned int. Fixes: 6e1acfa387b9 ("netfilter: nf_tables: validate registers coming from userspace.") Signed-off-by: Antoine Tenart <[email protected]> Signed-off-by: Pablo Neira Ayuso <[email protected]>
2022-04-12net: bridge: add support for host l2 mdb entriesJoachim Wiberg1-5/+7
This patch expands on the earlier work on layer-2 mdb entries by adding support for host entries. Due to the fact that host joined entries do not have any flag field, we infer the permanent flag when reporting the entries to userspace, which otherwise would be listed as 'temp'. Before patch: ~# bridge mdb add dev br0 port br0 grp 01:00:00:c0:ff:ee permanent Error: bridge: Flags are not allowed for host groups. ~# bridge mdb add dev br0 port br0 grp 01:00:00:c0:ff:ee Error: bridge: Only permanent L2 entries allowed. After patch: ~# bridge mdb add dev br0 port br0 grp 01:00:00:c0:ff:ee permanent ~# bridge mdb show dev br0 port br0 grp 01:00:00:c0:ff:ee permanent vid 1 Signed-off-by: Joachim Wiberg <[email protected]> Acked-by: Nikolay Aleksandrov <[email protected]> Signed-off-by: Paolo Abeni <[email protected]>
2022-04-11net: bridge: offload BR_HAIRPIN_MODE, BR_ISOLATED, BR_MULTICAST_TO_UNICASTArınç ÜNAL1-1/+2
Add BR_HAIRPIN_MODE, BR_ISOLATED and BR_MULTICAST_TO_UNICAST port flags to BR_PORT_FLAGS_HW_OFFLOAD so that switchdev drivers which have an offloaded data plane have a chance to reject these bridge port flags if they don't support them yet. It makes the code path go through the SWITCHDEV_ATTR_ID_PORT_PRE_BRIDGE_FLAGS driver handlers, which return -EINVAL for everything they don't recognize. For drivers that don't catch SWITCHDEV_ATTR_ID_PORT_PRE_BRIDGE_FLAGS at all, switchdev will return -EOPNOTSUPP for those which is then ignored, but those are in the minority. Signed-off-by: Arınç ÜNAL <[email protected]> Reviewed-by: Ido Schimmel <[email protected]> Reviewed-by: Vladimir Oltean <[email protected]> Link: https://lore.kernel.org/r/[email protected] Signed-off-by: Jakub Kicinski <[email protected]>
2022-04-11sctp: Initialize daddr on peeled off socketPetr Malat1-1/+1
Function sctp_do_peeloff() wrongly initializes daddr of the original socket instead of the peeled off socket, which makes getpeername() return zeroes instead of the primary address. Initialize the new socket instead. Fixes: d570ee490fb1 ("[SCTP]: Correctly set daddr for IPv6 sockets during peeloff") Signed-off-by: Petr Malat <[email protected]> Acked-by: Marcelo Ricardo Leitner <[email protected]> Link: https://lore.kernel.org/r/[email protected] Signed-off-by: Jakub Kicinski <[email protected]>
2022-04-11net/smc: Fix af_ops of child socket pointing to released memoryKarsten Graul1-2/+12
Child sockets may inherit the af_ops from the parent listen socket. When the listen socket is released then the af_ops of the child socket points to released memory. Solve that by restoring the original af_ops for child sockets which inherited the parent af_ops. And clear any inherited user_data of the parent socket. Fixes: 8270d9c21041 ("net/smc: Limit backlog connections") Reviewed-by: Wenjia Zhang <[email protected]> Signed-off-by: Karsten Graul <[email protected]> Reviewed-by: D. Wythe <[email protected]> Signed-off-by: Jakub Kicinski <[email protected]>
2022-04-11net/smc: Fix NULL pointer dereference in smc_pnet_find_ib()Karsten Graul1-2/+3
dev_name() was called with dev.parent as argument but without to NULL-check it before. Solve this by checking the pointer before the call to dev_name(). Fixes: af5f60c7e3d5 ("net/smc: allow PCI IDs as ib device names in the pnet table") Reported-by: [email protected] Signed-off-by: Karsten Graul <[email protected]> Signed-off-by: Jakub Kicinski <[email protected]>
2022-04-11net/smc: use memcpy instead of snprintf to avoid out of bounds readKarsten Graul1-2/+4
Using snprintf() to convert not null-terminated strings to null terminated strings may cause out of bounds read in the source string. Therefore use memcpy() and terminate the target string with a null afterwards. Fixes: fa0866625543 ("net/smc: add support for user defined EIDs") Fixes: 3c572145c24e ("net/smc: add generic netlink support for system EID") Signed-off-by: Karsten Graul <[email protected]> Signed-off-by: Jakub Kicinski <[email protected]>
2022-04-11ipv4: Use dscp_t in struct fib_entry_notifier_infoGuillaume Nault1-2/+2
Use the new dscp_t type to replace the tos field of struct fib_entry_notifier_info. This ensures ECN bits are ignored and makes it compatible with the dscp field of struct fib_rt_info. This also allows sparse to flag potential incorrect uses of DSCP and ECN bits. Signed-off-by: Guillaume Nault <[email protected]> Reviewed-by: Ido Schimmel <[email protected]> Reviewed-by: David Ahern <[email protected]> Signed-off-by: Jakub Kicinski <[email protected]>
2022-04-11ipv4: Use dscp_t in struct fib_rt_infoGuillaume Nault3-7/+7
Use the new dscp_t type to replace the tos field of struct fib_rt_info. This ensures ECN bits are ignored and makes it compatible with the fa_dscp field of struct fib_alias. This also allows sparse to flag potential incorrect uses of DSCP and ECN bits. Signed-off-by: Guillaume Nault <[email protected]> Reviewed-by: Ido Schimmel <[email protected]> Reviewed-by: David Ahern <[email protected]> Signed-off-by: Jakub Kicinski <[email protected]>
2022-04-11bpf: Fix release of page_pool in BPF_PROG_RUN in test runnerToke Høiland-Jørgensen1-2/+3
The live packet mode in BPF_PROG_RUN allocates a page_pool instance for each test run instance and uses it for the packet data. On setup it creates the page_pool, and calls xdp_reg_mem_model() to allow pages to be returned properly from the XDP data path. However, xdp_reg_mem_model() also raises the reference count of the page_pool itself, so the single page_pool_destroy() count on teardown was not enough to actually release the pool. To fix this, add an additional xdp_unreg_mem_model() call on teardown. Fixes: b530e9e1063e ("bpf: Add "live packet" mode for XDP in BPF_PROG_RUN") Reported-by: Freysteinn Alfredsson <[email protected]> Signed-off-by: Toke Høiland-Jørgensen <[email protected]> Signed-off-by: Daniel Borkmann <[email protected]> Acked-by: Song Liu <[email protected]> Link: https://lore.kernel.org/bpf/[email protected]
2022-04-11mac80211: prepare sta handling for MLO supportSriram R27-332/+385
Currently in mac80211 each STA object is represented using sta_info datastructure with the associated STA specific information and drivers access ieee80211_sta part of it. With MLO (Multi Link Operation) support being added in 802.11be standard, though the association is logically with a single Multi Link capable STA, at the physical level communication can happen via different advertised links (uniquely identified by Channel, operating class, BSSID) and hence the need to handle multiple link STA parameters within a composite sta_info object called the MLD STA. The different link STA part of MLD STA are identified using the link address which can be same or different as the MLD STA address and unique link id based on the link vif. To support extension of such a model, the sta_info datastructure is modified to hold multiple link STA objects with link specific params currently within sta_info moved to this new structure. Similarly this is done for ieee80211_sta as well which will be accessed within mac80211 as well as by drivers, hence trivial driver changes are expected to support this. For current non MLO supported drivers, only one link STA is present and link information is accessed via 'deflink' member. For MLO drivers, we still need to define the APIs etc. to get the correct link ID and access the correct part of the station info. Currently in mac80211, all link STA info are accessed directly via deflink. These will be updated to access via link pointers indexed by link id with MLO support patches, with link id being 0 for non MLO supported cases. Except for couple of macro related changes, below spatch takes care of updating mac80211 and driver code to access to the link STA info via deflink. @ieee80211_sta@ struct ieee80211_sta *s; struct sta_info *si; identifier var = {supp_rates, ht_cap, vht_cap, he_cap, he_6ghz_capa, eht_cap, rx_nss, bandwidth, txpwr}; @@ ( s-> - var + deflink.var | si->sta. - var + deflink.var ) @sta_info@ struct sta_info *si; identifier var = {gtk, pcpu_rx_stats, rx_stats, rx_stats_avg, status_stats, tx_stats, cur_max_bandwidth}; @@ ( si-> - var + deflink.var ) Signed-off-by: Sriram R <[email protected]> Link: https://lore.kernel.org/r/[email protected] [remove MLO-drivers notes from commit message, not clear yet; run spatch] Signed-off-by: Johannes Berg <[email protected]>
2022-04-11mac80211: minstrel_ht: fix where rate stats are stored (fixes debugfs output)Peter Seiderer1-0/+3
Using an ath9k card the debugfs output of minstrel_ht looks like the following (note the zero values for the first four rates sum-of success/attempts): best ____________rate__________ ____statistics___ _____last____ ______sum-of________ mode guard # rate [name idx airtime max_tp] [avg(tp) avg(prob)] [retry|suc|att] [#success | #attempts] OFDM 1 DP 6.0M 272 1640 5.2 3.1 53.8 3 0 0 0 0 OFDM 1 C 9.0M 273 1104 7.7 4.6 53.8 4 0 0 0 0 OFDM 1 B 12.0M 274 836 10.0 6.0 53.8 4 0 0 0 0 OFDM 1 A S 18.0M 275 568 14.3 8.5 53.8 5 0 0 0 0 OFDM 1 S 24.0M 276 436 18.1 0.0 0.0 5 0 1 80 1778 OFDM 1 36.0M 277 300 24.9 0.0 0.0 0 0 1 0 107 OFDM 1 S 48.0M 278 236 30.4 0.0 0.0 0 0 0 0 75 OFDM 1 54.0M 279 212 33.0 0.0 0.0 0 0 0 0 72 Total packet count:: ideal 16582 lookaround 885 Average # of aggregated frames per A-MPDU: 1.0 Debugging showed that the rate statistics for the first four rates where stored in the MINSTREL_CCK_GROUP instead of the MINSTREL_OFDM_GROUP because in minstrel_ht_get_stats() the supported check was not honoured as done in various other places, e.g net/mac80211/rc80211_minstrel_ht_debugfs.c: 74 if (!(mi->supported[i] & BIT(j))) 75 continue; With the patch applied the output looks good: best ____________rate__________ ____statistics___ _____last____ ______sum-of________ mode guard # rate [name idx airtime max_tp] [avg(tp) avg(prob)] [retry|suc|att] [#success | #attempts] OFDM 1 D 6.0M 272 1640 5.2 5.2 100.0 3 0 0 1 1 OFDM 1 C 9.0M 273 1104 7.7 7.7 100.0 4 0 0 38 38 OFDM 1 B 12.0M 274 836 10.0 9.9 89.5 4 2 2 372 395 OFDM 1 A P 18.0M 275 568 14.3 14.3 97.2 5 52 53 6956 7181 OFDM 1 S 24.0M 276 436 18.1 0.0 0.0 0 0 1 6 163 OFDM 1 36.0M 277 300 24.9 0.0 0.0 0 0 1 0 35 OFDM 1 S 48.0M 278 236 30.4 0.0 0.0 0 0 0 0 38 OFDM 1 S 54.0M 279 212 33.0 0.0 0.0 0 0 0 0 38 Total packet count:: ideal 7097 lookaround 287 Average # of aggregated frames per A-MPDU: 1.0 Signed-off-by: Peter Seiderer <[email protected]> Link: https://lore.kernel.org/r/[email protected] Signed-off-by: Johannes Berg <[email protected]>
2022-04-11nl80211: show SSID for P2P_GO interfacesJohannes Berg1-0/+1
There's no real reason not to send the SSID to userspace when it requests information about P2P_GO, it is, in that respect, exactly the same as AP interfaces. Fix that. Fixes: 44905265bc15 ("nl80211: don't expose wdev->ssid for most interfaces") Signed-off-by: Johannes Berg <[email protected]> Link: https://lore.kernel.org/r/20220318134656.14354ae223f0.Ia25e85a512281b92e1645d4160766a4b1a471597@changeid Signed-off-by: Johannes Berg <[email protected]>
2022-04-11mac80211: introduce BSS color collision detectionLorenzo Bianconi2-0/+47
Add ieee80211_rx_check_bss_color_collision routine in order to introduce BSS color collision detection in mac80211 if it is not supported in HW/FW (e.g. for mt7915 chipset). Add IEEE80211_HW_DETECTS_COLOR_COLLISION flag to let the driver notify BSS color collision detection is supported in HW/FW. Set this for ath11k which apparently didn't need this code. Tested-by: Peter Chiu <[email protected]> Co-developed-by: Ryder Lee <[email protected]> Signed-off-by: Ryder Lee <[email protected]> Signed-off-by: Lorenzo Bianconi <[email protected]> Link: https://lore.kernel.org/r/a05eeeb1841a84560dc5aaec77894fcb69a54f27.1648204871.git.lorenzo@kernel.org [clarify commit message a bit, move flag to mac80211] Signed-off-by: Johannes Berg <[email protected]>
2022-04-11mac80211: protect ieee80211_assign_beacon with next_beacon checkLorenzo Bianconi1-10/+12
Even if it is not a real issue since ieee80211_set_after_csa_beacon() or ieee80211_set_after_color_change_beacon() are run only when csa or bcc is active, move next_beacon check before running ieee80211_assign_beacon routine. Signed-off-by: Lorenzo Bianconi <[email protected]> Link: https://lore.kernel.org/r/041764ed7e9781bcee66c33b41f1365aa4205932.1649327683.git.lorenzo@kernel.org Signed-off-by: Johannes Berg <[email protected]>
2022-04-11ipv6: fix panic when forwarding a pkt with no in6 devNicolas Dichtel1-1/+1
kongweibin reported a kernel panic in ip6_forward() when input interface has no in6 dev associated. The following tc commands were used to reproduce this panic: tc qdisc del dev vxlan100 root tc qdisc add dev vxlan100 root netem corrupt 5% CC: [email protected] Fixes: ccd27f05ae7b ("ipv6: fix 'disable_policy' for fwd packets") Reported-by: kongweibin <[email protected]> Signed-off-by: Nicolas Dichtel <[email protected]> Reviewed-by: David Ahern <[email protected]> Signed-off-by: David S. Miller <[email protected]>
2022-04-11mptcp: listen diag dump supportFlorian Westphal1-0/+91
makes 'ss -Ml' show mptcp listen sockets. Iterate over the tcp listen sockets and pick those that have mptcp ulp info attached. mptcp_diag_get_info() is modified to prefer msk->first for mptcp sockets in listen state. This reports accurate number for recv and send queue (pending / max connection backlog counters). Sample output: ss -Mil State Recv-Q Send-Q Local Address:Port Peer Address:Port LISTEN 0 20 127.0.0.1:12000 0.0.0.0:* subflows_max:2 Signed-off-by: Florian Westphal <[email protected]> Signed-off-by: Mat Martineau <[email protected]> Signed-off-by: David S. Miller <[email protected]>
2022-04-11mptcp: remove locking in mptcp_diag_fill_infoFlorian Westphal1-6/+0
Problem is that listener iteration would call this from atomic context so this locking is not allowed. One way is to drop locks before calling the helper, but afaics the lock isn't really needed, all values are fetched via READ_ONCE(). Signed-off-by: Florian Westphal <[email protected]> Signed-off-by: Mat Martineau <[email protected]> Signed-off-by: David S. Miller <[email protected]>
2022-04-11mptcp: diag: switch to context structureFlorian Westphal1-3/+11
Raw access to cb->arg[] is deprecated, use a context structure. Signed-off-by: Florian Westphal <[email protected]> Signed-off-by: Mat Martineau <[email protected]> Signed-off-by: David S. Miller <[email protected]>
2022-04-11mptcp: add pm_nl_pernet helpersGeliang Tang1-17/+24
This patch adds two pm_nl_pernet related helpers, named pm_nl_get_pernet() and pm_nl_get_pernet_from_msk() to get pm_nl_pernet from 'net' or 'msk'. Use these helpers instead of using net_generic() directly. Suggested-by: Florian Westphal <[email protected]> Signed-off-by: Geliang Tang <[email protected]> Signed-off-by: Mat Martineau <[email protected]> Signed-off-by: David S. Miller <[email protected]>
2022-04-11mptcp: reset the packet scheduler on PRIO changePaolo Abeni1-0/+2
Similar to the previous patch, for priority changes requested by the local PM. Reported-and-suggested-by: Davide Caratti <[email protected]> Fixes: 067065422fcd ("mptcp: add the outgoing MP_PRIO support") Signed-off-by: Paolo Abeni <[email protected]> Signed-off-by: Mat Martineau <[email protected]> Signed-off-by: David S. Miller <[email protected]>
2022-04-11mptcp: reset the packet scheduler on incoming MP_PRIOPaolo Abeni3-4/+18
When an incoming MP_PRIO option changes the backup status of any subflow, we need to reset the packet scheduler status, or the next send could keep using the previously selected subflow, without taking in account the new priorities. Reported-by: Davide Caratti <[email protected]> Fixes: 40453a5c61f4 ("mptcp: add the incoming MP_PRIO support") Signed-off-by: Paolo Abeni <[email protected]> Signed-off-by: Mat Martineau <[email protected]> Signed-off-by: David S. Miller <[email protected]>
2022-04-11mptcp: optimize release_cb for the common casePaolo Abeni1-7/+9
The mptcp release callback checks several flags in atomic context, but only MPTCP_CLEAN_UNA can be up frequently. Reorganize the code to avoid multiple conditionals in the most common scenarios. Additional clarify a related comment. Signed-off-by: Paolo Abeni <[email protected]> Signed-off-by: Mat Martineau <[email protected]> Signed-off-by: David S. Miller <[email protected]>
2022-04-11Merge git://git.kernel.org/pub/scm/linux/kernel/git/netfilter/nf-nextDavid S. Miller9-125/+143
Pablo Neira Ayuso says: ==================== Netfilter updates for net-next The following patchset contains Netfilter updates for net-next: 1) Replace unnecessary list_for_each_entry_continue() in nf_tables, from Jakob Koschel. 2) Add struct nf_conntrack_net_ecache to conntrack event cache and use it, from Florian Westphal. 3) Refactor ctnetlink_dump_list(), also from Florian. 4) Bump module reference counter on cttimeout object addition/removal, from Florian. 5) Consolidate nf_log MAC printer, from Phil Sutter. 6) Add basic logging support for unknown ethertype, from Phil Sutter. 7) Consolidate check for sysctl nf_log_all_netns toggle, also from Phil. 8) Replace hardcode value in nft_bitwise, from Jeremy Sowden. 9) Rename BASIC-like goto tags in nft_bitwise to more meaningful names, also from Jeremy. 10) nft_fib support for reverse path filtering with policy-based routing on iif. Extend selftests to cover for this new usecase, from Florian. ==================== Signed-off-by: David S. Miller <[email protected]>