aboutsummaryrefslogtreecommitdiff
path: root/net
AgeCommit message (Collapse)AuthorFilesLines
2022-09-20seg6: add NEXT-C-SID support for SRv6 End behaviorAndrea Mayer1-3/+332
The NEXT-C-SID mechanism described in [1] offers the possibility of encoding several SRv6 segments within a single 128 bit SID address. Such a SID address is called a Compressed SID (C-SID) container. In this way, the length of the SID List can be drastically reduced. A SID instantiated with the NEXT-C-SID flavor considers an IPv6 address logically structured in three main blocks: i) Locator-Block; ii) Locator-Node Function; iii) Argument. C-SID container +------------------------------------------------------------------+ | Locator-Block |Loc-Node| Argument | | |Function| | +------------------------------------------------------------------+ <--------- B -----------> <- NF -> <------------- A ---------------> (i) The Locator-Block can be any IPv6 prefix available to the provider; (ii) The Locator-Node Function represents the node and the function to be triggered when a packet is received on the node; (iii) The Argument carries the remaining C-SIDs in the current C-SID container. The NEXT-C-SID mechanism relies on the "flavors" framework defined in [2]. The flavors represent additional operations that can modify or extend a subset of the existing behaviors. This patch introduces the support for flavors in SRv6 End behavior implementing the NEXT-C-SID one. An SRv6 End behavior with NEXT-C-SID flavor works as an End behavior but it is capable of processing the compressed SID List encoded in C-SID containers. An SRv6 End behavior with NEXT-C-SID flavor can be configured to support user-provided Locator-Block and Locator-Node Function lengths. In this implementation, such lengths must be evenly divisible by 8 (i.e. must be byte-aligned), otherwise the kernel informs the user about invalid values with a meaningful error code and message through netlink_ext_ack. If Locator-Block and/or Locator-Node Function lengths are not provided by the user during configuration of an SRv6 End behavior instance with NEXT-C-SID flavor, the kernel will choose their default values i.e., 32-bit Locator-Block and 16-bit Locator-Node Function. [1] - https://datatracker.ietf.org/doc/html/draft-ietf-spring-srv6-srh-compression [2] - https://datatracker.ietf.org/doc/html/rfc8986 Signed-off-by: Andrea Mayer <[email protected]> Reviewed-by: David Ahern <[email protected]> Signed-off-by: Paolo Abeni <[email protected]>
2022-09-20seg6: add netlink_ext_ack support in parsing SRv6 behavior attributesAndrea Mayer1-16/+28
An SRv6 behavior instance can be set up using mandatory and/or optional attributes. In the setup phase, each supplied attribute is parsed and processed. If the parsing operation fails, the creation of the behavior instance stops and an error number/code is reported to the user. In many cases, it is challenging for the user to figure out exactly what happened by relying only on the error code. For this reason, we add the support for netlink_ext_ack in parsing SRv6 behavior attributes. In this way, when an SRv6 behavior attribute is parsed and an error occurs, the kernel can send a message to the userspace describing the error through a meaningful text message in addition to the classic error code. Signed-off-by: Andrea Mayer <[email protected]> Reviewed-by: David Ahern <[email protected]> Signed-off-by: Paolo Abeni <[email protected]>
2022-09-20net-next: gro: Fix use of skb_gro_header_slowRichard Gobert1-1/+1
In the cited commit, the function ipv6_gro_receive was accidentally changed to use skb_gro_header_slow, without attempting the fast path. Fix it. Fixes: 35ffb6654729 ("net: gro: skb_gro_header helper function") Signed-off-by: Richard Gobert <[email protected]> Link: https://lore.kernel.org/r/20220911184835.GA105063@debian Signed-off-by: Paolo Abeni <[email protected]>
2022-09-20net: dsa: allow masters to join a LAGVladimir Oltean5-10/+298
There are 2 ways in which a DSA user port may become handled by 2 CPU ports in a LAG: (1) its current DSA master joins a LAG ip link del bond0 && ip link add bond0 type bond mode 802.3ad ip link set eno2 master bond0 When this happens, all user ports with "eno2" as DSA master get automatically migrated to "bond0" as DSA master. (2) it is explicitly configured as such by the user # Before, the DSA master was eno3 ip link set swp0 type dsa master bond0 The design of this configuration is that the LAG device dynamically becomes a DSA master through dsa_master_setup() when the first physical DSA master becomes a LAG slave, and stops being so through dsa_master_teardown() when the last physical DSA master leaves. A LAG interface is considered as a valid DSA master only if it contains existing DSA masters, and no other lower interfaces. Therefore, we mainly rely on method (1) to enter this configuration. Each physical DSA master (LAG slave) retains its dev->dsa_ptr for when it becomes a standalone DSA master again. But the LAG master also has a dev->dsa_ptr, and this is actually duplicated from one of the physical LAG slaves, and therefore needs to be balanced when LAG slaves come and go. To the switch driver, putting DSA masters in a LAG is seen as putting their associated CPU ports in a LAG. We need to prepare cross-chip host FDB notifiers for CPU ports in a LAG, by calling the driver's ->lag_fdb_add method rather than ->port_fdb_add. Signed-off-by: Vladimir Oltean <[email protected]> Signed-off-by: Paolo Abeni <[email protected]>
2022-09-20net: dsa: propagate extack to port_lag_joinVladimir Oltean3-2/+4
Drivers could refuse to offload a LAG configuration for a variety of reasons, mainly having to do with its TX type. Additionally, since DSA masters may now also be LAG interfaces, and this will translate into a call to port_lag_join on the CPU ports, there may be extra restrictions there. Propagate the netlink extack to this DSA method in order for drivers to give a meaningful error message back to the user. Signed-off-by: Vladimir Oltean <[email protected]> Signed-off-by: Paolo Abeni <[email protected]>
2022-09-20net: dsa: suppress device links to LAG DSA mastersVladimir Oltean1-6/+8
These don't work (print a harmless error about the operation failing) and make little sense to have anyway, because when a LAG DSA master goes away, we will introduce logic to move our CPU port back to the first physical DSA master. So suppress these device links in preparation for adding support for LAG DSA masters. Signed-off-by: Vladimir Oltean <[email protected]> Signed-off-by: Paolo Abeni <[email protected]>
2022-09-20net: dsa: suppress appending ethtool stats to LAG DSA mastersVladimir Oltean1-0/+9
Similar to the discussion about tracking the admin/oper state of LAG DSA masters, we have the problem here that struct dsa_port *cpu_dp caches a single pair of orig_ethtool_ops and netdev_ops pointers. So if we call dsa_master_setup(bond0, cpu_dp) where cpu_dp is also the dev->dsa_ptr of one of the physical DSA masters, we'd effectively overwrite what we cached from that physical netdev with what replaced from the bonding interface. We don't need DSA ethtool stats on the bonding interface when used as DSA master, it's good enough to have them just on the physical DSA masters, so suppress this logic. Signed-off-by: Vladimir Oltean <[email protected]> Signed-off-by: Paolo Abeni <[email protected]>
2022-09-20net: dsa: don't keep track of admin/oper state on LAG DSA mastersVladimir Oltean1-0/+12
We store information about the DSA master's state in cpu_dp->master_admin_up and cpu_dp->master_oper_up, and this assumes a bijective association between a CPU port and a DSA master. However, when we have CPU ports in a LAG (and DSA masters in a LAG too), the way in which we set up things is that the physical DSA masters still have dev->dsa_ptr pointing to our cpu_dp, but the bonding/team device itself also has its dev->dsa_ptr pointing towards one of the CPU port structures (the first one). So logically speaking, that first cpu_dp can't keep track of both the physical master's admin/oper state, and of the bonding master's state. This isn't even needed; the reason why we keep track of the DSA master's state is to know when it is available for Ethernet-based register access. For that use case, we don't even need LAG; we just need to decide upon one of the physical DSA masters (if there is more than 1 available) and use that. This change suppresses dsa_tree_master_{admin,oper}_state_change() calls on LAG DSA masters (which will be supported in a future change), to allow the tracking of just physical DSA masters. Link: https://lore.kernel.org/netdev/[email protected]/ Suggested-by: Christian Marangi <[email protected]> Signed-off-by: Vladimir Oltean <[email protected]> Signed-off-by: Paolo Abeni <[email protected]>
2022-09-20net: dsa: allow the DSA master to be seen and changed through rtnetlinkVladimir Oltean7-1/+354
Some DSA switches have multiple CPU ports, which can be used to improve CPU termination throughput, but DSA, through dsa_tree_setup_cpu_ports(), sets up only the first one, leading to suboptimal use of hardware. The desire is to not change the default configuration but to permit the user to create a dynamic mapping between individual user ports and the CPU port that they are served by, configurable through rtnetlink. It is also intended to permit load balancing between CPU ports, and in that case, the foreseen model is for the DSA master to be a bonding interface whose lowers are the physical DSA masters. To that end, we create a struct rtnl_link_ops for DSA user ports with the "dsa" kind. We expose the IFLA_DSA_MASTER link attribute that contains the ifindex of the newly desired DSA master. Signed-off-by: Vladimir Oltean <[email protected]> Signed-off-by: Paolo Abeni <[email protected]>
2022-09-20net: dsa: introduce dsa_port_get_master()Vladimir Oltean5-27/+26
There is a desire to support for DSA masters in a LAG. That configuration is intended to work by simply enslaving the master to a bonding/team device. But the physical DSA master (the LAG slave) still has a dev->dsa_ptr, and that cpu_dp still corresponds to the physical CPU port. However, we would like to be able to retrieve the LAG that's the upper of the physical DSA master. In preparation for that, introduce a helper called dsa_port_get_master() that replaces all occurrences of the dp->cpu_dp->master pattern. The distinction between LAG and non-LAG will be made later within the helper itself. Signed-off-by: Vladimir Oltean <[email protected]> Reviewed-by: Florian Fainelli <[email protected]> Signed-off-by: Paolo Abeni <[email protected]>
2022-09-20flow_offload: Introduce flow_match_l2tpv3Wojciech Drewek1-0/+7
Allow to offload L2TPv3 filters by adding flow_rule_match_l2tpv3. Drivers can extract L2TPv3 specific fields from now on. Signed-off-by: Wojciech Drewek <[email protected]> Signed-off-by: Tony Nguyen <[email protected]> Signed-off-by: Paolo Abeni <[email protected]>
2022-09-20net/sched: flower: Add L2TPv3 filterWojciech Drewek1-0/+16
Add support for matching on L2TPv3 session ID. Session ID can be specified only when ip proto was set to IPPROTO_L2TP. Example filter: # tc filter add dev $PF1 ingress prio 1 protocol ip \ flower \ ip_proto l2tp \ l2tpv3_sid 1234 \ skip_sw \ action mirred egress redirect dev $VF1_PR Acked-by: Guillaume Nault <[email protected]> Signed-off-by: Wojciech Drewek <[email protected]> Signed-off-by: Tony Nguyen <[email protected]> Signed-off-by: Paolo Abeni <[email protected]>
2022-09-20flow_dissector: Add L2TPv3 dissectorsWojciech Drewek1-0/+28
Allow to dissect L2TPv3 specific field which is: - session ID (32 bits) L2TPv3 might be transported over IP or over UDP, this implementation is only about L2TPv3 over IP. IP protocol carries L2TPv3 when ip_proto is IPPROTO_L2TP (115). Acked-by: Guillaume Nault <[email protected]> Signed-off-by: Wojciech Drewek <[email protected]> Signed-off-by: Tony Nguyen <[email protected]> Signed-off-by: Paolo Abeni <[email protected]>
2022-09-19openvswitch: Change the return type for vport_ops.send function hook to intNathan Huckleberry2-2/+2
All usages of the vport_ops struct have the .send field set to dev_queue_xmit or internal_dev_recv. Since most usages are set to dev_queue_xmit, the function hook should match the signature of dev_queue_xmit. The only call to vport_ops->send() is in net/openvswitch/vport.c and it throws away the return value. This mismatched return type breaks forward edge kCFI since the underlying function definition does not match the function hook definition. Reported-by: Dan Carpenter <[email protected]> Link: https://github.com/ClangBuiltLinux/linux/issues/1703 Cc: [email protected] Signed-off-by: Nathan Huckleberry <[email protected]> Reviewed-by: Nathan Chancellor <[email protected]> Acked-by: Eelco Chaudron <[email protected]> Link: https://lore.kernel.org/r/[email protected] Signed-off-by: Jakub Kicinski <[email protected]>
2022-09-19Merge tag 'batadv-next-pullrequest-20220916' of ↵Jakub Kicinski4-43/+1
git://git.open-mesh.org/linux-merge Simon Wunderlich says: ==================== This cleanup patchset includes the following patches: - bump version strings, by Simon Wunderlich - drop unused headers in trace.h, by Sven Eckelmann - drop initialization of flexible ethtool_link_ksettings, by Sven Eckelmann - remove unused struct definitions, by Marek Lindner * tag 'batadv-next-pullrequest-20220916' of git://git.open-mesh.org/linux-merge: batman-adv: remove unused struct definitions batman-adv: Drop initialization of flexible ethtool_link_ksettings batman-adv: Drop unused headers in trace.h batman-adv: Start new development cycle ==================== Link: https://lore.kernel.org/r/[email protected] Signed-off-by: Jakub Kicinski <[email protected]>
2022-09-19Merge tag 'batadv-net-pullrequest-20220916' of ↵Jakub Kicinski1-0/+4
git://git.open-mesh.org/linux-merge Simon Wunderlich says: ==================== Here is a batman-adv bugfix: - Fix hang up with small MTU hard-interface, by Shigeru Yoshida * tag 'batadv-net-pullrequest-20220916' of git://git.open-mesh.org/linux-merge: batman-adv: Fix hang up with small MTU hard-interface ==================== Link: https://lore.kernel.org/r/[email protected] Signed-off-by: Jakub Kicinski <[email protected]>
2022-09-19net: rds: add missing __init/__exit annotations to module init/exit funcsXiu Jianfeng3-4/+4
Add missing __init/__exit annotations to module init/exit funcs. Signed-off-by: Xiu Jianfeng <[email protected]> Link: https://lore.kernel.org/r/[email protected] Signed-off-by: Jakub Kicinski <[email protected]>
2022-09-19rxrpc: remove rxrpc_max_call_lifetime declarationGaosheng Cui1-1/+0
rxrpc_max_call_lifetime has been removed since commit a158bdd3247b ("rxrpc: Fix call timeouts"), so remove it. Signed-off-by: Gaosheng Cui <[email protected]> Link: https://lore.kernel.org/r/[email protected] Signed-off-by: Jakub Kicinski <[email protected]>
2022-09-19Bluetooth: use hdev->workqueue when queuing hdev->{cmd,ncmd}_timer worksTetsuo Handa2-4/+17
syzbot is reporting attempt to schedule hdev->cmd_work work from system_wq WQ into hdev->workqueue WQ which is under draining operation [1], for commit c8efcc2589464ac7 ("workqueue: allow chained queueing during destruction") does not allow such operation. The check introduced by commit 877afadad2dce8aa ("Bluetooth: When HCI work queue is drained, only queue chained work") was incomplete. Use hdev->workqueue WQ when queuing hdev->{cmd,ncmd}_timer works because hci_{cmd,ncmd}_timeout() calls queue_work(hdev->workqueue). Also, protect the queuing operation with RCU read lock in order to avoid calling queue_delayed_work() after cancel_delayed_work() completed. Link: https://syzkaller.appspot.com/bug?extid=243b7d89777f90f7613b [1] Reported-by: syzbot <[email protected]> Signed-off-by: Tetsuo Handa <[email protected]> Fixes: 877afadad2dce8aa ("Bluetooth: When HCI work queue is drained, only queue chained work") Signed-off-by: Luiz Augusto von Dentz <[email protected]>
2022-09-19Bluetooth: L2CAP: initialize delayed works at l2cap_chan_create()Tetsuo Handa1-4/+6
syzbot is reporting cancel_delayed_work() without INIT_DELAYED_WORK() at l2cap_chan_del() [1], for CONF_NOT_COMPLETE flag (which meant to prevent l2cap_chan_del() from calling cancel_delayed_work()) is cleared by timer which fires before l2cap_chan_del() is called by closing file descriptor created by socket(AF_BLUETOOTH, SOCK_STREAM, BTPROTO_L2CAP). l2cap_bredr_sig_cmd(L2CAP_CONF_REQ) and l2cap_bredr_sig_cmd(L2CAP_CONF_RSP) are calling l2cap_ertm_init(chan), and they call l2cap_chan_ready() (which clears CONF_NOT_COMPLETE flag) only when l2cap_ertm_init(chan) succeeded. l2cap_sock_init() does not call l2cap_ertm_init(chan), and it instead sets CONF_NOT_COMPLETE flag by calling l2cap_chan_set_defaults(). However, when connect() is requested, "command 0x0409 tx timeout" happens after 2 seconds from connect() request, and CONF_NOT_COMPLETE flag is cleared after 4 seconds from connect() request, for l2cap_conn_start() from l2cap_info_timeout() callback scheduled by schedule_delayed_work(&conn->info_timer, L2CAP_INFO_TIMEOUT); in l2cap_connect() is calling l2cap_chan_ready(). Fix this problem by initializing delayed works used by L2CAP_MODE_ERTM mode as soon as l2cap_chan_create() allocates a channel, like I did in commit be8597239379f0f5 ("Bluetooth: initialize skb_queue_head at l2cap_chan_create()"). Link: https://syzkaller.appspot.com/bug?extid=83672956c7aa6af698b3 [1] Reported-by: syzbot <[email protected]> Signed-off-by: Tetsuo Handa <[email protected]> Signed-off-by: Luiz Augusto von Dentz <[email protected]>
2022-09-16Merge tag 'linux-can-next-for-6.1-20220915' of ↵David S. Miller6-66/+111
git://git.kernel.org/pub/scm/linux/kernel/git/mkl/linux-can-next Marc Kleine-Budde says: ==================== Sept. 15, 2022, 8:19 a.m. UTC Hello Jakub, hello David, this is a pull request of 23 patches for net-next/master. the first 2 patches are by me and fix a typo in the rx-offload helper and the flexcan driver. Christophe JAILLET's patch cleans up the error handling in rcar_canfd driver's probe function. Kenneth Lee's patch converts the kvaser_usb driver from kcalloc() to kzalloc(). Biju Das contributes 2 patches to the sja1000 driver which update the DT bindings and support for the RZ/N1 SJA1000 CAN controller. Jinpeng Cui provides 2 patches that remove redundant variables from the sja1000 and kvaser_pciefd driver. 2 patches by John Whittington and me add hardware timestamp support to the gs_usb driver. Gustavo A. R. Silva's patch converts the etas_es58x driver to make use of DECLARE_FLEX_ARRAY(). Krzysztof Kozlowski's patch cleans up the sja1000 DT bindings. Dario Binacchi fixes his invalid email in the flexcan driver documentation. Ziyang Xuan contributes 2 patches that clean up the CAN RAW protocol. Yang Yingliang's patch switches the flexcan driver to dev_err_probe(). The last 7 patches are by Oliver Hartkopp and add support for the next generation of the CAN protocol: CAN with eXtended data Length (CAN XL). ==================== Signed-off-by: David S. Miller <[email protected]>
2022-09-16tcp: Use WARN_ON_ONCE() in tcp_read_skb()Peilin Ye1-1/+1
Prevent tcp_read_skb() from flooding the syslog. Suggested-by: Jakub Sitnicki <[email protected]> Signed-off-by: Peilin Ye <[email protected]> Signed-off-by: David S. Miller <[email protected]>
2022-09-16net/ieee802154: fix uninit value bug in dgram_sendmsgHaimin Zhang1-19/+23
There is uninit value bug in dgram_sendmsg function in net/ieee802154/socket.c when the length of valid data pointed by the msg->msg_name isn't verified. We introducing a helper function ieee802154_sockaddr_check_size to check namelen. First we check there is addr_type in ieee802154_addr_sa. Then, we check namelen according to addr_type. Also fixed in raw_bind, dgram_bind, dgram_connect. Signed-off-by: Haimin Zhang <[email protected]> Signed-off-by: David S. Miller <[email protected]>
2022-09-16vsock/vmci: fix repeated words in commentsJilin Yuan1-1/+1
Delete the redundant word 'that'. Signed-off-by: Jilin Yuan <[email protected]> Reviewed-by: Stefano Garzarella <[email protected]> Signed-off-by: David S. Miller <[email protected]>
2022-09-16rtnetlink: advertise allmulti counterNicolas Dichtel1-0/+3
Like what was done with IFLA_PROMISCUITY, add IFLA_ALLMULTI to advertise the allmulti counter. The flag IFF_ALLMULTI is advertised only if it was directly set by a userland app. Signed-off-by: Nicolas Dichtel <[email protected]> Signed-off-by: David S. Miller <[email protected]>
2022-09-15Bluetooth: RFCOMM: Fix possible deadlock on socket shutdown/releaseLuiz Augusto von Dentz1-0/+3
Due to change to switch to use lock_sock inside rfcomm_sk_state_change the socket shutdown/release procedure can cause a deadlock: rfcomm_sock_shutdown(): lock_sock(); __rfcomm_sock_close(): rfcomm_dlc_close(): __rfcomm_dlc_close(): rfcomm_dlc_lock(); rfcomm_sk_state_change(): lock_sock(); To fix this when the call __rfcomm_sock_close is now done without holding the lock_sock since rfcomm_dlc_lock exists to protect the dlc data there is no need to use lock_sock in that code path. Link: https://lore.kernel.org/all/CAD+dNTsbuU4w+Y_P7o+VEN7BYCAbZuwZx2+tH+OTzCdcZF82YA@mail.gmail.com/ Fixes: b7ce436a5d79 ("Bluetooth: switch to lock_sock in RFCOMM") Signed-off-by: Luiz Augusto von Dentz <[email protected]>
2022-09-15mptcp: account memory allocation in mptcp_nl_cmd_add_addr() to userThomas Haller1-1/+1
Now that non-root users can configure MPTCP endpoints, account the memory allocation to the user. Signed-off-by: Thomas Haller <[email protected]> Reviewed-by: Mat Martineau <[email protected]> Signed-off-by: Matthieu Baerts <[email protected]> Signed-off-by: Paolo Abeni <[email protected]>
2022-09-15mptcp: allow privileged operations from user namespacesThomas Haller1-9/+9
GENL_ADMIN_PERM checks that the user has CAP_NET_ADMIN in the initial namespace by calling netlink_capable(). Instead, use GENL_UNS_ADMIN_PERM which uses netlink_ns_capable(). This checks that the caller has CAP_NET_ADMIN in the current user namespace. See also commit 4a92602aa1cd ("openvswitch: allow management from inside user namespaces") which introduced this mechanism. See also commit 5617c6cd6f84 ("nl80211: Allow privileged operations from user namespaces") which introduced this for nl80211. Signed-off-by: Thomas Haller <[email protected]> Reviewed-by: Mat Martineau <[email protected]> Signed-off-by: Matthieu Baerts <[email protected]> Signed-off-by: Paolo Abeni <[email protected]>
2022-09-15mptcp: add do_check_data_fin to replace copiedGeliang Tang1-3/+4
This patch adds a new bool variable 'do_check_data_fin' to replace the original int variable 'copied' in __mptcp_push_pending(), check it to determine whether to call __mptcp_check_send_data_fin(). Suggested-by: Mat Martineau <[email protected]> Signed-off-by: Geliang Tang <[email protected]> Reviewed-by: Matthieu Baerts <[email protected]> Signed-off-by: Matthieu Baerts <[email protected]> Signed-off-by: Paolo Abeni <[email protected]>
2022-09-15mptcp: add mptcp_for_each_subflow_safe helperMatthieu Baerts3-4/+6
Similar to mptcp_for_each_subflow(): this is clearer now that the _safe version is used in multiple places. Reviewed-by: Mat Martineau <[email protected]> Signed-off-by: Matthieu Baerts <[email protected]> Signed-off-by: Paolo Abeni <[email protected]>
2022-09-15can: raw: add CAN XL supportOliver Hartkopp1-12/+43
Enable CAN_RAW sockets to read and write CAN XL frames analogue to the CAN FD extension (new CAN_RAW_XL_FRAMES sockopt). A CAN XL network interface is capable to handle Classical CAN, CAN FD and CAN XL frames. When CAN_RAW_XL_FRAMES is enabled, the CAN_RAW socket checks whether the addressed CAN network interface is capable to handle the provided CAN frame. In opposite to the fixed number of bytes for - CAN frames (CAN_MTU = sizeof(struct can_frame)) - CAN FD frames (CANFD_MTU = sizeof(struct can_frame)) the number of bytes when reading/writing CAN XL frames depends on the number of data bytes. For efficiency reasons the length of the struct canxl_frame is truncated to the needed size for read/write operations. This leads to a calculated size of CANXL_HDR_SIZE + canxl_frame::len which is enforced on write() operations and guaranteed on read() operations. NB: Valid length values are 1 .. 2048 (CANXL_MIN_DLEN .. CANXL_MAX_DLEN). Acked-by: Vincent Mailhol <[email protected]> Signed-off-by: Oliver Hartkopp <[email protected]> Link: https://lore.kernel.org/all/[email protected] Signed-off-by: Marc Kleine-Budde <[email protected]>
2022-09-15can: canxl: update CAN infrastructure for CAN XL framesOliver Hartkopp1-1/+24
- add new ETH_P_CANXL ethernet protocol type - update skb checks for CAN XL - add alloc_canxl_skb() which now needs a data length parameter - introduce init_can_skb_reserve() to reduce code duplication Acked-by: Vincent Mailhol <[email protected]> Signed-off-by: Oliver Hartkopp <[email protected]> Link: https://lore.kernel.org/all/[email protected] Signed-off-by: Marc Kleine-Budde <[email protected]>
2022-09-15can: set CANFD_FDF flag in all CAN FD frame structuresOliver Hartkopp1-0/+5
To simplify the testing in user space all struct canfd_frame's provided by the CAN subsystem of the Linux kernel now have the CANFD_FDF flag set in canfd_frame::flags. NB: Handcrafted ETH_P_CANFD frames introduced via PF_PACKET socket might not set this bit correctly. During the check for sufficient headroom in PF_PACKET sk_buffs the uninitialized CAN sk_buff data structures are filled. In the case of a CAN FD frame the CANFD_FDF flag is set accordingly. As the CAN frame content is already zero initialized in alloc_canfd_skb() the obsolete initialization of cf->flags in the CTU CAN FD driver has been removed as it would overwrite the already set CANFD_FDF flag. Acked-by: Vincent Mailhol <[email protected]> Signed-off-by: Oliver Hartkopp <[email protected]> Link: https://lore.kernel.org/all/[email protected] Signed-off-by: Marc Kleine-Budde <[email protected]>
2022-09-15can: skb: unify skb CAN frame identification helpersOliver Hartkopp5-45/+24
Replace open coded checks for sk_buffs containing Classical CAN and CAN FD frame structures as a preparation for CAN XL support. With the added length check the unintended processing of CAN XL frames having the CANXL_XLF bit set can be suppressed even when the skb->len fits to non CAN XL frames. The CAN_RAW socket needs a rework to use these helpers. Therefore the use of these helpers is postponed to the CAN_RAW CAN XL integration. The J1939 protocol gets a check for Classical CAN frames too. Acked-by: Vincent Mailhol <[email protected]> Signed-off-by: Oliver Hartkopp <[email protected]> Link: https://lore.kernel.org/all/[email protected] Signed-off-by: Marc Kleine-Budde <[email protected]>
2022-09-15batman-adv: remove unused struct definitionsMarek Lindner1-39/+0
Signed-off-by: Marek Lindner <[email protected]> Signed-off-by: Sven Eckelmann <[email protected]> Signed-off-by: Simon Wunderlich <[email protected]>
2022-09-14Bluetooth: hci_sync: allow advertise when scan without RPAZhengping Jiang1-1/+1
Address resolution will be paused during active scan to allow any advertising reports reach the host. If LL privacy is enabled, advertising will rely on the controller to generate new RPA. If host is not using RPA, there is no need to stop advertising during active scan because there is no need to generate RPA in the controller. Signed-off-by: Zhengping Jiang <[email protected]> Signed-off-by: Luiz Augusto von Dentz <[email protected]>
2022-09-14Bluetooth: avoid hci_dev_test_and_set_flag() in mgmt_init_hdev()Tetsuo Handa1-1/+3
syzbot is again reporting attempt to cancel uninitialized work at mgmt_index_removed() [1], for setting of HCI_MGMT flag from mgmt_init_hdev() from hci_mgmt_cmd() from hci_sock_sendmsg() can race with testing of HCI_MGMT flag from mgmt_index_removed() from hci_sock_bind() due to lack of serialization via hci_dev_lock(). Since mgmt_init_hdev() is called with mgmt_chan_list_lock held, we can safely split hci_dev_test_and_set_flag() into hci_dev_test_flag() and hci_dev_set_flag(). Thus, in order to close this race, set HCI_MGMT flag after INIT_DELAYED_WORK() completed. This is a local fix based on mgmt_chan_list_lock. Lack of serialization via hci_dev_lock() might be causing different race conditions somewhere else. But a global fix based on hci_dev_lock() should deserve a future patch. Link: https://syzkaller.appspot.com/bug?extid=844c7bf1b1aa4119c5de Reported-by: [email protected] Signed-off-by: Tetsuo Handa <[email protected]> Fixes: 3f2893d3c142986a ("Bluetooth: don't try to cancel uninitialized works at mgmt_index_removed()") Signed-off-by: Luiz Augusto von Dentz <[email protected]>
2022-09-13mptcp: fix fwd memory accounting on coalescePaolo Abeni1-1/+7
The intel bot reported a memory accounting related splat: [ 240.473094] ------------[ cut here ]------------ [ 240.478507] page_counter underflow: -4294828518 nr_pages=4294967290 [ 240.485500] WARNING: CPU: 2 PID: 14986 at mm/page_counter.c:56 page_counter_cancel+0x96/0xc0 [ 240.570849] CPU: 2 PID: 14986 Comm: mptcp_connect Tainted: G S 5.19.0-rc4-00739-gd24141fe7b48 #1 [ 240.581637] Hardware name: HP HP Z240 SFF Workstation/802E, BIOS N51 Ver. 01.63 10/05/2017 [ 240.590600] RIP: 0010:page_counter_cancel+0x96/0xc0 [ 240.596179] Code: 00 00 00 45 31 c0 48 89 ef 5d 4c 89 c6 41 5c e9 40 fd ff ff 4c 89 e2 48 c7 c7 20 73 39 84 c6 05 d5 b1 52 04 01 e8 e7 95 f3 01 <0f> 0b eb a9 48 89 ef e8 1e 25 fc ff eb c3 66 66 2e 0f 1f 84 00 00 [ 240.615639] RSP: 0018:ffffc9000496f7c8 EFLAGS: 00010082 [ 240.621569] RAX: 0000000000000000 RBX: ffff88819c9c0120 RCX: 0000000000000000 [ 240.629404] RDX: 0000000000000027 RSI: 0000000000000004 RDI: fffff5200092deeb [ 240.637239] RBP: ffff88819c9c0120 R08: 0000000000000001 R09: ffff888366527a2b [ 240.645069] R10: ffffed106cca4f45 R11: 0000000000000001 R12: 00000000fffffffa [ 240.652903] R13: ffff888366536118 R14: 00000000fffffffa R15: ffff88819c9c0000 [ 240.660738] FS: 00007f3786e72540(0000) GS:ffff888366500000(0000) knlGS:0000000000000000 [ 240.669529] CS: 0010 DS: 0000 ES: 0000 CR0: 0000000080050033 [ 240.675974] CR2: 00007f966b346000 CR3: 0000000168cea002 CR4: 00000000003706e0 [ 240.683807] DR0: 0000000000000000 DR1: 0000000000000000 DR2: 0000000000000000 [ 240.691641] DR3: 0000000000000000 DR6: 00000000fffe0ff0 DR7: 0000000000000400 [ 240.699468] Call Trace: [ 240.702613] <TASK> [ 240.705413] page_counter_uncharge+0x29/0x80 [ 240.710389] drain_stock+0xd0/0x180 [ 240.714585] refill_stock+0x278/0x580 [ 240.718951] __sk_mem_reduce_allocated+0x222/0x5c0 [ 240.729248] __mptcp_update_rmem+0x235/0x2c0 [ 240.734228] __mptcp_move_skbs+0x194/0x6c0 [ 240.749764] mptcp_recvmsg+0xdfa/0x1340 [ 240.763153] inet_recvmsg+0x37f/0x500 [ 240.782109] sock_read_iter+0x24a/0x380 [ 240.805353] new_sync_read+0x420/0x540 [ 240.838552] vfs_read+0x37f/0x4c0 [ 240.842582] ksys_read+0x170/0x200 [ 240.864039] do_syscall_64+0x5c/0x80 [ 240.872770] entry_SYSCALL_64_after_hwframe+0x46/0xb0 [ 240.878526] RIP: 0033:0x7f3786d9ae8e [ 240.882805] Code: c0 e9 b6 fe ff ff 50 48 8d 3d 6e 18 0a 00 e8 89 e8 01 00 66 0f 1f 84 00 00 00 00 00 64 8b 04 25 18 00 00 00 85 c0 75 14 0f 05 <48> 3d 00 f0 ff ff 77 5a c3 66 0f 1f 84 00 00 00 00 00 48 83 ec 28 [ 240.902259] RSP: 002b:00007fff7be81e08 EFLAGS: 00000246 ORIG_RAX: 0000000000000000 [ 240.910533] RAX: ffffffffffffffda RBX: 0000000000002000 RCX: 00007f3786d9ae8e [ 240.918368] RDX: 0000000000002000 RSI: 00007fff7be87ec0 RDI: 0000000000000005 [ 240.926206] RBP: 0000000000000005 R08: 00007f3786e6a230 R09: 00007f3786e6a240 [ 240.934046] R10: fffffffffffff288 R11: 0000000000000246 R12: 0000000000002000 [ 240.941884] R13: 00007fff7be87ec0 R14: 00007fff7be87ec0 R15: 0000000000002000 [ 240.949741] </TASK> [ 240.952632] irq event stamp: 27367 [ 240.956735] hardirqs last enabled at (27366): [<ffffffff81ba50ea>] mem_cgroup_uncharge_skmem+0x6a/0x80 [ 240.966848] hardirqs last disabled at (27367): [<ffffffff81b8fd42>] refill_stock+0x282/0x580 [ 240.976017] softirqs last enabled at (27360): [<ffffffff83a4d8ef>] mptcp_recvmsg+0xaf/0x1340 [ 240.985273] softirqs last disabled at (27364): [<ffffffff83a4d30c>] __mptcp_move_skbs+0x18c/0x6c0 [ 240.994872] ---[ end trace 0000000000000000 ]--- After commit d24141fe7b48 ("mptcp: drop SK_RECLAIM_* macros"), if rmem_fwd_alloc become negative, mptcp_rmem_uncharge() can try to reclaim a negative amount of pages, since the expression: reclaimable >= PAGE_SIZE will evaluate to true for any negative value of the int 'reclaimable': 'PAGE_SIZE' is an unsigned long and the negative integer will be promoted to a (very large) unsigned long value. Still after the mentioned commit, kfree_skb_partial() in mptcp_try_coalesce() will reclaim most of just released fwd memory, so that following charging of the skb delta size will lead to negative fwd memory values. At that point a racing recvmsg() can trigger the splat. Address the issue switching the order of the memory accounting operations. The fwd memory can still transiently reach negative values, but that will happen in an atomic scope and no code path could touch/use such value. Reported-by: kernel test robot <[email protected]> Fixes: d24141fe7b48 ("mptcp: drop SK_RECLAIM_* macros") Signed-off-by: Paolo Abeni <[email protected]> Reviewed-by: Matthieu Baerts <[email protected]> Signed-off-by: Matthieu Baerts <[email protected]> Link: https://lore.kernel.org/r/[email protected] Signed-off-by: Paolo Abeni <[email protected]>
2022-09-12Merge tag 'nfs-for-5.20-3' of git://git.linux-nfs.org/projects/trondmy/linux-nfsLinus Torvalds2-4/+7
Pull NFS client bugfixes from Trond Myklebust: - Fix SUNRPC call completion races with call_decode() that trigger a WARN_ON() - NFSv4.0 cannot support open-by-filehandle and NFS re-export - Revert "SUNRPC: Remove unreachable error condition" to allow handling of error conditions - Update suid/sgid mode bits after ALLOCATE and DEALLOCATE * tag 'nfs-for-5.20-3' of git://git.linux-nfs.org/projects/trondmy/linux-nfs: Revert "SUNRPC: Remove unreachable error condition" NFSv4.2: Update mode bits after ALLOCATE and DEALLOCATE NFSv4: Turn off open-by-filehandle and NFS re-export for NFSv4.0 SUNRPC: Fix call completion races with call_decode()
2022-09-10bpf: Add support for writing to nf_conn:markDaniel Xu3-1/+120
Support direct writes to nf_conn:mark from TC and XDP prog types. This is useful when applications want to store per-connection metadata. This is also particularly useful for applications that run both bpf and iptables/nftables because the latter can trivially access this metadata. One example use case would be if a bpf prog is responsible for advanced packet classification and iptables/nftables is later used for routing due to pre-existing/legacy code. Signed-off-by: Daniel Xu <[email protected]> Link: https://lore.kernel.org/r/ebca06dea366e3e7e861c12f375a548cc4c61108.1662568410.git.dxu@dxuuu.xyz Signed-off-by: Alexei Starovoitov <[email protected]>
2022-09-10bpf: Use 0 instead of NOT_INIT for btf_struct_access() writesDaniel Xu1-1/+1
Returning a bpf_reg_type only makes sense in the context of a BPF_READ. For writes, prefer to explicitly return 0 for clarity. Note that is non-functional change as it just so happened that NOT_INIT == 0. Signed-off-by: Daniel Xu <[email protected]> Link: https://lore.kernel.org/r/01772bc1455ae16600796ac78c6cc9fff34f95ff.1662568410.git.dxu@dxuuu.xyz Signed-off-by: Alexei Starovoitov <[email protected]>
2022-09-09bpf: Invoke cgroup/connect{4,6} programs for unprivileged ICMP pingYiFei Zhu2-0/+31
Usually when a TCP/UDP connection is initiated, we can bind the socket to a specific IP attached to an interface in a cgroup/connect hook. But for pings, this is impossible, as the hook is not being called. This adds the hook invocation to unprivileged ICMP ping (i.e. ping sockets created with SOCK_DGRAM IPPROTO_ICMP(V6) as opposed to SOCK_RAW. Logic is mirrored from UDP sockets where the hook is invoked during pre_connect, after a check for suficiently sized addr_len. Signed-off-by: YiFei Zhu <[email protected]> Link: https://lore.kernel.org/r/5764914c252fad4cd134fb6664c6ede95f409412.1662682323.git.zhuyifei@google.com Signed-off-by: Martin KaFai Lau <[email protected]>
2022-09-09net: core: fix flow symmetric hashLudovic Cintrat1-3/+2
__flow_hash_consistentify() wrongly swaps ipv4 addresses in few cases. This function is indirectly used by __skb_get_hash_symmetric(), which is used to fanout packets in AF_PACKET. Intrusion detection systems may be impacted by this issue. __flow_hash_consistentify() computes the addresses difference then swaps them if the difference is negative. In few cases src - dst and dst - src are both negative. The following snippet mimics __flow_hash_consistentify(): ``` #include <stdio.h> #include <stdint.h> int main(int argc, char** argv) { int diffs_d, diffd_s; uint32_t dst = 0xb225a8c0; /* 178.37.168.192 --> 192.168.37.178 */ uint32_t src = 0x3225a8c0; /* 50.37.168.192 --> 192.168.37.50 */ uint32_t dst2 = 0x3325a8c0; /* 51.37.168.192 --> 192.168.37.51 */ diffs_d = src - dst; diffd_s = dst - src; printf("src:%08x dst:%08x, diff(s-d)=%d(0x%x) diff(d-s)=%d(0x%x)\n", src, dst, diffs_d, diffs_d, diffd_s, diffd_s); diffs_d = src - dst2; diffd_s = dst2 - src; printf("src:%08x dst:%08x, diff(s-d)=%d(0x%x) diff(d-s)=%d(0x%x)\n", src, dst2, diffs_d, diffs_d, diffd_s, diffd_s); return 0; } ``` Results: src:3225a8c0 dst:b225a8c0, \ diff(s-d)=-2147483648(0x80000000) \ diff(d-s)=-2147483648(0x80000000) src:3225a8c0 dst:3325a8c0, \ diff(s-d)=-16777216(0xff000000) \ diff(d-s)=16777216(0x1000000) In the first case the addresses differences are always < 0, therefore __flow_hash_consistentify() always swaps, thus dst->src and src->dst packets have differents hashes. Fixes: c3f8324188fa8 ("net: Add full IPv6 addresses to flow_keys") Signed-off-by: Ludovic Cintrat <[email protected]> Signed-off-by: David S. Miller <[email protected]>
2022-09-09net: openvswitch: fix repeated words in commentsJilin Yuan1-1/+1
Delete the redundant word 'is'. Signed-off-by: Jilin Yuan <[email protected]> Signed-off-by: David S. Miller <[email protected]>
2022-09-09Merge git://git.kernel.org/pub/scm/linux/kernel/git/netfilter/nfDavid S. Miller3-9/+33
Florian Westhal says: ==================== netfilter: bugfixes for net The following set contains four netfilter patches for your *net* tree. When there are multiple Contact headers in a SIP message its possible the next headers won't be found because the SIP helper confuses relative and absolute offsets in the message. From Igor Ryzhov. Make the nft_concat_range self-test support socat, this makes the selftest pass on my test VM, from myself. nf_conntrack_irc helper can be tricked into opening a local port forward that the client never requested by embedding a DCC message in a PING request sent to the client. Fix from David Leadbeater. Both have been broken since the kernel 2.6.x days. The 'osf' match might indicate success while it could not find anything, broken since 5.2 . Fix from Pablo Neira. ==================== Signed-off-by: David S. Miller <[email protected]>
2022-09-09net: sched: act_vlan: get rid of tcf_vlan_walker and tcf_vlan_searchZhengchao Shao1-19/+0
tcf_vlan_walker() and tcf_vlan_search() do the same thing as generic walk/search function, so remove them. Signed-off-by: Zhengchao Shao <[email protected]> Acked-by: Jamal Hadi Salim <[email protected]> Signed-off-by: David S. Miller <[email protected]>
2022-09-09net: sched: act_tunnel_key: get rid of tunnel_key_walker and tunnel_key_searchZhengchao Shao1-19/+0
tunnel_key_walker() and tunnel_key_search() do the same thing as generic walk/search function, so remove them. Signed-off-by: Zhengchao Shao <[email protected]> Acked-by: Jamal Hadi Salim <[email protected]> Signed-off-by: David S. Miller <[email protected]>
2022-09-09net: sched: act_skbmod: get rid of tcf_skbmod_walker and tcf_skbmod_searchZhengchao Shao1-19/+0
tcf_skbmod_walker() and tcf_skbmod_search() do the same thing as generic walk/search function, so remove them. Signed-off-by: Zhengchao Shao <[email protected]> Acked-by: Jamal Hadi Salim <[email protected]> Signed-off-by: David S. Miller <[email protected]>
2022-09-09net: sched: act_skbedit: get rid of tcf_skbedit_walker and tcf_skbedit_searchZhengchao Shao1-19/+0
tcf_skbedit_walker() and tcf_skbedit_search() do the same thing as generic walk/search function, so remove them. Signed-off-by: Zhengchao Shao <[email protected]> Acked-by: Jamal Hadi Salim <[email protected]> Signed-off-by: David S. Miller <[email protected]>
2022-09-09net: sched: act_simple: get rid of tcf_simp_walker and tcf_simp_searchZhengchao Shao1-19/+0
tcf_simp_walker() and tcf_simp_search() do the same thing as generic walk/search function, so remove them. Signed-off-by: Zhengchao Shao <[email protected]> Acked-by: Jamal Hadi Salim <[email protected]> Signed-off-by: David S. Miller <[email protected]>