aboutsummaryrefslogtreecommitdiff
path: root/net
AgeCommit message (Collapse)AuthorFilesLines
2024-01-02net/sched: Retire ipt actionJamal Hadi Salim2-465/+0
The tc ipt action was intended to run all netfilter/iptables target. Unfortunately it has not benefitted over the years from proper updates when netfilter changes, and for that reason it has remained rudimentary. Pinging a bunch of people that i was aware were using this indicates that removing it wont affect them. Retire it to reduce maintenance efforts. Buh-bye. Reviewed-by: Victor Noguiera <[email protected]> Reviewed-by: Pedro Tammela <[email protected]> Signed-off-by: Jamal Hadi Salim <[email protected]> Signed-off-by: David S. Miller <[email protected]>
2024-01-02net-device: move gso_partial_features to net_device_read_txEric Dumazet1-1/+2
dev->gso_partial_features is read from tx fast path for GSO packets. Move it to appropriate section to avoid a cache line miss. Fixes: 43a71cd66b9c ("net-device: reorganize net_device fast path variables") Signed-off-by: Eric Dumazet <[email protected]> Cc: Coco Li <[email protected]> Cc: David Ahern <[email protected]> Reviewed-by: Simon Horman <[email protected]> Signed-off-by: David S. Miller <[email protected]>
2024-01-01net: qrtr: ns: Return 0 if server port is not presentSarannya S1-1/+3
When a 'DEL_CLIENT' message is received from the remote, the corresponding server port gets deleted. A DEL_SERVER message is then announced for this server. As part of handling the subsequent DEL_SERVER message, the name- server attempts to delete the server port which results in a '-ENOENT' error. The return value from server_del() is then propagated back to qrtr_ns_worker, causing excessive error prints. To address this, return 0 from control_cmd_del_server() without checking the return value of server_del(), since the above scenario is not an error case and hence server_del() doesn't have any other error return value. Signed-off-by: Sarannya Sasikumar <[email protected]> Reviewed-by: Simon Horman <[email protected]> Signed-off-by: David S. Miller <[email protected]>
2024-01-01net: ethtool: strset: Allow querying phy stats by indexMaxime Chevallier1-7/+8
The ETH_SS_PHY_STATS command gets PHY statistics. Use the phydev pointer from the ethnl request to allow query phy stats from each PHY on the link. Signed-off-by: Maxime Chevallier <[email protected]> Reviewed-by: Andrew Lunn <[email protected]> Signed-off-by: David S. Miller <[email protected]>
2024-01-01net: ethtool: cable-test: Target the command to the requested PHYMaxime Chevallier1-6/+6
Cable testing is a PHY-specific command. Instead of targeting the command towards dev->phydev, use the request to pick the targeted PHY. Signed-off-by: Maxime Chevallier <[email protected]> Reviewed-by: Andrew Lunn <[email protected]> Signed-off-by: David S. Miller <[email protected]>
2024-01-01net: ethtool: pse-pd: Target the command to the requested PHYMaxime Chevallier1-6/+3
PSE and PD configuration is a PHY-specific command. Instead of targeting the command towards dev->phydev, use the request to pick the targeted PHY device. Signed-off-by: Maxime Chevallier <[email protected]> Signed-off-by: David S. Miller <[email protected]>
2024-01-01net: ethtool: plca: Target the command to the requested PHYMaxime Chevallier1-7/+6
PLCA is a PHY-specific command. Instead of targeting the command towards dev->phydev, use the request to pick the targeted PHY. Signed-off-by: Maxime Chevallier <[email protected]> Reviewed-by: Andrew Lunn <[email protected]> Signed-off-by: David S. Miller <[email protected]>
2024-01-01net: ethtool: Introduce a command to list PHYs on an interfaceMaxime Chevallier4-1/+321
As we have the ability to track the PHYs connected to a net_device through the link_topology, we can expose this list to userspace. This allows userspace to use these identifiers for phy-specific commands and take the decision of which PHY to target by knowing the link topology. Add PHY_GET and PHY_DUMP, which can be a filtered DUMP operation to list devices on only one interface. Signed-off-by: Maxime Chevallier <[email protected]> Signed-off-by: David S. Miller <[email protected]>
2024-01-01net: ethtool: Allow passing a phy index for some commandsMaxime Chevallier2-2/+29
Some netlink commands are target towards ethernet PHYs, to control some of their features. As there's several such commands, add the ability to pass a PHY index in the ethnl request, which will populate the generic ethnl_req_info with the relevant phydev when the command targets a PHY. Signed-off-by: Maxime Chevallier <[email protected]> Reviewed-by: Andrew Lunn <[email protected]> Signed-off-by: David S. Miller <[email protected]>
2024-01-01net: phy: Introduce ethernet link topology representationMaxime Chevallier1-0/+3
Link topologies containing multiple network PHYs attached to the same net_device can be found when using a PHY as a media converter for use with an SFP connector, on which an SFP transceiver containing a PHY can be used. With the current model, the transceiver's PHY can't be used for operations such as cable testing, timestamping, macsec offload, etc. The reason being that most of the logic for these configuration, coming from either ethtool netlink or ioctls tend to use netdev->phydev, which in multi-phy systems will reference the PHY closest to the MAC. Introduce a numbering scheme allowing to enumerate PHY devices that belong to any netdev, which can in turn allow userspace to take more precise decisions with regard to each PHY's configuration. The numbering is maintained per-netdev, in a phy_device_list. The numbering works similarly to a netdevice's ifindex, with identifiers that are only recycled once INT_MAX has been reached. This prevents races that could occur between PHY listing and SFP transceiver removal/insertion. The identifiers are assigned at phy_attach time, as the numbering depends on the netdevice the phy is attached to. Signed-off-by: Maxime Chevallier <[email protected]> Signed-off-by: David S. Miller <[email protected]>
2024-01-01Merge tag 'nf-next-23-12-22' of ↵David S. Miller3-33/+128
git://git.kernel.org/pub/scm/linux/kernel/git/netfilter/nf-next Pablo Neira Ayuso says: ==================== netfilter pull request 23-12-22 The following patchset contains Netfilter updates for net-next: 1) Add locking for NFT_MSG_GETSETELEM_RESET requests, to address a race scenario with two concurrent processes running a dump-and-reset which exposes negative counters to userspace, from Phil Sutter. 2) Use GFP_KERNEL in pipapo GC, from Florian Westphal. 3) Reorder nf_flowtable struct members, place the read-mostly parts accessed by the datapath first. From Florian Westphal. 4) Set on dead flag for NFT_MSG_NEWSET in abort path, from Florian Westphal. 5) Support filtering zone in ctnetlink, from Felix Huettner. 6) Bail out if user tries to redefine an existing chain with different type in nf_tables. ==================== Signed-off-by: David S. Miller <[email protected]>
2024-01-01net/tcp_sigpool: Use kref_get_unless_zero()Dmitry Safonov1-3/+2
The freeing and re-allocation of algorithm are protected by cpool_mutex, so it doesn't fix an actual use-after-free, but avoids a deserved refcount_warn_saturate() warning. A trivial fix for the racy behavior. Fixes: 8c73b26315aa ("net/tcp: Prepare tcp_md5sig_pool for TCP-AO") Suggested-by: Eric Dumazet <[email protected]> Signed-off-by: Dmitry Safonov <[email protected]> Tested-by: Bagas Sanjaya <[email protected]> Reported-by: syzbot <[email protected]> Reviewed-by: Eric Dumazet <[email protected]> Signed-off-by: David S. Miller <[email protected]>
2024-01-01net: sched: em_text: fix possible memory leak in em_text_destroy()Hangyu Hua1-1/+3
m->data needs to be freed when em_text_destroy is called. Fixes: d675c989ed2d ("[PKT_SCHED]: Packet classification based on textsearch (ematch)") Acked-by: Jamal Hadi Salim <[email protected]> Signed-off-by: Hangyu Hua <[email protected]> Reviewed-by: Simon Horman <[email protected]> Signed-off-by: David S. Miller <[email protected]>
2023-12-29skbuff: use mempool KASAN hooksAndrey Konovalov1-4/+6
Instead of using slab-internal KASAN hooks for poisoning and unpoisoning cached objects, use the proper mempool KASAN hooks. Also check the return value of kasan_mempool_poison_object to prevent double-free and invali-free bugs. Link: https://lkml.kernel.org/r/a3482c41395c69baa80eb59dbb06beef213d2a14.1703024586.git.andreyknvl@google.com Signed-off-by: Andrey Konovalov <[email protected]> Cc: Alexander Lobakin <[email protected]> Cc: Alexander Potapenko <[email protected]> Cc: Andrey Ryabinin <[email protected]> Cc: Breno Leitao <[email protected]> Cc: Dmitry Vyukov <[email protected]> Cc: Evgenii Stepanov <[email protected]> Cc: Marco Elver <[email protected]> Signed-off-by: Andrew Morton <[email protected]>
2023-12-29kasan: rename and document kasan_(un)poison_object_dataAndrey Konovalov1-4/+4
Rename kasan_unpoison_object_data to kasan_unpoison_new_object and add a documentation comment. Do the same for kasan_poison_object_data. The new names and the comments should suggest the users that these hooks are intended for internal use by the slab allocator. The following patch will remove non-slab-internal uses of these hooks. No functional changes. [[email protected]: update references to renamed functions in comments] Link: https://lkml.kernel.org/r/[email protected] Link: https://lkml.kernel.org/r/eab156ebbd635f9635ef67d1a4271f716994e628.1703024586.git.andreyknvl@google.com Signed-off-by: Andrey Konovalov <[email protected]> Reviewed-by: Marco Elver <[email protected]> Cc: Alexander Lobakin <[email protected]> Cc: Alexander Potapenko <[email protected]> Cc: Andrey Ryabinin <[email protected]> Cc: Breno Leitao <[email protected]> Cc: Dmitry Vyukov <[email protected]> Cc: Evgenii Stepanov <[email protected]> Signed-off-by: Andrew Morton <[email protected]>
2023-12-29genetlink: Use internal flags for multicast groupsIdo Schimmel4-5/+5
As explained in commit e03781879a0d ("drop_monitor: Require 'CAP_SYS_ADMIN' when joining "events" group"), the "flags" field in the multicast group structure reuses uAPI flags despite the field not being exposed to user space. This makes it impossible to extend its use without adding new uAPI flags, which is inappropriate for internal kernel checks. Solve this by adding internal flags (i.e., "GENL_MCAST_*") and convert the existing users to use them instead of the uAPI flags. Tested using the reproducers in commit 44ec98ea5ea9 ("psample: Require 'CAP_NET_ADMIN' when joining "packets" group") and commit e03781879a0d ("drop_monitor: Require 'CAP_SYS_ADMIN' when joining "events" group"). No functional changes intended. Signed-off-by: Ido Schimmel <[email protected]> Reviewed-by: Mat Martineau <[email protected]> Reviewed-by: Andy Shevchenko <[email protected]> Signed-off-by: David S. Miller <[email protected]>
2023-12-29Merge tag 'nf-23-12-20' of ↵David S. Miller2-2/+2
git://git.kernel.org/pub/scm/linux/kernel/git/netfilter/nf Pablu Neira Syuso says: ==================== netfilter pull request 23-12-20 The following patchset contains Netfilter fixes for net: 1) Skip set commit for deleted/destroyed sets, this might trigger double deactivation of expired elements. 2) Fix packet mangling from egress, set transport offset from mac header for netdev/egress. Both fixes address bugs already present in several releases. ==================== Signed-off-by: David S. Miller <[email protected]>
2023-12-29iucv: make iucv_bus constGreg Kroah-Hartman1-1/+1
Now that the driver core can properly handle constant struct bus_type, move the iucv_bus variable to be a constant structure as well, placing it into read-only memory which can not be modified at runtime. Cc: Wenjia Zhang <[email protected]> Cc: "David S. Miller" <[email protected]> Cc: Eric Dumazet <[email protected]> Cc: Jakub Kicinski <[email protected]> Cc: Paolo Abeni <[email protected]> Cc: [email protected] Cc: [email protected] Acked-by: Alexandra Winter <[email protected]> Signed-off-by: Greg Kroah-Hartman <[email protected]> Signed-off-by: David S. Miller <[email protected]>
2023-12-27Kill sched.h dependency on rcupdate.hKent Overstreet5-0/+6
by moving cond_resched_rcu() to rcupdate_wait.h, we can kill another big sched.h dependency. Signed-off-by: Kent Overstreet <[email protected]>
2023-12-27net: pktgen: Use wait_event_freezable_timeout() for freezable kthreadKevin Hao1-4/+2
A freezable kernel thread can enter frozen state during freezing by either calling try_to_freeze() or using wait_event_freezable() and its variants. So for the following snippet of code in a kernel thread loop: wait_event_interruptible_timeout(); try_to_freeze(); We can change it to a simple wait_event_freezable_timeout() and then eliminate a function call. Signed-off-by: Kevin Hao <[email protected]> Signed-off-by: David S. Miller <[email protected]>
2023-12-27Merge tag 'wireless-2023-12-19' of ↵David S. Miller2-8/+15
git://git.kernel.org/pub/scm/linux/kernel/git/wireless/wireless Johannes Berg says: ==================== Just a couple of things: * debugfs fixes * rfkill fix in iwlwifi * remove mostly-not-working list ==================== Signed-off-by: David S. Miller <[email protected]>
2023-12-27net: rename dsa_realloc_skb to skb_ensure_writable_head_tailRadu Pirea (NXP OSS)2-26/+28
Rename dsa_realloc_skb to skb_ensure_writable_head_tail and move it to skbuff.c to use it as helper. Signed-off-by: Radu Pirea (NXP OSS) <[email protected]> Signed-off-by: David S. Miller <[email protected]>
2023-12-26bridge: cfm: fix enum typo in br_cc_ccm_tx_parseLin Ma1-1/+1
It appears that there is a typo in the code where the nlattr array is being parsed with policy br_cfm_cc_ccm_tx_policy, but the instance is being accessed via IFLA_BRIDGE_CFM_CC_RDI_INSTANCE, which is associated with the policy br_cfm_cc_rdi_policy. This problem was introduced by commit 2be665c3940d ("bridge: cfm: Netlink SET configuration Interface."). Though it seems like a harmless typo since these two enum owns the exact same value (1 here), it is quite misleading hence fix it by using the correct enum IFLA_BRIDGE_CFM_CC_CCM_TX_INSTANCE here. Signed-off-by: Lin Ma <[email protected]> Reviewed-by: Simon Horman <[email protected]> Acked-by: Nikolay Aleksandrov <[email protected]> Signed-off-by: David S. Miller <[email protected]>
2023-12-26mptcp: sockopt: support IP_LOCAL_PORT_RANGE and IP_BIND_ADDRESS_NO_PORTMaxim Galaganov1-1/+20
Support for IP_BIND_ADDRESS_NO_PORT sockopt was introduced in [1]. Recently [2] allowed its value to be accessed without locking the socket. Support for (newer) IP_LOCAL_PORT_RANGE sockopt was introduced in [3]. In the same series a selftest was added in [4]. This selftest also covers the IP_BIND_ADDRESS_NO_PORT sockopt. This patch enables getsockopt()/setsockopt() on MPTCP sockets for these socket options, syncing set values to subflows in sync_socket_options(). Ephemeral port range is synced to subflows, enabling NAT usecase described in [3]. [1] commit 90c337da1524 ("inet: add IP_BIND_ADDRESS_NO_PORT to overcome bind(0) limitations") [2] commit ca571e2eb7eb ("inet: move inet->bind_address_no_port to inet->inet_flags") [3] commit 91d0b78c5177 ("inet: Add IP_LOCAL_PORT_RANGE socket option") [4] commit ae5439658cce ("selftests/net: Cover the IP_LOCAL_PORT_RANGE socket option") Signed-off-by: Maxim Galaganov <[email protected]> Reviewed-by: Mat Martineau <[email protected]> Signed-off-by: Matthieu Baerts <[email protected]> Signed-off-by: David S. Miller <[email protected]>
2023-12-26mptcp: rename mptcp_setsockopt_sol_ip_set_transparent()Maxim Galaganov1-3/+3
Next patch extends this function so that it's not specific to IP_TRANSPARENT. Change function name to mptcp_setsockopt_sol_ip_set(). Reviewed-by: Mat Martineau <[email protected]> Signed-off-by: Maxim Galaganov <[email protected]> Signed-off-by: Matthieu Baerts <[email protected]> Signed-off-by: David S. Miller <[email protected]>
2023-12-26mptcp: don't overwrite sock_ops in mptcp_is_tcpsk()Davide Caratti1-64/+44
Eric Dumazet suggests: > The fact that mptcp_is_tcpsk() was able to write over sock->ops was a > bit strange to me. > mptcp_is_tcpsk() should answer a question, with a read-only argument. re-factor code to avoid overwriting sock_ops inside that function. Also, change the helper name to reflect the semantics and to disambiguate from its dual, sk_is_mptcp(). While at it, collapse mptcp_stream_accept() and mptcp_accept() into a single function, where fallback / non-fallback are separated into a single sk_is_mptcp() conditional. Link: https://github.com/multipath-tcp/mptcp_net-next/issues/432 Suggested-by: Eric Dumazet <[email protected]> Signed-off-by: Davide Caratti <[email protected]> Acked-by: Paolo Abeni <[email protected]> Signed-off-by: Matthieu Baerts <[email protected]> Signed-off-by: David S. Miller <[email protected]>
2023-12-26net/sched: act_mirred: Allow mirred to blockVictor Nogueira1-2/+117
So far the mirred action has dealt with syntax that handles mirror/redirection for netdev. A matching packet is redirected or mirrored to a target netdev. In this patch we enable mirred to mirror to a tc block as well. IOW, the new syntax looks as follows: ... mirred <ingress | egress> <mirror | redirect> [index INDEX] < <blockid BLOCKID> | <dev <devname>> > Examples of mirroring or redirecting to a tc block: $ tc filter add block 22 protocol ip pref 25 \ flower dst_ip 192.168.0.0/16 action mirred egress mirror blockid 22 $ tc filter add block 22 protocol ip pref 25 \ flower dst_ip 10.10.10.10/32 action mirred egress redirect blockid 22 Co-developed-by: Jamal Hadi Salim <[email protected]> Signed-off-by: Jamal Hadi Salim <[email protected]> Co-developed-by: Pedro Tammela <[email protected]> Signed-off-by: Pedro Tammela <[email protected]> Signed-off-by: Victor Nogueira <[email protected]> Signed-off-by: David S. Miller <[email protected]>
2023-12-26net/sched: act_mirred: Add helper function tcf_mirred_replace_devVictor Nogueira1-4/+12
The act of replacing a device will be repeated by the init logic for the block ID in the patch that allows mirred to a block. Therefore we encapsulate this functionality in a function (tcf_mirred_replace_dev) so that we can reuse it and avoid code repetition. Co-developed-by: Jamal Hadi Salim <[email protected]> Signed-off-by: Jamal Hadi Salim <[email protected]> Co-developed-by: Pedro Tammela <[email protected]> Signed-off-by: Pedro Tammela <[email protected]> Signed-off-by: Victor Nogueira <[email protected]> Signed-off-by: David S. Miller <[email protected]>
2023-12-26net/sched: act_mirred: Create function tcf_mirred_to_dev and improve readabilityVictor Nogueira1-57/+72
As a preparation for adding block ID to mirred, separate the part of mirred that redirect/mirrors to a dev into a specific function so that it can be called by blockcast for each dev. Also improve readability. Eg. rename use_reinsert to dont_clone and skb2 to skb_to_send. Co-developed-by: Jamal Hadi Salim <[email protected]> Signed-off-by: Jamal Hadi Salim <[email protected]> Co-developed-by: Pedro Tammela <[email protected]> Signed-off-by: Pedro Tammela <[email protected]> Signed-off-by: Victor Nogueira <[email protected]> Signed-off-by: David S. Miller <[email protected]>
2023-12-26net/sched: cls_api: Expose tc block to the datapathVictor Nogueira1-1/+2
The datapath can now find the block of the port in which the packet arrived at. In the next patch we show a possible usage of this patch in a new version of mirred that multicasts to all ports except for the port in which the packet arrived on. Co-developed-by: Jamal Hadi Salim <[email protected]> Signed-off-by: Jamal Hadi Salim <[email protected]> Co-developed-by: Pedro Tammela <[email protected]> Signed-off-by: Pedro Tammela <[email protected]> Signed-off-by: Victor Nogueira <[email protected]> Signed-off-by: David S. Miller <[email protected]>
2023-12-26net/sched: Introduce tc block netdev tracking infraVictor Nogueira3-1/+60
This commit makes tc blocks track which ports have been added to them. And, with that, we'll be able to use this new information to send packets to the block's ports. Which will be done in the patch #3 of this series. Suggested-by: Jiri Pirko <[email protected]> Co-developed-by: Jamal Hadi Salim <[email protected]> Signed-off-by: Jamal Hadi Salim <[email protected]> Co-developed-by: Pedro Tammela <[email protected]> Signed-off-by: Pedro Tammela <[email protected]> Signed-off-by: Victor Nogueira <[email protected]> Signed-off-by: David S. Miller <[email protected]>
2023-12-26keys, dns: Fix missing size check of V1 server-list headerEdward Adam Davis1-10/+9
The dns_resolver_preparse() function has a check on the size of the payload for the basic header of the binary-style payload, but is missing a check for the size of the V1 server-list payload header after determining that's what we've been given. Fix this by getting rid of the the pointer to the basic header and just assuming that we have a V1 server-list payload and moving the V1 server list pointer inside the if-statement. Dealing with other types and versions can be left for when such have been defined. This can be tested by doing the following with KASAN enabled: echo -n -e '\x0\x0\x1\x2' | keyctl padd dns_resolver foo @p and produces an oops like the following: BUG: KASAN: slab-out-of-bounds in dns_resolver_preparse+0xc9f/0xd60 net/dns_resolver/dns_key.c:127 Read of size 1 at addr ffff888028894084 by task syz-executor265/5069 ... Call Trace: dns_resolver_preparse+0xc9f/0xd60 net/dns_resolver/dns_key.c:127 __key_create_or_update+0x453/0xdf0 security/keys/key.c:842 key_create_or_update+0x42/0x50 security/keys/key.c:1007 __do_sys_add_key+0x29c/0x450 security/keys/keyctl.c:134 do_syscall_x64 arch/x86/entry/common.c:52 [inline] do_syscall_64+0x40/0x110 arch/x86/entry/common.c:83 entry_SYSCALL_64_after_hwframe+0x62/0x6a This patch was originally by Edward Adam Davis, but was modified by Linus. Fixes: b946001d3bb1 ("keys, dns: Allow key types (eg. DNS) to be reclaimed immediately on expiry") Reported-and-tested-by: [email protected] Link: https://lore.kernel.org/r/[email protected]/ Suggested-by: Linus Torvalds <[email protected]> Signed-off-by: Edward Adam Davis <[email protected]> Signed-off-by: David Howells <[email protected]> Tested-by: David Howells <[email protected]> Cc: Edward Adam Davis <[email protected]> Cc: Jarkko Sakkinen <[email protected]> Cc: Jeffrey E Altman <[email protected]> Cc: Wang Lei <[email protected]> Cc: Jeff Layton <[email protected]> Cc: Steve French <[email protected]> Cc: Marc Dionne <[email protected]> Cc: "David S. Miller" <[email protected]> Cc: Eric Dumazet <[email protected]> Cc: Jakub Kicinski <[email protected]> Cc: Paolo Abeni <[email protected]> Reviewed-by: Simon Horman <[email protected]> Signed-off-by: Linus Torvalds <[email protected]>
2023-12-26net: remove SOCK_DEBUG leftoversDenis Kirjanov5-24/+24
SOCK_DEBUG comes from the old days. Let's move logging to standard net core ratelimited logging functions Signed-off-by: Denis Kirjanov <[email protected]> changes in v2: - remove SOCK_DEBUG macro altogether Signed-off-by: David S. Miller <[email protected]>
2023-12-26net/smc: manage system EID in SMC stack instead of ISM driverWen Gu2-9/+31
The System EID (SEID) is an internal EID that is used by the SMCv2 software stack that has a predefined and constant value representing the s390 physical machine that the OS is executing on. So it should be managed by SMC stack instead of ISM driver and be consistent for all ISMv2 device (including virtual ISM devices) on s390 architecture. Suggested-by: Alexandra Winter <[email protected]> Signed-off-by: Wen Gu <[email protected]> Reviewed-and-tested-by: Wenjia Zhang <[email protected]> Reviewed-by: Alexandra Winter <[email protected]> Signed-off-by: David S. Miller <[email protected]>
2023-12-26net/smc: disable SEID on non-s390 archs where virtual ISM may be usedWen Gu1-0/+14
The system EID (SEID) is an internal EID used by SMC-D to represent the s390 physical machine that OS is executing on. On s390 architecture, it predefined by fixed string and part of cpuid and is enabled regardless of whether underlay device is virtual ISM or platform firmware ISM. However on non-s390 architectures where SMC-D can be used with virtual ISM devices, there is no similar information to identify physical machines, especially in virtualization scenarios. So in such cases, SEID is forcibly disabled and the user-defined UEID will be used to represent the communicable space. Signed-off-by: Wen Gu <[email protected]> Reviewed-and-tested-by: Wenjia Zhang <[email protected]> Signed-off-by: David S. Miller <[email protected]>
2023-12-26net/smc: support extended GID in SMC-D lgr netlink attributeWen Gu2-0/+8
Virtual ISM devices introduced in SMCv2.1 requires a 128 bit extended GID vs. the existing ISM 64bit GID. So the 2nd 64 bit of extended GID should be included in SMC-D linkgroup netlink attribute as well. Signed-off-by: Wen Gu <[email protected]> Signed-off-by: David S. Miller <[email protected]>
2023-12-26net/smc: compatible with 128-bits extended GID of virtual ISM deviceWen Gu10-55/+146
According to virtual ISM support feature defined by SMCv2.1, GIDs of virtual ISM device are UUIDs defined by RFC4122, which are 128-bits long. So some adaptation work is required. And note that the GIDs of existing platform firmware ISM devices still remain 64-bits long. Signed-off-by: Wen Gu <[email protected]> Reviewed-by: Alexandra Winter <[email protected]> Signed-off-by: David S. Miller <[email protected]>
2023-12-26net/smc: define a reserved CHID range for virtual ISM devicesWen Gu1-0/+20
According to virtual ISM support feature defined by SMCv2.1, CHIDs in the range 0xFF00 to 0xFFFF are reserved for use by virtual ISM devices. And two helpers are introduced to distinguish virtual ISM devices from the existing platform firmware ISM devices. Signed-off-by: Wen Gu <[email protected]> Reviewed-and-tested-by: Wenjia Zhang <[email protected]> Reviewed-by: Alexandra Winter <[email protected]> Signed-off-by: David S. Miller <[email protected]>
2023-12-26net/smc: introduce virtual ISM device support featureWen Gu1-3/+6
This introduces virtual ISM device support feature to SMCv2.1 as the first supplemental feature. Signed-off-by: Wen Gu <[email protected]> Signed-off-by: David S. Miller <[email protected]>
2023-12-26net/smc: support SMCv2.x supplemental features negotiationWen Gu4-4/+22
This patch adds SMCv2.x supplemental features negotiation. Supported SMCv2.x supplemental features are represented by feature_mask in FCE field. The negotiation process is as follows. Server Client Proposal(features(c-mask bits)) <----------------------------------------- Accept(features(s-mask bits)) -----------------------------------------> Confirm(features(s&c-mask bits)) <----------------------------------------- Signed-off-by: Wen Gu <[email protected]> Reviewed-and-tested-by: Wenjia Zhang <[email protected]> Signed-off-by: David S. Miller <[email protected]>
2023-12-26net/smc: unify the structs of accept or confirm message for v1 and v2Wen Gu3-97/+62
The structs of CLC accept and confirm messages for SMCv1 and SMCv2 are separately defined and often casted to each other in the code, which may increase the risk of errors caused by future divergence of them. So unify them into one struct for better maintainability. Suggested-by: Alexandra Winter <[email protected]> Signed-off-by: Wen Gu <[email protected]> Reviewed-by: Alexandra Winter <[email protected]> Signed-off-by: David S. Miller <[email protected]>
2023-12-26net/smc: introduce sub-functions for smc_clc_send_confirm_accept()Wen Gu1-82/+115
There is a large if-else block in smc_clc_send_confirm_accept() and it is better to split it into two sub-functions. Suggested-by: Alexandra Winter <[email protected]> Signed-off-by: Wen Gu <[email protected]> Reviewed-by: Alexandra Winter <[email protected]> Signed-off-by: David S. Miller <[email protected]>
2023-12-26net/smc: rename some 'fce' to 'fce_v2x' for clarityWen Gu1-14/+16
Rename some functions or variables with 'fce' in their name but used in SMCv2.1 as 'fce_v2x' for clarity. Signed-off-by: Wen Gu <[email protected]> Signed-off-by: David S. Miller <[email protected]>
2023-12-25nfc: Do not send datagram if socket state isn't LLCP_BOUNDSiddh Raman Pant1-0/+5
As we know we cannot send the datagram (state can be set to LLCP_CLOSED by nfc_llcp_socket_release()), there is no need to proceed further. Thus, bail out early from llcp_sock_sendmsg(). Signed-off-by: Siddh Raman Pant <[email protected]> Reviewed-by: Krzysztof Kozlowski <[email protected]> Reviewed-by: Suman Ghosh <[email protected]> Signed-off-by: David S. Miller <[email protected]>
2023-12-25nfc: llcp_core: Hold a ref to llcp_local->dev when holding a ref to llcp_localSiddh Raman Pant1-3/+36
llcp_sock_sendmsg() calls nfc_llcp_send_ui_frame() which in turn calls nfc_alloc_send_skb(), which accesses the nfc_dev from the llcp_sock for getting the headroom and tailroom needed for skb allocation. Parallelly the nfc_dev can be freed, as the refcount is decreased via nfc_free_device(), leading to a UAF reported by Syzkaller, which can be summarized as follows: (1) llcp_sock_sendmsg() -> nfc_llcp_send_ui_frame() -> nfc_alloc_send_skb() -> Dereference *nfc_dev (2) virtual_ncidev_close() -> nci_free_device() -> nfc_free_device() -> put_device() -> nfc_release() -> Free *nfc_dev When a reference to llcp_local is acquired, we do not acquire the same for the nfc_dev. This leads to freeing even when the llcp_local is in use, and this is the case with the UAF described above too. Thus, when we acquire a reference to llcp_local, we should acquire a reference to nfc_dev, and release the references appropriately later. References for llcp_local is initialized in nfc_llcp_register_device() (which is called by nfc_register_device()). Thus, we should acquire a reference to nfc_dev there. nfc_unregister_device() calls nfc_llcp_unregister_device() which in turn calls nfc_llcp_local_put(). Thus, the reference to nfc_dev is appropriately released later. Reported-and-tested-by: [email protected] Closes: https://syzkaller.appspot.com/bug?extid=bbe84a4010eeea00982d Fixes: c7aa12252f51 ("NFC: Take a reference on the LLCP local pointer when creating a socket") Reviewed-by: Suman Ghosh <[email protected]> Signed-off-by: Siddh Raman Pant <[email protected]> Reviewed-by: Krzysztof Kozlowski <[email protected]> Signed-off-by: David S. Miller <[email protected]>
2023-12-24rxrpc: Create a procfile to display outstanding client conn bundlesDavid Howells4-0/+94
Create /proc/net/rxrpc/bundles to display outstanding rxrpc client connection bundles. Signed-off-by: David Howells <[email protected]> cc: Marc Dionne <[email protected]> cc: [email protected]
2023-12-24rxrpc, afs: Allow afs to pin rxrpc_peer objectsDavid Howells5-39/+111
Change rxrpc's API such that: (1) A new function, rxrpc_kernel_lookup_peer(), is provided to look up an rxrpc_peer record for a remote address and a corresponding function, rxrpc_kernel_put_peer(), is provided to dispose of it again. (2) When setting up a call, the rxrpc_peer object used during a call is now passed in rather than being set up by rxrpc_connect_call(). For afs, this meenat passing it to rxrpc_kernel_begin_call() rather than the full address (the service ID then has to be passed in as a separate parameter). (3) A new function, rxrpc_kernel_remote_addr(), is added so that afs can get a pointer to the transport address for display purposed, and another, rxrpc_kernel_remote_srx(), to gain a pointer to the full rxrpc address. (4) The function to retrieve the RTT from a call, rxrpc_kernel_get_srtt(), is then altered to take a peer. This now returns the RTT or -1 if there are insufficient samples. (5) Rename rxrpc_kernel_get_peer() to rxrpc_kernel_call_get_peer(). (6) Provide a new function, rxrpc_kernel_get_peer(), to get a ref on a peer the caller already has. This allows the afs filesystem to pin the rxrpc_peer records that it is using, allowing faster lookups and pointer comparisons rather than comparing sockaddr_rxrpc contents. It also makes it easier to get hold of the RTT. The following changes are made to afs: (1) The addr_list struct's addrs[] elements now hold a peer struct pointer and a service ID rather than a sockaddr_rxrpc. (2) When displaying the transport address, rxrpc_kernel_remote_addr() is used. (3) The port arg is removed from afs_alloc_addrlist() since it's always overridden. (4) afs_merge_fs_addr4() and afs_merge_fs_addr6() do peer lookup and may now return an error that must be handled. (5) afs_find_server() now takes a peer pointer to specify the address. (6) afs_find_server(), afs_compare_fs_alists() and afs_merge_fs_addr[46]{} now do peer pointer comparison rather than address comparison. Signed-off-by: David Howells <[email protected]> cc: Marc Dionne <[email protected]> cc: [email protected]
2023-12-24rxrpc_find_service_conn_rcu: fix the usage of read_seqbegin_or_lock()Oleg Nesterov1-1/+2
rxrpc_find_service_conn_rcu() should make the "seq" counter odd on the second pass, otherwise read_seqbegin_or_lock() never takes the lock. Signed-off-by: Oleg Nesterov <[email protected]> Signed-off-by: David Howells <[email protected]> cc: Marc Dionne <[email protected]> cc: [email protected] Link: https://lore.kernel.org/r/[email protected]/
2023-12-22tipc: Remove some excess struct member documentationJonathan Corbet1-15/+0
Remove documentation for nonexistent struct members, addressing these warnings: ./net/tipc/link.c:228: warning: Excess struct member 'media_addr' description in 'tipc_link' ./net/tipc/link.c:228: warning: Excess struct member 'timer' description in 'tipc_link' ./net/tipc/link.c:228: warning: Excess struct member 'refcnt' description in 'tipc_link' ./net/tipc/link.c:228: warning: Excess struct member 'proto_msg' description in 'tipc_link' ./net/tipc/link.c:228: warning: Excess struct member 'pmsg' description in 'tipc_link' ./net/tipc/link.c:228: warning: Excess struct member 'backlog_limit' description in 'tipc_link' ./net/tipc/link.c:228: warning: Excess struct member 'exp_msg_count' description in 'tipc_link' ./net/tipc/link.c:228: warning: Excess struct member 'reset_rcv_checkpt' description in 'tipc_link' ./net/tipc/link.c:228: warning: Excess struct member 'transmitq' description in 'tipc_link' ./net/tipc/link.c:228: warning: Excess struct member 'snt_nxt' description in 'tipc_link' ./net/tipc/link.c:228: warning: Excess struct member 'deferred_queue' description in 'tipc_link' ./net/tipc/link.c:228: warning: Excess struct member 'unacked_window' description in 'tipc_link' ./net/tipc/link.c:228: warning: Excess struct member 'next_out' description in 'tipc_link' ./net/tipc/link.c:228: warning: Excess struct member 'long_msg_seq_no' description in 'tipc_link' ./net/tipc/link.c:228: warning: Excess struct member 'bc_rcvr' description in 'tipc_link' Signed-off-by: Jonathan Corbet <[email protected]> Reviewed-by: Randy Dunlap <[email protected]> Signed-off-by: David S. Miller <[email protected]>
2023-12-22tcp: Remove dead code and fields for bhash2.Kuniyuki Iwashima2-23/+1
Now all sockets including TIME_WAIT are linked to bhash2 using sock_common.skc_bind_node. We no longer use inet_bind2_bucket.deathrow, sock.sk_bind2_node, and inet_timewait_sock.tw_bind2_node. Signed-off-by: Kuniyuki Iwashima <[email protected]> Reviewed-by: Eric Dumazet <[email protected]> Signed-off-by: David S. Miller <[email protected]>