aboutsummaryrefslogtreecommitdiff
path: root/include
AgeCommit message (Collapse)AuthorFilesLines
2023-03-17ipv4: raw: constify raw_v4_match() socket argumentEric Dumazet1-1/+1
This clarifies raw_v4_match() intent. Signed-off-by: Eric Dumazet <edumazet@google.com> Reviewed-by: Simon Horman <simon.horman@corigine.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2023-03-17ipv6: raw: constify raw_v6_match() socket argumentEric Dumazet1-1/+1
This clarifies raw_v6_match() intent. Signed-off-by: Eric Dumazet <edumazet@google.com> Reviewed-by: Simon Horman <simon.horman@corigine.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2023-03-17ipv6: constify inet6_mc_check()Eric Dumazet1-1/+1
inet6_mc_check() is essentially a read-only function. Signed-off-by: Eric Dumazet <edumazet@google.com> Reviewed-by: Simon Horman <simon.horman@corigine.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2023-03-17ipv4: constify ip_mc_sf_allow() socket argumentEric Dumazet1-1/+1
This clarifies ip_mc_sf_allow() intent. Signed-off-by: Eric Dumazet <edumazet@google.com> Reviewed-by: Simon Horman <simon.horman@corigine.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2023-03-17inet: preserve const qualifier in inet_sk()Eric Dumazet3-7/+4
We can change inet_sk() to propagate const qualifier of its argument. This should avoid some potential errors caused by accidental (const -> not_const) promotion. Other helpers like tcp_sk(), udp_sk(), raw_sk() will be handled in separate patch series. v2: use container_of_const() as advised by Jakub and Linus Signed-off-by: Eric Dumazet <edumazet@google.com> Link: https://lore.kernel.org/netdev/20230315142841.3a2ac99a@kernel.org/ Link: https://lore.kernel.org/netdev/CAHk-=wiOf12nrYEF2vJMcucKjWPN-Ns_SW9fA7LwST_2Dzp7rw@mail.gmail.com/ Signed-off-by: David S. Miller <davem@davemloft.net>
2023-03-17vxlan: mdb: Add an internal flag to indicate MDB usageIdo Schimmel1-0/+1
Add an internal flag to indicate whether MDB entries are configured or not. Set the flag after installing the first MDB entry and clear it before deleting the last one. The flag will be consulted by the data path which will only perform an MDB lookup if the flag is set, thereby keeping the MDB overhead to a minimum when the MDB is not used. Another option would have been to use a static key, but it is global and not per-device, unlike the current approach. Signed-off-by: Ido Schimmel <idosch@nvidia.com> Reviewed-by: Nikolay Aleksandrov <razor@blackwall.org> Signed-off-by: David S. Miller <davem@davemloft.net>
2023-03-17vxlan: mdb: Add MDB control path supportIdo Schimmel2-0/+15
Implement MDB control path support, enabling the creation, deletion, replacement and dumping of MDB entries in a similar fashion to the bridge driver. Unlike the bridge driver, each entry stores a list of remote VTEPs to which matched packets need to be replicated to and not a list of bridge ports. The motivating use case is the installation of MDB entries by a user space control plane in response to received EVPN routes. As such, only allow permanent MDB entries to be installed and do not implement snooping functionality, avoiding a lot of unnecessary complexity. Since entries can only be modified by user space under RTNL, use RTNL as the write lock. Use RCU to ensure that MDB entries and remotes are not freed while being accessed from the data path during transmission. In terms of uAPI, reuse the existing MDB netlink interface, but add a few new attributes to request and response messages: * IP address of the destination VXLAN tunnel endpoint where the multicast receivers reside. * UDP destination port number to use to connect to the remote VXLAN tunnel endpoint. * VXLAN VNI Network Identifier to use to connect to the remote VXLAN tunnel endpoint. Required when Ingress Replication (IR) is used and the remote VTEP is not a member of originating broadcast domain (VLAN/VNI) [1]. * Source VNI Network Identifier the MDB entry belongs to. Used only when the VXLAN device is in external mode. * Interface index of the outgoing interface to reach the remote VXLAN tunnel endpoint. This is required when the underlay destination IP is multicast (P2MP), as the multicast routing tables are not consulted. All the new attributes are added under the 'MDBA_SET_ENTRY_ATTRS' nest which is strictly validated by the bridge driver, thereby automatically rejecting the new attributes. [1] https://datatracker.ietf.org/doc/html/draft-ietf-bess-evpn-irb-mcast#section-3.2.2 Signed-off-by: Ido Schimmel <idosch@nvidia.com> Reviewed-by: Nikolay Aleksandrov <razor@blackwall.org> Signed-off-by: David S. Miller <davem@davemloft.net>
2023-03-17net: Add MDB net device operationsIdo Schimmel1-0/+21
Add MDB net device operations that will be invoked by rtnetlink code in response to received RTM_{NEW,DEL,GET}MDB messages. Subsequent patches will implement these operations in the bridge and VXLAN drivers. Signed-off-by: Ido Schimmel <idosch@nvidia.com> Reviewed-by: Nikolay Aleksandrov <razor@blackwall.org> Signed-off-by: David S. Miller <davem@davemloft.net>
2023-03-17net: mana: Add new MANA VF performance counters for easier troubleshootingShradha Gupta1-0/+18
Extended performance counter stats in 'ethtool -S <interface>' output for MANA VF to facilitate troubleshooting. Tested-on: Ubuntu22 Signed-off-by: Shradha Gupta <shradhagupta@linux.microsoft.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2023-03-16nfc: mrvl: Move platform_data struct into driverRob Herring1-48/+0
There are no users of nfcmrvl platform_data struct outside of the driver and none will be added, so move it into the driver. Signed-off-by: Rob Herring <robh@kernel.org> Reviewed-by: Simon Horman <simon.horman@corigine.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2023-03-15net/mlx5: Move needed PTYS functions to core layerGal Pressman1-0/+16
Downstream patches require devlink params to access the PTYS register, move the needed functions from mlx5e to the core layer. Signed-off-by: Gal Pressman <gal@nvidia.com> Reviewed-by: Tariq Toukan <tariqt@nvidia.com> Signed-off-by: Saeed Mahameed <saeedm@nvidia.com> Link: https://lore.kernel.org/r/20230314054234.267365-11-saeed@kernel.org Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2023-03-15net/mlx5: Implement thermal zoneSandipan Patra2-0/+29
Implement thermal zone support for mlx5 based HW. The NIC uses temperature sensor provided by ASIC to report current temperature to thermal core. Signed-off-by: Sandipan Patra <spatra@nvidia.com> Reviewed-by: Gal Pressman <gal@nvidia.com> Signed-off-by: Saeed Mahameed <saeedm@nvidia.com> Link: https://lore.kernel.org/r/20230314054234.267365-5-saeed@kernel.org Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2023-03-15scm: fix MSG_CTRUNC setting condition for SO_PASSSECAlexander Mikhalitsyn1-1/+12
Currently, kernel would set MSG_CTRUNC flag if msg_control buffer wasn't provided and SO_PASSCRED was set or if there was pending SCM_RIGHTS. For some reason we have no corresponding check for SO_PASSSEC. In the recvmsg(2) doc we have: MSG_CTRUNC indicates that some control data was discarded due to lack of space in the buffer for ancillary data. So, we need to set MSG_CTRUNC flag for all types of SCM. This change can break applications those don't check MSG_CTRUNC flag. Cc: "David S. Miller" <davem@davemloft.net> Cc: Eric Dumazet <edumazet@google.com> Cc: Jakub Kicinski <kuba@kernel.org> Cc: Paolo Abeni <pabeni@redhat.com> Cc: Leon Romanovsky <leon@kernel.org> Fixes: 1da177e4c3f4 ("Linux-2.6.12-rc2") Signed-off-by: Alexander Mikhalitsyn <aleksandr.mikhalitsyn@canonical.com> v2: - commit message was rewritten according to Eric's suggestion Acked-by: Paul Moore <paul@paul-moore.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2023-03-15net/smc: Introduce explicit check for v2 supportStefan Raspl1-0/+1
Previously, v2 support was derived from a very specific format of the SEID as part of the SMC-D codebase. Make this part of the SMC-D device API, so implementers do not need to adhere to a specific SEID format. Signed-off-by: Stefan Raspl <raspl@linux.ibm.com> Reviewed-and-tested-by: Jan Karcher <jaka@linux.ibm.com> Reviewed-by: Wenjia Zhang <wenjia@linux.ibm.com> Signed-off-by: Wenjia Zhang <wenjia@linux.ibm.com> Reviewed-by: Tony Lu <tonylu@linux.alibaba.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2023-03-15neighbour: annotate lockless accesses to n->nud_stateEric Dumazet1-1/+1
We have many lockless accesses to n->nud_state. Before adding another one in the following patch, add annotations to readers and writers. Signed-off-by: Eric Dumazet <edumazet@google.com> Reviewed-by: David Ahern <dsahern@kernel.org> Reviewed-by: Martin KaFai Lau <martin.lau@kernel.org> Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2023-03-13net: virtio_net: implement exact header length guest featureJiri Pirko1-0/+1
Virtio spec introduced a feature VIRTIO_NET_F_GUEST_HDRLEN which when set implicates that device benefits from knowing the exact size of the header. For compatibility, to signal to the device that the header is reliable driver also needs to set this feature. Without this feature set by driver, device has to figure out the header size itself. Quoting the original virtio spec: "hdr_len is a hint to the device as to how much of the header needs to be kept to copy into each packet" "a hint" might not be clear for the reader what does it mean, if it is "maybe like that" of "exactly like that". This feature just makes it crystal clear and let the device count on the hdr_len being filled up by the exact length of header. Also note the spec already has following note about hdr_len: "Due to various bugs in implementations, this field is not useful as a guarantee of the transport header size." Without this feature the device needs to parse the header in core data path handling. Accurate information helps the device to eliminate such header parsing and directly use the hardware accelerators for GSO operation. virtio_net_hdr_from_skb() fills up hdr_len to skb_headlen(skb). The driver already complies to fill the correct value. Introduce the feature and advertise it. Note that virtio spec also includes following note for device implementation: "Caution should be taken by the implementation so as to prevent a malicious driver from attacking the device by setting an incorrect hdr_len." There is a plan to support this feature in our emulated device. A device of SolidRun offers this feature bit. They claim this feature will save the device a few cycles for every GSO packet. Link: https://docs.oasis-open.org/virtio/virtio/v1.2/cs01/virtio-v1.2-cs01.html#x1-230006x3 Signed-off-by: Jiri Pirko <jiri@nvidia.com> Reviewed-by: Parav Pandit <parav@nvidia.com> Reviewed-by: Alvaro Karsz <alvaro.karsz@solid-run.com> Acked-by: Michael S. Tsirkin <mst@redhat.com> Acked-by: Willem de Bruijn <willemb@google.com> Link: https://lore.kernel.org/r/20230309094559.917857-1-jiri@resnulli.us Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2023-03-10Merge tag 'wireless-next-2023-03-10' of ↵Jakub Kicinski4-46/+334
git://git.kernel.org/pub/scm/linux/kernel/git/wireless/wireless-next Johannes Berg says: ==================== wireless-next patches for 6.4 Major changes: cfg80211 * 6 GHz improvements * HW timestamping support * support for randomized auth/deauth TA for PASN privacy (also for mac80211) mac80211 * radiotap TLV and EHT support for the iwlwifi sniffer * HW timestamping support * per-link debugfs for multi-link brcmfmac * support for Apple (M1 Pro/Max) devices iwlwifi * support for a few new devices * EHT sniffer support rtw88 * better support for some SDIO devices (e.g. MAC address from efuse) rtw89 * HW scan support for 8852b * better support for 6 GHz scanning * tag 'wireless-next-2023-03-10' of git://git.kernel.org/pub/scm/linux/kernel/git/wireless/wireless-next: (84 commits) wifi: iwlwifi: mvm: fix EOF bit reporting wifi: iwlwifi: Do not include radiotap EHT user info if not needed wifi: iwlwifi: mvm: add EHT RU allocation to radiotap wifi: iwlwifi: Update logs for yoyo reset sw changes wifi: iwlwifi: mvm: clean up duplicated defines wifi: iwlwifi: rs-fw: break out for unsupported bandwidth wifi: iwlwifi: Add support for B step of BnJ-Fm4 wifi: iwlwifi: mvm: make flush code a bit clearer wifi: iwlwifi: mvm: avoid UB shift of snif_queue wifi: iwlwifi: mvm: add primary 80 known for EHT radiotap wifi: iwlwifi: mvm: parse FW frame metadata for EHT sniffer mode wifi: iwlwifi: mvm: decode USIG_B1_B7 RU to nl80211 RU width wifi: iwlwifi: mvm: rename define to generic name wifi: iwlwifi: mvm: allow Microsoft to use TAS wifi: iwlwifi: mvm: add all EHT based on data0 info from HW wifi: iwlwifi: mvm: add EHT radiotap info based on rate_n_flags wifi: iwlwifi: mvm: add an helper function radiotap TLVs wifi: radiotap: separate vendor TLV into header/content wifi: iwlwifi: reduce verbosity of some logging events wifi: iwlwifi: Adding the code to get RF name for MsP device ... ==================== Link: https://lore.kernel.org/r/20230310120159.36518-1-johannes@sipsolutions.net Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2023-03-09Merge branch 'main' of ↵Jakub Kicinski2-0/+4
git://git.kernel.org/pub/scm/linux/kernel/git/netfilter/nf-next Florian Westphal says: ==================== Netfilter updates for net-next 1. nf_tables 'brouting' support, from Sriram Yagnaraman. 2. Update bridge netfilter and ovs conntrack helpers to handle IPv6 Jumbo packets properly, i.e. fetch the packet length from hop-by-hop extension header, from Xin Long. This comes with a test BIG TCP test case, added to tools/testing/selftests/net/. 3. Fix spelling and indentation in conntrack, from Jeremy Sowden. * 'main' of git://git.kernel.org/pub/scm/linux/kernel/git/netfilter/nf-next: netfilter: nat: fix indentation of function arguments netfilter: conntrack: fix typo selftests: add a selftest for big tcp netfilter: use nf_ip6_check_hbh_len in nf_ct_skb_network_trim netfilter: move br_nf_check_hbh_len to utils netfilter: bridge: move pskb_trim_rcsum out of br_nf_check_hbh_len netfilter: bridge: check len before accessing more nh data netfilter: bridge: call pskb_may_pull in br_nf_check_hbh_len netfilter: bridge: introduce broute meta statement ==================== Link: https://lore.kernel.org/r/20230308193033.13965-1-fw@strlen.de Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2023-03-09neighbour: delete neigh_lookup_nodev as not usedLeon Romanovsky1-2/+0
neigh_lookup_nodev isn't used in the kernel after removal of DECnet. So let's remove it. Fixes: 1202cdd66531 ("Remove DECnet support from kernel") Signed-off-by: Leon Romanovsky <leonro@nvidia.com> Reviewed-by: Eric Dumazet <edumazet@google.com> Reviewed-by: Nikolay Aleksandrov <razor@blackwall.org> Link: https://lore.kernel.org/r/eb5656200d7964b2d177a36b77efa3c597d6d72d.1678267343.git.leonro@nvidia.com Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2023-03-09net: sched: remove qdisc_watchdog->last_expiresEric Dumazet1-1/+0
This field mirrors hrtimer softexpires, we can instead use the existing helpers. Signed-off-by: Eric Dumazet <edumazet@google.com> Link: https://lore.kernel.org/r/20230308182648.1150762-1-edumazet@google.com Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2023-03-09netlink: remove unused 'compare' functionFlorian Westphal1-1/+0
No users in the tree. Tested with allmodconfig build. Signed-off-by: Florian Westphal <fw@strlen.de> Link: https://lore.kernel.org/r/20230308142006.20879-1-fw@strlen.de Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2023-03-09Merge git://git.kernel.org/pub/scm/linux/kernel/git/netdev/netJakub Kicinski32-325/+237
Documentation/bpf/bpf_devel_QA.rst b7abcd9c656b ("bpf, doc: Link to submitting-patches.rst for general patch submission info") d56b0c461d19 ("bpf, docs: Fix link to netdev-FAQ target") https://lore.kernel.org/all/20230307095812.236eb1be@canb.auug.org.au/ Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2023-03-09Merge tag 'net-6.3-rc2' of ↵Linus Torvalds3-2/+9
git://git.kernel.org/pub/scm/linux/kernel/git/netdev/net Pull networking fixes from Paolo Abeni: "Including fixes from netfilter and bpf. Current release - regressions: - core: avoid skb end_offset change in __skb_unclone_keeptruesize() - sched: - act_connmark: handle errno on tcf_idr_check_alloc - flower: fix fl_change() error recovery path - ieee802154: prevent user from crashing the host Current release - new code bugs: - eth: bnxt_en: fix the double free during device removal - tools: ynl: - fix enum-as-flags in the generic CLI - fully inherit attrs in subsets - re-license uniformly under GPL-2.0 or BSD-3-clause Previous releases - regressions: - core: use indirect calls helpers for sk_exit_memory_pressure() - tls: - fix return value for async crypto - avoid hanging tasks on the tx_lock - eth: ice: copy last block omitted in ice_get_module_eeprom() Previous releases - always broken: - core: avoid double iput when sock_alloc_file fails - af_unix: fix struct pid leaks in OOB support - tls: - fix possible race condition - fix device-offloaded sendpage straddling records - bpf: - sockmap: fix an infinite loop error - test_run: fix &xdp_frame misplacement for LIVE_FRAMES - fix resolving BTF_KIND_VAR after ARRAY, STRUCT, UNION, PTR - netfilter: tproxy: fix deadlock due to missing BH disable - phylib: get rid of unnecessary locking - eth: bgmac: fix *initial* chip reset to support BCM5358 - eth: nfp: fix csum for ipsec offload - eth: mtk_eth_soc: fix RX data corruption issue Misc: - usb: qmi_wwan: add telit 0x1080 composition" * tag 'net-6.3-rc2' of git://git.kernel.org/pub/scm/linux/kernel/git/netdev/net: (64 commits) tools: ynl: fix enum-as-flags in the generic CLI tools: ynl: move the enum classes to shared code net: avoid double iput when sock_alloc_file fails af_unix: fix struct pid leaks in OOB support eth: fealnx: bring back this old driver net: dsa: mt7530: permit port 5 to work without port 6 on MT7621 SoC net: microchip: sparx5: fix deletion of existing DSCP mappings octeontx2-af: Unlock contexts in the queue context cache in case of fault detection net/smc: fix fallback failed while sendmsg with fastopen ynl: re-license uniformly under GPL-2.0 OR BSD-3-Clause mailmap: update entries for Stephen Hemminger mailmap: add entry for Maxim Mikityanskiy nfc: change order inside nfc_se_io error path ethernet: ice: avoid gcc-9 integer overflow warning ice: don't ignore return codes in VSI related code ice: Fix DSCP PFC TLV creation net: usb: qmi_wwan: add Telit 0x1080 composition net: usb: cdc_mbim: avoid altsetting toggling for Telit FE990 netfilter: conntrack: adopt safer max chain length net: tls: fix device-offloaded sendpage straddling records ...
2023-03-09Merge tag 'for-linus-2023030901' of ↵Linus Torvalds1-0/+3
git://git.kernel.org/pub/scm/linux/kernel/git/hid/hid Pull HID fixes from Benjamin Tissoires: - fix potential out of bound write of zeroes in HID core with a specially crafted uhid device (Lee Jones) - fix potential use-after-free in work function in intel-ish-hid (Reka Norman) - selftests config fixes (Benjamin Tissoires) - few device small fixes and support * tag 'for-linus-2023030901' of git://git.kernel.org/pub/scm/linux/kernel/git/hid/hid: HID: intel-ish-hid: ipc: Fix potential use-after-free in work function HID: logitech-hidpp: Add support for Logitech MX Master 3S mouse HID: cp2112: Fix driver not registering GPIO IRQ chip as threaded selftest: hid: fix hid_bpf not set in config HID: uhid: Over-ride the default maximum data buffer value with our own HID: core: Provide new max_buffer_size attribute to over-ride the default
2023-03-09sctp: add weighted fair queueing stream schedulerXin Long3-1/+4
As it says in rfc8260#section-3.6 about the weighted fair queueing scheduler: A Weighted Fair Queueing scheduler between the streams is used. The weight is configurable per outgoing SCTP stream. This scheduler considers the lengths of the messages of each stream and schedules them in a specific way to use the capacity according to the given weights. If the weight of stream S1 is n times the weight of stream S2, the scheduler should assign to stream S1 n times the capacity it assigns to stream S2. The details are implementation dependent. Interleaving user messages allows for a better realization of the capacity usage according to the given weights. This patch adds Weighted Fair Queueing Scheduler actually based on the code of Fair Capacity Scheduler by adding fc_weight into struct sctp_stream_out_ext and taking it into account when sorting stream-> fc_list in sctp_sched_fc_sched() and sctp_sched_fc_dequeue_done(). Signed-off-by: Xin Long <lucien.xin@gmail.com> Acked-by: Marcelo Ricardo Leitner <marcelo.leitner@gmail.com> Signed-off-by: Paolo Abeni <pabeni@redhat.com>
2023-03-09sctp: add fair capacity stream schedulerXin Long3-1/+10
As it says in rfc8260#section-3.5 about the fair capacity scheduler: A fair capacity distribution between the streams is used. This scheduler considers the lengths of the messages of each stream and schedules them in a specific way to maintain an equal capacity for all streams. The details are implementation dependent. interleaving user messages allows for a better realization of the fair capacity usage. This patch adds Fair Capacity Scheduler based on the foundations added by commit 5bbbbe32a431 ("sctp: introduce stream scheduler foundations"): A fc_list and a fc_length are added into struct sctp_stream_out_ext and a fc_list is added into struct sctp_stream. In .enqueue, when there are chunks enqueued into a stream, this stream will be linked into stream-> fc_list by its fc_list ordered by its fc_length. In .dequeue, it always picks up the 1st skb from stream->fc_list. In .dequeue_done, fc_length is increased by chunk's len and update its location in stream->fc_list according to the its new fc_length. Note that when the new fc_length overflows in .dequeue_done, instead of resetting all fc_lengths to 0, we only reduced them by U32_MAX / 4 to avoid a moment of imbalance in the scheduling, as Marcelo suggested. Signed-off-by: Xin Long <lucien.xin@gmail.com> Acked-by: Marcelo Ricardo Leitner <marcelo.leitner@gmail.com> Signed-off-by: Paolo Abeni <pabeni@redhat.com>
2023-03-08Merge https://git.kernel.org/pub/scm/linux/kernel/git/bpf/bpf-nextJakub Kicinski4-1/+11
Andrii Nakryiko says: ==================== pull-request: bpf-next 2023-03-08 We've added 23 non-merge commits during the last 2 day(s) which contain a total of 28 files changed, 414 insertions(+), 104 deletions(-). The main changes are: 1) Add more precise memory usage reporting for all BPF map types, from Yafang Shao. 2) Add ARM32 USDT support to libbpf, from Puranjay Mohan. 3) Fix BTF_ID_LIST size causing problems in !CONFIG_DEBUG_INFO_BTF, from Nathan Chancellor. 4) IMA selftests fix, from Roberto Sassu. 5) libbpf fix in APK support code, from Daniel Müller. * https://git.kernel.org/pub/scm/linux/kernel/git/bpf/bpf-next: (23 commits) selftests/bpf: Fix IMA test libbpf: USDT arm arg parsing support libbpf: Refactor parse_usdt_arg() to re-use code libbpf: Fix theoretical u32 underflow in find_cd() function bpf: enforce all maps having memory usage callback bpf: offload map memory usage bpf, net: xskmap memory usage bpf, net: sock_map memory usage bpf, net: bpf_local_storage memory usage bpf: local_storage memory usage bpf: bpf_struct_ops memory usage bpf: queue_stack_maps memory usage bpf: devmap memory usage bpf: cpumap memory usage bpf: bloom_filter memory usage bpf: ringbuf memory usage bpf: reuseport_array memory usage bpf: stackmap memory usage bpf: arraymap memory usage bpf: hashtab memory usage ... ==================== Link: https://lore.kernel.org/r/20230308193533.1671597-1-andrii@kernel.org Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2023-03-08netfilter: move br_nf_check_hbh_len to utilsXin Long1-0/+2
Rename br_nf_check_hbh_len() to nf_ip6_check_hbh_len() and move it to netfilter utils, so that it can be used by other modules, like ovs and tc. Signed-off-by: Xin Long <lucien.xin@gmail.com> Reviewed-by: Simon Horman <simon.horman@corigine.com> Reviewed-by: Nikolay Aleksandrov <razor@blackwall.org> Reviewed-by: Aaron Conole <aconole@redhat.com> Signed-off-by: Florian Westphal <fw@strlen.de>
2023-03-08net: reclaim skb->scm_io_uring bitEric Dumazet2-2/+1
Commit 0091bfc81741 ("io_uring/af_unix: defer registered files gc to io_uring release") added one bit to struct sk_buff. This structure is critical for networking, and we try very hard to not add bloat on it, unless absolutely required. For instance, we can use a specific destructor as a wrapper around unix_destruct_scm(), to identify skbs that unix_gc() has to special case. Signed-off-by: Eric Dumazet <edumazet@google.com> Cc: Pavel Begunkov <asml.silence@gmail.com> Cc: Thadeu Lima de Souza Cascardo <cascardo@canonical.com> Cc: Jens Axboe <axboe@kernel.dk> Reviewed-by: Jens Axboe <axboe@kernel.dk> Reviewed-by: Pavel Begunkov <asml.silence@gmail.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2023-03-08netfilter: bridge: introduce broute meta statementSriram Yagnaraman1-0/+2
nftables equivalent for ebtables -t broute. Implement broute meta statement to set br_netfilter_broute flag in skb to force a packet to be routed instead of being bridged. Signed-off-by: Sriram Yagnaraman <sriram.yagnaraman@est.tech> Signed-off-by: Florian Westphal <fw@strlen.de>
2023-03-07net: remove enum skb_free_reasonEric Dumazet1-11/+7
enum skb_drop_reason is more generic, we can adopt it instead. Provide dev_kfree_skb_irq_reason() and dev_kfree_skb_any_reason(). This means drivers can use more precise drop reasons if they want to. Signed-off-by: Eric Dumazet <edumazet@google.com> Reviewed-by: Simon Horman <simon.horman@corigine.com> Reviewed-by: Yunsheng Lin <linyunsheng@huawei.com> Link: https://lore.kernel.org/r/20230306204313.10492-1-edumazet@google.com Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2023-03-07net: phy: improve phy_read_poll_timeoutHeiner Kallweit1-3/+2
cond sometimes is (val & MASK) what may result in a false positive if val is a negative errno. We shouldn't evaluate cond if val < 0. This has no functional impact here, but it's not nice. Therefore switch order of the checks. Signed-off-by: Heiner Kallweit <hkallweit1@gmail.com> Reviewed-by: Simon Horman <simon.horman@corigine.com> Reviewed-by: Andrew Lunn <andrew@lunn.ch> Link: https://lore.kernel.org/r/6d8274ac-4344-23b4-d9a3-cad4c39517d4@gmail.com Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2023-03-07ynl: re-license uniformly under GPL-2.0 OR BSD-3-ClauseJakub Kicinski2-2/+2
I was intending to make all the Netlink Spec code BSD-3-Clause to ease the adoption but it appears that: - I fumbled the uAPI and used "GPL WITH uAPI note" there - it gives people pause as they expect GPL in the kernel As suggested by Chuck re-license under dual. This gives us benefit of full BSD freedom while fulfilling the broad "kernel is under GPL" expectations. Link: https://lore.kernel.org/all/20230304120108.05dd44c5@kernel.org/ Link: https://lore.kernel.org/r/20230306200457.3903854-1-kuba@kernel.org Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2023-03-07cpumask: be more careful with 'cpumask_setall()'Linus Torvalds1-5/+5
Commit 596ff4a09b89 ("cpumask: re-introduce constant-sized cpumask optimizations") changed cpumask_setall() to use "bitmap_set()" instead of "bitmap_fill()", because bitmap_fill() would explicitly set all the bits of a constant sized small bitmap, and that's exactly what we don't want: we want to only set bits up to 'nr_cpu_ids', which is what "bitmap_set()" does. However, Yury correctly points out that while "bitmap_set()" does indeed only set bits up to the required bitmap size, it doesn't _clear_ bits above that size, so the upper bits would still not have well-defined values. Now, none of this should really matter, since any bits set past 'nr_cpu_ids' should always be ignored in the first place. Yes, the bit scanning functions might return them as a result, but since users should always consider the ">= nr_cpu_ids" condition to mean "no more bits", that shouldn't have any actual effect (see previous commit 8ca09d5fa354 "cpumask: fix incorrect cpumask scanning result checks"). But let's just do it right, the way the code was _intended_ to work. We have had enough lazy code that works but bites us in the *rse later (again, see previous commit) that there's no reason to not just do this properly. It turns out that "bitmap_fill()" gets this all right for the complex case, and really only fails for the inlined optimized case that just fills the whole word. And while we could just fix bitmap_fill() to use the proper last word mask, there's two issues with that: - the cpumask case wants to do the _optimization_ based on "NR_CPUS is a small constant", but then wants to do the actual bit _fill_ based on "nr_cpu_ids" that isn't necessarily that same constant - we have lots of non-cpumask users of bitmap_fill(), and while they hopefully don't care, and probably would want the proper semantics anyway ("only set bits up to the limit"), I do not want the cpumask changes to impact other parts So this ends up just doing the single-word optimization by hand in the cpumask code. If our cpumask is fundamentally limited to a single word, just do the proper "fill in that word" exactly. And if it's the more complex multi-word case, then the generic bitmap_fill() will DTRT. This is all an example of how our bitmap function optimizations really are somewhat broken. They conflate the "this is size of the bitmap" optimizations with the actual bit(s) we want to set. In many cases we really want to have the two be separate things: sometimes we base our optimizations on the size of the whole bitmap ("I know this whole bitmap fits in a single word, so I'll just use single-word accesses"), and sometimes we base them on the bit we are looking at ("this is just acting on bits that are in the first word, so I'll use single-word accesses"). Notice how the end result of the two optimizations are the same, but the way we get to them are quite different. And all our cpumask optimization games are really about that fundamental distinction, and we'd often really want to pass in both the "this is the bit I'm working on" (which _can_ be a small constant but might be variable), and "I know it's in this range even if it's variable" (based on CONFIG_NR_CPUS). So this cpumask_setall() implementation just makes that explicit. It checks the "I statically know the size is small" using the known static size of the cpumask (which is what that 'small_cpumask_bits' is all about), but then sets the actual bits using the exact number of cpus we have (ie 'nr_cpumask_bits') Of course, in a perfect world, the compiler would have done all the range analysis (possibly with help from us just telling it that "this value is always in this range"), and would do all of this for us. But that is not the world we live in. While we dream of that perfect world, this does that manual logic to make it all work out. And this was a very long explanation for a small code change that shouldn't even matter. Reported-by: Yury Norov <yury.norov@gmail.com> Link: https://lore.kernel.org/lkml/ZAV9nGG9e1%2FrV+L%2F@yury-laptop/ Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2023-03-07wifi: radiotap: separate vendor TLV into header/contentMordechay Goodstein1-6/+14
To be able to use a general function later for any kind of TLV, separate the vendor TLV header/content in the structs. Signed-off-by: Mordechay Goodstein <mordechay.goodstein@intel.com> Signed-off-by: Gregory Greenman <gregory.greenman@intel.com> Link: https://lore.kernel.org/r/20230305124407.8ac5195bb3e6.I19ad99c1ad3108453aede64bddf6ef1a7c4a0b74@changeid [separate from the original combined patch] Signed-off-by: Johannes Berg <johannes.berg@intel.com>
2023-03-07bpf: offload map memory usageYafang Shao1-0/+6
A new helper is introduced to calculate offload map memory usage. But currently the memory dynamically allocated in netdev dev_ops, like nsim_map_update_elem, is not counted. Let's just put it aside now. Signed-off-by: Yafang Shao <laoar.shao@gmail.com> Link: https://lore.kernel.org/r/20230305124615.12358-18-laoar.shao@gmail.com Signed-off-by: Alexei Starovoitov <ast@kernel.org>
2023-03-07bpf, net: xskmap memory usageYafang Shao1-0/+1
A new helper is introduced to calculate xskmap memory usage. The xfsmap memory usage can be dynamically changed when we add or remove a xsk_map_node. Hence we need to track the count of xsk_map_node to get its memory usage. The result as follows, - before 10: xskmap name count_map flags 0x0 key 4B value 4B max_entries 65536 memlock 524288B - after 10: xskmap name count_map flags 0x0 <<< no elements case key 4B value 4B max_entries 65536 memlock 524608B Signed-off-by: Yafang Shao <laoar.shao@gmail.com> Link: https://lore.kernel.org/r/20230305124615.12358-17-laoar.shao@gmail.com Signed-off-by: Alexei Starovoitov <ast@kernel.org>
2023-03-07bpf, net: bpf_local_storage memory usageYafang Shao1-0/+1
A new helper is introduced into bpf_local_storage map to calculate the memory usage. This helper is also used by other maps like bpf_cgrp_storage, bpf_inode_storage, bpf_task_storage and etc. Note that currently the dynamically allocated storage elements are not counted in the usage, since it will take extra runtime overhead in the elements update or delete path. So let's put it aside now, and implement it in the future when someone really need it. Signed-off-by: Yafang Shao <laoar.shao@gmail.com> Link: https://lore.kernel.org/r/20230305124615.12358-15-laoar.shao@gmail.com Signed-off-by: Alexei Starovoitov <ast@kernel.org>
2023-03-07bpf: add new map ops ->map_mem_usageYafang Shao1-0/+2
Add a new map ops ->map_mem_usage to print the memory usage of a bpf map. This is a preparation for the followup change. Signed-off-by: Yafang Shao <laoar.shao@gmail.com> Link: https://lore.kernel.org/r/20230305124615.12358-2-laoar.shao@gmail.com Signed-off-by: Alexei Starovoitov <ast@kernel.org>
2023-03-07bpf: Increase size of BTF_ID_LIST without CONFIG_DEBUG_INFO_BTF againNathan Chancellor1-1/+1
After commit 66e3a13e7c2c ("bpf: Add bpf_dynptr_slice and bpf_dynptr_slice_rdwr"), clang builds without CONFIG_DEBUG_INFO_BTF warn: kernel/bpf/verifier.c:10298:24: warning: array index 16 is past the end of the array (that has type 'u32[16]' (aka 'unsigned int[16]')) [-Warray-bounds] meta.func_id == special_kfunc_list[KF_bpf_dynptr_slice_rdwr]) { ^ ~~~~~~~~~~~~~~~~~~~~~~~~ kernel/bpf/verifier.c:9150:1: note: array 'special_kfunc_list' declared here BTF_ID_LIST(special_kfunc_list) ^ include/linux/btf_ids.h:207:27: note: expanded from macro 'BTF_ID_LIST' #define BTF_ID_LIST(name) static u32 __maybe_unused name[16]; ^ 1 warning generated. A warning of this nature was previously addressed by commit beb3d47d1d3d ("bpf: Fix a BTF_ID_LIST bug with CONFIG_DEBUG_INFO_BTF not set") but there have been new kfuncs added since then. Quadruple the size of the CONFIG_DEBUG_INFO_BTF=n definition so that this problem is unlikely to show up for some time. Link: https://github.com/ClangBuiltLinux/linux/issues/1810 Signed-off-by: Nathan Chancellor <nathan@kernel.org> Reviewed-by: Tom Rix <trix@redhat.com> Link: https://lore.kernel.org/r/20230307-bpf-kfuncs-warray-bounds-v1-1-00ad3191f3a6@kernel.org Signed-off-by: Alexei Starovoitov <ast@kernel.org>
2023-03-07wifi: nl80211: Add support for randomizing TA of auth and deauth framesVeerendranath Jakkam1-0/+5
Add support to use a random local address in authentication and deauthentication frames sent to unassociated peer when the driver supports. The driver needs to configure receive behavior to accept frames with random transmit address specified in TX path authentication frames during the time of the frame exchange is pending and such frames need to be acknowledged similarly to frames sent to the local permanent address when this random address functionality is used. This capability allows use of randomized transmit address for PASN authentication frames to improve privacy of WLAN clients. Signed-off-by: Veerendranath Jakkam <quic_vjakkam@quicinc.com> Link: https://lore.kernel.org/r/20230112012415.167556-2-quic_vjakkam@quicinc.com Signed-off-by: Johannes Berg <johannes.berg@intel.com>
2023-03-07wifi: mac80211: add LDPC related flags in ieee80211_bss_confRyder Lee1-0/+6
This is utilized to pass LDPC configurations from user space (i.e. hostapd) to driver. Signed-off-by: Ryder Lee <ryder.lee@mediatek.com> Link: https://lore.kernel.org/r/1de696aaa34efd77a926eb657b8c0fda05aaa177.1676628065.git.ryder.lee@mediatek.com Signed-off-by: Johannes Berg <johannes.berg@intel.com>
2023-03-07wifi: mac80211: add EHT MU-MIMO related flags in ieee80211_bss_confRyder Lee1-0/+9
Similar to VHT/HE. This is utilized to pass MU-MIMO configurations from user space (i.e. hostapd) to driver. Signed-off-by: Ryder Lee <ryder.lee@mediatek.com> Link: https://lore.kernel.org/r/8d9966c4c1e77cb1ade77d42bdc49905609192e9.1676628065.git.ryder.lee@mediatek.com [move into combined if statement, reset on !eht] Signed-off-by: Johannes Berg <johannes.berg@intel.com>
2023-03-07wifi: mac80211: introduce ieee80211_refresh_tx_agg_session_timer()Ryder Lee1-0/+14
This allows low level drivers to refresh the tx agg session timer, based on querying stats from the firmware usually. Especially for some mt76 devices support .net_fill_forward_path would bypass mac80211, which leads to tx BA session timeout clients that set a timeout in their AddBA response to our request, even if our request is without a timeout. Signed-off-by: Ryder Lee <ryder.lee@mediatek.com> Link: https://lore.kernel.org/r/7c3f72eac1c34921cd84a462e60d71e125862152.1676616450.git.ryder.lee@mediatek.com [slightly clarify commit message, add note about RCU] Signed-off-by: Johannes Berg <johannes.berg@intel.com>
2023-03-07wifi: mac80211: add support for driver adding radiotap TLVsMordechay Goodstein2-37/+27
The new TLV format enables adding TLVs after the fixed fields in radiotap, as part of the radiotap header. Support this and move vendor data to the TLV format, allowing a reuse of the RX_FLAG_RADIOTAP_VENDOR_DATA as the new RX_FLAG_RADIOTAP_TLV_AT_END flag. Signed-off-by: Mordechay Goodstein <mordechay.goodstein@intel.com> Signed-off-by: Gregory Greenman <gregory.greenman@intel.com> Link: https://lore.kernel.org/r/20230301115906.b18fd5da8477.I576400ec40a7b35ef97a3b09a99b3a49e9174786@changeid Signed-off-by: Johannes Berg <johannes.berg@intel.com>
2023-03-07wifi: radiotap: Add EHT radiotap definitionsMordechay Goodstein1-2/+185
This is based on https://www.radiotap.org/fields/EHT.html and https://www.radiotap.org/fields/U-SIG.html new EHT TLV definition for 11be standard. Signed-off-by: Mordechay Goodstein <mordechay.goodstein@intel.com> Signed-off-by: Gregory Greenman <gregory.greenman@intel.com> Link: https://lore.kernel.org/r/20230301115906.254b19fffe41.I4ce78e2c558da6e5a708a8d68d61b5d7b3eb3746@changeid Signed-off-by: Johannes Berg <johannes.berg@intel.com>
2023-03-07wifi: mac80211: add netdev per-link debugfs data and driver hookBenjamin Berg1-0/+10
This adds the infrastructure to have netdev specific per-link data both for mac80211 and the driver in debugfs. For the driver, a new callback is added which is only used if MLO is supported. Signed-off-by: Benjamin Berg <benjamin.berg@intel.com> Signed-off-by: Gregory Greenman <gregory.greenman@intel.com> Link: https://lore.kernel.org/r/20230301115906.fb4c947e4df8.I69b3516ddf4c8a7501b395f652d6063444ecad63@changeid Signed-off-by: Johannes Berg <johannes.berg@intel.com>
2023-03-07wifi: mac80211: add pointer from bss_conf to vifBenjamin Berg1-0/+3
While often not needed, this considerably simplifies going from a link specific bss_config to the vif. This helps with e.g. creating link specific debugfs entries inside drivers. Signed-off-by: Benjamin Berg <benjamin.berg@intel.com> Signed-off-by: Gregory Greenman <gregory.greenman@intel.com> Link: https://lore.kernel.org/r/20230301115906.46f701a10ed5.I20390b2a8165ff222d66585915689206ea93222b@changeid Signed-off-by: Johannes Berg <johannes.berg@intel.com>
2023-03-07wifi: cfg80211/mac80211: report link ID on control port RXJohannes Berg1-2/+3
For control port RX, report the link ID for MLO. Signed-off-by: Johannes Berg <johannes.berg@intel.com> Signed-off-by: Gregory Greenman <gregory.greenman@intel.com> Link: https://lore.kernel.org/r/20230301115906.fe06dfc3791b.Iddcab94789cafe336417be406072ce8a6312fc2d@changeid Signed-off-by: Johannes Berg <johannes.berg@intel.com>
2023-03-07wifi: mac80211: add support for set_hw_timestamp commandAvraham Stern1-0/+6
Support the set_hw_timestamp callback for enabling and disabling HW timestamping if the low level driver supports it. Signed-off-by: Avraham Stern <avraham.stern@intel.com> Signed-off-by: Gregory Greenman <gregory.greenman@intel.com> Link: https://lore.kernel.org/r/20230301115906.700ded7badde.Ib2f7c228256ce313a04d3d9f9ecc6c7b9aa602bb@changeid Signed-off-by: Johannes Berg <johannes.berg@intel.com>