aboutsummaryrefslogtreecommitdiff
path: root/drivers/net/ethernet
AgeCommit message (Collapse)AuthorFilesLines
2023-01-30net/mlx5: Reuse DEKs after executing SYNC_CRYPTO commandJianbo Liu1-9/+183
To fast update encryption keys, those freed keys with need_sync bit 1 and in_use bit 0 in a bulk, can be recycled. The keys are cached internally by the NIC, so invalidating internal NIC caches by SYNC_CRYPTO command is required before reusing them. A threshold in driver is added to avoid invalidating for every update. Only when the number of DEKs, which need to be synced, is over this threshold, the sync process will start. Besides, it is done in system workqueue. After SYNC_CRYPTO command is executed successfully, the bitmaps of each bulk must be reset accordingly, so that the freed DEKs can be reused. From the analysis in previous patch, the number of reused DEKs can be calculated by hweight_long(need_sync XOR in_use), and the need_sync bits can be reset by simply copying from in_use bits. Two more list (avail_list and sync_list) are added for each pool. The avail_list is for a bulk when all bits in need_sync are reset after sync. If there is no avail deks, and all are be freed by users, the bulk is moved to sync_list, instead of being destroyed in previous patch, and waiting for the invalidation. While syncing, they are simply reset need_sync bits, and moved to avail_list. Besides, add a wait_for_free list for the to-be-free DEKs. It is to avoid this corner case: when thread A is done with SYNC_CRYPTO but just before starting to reset the bitmaps, thread B is alloc dek, and free it immediately. It's obvious that this DEK can't be reused this time, so put it to waiting list, and do free after bulk bitmaps reset is finished. Signed-off-by: Jianbo Liu <[email protected]> Reviewed-by: Tariq Toukan <[email protected]> Signed-off-by: Saeed Mahameed <[email protected]>
2023-01-30net/mlx5: Use bulk allocation for fast update encryption keyJianbo Liu1-7/+208
We create a pool for each key type. For the pool, there is a struct to store the info for all DEK objects of one bulk allocation. As we use crypto->log_dek_obj_range, which is set to 12 in previous patch, for the log_obj_range of bulk allocation, 4096 DEKs are allocated in one time. To trace the state of all the keys in a bulk, two bitmaps are created. The need_sync bitmap is used to indicate the available state of the corresponding key. If the bit is 0, it can be used (available) as it either is newly created by FW, or SYNC_CRYPTO is executed and bit is reset after it is freed by upper layer user (this is the case to be handled in later patch). Otherwise, the key need to be synced. The in_use bitmap is used to indicate the key is being used, and reset when user free it. When ktls, ipsec or macsec need a key from a bulk, it get one with need_sync bit 0, then set both need_sync and in_used bit to 1. When user free a key, only in_use bit is reset to 0. So, for the combinations of (need_sync, in_use) of one DEK object, - (0,0) means the key is ready for use, - (1,1) means the key is currently being used by a user, - (1,0) means the key is freed, and waiting for being synced, - (0,1) is invalid state. There are two lists in each pool, partial_list and full_list, according to the number for available DEKs in a bulk. When user need a key, it get a bulk, either from partial list, or create new one from FW. Then the bulk is put in the different pool's lists according to the num of avail deks it has. If there is no avail deks, and all of them are be freed by users, for now, the bulk is destroyed. To speed up the bitmap search, a variable (avail_start) is added to indicate where to start to search need_sync bitmap for available key. Signed-off-by: Jianbo Liu <[email protected]> Reviewed-by: Tariq Toukan <[email protected]> Signed-off-by: Saeed Mahameed <[email protected]>
2023-01-30net/mlx5: Add bulk allocation and modify_dek operationJianbo Liu1-2/+83
To support fast update of keys into hardware, we optimize firmware to achieve the maximum rate. The approach is to create DEK objects in bulk, and update each of them with modify command. This patch supports bulk allocation and modify_dek commands for new firmware. However, as log_obj_range is 0 for now, only one DEK obj is allocated each time, and then updated with user key by modify_dek. Signed-off-by: Jianbo Liu <[email protected]> Reviewed-by: Tariq Toukan <[email protected]> Signed-off-by: Saeed Mahameed <[email protected]>
2023-01-30net/mlx5: Add support SYNC_CRYPTO commandJianbo Liu2-0/+39
Add support for SYNC_CRYPTO command. For now, it is executed only when initializing DEK, but needed when reusing keys in later patch. Signed-off-by: Jianbo Liu <[email protected]> Reviewed-by: Tariq Toukan <[email protected]> Signed-off-by: Saeed Mahameed <[email protected]>
2023-01-30net/mlx5: Add new APIs for fast update encryption keyJianbo Liu2-5/+98
New APIs are added to support fast update DEKs. As a pool is created for each key purpose (type), one pair of pool APIs to get/put pool. Anotehr pair of DEKs APIs is to get DEK object from pool and update it with user key, or free it back to the pool. As The bulk allocation and destruction will be supported in later patches, old implementation is used here. To support these APIs, pool and dek structs are defined first. Only small number of fields are stored in them. For example, key_purpose and refcnt in pool struct, DEK object id in dek struct. More fields will be added to these structs in later patches, for example, the different bulk lists for pool struct, the bulk pointer dek struct belongs to, and a list_entry for the list in a pool, which is used to save keys waiting for being freed while other thread is doing sync. Besides the creation and destruction interfaces, new one is also added to get obj id. Currently these APIs are planned to used by TLS only. Signed-off-by: Jianbo Liu <[email protected]> Reviewed-by: Tariq Toukan <[email protected]> Signed-off-by: Saeed Mahameed <[email protected]>
2023-01-30net/mlx5: Refactor the encryption key creationJianbo Liu1-24/+53
Move the common code to general functions which can be used by fast update encryption key in later patches. Signed-off-by: Jianbo Liu <[email protected]> Reviewed-by: Tariq Toukan <[email protected]> Signed-off-by: Saeed Mahameed <[email protected]>
2023-01-30net/mlx5: Add const to the key pointer of encryption key creationJianbo Liu3-3/+3
Change key pointer to const void *, as there is no need to change the key content. This is also to avoid modifying the key by mistake. Signed-off-by: Jianbo Liu <[email protected]> Reviewed-by: Tariq Toukan <[email protected]> Signed-off-by: Saeed Mahameed <[email protected]>
2023-01-30net/mlx5: Prepare for fast crypto key update if hardware supports itJianbo Liu5-0/+55
Add CAP for crypto offload, do the simple initialization if hardware supports it. Currently set log_dek_obj_range to 12, so 4k DEKs will be created in one bulk allocation. Signed-off-by: Jianbo Liu <[email protected]> Reviewed-by: Tariq Toukan <[email protected]> Signed-off-by: Saeed Mahameed <[email protected]>
2023-01-30net/mlx5: Change key type to key purposeJianbo Liu2-4/+4
Change the naming of key type in DEK fields and macros, to be consistent with the device spec. Signed-off-by: Jianbo Liu <[email protected]> Reviewed-by: Tariq Toukan <[email protected]> Signed-off-by: Saeed Mahameed <[email protected]>
2023-01-30net/mlx5: Add IFC bits for general obj create paramJianbo Liu1-2/+4
Before this patch, the log_obj_range was defined inside general_obj_in_cmd_hdr to support bulk allocation. However, we need to modify/query one of the object in the bulk in later patch, so change those fields to param bits for parameters specific for cmd header, and add general_obj_create_param according to what was updated in spec. We will also add general_obj_query_param for modify/query later. Signed-off-by: Jianbo Liu <[email protected]> Reviewed-by: Tariq Toukan <[email protected]> Signed-off-by: Saeed Mahameed <[email protected]>
2023-01-30net/mlx5: Header file for cryptoTariq Toukan6-15/+23
Take crypto API out of the generic mlx5.h header into a dedicated header. Signed-off-by: Tariq Toukan <[email protected]> Signed-off-by: Jianbo Liu <[email protected]> Signed-off-by: Saeed Mahameed <[email protected]>
2023-01-30ixgbe: Remove redundant pci_enable_pcie_error_reporting()Bjorn Helgaas1-5/+0
pci_enable_pcie_error_reporting() enables the device to send ERR_* Messages. Since f26e58bf6f54 ("PCI/AER: Enable error reporting when AER is native"), the PCI core does this for all devices during enumeration. Remove the redundant pci_enable_pcie_error_reporting() call from the driver. Also remove the corresponding pci_disable_pcie_error_reporting() from the driver .remove() path. Note that this doesn't control interrupt generation by the Root Port; that is controlled by the AER Root Error Command register, which is managed by the AER service driver. Signed-off-by: Bjorn Helgaas <[email protected]> Cc: Jesse Brandeburg <[email protected]> Cc: Tony Nguyen <[email protected]> Cc: [email protected] Cc: [email protected] Tested-by: Gurucharan G <[email protected]> (A Contingent worker at Intel) Signed-off-by: Tony Nguyen <[email protected]>
2023-01-30igc: Remove redundant pci_enable_pcie_error_reporting()Bjorn Helgaas1-5/+0
pci_enable_pcie_error_reporting() enables the device to send ERR_* Messages. Since f26e58bf6f54 ("PCI/AER: Enable error reporting when AER is native"), the PCI core does this for all devices during enumeration. Remove the redundant pci_enable_pcie_error_reporting() call from the driver. Also remove the corresponding pci_disable_pcie_error_reporting() from the driver .remove() path. Note that this doesn't control interrupt generation by the Root Port; that is controlled by the AER Root Error Command register, which is managed by the AER service driver. Signed-off-by: Bjorn Helgaas <[email protected]> Cc: Jesse Brandeburg <[email protected]> Cc: Tony Nguyen <[email protected]> Cc: [email protected] Cc: [email protected] Tested-by: Naama Meir <[email protected]> Signed-off-by: Tony Nguyen <[email protected]>
2023-01-30igb: Remove redundant pci_enable_pcie_error_reporting()Bjorn Helgaas1-5/+0
pci_enable_pcie_error_reporting() enables the device to send ERR_* Messages. Since f26e58bf6f54 ("PCI/AER: Enable error reporting when AER is native"), the PCI core does this for all devices during enumeration. Remove the redundant pci_enable_pcie_error_reporting() call from the driver. Also remove the corresponding pci_disable_pcie_error_reporting() from the driver .remove() path. Note that this doesn't control interrupt generation by the Root Port; that is controlled by the AER Root Error Command register, which is managed by the AER service driver. Signed-off-by: Bjorn Helgaas <[email protected]> Cc: Jesse Brandeburg <[email protected]> Cc: Tony Nguyen <[email protected]> Cc: [email protected] Cc: [email protected] Tested-by: Gurucharan G <[email protected]> (A Contingent worker at Intel) Signed-off-by: Tony Nguyen <[email protected]>
2023-01-30ice: Remove redundant pci_enable_pcie_error_reporting()Bjorn Helgaas1-3/+0
pci_enable_pcie_error_reporting() enables the device to send ERR_* Messages. Since f26e58bf6f54 ("PCI/AER: Enable error reporting when AER is native"), the PCI core does this for all devices during enumeration. Remove the redundant pci_enable_pcie_error_reporting() call from the driver. Also remove the corresponding pci_disable_pcie_error_reporting() from the driver .remove() path. Note that this doesn't control interrupt generation by the Root Port; that is controlled by the AER Root Error Command register, which is managed by the AER service driver. Signed-off-by: Bjorn Helgaas <[email protected]> Cc: Jesse Brandeburg <[email protected]> Cc: Tony Nguyen <[email protected]> Cc: [email protected] Cc: [email protected] Tested-by: Gurucharan G <[email protected]> (A Contingent worker at Intel) Signed-off-by: Tony Nguyen <[email protected]>
2023-01-30iavf: Remove redundant pci_enable_pcie_error_reporting()Bjorn Helgaas1-5/+0
pci_enable_pcie_error_reporting() enables the device to send ERR_* Messages. Since f26e58bf6f54 ("PCI/AER: Enable error reporting when AER is native"), the PCI core does this for all devices during enumeration. Remove the redundant pci_enable_pcie_error_reporting() call from the driver. Also remove the corresponding pci_disable_pcie_error_reporting() from the driver .remove() path. Note that this doesn't control interrupt generation by the Root Port; that is controlled by the AER Root Error Command register, which is managed by the AER service driver. Signed-off-by: Bjorn Helgaas <[email protected]> Cc: Jesse Brandeburg <[email protected]> Cc: Tony Nguyen <[email protected]> Cc: [email protected] Cc: [email protected] Tested-by: Marek Szlosek <[email protected]> Signed-off-by: Tony Nguyen <[email protected]>
2023-01-30i40e: Remove redundant pci_enable_pcie_error_reporting()Bjorn Helgaas1-4/+0
pci_enable_pcie_error_reporting() enables the device to send ERR_* Messages. Since f26e58bf6f54 ("PCI/AER: Enable error reporting when AER is native"), the PCI core does this for all devices during enumeration. Remove the redundant pci_enable_pcie_error_reporting() call from the driver. Also remove the corresponding pci_disable_pcie_error_reporting() from the driver .remove() path. Note that this doesn't control interrupt generation by the Root Port; that is controlled by the AER Root Error Command register, which is managed by the AER service driver. Signed-off-by: Bjorn Helgaas <[email protected]> Cc: Jesse Brandeburg <[email protected]> Cc: Tony Nguyen <[email protected]> Cc: [email protected] Cc: [email protected] Tested-by: Gurucharan G <[email protected]> (A Contingent worker at Intel) Signed-off-by: Tony Nguyen <[email protected]>
2023-01-30fm10k: Remove redundant pci_enable_pcie_error_reporting()Bjorn Helgaas1-5/+0
pci_enable_pcie_error_reporting() enables the device to send ERR_* Messages. Since f26e58bf6f54 ("PCI/AER: Enable error reporting when AER is native"), the PCI core does this for all devices during enumeration. Remove the redundant pci_enable_pcie_error_reporting() call from the driver. Also remove the corresponding pci_disable_pcie_error_reporting() from the driver .remove() path. Note that this doesn't control interrupt generation by the Root Port; that is controlled by the AER Root Error Command register, which is managed by the AER service driver. Signed-off-by: Bjorn Helgaas <[email protected]> Cc: Jesse Brandeburg <[email protected]> Cc: Tony Nguyen <[email protected]> Cc: [email protected] Cc: [email protected] Signed-off-by: Tony Nguyen <[email protected]>
2023-01-30e1000e: Remove redundant pci_enable_pcie_error_reporting()Bjorn Helgaas1-7/+0
pci_enable_pcie_error_reporting() enables the device to send ERR_* Messages. Since f26e58bf6f54 ("PCI/AER: Enable error reporting when AER is native"), the PCI core does this for all devices during enumeration. Remove the redundant pci_enable_pcie_error_reporting() call from the driver. Also remove the corresponding pci_disable_pcie_error_reporting() from the driver .remove() path. Note that this doesn't control interrupt generation by the Root Port; that is controlled by the AER Root Error Command register, which is managed by the AER service driver. Signed-off-by: Bjorn Helgaas <[email protected]> Cc: Jesse Brandeburg <[email protected]> Cc: Tony Nguyen <[email protected]> Cc: [email protected] Cc: [email protected] Signed-off-by: Tony Nguyen <[email protected]>
2023-01-30devlink: remove devlink featuresJiri Pirko7-10/+5
Devlink features were introduced to disallow devlink reload calls of userspace before the devlink was fully initialized. The reason for this workaround was the fact that devlink reload was originally called without devlink instance lock held. However, with recent changes that converted devlink reload to be performed under devlink instance lock, this is redundant so remove devlink features entirely. Note that mlx5 used this to enable devlink reload conditionally only when device didn't act as multi port slave. Move the multi port check into mlx5_devlink_reload_down() callback alongside with the other checks preventing the device from reload in certain states. Signed-off-by: Jiri Pirko <[email protected]> Reviewed-by: Jacob Keller <[email protected]> Signed-off-by: David S. Miller <[email protected]>
2023-01-30net: microchip: sparx5: Add KUNIT tests for enabling/disabling chainsSteen Hegelund1-0/+66
This enhances the KUNIT test of the VCAP API with tests of the chaining functionality. Signed-off-by: Steen Hegelund <[email protected]> Signed-off-by: David S. Miller <[email protected]>
2023-01-30net: microchip: sparx5: Add TC support for the ES2 VCAPSteen Hegelund2-4/+39
This enables the TC command to use the Sparx5 ES2 VCAP, and provides a new ES2 ethertype table and handling of rule links between IS0 and ES2. Signed-off-by: Steen Hegelund <[email protected]> Signed-off-by: David S. Miller <[email protected]>
2023-01-30net: microchip: sparx5: Add ingress information to VCAP instanceSteen Hegelund11-19/+46
This allows the check of the goto action to be specific to the ingress and egress VCAP instances. The debugfs support is also updated to show this information. Signed-off-by: Steen Hegelund <[email protected]> Signed-off-by: David S. Miller <[email protected]>
2023-01-30net: microchip: sparx5: Add ES2 VCAP keyset configuration for Sparx5Steen Hegelund4-119/+803
This adds the ES2 VCAP port keyset configuration for Sparx5 and also updates the debugFS support to show the keyset configuration and the egress port mask. Signed-off-by: Steen Hegelund <[email protected]> Signed-off-by: David S. Miller <[email protected]>
2023-01-30net: microchip: sparx5: Add ES2 VCAP model and updated KUNIT VCAP modelSteen Hegelund5-18/+1401
This provides the VCAP model for the Sparx5 ES2 (Egress Stage 2) VCAP. This VCAP provides tagging and remarking functionality This also renames a VCAP keyfield: VCAP_KF_MIRROR_ENA becomes VCAP_KF_MIRROR_PROBE, as the first name was caused by a mistake in the model transformation. Signed-off-by: Steen Hegelund <[email protected]> Signed-off-by: David S. Miller <[email protected]>
2023-01-30net: microchip: sparx5: Improve error message when parsing CVLAN filterSteen Hegelund1-1/+4
This improves the error message when a TC filter with CVLAN tag is used and the selected VCAP instance does not support this. Signed-off-by: Steen Hegelund <[email protected]> Signed-off-by: David S. Miller <[email protected]>
2023-01-30net: microchip: sparx5: Improve the IP frame key match for IPv6 framesSteen Hegelund1-0/+8
This ensures that it will be possible for a VCAP rule to distinguish IPv6 frames from non-IP frames, as the IS0 keyset usually selected for the IPv6 traffic class in (7TUPLE) does not offer a key that specifies IPv6 directly: only non-IPv4. The IP_SNAP key ensures that we select (at least) IP frames. Signed-off-by: Steen Hegelund <[email protected]> Signed-off-by: David S. Miller <[email protected]>
2023-01-30net: microchip: sparx5: Add support for getting keysets without a type idSteen Hegelund1-1/+23
When there is only one keyset available for a certain VCAP rule size, the particular keyset does not need a type id when encoded in the VCAP Hardware. This provides support for getting a keyset from a rule, when this is the case: only one keyset fits this rule size. Signed-off-by: Steen Hegelund <[email protected]> Signed-off-by: David S. Miller <[email protected]>
2023-01-30net: bcmgenet: Add a check for oversized packetsFlorian Fainelli1-0/+8
Occasionnaly we may get oversized packets from the hardware which exceed the nomimal 2KiB buffer size we allocate SKBs with. Add an early check which drops the packet to avoid invoking skb_over_panic() and move on to processing the next packet. Signed-off-by: Florian Fainelli <[email protected]> Signed-off-by: David S. Miller <[email protected]>
2023-01-30net: stmmac: do not stop RX_CLK in Rx LPI state for qcs404 SoCAndrey Konovalov2-1/+4
Currently in phy_init_eee() the driver unconditionally configures the PHY to stop RX_CLK after entering Rx LPI state. This causes an LPI interrupt storm on my qcs404-base board. Change the PHY initialization so that for "qcom,qcs404-ethqos" compatible device RX_CLK continues to run even in Rx LPI state. Signed-off-by: Andrey Konovalov <[email protected]> Signed-off-by: David S. Miller <[email protected]>
2023-01-30fec: convert to gpio descriptorArnd Bergmann1-16/+9
The driver can be trivially converted, as it only triggers the gpio pin briefly to do a reset, and it already only supports DT. Signed-off-by: Arnd Bergmann <[email protected]> Reviewed-by: Andrew Lunn <[email protected]> Signed-off-by: David S. Miller <[email protected]>
2023-01-28sfc: correctly advertise tunneled IPv6 segmentationÍñigo Huguet1-1/+4
Recent sfc NICs are TSO capable for some tunnel protocols. However, it was not working properly because the feature was not advertised in hw_enc_features, but in hw_features only. Setting up a GENEVE tunnel and using iperf3 to send IPv4 and IPv6 traffic to the tunnel show, with tcpdump, that the IPv4 packets still had ~64k size but the IPv6 ones had only ~1500 bytes (they had been segmented by software, not offloaded). With this patch segmentation is offloaded as expected and the traffic is correctly received at the other end. Fixes: 24b2c3751aa3 ("sfc: advertise encapsulated offloads on EF10") Reported-by: Tianhao Zhao <[email protected]> Signed-off-by: Íñigo Huguet <[email protected]> Acked-by: Martin Habets <[email protected]> Link: https://lore.kernel.org/r/[email protected] Signed-off-by: Jakub Kicinski <[email protected]>
2023-01-28Merge tag 'for-netdev' of ↵Jakub Kicinski12-82/+205
https://git.kernel.org/pub/scm/linux/kernel/git/bpf/bpf-next Daniel Borkmann says: ==================== bpf-next 2023-01-28 We've added 124 non-merge commits during the last 22 day(s) which contain a total of 124 files changed, 6386 insertions(+), 1827 deletions(-). The main changes are: 1) Implement XDP hints via kfuncs with initial support for RX hash and timestamp metadata kfuncs, from Stanislav Fomichev and Toke Høiland-Jørgensen. Measurements on overhead: https://lore.kernel.org/bpf/[email protected] 2) Extend libbpf's bpf_tracing.h support for tracing arguments of kprobes/uprobes and syscall as a special case, from Andrii Nakryiko. 3) Significantly reduce the search time for module symbols by livepatch and BPF, from Jiri Olsa and Zhen Lei. 4) Enable cpumasks to be used as kptrs, which is useful for tracing programs tracking which tasks end up running on which CPUs in different time intervals, from David Vernet. 5) Fix several issues in the dynptr processing such as stack slot liveness propagation, missing checks for PTR_TO_STACK variable offset, etc, from Kumar Kartikeya Dwivedi. 6) Various performance improvements, fixes, and introduction of more than just one XDP program to XSK selftests, from Magnus Karlsson. 7) Big batch to BPF samples to reduce deprecated functionality, from Daniel T. Lee. 8) Enable struct_ops programs to be sleepable in verifier, from David Vernet. 9) Reduce pr_warn() noise on BTF mismatches when they are expected under the CONFIG_MODULE_ALLOW_BTF_MISMATCH config anyway, from Connor O'Brien. 10) Describe modulo and division by zero behavior of the BPF runtime in BPF's instruction specification document, from Dave Thaler. 11) Several improvements to libbpf API documentation in libbpf.h, from Grant Seltzer. 12) Improve resolve_btfids header dependencies related to subcmd and add proper support for HOSTCC, from Ian Rogers. 13) Add ipip6 and ip6ip decapsulation support for bpf_skb_adjust_room() helper along with BPF selftests, from Ziyang Xuan. 14) Simplify the parsing logic of structure parameters for BPF trampoline in the x86-64 JIT compiler, from Pu Lehui. 15) Get BTF working for kernels with CONFIG_RUST enabled by excluding Rust compilation units with pahole, from Martin Rodriguez Reboredo. 16) Get bpf_setsockopt() working for kTLS on top of TCP sockets, from Kui-Feng Lee. 17) Disable stack protection for BPF objects in bpftool given BPF backends don't support it, from Holger Hoffstätte. * tag 'for-netdev' of https://git.kernel.org/pub/scm/linux/kernel/git/bpf/bpf-next: (124 commits) selftest/bpf: Make crashes more debuggable in test_progs libbpf: Add documentation to map pinning API functions libbpf: Fix malformed documentation formatting selftests/bpf: Properly enable hwtstamp in xdp_hw_metadata selftests/bpf: Calls bpf_setsockopt() on a ktls enabled socket. bpf: Check the protocol of a sock to agree the calls to bpf_setsockopt(). bpf/selftests: Verify struct_ops prog sleepable behavior bpf: Pass const struct bpf_prog * to .check_member libbpf: Support sleepable struct_ops.s section bpf: Allow BPF_PROG_TYPE_STRUCT_OPS programs to be sleepable selftests/bpf: Fix vmtest static compilation error tools/resolve_btfids: Alter how HOSTCC is forced tools/resolve_btfids: Install subcmd headers bpf/docs: Document the nocast aliasing behavior of ___init bpf/docs: Document how nested trusted fields may be defined bpf/docs: Document cpumask kfuncs in a new file selftests/bpf: Add selftest suite for cpumask kfuncs selftests/bpf: Add nested trust selftests suite bpf: Enable cpumasks to be queried and used as kptrs bpf: Disallow NULLable pointers for trusted kfuncs ... ==================== Link: https://lore.kernel.org/r/[email protected] Signed-off-by: Jakub Kicinski <[email protected]>
2023-01-27Merge git://git.kernel.org/pub/scm/linux/kernel/git/netdev/netJakub Kicinski18-114/+150
Conflicts: drivers/net/ethernet/intel/ice/ice_main.c 418e53401e47 ("ice: move devlink port creation/deletion") 643ef23bd9dd ("ice: Introduce local var for readability") https://lore.kernel.org/all/[email protected]/ https://lore.kernel.org/all/[email protected]/ drivers/net/ethernet/engleder/tsnep_main.c 3d53aaef4332 ("tsnep: Fix TX queue stop/wake for multiple queues") 25faa6a4c5ca ("tsnep: Replace TX spin_lock with __netif_tx_lock") https://lore.kernel.org/all/[email protected]/ net/netfilter/nf_conntrack_proto_sctp.c 13bd9b31a969 ("Revert "netfilter: conntrack: add sctp DATA_SENT state"") a44b7651489f ("netfilter: conntrack: unify established states for SCTP paths") f71cb8f45d09 ("netfilter: conntrack: sctp: use nf log infrastructure for invalid packets") https://lore.kernel.org/all/[email protected]/ https://lore.kernel.org/all/[email protected]/ Signed-off-by: Jakub Kicinski <[email protected]>
2023-01-27dpaa2-eth: execute xdp_do_flush() before napi_complete_done()Magnus Karlsson1-3/+6
Make sure that xdp_do_flush() is always executed before napi_complete_done(). This is important for two reasons. First, a redirect to an XSKMAP assumes that a call to xdp_do_redirect() from napi context X on CPU Y will be followed by a xdp_do_flush() from the same napi context and CPU. This is not guaranteed if the napi_complete_done() is executed before xdp_do_flush(), as it tells the napi logic that it is fine to schedule napi context X on another CPU. Details from a production system triggering this bug using the veth driver can be found following the first link below. The second reason is that the XDP_REDIRECT logic in itself relies on being inside a single NAPI instance through to the xdp_do_flush() call for RCU protection of all in-kernel data structures. Details can be found in the second link below. Fixes: d678be1dc1ec ("dpaa2-eth: add XDP_REDIRECT support") Signed-off-by: Magnus Karlsson <[email protected]> Acked-by: Toke Høiland-Jørgensen <[email protected]> Link: https://lore.kernel.org/r/[email protected] Link: https://lore.kernel.org/all/[email protected]/ Signed-off-by: Jakub Kicinski <[email protected]>
2023-01-27dpaa_eth: execute xdp_do_flush() before napi_complete_done()Magnus Karlsson1-3/+3
Make sure that xdp_do_flush() is always executed before napi_complete_done(). This is important for two reasons. First, a redirect to an XSKMAP assumes that a call to xdp_do_redirect() from napi context X on CPU Y will be followed by a xdp_do_flush() from the same napi context and CPU. This is not guaranteed if the napi_complete_done() is executed before xdp_do_flush(), as it tells the napi logic that it is fine to schedule napi context X on another CPU. Details from a production system triggering this bug using the veth driver can be found following the first link below. The second reason is that the XDP_REDIRECT logic in itself relies on being inside a single NAPI instance through to the xdp_do_flush() call for RCU protection of all in-kernel data structures. Details can be found in the second link below. Fixes: a1e031ffb422 ("dpaa_eth: add XDP_REDIRECT support") Signed-off-by: Magnus Karlsson <[email protected]> Acked-by: Toke Høiland-Jørgensen <[email protected]> Link: https://lore.kernel.org/r/[email protected] Link: https://lore.kernel.org/all/[email protected]/ Acked-by: Camelia Groza <[email protected]> Signed-off-by: Jakub Kicinski <[email protected]>
2023-01-27lan966x: execute xdp_do_flush() before napi_complete_done()Magnus Karlsson1-3/+3
Make sure that xdp_do_flush() is always executed before napi_complete_done(). This is important for two reasons. First, a redirect to an XSKMAP assumes that a call to xdp_do_redirect() from napi context X on CPU Y will be followed by a xdp_do_flush() from the same napi context and CPU. This is not guaranteed if the napi_complete_done() is executed before xdp_do_flush(), as it tells the napi logic that it is fine to schedule napi context X on another CPU. Details from a production system triggering this bug using the veth driver can be found following the first link below. The second reason is that the XDP_REDIRECT logic in itself relies on being inside a single NAPI instance through to the xdp_do_flush() call for RCU protection of all in-kernel data structures. Details can be found in the second link below. Fixes: a825b611c7c1 ("net: lan966x: Add support for XDP_REDIRECT") Signed-off-by: Magnus Karlsson <[email protected]> Acked-by: Toke Høiland-Jørgensen <[email protected]> Acked-by: Steen Hegelund <[email protected]> Link: https://lore.kernel.org/r/[email protected] Link: https://lore.kernel.org/all/[email protected]/ Signed-off-by: Jakub Kicinski <[email protected]>
2023-01-27qede: execute xdp_do_flush() before napi_complete_done()Magnus Karlsson1-3/+4
Make sure that xdp_do_flush() is always executed before napi_complete_done(). This is important for two reasons. First, a redirect to an XSKMAP assumes that a call to xdp_do_redirect() from napi context X on CPU Y will be followed by a xdp_do_flush() from the same napi context and CPU. This is not guaranteed if the napi_complete_done() is executed before xdp_do_flush(), as it tells the napi logic that it is fine to schedule napi context X on another CPU. Details from a production system triggering this bug using the veth driver can be found following the first link below. The second reason is that the XDP_REDIRECT logic in itself relies on being inside a single NAPI instance through to the xdp_do_flush() call for RCU protection of all in-kernel data structures. Details can be found in the second link below. Fixes: d1b25b79e162b ("qede: add .ndo_xdp_xmit() and XDP_REDIRECT support") Signed-off-by: Magnus Karlsson <[email protected]> Acked-by: Toke Høiland-Jørgensen <[email protected]> Link: https://lore.kernel.org/r/[email protected] Link: https://lore.kernel.org/all/[email protected]/ Signed-off-by: Jakub Kicinski <[email protected]>
2023-01-27net/i40e: Replace 0-length array with flexible arrayKees Cook1-1/+1
Zero-length arrays are deprecated[1]. Replace struct i40e_lump_tracking's "list" 0-length array with a flexible array. Detected with GCC 13, using -fstrict-flex-arrays=3: In function 'i40e_put_lump', inlined from 'i40e_clear_interrupt_scheme' at drivers/net/ethernet/intel/i40e/i40e_main.c:5145:2: drivers/net/ethernet/intel/i40e/i40e_main.c:278:27: warning: array subscript <unknown> is outside array bounds of 'u16[0]' {aka 'short unsigned int[]'} [-Warray-bounds=] 278 | pile->list[i] = 0; | ~~~~~~~~~~^~~ drivers/net/ethernet/intel/i40e/i40e.h: In function 'i40e_clear_interrupt_scheme': drivers/net/ethernet/intel/i40e/i40e.h:179:13: note: while referencing 'list' 179 | u16 list[0]; | ^~~~ [1] https://www.kernel.org/doc/html/latest/process/deprecated.html#zero-length-and-one-element-arrays Cc: Jesse Brandeburg <[email protected]> Cc: Tony Nguyen <[email protected]> Cc: "David S. Miller" <[email protected]> Cc: Eric Dumazet <[email protected]> Cc: Jakub Kicinski <[email protected]> Cc: Paolo Abeni <[email protected]> Cc: "Gustavo A. R. Silva" <[email protected]> Cc: [email protected] Cc: [email protected] Signed-off-by: Kees Cook <[email protected]> Reviewed-by: Jiri Pirko <[email protected]> Tested-by: Gurucharan G <[email protected]> (A Contingent worker at Intel) Reviewed-by: Gustavo A. R. Silva <[email protected]> Link: https://lore.kernel.org/r/[email protected]
2023-01-27ice: Prevent set_channel from changing queues while RDMA activeDave Ertman5-19/+43
The PF controls the set of queues that the RDMA auxiliary_driver requests resources from. The set_channel command will alter that pool and trigger a reconfiguration of the VSI, which breaks RDMA functionality. Prevent set_channel from executing when RDMA driver bound to auxiliary device. Adding a locked variable to pass down the call chain to avoid double locking the device_lock. Fixes: 348048e724a0 ("ice: Implement iidc operations") Signed-off-by: Dave Ertman <[email protected]> Tested-by: Gurucharan G <[email protected]> (A Contingent worker at Intel) Signed-off-by: Tony Nguyen <[email protected]>
2023-01-27Merge back thermal control material for 6.3.Rafael J. Wysocki3-195/+55
2023-01-27net/mlx5: Move eswitch port metadata devlink param to flow eswitch codeJiri Pirko4-55/+94
Move the param registration and handling code into the eswitch offloads code as they are related to each other. No point in having the devlink param registration done in separate file. Signed-off-by: Jiri Pirko <[email protected]> Signed-off-by: David S. Miller <[email protected]>
2023-01-27net/mlx5: Move flow steering devlink param to flow steering codeJiri Pirko2-70/+83
Move the param registration and handling code into the flow steering code as they are related to each other. No point in having the devlink param registration done in separate file. Signed-off-by: Jiri Pirko <[email protected]> Reviewed-by: Jacob Keller <[email protected]> Signed-off-by: David S. Miller <[email protected]>
2023-01-27net/mlx5: Move fw reset devlink param to fw reset codeJiri Pirko3-29/+38
Move the param registration and handling code into the fw reset code as they are related to each other. No point in having the devlink param registration done in separate file. Signed-off-by: Jiri Pirko <[email protected]> Reviewed-by: Jacob Keller <[email protected]> Signed-off-by: David S. Miller <[email protected]>
2023-01-27devlink: protect devlink param list by instance lockJiri Pirko10-133/+136
Commit 1d18bb1a4ddd ("devlink: allow registering parameters after the instance") as the subject implies introduced possibility to register devlink params even for already registered devlink instance. This is a bit problematic, as the consistency or params list was originally secured by the fact it is static during devlink lifetime. So in order to protect the params list, take devlink instance lock during the params operations. Introduce unlocked function variants and use them in drivers in locked context. Put lock assertions to appropriate places. Signed-off-by: Jiri Pirko <[email protected]> Reviewed-by: Jakub Kicinski <[email protected]> Reviewed-by: Jacob Keller <[email protected]> Reviewed-by: Ido Schimmel <[email protected]> Reviewed-by: Simon Horman <[email protected]> Tested-by: Simon Horman <[email protected]> Signed-off-by: David S. Miller <[email protected]>
2023-01-27qed: remove pointless call to devlink_param_driverinit_value_set()Jiri Pirko1-6/+0
devlink_param_driverinit_value_set() call makes sense only for " driverinit" params. However here, the param is "runtime". devlink_param_driverinit_value_set() returns -EOPNOTSUPP in such case and does not do anything. So remove the pointless call to devlink_param_driverinit_value_set() entirely. Signed-off-by: Jiri Pirko <[email protected]> Reviewed-by: Jakub Kicinski <[email protected]> Reviewed-by: Jacob Keller <[email protected]> Signed-off-by: David S. Miller <[email protected]>
2023-01-27ice: remove pointless calls to devlink_param_driverinit_value_set()Jiri Pirko1-18/+2
devlink_param_driverinit_value_set() call makes sense only for "driverinit" params. However here, both params are "runtime". devlink_param_driverinit_value_set() returns -EOPNOTSUPP in such case and does not do anything. So remove the pointless calls to devlink_param_driverinit_value_set() entirely. Signed-off-by: Jiri Pirko <[email protected]> Reviewed-by: Jakub Kicinski <[email protected]> Reviewed-by: Jacob Keller <[email protected]> Signed-off-by: David S. Miller <[email protected]>
2023-01-27net/mlx5: Covert devlink params registration to use ↵Jiri Pirko1-34/+46
devlink_params_register/unregister() Since mlx5 is the only user of devlink API to register/unregister a single param, convert it to use array registration function allowing to simplify the devlink API by removing the single param registration functions. Signed-off-by: Jiri Pirko <[email protected]> Reviewed-by: Jakub Kicinski <[email protected]> Reviewed-by: Jacob Keller <[email protected]> Signed-off-by: David S. Miller <[email protected]>
2023-01-27net/mlx5: Change devlink param register/unregister function namesJiri Pirko3-9/+9
The functions are registering and unregistering devlink params, so change the names accordingly. Signed-off-by: Jiri Pirko <[email protected]> Reviewed-by: Jakub Kicinski <[email protected]> Reviewed-by: Jacob Keller <[email protected]> Signed-off-by: David S. Miller <[email protected]>
2023-01-27net: add missing includes of linux/sched/clock.hJakub Kicinski2-0/+3
Number of files depend on linux/sched/clock.h getting included by linux/skbuff.h which soon will no longer be the case. Signed-off-by: Jakub Kicinski <[email protected]> Signed-off-by: David S. Miller <[email protected]>