aboutsummaryrefslogtreecommitdiff
AgeCommit message (Collapse)AuthorFilesLines
2023-04-21net: stmmac:fix system hang when setting up tag_8021q VLAN for DSA portsYan Wang1-3/+9
The system hang because of dsa_tag_8021q_port_setup()-> stmmac_vlan_rx_add_vid(). I found in stmmac_drv_probe() that cailing pm_runtime_put() disabled the clock. First, when the kernel is compiled with CONFIG_PM=y,The stmmac's resume/suspend is active. Secondly,stmmac as DSA master,the dsa_tag_8021q_port_setup() function will callback stmmac_vlan_rx_add_vid when DSA dirver starts. However, The system is hanged for the stmmac_vlan_rx_add_vid() accesses its registers after stmmac's clock is closed. I would suggest adding the pm_runtime_resume_and_get() to the stmmac_vlan_rx_add_vid().This guarantees that resuming clock output while in use. Fixes: b3dcb3127786 ("net: stmmac: correct clocks enabled in stmmac_vlan_rx_kill_vid()") Reviewed-by: Jacob Keller <[email protected]> Signed-off-by: Yan Wang <[email protected]> Signed-off-by: David S. Miller <[email protected]>
2023-04-21Merge branch 'pds_core'David S. Miller20-0/+4481
Shannon Nelson says: ==================== pds_core driver Summary: -------- This patchset implements a new driver for use with the AMD/Pensando Distributed Services Card (DSC), intended to provide core configuration services through the auxiliary_bus and through a couple of EXPORTed functions for use initially in VFio and vDPA feature specific drivers. To keep this patchset to a manageable size, the pds_vdpa and pds_vfio drivers have been split out into their own patchsets to be reviewed separately. Detail: ------- AMD/Pensando is making available a new set of devices for supporting vDPA, VFio, and potentially other features in the Distributed Services Card (DSC). These features are implemented through a PF that serves as a Core device for controlling and configuring its VF devices. These VF devices have separate drivers that use the auxiliary_bus to work through the Core device as the control path. Currently, the DSC supports standard ethernet operations using the ionic driver. This is not replaced by the Core-based devices - these new devices are in addition to the existing Ethernet device. Typical DSC configurations will include both PDS devices and Ionic Eth devices. However, there is a potential future path for ethernet services to come through this device as well. The Core device is a new PCI PF/VF device managed by a new driver 'pds_core'. The PF device has access to an admin queue for configuring the services used by the VFs, and sets up auxiliary_bus devices for each vDPA VF for communicating with the drivers for the vDPA devices. The VFs may be for VFio or vDPA, and other services in the future; these VF types are selected as part of the DSC internal FW configurations, which is out of the scope of this patchset. When the vDPA support set is enabled in the core PF through its devlink param, auxiliary_bus devices are created for each VF that supports the feature. The vDPA driver then connects to and uses this auxiliary_device to do control path configuration through the PF device. This can then be used with the vdpa kernel module to provide devices for virtio_vdpa kernel module for host interfaces, or vhost_vdpa kernel module for interfaces exported into your favorite VM. A cheap ASCII diagram of a vDPA instance looks something like this: ,----------. | vdpa | '----------' | || ctl data | || .----------. || | pds_vdpa | || '----------' || | || pds_core.vDPA.1 || | || .---------------. || | pds_core | || '---------------' || || || || 09:00.0 09:00.1 == PCI ============================================ || || .----------. .----------. ,------| PF |---| VF |-------, | '----------' '----------' | | DSC | | | ------------------------------------------ Changes: v11: - change strncpy to strscpy Reported-by: kernel test robot <[email protected]> Link: https://lore.kernel.org/oe-kbuild-all/[email protected]/ v10: Link: https://lore.kernel.org/netdev/[email protected]/ - remove CONFIG_DEBUG_FS guard static inline stuff - remove unnecessary 0 and null initializations - verify in driver load that PDS_CORE_DRV_NAME matches KBUILD_MODNAME - remove debugfs irqs_show(), redundant with /proc - return -ENOMEM if intr_info = kcalloc() fails - move the status code enum into pds_core_if.h as part of API definition - fix up one place in pdsc_devcmd_wait() we're using the status codes where we could use the errno - remove redundant calls to flush_workqueue() - grab config_lock before testing state bits in pdsc_fw_reporter_diagnose() - change pdsc_color_match() to return bool - remove useless VIF setup loop and just setup vDPA services for now - remove pf pointer from struct padev and have clients use pci_physfn() - drop use of "vf" in auxdev.c function names, make more generic - remove last of client ops struct and simply export the functions - drop [email protected] from MAINTAINERS and add new include dir - include dynamic_debug.h in adminq.c to protect dynamic_hex_dump() - fixed fw_slot type from u8 to int for handling error returns - fixed comment spelling - changed void arg in pdsc_adminq_post() to struct pdsc * v9: Link: https://lore.kernel.org/netdev/[email protected]/ - change pdsc field name id to uid to clarify the unique id used for aux device - remove unnecessary pf->state and other checks in aux device creation - hardcode fw slotnames for devlink info, don't use strings from FW - handle errors from PDS_CORE_CMD_INIT devcmd call - tighten up health thread use of config_lock - remove pdsc_queue_health_check() layer over queuing health check - start pds_core.rst file in first patch, add to it incrementally - give more user interaction info in commit messages - removed a few more extraneous includes v8: Link: https://lore.kernel.org/netdev/[email protected]/ - fixed deadlock problem, use devl_health_reporter_destroy() when devlink is locked - don't clear client_id until after auxiliary_device_uninit() v7: Link: https://lore.kernel.org/netdev/[email protected]/ - use explicit devlink locking and devl_* APIs - move some of devlink setup logic into probe and remove - use debugfs_create_u{type}() for state and queue head and tail - add include for linux/vmalloc.h Reported-by: kernel test robot <[email protected]> Link: https://lore.kernel.org/oe-kbuild-all/[email protected]/ v6: Link: https://lore.kernel.org/netdev/[email protected]/ - removed version.h include noticed by kernel test robot's version check Reported-by: kernel test robot <[email protected]> Link: https://lore.kernel.org/oe-kbuild-all/[email protected]/ - fixed up the more egregious checkpatch line length complaints - make sure pdsc_auxbus_dev_register() checks padev pointer errcode v5: Link: https://lore.kernel.org/netdev/[email protected]/ - added devlink health reporter for FW issues - removed asic_type, asic_rev, serial_num, fw_version from debugfs as they are available through other means - trimed OS info in pdsc_identify(), we don't need to send that much info to the FW - removed reg/unreg from auxbus client API, they are now in the core when VF is started - removed need for pdsc definition in client by simplifying the padev to only carry struct pci_dev pointers rather than full struct pdsc to the pf and vf - removed the unused pdsc argument in pdsc_notify() - moved include/linux/pds/pds_core.h to driver/../pds_core/core.h - restored a few pds_core_if.h interface values and structs that are shared with FW source - moved final config_lock unlock to before tear down of timer and workqueue to be sure there are no deadlocks while waiting for any stragglers - changed use of PAGE_SIZE to local PDS_PAGE_SIZE to keep with FW layout needs without regard to kernel PAGE_SIZE configuration - removed the redundant *adminqcq argument from pdsc_adminq_post() v4: Link: https://lore.kernel.org/netdev/[email protected]/ - reworked to attach to both Core PF and vDPA VF PCI devices - now creates auxiliary_device as part of each VF PCI probe, removes them on PCI remove - auxiliary devices now use simple unique id rather than PCI address for identifier - replaced home-grown event publishing with kernel-based notifier service - dropped live_migration parameter, not needed when not creating aux device for it - replaced devm_* functions with traditional interfaces - added MAINTAINERS entry - removed lingering traces of set/get_vf attribute adminq commands - trimmed some include lists - cleaned a kernel test robot complaint about a stray unused variable Link: https://lore.kernel.org/oe-kbuild-all/[email protected]/ v3: Link: https://lore.kernel.org/netdev/[email protected]/ - changed names from "pensando" to "amd" and updated copyright strings - dropped the DEVLINK_PARAM_GENERIC_ID_FW_BANK for future development - changed the auxiliary device creation to be triggered by the PCI bus event BOUND_DRIVER, and torn down at UNBIND_DRIVER in order to properly handle users using the sysfs bind/unbind functions - dropped some noisy log messages - rebased to current net-next RFC to v2: Link: https://lore.kernel.org/netdev/[email protected]/ - added separate devlink param patches for DEVLINK_PARAM_GENERIC_ID_ENABLE_MIGRATION and DEVLINK_PARAM_GENERIC_ID_FW_BANK, and dropped the driver specific implementations - updated descriptions for the new devlink parameters - dropped netdev support - dropped vDPA patches, will followup later - separated fw update and fw bank select into their own patches RFC: Link: https://lore.kernel.org/netdev/[email protected]/ ==================== Signed-off-by: David S. Miller <[email protected]>
2023-04-21pds_core: Kconfig and pds_core.rstShannon Nelson4-0/+38
Remaining documentation and Kconfig hook for building the driver. Signed-off-by: Shannon Nelson <[email protected]> Acked-by: Jakub Kicinski <[email protected]> Signed-off-by: David S. Miller <[email protected]>
2023-04-21pds_core: publish events to the clientsShannon Nelson4-0/+39
When the Core device gets an event from the device, or notices the device FW to be up or down, it needs to send those events on to the clients that have an event handler. Add the code to pass along the events to the clients. The entry points pdsc_register_notify() and pdsc_unregister_notify() are EXPORTed for other drivers that want to listen for these events. Signed-off-by: Shannon Nelson <[email protected]> Acked-by: Jakub Kicinski <[email protected]> Signed-off-by: David S. Miller <[email protected]>
2023-04-21pds_core: add the aux client APIShannon Nelson3-1/+158
Add the client API operations for running adminq commands. The core registers the client with the FW, then the client has a context for requesting adminq services. We expect to add additional operations for other clients, including requesting additional private adminqs and IRQs, but don't have the need yet. Signed-off-by: Shannon Nelson <[email protected]> Acked-by: Jakub Kicinski <[email protected]> Signed-off-by: David S. Miller <[email protected]>
2023-04-21pds_core: devlink params for enabling VIF supportShannon Nelson4-6/+127
Add the devlink parameter switches so the user can enable the features supported by the VFs. The only feature supported at the moment is vDPA. Example: devlink dev param set pci/0000:2b:00.0 \ name enable_vnet cmode runtime value true Signed-off-by: Shannon Nelson <[email protected]> Acked-by: Jakub Kicinski <[email protected]> Signed-off-by: David S. Miller <[email protected]>
2023-04-21pds_core: add auxiliary_bus devicesShannon Nelson6-2/+171
An auxiliary_bus device is created for each vDPA type VF at VF probe and destroyed at VF remove. The aux device name comes from the driver name + VIF type + the unique id assigned at PCI probe. The VFs are always removed on PF remove, so there should be no issues with VFs trying to access missing PF structures. The auxiliary_device names will look like "pds_core.vDPA.nn" where 'nn' is the VF's uid. Signed-off-by: Shannon Nelson <[email protected]> Acked-by: Jakub Kicinski <[email protected]> Signed-off-by: David S. Miller <[email protected]>
2023-04-21pds_core: add initial VF device handlingShannon Nelson2-1/+56
This is the initial VF PCI driver framework for the new pds_vdpa VF device, which will work in conjunction with an auxiliary_bus client of the pds_core driver. This does the very basics of registering for the new VF device, setting up debugfs entries, and registering with devlink. Signed-off-by: Shannon Nelson <[email protected]> Acked-by: Jakub Kicinski <[email protected]> Signed-off-by: David S. Miller <[email protected]>
2023-04-21pds_core: set up the VIF definitions and defaultsShannon Nelson4-0/+102
The Virtual Interfaces (VIFs) supported by the DSC's configuration (vDPA, Eth, RDMA, etc) are reported in the dev_ident struct and made visible in debugfs. At this point only vDPA is supported in this driver so we only setup devices for that feature. Signed-off-by: Shannon Nelson <[email protected]> Acked-by: Jakub Kicinski <[email protected]> Signed-off-by: David S. Miller <[email protected]>
2023-04-21pds_core: add FW update feature to devlinkShannon Nelson6-1/+221
Add in the support for doing firmware updates. Of the two main banks available, a and b, this updates the one not in use and then selects it for the next boot. Example: devlink dev flash pci/0000:b2:00.0 \ file pensando/dsc_fw_1.63.0-22.tar Signed-off-by: Shannon Nelson <[email protected]> Acked-by: Jakub Kicinski <[email protected]> Signed-off-by: David S. Miller <[email protected]>
2023-04-21pds_core: Add adminq processing and commandsShannon Nelson4-12/+299
Add the service routines for submitting and processing the adminq messages and for handling notifyq events. Signed-off-by: Shannon Nelson <[email protected]> Acked-by: Jakub Kicinski <[email protected]> Signed-off-by: David S. Miller <[email protected]>
2023-04-21pds_core: set up device and adminqShannon Nelson7-4/+1418
Set up the basic adminq and notifyq queue structures. These are used mostly by the client drivers for feature configuration. These are essentially the same adminq and notifyq as in the ionic driver. Part of this includes querying for device identity and FW information, so we can make that available to devlink dev info. $ devlink dev info pci/0000:b5:00.0 pci/0000:b5:00.0: driver pds_core serial_number FLM18420073 versions: fixed: asic.id 0x0 asic.rev 0x0 running: fw 1.51.0-73 stored: fw.goldfw 1.15.9-C-22 fw.mainfwa 1.60.0-73 fw.mainfwb 1.60.0-57 Signed-off-by: Shannon Nelson <[email protected]> Acked-by: Jakub Kicinski <[email protected]> Signed-off-by: David S. Miller <[email protected]>
2023-04-21pds_core: add devlink health facilitiesShannon Nelson6-1/+88
Add devlink health reporting on top of our fw watchdog. Example: # devlink health show pci/0000:2b:00.0 reporter fw pci/0000:2b:00.0: reporter fw state healthy error 0 recover 0 # devlink health diagnose pci/0000:2b:00.0 reporter fw Status: healthy State: 1 Generation: 0 Recoveries: 0 Signed-off-by: Shannon Nelson <[email protected]> Acked-by: Jakub Kicinski <[email protected]> Signed-off-by: David S. Miller <[email protected]>
2023-04-21pds_core: health timer and workqueueShannon Nelson4-0/+103
Add in the periodic health check and the related workqueue, as well as the handlers for when a FW reset is seen. The firmware is polled every 5 seconds to be sure that it is still alive and that the FW generation didn't change. The alive check looks to see that the PCI bus is still readable and the fw_status still has the RUNNING bit on. If not alive, the driver stops activity and tears things down. When the FW recovers and the alive check again succeeds, the driver sets back up for activity. The generation check looks at the fw_generation to see if it has changed, which can happen if the FW crashed and recovered or was updated in between health checks. If changed, the driver counts that as though the alive test failed and forces the fw_down/fw_up cycle. Signed-off-by: Shannon Nelson <[email protected]> Acked-by: Jakub Kicinski <[email protected]> Signed-off-by: David S. Miller <[email protected]>
2023-04-21pds_core: add devcmd device interfacesShannon Nelson8-3/+699
The devcmd interface is the basic connection to the device through the PCI BAR for low level identification and command services. This does the early device initialization and finds the identity data, and adds devcmd routines to be used by later driver bits. Signed-off-by: Shannon Nelson <[email protected]> Acked-by: Jakub Kicinski <[email protected]> Signed-off-by: David S. Miller <[email protected]>
2023-04-21pds_core: initial framework for pds_core PF driverShannon Nelson8-0/+993
This is the initial PCI driver framework for the new pds_core device driver and its family of devices. This does the very basics of registering for the new PF PCI device 1dd8:100c, setting up debugfs entries, and registering with devlink. Signed-off-by: Shannon Nelson <[email protected]> Acked-by: Jakub Kicinski <[email protected]> Signed-off-by: David S. Miller <[email protected]>
2023-04-21Merge branch 'bridge-neigh-suppression'David S. Miller15-19/+936
Ido Schimmel says: ==================== bridge: Add per-{Port, VLAN} neighbor suppression Background ========== In order to minimize the flooding of ARP and ND messages in the VXLAN network, EVPN includes provisions [1] that allow participating VTEPs to suppress such messages in case they know the MAC-IP binding and can reply on behalf of the remote host. In Linux, the above is implemented in the bridge driver using a per-port option called "neigh_suppress" that was added in kernel version 4.15 [2]. Motivation ========== Some applications use ARP messages as keepalives between the application nodes in the network. This works perfectly well when two nodes are connected to the same VTEP. When a node goes down it will stop responding to ARP requests and the other node will notice it immediately. However, when the two nodes are connected to different VTEPs and neighbor suppression is enabled, the local VTEP will reply to ARP requests even after the remote node went down, until certain timers expire and the EVPN control plane decides to withdraw the MAC/IP Advertisement route for the address. Therefore, some users would like to be able to disable neighbor suppression on VLANs where such applications reside and keep it enabled on the rest. Implementation ============== The proposed solution is to allow user space to control neighbor suppression on a per-{Port, VLAN} basis, in a similar fashion to other per-port options that gained per-{Port, VLAN} counterparts such as "mcast_router". This allows users to benefit from the operational simplicity and scalability associated with shared VXLAN devices (i.e., external / collect-metadata mode), while still allowing for per-VLAN/VNI neighbor suppression control. The user interface is extended with a new "neigh_vlan_suppress" bridge port option that allows user space to enable per-{Port, VLAN} neighbor suppression on the bridge port. When enabled, the existing "neigh_suppress" option has no effect and neighbor suppression is controlled using a new "neigh_suppress" VLAN option. Example usage: # bridge link set dev vxlan0 neigh_vlan_suppress on # bridge vlan add vid 10 dev vxlan0 # bridge vlan set vid 10 dev vxlan0 neigh_suppress on Testing ======= Tested using existing bridge selftests. Added a dedicated selftest in the last patch. Patchset overview ================= Patches #1-#5 are preparations. Patch #6 adds per-{Port, VLAN} neighbor suppression support to the bridge's data path. Patches #7-#8 add the required netlink attributes to enable the feature. Patch #9 adds a selftest. iproute2 patches can be found here [3]. Changelog ========= Since RFC [4]: No changes. [1] https://www.rfc-editor.org/rfc/rfc7432#section-10 [2] https://git.kernel.org/pub/scm/linux/kernel/git/torvalds/linux.git/commit/?id=a42317785c898c0ed46db45a33b0cc71b671bf29 [3] https://github.com/idosch/iproute2/tree/submit/neigh_suppress_v1 [4] https://lore.kernel.org/netdev/[email protected]/ ==================== Signed-off-by: David S. Miller <[email protected]>
2023-04-21selftests: net: Add bridge neighbor suppression testIdo Schimmel2-0/+863
Add test cases for bridge neighbor suppression, testing both per-port and per-{Port, VLAN} neighbor suppression with both ARP and NS packets. Example truncated output: # ./test_bridge_neigh_suppress.sh [...] Tests passed: 148 Tests failed: 0 Signed-off-by: Ido Schimmel <[email protected]> Acked-by: Nikolay Aleksandrov <[email protected]> Signed-off-by: David S. Miller <[email protected]>
2023-04-21bridge: Allow setting per-{Port, VLAN} neighbor suppression stateIdo Schimmel3-2/+9
Add a new bridge port attribute that allows user space to enable per-{Port, VLAN} neighbor suppression. Example: # bridge -d -j -p link show dev swp1 | jq '.[]["neigh_vlan_suppress"]' false # bridge link set dev swp1 neigh_vlan_suppress on # bridge -d -j -p link show dev swp1 | jq '.[]["neigh_vlan_suppress"]' true # bridge link set dev swp1 neigh_vlan_suppress off # bridge -d -j -p link show dev swp1 | jq '.[]["neigh_vlan_suppress"]' false Signed-off-by: Ido Schimmel <[email protected]> Acked-by: Nikolay Aleksandrov <[email protected]> Signed-off-by: David S. Miller <[email protected]>
2023-04-21bridge: vlan: Allow setting VLAN neighbor suppression stateIdo Schimmel3-1/+21
Add a new VLAN attribute that allows user space to set the neighbor suppression state of the port VLAN. Example: # bridge -d -j -p vlan show dev swp1 vid 10 | jq '.[]["vlans"][]["neigh_suppress"]' false # bridge vlan set vid 10 dev swp1 neigh_suppress on # bridge -d -j -p vlan show dev swp1 vid 10 | jq '.[]["vlans"][]["neigh_suppress"]' true # bridge vlan set vid 10 dev swp1 neigh_suppress off # bridge -d -j -p vlan show dev swp1 vid 10 | jq '.[]["vlans"][]["neigh_suppress"]' false # bridge vlan set vid 10 dev br0 neigh_suppress on Error: bridge: Can't set neigh_suppress for non-port vlans. Signed-off-by: Ido Schimmel <[email protected]> Acked-by: Nikolay Aleksandrov <[email protected]> Signed-off-by: David S. Miller <[email protected]>
2023-04-21bridge: Add per-{Port, VLAN} neighbor suppression data path supportIdo Schimmel1-1/+17
When the bridge is not VLAN-aware (i.e., VLAN ID is 0), determine if neighbor suppression is enabled on a given bridge port solely based on the existing 'BR_NEIGH_SUPPRESS' flag. Otherwise, if the bridge is VLAN-aware, first check if per-{Port, VLAN} neighbor suppression is enabled on the given bridge port using the 'BR_NEIGH_VLAN_SUPPRESS' flag. If so, look up the VLAN and check whether it has neighbor suppression enabled based on the per-VLAN 'BR_VLFLAG_NEIGH_SUPPRESS_ENABLED' flag. If the bridge is VLAN-aware, but the bridge port does not have per-{Port, VLAN} neighbor suppression enabled, then fallback to determine neighbor suppression based on the 'BR_NEIGH_SUPPRESS' flag. Signed-off-by: Ido Schimmel <[email protected]> Acked-by: Nikolay Aleksandrov <[email protected]> Signed-off-by: David S. Miller <[email protected]>
2023-04-21bridge: Encapsulate data path neighbor suppression logicIdo Schimmel3-6/+13
Currently, there are various places in the bridge data path that check whether neighbor suppression is enabled on a given bridge port. As a preparation for per-{Port, VLAN} neighbor suppression, encapsulate this logic in a function and pass the VLAN ID of the packet as an argument. Signed-off-by: Ido Schimmel <[email protected]> Acked-by: Nikolay Aleksandrov <[email protected]> Signed-off-by: David S. Miller <[email protected]>
2023-04-21bridge: Take per-{Port, VLAN} neighbor suppression into accountIdo Schimmel2-2/+2
The bridge driver gates the neighbor suppression code behind an internal per-bridge flag called 'BROPT_NEIGH_SUPPRESS_ENABLED'. The flag is set when at least one bridge port has neighbor suppression enabled. As a preparation for per-{Port, VLAN} neighbor suppression, make sure the global flag is also set if per-{Port, VLAN} neighbor suppression is enabled. That is, when the 'BR_NEIGH_VLAN_SUPPRESS' flag is set on at least one bridge port. Signed-off-by: Ido Schimmel <[email protected]> Acked-by: Nikolay Aleksandrov <[email protected]> Signed-off-by: David S. Miller <[email protected]>
2023-04-21bridge: Add internal flags for per-{Port, VLAN} neighbor suppressionIdo Schimmel2-0/+2
Add two internal flags that will be used to enable / disable per-{Port, VLAN} neighbor suppression: 1. 'BR_NEIGH_VLAN_SUPPRESS': A per-port flag used to indicate that per-{Port, VLAN} neighbor suppression is enabled on the bridge port. When set, 'BR_NEIGH_SUPPRESS' has no effect. 2. 'BR_VLFLAG_NEIGH_SUPPRESS_ENABLED': A per-VLAN flag used to indicate that neighbor suppression is enabled on the given VLAN. Signed-off-by: Ido Schimmel <[email protected]> Acked-by: Nikolay Aleksandrov <[email protected]> Signed-off-by: David S. Miller <[email protected]>
2023-04-21bridge: Pass VLAN ID to br_flood()Ido Schimmel4-7/+9
Subsequent patches are going to add per-{Port, VLAN} neighbor suppression, which will require br_flood() to potentially suppress ARP / NS packets on a per-{Port, VLAN} basis. As a preparation, pass the VLAN ID of the packet as another argument to br_flood(). Signed-off-by: Ido Schimmel <[email protected]> Acked-by: Nikolay Aleksandrov <[email protected]> Signed-off-by: David S. Miller <[email protected]>
2023-04-21bridge: Reorder neighbor suppression check when floodingIdo Schimmel1-2/+2
The bridge does not flood ARP / NS packets for which a reply was sent to bridge ports that have neighbor suppression enabled. Subsequent patches are going to add per-{Port, VLAN} neighbor suppression, which is going to make it more expensive to check whether neighbor suppression is enabled since a VLAN lookup will be required. Therefore, instead of unnecessarily performing this lookup for every packet, only perform it for ARP / NS packets for which a reply was sent. Signed-off-by: Ido Schimmel <[email protected]> Acked-by: Nikolay Aleksandrov <[email protected]> Signed-off-by: David S. Miller <[email protected]>
2023-04-21Merge branch 'macsec-vlan'David S. Miller5-18/+288
Emeel Hakim says: ==================== Support MACsec VLAN This patch series introduces support for hardware (HW) offload MACsec devices with VLAN configuration. The patches address both scenarios where the VLAN header is both the inner and outer header for MACsec. The changes include: 1. Adding MACsec offload operation for VLAN. 2. Considering VLAN when accessing MACsec net device. 3. Currently offloading MACsec when it's configured over VLAN with current MACsec TX steering rules would wrongly insert the MACsec sec tag after inserting the VLAN header. This resulted in an ETHERNET | SECTAG | VLAN packet when ETHERNET | VLAN | SECTAG is configured. The patche handles this issue when configuring steering rules. 4. Adding MACsec rx_handler change support in case of a marked skb and a mismatch on the dst MAC address. Please review these changes and let me know if you have any feedback or concerns. Updates since v1: - Consult vlan_features when adding NETIF_F_HW_MACSEC. - Allow grep for the functions. - Add helper function to get the macsec operation to allow the compiler to make some choice. Updates since v2: - Don't use macros to allow direct navigattion from mdo functions to its implementation. - Make the vlan_get_macsec_ops argument a const. - Check if the specific mdo function is available before calling it. - Enable NETIF_F_HW_MACSEC by default when the lower device has it enabled and in case the lower device currently has NETIF_F_HW_MACSEC but disabled let the new vlan device also have it disabled. Updates since v3: - Split patch ("vlan: Add MACsec offload operations for VLAN interface") to prevent mixing generic vlan code changes with driver changes. - Add mdo_open, stop and stats to support drivers which have those. - Don't fail if macsec offload operations are available but a specific function is not, to support drivers which does not implement all macsec offload operations. - Don't call find_rx_sc twice in the same loop, instead save the result in a parameter and re-use it. - Completely remove _BUILD_VLAN_MACSEC_MDO macro, to prevent returning from a macro. - Reorder the functions inside struct macsec_ops to match the struct decleration. Updates since v4: - Change subject line of ("macsec: Add MACsec rx_handler change support") and adapt commit message. - Don't separate the new check in patch ("macsec: Add MACsec rx_handler change support") from the previous if/else if. - Drop"_found" from the parameter naming "rx_sc_found" and move the definition to the relevant block. - Remove "{}" since not needed around a single line. Updates since v5: - Consider promiscuous mode case. Updates since v6: - Use IS_ENABLED instead of checking for ifdef. - Don't add inline keywork in c files, let the compiler make its own decisions. ==================== Signed-off-by: David S. Miller <[email protected]>
2023-04-21macsec: Don't rely solely on the dst MAC address to identify destination ↵Emeel Hakim1-2/+12
MACsec device Offloading device drivers will mark offloaded MACsec SKBs with the corresponding SCI in the skb_metadata_dst so the macsec rx handler will know to which interface to divert those skbs, in case of a marked skb and a mismatch on the dst MAC address, divert the skb to the macsec net_device where the macsec rx_handler will be called to consider cases where relying solely on the dst MAC address is insufficient. One such instance is when using MACsec with a VLAN as an inner header, where the packet structure is ETHERNET | SECTAG | VLAN. In such a scenario, the dst MAC address in the ethernet header will correspond to the VLAN MAC address, resulting in a mismatch. Signed-off-by: Emeel Hakim <[email protected]> Reviewed-by: Subbaraya Sundeep <[email protected]> Signed-off-by: David S. Miller <[email protected]>
2023-04-21net/mlx5: Consider VLAN interface in MACsec TX steering rulesEmeel Hakim1-0/+7
Offloading MACsec when its configured over VLAN with current MACsec TX steering rules will wrongly insert MACsec sec tag after inserting the VLAN header leading to a ETHERNET | SECTAG | VLAN packet when ETHERNET | VLAN | SECTAG is configured. The above issue is due to adding the SECTAG by HW which is a later stage compared to the VLAN header insertion stage. Detect such a case and adjust TX steering rules to insert the SECTAG in the correct place by using reformat_param_0 field in the packet reformat to indicate the offset of SECTAG from end of the MAC header to account for VLANs in granularity of 4Bytes. Signed-off-by: Emeel Hakim <[email protected]> Reviewed-by: Subbaraya Sundeep <[email protected]> Signed-off-by: David S. Miller <[email protected]>
2023-04-21net/mlx5: Support MACsec over VLANEmeel Hakim1-16/+26
MACsec device may have a VLAN device on top of it. Detect MACsec state correctly under this condition, and return the correct net device accordingly. Signed-off-by: Emeel Hakim <[email protected]> Reviewed-by: Subbaraya Sundeep <[email protected]> Signed-off-by: David S. Miller <[email protected]>
2023-04-21net/mlx5: Enable MACsec offload feature for VLAN interfaceEmeel Hakim1-0/+1
Enable MACsec offload feature over VLAN by adding NETIF_F_HW_MACSEC to the device vlan_features. Signed-off-by: Emeel Hakim <[email protected]> Reviewed-by: Subbaraya Sundeep <[email protected]> Signed-off-by: David S. Miller <[email protected]>
2023-04-21vlan: Add MACsec offload operations for VLAN interfaceEmeel Hakim1-0/+242
Add support for MACsec offload operations for VLAN driver to allow offloading MACsec when VLAN's real device supports Macsec offload by forwarding the offload request to it. Signed-off-by: Emeel Hakim <[email protected]> Reviewed-by: Subbaraya Sundeep <[email protected]> Signed-off-by: David S. Miller <[email protected]>
2023-04-21Merge branch 'sctp-nested-flex-arrays'David S. Miller12-48/+49
Xin Long says: ==================== sctp: fix a plenty of flexible-array-nested warnings Paolo noticed a compile warning in SCTP, ../net/sctp/stream_sched_fc.c: note: in included file (through ../include/net/sctp/sctp.h): ../include/net/sctp/structs.h:335:41: warning: array of flexible structures But not only this, there are actually quite a lot of such warnings in some SCTP structs. This patchset fixes most of warnings by deleting these nested flexible array members. After this patchset, there are still some warnings left: # make C=2 CF="-Wflexible-array-nested" M=./net/sctp/ ./include/net/sctp/structs.h:1145:41: warning: nested flexible array ./include/uapi/linux/sctp.h:641:34: warning: nested flexible array ./include/uapi/linux/sctp.h:643:34: warning: nested flexible array ./include/uapi/linux/sctp.h:644:33: warning: nested flexible array ./include/uapi/linux/sctp.h:650:40: warning: nested flexible array ./include/uapi/linux/sctp.h:653:39: warning: nested flexible array the 1st is caused by __data[] in struct ip_options, not in SCTP; the others are in uapi, and we should not touch them. Note that instead of completely deleting it, we just leave it as a comment in the struct, signalling to the reader that we do expect such variable parameters over there, as Marcelo suggested. ==================== Signed-off-by: David S. Miller <[email protected]>
2023-04-21sctp: delete the nested flexible array payloadXin Long1-1/+1
This patch deletes the flexible-array payload[] from the structure sctp_datahdr to avoid some sparse warnings: # make C=2 CF="-Wflexible-array-nested" M=./net/sctp/ net/sctp/socket.c: note: in included file (through include/net/sctp/structs.h, include/net/sctp/sctp.h): ./include/linux/sctp.h:230:29: warning: nested flexible array This member is not even used anywhere. Signed-off-by: Xin Long <[email protected]> Signed-off-by: David S. Miller <[email protected]>
2023-04-21sctp: delete the nested flexible array hmacXin Long3-3/+3
This patch deletes the flexible-array hmac[] from the structure sctp_authhdr to avoid some sparse warnings: # make C=2 CF="-Wflexible-array-nested" M=./net/sctp/ net/sctp/auth.c: note: in included file (through include/net/sctp/structs.h, include/net/sctp/sctp.h): ./include/linux/sctp.h:735:29: warning: nested flexible array Signed-off-by: Xin Long <[email protected]> Signed-off-by: David S. Miller <[email protected]>
2023-04-21sctp: delete the nested flexible array peer_initXin Long4-10/+9
This patch deletes the flexible-array peer_init[] from the structure sctp_cookie to avoid some sparse warnings: # make C=2 CF="-Wflexible-array-nested" M=./net/sctp/ net/sctp/sm_make_chunk.c: note: in included file (through include/net/sctp/sctp.h): ./include/net/sctp/structs.h:1588:28: warning: nested flexible array ./include/net/sctp/structs.h:343:28: warning: nested flexible array Signed-off-by: Xin Long <[email protected]> Signed-off-by: David S. Miller <[email protected]>
2023-04-21sctp: delete the nested flexible array variableXin Long4-9/+11
This patch deletes the flexible-array variable[] from the structure sctp_sackhdr and sctp_errhdr to avoid some sparse warnings: # make C=2 CF="-Wflexible-array-nested" M=./net/sctp/ net/sctp/sm_statefuns.c: note: in included file (through include/net/sctp/structs.h, include/net/sctp/sctp.h): ./include/linux/sctp.h:451:28: warning: nested flexible array ./include/linux/sctp.h:393:29: warning: nested flexible array Signed-off-by: Xin Long <[email protected]> Signed-off-by: David S. Miller <[email protected]>
2023-04-21sctp: delete the nested flexible array skipXin Long3-6/+6
This patch deletes the flexible-array skip[] from the structure sctp_ifwdtsn/fwdtsn_hdr to avoid some sparse warnings: # make C=2 CF="-Wflexible-array-nested" M=./net/sctp/ net/sctp/stream_interleave.c: note: in included file (through include/net/sctp/structs.h, include/net/sctp/sctp.h): ./include/linux/sctp.h:611:32: warning: nested flexible array ./include/linux/sctp.h:628:33: warning: nested flexible array Signed-off-by: Xin Long <[email protected]> Signed-off-by: David S. Miller <[email protected]>
2023-04-21sctp: delete the nested flexible array paramsXin Long6-19/+19
This patch deletes the flexible-array params[] from the structure sctp_inithdr, sctp_addiphdr and sctp_reconf_chunk to avoid some sparse warnings: # make C=2 CF="-Wflexible-array-nested" M=./net/sctp/ net/sctp/input.c: note: in included file (through include/net/sctp/structs.h, include/net/sctp/sctp.h): ./include/linux/sctp.h:278:29: warning: nested flexible array ./include/linux/sctp.h:675:30: warning: nested flexible array This warning is reported if a structure having a flexible array member is included by other structures. Signed-off-by: Xin Long <[email protected]> Signed-off-by: David S. Miller <[email protected]>
2023-04-20Merge branch 'net-extend-drop-reasons'Jakub Kicinski12-410/+606
Johannes Berg says: ==================== net: extend drop reasons Here's v4 of the extended drop reasons, with fixes to kernel-doc and checkpatch. ==================== Link: https://lore.kernel.org/r/[email protected] Signed-off-by: Jakub Kicinski <[email protected]>
2023-04-20mac80211: use the new drop reasons infrastructureJohannes Berg6-48/+138
It can be really hard to analyse or debug why packets are going missing in mac80211, so add the needed infrastructure to use use the new per-subsystem drop reasons. We actually use two drop reason subsystems here because of the different handling of frames that are dropped but still go to monitor for old versions of hostapd, and those that are just completely unusable (e.g. crypto failed.) Annotate a few reasons here just to illustrate this, we'll need to go through and annotate more of them later. Signed-off-by: Johannes Berg <[email protected]> Signed-off-by: Jakub Kicinski <[email protected]>
2023-04-20net: extend drop reasons for multiple subsystemsJohannes Berg4-16/+121
Extend drop reasons to make them usable by subsystems other than core by reserving the high 16 bits for a new subsystem ID, of which 0 of course is used for the existing reasons immediately. To still be able to have string reasons, restructure that code a bit to make the loopup under RCU, the only user of this (right now) is drop_monitor. Link: https://lore.kernel.org/netdev/[email protected] Signed-off-by: Johannes Berg <[email protected]> Signed-off-by: Jakub Kicinski <[email protected]>
2023-04-20net: move dropreason.h to dropreason-core.hJohannes Berg4-6/+7
This will, after the next patch, hold only the core drop reasons and minimal infrastructure. Fix a small kernel-doc issue while at it, to avoid the move triggering a checker. Signed-off-by: Johannes Berg <[email protected]> Signed-off-by: Jakub Kicinski <[email protected]>
2023-04-20ipv6: add icmpv6_error_anycast_as_unicast for ICMPv6Mahesh Bandewar4-2/+22
ICMPv6 error packets are not sent to the anycast destinations and this prevents things like traceroute from working. So create a setting similar to ECHO when dealing with Anycast sources (icmpv6_echo_ignore_anycast). Signed-off-by: Mahesh Bandewar <[email protected]> Reviewed-by: David Ahern <[email protected]> Reviewed-by: Maciej Żenczykowski <[email protected]> Link: https://lore.kernel.org/r/[email protected] Signed-off-by: Jakub Kicinski <[email protected]>
2023-04-20Merge branch 'ethtool-mm-api-consolidation'Jakub Kicinski20-66/+486
Vladimir Oltean says: ==================== ethtool mm API consolidation This series consolidates the behavior of the 2 drivers that implement the ethtool MAC Merge layer by making NXP ENETC commit its preemptible traffic classes to hardware only when MM TX is active (same as Ocelot). Then, after resolving an issue with the ENETC driver, it restricts user space from entering 2 states which don't make sense: - pmac-enabled off tx-enabled on verify-enabled * - pmac-enabled * tx-enabled off verify-enabled on Then, it introduces a selftest (ethtool_mm.sh) which puts everything together and tests all valid configurations known to me. This is simultaneously the v2 of "[PATCH net-next 0/2] ethtool mm API improvements": https://lore.kernel.org/netdev/[email protected]/ which had caused some problems to openlldp. Those were solved in the meantime, see: https://github.com/intel/openlldp/commit/11171b474f6f3cbccac5d608b7f26b32ff72c651 and of "[RFC PATCH net-next] selftests: forwarding: add a test for MAC Merge layer": https://lore.kernel.org/netdev/[email protected]/ ==================== Link: https://lore.kernel.org/r/[email protected] Signed-off-by: Jakub Kicinski <[email protected]>
2023-04-20selftests: forwarding: add a test for MAC Merge layerVladimir Oltean3-0/+307
The MAC Merge layer (IEEE 802.3-2018 clause 99) does all the heavy lifting for Frame Preemption (IEEE 802.1Q-2018 clause 6.7.2), a TSN feature for minimizing latency. Preemptible traffic is different on the wire from normal traffic in incompatible ways. If we send a preemptible packet and the link partner doesn't support preemption, it will drop it as an error frame and we will never know. The MAC Merge layer has a control plane of its own, which can be manipulated (using ethtool) in order to negotiate this capability with the link partner (through LLDP). Actually the TLV format for LLDP solves this problem only partly, because both partners only advertise: - if they support preemption (RX and TX) - if they have enabled preemption (TX) so we cannot tell the link partner what to do - we cannot force it to enable reception of our preemptible packets. That is fully solved by the verification feature, where the local device generates some small probe frames which look like preemptible frames with no useful content, and the link partner is obliged to respond to them if it supports the standard. If the verification times out, we know that preemption isn't active in our TX direction on the link. Having clarified the definition, this selftest exercises the manual (ethtool) configuration path of 2 link partners (with and without verification), and the LLDP code path, using the openlldp project. The test also verifies the TX activity of the MAC Merge layer by sending traffic through a traffic class configured as preemptible (using mqprio). There isn't a good way to make this really portable (user space cannot find out how many traffic classes there are for a device), but I chose num_tc 4 here, that should work reasonably well. I also know that some devices (stmmac) only permit TXQ0 to be preemptible, so this is why PREEMPTIBLE_PRIO was strategically chosen as 0. Even if other hardware is more configurable, this test should cover the baseline. This is not really a "forwarding" selftest, but I put it near the other "ethtool" selftests. $ ./ethtool_mm.sh eno0 swp0 TEST: Manual configuration with verification: eno0 to swp0 [ OK ] TEST: Manual configuration with verification: swp0 to eno0 [ OK ] TEST: Manual configuration without verification: eno0 to swp0 [ OK ] TEST: Manual configuration without verification: swp0 to eno0 [ OK ] TEST: Manual configuration with failed verification: eno0 to swp0 [ OK ] TEST: Manual configuration with failed verification: swp0 to eno0 [ OK ] TEST: LLDP [ OK ] Signed-off-by: Vladimir Oltean <[email protected]> Signed-off-by: Jakub Kicinski <[email protected]>
2023-04-20selftests: forwarding: introduce helper for standard ethtool countersVladimir Oltean1-0/+11
Counters for the MAC Merge layer and preemptible MAC have standardized so far on using structured ethtool stats as opposed to the driver specific names and meanings. Benefit from that rare opportunity and introduce a helper to lib.sh for querying standardized counters, in the hope that these will take off for other uses as well. Signed-off-by: Vladimir Oltean <[email protected]> Signed-off-by: Jakub Kicinski <[email protected]>
2023-04-20selftests: forwarding: generalize bail_on_lldpad from mlxswPetr Machata11-46/+39
mlxsw selftests often invoke a bail_on_lldpad() helper to make sure LLDPAD is not running, to prevent conflicts between the QoS configuration applied through TC or DCB command line tool, and the DCB configuration that LLDPAD might apply. This helper might be useful to others. Move the function to lib.sh, and parameterize to make reusable in other contexts. Signed-off-by: Petr Machata <[email protected]> Reviewed-by: Danielle Ratson <[email protected]> Signed-off-by: Vladimir Oltean <[email protected]> Signed-off-by: Jakub Kicinski <[email protected]>
2023-04-20selftests: forwarding: sch_tbf_*: Add a pre-run hookPetr Machata5-3/+23
The driver-specific wrappers of these selftests invoke bail_on_lldpad to make sure that LLDPAD doesn't trample the configuration. The function bail_on_lldpad is going to move to lib.sh in the next patch. With that, it won't be visible for the wrappers before sourcing the framework script. And after sourcing it, it is too late: the selftest will have run by then. One option might be to source NUM_NETIFS=0 lib.sh from the wrapper, but even if that worked (it might, it might not), that seems cumbersome. lib.sh is doing fair amount of stuff, and even if it works today, it does not look particularly solid as a solution. Instead, introduce a hook, sch_tbf_pre_hook(), that when available, gets invoked. Move the bail to the hook. Signed-off-by: Petr Machata <[email protected]> Reviewed-by: Danielle Ratson <[email protected]> Signed-off-by: Vladimir Oltean <[email protected]> Signed-off-by: Jakub Kicinski <[email protected]>
2023-04-20net: ethtool: mm: sanitize some UAPI configurationsVladimir Oltean1-0/+10
The verify-enabled boolean (ETHTOOL_A_MM_VERIFY_ENABLED) was intended to be a sub-setting of tx-enabled (ETHTOOL_A_MM_TX_ENABLED). IOW, MAC Merge TX can be enabled with or without verification, but verification with TX disabled makes no sense. The pmac-enabled boolean (ETHTOOL_A_MM_PMAC_ENABLED) was intended to be a global toggle from an API perspective, whereas tx-enabled just handles the TX direction. IOW, the pMAC can be enabled with or without TX, but it doesn't make sense to enable TX if the pMAC is not enabled. Add two checks which sanitize and reject these invalid cases. Signed-off-by: Vladimir Oltean <[email protected]> Reviewed-by: Simon Horman <[email protected]> Signed-off-by: Jakub Kicinski <[email protected]>