aboutsummaryrefslogtreecommitdiff
path: root/drivers/net
AgeCommit message (Collapse)AuthorFilesLines
2023-03-19qed/qed_sriov: guard against NULL derefs from qed_iov_get_vf_infoDaniil Tatianin1-1/+4
We have to make sure that the info returned by the helper is valid before using it. Found by Linux Verification Center (linuxtesting.org) with the SVACE static analysis tool. Fixes: f990c82c385b ("qed*: Add support for ndo_set_vf_trust") Fixes: 733def6a04bf ("qed*: IOV link control") Signed-off-by: Daniil Tatianin <[email protected]> Reviewed-by: Michal Swiatkowski <[email protected]> Signed-off-by: David S. Miller <[email protected]>
2023-03-19net: macb: Set MDIO clock divisor for pclk higher than 160MHzBartosz Wawrzyniak2-1/+7
Currently macb sets clock divisor for pclk up to 160 MHz. Function gem_mdc_clk_div was updated to enable divisor for higher values of pclk. Signed-off-by: Bartosz Wawrzyniak <[email protected]> Reviewed-by: Michal Kubiak <[email protected]> Acked-by: Nicolas Ferre <[email protected]> Reviewed-by: Andrew Lunn <[email protected]> Signed-off-by: David S. Miller <[email protected]>
2023-03-17net/mlx5e: TC, Add support for VxLAN GBP encap/decap flows offloadGavin Li1-2/+70
Add HW offloading support for TC flows with VxLAN GBP encap/decap. Example of encap rule: tc filter add dev eth0 protocol ip ingress flower \ action tunnel_key set id 42 vxlan_opts 512 \ action mirred egress redirect dev vxlan1 Example of decap rule: tc filter add dev vxlan1 protocol ip ingress flower \ enc_key_id 42 enc_dst_port 4789 vxlan_opts 1024 \ action tunnel_key unset action mirred egress redirect dev eth0 Signed-off-by: Gavin Li <[email protected]> Reviewed-by: Gavi Teitz <[email protected]> Reviewed-by: Roi Dayan <[email protected]> Reviewed-by: Maor Dickman <[email protected]> Acked-by: Saeed Mahameed <[email protected]> Signed-off-by: Jakub Kicinski <[email protected]>
2023-03-17net/mlx5e: Add helper for encap_info_equal for tunnels with optionsGavin Li3-23/+36
For tunnels with options, eg, geneve and vxlan with gbp, they share the same way to compare the headers and options. Extract the code as a common function for them. Signed-off-by: Gavin Li <[email protected]> Reviewed-by: Gavi Teitz <[email protected]> Reviewed-by: Roi Dayan <[email protected]> Reviewed-by: Maor Dickman <[email protected]> Acked-by: Saeed Mahameed <[email protected]> Signed-off-by: Jakub Kicinski <[email protected]>
2023-03-17vxlan: Expose helper vxlan_build_gbp_hdrGavin Li1-19/+0
The function vxlan_build_gbp_hdr will be used by other modules to build gbp option in vxlan header according to gbp flags. Signed-off-by: Gavin Li <[email protected]> Reviewed-by: Gavi Teitz <[email protected]> Reviewed-by: Roi Dayan <[email protected]> Reviewed-by: Maor Dickman <[email protected]> Acked-by: Saeed Mahameed <[email protected]> Reviewed-by: Simon Horman <[email protected]> Signed-off-by: Jakub Kicinski <[email protected]>
2023-03-17vxlan: Remove unused argument from vxlan_build_gbp_hdr( ) and ↵Gavin Li1-6/+4
vxlan_build_gpe_hdr( ) Remove unused argument (i.e. u32 vxflags) in vxlan_build_gbp_hdr( ) and vxlan_build_gpe_hdr( ) function arguments. Signed-off-by: Gavin Li <[email protected]> Reviewed-by: Simon Horman <[email protected]> Signed-off-by: Jakub Kicinski <[email protected]>
2023-03-17Merge branch '40GbE' of ↵Jakub Kicinski4-4/+10
git://git.kernel.org/pub/scm/linux/kernel/git/tnguy/net-queue Tony Nguyen says: ==================== Intel Wired LAN Driver Updates 2023-03-16 (iavf) This series contains updates to iavf driver only. Alex fixes incorrect check against Rx hash feature and corrects payload value for IPv6 UDP packet. Ahmed removes bookkeeping of VLAN 0 filter as it always exists and can cause a false max filter error message. * '40GbE' of git://git.kernel.org/pub/scm/linux/kernel/git/tnguy/net-queue: iavf: do not track VLAN 0 filters iavf: fix non-tunneled IPv6 UDP packet type and hashing iavf: fix inverted Rx hash condition leading to disabled hash ==================== Link: https://lore.kernel.org/r/[email protected] Signed-off-by: Jakub Kicinski <[email protected]>
2023-03-17wwan: core: Support slicing in port TX flow of WWAN subsystemhaozhe chang6-35/+68
wwan_port_fops_write inputs the SKB parameter to the TX callback of the WWAN device driver. However, the WWAN device (e.g., t7xx) may have an MTU less than the size of SKB, causing the TX buffer to be sliced and copied once more in the WWAN device driver. This patch implements the slicing in the WWAN subsystem and gives the WWAN devices driver the option to slice(by frag_len) or not. By doing so, the additional memory copy is reduced. Meanwhile, this patch gives WWAN devices driver the option to reserve headroom in fragments for the device-specific metadata. Signed-off-by: haozhe chang <[email protected]> Reviewed-by: Loic Poulain <[email protected]> Link: https://lore.kernel.org/r/[email protected] Signed-off-by: Jakub Kicinski <[email protected]>
2023-03-17net: ethernet: ti: am65-cpts: reset pps genf adj settings on enableGrygorii Strashko1-0/+4
The CPTS PPS GENf adjustment settings are invalid after it has been disabled for a while, so reset them. Fixes: eb9233ce6751 ("net: ethernet: ti: am65-cpts: adjust pps following ptp changes") Signed-off-by: Grygorii Strashko <[email protected]> Signed-off-by: Siddharth Vadapalli <[email protected]> Reviewed-by: Roger Quadros <[email protected]> Reviewed-by: Michal Swiatkowski <[email protected]> Link: https://lore.kernel.org/r/[email protected] Signed-off-by: Jakub Kicinski <[email protected]>
2023-03-17net: macb: Increase halt timeout to accommodate 10Mbps linkHarini Katakam1-2/+1
Increase halt timeout to accommodate for 16K SRAM at 10Mbps rounded. Signed-off-by: Harini Katakam <[email protected]> Signed-off-by: Michal Simek <[email protected]> Signed-off-by: Radhey Shyam Pandey <[email protected]> Acked-by: Nicolas Ferre <[email protected]> Link: https://lore.kernel.org/r/[email protected] Signed-off-by: Jakub Kicinski <[email protected]>
2023-03-17net: dsa: mv88e6xxx: mask apparently non-existing phys during probingKlaus Kudielka1-0/+1
To avoid excessive mdio bus transactions during probing, mask all phy addresses that do not exist (there is a 1:1 mapping between switch port number and phy address). Suggested-by: Andrew Lunn <[email protected]> Signed-off-by: Klaus Kudielka <[email protected]> Reviewed-by: Andrew Lunn <[email protected]> Reviewed-by: Vladimir Oltean <[email protected]> Reviewed-by: Florian Fainelli <[email protected]> Signed-off-by: Jakub Kicinski <[email protected]>
2023-03-17net: dsa: mv88e6xxx: move call to mv88e6xxx_mdios_register()Klaus Kudielka1-12/+14
Call the rather expensive mv88e6xxx_mdios_register() at the beginning of mv88e6xxx_setup(). This avoids the double call via mv88e6xxx_probe() during boot. For symmetry, call mv88e6xxx_mdios_unregister() at the end of mv88e6xxx_teardown(). Link: https://lore.kernel.org/lkml/[email protected]/ Suggested-by: Andrew Lunn <[email protected]> Signed-off-by: Klaus Kudielka <[email protected]> Reviewed-by: Andrew Lunn <[email protected]> Reviewed-by: Vladimir Oltean <[email protected]> Tested-by: Vladimir Oltean <[email protected]> Reviewed-by: Florian Fainelli <[email protected]> Signed-off-by: Jakub Kicinski <[email protected]>
2023-03-17net: dsa: mv88e6xxx: re-order functionsKlaus Kudielka1-179/+179
Move mv88e6xxx_setup() below mv88e6xxx_mdios_register(), so that we are able to call the latter one from here. Do the same thing for the inverse functions. Signed-off-by: Klaus Kudielka <[email protected]> Reviewed-by: Andrew Lunn <[email protected]> Reviewed-by: Vladimir Oltean <[email protected]> Reviewed-by: Florian Fainelli <[email protected]> Signed-off-by: Jakub Kicinski <[email protected]>
2023-03-17net: dsa: mv88e6xxx: don't dispose of Global2 IRQ mappings from mdiobus codeVladimir Oltean1-16/+4
irq_find_mapping() does not need irq_dispose_mapping(), only irq_create_mapping() does. Calling irq_dispose_mapping() from mv88e6xxx_g2_irq_mdio_free() and from the error path of mv88e6xxx_g2_irq_mdio_setup() effectively means that the mdiobus logic (for internal PHY interrupts) is disposing of a hwirq->virq mapping which it is not responsible of (but instead, the function pair mv88e6xxx_g2_irq_setup() + mv88e6xxx_g2_irq_free() is). With the current code structure, this isn't such a huge problem, because mv88e6xxx_g2_irq_mdio_free() is called relatively close to the real owner of the IRQ mappings: mv88e6xxx_remove() -> mv88e6xxx_unregister_switch() -> mv88e6xxx_mdios_unregister() -> mv88e6xxx_g2_irq_mdio_free() -> mv88e6xxx_g2_irq_free() and the switch isn't 'live' in any way such that it would be able of generating interrupts at this point (mv88e6xxx_unregister_switch() has been called). However, there is a desire to split mv88e6xxx_mdios_unregister() and mv88e6xxx_g2_irq_free() such that mv88e6xxx_mdios_unregister() only gets called from mv88e6xxx_teardown(). This is much more problematic, as can be seen below. In a cross-chip scenario (say 3 switches d0032004.mdio-mii:10, d0032004.mdio-mii:11 and d0032004.mdio-mii:12 which form a single DSA tree), it is possible to unbind the device driver from a single switch (say d0032004.mdio-mii:10). When that happens, mv88e6xxx_remove() will be called for just that one switch, and this will call mv88e6xxx_unregister_switch() which will tear down the entire tree (calling mv88e6xxx_teardown() for all 3 switches). Assuming mv88e6xxx_mdios_unregister() was moved to mv88e6xxx_teardown(), at this stage, all 3 switches will have called irq_dispose_mapping() on their mdiobus virqs. When we bind again the device driver to d0032004.mdio-mii:10, mv88e6xxx_probe() is called for it, which calls dsa_register_switch(). The DSA tree is now complete again, and mv88e6xxx_setup() is called for all 3 switches. Also assuming that mv88e6xxx_mdios_register() is moved to mv88e6xxx_setup() (the 2 assumptions go together), at this point, d0032004.mdio-mii:11 and d0032004.mdio-mii:12 don't have an IRQ mapping for the internal PHYs anymore, as they've disposed of it in mv88e6xxx_teardown(). Whereas switch d0032004.mdio-mii:10 has re-created it, because its code path comes from mv88e6xxx_probe(). Simply put, this change prepares the driver to handle the movement of mv88e6xxx_mdios_register() to mv88e6xxx_setup() for cross-chip DSA trees. Also, the code being deleted was partially wrong anyway (in a way which may have hidden this other issue). mv88e6xxx_g2_irq_mdio_setup() populates bus->irq[] starting with offset chip->info->phy_base_addr, but the teardown path doesn't apply that offset too. So it disposes of virq 0 for phy = [ 0, phy_base_addr ). All switch families have phy_base_addr = 0, except for MV88E6141 and MV88E6341 which have it as 0x10. I guess those families would have happened to work by mistake in cross-chip scenarios too. I'm deleting the body of mv88e6xxx_g2_irq_mdio_free() but leaving its call sites and prototype in place. This is because, if we ever need to add back some teardown procedure in the future, it will be perhaps error-prone to deduce the proper call sites again. Whereas like this, no extra code should get generated, it shouldn't bother anybody. Signed-off-by: Vladimir Oltean <[email protected]> Signed-off-by: Klaus Kudielka <[email protected]> Tested-by: Vladimir Oltean <[email protected]> Reviewed-by: Florian Fainelli <[email protected]> Signed-off-by: Jakub Kicinski <[email protected]>
2023-03-17net: wangxun: Remove macro that is redefinedmengyuanlou1-5/+0
Remove PCI_VENDOR_ID_WANGXUN which is redefined in drivers/pci/quirks. Signed-off-by: mengyuanlou <[email protected]> Link: https://lore.kernel.org/r/[email protected] Signed-off-by: Jakub Kicinski <[email protected]>
2023-03-17net: usb: smsc95xx: Limit packet length to skb->lenSzymon Heidrich1-0/+6
Packet length retrieved from descriptor may be larger than the actual socket buffer length. In such case the cloned skb passed up the network stack will leak kernel memory contents. Fixes: 2f7ca802bdae ("net: Add SMSC LAN9500 USB2.0 10/100 ethernet adapter driver") Signed-off-by: Szymon Heidrich <[email protected]> Reviewed-by: Jakub Kicinski <[email protected]> Link: https://lore.kernel.org/r/[email protected] Signed-off-by: Jakub Kicinski <[email protected]>
2023-03-17net: dsa: b53: mmap: fix device tree supportÁlvaro Fernández Rojas1-1/+1
CPU port should also be enabled in order to get a working switch. Fixes: a5538a777b73 ("net: dsa: b53: mmap: Add device tree support") Signed-off-by: Álvaro Fernández Rojas <[email protected]> Acked-by: Florian Fainelli <[email protected]> Link: https://lore.kernel.org/r/[email protected] Signed-off-by: Jakub Kicinski <[email protected]>
2023-03-17Merge git://git.kernel.org/pub/scm/linux/kernel/git/netdev/netJakub Kicinski75-430/+664
net/wireless/nl80211.c b27f07c50a73 ("wifi: nl80211: fix puncturing bitmap policy") cbbaf2bb829b ("wifi: nl80211: add a command to enable/disable HW timestamping") https://lore.kernel.org/all/[email protected] tools/testing/selftests/net/Makefile 62199e3f1658 ("selftests: net: Add VXLAN MDB test") 13715acf8ab5 ("selftest: Add test for bind() conflicts.") Signed-off-by: Jakub Kicinski <[email protected]>
2023-03-17driver core: class: remove module * from class_create()Greg Kroah-Hartman5-5/+5
The module pointer in class_create() never actually did anything, and it shouldn't have been requred to be set as a parameter even if it did something. So just remove it and fix up all callers of the function in the kernel tree at the same time. Cc: "Rafael J. Wysocki" <[email protected]> Acked-by: Benjamin Tissoires <[email protected]> Link: https://lore.kernel.org/r/[email protected] Signed-off-by: Greg Kroah-Hartman <[email protected]>
2023-03-17drivers: remove struct module * setting from struct classGreg Kroah-Hartman2-2/+0
There is no need to manually set the owner of a struct class, as the registering function does it automatically, so remove all of the explicit settings from various drivers that did so as it is unneeded. This allows us to remove this pointer entirely from this structure going forward. Cc: "Rafael J. Wysocki" <[email protected]> Link: https://lore.kernel.org/r/[email protected] Signed-off-by: Greg Kroah-Hartman <[email protected]>
2023-03-17gve: Add AF_XDP zero-copy support for GQI-QPL formatPraveen Kaligineedi5-9/+274
Adding AF_XDP zero-copy support. Note: Although these changes support AF_XDP socket in zero-copy mode, there is still a copy happening within the driver between XSK buffer pool and QPL bounce buffers in GQI-QPL format. In GQI-QPL queue format, the driver needs to allocate a fixed size memory, the size specified by vNIC device, for RX/TX and register this memory as a bounce buffer with the vNIC device when a queue is created. The number of pages in the bounce buffer is limited and the pages need to be made available to the vNIC by copying the RX data out to prevent head-of-line blocking. Therefore, we cannot pass the XSK buffer pool to the vNIC. The number of copies on RX path from the bounce buffer to XSK buffer is 2 for AF_XDP copy mode (bounce buffer -> allocated page frag -> XSK buffer) and 1 for AF_XDP zero-copy mode (bounce buffer -> XSK buffer). This patch contains the following changes: 1) Enable and disable XSK buffer pool 2) Copy XDP packets from QPL bounce buffers to XSK buffer on rx 3) Copy XDP packets from XSK buffer to QPL bounce buffers and ring the doorbell as part of XDP TX napi poll 4) ndo_xsk_wakeup callback support Signed-off-by: Praveen Kaligineedi <[email protected]> Reviewed-by: Jeroen de Borst <[email protected]> Signed-off-by: David S. Miller <[email protected]>
2023-03-17gve: Add XDP REDIRECT support for GQI-QPL formatPraveen Kaligineedi5-17/+138
This patch contains the following changes: 1) Support for XDP REDIRECT action on rx 2) ndo_xdp_xmit callback support In GQI-QPL queue format, the driver needs to allocate a fixed size memory, the size specified by vNIC device, for RX/TX and register this memory as a bounce buffer with the vNIC device when a queue is created. The number of pages in the bounce buffer is limited and the pages need to be made available to the vNIC by copying the RX data out to prevent head-of-line blocking. The XDP_REDIRECT packets are therefore immediately copied to a newly allocated page. Signed-off-by: Praveen Kaligineedi <[email protected]> Reviewed-by: Jeroen de Borst <[email protected]> Signed-off-by: David S. Miller <[email protected]>
2023-03-17gve: Add XDP DROP and TX support for GQI-QPL formatPraveen Kaligineedi5-39/+687
Add support for XDP PASS, DROP and TX actions. This patch contains the following changes: 1) Support installing/uninstalling XDP program 2) Add dedicated XDP TX queues 3) Add support for XDP DROP action 4) Add support for XDP TX action Signed-off-by: Praveen Kaligineedi <[email protected]> Reviewed-by: Jeroen de Borst <[email protected]> Signed-off-by: David S. Miller <[email protected]>
2023-03-17gve: Changes to add new TX queuesPraveen Kaligineedi6-50/+104
Changes to enable adding and removing TX queues without calling gve_close() and gve_open(). Made the following changes: 1) priv->tx, priv->rx and priv->qpls arrays are allocated based on max tx queues and max rx queues 2) Changed gve_adminq_create_tx_queues(), gve_adminq_destroy_tx_queues(), gve_tx_alloc_rings() and gve_tx_free_rings() functions to add/remove a subset of TX queues rather than all the TX queues. Signed-off-by: Praveen Kaligineedi <[email protected]> Reviewed-by: Jeroen de Borst <[email protected]> Signed-off-by: David S. Miller <[email protected]>
2023-03-17gve: XDP support GQI-QPL: helper function changesPraveen Kaligineedi8-44/+70
This patch adds/modifies helper functions needed to add XDP support. Signed-off-by: Praveen Kaligineedi <[email protected]> Reviewed-by: Jeroen de Borst <[email protected]> Signed-off-by: David S. Miller <[email protected]>
2023-03-17net: pcs: lynx: don't print an_enabled in pcs_get_state()Russell King (Oracle)1-2/+2
an_enabled will be going away, and in any case, pcs_get_state() should not be updating this member. Remove the print. Signed-off-by: Russell King (Oracle) <[email protected]> Reviewed-by: Steen Hegelund <[email protected]> Signed-off-by: David S. Miller <[email protected]>
2023-03-17net: pcs: xpcs: remove double-read of link state when using ANRussell King (Oracle)1-11/+2
Phylink does not want the current state of the link when reading the PCS link state - it wants the latched state. Don't double-read the MII status register. Phylink will re-read as necessary to capture transient link-down events as of dbae3388ea9c ("net: phylink: Force retrigger in case of latched link-fail indicator"). The above referenced commit is a dependency for this change, and thus this change should not be backported to any kernel that does not contain the above referenced commit. Fixes: fcb26bd2b6ca ("net: phy: Add Synopsys DesignWare XPCS MDIO module") Signed-off-by: Russell King (Oracle) <[email protected]> Signed-off-by: David S. Miller <[email protected]>
2023-03-17vxlan: Enable MDB supportIdo Schimmel1-0/+3
Now that the VXLAN MDB control and data paths are in place we can expose the VXLAN MDB functionality to user space. Set the VXLAN MDB net device operations to the appropriate functions, thereby allowing the rtnetlink code to reach the VXLAN driver. Signed-off-by: Ido Schimmel <[email protected]> Reviewed-by: Nikolay Aleksandrov <[email protected]> Signed-off-by: David S. Miller <[email protected]>
2023-03-17vxlan: Add MDB data path supportIdo Schimmel3-0/+135
Integrate MDB support into the Tx path of the VXLAN driver, allowing it to selectively forward IP multicast traffic according to the matched MDB entry. If MDB entries are configured (i.e., 'VXLAN_F_MDB' is set) and the packet is an IP multicast packet, perform up to three different lookups according to the following priority: 1. For an (S, G) entry, using {Source VNI, Source IP, Destination IP}. 2. For a (*, G) entry, using {Source VNI, Destination IP}. 3. For the catchall MDB entry (0.0.0.0 or ::), using the source VNI. The catchall MDB entry is similar to the catchall FDB entry (00:00:00:00:00:00) that is currently used to transmit BUM (broadcast, unknown unicast and multicast) traffic. However, unlike the catchall FDB entry, this entry is only used to transmit unregistered IP multicast traffic that is not link-local. Therefore, when configured, the catchall FDB entry will only transmit BULL (broadcast, unknown unicast, link-local multicast) traffic. The catchall MDB entry is useful in deployments where inter-subnet multicast forwarding is used and not all the VTEPs in a tenant domain are members in all the broadcast domains. In such deployments it is advantageous to transmit BULL (broadcast, unknown unicast and link-local multicast) and unregistered IP multicast traffic on different tunnels. If the same tunnel was used, a VTEP only interested in IP multicast traffic would also pull all the BULL traffic and drop it as it is not a member in the originating broadcast domain [1]. If the packet did not match an MDB entry (or if the packet is not an IP multicast packet), return it to the Tx path, allowing it to be forwarded according to the FDB. If the packet did match an MDB entry, forward it to the associated remote VTEPs. However, if the entry is a (*, G) entry and the associated remote is in INCLUDE mode, then skip over it as the source IP is not in its source list (otherwise the packet would have matched on an (S, G) entry). Similarly, if the associated remote is marked as BLOCKED (can only be set on (S, G) entries), then skip over it as well as the remote is in EXCLUDE mode and the source IP is in its source list. [1] https://datatracker.ietf.org/doc/html/draft-ietf-bess-evpn-irb-mcast#section-2.6 Signed-off-by: Ido Schimmel <[email protected]> Reviewed-by: Nikolay Aleksandrov <[email protected]> Signed-off-by: David S. Miller <[email protected]>
2023-03-17vxlan: mdb: Add an internal flag to indicate MDB usageIdo Schimmel1-0/+7
Add an internal flag to indicate whether MDB entries are configured or not. Set the flag after installing the first MDB entry and clear it before deleting the last one. The flag will be consulted by the data path which will only perform an MDB lookup if the flag is set, thereby keeping the MDB overhead to a minimum when the MDB is not used. Another option would have been to use a static key, but it is global and not per-device, unlike the current approach. Signed-off-by: Ido Schimmel <[email protected]> Reviewed-by: Nikolay Aleksandrov <[email protected]> Signed-off-by: David S. Miller <[email protected]>
2023-03-17vxlan: mdb: Add MDB control path supportIdo Schimmel4-1/+1381
Implement MDB control path support, enabling the creation, deletion, replacement and dumping of MDB entries in a similar fashion to the bridge driver. Unlike the bridge driver, each entry stores a list of remote VTEPs to which matched packets need to be replicated to and not a list of bridge ports. The motivating use case is the installation of MDB entries by a user space control plane in response to received EVPN routes. As such, only allow permanent MDB entries to be installed and do not implement snooping functionality, avoiding a lot of unnecessary complexity. Since entries can only be modified by user space under RTNL, use RTNL as the write lock. Use RCU to ensure that MDB entries and remotes are not freed while being accessed from the data path during transmission. In terms of uAPI, reuse the existing MDB netlink interface, but add a few new attributes to request and response messages: * IP address of the destination VXLAN tunnel endpoint where the multicast receivers reside. * UDP destination port number to use to connect to the remote VXLAN tunnel endpoint. * VXLAN VNI Network Identifier to use to connect to the remote VXLAN tunnel endpoint. Required when Ingress Replication (IR) is used and the remote VTEP is not a member of originating broadcast domain (VLAN/VNI) [1]. * Source VNI Network Identifier the MDB entry belongs to. Used only when the VXLAN device is in external mode. * Interface index of the outgoing interface to reach the remote VXLAN tunnel endpoint. This is required when the underlay destination IP is multicast (P2MP), as the multicast routing tables are not consulted. All the new attributes are added under the 'MDBA_SET_ENTRY_ATTRS' nest which is strictly validated by the bridge driver, thereby automatically rejecting the new attributes. [1] https://datatracker.ietf.org/doc/html/draft-ietf-bess-evpn-irb-mcast#section-3.2.2 Signed-off-by: Ido Schimmel <[email protected]> Reviewed-by: Nikolay Aleksandrov <[email protected]> Signed-off-by: David S. Miller <[email protected]>
2023-03-17vxlan: Expose vxlan_xmit_one()Ido Schimmel2-3/+4
Given a packet and a remote destination, the function will take care of encapsulating the packet and transmitting it to the destination. Expose it so that it could be used in subsequent patches by the MDB code to transmit a packet to the remote destination(s) stored in the MDB entry. It will allow us to keep the MDB code self-contained, not exposing its data structures to the rest of the VXLAN driver. Signed-off-by: Ido Schimmel <[email protected]> Reviewed-by: Nikolay Aleksandrov <[email protected]> Signed-off-by: David S. Miller <[email protected]>
2023-03-17vxlan: Move address helpers to private headersIdo Schimmel2-47/+45
Move the helpers out of the core C file to the private header so that they could be used by the upcoming MDB code. While at it, constify the second argument of vxlan_nla_get_addr(). Signed-off-by: Ido Schimmel <[email protected]> Reviewed-by: Nikolay Aleksandrov <[email protected]> Signed-off-by: David S. Miller <[email protected]>
2023-03-17net: mana: Add new MANA VF performance counters for easier troubleshootingShradha Gupta2-4/+110
Extended performance counter stats in 'ethtool -S <interface>' output for MANA VF to facilitate troubleshooting. Tested-on: Ubuntu22 Signed-off-by: Shradha Gupta <[email protected]> Signed-off-by: David S. Miller <[email protected]>
2023-03-17bonding: restore bond's IFF_SLAVE flag if a non-eth dev enslave failsNikolay Aleksandrov1-3/+1
syzbot reported a warning[1] where the bond device itself is a slave and we try to enslave a non-ethernet device as the first slave which fails but then in the error path when ether_setup() restores the bond device it also clears all flags. In my previous fix[2] I restored the IFF_MASTER flag, but I didn't consider the case that the bond device itself might also be a slave with IFF_SLAVE set, so we need to restore that flag as well. Use the bond_ether_setup helper which does the right thing and restores the bond's flags properly. Steps to reproduce using a nlmon dev: $ ip l add nlmon0 type nlmon $ ip l add bond1 type bond $ ip l add bond2 type bond $ ip l set bond1 master bond2 $ ip l set dev nlmon0 master bond1 $ ip -d l sh dev bond1 22: bond1: <BROADCAST,MULTICAST,MASTER> mtu 1500 qdisc noqueue master bond2 state DOWN mode DEFAULT group default qlen 1000 (now bond1's IFF_SLAVE flag is gone and we'll hit a warning[3] if we try to delete it) [1] https://syzkaller.appspot.com/bug?id=391c7b1f6522182899efba27d891f1743e8eb3ef [2] commit 7d5cd2ce5292 ("bonding: correctly handle bonding type change on enslave failure") [3] example warning: [ 27.008664] bond1: (slave nlmon0): The slave device specified does not support setting the MAC address [ 27.008692] bond1: (slave nlmon0): Error -95 calling set_mac_address [ 32.464639] bond1 (unregistering): Released all slaves [ 32.464685] ------------[ cut here ]------------ [ 32.464686] WARNING: CPU: 1 PID: 2004 at net/core/dev.c:10829 unregister_netdevice_many+0x72a/0x780 [ 32.464694] Modules linked in: br_netfilter bridge bonding virtio_net [ 32.464699] CPU: 1 PID: 2004 Comm: ip Kdump: loaded Not tainted 5.18.0-rc3+ #47 [ 32.464703] Hardware name: QEMU Standard PC (Q35 + ICH9, 2009), BIOS 1.16.1-2.fc37 04/01/2014 [ 32.464704] RIP: 0010:unregister_netdevice_many+0x72a/0x780 [ 32.464707] Code: 99 fd ff ff ba 90 1a 00 00 48 c7 c6 f4 02 66 96 48 c7 c7 20 4d 35 96 c6 05 fa c7 2b 02 01 e8 be 6f 4a 00 0f 0b e9 73 fd ff ff <0f> 0b e9 5f fd ff ff 80 3d e3 c7 2b 02 00 0f 85 3b fd ff ff ba 59 [ 32.464710] RSP: 0018:ffffa006422d7820 EFLAGS: 00010206 [ 32.464712] RAX: ffff8f6e077140a0 RBX: ffffa006422d7888 RCX: 0000000000000000 [ 32.464714] RDX: ffff8f6e12edbe58 RSI: 0000000000000296 RDI: ffffffff96d4a520 [ 32.464716] RBP: ffff8f6e07714000 R08: ffffffff96d63600 R09: ffffa006422d7728 [ 32.464717] R10: 0000000000000ec0 R11: ffffffff9698c988 R12: ffff8f6e12edb140 [ 32.464719] R13: dead000000000122 R14: dead000000000100 R15: ffff8f6e12edb140 [ 32.464723] FS: 00007f297c2f1740(0000) GS:ffff8f6e5d900000(0000) knlGS:0000000000000000 [ 32.464725] CS: 0010 DS: 0000 ES: 0000 CR0: 0000000080050033 [ 32.464726] CR2: 00007f297bf1c800 CR3: 00000000115e8000 CR4: 0000000000350ee0 [ 32.464730] Call Trace: [ 32.464763] <TASK> [ 32.464767] rtnl_dellink+0x13e/0x380 [ 32.464776] ? cred_has_capability.isra.0+0x68/0x100 [ 32.464780] ? __rtnl_unlock+0x33/0x60 [ 32.464783] ? bpf_lsm_capset+0x10/0x10 [ 32.464786] ? security_capable+0x36/0x50 [ 32.464790] rtnetlink_rcv_msg+0x14e/0x3b0 [ 32.464792] ? _copy_to_iter+0xb1/0x790 [ 32.464796] ? post_alloc_hook+0xa0/0x160 [ 32.464799] ? rtnl_calcit.isra.0+0x110/0x110 [ 32.464802] netlink_rcv_skb+0x50/0xf0 [ 32.464806] netlink_unicast+0x216/0x340 [ 32.464809] netlink_sendmsg+0x23f/0x480 [ 32.464812] sock_sendmsg+0x5e/0x60 [ 32.464815] ____sys_sendmsg+0x22c/0x270 [ 32.464818] ? import_iovec+0x17/0x20 [ 32.464821] ? sendmsg_copy_msghdr+0x59/0x90 [ 32.464823] ? do_set_pte+0xa0/0xe0 [ 32.464828] ___sys_sendmsg+0x81/0xc0 [ 32.464832] ? mod_objcg_state+0xc6/0x300 [ 32.464835] ? refill_obj_stock+0xa9/0x160 [ 32.464838] ? memcg_slab_free_hook+0x1a5/0x1f0 [ 32.464842] __sys_sendmsg+0x49/0x80 [ 32.464847] do_syscall_64+0x3b/0x90 [ 32.464851] entry_SYSCALL_64_after_hwframe+0x44/0xae [ 32.464865] RIP: 0033:0x7f297bf2e5e7 [ 32.464868] Code: 64 89 02 48 c7 c0 ff ff ff ff eb bb 0f 1f 80 00 00 00 00 f3 0f 1e fa 64 8b 04 25 18 00 00 00 85 c0 75 10 b8 2e 00 00 00 0f 05 <48> 3d 00 f0 ff ff 77 51 c3 48 83 ec 28 89 54 24 1c 48 89 74 24 10 [ 32.464869] RSP: 002b:00007ffd96c824c8 EFLAGS: 00000246 ORIG_RAX: 000000000000002e [ 32.464872] RAX: ffffffffffffffda RBX: 0000000000000000 RCX: 00007f297bf2e5e7 [ 32.464874] RDX: 0000000000000000 RSI: 00007ffd96c82540 RDI: 0000000000000003 [ 32.464875] RBP: 00000000640f19de R08: 0000000000000001 R09: 000000000000007c [ 32.464876] R10: 00007f297bffabe0 R11: 0000000000000246 R12: 0000000000000001 [ 32.464877] R13: 00007ffd96c82d20 R14: 00007ffd96c82610 R15: 000055bfe38a7020 [ 32.464881] </TASK> [ 32.464882] ---[ end trace 0000000000000000 ]--- Fixes: 7d5cd2ce5292 ("bonding: correctly handle bonding type change on enslave failure") Reported-by: [email protected] Link: https://syzkaller.appspot.com/bug?id=391c7b1f6522182899efba27d891f1743e8eb3ef Signed-off-by: Nikolay Aleksandrov <[email protected]> Reviewed-by: Michal Kubiak <[email protected]> Acked-by: Jonathan Toppins <[email protected]> Acked-by: Jay Vosburgh <[email protected]> Signed-off-by: David S. Miller <[email protected]>
2023-03-17bonding: restore IFF_MASTER/SLAVE flags on bond enslave ether type changeNikolay Aleksandrov1-4/+15
Add bond_ether_setup helper which is used to fix ether_setup() calls in the bonding driver. It takes care of both IFF_MASTER and IFF_SLAVE flags, the former is always restored and the latter only if it was set. If the bond enslaves non-ARPHRD_ETHER device (changes its type), then releases it and enslaves ARPHRD_ETHER device (changes back) then we use ether_setup() to restore the bond device type but it also resets its flags and removes IFF_MASTER and IFF_SLAVE[1]. Use the bond_ether_setup helper to restore both after such transition. [1] reproduce (nlmon is non-ARPHRD_ETHER): $ ip l add nlmon0 type nlmon $ ip l add bond2 type bond mode active-backup $ ip l set nlmon0 master bond2 $ ip l set nlmon0 nomaster $ ip l add bond1 type bond (we use bond1 as ARPHRD_ETHER device to restore bond2's mode) $ ip l set bond1 master bond2 $ ip l sh dev bond2 37: bond2: <BROADCAST,MULTICAST> mtu 1500 qdisc noop state DOWN mode DEFAULT group default qlen 1000 link/ether be:d7:c5:40:5b:cc brd ff:ff:ff:ff:ff:ff promiscuity 0 minmtu 68 maxmtu 1500 (notice bond2's IFF_MASTER is missing) Fixes: e36b9d16c6a6 ("bonding: clean muticast addresses when device changes type") Signed-off-by: Nikolay Aleksandrov <[email protected]> Signed-off-by: David S. Miller <[email protected]>
2023-03-17net: wangxun: Implement the ndo change mtu interfaceMengyuan Lou7-5/+31
Add ngbe and txgbe ndo_change_mtu support. Signed-off-by: Mengyuan Lou <[email protected]> Signed-off-by: David S. Miller <[email protected]>
2023-03-17net: renesas: rswitch: Fix GWTSDIE register handlingYoshihiro Shimoda2-2/+8
Since the GWCA has the TX timestamp feature, this driver should not disable it if one of ports is opened. So, fix it. Reported-by: Phong Hoang <[email protected]> Fixes: 33f5d733b589 ("net: renesas: rswitch: Improve TX timestamp accuracy") Signed-off-by: Yoshihiro Shimoda <[email protected]> Signed-off-by: David S. Miller <[email protected]>
2023-03-17net: renesas: rswitch: Fix the output value of quote from rswitch_rx()Yoshihiro Shimoda1-3/+7
If the RX descriptor doesn't have any data, the output value of quote from rswitch_rx() will be increased unexpectedily. So, fix it. Reported-by: Volodymyr Babchuk <[email protected]> Fixes: 3590918b5d07 ("net: ethernet: renesas: Add support for "Ethernet Switch"") Signed-off-by: Yoshihiro Shimoda <[email protected]> Signed-off-by: David S. Miller <[email protected]>
2023-03-17ethernet: sun: add check for the mdesc_grab()Liang He2-0/+6
In vnet_port_probe() and vsw_port_probe(), we should check the return value of mdesc_grab() as it may return NULL which can caused NPD bugs. Fixes: 5d01fa0c6bd8 ("ldmvsw: Add ldmvsw.c driver code") Fixes: 43fdf27470b2 ("[SPARC64]: Abstract out mdesc accesses for better MD update handling.") Signed-off-by: Liang He <[email protected]> Reviewed-by: Piotr Raczynski <[email protected]> Signed-off-by: David S. Miller <[email protected]>
2023-03-17net: dsa: realtek: rtl8365mb: add change_mtuLuiz Angelo Daros de Luca1-4/+36
The rtl8365mb was using a fixed MTU size of 1536, which was probably inspired by the rtl8366rb's initial frame size. However, unlike that family, the rtl8365mb family can specify the max frame size in bytes, rather than in fixed steps. DSA calls change_mtu for the CPU port once the max MTU value among the ports changes. As the max frame size is defined globally, the switch is configured only when the call affects the CPU port. The available specifications do not directly define the max supported frame size, but it mentions a 16k limit. This driver will use the 0x3FFF limit as it is used in the vendor API code. However, the switch sets the max frame size to 16368 bytes (0x3FF0) after it resets. change_mtu uses MTU size, or ethernet payload size, while the switch works with frame size. The frame size is calculated considering the ethernet header (14 bytes), a possible 802.1Q tag (4 bytes), the payload size (MTU), and the Ethernet FCS (4 bytes). The CPU tag (8 bytes) is consumed before the switch enforces the limit. During setup, the driver will use the default 1500-byte MTU of DSA to set the maximum frame size. The current sum will be VLAN_ETH_HLEN+1500+ETH_FCS_LEN, which results in 1522 bytes. Although it is lower than the previous initial value of 1536 bytes, the driver will increase the frame size for a larger MTU. However, if something requires more space without increasing the MTU, such as QinQ, we would need to add the extra length to the rtl8365mb_port_change_mtu() formula. MTU was tested up to 2018 (with 802.1Q) as that is as far as mt7620 (where rtl8367s is stacked) can go. The register was manually manipulated byte-by-byte to ensure the MTU to frame size conversion was correct. For frames without 802.1Q tag, the frame size limit will be 4 bytes over the required size. There is a jumbo register, enabled by default at 6k frame size. However, the jumbo settings do not seem to limit nor expand the maximum tested MTU (2018), even when jumbo is disabled. More tests are needed with a device that can handle larger frames. Signed-off-by: Luiz Angelo Daros de Luca <[email protected]> Reviewed-by: Alexander Duyck <[email protected]> Reviewed-by: Alvin Šipraga <[email protected]> Signed-off-by: David S. Miller <[email protected]>
2023-03-16net: ipa: fix some register validity checksAlex Elder3-15/+30
A recent commit defined HW_PARAM_4 as a GSI register ID but did not add it to gsi_reg_id_valid() to indicate it's valid (for IPA v5.0+). Add version checks for the HW_PARAM_2 and INTER_EE IRQ GSI registers there as well. IPA v5.0 supports up to 8 source and destination resource groups. Update the validity check (and the comments where the register IDs are defined) to reflect that. Similarly update comments and validity checks for the hash/cache-related registers. Note that this patch fixes an omission and constrains things further, but these don't technically represent bugs. Fixes: f651334e1ef5 ("net: ipa: add HW_PARAM_4 GSI register") Signed-off-by: Alex Elder <[email protected]> Signed-off-by: Jakub Kicinski <[email protected]>
2023-03-16net: ipa: kill FILT_ROUT_CACHE_CFG IPA registerAlex Elder2-11/+2
A recent commit defined a few IPA registers used for IPA v5.0+. One of those was a mistake. Although the filter and router caches get *flushed* using a single register, they use distinct registers (ENDP_FILTER_CACHE_CFG and ENDP_ROUTER_CACHE_CFG) for configuration. And although there *exists* a FILT_ROUT_CACHE_CFG register, it is not needed in upstream code. So get rid of definitions related to FILT_ROUT_CACHE_CFG, because they are not needed. Fixes: 8ba59716d16a ("net: ipa: define IPA v5.0+ registers") Signed-off-by: Alex Elder <[email protected]> Signed-off-by: Jakub Kicinski <[email protected]>
2023-03-16net: ipa: add two missing declarationsAlex Elder1-0/+4
When gsi_reg_init() got added, its declaration was added to "gsi_reg.h" without declaring the two struct pointer types it uses. Add these struct declarations to "gsi_reg.h". Fixes: 3c506add35c7 ("net: ipa: introduce gsi_reg_init()") Signed-off-by: Alex Elder <[email protected]> Signed-off-by: Jakub Kicinski <[email protected]>
2023-03-16net: ipa: reg: include <linux/bug.h>Alex Elder1-1/+2
When "reg.h" got created, it included calls to WARN() and WARN_ON(). Those macros are defined via <linux/bug.h>. In addition, it uses is_power_of_2(), which is defined in <linux/log2.h>. Include those files so IPA "reg.h" has access to all definitions it requires. Meanwhile, <linux/bits.h> is included but nothing defined therein is required directly in "reg.h", so get rid of that. Fixes: 81772e444dbe ("net: ipa: start generalizing "ipa_reg"") Signed-off-by: Alex Elder <[email protected]> Signed-off-by: Jakub Kicinski <[email protected]>
2023-03-16net: dsa: microchip: fix RGMII delay configuration on KSZ8765/KSZ8794/KSZ8795Marek Vasut1-1/+1
The blamed commit has replaced a ksz_write8() call to address REG_PORT_5_CTRL_6 (0x56) with a ksz_set_xmii() -> ksz_pwrite8() call to regs[P_XMII_CTRL_1], which is also defined as 0x56 for ksz8795_regs[]. The trouble is that, when compared to ksz_write8(), ksz_pwrite8() also adjusts the register offset with the port base address. So in reality, ksz_pwrite8(offset=0x56) accesses register 0x56 + 0x50 = 0xa6, which in this switch appears to be unmapped, and the RGMII delay configuration on the CPU port does nothing. So if the switch wasn't fine with the RGMII delay configuration done through pin strapping and relied on Linux to apply a different one in order to pass traffic, this is now broken. Using the offset translation logic imposed by ksz_pwrite8(), the correct value for regs[P_XMII_CTRL_1] should have been 0x6 on ksz8795_regs[], in order to really end up accessing register 0x56. Static code analysis shows that, despite there being multiple other accesses to regs[P_XMII_CTRL_1] in this driver, the only code path that is applicable to ksz8795_regs[] and ksz8_dev_ops is ksz_set_xmii(). Therefore, the problem is isolated to RGMII delays. In its current form, ksz8795_regs[] contains the same value for P_XMII_CTRL_0 and for P_XMII_CTRL_1, and this raises valid suspicions that writes made by the driver to regs[P_XMII_CTRL_0] might overwrite writes made to regs[P_XMII_CTRL_1] or vice versa. Again, static analysis shows that the only accesses to P_XMII_CTRL_0 from the driver are made from code paths which are not reachable with ksz8_dev_ops. So the accesses made by ksz_set_xmii() are safe for this switch family. [ vladimiroltean: rewrote commit message ] Fixes: c476bede4b0f ("net: dsa: microchip: ksz8795: use common xmii function") Signed-off-by: Marek Vasut <[email protected]> Signed-off-by: Vladimir Oltean <[email protected]> Acked-by: Arun Ramadoss <[email protected]> Reviewed-by: Florian Fainelli <[email protected]> Link: https://lore.kernel.org/r/[email protected] Signed-off-by: Jakub Kicinski <[email protected]>
2023-03-16Merge tag 'mlx5-fixes-2023-03-15' of ↵Jakub Kicinski13-47/+81
git://git.kernel.org/pub/scm/linux/kernel/git/saeed/linux Saeed Mahameed says: ==================== mlx5 fixes 2023-03-15 This series provides bug fixes to mlx5 driver. * tag 'mlx5-fixes-2023-03-15' of git://git.kernel.org/pub/scm/linux/kernel/git/saeed/linux: net/mlx5e: TC, Remove error message log print net/mlx5e: TC, fix cloned flow attribute net/mlx5e: TC, fix missing error code net/sched: TC, fix raw counter initialization net/mlx5e: Lower maximum allowed MTU in XSK to match XDP prerequisites net/mlx5: Set BREAK_FW_WAIT flag first when removing driver net/mlx5e: kTLS, Fix missing error unwind on unsupported cipher type net/mlx5e: Fix cleanup null-ptr deref on encap lock net/mlx5: E-switch, Fix missing set of split_count when forward to ovs internal port net/mlx5: E-switch, Fix wrong usage of source port rewrite in split rules net/mlx5: Disable eswitch before waiting for VF pages net/mlx5: Fix setting ec_function bit in MANAGE_PAGES net/mlx5e: Don't cache tunnel offloads capability net/mlx5e: Fix macsec ASO context alignment ==================== Link: https://lore.kernel.org/r/[email protected] Signed-off-by: Jakub Kicinski <[email protected]>
2023-03-16qed/qed_mng_tlv: correctly zero out ->min instead of ->hourDaniil Tatianin1-1/+1
This fixes an issue where ->hour would erroneously get zeroed out instead of ->min because of a bad copy paste. Found by Linux Verification Center (linuxtesting.org) with the SVACE static analysis tool. Fixes: f240b6882211 ("qed: Add support for processing fcoe tlv request.") Signed-off-by: Daniil Tatianin <[email protected]> Link: https://lore.kernel.org/r/[email protected] Signed-off-by: Jakub Kicinski <[email protected]>
2023-03-16i825xx: sni_82596: use eth_hw_addr_set()Thomas Bogendoerfer1-6/+8
netdev->dev_addr is now const, we can't write to it directly. Copy scrambled mac address octects into an array then eth_hw_addr_set(). Fixes: adeef3e32146 ("net: constify netdev->dev_addr") Signed-off-by: Thomas Bogendoerfer <[email protected]> Reviewed-by: Michal Kubiak <[email protected]> Link: https://lore.kernel.org/r/[email protected] Signed-off-by: Jakub Kicinski <[email protected]>
2023-03-16net: atlantic: Fix crash when XDP is enabled but no program is loadedToke Høiland-Jørgensen1-7/+21
The aq_xdp_run_prog() function falls back to the XDP_ABORTED action handler (using a goto) if the operations for any of the other actions fail. The XDP_ABORTED handler in turn calls the bpf_warn_invalid_xdp_action() tracepoint. However, the function also jumps into the XDP_PASS helper if no XDP program is loaded on the device, which means the XDP_ABORTED handler can be run with a NULL program pointer. This results in a NULL pointer deref because the tracepoint dereferences the 'prog' pointer passed to it. This situation can happen in multiple ways: - If a packet arrives between the removal of the program from the interface and the static_branch_dec() in aq_xdp_setup() - If there are multiple devices using the same driver in the system and one of them has an XDP program loaded and the other does not. Fix this by refactoring the aq_xdp_run_prog() function to remove the 'goto pass' handling if there is no XDP program loaded. Instead, factor out the skb building in a separate small helper function. Fixes: 26efaef759a1 ("net: atlantic: Implement xdp data plane") Reported-by: Freysteinn Alfredsson <[email protected]> Tested-by: Freysteinn Alfredsson <[email protected]> Signed-off-by: Toke Høiland-Jørgensen <[email protected]> Link: https://lore.kernel.org/r/[email protected] Signed-off-by: Jakub Kicinski <[email protected]>