aboutsummaryrefslogtreecommitdiff
path: root/drivers/net/ethernet
AgeCommit message (Collapse)AuthorFilesLines
2023-01-19ice: add missing checks for PF vsi typeJesse Brandeburg1-9/+8
There were a few places we had missed checking the VSI type to make sure it was definitely a PF VSI, before calling setup functions intended only for the PF VSI. This doesn't fix any explicit bugs but cleans up the code in a few places and removes one explicit != vsi->type check that can be superseded by this code (it's a super set) Signed-off-by: Jesse Brandeburg <[email protected]> Tested-by: Gurucharan G <[email protected]> (A Contingent worker at Intel) Signed-off-by: Tony Nguyen <[email protected]>
2023-01-19ice: remove redundant non-null check in ice_setup_pf_sw()Anirudh Venkataramanan1-7/+5
Remove a redundant null check, as vsi could not be null at this point. Signed-off-by: Anirudh Venkataramanan <[email protected]> Signed-off-by: Przemek Kitszel <[email protected]> Reviewed-by: Leon Romanovsky <[email protected]> Tested-by: Gurucharan G <[email protected]> (A Contingent worker at Intel) Signed-off-by: Tony Nguyen <[email protected]>
2023-01-19ice: restrict PTP HW clock freq adjustments to 100, 000, 000 PPBSiddaraju DH1-1/+1
The PHY provides only 39b timestamp. With current timing implementation, we discard lower 7b, leaving 32b timestamp. The driver reconstructs the full 64b timestamp by correlating the 32b timestamp with cached_time for performance. The reconstruction algorithm does both forward & backward interpolation. The 32b timeval has overflow duration of 2^32 counts ~= 4.23 second. Due to interpolation in both direction, its now ~= 2.125 second IIRC, going with at least half a duration, the cached_time is updated with periodic thread of 1 second (worst-case) periodicity. But the 1 second periodicity is based on System-timer. With PPB adjustments, if the 1588 timers increments at say double the rate, (2s in-place of 1s), the Nyquist rate/half duration sampling/update of cached_time with 1 second periodic thread will lead to incorrect interpolations. Hence we should restrict the PPB adjustments to at least half duration of cached_time update which translates to 500,000,000 PPB. Since the periodicity of the cached-time system thread can vary, it is good to have some buffer time and considering practicality of PPB adjustments, limiting the max_adj to 100,000,000. Signed-off-by: Siddaraju DH <[email protected]> Tested-by: Gurucharan G <[email protected]> (A Contingent worker at Intel) Signed-off-by: Tony Nguyen <[email protected]>
2023-01-19ice: Support drop actionAmritha Nambiar2-20/+40
Currently the drop action is supported only in switchdev mode. Add support for offloading receive filters with action drop in ADQ/non-ADQ modes. This is in addition to other actions such as forwarding to a VSI (ADQ) or a queue (ADQ/non-ADQ). Also renamed 'ch_vsi' to 'dest_vsi' as it is valid for multiple actions such as forward to vsi/queue which may/may not create a channel vsi. Reviewed-by: Sridhar Samudrala <[email protected]> Signed-off-by: Amritha Nambiar <[email protected]> Tested-by: Bharathi Sreenivas <[email protected]> Signed-off-by: Tony Nguyen <[email protected]>
2023-01-19ice: Handle LLDP MIB Pending changeAnatolii Gerasymenko3-15/+91
If the number of Traffic Classes (TC) is decreased, the FW will no longer remove TC nodes, but will send a pending change notification. This will allow RDMA to destroy corresponding Control QP markers. After RDMA finishes outstanding operations, the ice driver will send an execute MIB Pending change admin queue command to FW to finish DCB configuration change. The FW will buffer all incoming Pending changes, so there can be only one active Pending change. RDMA driver guarantees to remove Control QP markers within 5000 ms. Hence, LLDP response timeout txTTL (default 30 sec) will be met. In the case of a Pending change, LLDP MIB Change Event (opcode 0x0A01) will contain the whole new MIB. But Get LLDP MIB (opcode 0x0A00) AQ call would still return an old MIB, as the Pending change hasn't been applied yet. Add ice_get_dcb_cfg_from_mib_change() function to retrieve DCBX config from LLDP MIB Change Event's buffer for Pending changes. Co-developed-by: Dave Ertman <[email protected]> Signed-off-by: Dave Ertman <[email protected]> Signed-off-by: Anatolii Gerasymenko <[email protected]> Tested-by: Arpana Arland <[email protected]> (A Contingent worker at Intel) Signed-off-by: Tony Nguyen <[email protected]>
2023-01-19ice: Add 'Execute Pending LLDP MIB' Admin Queue commandTsotne Chakhvadze4-2/+33
In DCB Willing Mode (FW managed LLDP), when the link partner changes configuration which requires fewer TCs, the TCs that are no longer needed are suspended by EMP FW, removed, and never resumed. This occurs before a MIB change event is indicated to SW. The permanent suspension and removal of these TC nodes in the scheduler prevents RDMA from being able to destroy QPs associated with this TC, requiring a CORE reset to recover. A new DCBX configuration change flow is defined to allow SW driver and other SW components (RDMA) to properly adjust to the configuration changes before they are taking effect in HW. This flow includes a two-way handshake between EMP FW<->LAN SW<->RDMA SW. List of changes: - Add 'Execute Pending LLDP MIB' AQC. - Add 'Pending Event Enable' bit. - Add additional logic to ignore Pending Event Enable' request while 'LLDP MIB Chnage' event is disabled. - Add 'Execute Pending LLDP MIB' AQC sending function to FW, which is needed to take place MIB Event change. Signed-off-by: Tsotne Chakhvadze <[email protected]> Co-developed-by: Karen Sornek <[email protected]> Signed-off-by: Karen Sornek <[email protected]> Co-developed-by: Dave Ertman <[email protected]> Signed-off-by: Dave Ertman <[email protected]> Co-developed-by: Anatolii Gerasymenko <[email protected]> Signed-off-by: Anatolii Gerasymenko <[email protected]> Tested-by: Arpana Arland <[email protected]> (A Contingent worker at Intel) Signed-off-by: Tony Nguyen <[email protected]>
2023-01-19net: phy: Remove probe_capabilitiesAndrew Lunn5-8/+0
Deciding if to probe of PHYs using C45 is now determine by if the bus provides the C45 read method. This makes probe_capabilities redundant so remove it. Signed-off-by: Andrew Lunn <[email protected]> Signed-off-by: Michael Walle <[email protected]> Reviewed-by: Jesse Brandeburg <[email protected]> Acked-by: Andrew Jeffery <[email protected]> Signed-off-by: Paolo Abeni <[email protected]>
2023-01-19net: mdio: Rework scanning of bus ready for quirksAndrew Lunn1-1/+1
Some C22 PHYs do bad things when there are C45 transactions on the bus. In order to handle this, the bus needs to be scanned first for C22 at all addresses, and then C45 scanned for all addresses. The Marvell pxa168 driver scans a specific address on the bus to find its PHY. This is a C22 only device, so update it to use the c22 helper. Signed-off-by: Andrew Lunn <[email protected]> Signed-off-by: Michael Walle <[email protected]> Reviewed-by: Jesse Brandeburg <[email protected]> Signed-off-by: Paolo Abeni <[email protected]>
2023-01-19Merge tag 'mlx5-fixes-2023-01-18' of ↵Paolo Abeni12-40/+27
git://git.kernel.org/pub/scm/linux/kernel/git/saeed/linux Saeed Mahameed says: ==================== This series provides bug fixes to mlx5 driver. * tag 'mlx5-fixes-2023-01-18' of git://git.kernel.org/pub/scm/linux/kernel/git/saeed/linux: net: mlx5: eliminate anonymous module_init & module_exit net/mlx5: E-switch, Fix switchdev mode after devlink reload net/mlx5e: Protect global IPsec ASO net/mlx5e: Remove optimization which prevented update of ESN state net/mlx5e: Set decap action based on attr for sample net/mlx5e: QoS, Fix wrongfully setting parent_element_id on MODIFY_SCHEDULING_ELEMENT net/mlx5: E-switch, Fix setting of reserved fields on MODIFY_SCHEDULING_ELEMENT net/mlx5e: Remove redundant xsk pointer check in mlx5e_mpwrq_validate_xsk net/mlx5e: Avoid false lock dependency warning on tc_ht even more net/mlx5: fix missing mutex_unlock in mlx5_fw_fatal_reporter_err_work() ==================== Link: https://lore.kernel.org/r/ Signed-off-by: Paolo Abeni <[email protected]>
2023-01-19octeontx2-pf: Fix the use of GFP_KERNEL in atomic context on rtKevin Hao2-9/+4
The commit 4af1b64f80fb ("octeontx2-pf: Fix lmtst ID used in aura free") uses the get/put_cpu() to protect the usage of percpu pointer in ->aura_freeptr() callback, but it also unnecessarily disable the preemption for the blockable memory allocation. The commit 87b93b678e95 ("octeontx2-pf: Avoid use of GFP_KERNEL in atomic context") tried to fix these sleep inside atomic warnings. But it only fix the one for the non-rt kernel. For the rt kernel, we still get the similar warnings like below. BUG: sleeping function called from invalid context at kernel/locking/spinlock_rt.c:46 in_atomic(): 1, irqs_disabled(): 0, non_block: 0, pid: 1, name: swapper/0 preempt_count: 1, expected: 0 RCU nest depth: 0, expected: 0 3 locks held by swapper/0/1: #0: ffff800009fc5fe8 (rtnl_mutex){+.+.}-{3:3}, at: rtnl_lock+0x24/0x30 #1: ffff000100c276c0 (&mbox->lock){+.+.}-{3:3}, at: otx2_init_hw_resources+0x8c/0x3a4 #2: ffffffbfef6537e0 (&cpu_rcache->lock){+.+.}-{2:2}, at: alloc_iova_fast+0x1ac/0x2ac Preemption disabled at: [<ffff800008b1908c>] otx2_rq_aura_pool_init+0x14c/0x284 CPU: 20 PID: 1 Comm: swapper/0 Tainted: G W 6.2.0-rc3-rt1-yocto-preempt-rt #1 Hardware name: Marvell OcteonTX CN96XX board (DT) Call trace: dump_backtrace.part.0+0xe8/0xf4 show_stack+0x20/0x30 dump_stack_lvl+0x9c/0xd8 dump_stack+0x18/0x34 __might_resched+0x188/0x224 rt_spin_lock+0x64/0x110 alloc_iova_fast+0x1ac/0x2ac iommu_dma_alloc_iova+0xd4/0x110 __iommu_dma_map+0x80/0x144 iommu_dma_map_page+0xe8/0x260 dma_map_page_attrs+0xb4/0xc0 __otx2_alloc_rbuf+0x90/0x150 otx2_rq_aura_pool_init+0x1c8/0x284 otx2_init_hw_resources+0xe4/0x3a4 otx2_open+0xf0/0x610 __dev_open+0x104/0x224 __dev_change_flags+0x1e4/0x274 dev_change_flags+0x2c/0x7c ic_open_devs+0x124/0x2f8 ip_auto_config+0x180/0x42c do_one_initcall+0x90/0x4dc do_basic_setup+0x10c/0x14c kernel_init_freeable+0x10c/0x13c kernel_init+0x2c/0x140 ret_from_fork+0x10/0x20 Of course, we can shuffle the get/put_cpu() to only wrap the invocation of ->aura_freeptr() as what commit 87b93b678e95 does. But there are only two ->aura_freeptr() callbacks, otx2_aura_freeptr() and cn10k_aura_freeptr(). There is no usage of perpcu variable in the otx2_aura_freeptr() at all, so the get/put_cpu() seems redundant to it. We can move the get/put_cpu() into the corresponding callback which really has the percpu variable usage and avoid the sprinkling of get/put_cpu() in several places. Fixes: 4af1b64f80fb ("octeontx2-pf: Fix lmtst ID used in aura free") Signed-off-by: Kevin Hao <[email protected]> Link: https://lore.kernel.org/r/[email protected] Signed-off-by: Paolo Abeni <[email protected]>
2023-01-19net: lan743x: add fixed phy support for LAN7431 devicePavithra Sathyanarayanan1-2/+18
Add fixed_phy support at 1Gbps full duplex for the lan7431 device if a phy not found over MDIO. Tested with a MAC to MAC connection from LAN7431 to a KSZ9893 switch. This avoids the Driver open error in LAN743x. TX delay and internal CLK125 generation is already enabled in EEPROM. Signed-off-by: Pavithra Sathyanarayanan <[email protected]> Signed-off-by: Paolo Abeni <[email protected]>
2023-01-19net: lan743x: add generic implementation for phy interface selectionPavithra Sathyanarayanan2-8/+24
Add logic to read the Phy interface from MAC_CR register for LAN743x driver. Checks for the LAN7430/31 or pci11x1x devices and the adapter interface is updated accordingly. For LAN7431, adapter interface is set based on Bit 19 of MAC_CR register as MII or RGMII which removes the forced RGMII/GMII configurations in lan743x_phy_open(). Signed-off-by: Pavithra Sathyanarayanan <[email protected]> Signed-off-by: Paolo Abeni <[email protected]>
2023-01-19net: lan743x: remove unwanted interface select settingsPavithra Sathyanarayanan1-8/+0
Remove the MII/RGMII Selection settings in driver as it is preset by the EEPROM and has the required configurations before the driver loads for LAN743x. Signed-off-by: Pavithra Sathyanarayanan <[email protected]> Signed-off-by: Paolo Abeni <[email protected]>
2023-01-18net: enetc: prioritize ability to go down over packet processingVladimir Oltean1-18/+63
napi_synchronize() from enetc_stop() waits until the softirq has finished execution and no longer wants to be rescheduled. However under high traffic load, this will never happen, and the interface can never be closed. The problem is the fact that the NAPI poll routine is written to update the consumer index which makes the device want to put more buffers in the RX ring, which restarts the madness again. Browsing around, it seems that some drivers like i40e keep a bit (__I40E_VSI_DOWN) which they use as communication between the control path and the data path. But that isn't my first choice, because complications ensue - since the enetc hardirq may trigger while we are in a theoretical ENETC_DOWN state, it may happen that enetc_msix() masks it, but enetc_poll() never unmasks it. To prevent a stall in that case, one would need to schedule all NAPI instances when ENETC_DOWN gets cleared, to process what's pending. I find it more desirable for the control path - enetc_stop() - to just quiesce the RX ring and let the softirq finish what remains there, without any explicit communication, just by making hardware not provide any more packets. This seems possible with the Enable bit of the RX BD ring (RBaMR[EN]). I can't seem to find an exact definition of what this bit does, but when the RX ring is disabled, the port seems to no longer update the producer index, and not react to software updates of the consumer index. In fact, the RBaMR[EN] bit is already toggled by the driver, but too late for what we want: enetc_close() -> enetc_stop() -> napi_synchronize() -> enetc_clear_bdrs() -> enetc_clear_rxbdr() The enetc_clear_bdrs() function contains not only logic to disable the RX and TX rings, but also logic to wait for the TX ring stop being busy. We split enetc_clear_bdrs() into enetc_disable_bdrs() and enetc_wait_bdrs(). One needs to run before napi_synchronize() and the other after (NAPI also processes TX completions, so we maximize our chances of not waiting for the ENETC_TBSR_BUSY bit - unless a packet is stuck for some reason, ofc). We also split off enetc_enable_bdrs() from enetc_setup_bdrs(), and call this from the mirror position in enetc_start() compared to enetc_stop(), i.e. right before netif_tx_start_all_queues(). Signed-off-by: Vladimir Oltean <[email protected]> Signed-off-by: Jakub Kicinski <[email protected]>
2023-01-18net: enetc: set up XDP program under enetc_reconfigure()Vladimir Oltean1-19/+32
Offloading a BPF program to the RX path of the driver suffers from the same problems as the PTP reconfiguration - improper error checking can leave the driver in an invalid state, and the link on the PHY is lost. Reuse the enetc_reconfigure() procedure, but here, we need to run some code in the middle of the ring reconfiguration procedure - while the interface is still down. Introduce a callback which makes that possible. Signed-off-by: Vladimir Oltean <[email protected]> Signed-off-by: Jakub Kicinski <[email protected]>
2023-01-18net: enetc: rename "xdp" and "dev" in enetc_setup_bpf()Vladimir Oltean2-9/+9
Follow the convention from this driver, which is to name "struct net_device *" as "ndev", and the convention from other drivers, to name "struct netdev_bpf *" as "bpf". Signed-off-by: Vladimir Oltean <[email protected]> Signed-off-by: Jakub Kicinski <[email protected]>
2023-01-18net: enetc: implement ring reconfiguration procedure for PTP RX timestampingVladimir Oltean1-12/+56
The crude enetc_stop() -> enetc_open() mechanism suffers from 2 problems: 1. improper error checking 2. it involves phylink_stop() -> phylink_start() which loses the link Right now, the driver is prepared to offer a better alternative: a ring reconfiguration procedure which takes the RX BD size (normal or extended) as argument. It allocates new resources (failing if that fails), stops the traffic, and assigns the new resources to the rings. Signed-off-by: Vladimir Oltean <[email protected]> Signed-off-by: Jakub Kicinski <[email protected]>
2023-01-18net: enetc: move phylink_start/stop out of enetc_start/stopVladimir Oltean1-13/+13
We want to introduce a fast interface reconfiguration procedure, which involves temporarily stopping the rings. But we want enetc_start() and enetc_stop() to not restart PHY autoneg, because that can take a few seconds until it completes again. So we need part of enetc_start() and enetc_stop(), but not all of them. Move phylink_start() right next to phylink_of_phy_connect(), and phylink_stop() right next to phylink_disconnect_phy(), both still in ndo_open() and ndo_stop(). Signed-off-by: Vladimir Oltean <[email protected]> Signed-off-by: Jakub Kicinski <[email protected]>
2023-01-18net: enetc: split ring resource allocation from assignmentVladimir Oltean2-70/+180
We have a few instances in the enetc driver where the ring resources (BD ring iomem, software BD ring, software TSO headers, basically everything except RX buffers) need to be reallocated. For example, when RX timestamping is enabled, the RX BD format changes to an extended one (twice as large). Currently, this is done using a simplistic enetc_close() -> enetc_open() procedure. But this is quite crude, since it also invokes phylink_stop() -> phylink_start(), the link is lost, and a few seconds need to pass for autoneg to complete again. In fact it's bad also due to the improper (yolo) error checking. In case we fail to allocate new resources, we've already freed the old ones, so the interface is more or less stuck. To avoid that, we need a system where reconfiguration is possible in a way in which resources are allocated upfront. This means that there will be a higher memory usage temporarily, but the assignment of resources to rings can be done when both the old and new resources are still available. Introduce a struct enetc_bdr_resource which holds the resources for a ring, be it RX or TX. This structure duplicates a lot of fields from struct enetc_bdr (and access to the same fields in the ring structure was left duplicated, to not change cache characteristics in the fast path). When enetc_alloc_tx_resources() runs, it returns an array of resource elements (one per TX ring), in addition to the existing priv->tx_res. To populate priv->tx_res with that array, one must call enetc_assign_tx_resources(), and this also frees the old resources. Signed-off-by: Vladimir Oltean <[email protected]> Signed-off-by: Jakub Kicinski <[email protected]>
2023-01-18net: enetc: bring "bool extended" to top-level in enetc_open()Vladimir Oltean1-9/+11
Extended RX buffer descriptors are necessary if they carry RX timestamps, which will be true when PTP timestamping is enabled. Right now, the rx_ring->ext_en is set from the function that allocates ring resources (enetc_alloc_rx_resources() -> enetc_alloc_rxbdr()), and also used later, in enetc_setup_rxbdr(). It is also used in the enetc_rxbd() and enetc_rxbd_next() fast path helpers. We want to decouple resource allocation from BD ring setup, but both procedures depend on BD size (extended or not). Move the "extended" boolean to enetc_open() and pass it both to the RX allocation procedure as well as to the RX ring setup procedure. The latter will set rx_ring->ext_en from now on. Signed-off-by: Vladimir Oltean <[email protected]> Signed-off-by: Jakub Kicinski <[email protected]>
2023-01-18net: enetc: drop redundant enetc_free_tx_frame() call from enetc_free_txbdr()Vladimir Oltean1-5/+0
The call path in enetc_close() is: enetc_close() -> enetc_free_rxtx_rings() -> enetc_free_tx_ring() -> enetc_free_tx_frame() -> enetc_free_tx_resources() -> enetc_free_txbdr() -> enetc_free_tx_frame() The enetc_free_tx_frame() function is written such that the second call exits without doing anything, but nonetheless, it is completely redundant. Delete it. This makes the TX teardown path more similar to the RX one, where rx_swbd freeing is done in enetc_free_rx_ring(), not in enetc_free_rxbdr(). Signed-off-by: Vladimir Oltean <[email protected]> Signed-off-by: Jakub Kicinski <[email protected]>
2023-01-18net: enetc: rx_swbd and tx_swbd are never NULL in enetc_free_rxtx_rings()Vladimir Oltean1-6/+0
The call path in enetc_close() is: enetc_close() -> enetc_free_rxtx_rings() -> enetc_free_rx_ring() -> tests whether rx_ring->rx_swbd is NULL -> enetc_free_tx_ring() -> tests whether tx_ring->tx_swbd is NULL -> enetc_free_rx_resources() -> enetc_free_rxbdr() -> sets rxr->rx_swbd to NULL -> enetc_free_tx_resources() -> enetc_free_txbdr() -> setx txr->tx_swbd to NULL From the above, it is clear that due to the function ordering, the checks for NULL are redundant, since the software buffer descriptor arrays have not yet been set to NULL. Drop these checks. Signed-off-by: Vladimir Oltean <[email protected]> Signed-off-by: Jakub Kicinski <[email protected]>
2023-01-18net: enetc: create enetc_dma_free_bdr()Vladimir Oltean1-14/+11
This is a refactoring change which introduces the opposite function of enetc_dma_alloc_bdr(). Signed-off-by: Vladimir Oltean <[email protected]> Signed-off-by: Jakub Kicinski <[email protected]>
2023-01-18net: enetc: set up RX ring indices from enetc_setup_rxbdr()Vladimir Oltean1-7/+4
There is only one place which needs to set up indices in the RX ring. Be consistent with what was done in the TX path and do this in enetc_setup_rxbdr(). Signed-off-by: Vladimir Oltean <[email protected]> Signed-off-by: Jakub Kicinski <[email protected]>
2023-01-18net: enetc: set next_to_clean/next_to_use just from enetc_setup_txbdr()Vladimir Oltean1-6/+0
enetc_alloc_txbdr() deals with allocating resources necessary for a TX ring to work (the array of software BDs and the array of TSO headers). The next_to_clean and next_to_use pointers are overwritten with proper values which are read from hardware here: enetc_open -> enetc_alloc_tx_resources -> enetc_alloc_txbdr -> set to zero -> enetc_setup_bdrs -> enetc_setup_txbdr -> read from hardware So their initialization with zeroes is pointless and confusing. Delete it. Consequently, since enetc_setup_txbdr() has no opposite cleanup function, also delete the resetting of these indices from enetc_free_tx_ring(). Signed-off-by: Vladimir Oltean <[email protected]> Signed-off-by: Jakub Kicinski <[email protected]>
2023-01-18net/mlx5e: Use read lock for eswitch get callbacksLeon Romanovsky1-6/+6
In commit 367dfa121205 ("net/mlx5: Remove devl_unlock from mlx5_eswtich_mode_callback_enter") all functions were converted to use write lock without relation to their actual purpose. Change the devlink eswitch getters to use read and not write locks. Reviewed-by: Saeed Mahameed <[email protected]> Signed-off-by: Leon Romanovsky <[email protected]> Signed-off-by: Saeed Mahameed <[email protected]>
2023-01-18net/mlx5e: Remove redundant allocation of spec in create indirect fwd groupMaor Dickman1-9/+1
mlx5_add_flow_rules supports creating rules without any matches by passing NULL pointer instead of spec, if NULL is passed it will use a static empty spec. This make allocation of spec in mlx5_create_indir_fwd_group unnecessary. Remove the redundant allocation. Signed-off-by: Maor Dickman <[email protected]> Reviewed-by: Roi Dayan <[email protected]> Signed-off-by: Saeed Mahameed <[email protected]>
2023-01-18net/mlx5e: Support Geneve and GRE with VF tunnel offloadMaor Dickman6-195/+48
Today VF tunnel offload (tunnel endpoint is on VF) is implemented by indirect table which use rules that match on VXLAN VNI to recirculated to root table, this limit the support for only VXLAN tunnels. This patch change indirect table to use one single match all rule to recirculated to root table which is added when any tunnel decap rule is added with tunnel endpoint is VF. This allow support of Geneve and GRE with this configuration. Signed-off-by: Maor Dickman <[email protected]> Reviewed-by: Paul Blakey <[email protected]> Reviewed-by: Roi Dayan <[email protected]> Signed-off-by: Saeed Mahameed <[email protected]>
2023-01-18net/mlx5: E-Switch, Fix typo for egressRoi Dayan1-1/+1
Fix engress to egress. Signed-off-by: Roi Dayan <[email protected]> Signed-off-by: Saeed Mahameed <[email protected]>
2023-01-18net/mlx5e: Warn when destroying mod hdr hash table that is not emptyRoi Dayan1-0/+1
To avoid memory leaks add a warn when destroying mod hdr hash table but the hash table is not empty. Signed-off-by: Roi Dayan <[email protected]> Reviewed-by: Maor Dickman <[email protected]> Signed-off-by: Saeed Mahameed <[email protected]>
2023-01-18net/mlx5e: TC, Use common function allocating flow mod hdr or encap mod hdrRoi Dayan3-51/+25
Use mlx5e_tc_attach_mod_hdr() when allocating encap mod hdr and remove mlx5e_tc_add_flow_mod_hdr() which is not being used now. Signed-off-by: Roi Dayan <[email protected]> Reviewed-by: Maor Dickman <[email protected]> Signed-off-by: Saeed Mahameed <[email protected]>
2023-01-18net/mlx5e: TC, Add tc prefix to attach/detach hdr functionsRoi Dayan1-10/+10
Currently there are confusing names for attach/detach functions. mlx5e_attach_mod_hdr() vs mlx5e_mod_hdr_attach() mlx5e_detach_mod_hdr() vs mlx5e_mod_hdr_detach() Add tc prefix to the functions that are in en_tc.c to separate from the functions in mod_hdr.c which has the mod_hdr prefix. Signed-off-by: Roi Dayan <[email protected]> Reviewed-by: Maor Dickman <[email protected]> Signed-off-by: Saeed Mahameed <[email protected]>
2023-01-18net/mlx5e: TC, Pass flow attr to attach/detach mod hdr functionsRoi Dayan3-22/+23
In preparation to remove duplicate functions handling mod hdr allocation and the fact that modify hdr should be per flow attr and not flow pass flow attr to the attach and detach mod hdr funcs. Signed-off-by: Roi Dayan <[email protected]> Reviewed-by: Maor Dickman <[email protected]> Signed-off-by: Saeed Mahameed <[email protected]>
2023-01-18net/mlx5e: Add warning when log WQE size is smaller than log stride sizeAdham Faris1-3/+8
Add warning macro in the function mlx5e_mpwqe_get_log_num_strides() when log WQE size is smaller than log stride size. Theoretically this should not happen. Signed-off-by: Adham Faris <[email protected]> Reviewed-by: Tariq Toukan <[email protected]> Signed-off-by: Saeed Mahameed <[email protected]>
2023-01-18net/mlx5e: Fail with messages when params are not valid for XSKAdham Faris2-4/+24
Current XSK prerequisites validation implementation (setup.c/mlx5e_validate_xsk_param()) fails silently when xsk prerequisites are not fulfilled. Add error messages to the kernel log to help the user understand what went wrong when params are not valid for XSK. Signed-off-by: Adham Faris <[email protected]> Reviewed-by: Tariq Toukan <[email protected]> Signed-off-by: Saeed Mahameed <[email protected]>
2023-01-18net/mlx5: E-switch, Remove redundant comment about meta rulesRoi Dayan1-3/+1
Meta rules are created/destroyed per vport and not in eswitch init/destroy. Signed-off-by: Roi Dayan <[email protected]> Reviewed-by: Maor Dickman <[email protected]> Signed-off-by: Saeed Mahameed <[email protected]>
2023-01-18net/mlx5: Add hardware extended range support for PTP adjtime and adjphaseRahul Rameshbabu1-3/+31
Capable hardware can use an extended range for offsetting the clock. An extended range of [-200000,200000] is used instead of [-32768,32767] for the delta/phase parameter of the adjtime/adjphase ptp_clock_info callbacks. Signed-off-by: Rahul Rameshbabu <[email protected]> Reviewed-by: Gal Pressman <[email protected]> Reviewed-by: Tariq Toukan <[email protected]> Signed-off-by: Saeed Mahameed <[email protected]>
2023-01-18net/mlx5: Add adjphase function to support hardware-only offset controlRahul Rameshbabu1-0/+9
The adjtime function supports using hardware to set the clock offset when the delta was supported by the hardware. When the delta is not supported by the hardware, the driver handles adjusting the clock. The newly-introduced adjphase function is similar to the adjtime function, except it guarantees that a provided clock offset will be used directly by the hardware to adjust the PTP clock. When the range is not acceptable by the hardware, an error is returned. Signed-off-by: Rahul Rameshbabu <[email protected]> Reviewed-by: Gal Pressman <[email protected]> Reviewed-by: Tariq Toukan <[email protected]> Signed-off-by: Saeed Mahameed <[email protected]>
2023-01-18net/mlx5: Suppress error logging on UCTX creationYishai Hadas1-1/+2
Suppress error logging that can be triggered by userspace upon DEVX UCTX creation. The reason that it's not suppressed today with the uid check to suppress DEVX is that MLX5_CMD_OP_CREATE_UCTX command still doesn't have a uid as it comes to create it.. Signed-off-by: Yishai Hadas <[email protected]> Reviewed-by: Maor Gottlieb <[email protected]> Signed-off-by: Saeed Mahameed <[email protected]>
2023-01-18net/mlx5e: Suppress Send WQEBB room warning for PAGE_SIZE >= 16KBRahul Rameshbabu1-1/+1
Send WQEBB size is 64 bytes and the max number of WQEBBs for an SQ is 255. For 16KB pages and greater, there is always sufficient spaces for all WQEBBs of an SQ. Cast mlx5e_get_max_sq_wqebbs(mdev) to u16. Prevents -Wtautological-constant-out-of-range-compare warnings from occurring when PAGE_SIZE >= 16KB. Signed-off-by: Rahul Rameshbabu <[email protected]> Reported-by: kernel test robot <[email protected]> Reviewed-by: Tariq Toukan <[email protected]> Signed-off-by: Saeed Mahameed <[email protected]>
2023-01-18net: microchip: sparx5: Add lock initialization to the KUNIT testsSteen Hegelund2-0/+2
Ensure that the KUNIT tests lock instance is initialized before the test is executed. Signed-off-by: Steen Hegelund <[email protected]> Signed-off-by: David S. Miller <[email protected]>
2023-01-18net: microchip: sparx5: Improve VCAP admin locking in the VCAP APISteen Hegelund1-30/+67
This improves the VCAP cache and the VCAP rule list protection against access from different sources. Signed-off-by: Steen Hegelund <[email protected]> Signed-off-by: David S. Miller <[email protected]>
2023-01-18net: microchip: sparx5: Add VCAP admin locking in debugFSSteen Hegelund3-30/+47
This ensures that the admin lock is taken before the debugFS functions starts iterating the VCAP rules. It also adds a separate function to decode a rule, which expects the lock to have been taken before it is called. Signed-off-by: Steen Hegelund <[email protected]> Signed-off-by: David S. Miller <[email protected]>
2023-01-18net: microchip: sparx5: Add support to check for existing VCAP rule idSteen Hegelund1-2/+16
Add a new function that just checks if the VCAP rule id is already used by an existing rule. Signed-off-by: Steen Hegelund <[email protected]> Signed-off-by: David S. Miller <[email protected]>
2023-01-18net: microchip: sparx5: Add support for rule count by cookieSteen Hegelund3-56/+47
This adds support for TC clients to get the packet count for a TC filter identified by its cookie. Signed-off-by: Steen Hegelund <[email protected]> Signed-off-by: David S. Miller <[email protected]>
2023-01-18net: macb: simplify TX timestamp handlingRobert Hancock3-94/+34
This driver was capturing the TX timestamp values from the TX ring during the TX completion path, but deferring the actual packet TX timestamp updating to a workqueue. There does not seem to be much of a reason for this with the current state of the driver. Simplify this to just do the TX timestamping as part of the TX completion path, to avoid the need for the extra timestamp buffer and workqueue. Signed-off-by: Robert Hancock <[email protected]> Reviewed-by: Jacob Keller <[email protected]> Reviewed-by: Claudiu Beznea <[email protected]> Signed-off-by: David S. Miller <[email protected]>
2023-01-18net: macb: fix PTP TX timestamp failure due to packet paddingRobert Hancock1-8/+1
PTP TX timestamp handling was observed to be broken with this driver when using the raw Layer 2 PTP encapsulation. ptp4l was not receiving the expected TX timestamp after transmitting a packet, causing it to enter a failure state. The problem appears to be due to the way that the driver pads packets which are smaller than the Ethernet minimum of 60 bytes. If headroom space was available in the SKB, this caused the driver to move the data back to utilize it. However, this appears to cause other data references in the SKB to become inconsistent. In particular, this caused the ptp_one_step_sync function to later (in the TX completion path) falsely detect the packet as a one-step SYNC packet, even when it was not, which caused the TX timestamp to not be processed when it should be. Using the headroom for this purpose seems like an unnecessary complexity as this is not a hot path in the driver, and in most cases it appears that there is sufficient tailroom to not require using the headroom anyway. Remove this usage of headroom to prevent this inconsistency from occurring and causing other problems. Fixes: 653e92a9175e ("net: macb: add support for padding and fcs computation") Signed-off-by: Robert Hancock <[email protected]> Reviewed-by: Jacob Keller <[email protected]> Tested-by: Claudiu Beznea <[email protected]> # on SAMA7G5 Reviewed-by: Claudiu Beznea <[email protected]> Signed-off-by: David S. Miller <[email protected]>
2023-01-18tsnep: Support XDP BPF program setupGerhard Engleder4-1/+36
Implement setup of BPF programs for XDP RX path with command XDP_SETUP_PROG of ndo_bpf(). This is the final step for XDP RX path support. There is no need to reinit the RX queues as they are always prepared for XDP. Additionally remove $(tsnep-y) from $(tsnep-objs) because it is added automatically. Test results with A53 1.2GHz: XDP_DROP (samples/bpf/xdp1) proto 17: 883878 pkt/s XDP_TX (samples/bpf/xdp2) proto 17: 255693 pkt/s XDP_REDIRECT (samples/bpf/xdpsock) sock0@eth2:0 rxdrop xdp-drv pps pkts 1.00 rx 855,582 5,404,523 tx 0 0 XDP_REDIRECT (samples/bpf/xdp_redirect) eth2->eth1 613,267 rx/s 0 err,drop/s 613,272 xmit/s Signed-off-by: Gerhard Engleder <[email protected]> Signed-off-by: David S. Miller <[email protected]>
2023-01-18tsnep: Add XDP RX supportGerhard Engleder2-2/+132
If BPF program is set up, then run BPF program for every received frame and execute the selected action. Signed-off-by: Gerhard Engleder <[email protected]> Signed-off-by: David S. Miller <[email protected]>
2023-01-18tsnep: Add RX queue info for XDP supportGerhard Engleder2-18/+58
Register xdp_rxq_info with page_pool memory model. This is needed for XDP buffer handling. Additionally fix error path by removing call of tsnep_phy_close() after failed tsnep_phy_open(). Signed-off-by: Gerhard Engleder <[email protected]> Signed-off-by: David S. Miller <[email protected]>