aboutsummaryrefslogtreecommitdiff
path: root/drivers/net/ethernet
AgeCommit message (Collapse)AuthorFilesLines
2023-08-27sfc: add mac source and destination pedit action offloadPieter Jansen van Vuuren1-2/+207
Introduce the first pedit set offload functionality for the sfc driver. In addition to this, add offload functionality for both mac source and destination pedit set actions. Co-developed-by: Edward Cree <[email protected]> Signed-off-by: Edward Cree <[email protected]> Signed-off-by: Pieter Jansen van Vuuren <[email protected]> Signed-off-by: David S. Miller <[email protected]>
2023-08-27sfc: introduce ethernet pedit set action infrastructurePieter Jansen van Vuuren4-11/+202
Introduce the initial ethernet pedit set action infrastructure in preparation for adding mac src and dst pedit action offloads. Co-developed-by: Edward Cree <[email protected]> Signed-off-by: Edward Cree <[email protected]> Signed-off-by: Pieter Jansen van Vuuren <[email protected]> Signed-off-by: David S. Miller <[email protected]>
2023-08-25Merge branch '1GbE' of ↵Jakub Kicinski11-64/+210
git://git.kernel.org/pub/scm/linux/kernel/git/tnguy/next-queue Tony Nguyen says: ==================== Intel Wired LAN Driver Updates 2023-08-24 (igc, e1000e) This series contains updates to igc and e1000e drivers. Vinicius adds support for utilizing multiple PTP registers on igc. Sasha reduces interval time for PTM on igc and adds new device support on e1000e. * '1GbE' of git://git.kernel.org/pub/scm/linux/kernel/git/tnguy/next-queue: e1000e: Add support for the next LOM generation igc: Decrease PTM short interval from 10 us to 1 us igc: Add support for multiple in-flight TX timestamps ==================== Link: https://lore.kernel.org/r/[email protected] Signed-off-by: Jakub Kicinski <[email protected]>
2023-08-25pds_core: pass opcode to devcmd_waitShannon Nelson1-5/+2
Don't rely on the PCI memory for the devcmd opcode because we read a 0xff value if the PCI bus is broken, which can cause us to report a bogus dev_cmd opcode later. Fixes: 523847df1b37 ("pds_core: add devcmd device interfaces") Signed-off-by: Shannon Nelson <[email protected]> Reviewed-by: Brett Creeley <[email protected]> Reviewed-by: Simon Horman <[email protected]> Link: https://lore.kernel.org/r/[email protected] Signed-off-by: Jakub Kicinski <[email protected]>
2023-08-25pds_core: check for work queue before useShannon Nelson1-1/+1
Add a check that the wq exists before queuing up work for a failed devcmd, as the PF is responsible for health and the VF doesn't have a wq. Fixes: c2dbb0904310 ("pds_core: health timer and workqueue") Signed-off-by: Shannon Nelson <[email protected]> Reviewed-by: Brett Creeley <[email protected]> Reviewed-by: Simon Horman <[email protected]> Link: https://lore.kernel.org/r/[email protected] Signed-off-by: Jakub Kicinski <[email protected]>
2023-08-25pds_core: no reset command for VFShannon Nelson1-1/+2
The VF doesn't need to send a reset command, and in a PCI reset scenario it might not have a valid IO space to write to anyway. Fixes: 523847df1b37 ("pds_core: add devcmd device interfaces") Signed-off-by: Shannon Nelson <[email protected]> Reviewed-by: Brett Creeley <[email protected]> Reviewed-by: Simon Horman <[email protected]> Link: https://lore.kernel.org/r/[email protected] Signed-off-by: Jakub Kicinski <[email protected]>
2023-08-25pds_core: no health reporter in VFShannon Nelson1-3/+5
Make sure the health reporter is set up before we use it in our devlink health updates, especially since the VF doesn't set up the health reporter. Fixes: 25b450c05a49 ("pds_core: add devlink health facilities") Signed-off-by: Shannon Nelson <[email protected]> Reviewed-by: Brett Creeley <[email protected]> Reviewed-by: Simon Horman <[email protected]> Link: https://lore.kernel.org/r/[email protected] Signed-off-by: Jakub Kicinski <[email protected]>
2023-08-25pds_core: protect devlink callbacks from fw_down stateShannon Nelson1-0/+3
Don't access structs that have been cleared when in the fw_down state and the various structs have been cleaned and are waiting to recover. This caused a panic on rmmod when already in fw_down and devlink_param_unregister() tried to check the parameters. Fixes: 40ced8944536 ("pds_core: devlink params for enabling VIF support") Signed-off-by: Shannon Nelson <[email protected]> Reviewed-by: Brett Creeley <[email protected]> Reviewed-by: Simon Horman <[email protected]> Link: https://lore.kernel.org/r/[email protected] Signed-off-by: Jakub Kicinski <[email protected]>
2023-08-25sfc: Check firmware supports Ethernet PTP filterAlex Austin1-1/+3
Not all firmware variants support RSS filters. Do not fail all PTP functionality when raw ethernet PTP filters fail to insert. Fixes: e4616f64726b ("sfc: support PTP over Ethernet") Signed-off-by: Alex Austin <[email protected]> Acked-by: Edward Cree <[email protected]> Reviewed-by: Pieter Jansen van Vuuren <[email protected]> Link: https://lore.kernel.org/r/[email protected] Signed-off-by: Jakub Kicinski <[email protected]>
2023-08-25net: stmmac: convert half-duplex support to positive logicRussell King (Oracle)1-6/+7
Rather than detecting when half-duplex is not supported, and clearing the MAC capabilities, reverse the if() condition and use it to set the capabilities instead. Signed-off-by: Russell King (Oracle) <[email protected]> Link: https://lore.kernel.org/r/[email protected] Signed-off-by: Jakub Kicinski <[email protected]>
2023-08-25net: stmmac: move priv->phylink_config.mac_managed_pmRussell King (Oracle)1-1/+1
Move priv->phylink_config.mac_managed_pm to be along side the other phylink initialisations. Signed-off-by: Russell King (Oracle) <[email protected]> Link: https://lore.kernel.org/r/[email protected] Signed-off-by: Jakub Kicinski <[email protected]>
2023-08-25net: stmmac: move xgmac specific phylink caps to dwxgmac2 coreRussell King (Oracle)2-10/+10
Move the xgmac specific phylink capabilities to the dwxgmac2 support core. Signed-off-by: Russell King (Oracle) <[email protected]> Link: https://lore.kernel.org/r/[email protected] Signed-off-by: Jakub Kicinski <[email protected]>
2023-08-25net: stmmac: move gmac4 specific phylink capabilities to gmac4Russell King (Oracle)2-3/+9
Move the setup of gmac4 speicifc phylink capabilities into gmac4 code. Signed-off-by: Russell King (Oracle) <[email protected]> Link: https://lore.kernel.org/r/[email protected] Signed-off-by: Jakub Kicinski <[email protected]>
2023-08-25net: stmmac: provide stmmac_mac_phylink_get_caps()Russell King (Oracle)2-0/+7
Allow MACs to provide their own capabilities via the MAC operations struct. Signed-off-by: Russell King (Oracle) <[email protected]> Link: https://lore.kernel.org/r/[email protected] Signed-off-by: Jakub Kicinski <[email protected]>
2023-08-25net: stmmac: use phylink_limit_mac_speed()Russell King (Oracle)1-21/+14
Use phylink_limit_mac_speed() to limit the MAC capabilities rather than coding this for each speed. Signed-off-by: Russell King (Oracle) <[email protected]> Link: https://lore.kernel.org/r/[email protected] Signed-off-by: Jakub Kicinski <[email protected]>
2023-08-25net: stmmac: use "mdio_bus_data" local variableRussell King (Oracle)1-2/+4
We have a local variable for priv->plat->mdio_bus_data, which we use later in the conditional if() block, but we evaluate the above within the conditional expression. Use mdio_bus_data instead. Since these will be the only two users of this local variable, move its assignment just before the if(). Signed-off-by: Russell King (Oracle) <[email protected]> Link: https://lore.kernel.org/r/[email protected] Signed-off-by: Jakub Kicinski <[email protected]>
2023-08-25net: stmmac: clean up passing fwnode to phylinkRussell King (Oracle)1-4/+5
Move the initialisation of the fwnode variable closer to its use site, rather than scattered throughout stmmac_phy_setup(). Signed-off-by: Russell King (Oracle) <[email protected]> Link: https://lore.kernel.org/r/[email protected] Signed-off-by: Jakub Kicinski <[email protected]>
2023-08-25net: stmmac: convert plat->phylink_node to fwnodeRussell King (Oracle)3-4/+5
All users of plat->phylink_node first convert it to a fwnode. Rather than repeatedly convert to a fwnode, store it as a fwnode. To reflect this change, call it plat->port_node instead - it is used for more than just phylink. Signed-off-by: Russell King (Oracle) <[email protected]> Link: https://lore.kernel.org/r/[email protected] Signed-off-by: Jakub Kicinski <[email protected]>
2023-08-25cteonxt2-pf: Fix backpressure config for multiple PFC priorities to work ↵Suman Ghosh1-3/+3
simultaneously MAC (CGX or RPM) asserts backpressure at TL3 or TL2 node of the egress hierarchical scheduler tree depending on link level config done. If there are multiple PFC priorities enabled at a time and for all such flows to backoff, each priority will have to assert backpressure at different TL3/TL2 scheduler nodes and these flows will need to submit egress pkts to these nodes. Current PFC configuration has an issue where in only one backpressure scheduler node is being allocated which is resulting in only one PFC priority to work. This patch fixes this issue. Fixes: 99c969a83d82 ("octeontx2-pf: Add egress PFC support") Signed-off-by: Suman Ghosh <[email protected]> Reviewed-by: Simon Horman <[email protected]> Link: https://lore.kernel.org/r/[email protected] Signed-off-by: Jakub Kicinski <[email protected]>
2023-08-25octeontx2-af: CN10KB: fix PFC configurationHariprasad Kelam1-8/+9
Suppose user has enabled pfc with prio 0,1 on a PF netdev(eth0) dcb pfc set dev eth0 prio-pfc o:on 1:on later user enabled pfc priorities 2 and 3 on the VF interface(eth1) dcb pfc set dev eth1 prio-pfc 2:on 3:on Instead of enabling pfc on all priorities (0..3), the driver only enables on priorities 2,3. This patch corrects the issue by using the proper CSR address. Fixes: b9d0fedc6234 ("octeontx2-af: cn10kb: Add RPM_USX MAC support") Signed-off-by: Hariprasad Kelam <[email protected]> Reviewed-by: Simon Horman <[email protected]> Link: https://lore.kernel.org/r/[email protected] Signed-off-by: Jakub Kicinski <[email protected]>
2023-08-25octeontx2-pf: Fix PFC TX scheduler freeSuman Ghosh2-11/+5
During PFC TX schedulers free, flag TXSCHQ_FREE_ALL was being set which caused free up all schedulers other than the PFC schedulers. This patch fixes that to free only the PFC Tx schedulers. Fixes: 99c969a83d82 ("octeontx2-pf: Add egress PFC support") Signed-off-by: Suman Ghosh <[email protected]> Reviewed-by: Simon Horman <[email protected]> Link: https://lore.kernel.org/r/[email protected] Signed-off-by: Jakub Kicinski <[email protected]>
2023-08-25mlxsw: core_hwmon: Adjust module label names based on MTCAP sensor counterVadim Pasternak1-1/+2
Transceiver module temperature sensors are indexed after ASIC and platform sensors. The current label printing method does not take this into account and simply prints the index of the transceiver module sensor. On new systems that have platform sensors this results in incorrect (shifted) transceiver module labels being printed: $ sensors [...] front panel 002: +37.0°C (crit = +70.0°C, emerg = +75.0°C) front panel 003: +47.0°C (crit = +70.0°C, emerg = +75.0°C) [...] Fix by taking the sensor count into account. After the fix: $ sensors [...] front panel 001: +37.0°C (crit = +70.0°C, emerg = +75.0°C) front panel 002: +47.0°C (crit = +70.0°C, emerg = +75.0°C) [...] Fixes: a53779de6a0e ("mlxsw: core: Add QSFP module temperature label attribute to hwmon") Signed-off-by: Vadim Pasternak <[email protected]> Reviewed-by: Ido Schimmel <[email protected]> Signed-off-by: Petr Machata <[email protected]> Signed-off-by: David S. Miller <[email protected]>
2023-08-25mlxsw: i2c: Limit single transaction buffer sizeVadim Pasternak1-1/+2
Maximum size of buffer is obtained from underlying I2C adapter and in case adapter allows I2C transaction buffer size greater than 100 bytes, transaction will fail due to firmware limitation. As a result driver will fail initialization. Limit the maximum size of transaction buffer by 100 bytes to fit to firmware. Remove unnecessary calculation: max_t(u16, MLXSW_I2C_BLK_DEF, quirk_size). This condition can not happened. Fixes: 3029a693beda ("mlxsw: i2c: Allow flexible setting of I2C transactions size") Signed-off-by: Vadim Pasternak <[email protected]> Reviewed-by: Petr Machata <[email protected]> Signed-off-by: Petr Machata <[email protected]> Signed-off-by: David S. Miller <[email protected]>
2023-08-25mlxsw: i2c: Fix chunk size setting in output mailbox bufferVadim Pasternak1-1/+1
The driver reads commands output from the output mailbox. If the size of the output mailbox is not a multiple of the transaction / block size, then the driver will not issue enough read transactions to read the entire output, which can result in driver initialization errors. Fix by determining the number of transactions using DIV_ROUND_UP(). Fixes: 3029a693beda ("mlxsw: i2c: Allow flexible setting of I2C transactions size") Signed-off-by: Vadim Pasternak <[email protected]> Reviewed-by: Ido Schimmel <[email protected]> Signed-off-by: Petr Machata <[email protected]> Signed-off-by: David S. Miller <[email protected]>
2023-08-25octeontx2-pf: fix page_pool creation fail for rings > 32kRatheesh Kannoth2-1/+3
octeontx2 driver calls page_pool_create() during driver probe() and fails if queue size > 32k. Page pool infra uses these buffers as shock absorbers for burst traffic. These pages are pinned down over time as working sets varies, due to the recycling nature of page pool, given page pool (currently) don't have a shrinker mechanism, the pages remain pinned down in ptr_ring. Instead of clamping page_pool size to 32k at most, limit it even more to 2k to avoid wasting memory. This have been tested on octeontx2 CN10KA hardware. TCP and UDP tests using iperf shows no performance regressions. Fixes: b2e3406a38f0 ("octeontx2-pf: Add support for page pool") Suggested-by: Alexander Lobakin <[email protected]> Reviewed-by: Sunil Goutham <[email protected]> Signed-off-by: Ratheesh Kannoth <[email protected]> Acked-by: Jesper Dangaard Brouer <[email protected]> Signed-off-by: David S. Miller <[email protected]>
2023-08-25net: fec: add statistics for XDP_TXWei Fang1-1/+4
The FEC driver supports the statistics for XDP actions except for XDP_TX before, because the XDP_TX was not supported when adding the statistics for XDP. Now the FEC driver has supported XDP_TX since commit f601899e4321 ("net: fec: add XDP_TX feature support"). So it's reasonable and necessary to add statistics for XDP_TX. Signed-off-by: Wei Fang <[email protected]> Signed-off-by: David S. Miller <[email protected]>
2023-08-25ice: avoid executing commands on other ports when driving syncJacob Keller2-6/+52
The ice hardware has a synchronization mechanism used to drive the simultaneous application of commands on both PHY ports and the source timer in the MAC. When issuing a sync via ice_ptp_exec_tmr_cmd(), the hardware will simultaneously apply the commands programmed for the main timer and each PHY port. Neither the main timer command register, nor the PHY port command registers auto clear on command execution. During the execution of a timer command intended for a single port on E822 devices, such as those used to configure a PHY during link up, the driver is not correctly clearing the previous commands. This results in unintentionally executing the last programmed command on the main timer and other PHY ports whenever performing reconfiguration on E822 ports after link up. This results in unintended side effects on other timers, depending on what command was previously programmed. To fix this, the driver must ensure that the main timer and all other PHY ports are properly initialized to perform no action. The enumeration for timer commands does not include an enumeration value for doing nothing. Introduce ICE_PTP_NOP for this purpose. When writing a timer command to hardware, leave the command bits set to zero which indicates that no operation should be performed on that port. Modify ice_ptp_one_port_cmd() to always initialize all ports. For all ports other than the one being configured, write their timer command register to ICE_PTP_NOP. This ensures that no side effect happens on the timer command. To fix this for the PHY ports, modify ice_ptp_one_port_cmd() to always initialize all other ports to ICE_PTP_NOP. This ensures that no side effects happen on the other ports. Call ice_ptp_src_cmd() with a command value if ICE_PTP_NOP in ice_sync_phy_timer_e822() and ice_start_phy_timer_e822(). With both of these changes, the driver should no longer execute a stale command on the main timer or another PHY port when reconfiguring one of the PHY ports after link up. Fixes: 3a7496234d17 ("ice: implement basic E822 PTP support") Signed-off-by: Siddaraju DH <[email protected]> Signed-off-by: Jacob Keller <[email protected]> Tested-by: Sunitha Mekala <[email protected]> (A Contingent worker at Intel) Signed-off-by: Tony Nguyen <[email protected]> Reviewed-by: Simon Horman <[email protected]> Signed-off-by: David S. Miller <[email protected]>
2023-08-25net: ngbe: move mdio access registers to libwxJiawen Wu2-61/+42
Registers of mdio accessing are common defined in libwx, remove the redundant macro definitions in ngbe driver. Signed-off-by: Jiawen Wu <[email protected]> Signed-off-by: David S. Miller <[email protected]>
2023-08-25net: txgbe: support copper NIC with external PHYJiawen Wu5-22/+221
Wangxun SP chip supports to connect with external PHY (marvell 88x3310), which links to 10GBASE-T/1000BASE-T/100BASE-T. Add the identification of media types from subsystem device IDs. For sp_media_copper, register mdio bus for the external PHY. Signed-off-by: Jiawen Wu <[email protected]> Signed-off-by: David S. Miller <[email protected]>
2023-08-25net: txgbe: support switching mode to 1000BASE-X and SGMIIJiawen Wu4-1/+61
Disable data path before PCS VR reset while switching PCS mode, to prevent the blocking of data path. Enable AN interrupt for CL37 auto-negotiation. Signed-off-by: Jiawen Wu <[email protected]> Signed-off-by: David S. Miller <[email protected]>
2023-08-25net: txgbe: add FW version warningJiawen Wu1-0/+3
Since XPCS device identifier is implemented in the firmware version 0x20010 and above, so add a warning to prompt the users to upgrade the firmware to make sure the driver works. Signed-off-by: Jiawen Wu <[email protected]> Signed-off-by: David S. Miller <[email protected]>
2023-08-24net/mlx5e: fix up for "net/mlx5e: Move MACsec flow steering operations to be ↵Stephen Rothwell1-0/+1
used as core library" Recent merge had a conflict in: drivers/net/ethernet/mellanox/mlx5/core/en_accel/macsec_fs.c between commit: aeb660171b06 ("net/mlx5e: fix double free in macsec_fs_tx_create_crypto_table_groups") from Linus' tree and commit: cb5ebe4896f9 ("net/mlx5e: Move MACsec flow steering operations to be used as core library") from the mlx5-next tree. This was missed and the former commit got lost, bring it back. Fixes: 3c5066c6b0a5 ("Merge branch 'mlx5-next' of https://git.kernel.org/pub/scm/linux/kernel/git/mellanox/linux") Signed-off-by: Stephen Rothwell <[email protected]> Link: https://lore.kernel.org/r/[email protected] Signed-off-by: Jakub Kicinski <[email protected]>
2023-08-24net/mlx5: Dynamic cyclecounter shift calculation for PTP free running clockRahul Rameshbabu1-5/+27
Use a dynamic calculation to determine the shift value for the internal timer cyclecounter that will lead to the highest precision frequency adjustments. Previously used a constant for the shift value assuming all devices supported by the driver had a nominal frequency of 1GHz. However, there are devices that operate at different frequencies. The previous shift value constant would break the PHC functionality for those devices. Reported-by: Vadim Fedorenko <[email protected]> Closes: https://lore.kernel.org/netdev/[email protected]/ Fixes: 6a4010927562 ("net/mlx5: Update cyclecounter shift value to improve ptp free running mode precision") Signed-off-by: Rahul Rameshbabu <[email protected]> Tested-by: Vadim Fedorenko <[email protected]> Reviewed-by: Jacob Keller <[email protected]> Reviewed-by: Simon Horman <[email protected]> Acked-by: Saeed Mahameed <[email protected]> Link: https://lore.kernel.org/r/[email protected] Signed-off-by: Jakub Kicinski <[email protected]>
2023-08-24minmax: add in_range() macroMatthew Wilcox (Oracle)1-9/+9
Patch series "New page table range API", v6. This patchset changes the API used by the MM to set up page table entries. The four APIs are: set_ptes(mm, addr, ptep, pte, nr) update_mmu_cache_range(vma, addr, ptep, nr) flush_dcache_folio(folio) flush_icache_pages(vma, page, nr) flush_dcache_folio() isn't technically new, but no architecture implemented it, so I've done that for them. The old APIs remain around but are mostly implemented by calling the new interfaces. The new APIs are based around setting up N page table entries at once. The N entries belong to the same PMD, the same folio and the same VMA, so ptep++ is a legitimate operation, and locking is taken care of for you. Some architectures can do a better job of it than just a loop, but I have hesitated to make too deep a change to architectures I don't understand well. One thing I have changed in every architecture is that PG_arch_1 is now a per-folio bit instead of a per-page bit when used for dcache clean/dirty tracking. This was something that would have to happen eventually, and it makes sense to do it now rather than iterate over every page involved in a cache flush and figure out if it needs to happen. The point of all this is better performance, and Fengwei Yin has measured improvement on x86. I suspect you'll see improvement on your architecture too. Try the new will-it-scale test mentioned here: https://lore.kernel.org/linux-mm/[email protected]/ You'll need to run it on an XFS filesystem and have CONFIG_TRANSPARENT_HUGEPAGE set. This patchset is the basis for much of the anonymous large folio work being done by Ryan, so it's received quite a lot of testing over the last few months. This patch (of 38): Determine if a value lies within a range more efficiently (subtraction + comparison vs two comparisons and an AND). It also has useful (under some circumstances) behaviour if the range exceeds the maximum value of the type. Convert all the conflicting definitions of in_range() within the kernel; some can use the generic definition while others need their own definition. Link: https://lkml.kernel.org/r/[email protected] Link: https://lkml.kernel.org/r/[email protected] Signed-off-by: Matthew Wilcox (Oracle) <[email protected]> Signed-off-by: Andrew Morton <[email protected]>
2023-08-24e1000e: Add support for the next LOM generationSasha Neftin5-0/+17
Add devices IDs for the next LOM generations that will be available on the next Intel Client platforms. This patch provides the initial support for these devices. Signed-off-by: Sasha Neftin <[email protected]> Tested-by: Naama Meir <[email protected]> Signed-off-by: Tony Nguyen <[email protected]>
2023-08-24igc: Decrease PTM short interval from 10 us to 1 usSasha Neftin1-1/+1
With the 10us interval, we were seeing PTM transactions take around 12us. Hardware team suggested this interval could be lowered to 1us which was confirmed with PCIe sniffer. With the 1us interval, PTM dialogs took around 2us. Suggested-by: Vinicius Costa Gomes <[email protected]> Signed-off-by: Sasha Neftin <[email protected]> Tested-by: Muhammad Husaini Zulkifli <[email protected]> Reviewed-by: Muhammad Husaini Zulkifli <[email protected]> Tested-by: Naama Meir <[email protected]> Signed-off-by: Tony Nguyen <[email protected]>
2023-08-24igc: Add support for multiple in-flight TX timestampsVinicius Costa Gomes6-63/+192
Add support for using the four sets of timestamping registers that i225/i226 have available for TX. In some workloads, where multiple applications request hardware transmission timestamps, it was possible that some of those requests were denied because the only in use register was already occupied. This is also in preparation to future support for hardware timestamping with multiple PTP domains. With multiple domains chances of multiple TX timestamps being requested at the same time increase. Before: $ sudo ./ntpperf -i enp3s0 -m 10:22:22:22:22:21 -d 192.168.1.3 -s 172.18.0.0/16 -I -H -o 37 | responses | TX timestamp offset (ns) rate clients | lost invalid basic xleave | min mean max stddev 1000 100 0.00% 0.00% 0.00% 100.00% +1 +41 +73 13 1500 150 0.00% 0.00% 0.00% 100.00% +9 +49 +87 15 2250 225 0.00% 0.00% 0.00% 100.00% +9 +42 +79 13 3375 337 0.00% 0.00% 0.00% 100.00% +11 +46 +81 13 5062 506 0.00% 0.00% 0.00% 100.00% +7 +44 +80 13 7593 759 0.00% 0.00% 0.00% 100.00% +9 +44 +79 12 11389 1138 0.00% 0.00% 0.00% 100.00% +14 +51 +87 13 17083 1708 0.00% 0.00% 0.00% 100.00% +1 +41 +80 14 25624 2562 0.00% 0.00% 0.00% 100.00% +11 +50 +5107 51 38436 3843 0.00% 0.00% 0.00% 100.00% -2 +36 +7843 38 57654 5765 0.00% 0.00% 0.00% 100.00% +4 +42 +10503 69 86481 8648 0.00% 0.00% 0.00% 100.00% +11 +54 +5492 65 129721 12972 0.00% 0.00% 0.00% 100.00% +31 +2680 +6942 2606 194581 16384 16.79% 0.00% 0.87% 82.34% +73 +4444 +15879 3116 291871 16384 35.05% 0.00% 1.53% 63.42% +188 +5381 +17019 3035 437806 16384 54.95% 0.00% 2.55% 42.50% +233 +6302 +13885 2846 After: $ sudo ./ntpperf -i enp3s0 -m 10:22:22:22:22:21 -d 192.168.1.3 -s 172.18.0.0/16 -I -H -o 37 | responses | TX timestamp offset (ns) rate clients | lost invalid basic xleave | min mean max stddev 1000 100 0.00% 0.00% 0.00% 100.00% -20 +12 +43 13 1500 150 0.00% 0.00% 0.00% 100.00% -23 +18 +57 14 2250 225 0.00% 0.00% 0.00% 100.00% -2 +33 +67 13 3375 337 0.00% 0.00% 0.00% 100.00% +1 +38 +76 13 5062 506 0.00% 0.00% 0.00% 100.00% +9 +52 +93 14 7593 759 0.00% 0.00% 0.00% 100.00% +11 +47 +82 13 11389 1138 0.00% 0.00% 0.00% 100.00% -9 +27 +74 13 17083 1708 0.00% 0.00% 0.00% 100.00% -13 +25 +66 14 25624 2562 0.00% 0.00% 0.00% 100.00% -8 +28 +65 13 38436 3843 0.00% 0.00% 0.00% 100.00% -13 +28 +69 13 57654 5765 0.00% 0.00% 0.00% 100.00% -11 +32 +71 14 86481 8648 0.00% 0.00% 0.00% 100.00% +2 +44 +83 14 129721 12972 15.36% 0.00% 0.35% 84.29% -2 +2248 +22907 4252 194581 16384 42.98% 0.00% 1.98% 55.04% -4 +5278 +65039 5856 291871 16384 54.33% 0.00% 2.21% 43.46% -3 +6306 +22608 5665 We can see that with 4 registers, as expected, we are able to handle a increasing number of requests more consistently, but as soon as all registers are in use, the decrease in quality of service happens in a sharp step. Signed-off-by: Vinicius Costa Gomes <[email protected]> Reviewed-by: Vladimir Oltean <[email protected]> Reviewed-by: Muhammad Husaini Zulkifli <[email protected]> Reviewed-by: Kurt Kanzenbach <[email protected]> Tested-by: Naama Meir <[email protected]> Signed-off-by: Tony Nguyen <[email protected]>
2023-08-24Merge branch 'mlx5-next' of ↵Jakub Kicinski14-1618/+2571
https://git.kernel.org/pub/scm/linux/kernel/git/mellanox/linux Leon Romanovsky says: ==================== mlx5 MACsec RoCEv2 support From Patrisious: This series extends previously added MACsec offload support to cover RoCE traffic either. In order to achieve that, we need configure MACsec with offload between the two endpoints, like below: REMOTE_MAC=10:70:fd:43:71:c0 * ip addr add 1.1.1.1/16 dev eth2 * ip link set dev eth2 up * ip link add link eth2 macsec0 type macsec encrypt on * ip macsec offload macsec0 mac * ip macsec add macsec0 tx sa 0 pn 1 on key 00 dffafc8d7b9a43d5b9a3dfbbf6a30c16 * ip macsec add macsec0 rx port 1 address $REMOTE_MAC * ip macsec add macsec0 rx port 1 address $REMOTE_MAC sa 0 pn 1 on key 01 ead3664f508eb06c40ac7104cdae4ce5 * ip addr add 10.1.0.1/16 dev macsec0 * ip link set dev macsec0 up And in a similar manner on the other machine, while noting the keys order would be reversed and the MAC address of the other machine. RDMA traffic is separated through relevant GID entries and in case of IP ambiguity issue - meaning we have a physical GIDs and a MACsec GIDs with the same IP/GID, we disable our physical GID in order to force the user to only use the MACsec GID. v0: https://lore.kernel.org/netdev/[email protected]/ * 'mlx5-next' of https://git.kernel.org/pub/scm/linux/kernel/git/mellanox/linux: RDMA/mlx5: Handles RoCE MACsec steering rules addition and deletion net/mlx5: Add RoCE MACsec steering infrastructure in core net/mlx5: Configure MACsec steering for ingress RoCEv2 traffic net/mlx5: Configure MACsec steering for egress RoCEv2 traffic IB/core: Reorder GID delete code for RoCE net/mlx5: Add MACsec priorities in RDMA namespaces RDMA/mlx5: Implement MACsec gid addition and deletion net/mlx5: Maintain fs_id xarray per MACsec device inside macsec steering net/mlx5: Remove netdevice from MACsec steering net/mlx5e: Move MACsec flow steering and statistics database from ethernet to core net/mlx5e: Rename MACsec flow steering functions/parameters to suit core naming style net/mlx5: Remove dependency of macsec flow steering on ethernet net/mlx5e: Move MACsec flow steering operations to be used as core library macsec: add functions to get macsec real netdevice and check offload ==================== Link: https://lore.kernel.org/r/[email protected] Signed-off-by: Jakub Kicinski <[email protected]>
2023-08-24Merge git://git.kernel.org/pub/scm/linux/kernel/git/netdev/netJakub Kicinski27-103/+108
Cross-merge networking fixes after downstream PR. Conflicts: include/net/inet_sock.h f866fbc842de ("ipv4: fix data-races around inet->inet_id") c274af224269 ("inet: introduce inet->inet_flags") https://lore.kernel.org/all/[email protected]/ Adjacent changes: drivers/net/bonding/bond_alb.c e74216b8def3 ("bonding: fix macvlan over alb bond support") f11e5bd159b0 ("bonding: support balance-alb with openvswitch") drivers/net/ethernet/broadcom/bgmac.c d6499f0b7c7c ("net: bgmac: Return PTR_ERR() for fixed_phy_register()") 23a14488ea58 ("net: bgmac: Fix return value check for fixed_phy_register()") drivers/net/ethernet/broadcom/genet/bcmmii.c 32bbe64a1386 ("net: bcmgenet: Fix return value check for fixed_phy_register()") acf50d1adbf4 ("net: bcmgenet: Return PTR_ERR() for fixed_phy_register()") net/sctp/socket.c f866fbc842de ("ipv4: fix data-races around inet->inet_id") b09bde5c3554 ("inet: move inet->mc_loop to inet->inet_frags") Signed-off-by: Jakub Kicinski <[email protected]>
2023-08-24Merge tag 'mlx5-updates-2023-08-22' of ↵Paolo Abeni7-303/+371
git://git.kernel.org/pub/scm/linux/kernel/git/saeed/linux Saeed Mahameed says: ==================== mlx5-updates-2023-08-22 1) Patches #1..#13 From Jiri: The goal of this patchset is to make the SF code cleaner. Benefit from previously introduced devlink_port struct containerization to avoid unnecessary lookups in devlink port ops. Also, benefit from the devlink locking changes and avoid unnecessary reference counting. 2) Patches #14,#15: Add ability to configure proto both UDP and TCP selectors in RX and TX directions. * tag 'mlx5-updates-2023-08-22' of git://git.kernel.org/pub/scm/linux/kernel/git/saeed/linux: net/mlx5e: Support IPsec upper TCP protocol selector net/mlx5e: Support IPsec upper protocol selector field offload for RX net/mlx5: Store vport in struct mlx5_devlink_port and use it in port ops net/mlx5: Check vhca_resource_manager capability in each op and add extack msg net/mlx5: Relax mlx5_devlink_eswitch_get() return value checking net/mlx5: Return -EOPNOTSUPP in mlx5_devlink_port_fn_migratable_set() directly net/mlx5: Reduce number of vport lookups passing vport pointer instead of index net/mlx5: Embed struct devlink_port into driver structure net/mlx5: Don't register ops for non-PF/VF/SF port and avoid checks in ops net/mlx5: Remove no longer used mlx5_esw_offloads_sf_vport_enable/disable() net/mlx5: Introduce mlx5_eswitch_load/unload_sf_vport() and use it from SF code net/mlx5: Allow mlx5_esw_offloads_devlink_port_register() to register SFs net/mlx5: Push devlink port PF/VF init/cleanup calls out of devlink_port_register/unregister() net/mlx5: Push out SF devlink port init and cleanup code to separate helpers net/mlx5: Rework devlink port alloc/free into init/cleanup ==================== Link: https://lore.kernel.org/all/[email protected]/ Signed-off-by: Paolo Abeni <[email protected]>
2023-08-23net: ethernet: mtk_eth_soc: support 36-bit DMA addressing on MT7988Daniel Golle2-4/+48
Systems having 4 GiB of RAM and more require DMA addressing beyond the current 32-bit limit. Starting from MT7988 the hardware now supports 36-bit DMA addressing, let's use that new capability in the driver to avoid running into swiotlb on systems with 4 GiB of RAM or more. Signed-off-by: Daniel Golle <[email protected]> Link: https://lore.kernel.org/r/95b919c98876c9e49761e44662e7c937479eecb8.1692721443.git.daniel@makrotopia.org Signed-off-by: Jakub Kicinski <[email protected]>
2023-08-23net: ethernet: mtk_eth_soc: add support for in-SoC SRAMDaniel Golle2-22/+78
MT7981, MT7986 and MT7988 come with in-SoC SRAM dedicated for Ethernet DMA rings. Support using the SRAM without breaking existing device tree bindings, ie. only new SoC starting from MT7988 will have the SRAM declared as additional resource in device tree. For MT7981 and MT7986 an offset on top of the main I/O base is used. Signed-off-by: Daniel Golle <[email protected]> Link: https://lore.kernel.org/r/e45e0f230c63ad58869e8fe35b95a2fb8925b625.1692721443.git.daniel@makrotopia.org Signed-off-by: Jakub Kicinski <[email protected]>
2023-08-23net: ethernet: mtk_eth_soc: add reset bits for MT7988Daniel Golle2-24/+68
Add bits needed to reset the frame engine on MT7988. Fixes: 445eb6448ed3 ("net: ethernet: mtk_eth_soc: add basic support for MT7988 SoC") Signed-off-by: Daniel Golle <[email protected]> Link: https://lore.kernel.org/r/89b6c38380e7a3800c1362aa7575600717bc7543.1692721443.git.daniel@makrotopia.org Signed-off-by: Jakub Kicinski <[email protected]>
2023-08-23net: ethernet: mtk_eth_soc: fix register definitions for MT7988Daniel Golle1-3/+5
More register macros need to be adjusted for the 3rd GMAC on MT7988. Account for added bit in SYSCFG0_SGMII_MASK. Fixes: 445eb6448ed3 ("net: ethernet: mtk_eth_soc: add basic support for MT7988 SoC") Signed-off-by: Daniel Golle <[email protected]> Reviewed-by: Simon Horman <[email protected]> Link: https://lore.kernel.org/r/1c8da012e2ca80939906d85f314138c552139f0f.1692721443.git.daniel@makrotopia.org Signed-off-by: Jakub Kicinski <[email protected]>
2023-08-23net: fec: add exception tracing for XDPWei Fang1-15/+11
As we already added the exception tracing for XDP_TX, I think it is necessary to add the exception tracing for other XDP actions, such as XDP_REDIRECT, XDP_ABORTED and unknown error actions. Signed-off-by: Wei Fang <[email protected]> Suggested-by: Jakub Kicinski <[email protected]> Reviewed-by: Jacob Keller <[email protected]> Link: https://lore.kernel.org/r/[email protected] Signed-off-by: Jakub Kicinski <[email protected]>
2023-08-23net: dm9051: Use PTR_ERR_OR_ZERO() to simplify codeYu Liao1-3/+1
Use the standard error pointer macro to shorten the code and simplify. Signed-off-by: Yu Liao <[email protected]> Link: https://lore.kernel.org/r/[email protected] Signed-off-by: Jakub Kicinski <[email protected]>
2023-08-23ibmveth: Use dcbf rather than dcbflMichael Ellerman1-1/+1
When building for power4, newer binutils don't recognise the "dcbfl" extended mnemonic. dcbfl RA, RB is equivalent to dcbf RA, RB, 1. Switch to "dcbf" to avoid the build error. Signed-off-by: Michael Ellerman <[email protected]> Signed-off-by: David S. Miller <[email protected]>
2023-08-23i40e: fix potential NULL pointer dereferencing of pf->vf i40e_sync_vsi_filters()Andrii Staikov1-2/+3
Add check for pf->vf not being NULL before dereferencing pf->vf[vsi->vf_id] in updating VSI filter sync. Add a similar check before dereferencing !pf->vf[vsi->vf_id].trusted in the condition for clearing promisc mode bit. Fixes: c87c938f62d8 ("i40e: Add VF VLAN pruning") Signed-off-by: Andrii Staikov <[email protected]> Signed-off-by: Aleksandr Loktionov <[email protected]> Tested-by: Rafal Romanowski <[email protected]> Signed-off-by: Tony Nguyen <[email protected]> Signed-off-by: David S. Miller <[email protected]>
2023-08-23bnxt: use the NAPI skb allocation cacheJakub Kicinski1-3/+3
All callers of build_skb() (*) in bnxt are in NAPI context. The budget checking is somewhat convoluted because in the shared completion queue cases Rx packets are discarded by netpoll by forcing an error (E). But that happens before skb allocation. Only a call chain starting at __bnxt_poll_work() can lead to an skb allocation and it checks budget (b). * bnxt_rx_multi_page_skb * bnxt_rx_skb ` bp->rx_skb_func * bnxt_tpa_end ` bnxt_rx_pkt E bnxt_force_rx_discard E bnxt_poll_nitroa0 b __bnxt_poll_work Use napi_build_skb() to take advantage of the skb cache. In iperf tests with HW-GRO enabled it barely makes a difference but in cases where HW-GRO is not as effective (or disabled) it can give even a >10% boost (20.7Gbps -> 23.1Gbps). Signed-off-by: Jakub Kicinski <[email protected]> Reviewed-by: Michael Chan <[email protected]> Signed-off-by: David S. Miller <[email protected]>
2023-08-23mlx4: Delete custom device management logicPetr Pavlu3-156/+0
After the conversion to use the auxiliary bus, the custom device management is not needed anymore and can be deleted. Signed-off-by: Petr Pavlu <[email protected]> Tested-by: Leon Romanovsky <[email protected]> Reviewed-by: Leon Romanovsky <[email protected]> Acked-by: Tariq Toukan <[email protected]> Signed-off-by: David S. Miller <[email protected]>