aboutsummaryrefslogtreecommitdiff
path: root/drivers/net/ethernet
AgeCommit message (Collapse)AuthorFilesLines
2023-04-19net/mlx5e: XDP, Add support for multi-buffer XDP redirect-inTariq Toukan3-17/+75
Handle multi-buffer XDP redirect-in requests coming through mlx5e_xdp_xmit. Extend struct mlx5e_xmit_data_frags with an additional dma_arr field, to point to the fragments dma mapping, as they cannot be retrieved via the page_pool_get_dma_addr() function. Push a dma_addr xdpi instance per each fragment, and use them in the completion flow to dma_unmap the frags. Finally, remove the restriction in mlx5e_open_xdpsq, and set the flag in xdp_features. Reviewed-by: Saeed Mahameed <[email protected]> Signed-off-by: Tariq Toukan <[email protected]> Signed-off-by: David S. Miller <[email protected]>
2023-04-19net/mlx5e: XDP, Use multiple single-entry objects in xdpi_fifoTariq Toukan5-50/+101
Here we fix the current wi->num_pkts abuse, as it was used to indicate multiple xdpi entries in the xdpi_fifo. Instead, reduce mlx5e_xdp_info to the size of a single field, making it a union of unions. Per packet, use as many instances as needed to provide the information needed at the time of completion. The sequence of xdpi instances pushed is well defined, derived by the xmit_mode. Reviewed-by: Saeed Mahameed <[email protected]> Signed-off-by: Tariq Toukan <[email protected]> Signed-off-by: David S. Miller <[email protected]>
2023-04-19net/mlx5e: XDP, Remove doubtful unlikely callsTariq Toukan1-5/+5
It is not likely nor unlikely that the xdp buff has fragments, it depends on the program loaded and size of the packet received. Reviewed-by: Saeed Mahameed <[email protected]> Signed-off-by: Tariq Toukan <[email protected]> Signed-off-by: David S. Miller <[email protected]>
2023-04-19net/mlx5e: Introduce extended version for mlx5e_xmit_dataTariq Toukan5-34/+44
Introduce struct mlx5e_xmit_data_frags to be used for non-linear xmit buffers. Let it include sinfo pointer. Take one bit from the len field to indicate if the descriptor has fragments and can be casted-up into the extended version. Zero-init to make sure has_frags, and potentially future fields, are zero when not explicitly assigned. Another field will be added in a downstream patch to indicate and point to dma addresses of the different frags, for redirect-in requests. This simplifies the mlx5e_xmit_xdp_frame/mlx5e_xmit_xdp_frame_mpwqe functions params. Signed-off-by: Tariq Toukan <[email protected]> Signed-off-by: David S. Miller <[email protected]>
2023-04-19net/mlx5e: Move struct mlx5e_xmit_data to datapath headerTariq Toukan2-6/+7
Move TX datapath struct from the generic en.h to the datapath txrx.h header, where it belongs. Reviewed-by: Saeed Mahameed <[email protected]> Signed-off-by: Tariq Toukan <[email protected]> Signed-off-by: David S. Miller <[email protected]>
2023-04-19net/mlx5e: Move XDP struct and enum to XDP headerTariq Toukan2-35/+35
Move struct mlx5e_xdp_info and enum mlx5e_xdp_xmit_mode from the generic en.h to the XDP header, where they belong. Reviewed-by: Saeed Mahameed <[email protected]> Signed-off-by: Tariq Toukan <[email protected]> Signed-off-by: David S. Miller <[email protected]>
2023-04-18net: ethernet: stmmac: dwmac-sti: remove stih415/stih416/stid127Alain Volmat1-59/+1
Remove no more supported platforms (stih415/stih416 and stid127) Signed-off-by: Alain Volmat <[email protected]> Acked-by: Jakub Kicinski <[email protected]> Reviewed-by: Horatiu Vultur <[email protected]> Link: https://lore.kernel.org/r/[email protected] Signed-off-by: Jakub Kicinski <[email protected]>
2023-04-18net: mscc: ocelot: remove incompatible prototypesArnd Bergmann1-3/+0
The types for the register argument changed recently, but there are still incompatible prototypes that got left behind, and gcc-13 warns about these: In file included from drivers/net/ethernet/mscc/ocelot.c:13: drivers/net/ethernet/mscc/ocelot.h:97:5: error: conflicting types for 'ocelot_port_readl' due to enum/integer mismatch; have 'u32(struct ocelot_port *, u32)' {aka 'unsigned int(struct ocelot_port *, unsigned int)'} [-Werror=enum-int-mismatch] 97 | u32 ocelot_port_readl(struct ocelot_port *port, u32 reg); | ^~~~~~~~~~~~~~~~~ Just remove the two prototypes, and rely on the copy in the global header. Fixes: 9ecd05794b8d ("net: mscc: ocelot: strengthen type of "u32 reg" in I/O accessors") Signed-off-by: Arnd Bergmann <[email protected]> Reviewed-by: Vladimir Oltean <[email protected]> Link: https://lore.kernel.org/r/[email protected] Signed-off-by: Jakub Kicinski <[email protected]>
2023-04-18net: stmmac: propagate feature flags to vlanCorinna Vinschen1-0/+4
stmmac_dev_probe doesn't propagate feature flags to VLANs. So features like offloading don't correspond with the general features and it's not possible to manipulate features via ethtool -K to affect VLANs. Propagate feature flags to vlan features. Drop TSO feature because it does not work on VLANs yet. Signed-off-by: Corinna Vinschen <[email protected]> Link: https://lore.kernel.org/r/[email protected] Signed-off-by: Jakub Kicinski <[email protected]>
2023-04-18Change DEFINE_SEMAPHORE() to take a number argumentPeter Zijlstra1-1/+1
Fundamentally semaphores are a counted primitive, but DEFINE_SEMAPHORE() does not expose this and explicitly creates a binary semaphore. Change DEFINE_SEMAPHORE() to take a number argument and use that in the few places that open-coded it using __SEMAPHORE_INITIALIZER(). Signed-off-by: Peter Zijlstra (Intel) <[email protected]> [mcgrof: add some tribal knowledge about why some folks prefer binary sempahores over mutexes] Reviewed-by: Sergey Senozhatsky <[email protected]> Reviewed-by: Davidlohr Bueso <[email protected]> Signed-off-by: Luis Chamberlain <[email protected]>
2023-04-18mlxfw: fix null-ptr-deref in mlxfw_mfa2_tlv_next()Nikita Zhandarovich1-0/+2
Function mlxfw_mfa2_tlv_multi_get() returns NULL if 'tlv' in question does not pass checks in mlxfw_mfa2_tlv_payload_get(). This behaviour may lead to NULL pointer dereference in 'multi->total_len'. Fix this issue by testing mlxfw_mfa2_tlv_multi_get()'s return value against NULL. Found by Linux Verification Center (linuxtesting.org) with static analysis tool SVACE. Fixes: 410ed13cae39 ("Add the mlxfw module for Mellanox firmware flash process") Co-developed-by: Natalia Petrova <[email protected]> Signed-off-by: Nikita Zhandarovich <[email protected]> Reviewed-by: Ido Schimmel <[email protected]> Link: https://lore.kernel.org/r/[email protected] Signed-off-by: Paolo Abeni <[email protected]>
2023-04-18net: stmmac: dwmac-starfive: Add phy interface settingsSamin Guo1-0/+48
dwmac supports multiple modess. When working under rmii and rgmii, you need to set different phy interfaces. According to the dwmac document, when working in rmii, it needs to be set to 0x4, and rgmii needs to be set to 0x1. The phy interface needs to be set in syscon, the format is as follows: starfive,syscon: <&syscon, offset, shift> Tested-by: Tommaso Merciai <[email protected]> Signed-off-by: Samin Guo <[email protected]> Signed-off-by: Paolo Abeni <[email protected]>
2023-04-18net: stmmac: Add glue layer for StarFive JH7110 SoCSamin Guo3-0/+136
This adds StarFive dwmac driver support on the StarFive JH7110 SoC. Tested-by: Tommaso Merciai <[email protected]> Co-developed-by: Emil Renner Berthing <[email protected]> Signed-off-by: Emil Renner Berthing <[email protected]> Signed-off-by: Samin Guo <[email protected]> Signed-off-by: Paolo Abeni <[email protected]>
2023-04-18net: stmmac: platform: Add snps,dwmac-5.20 IP compatible stringEmil Renner Berthing1-1/+2
Add "snps,dwmac-5.20" compatible string for 5.20 version that can avoid to define some platform data in the glue layer. Tested-by: Tommaso Merciai <[email protected]> Signed-off-by: Emil Renner Berthing <[email protected]> Signed-off-by: Samin Guo <[email protected]> Signed-off-by: Paolo Abeni <[email protected]>
2023-04-18r8169: use new macro netif_subqueue_completed_wake in the tx cleanup pathHeiner Kallweit1-12/+4
Use new net core macro netif_subqueue_completed_wake to simplify the code of the tx cleanup path. Reviewed-by: Jacob Keller <[email protected]> Signed-off-by: Heiner Kallweit <[email protected]> Signed-off-by: Paolo Abeni <[email protected]>
2023-04-18r8169: use new macro netif_subqueue_maybe_stop in rtl8169_start_xmitHeiner Kallweit1-28/+11
Use new net core macro netif_subqueue_maybe_stop in the start_xmit path to simplify the code. Whilst at it, set the tx queue start threshold to twice the stop threshold. Before values were the same, resulting in stopping/starting the queue more often than needed. Signed-off-by: Heiner Kallweit <[email protected]> Signed-off-by: Paolo Abeni <[email protected]>
2023-04-18bnxt_en: Fix a possible NULL pointer dereference in unload pathKalesh AP1-9/+10
In the driver unload path, the driver currently checks the valid BNXT_FLAG_ROCE_CAP flag in bnxt_rdma_aux_device_uninit() before proceeding. This is flawed because the flag may not be set initially during driver load. It may be set later after the NVRAM setting is changed followed by a firmware reset. Relying on the BNXT_FLAG_ROCE_CAP flag may crash in bnxt_rdma_aux_device_uninit() if the aux device was never initialized: BUG: unable to handle kernel NULL pointer dereference at 0000000000000000 PGD 8ae6aa067 P4D 0 Oops: 0000 [#1] SMP NOPTI CPU: 39 PID: 42558 Comm: rmmod Kdump: loaded Tainted: G OE --------- - - 4.18.0-348.el8.x86_64 #1 Hardware name: Dell Inc. PowerEdge R750/0WT8Y6, BIOS 1.5.4 12/17/2021 RIP: 0010:device_del+0x1b/0x410 Code: 89 a5 50 03 00 00 4c 89 a5 58 03 00 00 eb 89 0f 1f 44 00 00 41 56 41 55 41 54 4c 8d a7 80 00 00 00 55 53 48 89 fb 48 83 ec 18 <48> 8b 2f 4c 89 e7 65 48 8b 04 25 28 00 00 00 48 89 44 24 10 31 c0 RSP: 0018:ff7f82bf469a7dc8 EFLAGS: 00010292 RAX: 0000000000000000 RBX: 0000000000000000 RCX: 0000000000000000 RDX: 0000000000000000 RSI: 0000000000000206 RDI: 0000000000000000 RBP: ff31b7cd114b0ac0 R08: 0000000000000000 R09: ffffffff935c3400 R10: ff31b7cd45bc3440 R11: 0000000000000001 R12: 0000000000000080 R13: ffffffffc1069f40 R14: 0000000000000000 R15: 0000000000000000 FS: 00007fc9903ce740(0000) GS:ff31b7d4ffac0000(0000) knlGS:0000000000000000 CS: 0010 DS: 0000 ES: 0000 CR0: 0000000080050033 CR2: 0000000000000000 CR3: 0000000992fee004 CR4: 0000000000773ee0 DR0: 0000000000000000 DR1: 0000000000000000 DR2: 0000000000000000 DR3: 0000000000000000 DR6: 00000000fffe0ff0 DR7: 0000000000000400 PKRU: 55555554 Call Trace: bnxt_rdma_aux_device_uninit+0x1f/0x30 [bnxt_en] bnxt_remove_one+0x2f/0x1f0 [bnxt_en] pci_device_remove+0x3b/0xc0 device_release_driver_internal+0x103/0x1f0 driver_detach+0x54/0x88 bus_remove_driver+0x77/0xc9 pci_unregister_driver+0x2d/0xb0 bnxt_exit+0x16/0x2c [bnxt_en] __x64_sys_delete_module+0x139/0x280 do_syscall_64+0x5b/0x1a0 entry_SYSCALL_64_after_hwframe+0x65/0xca RIP: 0033:0x7fc98f3af71b Fix this by modifying the check inside bnxt_rdma_aux_device_uninit() to check for bp->aux_priv instead. We also need to make some changes in bnxt_rdma_aux_device_init() to make sure that bp->aux_priv is set only when the aux device is fully initialized. Fixes: d80d88b0dfff ("bnxt_en: Add auxiliary driver support") Reviewed-by: Ajit Khaparde <[email protected]> Signed-off-by: Kalesh AP <[email protected]> Signed-off-by: Michael Chan <[email protected]> Signed-off-by: Paolo Abeni <[email protected]>
2023-04-18bnxt_en: Do not initialize PTP on older P3/P4 chipsMichael Chan1-1/+1
The driver does not support PTP on these older chips and it is assuming that firmware on these older chips will not return the PORT_MAC_PTP_QCFG_RESP_FLAGS_HWRM_ACCESS flag in __bnxt_hwrm_ptp_qcfg(), causing the function to abort quietly. But newer firmware now sets this flag and so __bnxt_hwrm_ptp_qcfg() will proceed further. Eventually it will fail in bnxt_ptp_init() -> bnxt_map_ptp_regs() because there is no code to support the older chips. The driver will then complain: "PTP initialization failed.\n" Fix it so that we abort quietly earlier without going through the unnecessary steps and alarming the user with the warning log. Fixes: ae5c42f0b92c ("bnxt_en: Get PTP hardware capability from firmware") Signed-off-by: Michael Chan <[email protected]> Signed-off-by: Paolo Abeni <[email protected]>
2023-04-18cxgb4: fix use after free bugs caused by circular dependency problemDuoming Zhou1-1/+1
The flower_stats_timer can schedule flower_stats_work and flower_stats_work can also arm the flower_stats_timer. The process is shown below: ----------- timer schedules work ------------ ch_flower_stats_cb() //timer handler schedule_work(&adap->flower_stats_work); ----------- work arms timer ------------ ch_flower_stats_handler() //workqueue callback function mod_timer(&adap->flower_stats_timer, ...); When the cxgb4 device is detaching, the timer and workqueue could still be rearmed. The process is shown below: (cleanup routine) | (timer and workqueue routine) remove_one() | free_some_resources() | ch_flower_stats_cb() //timer cxgb4_cleanup_tc_flower() | schedule_work() del_timer_sync() | | ch_flower_stats_handler() //workqueue | mod_timer() cancel_work_sync() | kfree(adapter) //FREE | ch_flower_stats_cb() //timer | adap->flower_stats_work //USE This patch changes del_timer_sync() to timer_shutdown_sync(), which could prevent rearming of the timer from the workqueue. Fixes: e0f911c81e93 ("cxgb4: fetch stats for offloaded tc flower flows") Signed-off-by: Duoming Zhou <[email protected]> Link: https://lore.kernel.org/r/[email protected] Signed-off-by: Paolo Abeni <[email protected]>
2023-04-17net: mscc: ocelot: add support for preemptible traffic classesVladimir Oltean3-1/+66
In order to not transmit (preemptible) frames which will be received by the link partner as corrupted (because it doesn't support FP), the hardware requires the driver to program the QSYS_PREEMPTION_CFG_P_QUEUES register only after the MAC Merge layer becomes active (verification succeeds, or was disabled). There are some cases when FP is known (through experimentation) to be broken. Give priority to FP over cut-through switching, and disable FP for known broken link modes. Signed-off-by: Vladimir Oltean <[email protected]> Reviewed-by: Simon Horman <[email protected]> Signed-off-by: Jakub Kicinski <[email protected]>
2023-04-17net: mscc: ocelot: add support for mqprio offloadVladimir Oltean1-0/+50
This doesn't apply anything to hardware and in general doesn't do anything that the software variant doesn't do, except for checking that there isn't more than 1 TXQ per TC (TXQs for a DSA switch are a dubious concept anyway). The reason we add this is to be able to parse one more field added to struct tc_mqprio_qopt_offload, namely preemptible_tcs. Signed-off-by: Vladimir Oltean <[email protected]> Reviewed-by: Ferenc Fejes <[email protected]> Reviewed-by: Simon Horman <[email protected]> Reviewed-by: Florian Fainelli <[email protected]> Signed-off-by: Jakub Kicinski <[email protected]>
2023-04-17net: mscc: ocelot: don't rely on cached verify_status in ocelot_port_get_mm()Vladimir Oltean1-0/+1
ocelot_mm_update_port_status() updates mm->verify_status, but when the verification state of a port changes, an IRQ isn't emitted, but rather, only when the verification state reaches one of the final states (like DISABLED, FAILED, SUCCEEDED) - things that would affect mm->tx_active, which is what the IRQ *is* actually emitted for. That is to say, user space may miss reports of an intermediary MAC Merge verification state (like from INITIAL to VERIFYING), unless there was an IRQ notifying the driver of the change in mm->tx_active as well. This is not a huge deal, but for reliable reporting to user space, let's call ocelot_mm_update_port_status() synchronously from ocelot_port_get_mm(), which makes user space see the current MM status. Signed-off-by: Vladimir Oltean <[email protected]> Reviewed-by: Florian Fainelli <[email protected]> Reviewed-by: Simon Horman <[email protected]> Signed-off-by: Jakub Kicinski <[email protected]>
2023-04-17net: mscc: ocelot: optimize ocelot_mm_irq()Vladimir Oltean1-2/+28
The MAC Merge IRQ of all ports is shared with the PTP TX timestamp IRQ of all ports, which means that currently, when a PTP TX timestamp is generated, felix_irq_handler() also polls for the MAC Merge layer status of all ports, looking for changes. This makes the kernel do more work, and under certain circumstances may make ptp4l require a tx_timestamp_timeout argument higher than before. Changes to the MAC Merge layer status are only to be expected under certain conditions - its TX direction needs to be enabled - so we can check early if that is the case, and omit register access otherwise. Make ocelot_mm_update_port_status() skip register access if mm->tx_enabled is unset, and also call it once more, outside IRQ context, from ocelot_port_set_mm(), when mm->tx_enabled transitions from true to false, because an IRQ is also expected in that case. Also, a port may have its MAC Merge layer enabled but it may not have generated the interrupt. In that case, there's no point in writing to DEV_MM_STATUS to acknowledge that IRQ. We can reduce the number of register writes per port with MM enabled by keeping an "ack" variable which writes the "write-one-to-clear" bits. Those are 3 in number: PRMPT_ACTIVE_STICKY, UNEXP_RX_PFRM_STICKY and UNEXP_TX_PFRM_STICKY. The other fields in DEV_MM_STATUS are read-only and it doesn't matter what is written to them, so writing zero is just fine. Signed-off-by: Vladimir Oltean <[email protected]> Reviewed-by: Simon Horman <[email protected]> Reviewed-by: Florian Fainelli <[email protected]> Signed-off-by: Jakub Kicinski <[email protected]>
2023-04-17net: mscc: ocelot: remove struct ocelot_mm_state :: lockVladimir Oltean1-12/+8
Unfortunately, the workarounds for the hardware bugs make it pointless to keep fine-grained locking for the MAC Merge state of each port. Our vsc9959_cut_through_fwd() implementation requires ocelot->fwd_domain_lock to be held, in order to serialize with changes to the bridging domains and to port speed changes (which affect which ports can be cut-through). Simultaneously, the traffic classes which can be cut-through cannot be preemptible at the same time, and this will depend on the MAC Merge layer state (which changes from threaded interrupt context). Since vsc9959_cut_through_fwd() would have to hold the mm->lock of all ports for a correct and race-free implementation with respect to ocelot_mm_irq(), in practice it means that any time a port's mm->lock is held, it would potentially block holders of ocelot->fwd_domain_lock. In the interest of simple locking rules, make all MAC Merge layer state changes (and preemptible traffic class changes) be serialized by the ocelot->fwd_domain_lock. Signed-off-by: Vladimir Oltean <[email protected]> Reviewed-by: Simon Horman <[email protected]> Reviewed-by: Florian Fainelli <[email protected]> Signed-off-by: Jakub Kicinski <[email protected]>
2023-04-17net: mscc: ocelot: export a single ocelot_mm_irq()Vladimir Oltean1-2/+10
When the switch emits an IRQ, we don't know what caused it, and we iterate through all ports to check the MAC Merge status. Move that iteration inside the ocelot lib; we will change the locking in a future change and it would be good to encapsulate that lock completely within the ocelot lib. Signed-off-by: Vladimir Oltean <[email protected]> Reviewed-by: Simon Horman <[email protected]> Reviewed-by: Florian Fainelli <[email protected]> Signed-off-by: Jakub Kicinski <[email protected]>
2023-04-17net: stmmac: add Rx HWTS metadata to XDP ZC receive pktSong Yoong Siang1-0/+22
Add receive hardware timestamp metadata support via kfunc to XDP Zero Copy receive packets. Signed-off-by: Song Yoong Siang <[email protected]> Acked-by: Jesper Dangaard Brouer <[email protected]> Signed-off-by: Jakub Kicinski <[email protected]>
2023-04-17net: stmmac: add Rx HWTS metadata to XDP receive pktSong Yoong Siang2-1/+42
Add receive hardware timestamp metadata support via kfunc to XDP receive packets. Suggested-by: Stanislav Fomichev <[email protected]> Signed-off-by: Song Yoong Siang <[email protected]> Acked-by: Stanislav Fomichev <[email protected]> Acked-by: Jesper Dangaard Brouer <[email protected]> Signed-off-by: Jakub Kicinski <[email protected]>
2023-04-17net: stmmac: introduce wrapper for struct xdp_buffSong Yoong Siang2-9/+13
Introduce struct stmmac_xdp_buff as a preparation to support XDP Rx metadata via kfuncs. Signed-off-by: Song Yoong Siang <[email protected]> Reviewed-by: Jacob Keller <[email protected]> Acked-by: Jesper Dangaard Brouer <[email protected]> Signed-off-by: Jakub Kicinski <[email protected]>
2023-04-17net/mlx5e: Accept tunnel mode for IPsec packet offloadLeon Romanovsky1-7/+8
Open mlx5 driver to accept IPsec tunnel mode. Signed-off-by: Leon Romanovsky <[email protected]> Reviewed-by: Simon Horman <[email protected]> Reviewed-by: Sridhar Samudrala <[email protected]> Signed-off-by: Jakub Kicinski <[email protected]>
2023-04-17net/mlx5e: Create IPsec table with tunnel support only when encap is disabledLeon Romanovsky3-3/+39
Current hardware doesn't support double encapsulation which is happening when IPsec packet offload tunnel mode is configured together with eswitch encap option. Any user attempt to add new SA/policy after he/she sets encap mode, will generate the following FW syndrome: mlx5_core 0000:08:00.0: mlx5_cmd_out_err:803:(pid 1904): CREATE_FLOW_TABLE(0x930) op_mod(0x0) failed, status bad parameter(0x3), syndrome (0xa43321), err(-22) Make sure that we block encap changes before creating flow steering tables. This is applicable only for packet offload in tunnel mode, while packet offload in transport mode and crypto offload, don't have such limitation as they don't perform encapsulation. Reviewed-by: Raed Salem <[email protected]> Signed-off-by: Leon Romanovsky <[email protected]> Reviewed-by: Simon Horman <[email protected]> Reviewed-by: Sridhar Samudrala <[email protected]> Signed-off-by: Jakub Kicinski <[email protected]>
2023-04-17net/mlx5: Allow blocking encap changes in eswitchLeon Romanovsky2-0/+62
Existing eswitch encap option enables header encapsulation. Unfortunately currently available hardware isn't able to perform double encapsulation, which can happen once IPsec packet offload tunnel mode is used together with encap mode set to BASIC. So as a solution for misconfiguration, provide an option to block encap changes, which will be used for IPsec packet offload. Reviewed-by: Emeel Hakim <[email protected]> Signed-off-by: Leon Romanovsky <[email protected]> Reviewed-by: Simon Horman <[email protected]> Reviewed-by: Sridhar Samudrala <[email protected]> Signed-off-by: Jakub Kicinski <[email protected]>
2023-04-17net/mlx5e: Listen to ARP events to update IPsec L2 headers in tunnel modeLeon Romanovsky2-7/+130
In IPsec packet offload mode all header manipulations are performed by hardware, which is responsible to add/remove L2 header with source and destinations MACs. CX-7 devices don't support offload of in-kernel routing functionality, as such HW needs external help to fill other side MAC as it isn't available for HW. As a solution, let's listen to neigh ARP updates and reconfigure IPsec rules on the fly once new MAC data information arrives. Signed-off-by: Leon Romanovsky <[email protected]> Reviewed-by: Simon Horman <[email protected]> Reviewed-by: Sridhar Samudrala <[email protected]> Signed-off-by: Jakub Kicinski <[email protected]>
2023-04-17net/mlx5e: Support IPsec TX packet offload in tunnel modeLeon Romanovsky2-0/+64
Extend mlx5 driver with logic to support IPsec TX packet offload in tunnel mode. Signed-off-by: Leon Romanovsky <[email protected]> Reviewed-by: Simon Horman <[email protected]> Reviewed-by: Sridhar Samudrala <[email protected]> Signed-off-by: Jakub Kicinski <[email protected]>
2023-04-17net/mlx5e: Support IPsec RX packet offload in tunnel modeLeon Romanovsky3-0/+88
Extend mlx5 driver with logic to support IPsec RX packet offload in tunnel mode. Signed-off-by: Leon Romanovsky <[email protected]> Reviewed-by: Simon Horman <[email protected]> Reviewed-by: Sridhar Samudrala <[email protected]> Signed-off-by: Jakub Kicinski <[email protected]>
2023-04-17net/mlx5e: Prepare IPsec packet reformat code for tunnel modeLeon Romanovsky3-21/+63
Refactor setup_pkt_reformat() function to accommodate future extension to support tunnel mode. Signed-off-by: Leon Romanovsky <[email protected]> Reviewed-by: Simon Horman <[email protected]> Reviewed-by: Sridhar Samudrala <[email protected]> Signed-off-by: Jakub Kicinski <[email protected]>
2023-04-17net/mlx5e: Configure IPsec SA tables to support tunnel modeLeon Romanovsky1-8/+15
Create SA flow steering tables both for RX and TX with tunnel reformat property. This allows to add and delete extra headers needed for tunnel mode. Reviewed-by: Simon Horman <[email protected]> Signed-off-by: Leon Romanovsky <[email protected]> Reviewed-by: Sridhar Samudrala <[email protected]> Signed-off-by: Jakub Kicinski <[email protected]>
2023-04-17net/mlx5e: Check IPsec packet offload tunnel capabilitiesLeon Romanovsky2-0/+7
Validate tunnel mode support for IPsec packet offload. Reviewed-by: Simon Horman <[email protected]> Signed-off-by: Leon Romanovsky <[email protected]> Reviewed-by: Sridhar Samudrala <[email protected]> Signed-off-by: Jakub Kicinski <[email protected]>
2023-04-17i40e: fix i40e_setup_misc_vector() error handlingAleksandr Loktionov1-1/+4
Add error handling of i40e_setup_misc_vector() in i40e_rebuild(). In case interrupt vectors setup fails do not re-open vsi-s and do not bring up vf-s, we have no interrupts to serve a traffic anyway. Fixes: 41c445ff0f48 ("i40e: main driver core") Signed-off-by: Aleksandr Loktionov <[email protected]> Tested-by: Pucha Himasekhar Reddy <[email protected]> (A Contingent worker at Intel) Signed-off-by: Tony Nguyen <[email protected]>
2023-04-17i40e: fix accessing vsi->active_filters without holding lockAleksandr Loktionov1-2/+2
Fix accessing vsi->active_filters without holding the mac_filter_hash_lock. Move vsi->active_filters = 0 inside critical section and move clear_bit(__I40E_VSI_OVERFLOW_PROMISC, vsi->state) after the critical section to ensure the new filters from other threads can be added only after filters cleaning in the critical section is finished. Fixes: 278e7d0b9d68 ("i40e: store MAC/VLAN filters in a hash with the MAC Address as key") Signed-off-by: Aleksandr Loktionov <[email protected]> Tested-by: Pucha Himasekhar Reddy <[email protected]> (A Contingent worker at Intel) Signed-off-by: Tony Nguyen <[email protected]>
2023-04-17net: lan966x: Fix lan966x_ifh_getHoratiu Vultur1-1/+1
From time to time, it was observed that the nanosecond part of the received timestamp, which is extracted from the IFH, it was actually bigger than 1 second. So then when actually calculating the full received timestamp, based on the nanosecond part from IFH and the second part which is read from HW, it was actually wrong. The issue seems to be inside the function lan966x_ifh_get, which extracts information from an IFH(which is an byte array) and returns the value in a u64. When extracting the timestamp value from the IFH, which starts at bit 192 and have the size of 32 bits, then if the most significant bit was set in the timestamp, then this bit was extended then the return value became 0xffffffff... . And the reason of this is because constants without any postfix are treated as signed longs and that is the reason why '1 << 31' becomes 0xffffffff80000000. This is fixed by adding the postfix 'ULL' to 1. Fixes: fd7627833ddf ("net: lan966x: Stop using packing library") Signed-off-by: Horatiu Vultur <[email protected]> Signed-off-by: David S. Miller <[email protected]>
2023-04-17sfc: Fix use-after-free due to selftest_workDing Hui2-1/+2
There is a use-after-free scenario that is: When the NIC is down, user set mac address or vlan tag to VF, the xxx_set_vf_mac() or xxx_set_vf_vlan() will invoke efx_net_stop() and efx_net_open(), since netif_running() is false, the port will not start and keep port_enabled false, but selftest_work is scheduled in efx_net_open(). If we remove the device before selftest_work run, the efx_stop_port() will not be called since the NIC is down, and then efx is freed, we will soon get a UAF in run_timer_softirq() like this: [ 1178.907941] ================================================================== [ 1178.907948] BUG: KASAN: use-after-free in run_timer_softirq+0xdea/0xe90 [ 1178.907950] Write of size 8 at addr ff11001f449cdc80 by task swapper/47/0 [ 1178.907950] [ 1178.907953] CPU: 47 PID: 0 Comm: swapper/47 Kdump: loaded Tainted: G O --------- -t - 4.18.0 #1 [ 1178.907954] Hardware name: SANGFOR X620G40/WI2HG-208T1061A, BIOS SPYH051032-U01 04/01/2022 [ 1178.907955] Call Trace: [ 1178.907956] <IRQ> [ 1178.907960] dump_stack+0x71/0xab [ 1178.907963] print_address_description+0x6b/0x290 [ 1178.907965] ? run_timer_softirq+0xdea/0xe90 [ 1178.907967] kasan_report+0x14a/0x2b0 [ 1178.907968] run_timer_softirq+0xdea/0xe90 [ 1178.907971] ? init_timer_key+0x170/0x170 [ 1178.907973] ? hrtimer_cancel+0x20/0x20 [ 1178.907976] ? sched_clock+0x5/0x10 [ 1178.907978] ? sched_clock_cpu+0x18/0x170 [ 1178.907981] __do_softirq+0x1c8/0x5fa [ 1178.907985] irq_exit+0x213/0x240 [ 1178.907987] smp_apic_timer_interrupt+0xd0/0x330 [ 1178.907989] apic_timer_interrupt+0xf/0x20 [ 1178.907990] </IRQ> [ 1178.907991] RIP: 0010:mwait_idle+0xae/0x370 If the NIC is not actually brought up, there is no need to schedule selftest_work, so let's move invoking efx_selftest_async_start() into efx_start_all(), and it will be canceled by broughting down. Fixes: dd40781e3a4e ("sfc: Run event/IRQ self-test asynchronously when interface is brought up") Fixes: e340be923012 ("sfc: add ndo_set_vf_mac() function for EF10") Debugged-by: Huang Cun <[email protected]> Cc: Donglin Peng <[email protected]> Suggested-by: Martin Habets <[email protected]> Signed-off-by: Ding Hui <[email protected]> Signed-off-by: David S. Miller <[email protected]>
2023-04-17Merge tag 'mlx5-updates-2023-04-14' of ↵David S. Miller15-89/+1025
git://git.kernel.org/pub/scm/linux/kernel/git/saeed/linux mlx5-updates-2023-04-14 Yevgeny Kliteynik Says: ======================= SW Steering: Support pattern/args modify_header actions The following patch series adds support for a new pattern/arguments type of modify_header actions. Starting with ConnectX-6 DX, we use a new design of modify_header FW object. The current modify_header object allows for having only limited number of these FW objects, which means that we are limited in the number of offloaded flows that require modify_header action. The new approach comprises of two types of objects: pattern and argument. Pattern holds header modification templates, later used with corresponding argument object to create complete header modification actions. The pattern indicates which headers are modified, while the arguments provide the specific values. Therefore a single pattern can be used with different arguments in different flows, enabling offloading of large number of modify_header flows. - Patch 1, 2: Add ICM pool for modify-header-pattern objects and implement patterns cache, allowing patterns reuse for different flows - Patch 3: Allow for chunk allocation separately for STEv0 and STEv1 - Patch 4: Read related device capabilities - Patch 5: Add create/destroy functions for the new general object type - Patch 6: Add support for writing modify header argument to ICM - Patch 7, 8: Some required fixes to support pattern/arg - separate read buffer from the write buffer and fix QP continuous allocation - Patch 9: Add pool for modify header arg objects - Patch 10, 11, 12: Implement MODIFY_HEADER and TNL_L3_TO_L2 actions with the new patterns/args design - Patch 13: Optimization - set modify header action of size 1 directly on the STE instead of separate pattern/args combination - Patch 14: Adjust debug dump for patterns/args - Patch 15: Enable patterns and arguments for supporting devices =======================
2023-04-16RDMA/mlx5: Allow relaxed ordering read in VFs and VMsAvihai Horon1-3/+4
According to PCIe spec, Enable Relaxed Ordering value in the VF's PCI config space is wired to 0 and PF relaxed ordering (RO) setting should be applied to the VF. In QEMU (and maybe others), when assigning VFs, the RO bit in PCI config space is not emulated properly and is always set to 0. Therefore, pcie_relaxed_ordering_enabled() always returns 0 for VFs and VMs and thus MKeys can't be created with RO read even if the PF supports it. pcie_relaxed_ordering_enabled() check was added to avoid a syndrome when creating a MKey with relaxed ordering (RO) enabled when the driver's relaxed_ordering_read_pci_enabled HCA capability is out of sync with FW. With the new relaxed_ordering_read capability this can't happen, as it's set regardless of RO value in PCI config space and thus can't change during runtime. Hence, to allow RO read in VFs and VMs, use the new HCA capability relaxed_ordering_read without checking pcie_relaxed_ordering_enabled(). The old capability checks are kept for backward compatibility with older FWs. Allowing RO in VFs and VMs is valuable since it can greatly improve performance on some setups. For example, testing throughput of a VF on an AMD EPYC 7763 and ConnectX-6 Dx setup showed roughly 60% performance improvement. Signed-off-by: Avihai Horon <[email protected]> Reviewed-by: Shay Drory <[email protected]> Reviewed-by: Aya Levin <[email protected]> Link: https://lore.kernel.org/r/e7048640d66c341a8fa0465e099926e7989184bc.1681131553.git.leon@kernel.org Reviewed-by: Jacob Keller <[email protected]> Signed-off-by: Leon Romanovsky <[email protected]>
2023-04-16net/mlx5: Update relaxed ordering read HCA capabilitiesAvihai Horon1-1/+1
Rename existing HCA capability relaxed_ordering_read to relaxed_ordering_read_pci_enabled. This is in accordance with recent PRM change to better describe the capability, as it's set only if both the device supports relaxed ordering (RO) read and RO is enabled in PCI config space. In addition, add new HCA capability relaxed_ordering_read which is set if the device supports RO read, regardless of RO in PCI config space. This will be used in the following patch to allow RO in VFs and VMs. Signed-off-by: Avihai Horon <[email protected]> Reviewed-by: Shay Drory <[email protected]> Link: https://lore.kernel.org/r/caa0002fd8135086357dfcc368e2f5cc73b08480.1681131553.git.leon@kernel.org Reviewed-by: Jacob Keller <[email protected]> Signed-off-by: Leon Romanovsky <[email protected]>
2023-04-16RDMA/mlx5: Remove pcie_relaxed_ordering_enabled() check for RO writeAvihai Horon2-3/+2
pcie_relaxed_ordering_enabled() check was added to avoid a syndrome when creating a MKey with relaxed ordering (RO) enabled when the driver's relaxed_ordering_{read,write} HCA capabilities are out of sync with FW. While this can happen with relaxed_ordering_read, it can't happen with relaxed_ordering_write as it's set if the device supports RO write, regardless of RO in PCI config space, and thus can't change during runtime. Therefore, drop the pcie_relaxed_ordering_enabled() check for relaxed_ordering_write while keeping it for relaxed_ordering_read. Doing so will also allow the usage of RO write in VFs and VMs (where RO in PCI config space is not reported/emulated properly). Signed-off-by: Avihai Horon <[email protected]> Reviewed-by: Shay Drory <[email protected]> Link: https://lore.kernel.org/r/7e8f55e31572c1702d69cae015a395d3a824a38a.1681131553.git.leon@kernel.org Reviewed-by: Jacob Keller <[email protected]> Signed-off-by: Leon Romanovsky <[email protected]>
2023-04-14bnxt: hook NAPIs to page poolsJakub Kicinski1-0/+1
bnxt has 1:1 mapping of page pools and NAPIs, so it's safe to hoook them up together. Reviewed-by: Tariq Toukan <[email protected]> Tested-by: Dragos Tatulea <[email protected]> Signed-off-by: Jakub Kicinski <[email protected]>
2023-04-14net/mlx5: DR, Enable patterns and arguments for supporting devicesYevgeny Kliteynik1-1/+2
Check if patterns and arguments for modify header action are supported and enable them accordingly. Signed-off-by: Muhammad Sammar <[email protected]> Signed-off-by: Yevgeny Kliteynik <[email protected]> Reviewed-by: Alex Vesker <[email protected]> Signed-off-by: Saeed Mahameed <[email protected]>
2023-04-14net/mlx5: DR, Add support for the pattern/arg parameters in debug dumpYevgeny Kliteynik1-3/+27
Support the pattern/args-based MODIFY_HDR and TNL_L3_TO_L2 actions in dbg dump Signed-off-by: Yevgeny Kliteynik <[email protected]> Reviewed-by: Alex Vesker <[email protected]> Signed-off-by: Saeed Mahameed <[email protected]>
2023-04-14net/mlx5: DR, Modify header action of size 1 optimizationYevgeny Kliteynik3-25/+52
Set modify header action of size 1 directly on the STE for supporting devices, thus reducing number of hops and cache misses. Signed-off-by: Yevgeny Kliteynik <[email protected]> Reviewed-by: Alex Vesker <[email protected]> Signed-off-by: Saeed Mahameed <[email protected]>
2023-04-14net/mlx5: DR, Support decap L3 action using pattern / arg mechanismYevgeny Kliteynik1-16/+5
Use the new accelerated action for decap L3 on RX side: use the mechanism of pattern and argument same as in modify-header action. Signed-off-by: Erez Shitrit <[email protected]> Signed-off-by: Yevgeny Kliteynik <[email protected]> Reviewed-by: Alex Vesker <[email protected]> Signed-off-by: Saeed Mahameed <[email protected]>