aboutsummaryrefslogtreecommitdiff
path: root/drivers/net/ethernet/intel
AgeCommit message (Collapse)AuthorFilesLines
2022-02-03i40e: Fix race condition while adding/deleting MAC/VLAN filtersJedrzej Jagielski1-11/+13
There was a race condition in access to hw->aq.asq_last_status while adding and deleting MAC/VLAN filters causing incorrect error status to be printed as ERROR OK instead of the correct error. Change calls to i40e_aq_add_macvlan in i40e_aqc_add_filters and i40e_aq_remove_macvlan in i40e_aqc_del_filters to _v2 versions that return Admin Queue status on the stack to avoid race conditions in access to hw->aq.asq_last_status. Signed-off-by: Sylwester Dziedziuch <[email protected]> Signed-off-by: Jedrzej Jagielski <[email protected]> Tested-by: Gurucharan G <[email protected]> Signed-off-by: Tony Nguyen <[email protected]>
2022-02-03i40e: Add new version of i40e_aq_add_macvlan functionJedrzej Jagielski2-20/+77
ASQ send command functions are returning only i40e status codes yet some calling functions also need Admin Queue status that is stored in hw->aq.asq_last_status. Since hw object is stored on a heap it introduces a possibility for a race condition in access to hw if calling function is not fast enough to read hw->aq.asq_last_status before next send ASQ command is executed. Add new _v2 version of i40e_aq_add_macvlan that is using new _v2 versions of ASQ send command functions and returns the Admin Queue status on the stack. Signed-off-by: Sylwester Dziedziuch <[email protected]> Signed-off-by: Jedrzej Jagielski <[email protected]> Tested-by: Gurucharan G <[email protected]> Signed-off-by: Tony Nguyen <[email protected]>
2022-02-03i40e: Add new versions of send ASQ command functionsJedrzej Jagielski3-8/+150
ASQ send command functions are returning only i40e status codes yet some calling functions also need Admin Queue status that is stored in hw->aq.asq_last_status. Since hw object is stored on a heap it introduces a possibility for a race condition in access to hw if calling function is not fast enough to read hw->aq.asq_last_status before next send ASQ command is executed. Add new versions of send ASQ command functions that return Admin Queue status on the stack to avoid race conditions in access to hw->aq.asq_last_status. Add new _v2 version of i40e_aq_remove_macvlan that is using new _v2 versions of ASQ send command functions and returns the Admin Queue status on the stack. Signed-off-by: Sylwester Dziedziuch <[email protected]> Signed-off-by: Jedrzej Jagielski <[email protected]> Tested-by: Gurucharan G <[email protected]> Signed-off-by: Tony Nguyen <[email protected]>
2022-02-03i40e: Add sending commands in atomic contextJedrzej Jagielski1-9/+12
Change functions: - i40e_aq_add_macvlan - i40e_aq_remove_macvlan - i40e_aq_delete_element - i40e_aq_add_vsi - i40e_aq_update_vsi_params to explicitly use i40e_asq_send_command_atomic(..., true) instead of i40e_asq_send_command, as they use mutexes and do some work in an atomic context. Without this change setting vlan via netdev will fail with call trace cased by bug "BUG: scheduling while atomic". Signed-off-by: Witold Fijalkowski <[email protected]> Signed-off-by: Jedrzej Jagielski <[email protected]> Tested-by: Gurucharan G <[email protected]> Signed-off-by: Tony Nguyen <[email protected]>
2022-02-03i40e: Remove unused RX realloc statJoe Damato2-3/+1
After commit 1a557afc4dd5 ("i40e: Refactor receive routine"), rx_stats.realloc_count is no longer being incremented, so remove it. The debugfs string was left, but hardcoded to 0. This is intended to prevent breaking any existing code / scripts that are parsing debugfs for i40e. Signed-off-by: Joe Damato <[email protected]> Reviewed-by: Jesse Brandeburg <[email protected]> Tested-by: Gurucharan G <[email protected]> Signed-off-by: Tony Nguyen <[email protected]>
2022-02-03i40e: Disable hw-tc-offload feature on driver loadMateusz Palczewski1-1/+4
After loading driver hw-tc-offload is enabled by default. Change the behaviour of driver to disable hw-tc-offload by default as this is the expected state. Additionally since this impacts ntuple feature state change the way of checking NETIF_F_HW_TC flag. Signed-off-by: Norbert Zulinski <[email protected]> Signed-off-by: Przemyslaw Patynowski <[email protected]> Signed-off-by: Mateusz Palczewski <[email protected]> Tested-by: Dave Switzer <[email protected]> Signed-off-by: Tony Nguyen <[email protected]>
2022-02-01Merge branch '1GbE' of ↵Jakub Kicinski3-19/+44
git://git.kernel.org/pub/scm/linux/kernel/git/tnguy/net-queue Tony Nguyen says: ==================== Intel Wired LAN Driver Updates 2022-02-01 This series contains updates to e1000e driver only. Sasha removes CSME handshake with TGL platform as this is not supported and is causing hardware unit hangs to be reported. * '1GbE' of git://git.kernel.org/pub/scm/linux/kernel/git/tnguy/net-queue: e1000e: Handshake with CSME starts from ADL platforms e1000e: Separate ADP board type from TGP ==================== Link: https://lore.kernel.org/r/[email protected] Signed-off-by: Jakub Kicinski <[email protected]>
2022-02-01e1000e: Handshake with CSME starts from ADL platformsSasha Neftin1-2/+4
Handshake with CSME/AMT on none provisioned platforms during S0ix flow is not supported on TGL platform and can cause to HW unit hang. Update the handshake with CSME flow to start from the ADL platform. Fixes: 3e55d231716e ("e1000e: Add handshake with the CSME to support S0ix") Signed-off-by: Sasha Neftin <[email protected]> Tested-by: Nechama Kraus <[email protected]> Signed-off-by: Tony Nguyen <[email protected]>
2022-02-01e1000e: Separate ADP board type from TGPSasha Neftin3-17/+40
We have the same LAN controller on different PCH's. Separate ADP board type from a TGP which will allow for specific fixes to be applied for ADP platforms. Suggested-by: Kai-Heng Feng <[email protected]> Suggested-by: Dima Ruinskiy <[email protected]> Signed-off-by: Sasha Neftin <[email protected]> Tested-by: Nechama Kraus <[email protected]> Signed-off-by: Tony Nguyen <[email protected]>
2022-01-31i40e: Fix reset path while removing the driverKaren Sornek2-1/+19
Fix the crash in kernel while dereferencing the NULL pointer, when the driver is unloaded and simultaneously the VSI rings are being stopped. The hardware requires 50msec in order to finish RX queues disable. For this purpose the driver spins in mdelay function for the operation to be completed. For example changing number of queues which requires reset would fail in the following call stack: 1) i40e_prep_for_reset 2) i40e_pf_quiesce_all_vsi 3) i40e_quiesce_vsi 4) i40e_vsi_close 5) i40e_down 6) i40e_vsi_stop_rings 7) i40e_vsi_control_rx -> disable requires the delay of 50msecs 8) continue back in i40e_down function where i40e_clean_tx_ring(vsi->tx_rings[i]) is going to crash When the driver was spinning vsi_release called i40e_vsi_free_arrays where the vsi->tx_rings resources were freed and the pointer was set to NULL. Fixes: 5b6d4a7f20b0 ("i40e: Fix crash during removing i40e driver") Signed-off-by: Slawomir Laba <[email protected]> Signed-off-by: Sylwester Dziedziuch <[email protected]> Signed-off-by: Karen Sornek <[email protected]> Tested-by: Gurucharan G <[email protected]> Signed-off-by: Tony Nguyen <[email protected]>
2022-01-31i40e: Fix reset bw limit when DCB enabled with 1 TCJedrzej Jagielski1-1/+11
There was an AQ error I40E_AQ_RC_EINVAL when trying to reset bw limit as part of bw allocation setup. This was caused by trying to reset bw limit with DCB enabled. Bw limit should not be reset when DCB is enabled. The code was relying on the pf->flags to check if DCB is enabled but if only 1 TC is available this flag will not be set even though DCB is enabled. Add a check for number of TC and if it is 1 don't try to reset bw limit even if pf->flags shows DCB as disabled. Fixes: fa38e30ac73f ("i40e: Fix for Tx timeouts when interface is brought up if DCB is enabled") Suggested-by: Alexander Lobakin <[email protected]> # Flatten the condition Signed-off-by: Sylwester Dziedziuch <[email protected]> Signed-off-by: Jedrzej Jagielski <[email protected]> Reviewed-by: Alexander Lobakin <[email protected]> Tested-by: Imam Hassan Reza Biswas <[email protected]> Signed-off-by: Tony Nguyen <[email protected]>
2022-01-31ixgbe: respect metadata on XSK Rx to skbAlexander Lobakin1-4/+10
For now, if the XDP prog returns XDP_PASS on XSK, the metadata will be lost as it doesn't get copied to the skb. Copy it along with the frame headers. Account its size on skb allocation, and when copying just treat it as a part of the frame and do a pull after to "move" it to the "reserved" zone. net_prefetch() xdp->data_meta and align the copy size to speed-up memcpy() a little and better match ixgbe_construct_skb(). Fixes: d0bcacd0a130 ("ixgbe: add AF_XDP zero-copy Rx support") Suggested-by: Jesper Dangaard Brouer <[email protected]> Suggested-by: Maciej Fijalkowski <[email protected]> Signed-off-by: Alexander Lobakin <[email protected]> Reviewed-by: Michal Swiatkowski <[email protected]> Tested-by: Sandeep Penigalapati <[email protected]> Signed-off-by: Tony Nguyen <[email protected]>
2022-01-31ixgbe: don't reserve excessive XDP_PACKET_HEADROOM on XSK Rx to skbAlexander Lobakin1-3/+1
{__,}napi_alloc_skb() allocates and reserves additional NET_SKB_PAD + NET_IP_ALIGN for any skb. OTOH, ixgbe_construct_skb_zc() currently allocates and reserves additional `xdp->data - xdp->data_hard_start`, which is XDP_PACKET_HEADROOM for XSK frames. There's no need for that at all as the frame is post-XDP and will go only to the networking stack core. Pass the size of the actual data only to __napi_alloc_skb() and don't reserve anything. This will give enough headroom for stack processing. Fixes: d0bcacd0a130 ("ixgbe: add AF_XDP zero-copy Rx support") Signed-off-by: Alexander Lobakin <[email protected]> Reviewed-by: Michal Swiatkowski <[email protected]> Tested-by: Sandeep Penigalapati <[email protected]> Signed-off-by: Tony Nguyen <[email protected]>
2022-01-31ixgbe: pass bi->xdp to ixgbe_construct_skb_zc() directlyAlexander Lobakin1-9/+10
To not dereference bi->xdp each time in ixgbe_construct_skb_zc(), pass bi->xdp as an argument instead of bi. We can also call xsk_buff_free() outside of the function as well as assign bi->xdp to NULL, which seems to make it closer to its name. Suggested-by: Maciej Fijalkowski <[email protected]> Signed-off-by: Alexander Lobakin <[email protected]> Tested-by: Sandeep Penigalapati <[email protected]> Signed-off-by: Tony Nguyen <[email protected]>
2022-01-31igc: don't reserve excessive XDP_PACKET_HEADROOM on XSK Rx to skbAlexander Lobakin1-6/+7
{__,}napi_alloc_skb() allocates and reserves additional NET_SKB_PAD + NET_IP_ALIGN for any skb. OTOH, igc_construct_skb_zc() currently allocates and reserves additional `xdp->data_meta - xdp->data_hard_start`, which is about XDP_PACKET_HEADROOM for XSK frames. There's no need for that at all as the frame is post-XDP and will go only to the networking stack core. Pass the size of the actual data only (+ meta) to __napi_alloc_skb() and don't reserve anything. This will give enough headroom for stack processing. Also, net_prefetch() xdp->data_meta and align the copy size to speed-up memcpy() a little and better match igc_construct_skb(). Fixes: fc9df2a0b520 ("igc: Enable RX via AF_XDP zero-copy") Signed-off-by: Alexander Lobakin <[email protected]> Reviewed-by: Michal Swiatkowski <[email protected]> Tested-by: Nechama Kraus <[email protected]> Signed-off-by: Tony Nguyen <[email protected]>
2022-01-31ice: respect metadata on XSK Rx to skbAlexander Lobakin1-4/+10
For now, if the XDP prog returns XDP_PASS on XSK, the metadata will be lost as it doesn't get copied to the skb. Copy it along with the frame headers. Account its size on skb allocation, and when copying just treat it as a part of the frame and do a pull after to "move" it to the "reserved" zone. net_prefetch() xdp->data_meta and align the copy size to speed-up memcpy() a little and better match ice_construct_skb(). Fixes: 2d4238f55697 ("ice: Add support for AF_XDP") Suggested-by: Jesper Dangaard Brouer <[email protected]> Suggested-by: Maciej Fijalkowski <[email protected]> Signed-off-by: Alexander Lobakin <[email protected]> Reviewed-by: Michal Swiatkowski <[email protected]> Tested-by: Kiran Bhandare <[email protected]> Signed-off-by: Tony Nguyen <[email protected]>
2022-01-31ice: don't reserve excessive XDP_PACKET_HEADROOM on XSK Rx to skbAlexander Lobakin1-3/+1
{__,}napi_alloc_skb() allocates and reserves additional NET_SKB_PAD + NET_IP_ALIGN for any skb. OTOH, ice_construct_skb_zc() currently allocates and reserves additional `xdp->data - xdp->data_hard_start`, which is XDP_PACKET_HEADROOM for XSK frames. There's no need for that at all as the frame is post-XDP and will go only to the networking stack core. Pass the size of the actual data only to __napi_alloc_skb() and don't reserve anything. This will give enough headroom for stack processing. Fixes: 2d4238f55697 ("ice: Add support for AF_XDP") Signed-off-by: Alexander Lobakin <[email protected]> Reviewed-by: Michal Swiatkowski <[email protected]> Tested-by: Kiran Bhandare <[email protected]> Signed-off-by: Tony Nguyen <[email protected]>
2022-01-31ice: respect metadata in legacy-rx/ice_construct_skb()Alexander Lobakin1-4/+11
In "legacy-rx" mode represented by ice_construct_skb(), we can still use XDP (and XDP metadata), but after XDP_PASS the metadata will be lost as it doesn't get copied to the skb. Copy it along with the frame headers. Account its size on skb allocation, and when copying just treat it as a part of the frame and do a pull after to "move" it to the "reserved" zone. Point net_prefetch() to xdp->data_meta instead of data. This won't change anything when the meta is not here, but will save some cache misses otherwise. Suggested-by: Jesper Dangaard Brouer <[email protected]> Suggested-by: Maciej Fijalkowski <[email protected]> Signed-off-by: Alexander Lobakin <[email protected]> Reviewed-by: Michal Swiatkowski <[email protected]> Tested-by: Kiran Bhandare <[email protected]> Signed-off-by: Tony Nguyen <[email protected]>
2022-01-31i40e: respect metadata on XSK Rx to skbAlexander Lobakin1-4/+10
For now, if the XDP prog returns XDP_PASS on XSK, the metadata will be lost as it doesn't get copied to the skb. Copy it along with the frame headers. Account its size on skb allocation, and when copying just treat it as a part of the frame and do a pull after to "move" it to the "reserved" zone. net_prefetch() xdp->data_meta and align the copy size to speed-up memcpy() a little and better match i40e_construct_skb(). Fixes: 0a714186d3c0 ("i40e: add AF_XDP zero-copy Rx support") Suggested-by: Jesper Dangaard Brouer <[email protected]> Suggested-by: Maciej Fijalkowski <[email protected]> Signed-off-by: Alexander Lobakin <[email protected]> Reviewed-by: Michal Swiatkowski <[email protected]> Tested-by: Kiran Bhandare <[email protected]> Signed-off-by: Tony Nguyen <[email protected]>
2022-01-31i40e: don't reserve excessive XDP_PACKET_HEADROOM on XSK Rx to skbAlexander Lobakin1-3/+1
{__,}napi_alloc_skb() allocates and reserves additional NET_SKB_PAD + NET_IP_ALIGN for any skb. OTOH, i40e_construct_skb_zc() currently allocates and reserves additional `xdp->data - xdp->data_hard_start`, which is XDP_PACKET_HEADROOM for XSK frames. There's no need for that at all as the frame is post-XDP and will go only to the networking stack core. Pass the size of the actual data only to __napi_alloc_skb() and don't reserve anything. This will give enough headroom for stack processing. Fixes: 0a714186d3c0 ("i40e: add AF_XDP zero-copy Rx support") Signed-off-by: Alexander Lobakin <[email protected]> Reviewed-by: Michal Swiatkowski <[email protected]> Acked-by: Jesper Dangaard Brouer <[email protected]> Tested-by: Kiran Bhandare <[email protected]> Signed-off-by: Tony Nguyen <[email protected]>
2022-01-27igbvf: Remove useless DMA-32 fallback configurationChristophe JAILLET1-16/+6
As stated in [1], dma_set_mask() with a 64-bit mask never fails if dev->dma_mask is non-NULL. So, if it fails, the 32 bits case will also fail for the same reason. So, if dma_set_mask_and_coherent() succeeds, 'pci_using_dac' is known to be 1. Simplify code and remove some dead code accordingly. [1]: https://lkml.org/lkml/2021/6/7/398 Signed-off-by: Christophe JAILLET <[email protected]> Tested-by: Konrad Jankowski <[email protected]> Signed-off-by: Tony Nguyen <[email protected]>
2022-01-27igb: Remove useless DMA-32 fallback configurationChristophe JAILLET1-13/+6
As stated in [1], dma_set_mask() with a 64-bit mask never fails if dev->dma_mask is non-NULL. So, if it fails, the 32 bits case will also fail for the same reason. So, if dma_set_mask_and_coherent() succeeds, 'pci_using_dac' is known to be 1. Simplify code and remove some dead code accordingly. [1]: https://lkml.org/lkml/2021/6/7/398 Signed-off-by: Christophe JAILLET <[email protected]> Tested-by: Gurucharan G <[email protected]> Signed-off-by: Tony Nguyen <[email protected]>
2022-01-27igc: Remove useless DMA-32 fallback configurationChristophe JAILLET1-13/+6
As stated in [1], dma_set_mask() with a 64-bit mask never fails if dev->dma_mask is non-NULL. So, if it fails, the 32 bits case will also fail for the same reason. So, if dma_set_mask_and_coherent() succeeds, 'pci_using_dac' is known to be 1. Simplify code and remove some dead code accordingly. [1]: https://lkml.org/lkml/2021/6/7/398 Signed-off-by: Christophe JAILLET <[email protected]> Signed-off-by: Tony Nguyen <[email protected]>
2022-01-27ice: Remove useless DMA-32 fallback configurationChristophe JAILLET1-2/+0
As stated in [1], dma_set_mask() with a 64-bit mask never fails if dev->dma_mask is non-NULL. So, if it fails, the 32 bits case will also fail for the same reason. Simplify code and remove some dead code accordingly. [1]: https://lkml.org/lkml/2021/6/7/398 Signed-off-by: Christophe JAILLET <[email protected]> Reviewed-by: Christoph Hellwig <[email protected]> Reviewed-by: Alexander Lobakin <[email protected]> Tested-by: Gurucharan G <[email protected]> Signed-off-by: Tony Nguyen <[email protected]>
2022-01-27iavf: Remove useless DMA-32 fallback configurationChristophe JAILLET1-6/+3
As stated in [1], dma_set_mask() with a 64-bit mask never fails if dev->dma_mask is non-NULL. So, if it fails, the 32 bits case will also fail for the same reason. Simplify code and remove some dead code accordingly. [1]: https://lkml.org/lkml/2021/6/7/398 Signed-off-by: Christophe JAILLET <[email protected]> Reviewed-by: Christoph Hellwig <[email protected]> Reviewed-by: Alexander Lobakin <[email protected]> Tested-by: Konrad Jankowski <[email protected]> Signed-off-by: Tony Nguyen <[email protected]>
2022-01-27e1000e: Remove useless DMA-32 fallback configurationChristophe JAILLET1-15/+7
As stated in [1], dma_set_mask() with a 64-bit mask never fails if dev->dma_mask is non-NULL. So, if it fails, the 32 bits case will also fail for the same reason. So, if dma_set_mask_and_coherent() succeeds, 'pci_using_dac' is known to be 1. Simplify code and remove some dead code accordingly. [1]: https://lkml.org/lkml/2021/6/7/398 Signed-off-by: Christophe JAILLET <[email protected]> Reviewed-by: Christoph Hellwig <[email protected]> Reviewed-by: Alexander Lobakin <[email protected]> Signed-off-by: Tony Nguyen <[email protected]>
2022-01-27i40e: Remove useless DMA-32 fallback configurationChristophe JAILLET1-6/+3
As stated in [1], dma_set_mask() with a 64-bit mask never fails if dev->dma_mask is non-NULL. So, if it fails, the 32 bits case will also fail for the same reason. Simplify code and remove some dead code accordingly. [1]: https://lkml.org/lkml/2021/6/7/398 Signed-off-by: Christophe JAILLET <[email protected]> Reviewed-by: Christoph Hellwig <[email protected]> Reviewed-by: Alexander Lobakin <[email protected]> Tested-by: Gurucharan G <[email protected]> Signed-off-by: Tony Nguyen <[email protected]>
2022-01-27ixgbevf: Remove useless DMA-32 fallback configurationChristophe JAILLET1-14/+6
As stated in [1], dma_set_mask() with a 64-bit mask never fails if dev->dma_mask is non-NULL. So, if it fails, the 32 bits case will also fail for the same reason. So, if dma_set_mask_and_coherent() succeeds, 'pci_using_dac' is known to be 1. Simplify code and remove some dead code accordingly. [1]: https://lkml.org/lkml/2021/6/7/398 Signed-off-by: Christophe JAILLET <[email protected]> Reviewed-by: Christoph Hellwig <[email protected]> Reviewed-by: Alexander Lobakin <[email protected]> Tested-by: Gurucharan G <[email protected]> Signed-off-by: Tony Nguyen <[email protected]>
2022-01-27ixgbe: Remove useless DMA-32 fallback configurationChristophe JAILLET1-13/+7
As stated in [1], dma_set_mask() with a 64-bit mask never fails if dev->dma_mask is non-NULL. So, if it fails, the 32 bits case will also fail for the same reason. So, if dma_set_mask_and_coherent() succeeds, 'pci_using_dac' is known to be 1. Simplify code and remove some dead code accordingly. [1]: https://lkml.org/lkml/2021/6/7/398 Signed-off-by: Christophe JAILLET <[email protected]> Reviewed-by: Christoph Hellwig <[email protected]> Reviewed-by: Alexander Lobakin <[email protected]> Tested-by: Gurucharan G <[email protected]> Signed-off-by: Tony Nguyen <[email protected]>
2022-01-27ixgb: Remove useless DMA-32 fallback configurationChristophe JAILLET1-14/+5
As stated in [1], dma_set_mask() with a 64-bit mask never fails if dev->dma_mask is non-NULL. So, if it fails, the 32 bits case will also fail for the same reason. So, if dma_set_mask_and_coherent() succeeds, 'pci_using_dac' is known to be 1. Simplify code and remove some dead code accordingly. [1]: https://lkml.org/lkml/2021/6/7/398 Signed-off-by: Christophe JAILLET <[email protected]> Reviewed-by: Christoph Hellwig <[email protected]> Reviewed-by: Alexander Lobakin <[email protected]> Signed-off-by: Tony Nguyen <[email protected]>
2022-01-27ice: xsk: Borrow xdp_tx_active logic from i40eMaciej Fijalkowski3-3/+14
One of the things that commit 5574ff7b7b3d ("i40e: optimize AF_XDP Tx completion path") introduced was the @xdp_tx_active field. Its usage from i40e can be adjusted to ice driver and give us positive performance results. If the descriptor that @next_dd points to has been sent by HW (its DD bit is set), then we are sure that at least quarter of the ring is ready to be cleaned. If @xdp_tx_active is 0 which means that related xdp_ring is not used for XDP_{TX, REDIRECT} workloads, then we know how many XSK entries should placed to completion queue, IOW walking through the ring can be skipped. Signed-off-by: Maciej Fijalkowski <[email protected]> Signed-off-by: Daniel Borkmann <[email protected]> Reviewed-by: Alexander Lobakin <[email protected]> Acked-by: Magnus Karlsson <[email protected]> Link: https://lore.kernel.org/bpf/[email protected]
2022-01-27ice: xsk: Improve AF_XDP ZC Tx and use batching APIMaciej Fijalkowski4-101/+190
Apply the logic that was done for regular XDP from commit 9610bd988df9 ("ice: optimize XDP_TX workloads") to the ZC side of the driver. On top of that, introduce batching to Tx that is inspired by i40e's implementation with adjustments to the cleaning logic - take into the account NAPI budget in ice_clean_xdp_irq_zc(). Separating the stats structs onto separate cache lines seemed to improve the performance. Signed-off-by: Maciej Fijalkowski <[email protected]> Signed-off-by: Daniel Borkmann <[email protected]> Reviewed-by: Alexander Lobakin <[email protected]> Link: https://lore.kernel.org/bpf/[email protected]
2022-01-27ice: xsk: Avoid potential dead AF_XDP Tx processingMaciej Fijalkowski1-0/+2
Commit 9610bd988df9 ("ice: optimize XDP_TX workloads") introduced @next_dd and @next_rs to ice_tx_ring struct. Currently, their state is not restored in ice_clean_tx_ring(), which was not causing any troubles as the XDP rings are gone after we're done with XDP prog on interface. For upcoming usage of mentioned fields in AF_XDP, this might expose us to a potential dead Tx side. Scenario would look like following (based on xdpsock): - two xdpsock instances are spawned in Tx mode - one of them is killed - XDP prog is kept on interface due to the other xdpsock still running * this means that XDP rings stayed in place - xdpsock is launched again on same queue id that was terminated on - @next_dd and @next_rs setting is bogus, therefore transmit side is broken To protect us from the above, restore the initial @next_rs and @next_dd values when cleaning the Tx ring. Signed-off-by: Maciej Fijalkowski <[email protected]> Signed-off-by: Daniel Borkmann <[email protected]> Reviewed-by: Alexander Lobakin <[email protected]> Acked-by: Magnus Karlsson <[email protected]> Link: https://lore.kernel.org/bpf/[email protected]
2022-01-27i40e: xsk: Move tmp desc array from driver to poolMagnus Karlsson3-14/+2
Move desc_array from the driver to the pool. The reason behind this is that we can then reuse this array as a temporary storage for descriptors in all zero-copy drivers that use the batched interface. This will make it easier to add batching to more drivers. i40e is the only driver that has a batched Tx zero-copy implementation, so no need to touch any other driver. Signed-off-by: Magnus Karlsson <[email protected]> Signed-off-by: Daniel Borkmann <[email protected]> Reviewed-by: Alexander Lobakin <[email protected]> Link: https://lore.kernel.org/bpf/[email protected]
2022-01-27ice: Make Tx threshold dependent on ring lengthMaciej Fijalkowski4-9/+12
XDP_TX workloads use a concept of Tx threshold that indicates the interval of setting RS bit on descriptors which in turn tells the HW to generate an interrupt to signal the completion of Tx on HW side. It is currently based on a constant value of 32 which might not work out well for various sizes of ring combined with for example batch size that can be set via SO_BUSY_POLL_BUDGET. Internal tests based on AF_XDP showed that most convenient setup of mentioned threshold is when it is equal to quarter of a ring length. Make use of recently introduced ICE_RING_QUARTER macro and use this value as a substitute for ICE_TX_THRESH. Align also ethtool -G callback so that next_dd/next_rs fields are up to date in terms of the ring size. Signed-off-by: Maciej Fijalkowski <[email protected]> Signed-off-by: Daniel Borkmann <[email protected]> Reviewed-by: Alexander Lobakin <[email protected]> Acked-by: Magnus Karlsson <[email protected]> Link: https://lore.kernel.org/bpf/[email protected]
2022-01-27ice: xsk: Handle SW XDP ring wrap and bump tail more oftenMaciej Fijalkowski2-20/+81
Currently, if ice_clean_rx_irq_zc() processed the whole ring and next_to_use != 0, then ice_alloc_rx_buf_zc() would not refill the whole ring even if the XSK buffer pool would have enough free entries (either from fill ring or the internal recycle mechanism) - it is because ring wrap is not handled. Improve the logic in ice_alloc_rx_buf_zc() to address the problem above. Do not clamp the count of buffers that is passed to xsk_buff_alloc_batch() in case when next_to_use + buffer count >= rx_ring->count, but rather split it and have two calls to the mentioned function - one for the part up until the wrap and one for the part after the wrap. Signed-off-by: Magnus Karlsson <[email protected]> Signed-off-by: Maciej Fijalkowski <[email protected]> Signed-off-by: Daniel Borkmann <[email protected]> Reviewed-by: Alexander Lobakin <[email protected]> Link: https://lore.kernel.org/bpf/[email protected]
2022-01-27ice: xsk: Force rings to be sized to power of 2Maciej Fijalkowski1-0/+8
With the upcoming introduction of batching to XSK data path, performance wise it will be the best to have the ring descriptor count to be aligned to power of 2. Check if ring sizes that user is going to attach the XSK socket fulfill the condition above. For Tx side, although check is being done against the Tx queue and in the end the socket will be attached to the XDP queue, it is fine since XDP queues get the ring->count setting from Tx queues. Suggested-by: Alexander Lobakin <[email protected]> Signed-off-by: Maciej Fijalkowski <[email protected]> Signed-off-by: Daniel Borkmann <[email protected]> Reviewed-by: Alexander Lobakin <[email protected]> Acked-by: Magnus Karlsson <[email protected]> Link: https://lore.kernel.org/bpf/[email protected]
2022-01-27ice: Remove likely for napi_complete_doneMaciej Fijalkowski1-1/+1
Remove the likely before napi_complete_done as this is the unlikely case when busy-poll is used. Removing this has a positive performance impact for busy-poll and no negative impact to the regular case. Signed-off-by: Maciej Fijalkowski <[email protected]> Signed-off-by: Daniel Borkmann <[email protected]> Reviewed-by: Alexander Lobakin <[email protected]> Acked-by: Magnus Karlsson <[email protected]> Link: https://lore.kernel.org/bpf/[email protected]
2022-01-20i40e: fix unsigned stat widthsJoe Damato3-7/+7
Change i40e_update_vsi_stats and struct i40e_vsi to use u64 fields to match the width of the stats counters in struct i40e_rx_queue_stats. Update debugfs code to use the correct format specifier for u64. Fixes: 41c445ff0f48 ("i40e: main driver core") Signed-off-by: Joe Damato <[email protected]> Reported-by: kernel test robot <[email protected]> Tested-by: Gurucharan G <[email protected]> Signed-off-by: Tony Nguyen <[email protected]>
2022-01-20i40e: Fix for failed to init adminq while VF resetKaren Sornek3-2/+46
Fix for failed to init adminq: -53 while VF is resetting via MAC address changing procedure. Added sync module to avoid reading deadbeef value in reinit adminq during software reset. Without this patch it is possible to trigger VF reset procedure during reinit adminq. This resulted in an incorrect reading of value from the AQP registers and generated the -53 error. Fixes: 5c3c48ac6bf5 ("i40e: implement virtual device interface") Signed-off-by: Grzegorz Szczurek <[email protected]> Signed-off-by: Karen Sornek <[email protected]> Tested-by: Konrad Jankowski <[email protected]> Signed-off-by: Tony Nguyen <[email protected]>
2022-01-20i40e: Fix queues reservation for XDPSylwester Dziedziuch1-0/+14
When XDP was configured on a system with large number of CPUs and X722 NIC there was a call trace with NULL pointer dereference. i40e 0000:87:00.0: failed to get tracking for 256 queues for VSI 0 err -12 i40e 0000:87:00.0: setup of MAIN VSI failed BUG: kernel NULL pointer dereference, address: 0000000000000000 RIP: 0010:i40e_xdp+0xea/0x1b0 [i40e] Call Trace: ? i40e_reconfig_rss_queues+0x130/0x130 [i40e] dev_xdp_install+0x61/0xe0 dev_xdp_attach+0x18a/0x4c0 dev_change_xdp_fd+0x1e6/0x220 do_setlink+0x616/0x1030 ? ahci_port_stop+0x80/0x80 ? ata_qc_issue+0x107/0x1e0 ? lock_timer_base+0x61/0x80 ? __mod_timer+0x202/0x380 rtnl_setlink+0xe5/0x170 ? bpf_lsm_binder_transaction+0x10/0x10 ? security_capable+0x36/0x50 rtnetlink_rcv_msg+0x121/0x350 ? rtnl_calcit.isra.0+0x100/0x100 netlink_rcv_skb+0x50/0xf0 netlink_unicast+0x1d3/0x2a0 netlink_sendmsg+0x22a/0x440 sock_sendmsg+0x5e/0x60 __sys_sendto+0xf0/0x160 ? __sys_getsockname+0x7e/0xc0 ? _copy_from_user+0x3c/0x80 ? __sys_setsockopt+0xc8/0x1a0 __x64_sys_sendto+0x20/0x30 do_syscall_64+0x33/0x40 entry_SYSCALL_64_after_hwframe+0x44/0xae RIP: 0033:0x7f83fa7a39e0 This was caused by PF queue pile fragmentation due to flow director VSI queue being placed right after main VSI. Because of this main VSI was not able to resize its queue allocation for XDP resulting in no queues allocated for main VSI when XDP was turned on. Fix this by always allocating last queue in PF queue pile for a flow director VSI. Fixes: 41c445ff0f48 ("i40e: main driver core") Fixes: 74608d17fe29 ("i40e: add support for XDP_TX action") Signed-off-by: Sylwester Dziedziuch <[email protected]> Signed-off-by: Mateusz Palczewski <[email protected]> Reviewed-by: Maciej Fijalkowski <[email protected]> Tested-by: Kiran Bhandare <[email protected]> Signed-off-by: Tony Nguyen <[email protected]>
2022-01-20i40e: Fix issue when maximum queues is exceededJedrzej Jagielski3-13/+61
Before this patch VF interface vanished when maximum queue number was exceeded. Driver tried to add next queues even if there was not enough space. PF sent incorrect number of queues to the VF when there were not enough of them. Add an additional condition introduced to check available space in 'qp_pile' before proceeding. This condition makes it impossible to add queues if they number is greater than the number resulting from available space. Also add the search for free space in PF queue pair piles. Without this patch VF interfaces are not seen when available space for queues has been exceeded and following logs appears permanently in dmesg: "Unable to get VF config (-32)". "VF 62 failed opcode 3, retval: -5" "Unable to get VF config due to PF error condition, not retrying" Fixes: 7daa6bf3294e ("i40e: driver core headers") Fixes: 41c445ff0f48 ("i40e: main driver core") Signed-off-by: Jaroslaw Gawin <[email protected]> Signed-off-by: Slawomir Laba <[email protected]> Signed-off-by: Jedrzej Jagielski <[email protected]> Tested-by: Konrad Jankowski <[email protected]> Signed-off-by: Tony Nguyen <[email protected]>
2022-01-20i40e: Increase delay to 1 s after global EMP resetJedrzej Jagielski1-9/+3
Recently simplified i40e_rebuild causes that FW sometimes is not ready after NVM update, the ping does not return. Increase the delay in case of EMP reset. Old delay of 300 ms was introduced for specific cards for 710 series. Now it works for all the cards and delay was increased. Fixes: 1fa51a650e1d ("i40e: Add delay after EMP reset for firmware to recover") Signed-off-by: Arkadiusz Kubalewski <[email protected]> Signed-off-by: Jedrzej Jagielski <[email protected]> Tested-by: Gurucharan G <[email protected]> Signed-off-by: Tony Nguyen <[email protected]>
2022-01-13Merge tag 'irq-core-2022-01-13' of ↵Linus Torvalds3-13/+13
git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip Pull irq updates from Thomas Gleixner: "Updates for the interrupt subsystem: Core: - Provide a new interface for affinity hints to provide a separation between hint and actual affinity change which has become a hidden property of the current interface - Fix up the in tree usage of the affinity hint interfaces Drivers: - No new irqchip drivers! - Fix GICv3 redistributor table reservation with RT across kexec - Fix GICv4.1 redistributor view of the VPE table across kexec - Add support for extra interrupts on spear-shirq - Make obtaining some interrupts optional for the Renesas drivers - Various cleanups and bug fixes" * tag 'irq-core-2022-01-13' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip: (25 commits) irqchip/renesas-intc-irqpin: Use platform_get_irq_optional() to get the interrupt irqchip/renesas-irqc: Use platform_get_irq_optional() to get the interrupt irqchip/gic-v4: Disable redistributors' view of the VPE table at boot time irqchip/ingenic-tcu: Use correctly sized arguments for bit field irqchip/gic-v2m: Add const to of_device_id irqchip/imx-gpcv2: Mark imx_gpcv2_instance with __ro_after_init irqchip/spear-shirq: Add support for IRQ 0..6 irqchip/gic-v3-its: Limit memreserve cpuhp state lifetime irqchip/gic-v3-its: Postpone LPI pending table freeing and memreserve irqchip/gic-v3-its: Give the percpu rdist struct its own flags field net/mlx4: Use irq_update_affinity_hint() net/mlx5: Use irq_set_affinity_and_hint() hinic: Use irq_set_affinity_and_hint() scsi: lpfc: Use irq_set_affinity() mailbox: Use irq_update_affinity_hint() ixgbe: Use irq_update_affinity_hint() be2net: Use irq_update_affinity_hint() enic: Use irq_update_affinity_hint() RDMA/irdma: Use irq_update_affinity_hint() scsi: mpt3sas: Use irq_set_affinity_and_hint() ...
2022-01-08Merge branch 'linus' into irq/core, to fix conflictIngo Molnar28-177/+384
Conflicts: drivers/net/ethernet/mellanox/mlx5/core/pci_irq.c Signed-off-by: Ingo Molnar <[email protected]>
2022-01-07iavf: remove an unneeded variableJason Wang1-3/+1
The variable `ret_code' used for returning is never changed in function `iavf_shutdown_adminq'. So that it can be removed and just return its initial value 0 at the end of `iavf_shutdown_adminq' function. Signed-off-by: Jason Wang <[email protected]> Signed-off-by: Tony Nguyen <[email protected]>
2022-01-07i40e: remove variables set but not usedYang Li1-5/+0
The code that uses variables pe_cntx_size and pe_filt_size has been removed, so they should be removed as well. Eliminate the following clang warnings: drivers/net/ethernet/intel/i40e/i40e_common.c:4139:20: warning: variable 'pe_filt_size' set but not used. drivers/net/ethernet/intel/i40e/i40e_common.c:4139:6: warning: variable 'pe_cntx_size' set but not used. Reported-by: Abaci Robot <[email protected]> Signed-off-by: Yang Li <[email protected]> Tested-by: Gurucharan G <[email protected]> Signed-off-by: Tony Nguyen <[email protected]>
2022-01-07i40e: Remove non-inclusive languageMateusz Palczewski2-3/+3
Remove non-inclusive language from the driver. Signed-off-by: Aleksandr Loktionov <[email protected]> Signed-off-by: Mateusz Palczewski <[email protected]> Signed-off-by: Tony Nguyen <[email protected]>
2022-01-07i40e: Update FW API versionMateusz Palczewski1-2/+2
Update FW API versions to the newest supported NVM images. Signed-off-by: Sylwester Dziedziuch <[email protected]> Signed-off-by: Mateusz Palczewski <[email protected]> Tested-by: Gurucharan G <[email protected]> Signed-off-by: Tony Nguyen <[email protected]>
2022-01-07i40e: Minimize amount of busy-waiting during AQ sendJedrzej Jagielski3-14/+35
The i40e_asq_send_command will now use a non blocking usleep_range if possible (non-atomic context), instead of busy-waiting udelay. The usleep_range function uses hrtimers to provide better performance and removes the negative impact of busy-waiting in time-critical environments. 1. Rename i40e_asq_send_command to i40e_asq_send_command_atomic and add 5th parameter to inform if called from an atomic context. Call inside usleep_range (if non-atomic) or udelay (if atomic). 2. Change i40e_asq_send_command to invoke i40e_asq_send_command_atomic(..., false). 3. Change two functions: - i40e_aq_set_vsi_uc_promisc_on_vlan - i40e_aq_set_vsi_mc_promisc_on_vlan to explicitly use i40e_asq_send_command_atomic(..., true) instead of i40e_asq_send_command, as they use spinlocks and do some work in an atomic context. All other calls to i40e_asq_send_command remain unchanged. Signed-off-by: Dawid Lukwinski <[email protected]> Signed-off-by: Jedrzej Jagielski <[email protected]> Tested-by: Tony Brelinski <[email protected]> Signed-off-by: Tony Nguyen <[email protected]>