aboutsummaryrefslogtreecommitdiff
path: root/drivers/net/ethernet/intel/ice/ice_sriov.c
AgeCommit message (Collapse)AuthorFilesLines
2024-10-08ice: Fix increasing MSI-X on VFMarcin Szycik1-3/+8
Increasing MSI-X value on a VF leads to invalid memory operations. This is caused by not reallocating some arrays. Reproducer: modprobe ice echo 0 > /sys/bus/pci/devices/$PF_PCI/sriov_drivers_autoprobe echo 1 > /sys/bus/pci/devices/$PF_PCI/sriov_numvfs echo 17 > /sys/bus/pci/devices/$VF0_PCI/sriov_vf_msix_count Default MSI-X is 16, so 17 and above triggers this issue. KASAN reports: BUG: KASAN: slab-out-of-bounds in ice_vsi_alloc_ring_stats+0x38d/0x4b0 [ice] Read of size 8 at addr ffff8888b937d180 by task bash/28433 (...) Call Trace: (...) ? ice_vsi_alloc_ring_stats+0x38d/0x4b0 [ice] kasan_report+0xed/0x120 ? ice_vsi_alloc_ring_stats+0x38d/0x4b0 [ice] ice_vsi_alloc_ring_stats+0x38d/0x4b0 [ice] ice_vsi_cfg_def+0x3360/0x4770 [ice] ? mutex_unlock+0x83/0xd0 ? __pfx_ice_vsi_cfg_def+0x10/0x10 [ice] ? __pfx_ice_remove_vsi_lkup_fltr+0x10/0x10 [ice] ice_vsi_cfg+0x7f/0x3b0 [ice] ice_vf_reconfig_vsi+0x114/0x210 [ice] ice_sriov_set_msix_vec_count+0x3d0/0x960 [ice] sriov_vf_msix_count_store+0x21c/0x300 (...) Allocated by task 28201: (...) ice_vsi_cfg_def+0x1c8e/0x4770 [ice] ice_vsi_cfg+0x7f/0x3b0 [ice] ice_vsi_setup+0x179/0xa30 [ice] ice_sriov_configure+0xcaa/0x1520 [ice] sriov_numvfs_store+0x212/0x390 (...) To fix it, use ice_vsi_rebuild() instead of ice_vf_reconfig_vsi(). This causes the required arrays to be reallocated taking the new queue count into account (ice_vsi_realloc_stat_arrays()). Set req_txq and req_rxq before ice_vsi_rebuild(), so that realloc uses the newly set queue count. Additionally, ice_vsi_rebuild() does not remove VSI filters (ice_fltr_remove_all()), so ice_vf_init_host_cfg() is no longer necessary. Reported-by: Jacob Keller <[email protected]> Fixes: 2a2cb4c6c181 ("ice: replace ice_vf_recreate_vsi() with ice_vf_reconfig_vsi()") Reviewed-by: Michal Swiatkowski <[email protected]> Signed-off-by: Marcin Szycik <[email protected]> Reviewed-by: Simon Horman <[email protected]> Tested-by: Rafal Romanowski <[email protected]> Signed-off-by: Tony Nguyen <[email protected]>
2024-09-30ice: Fix improper handling of refcount in ice_sriov_set_msix_vec_count()Gui-Dong Han1-2/+6
This patch addresses an issue with improper reference count handling in the ice_sriov_set_msix_vec_count() function. First, the function calls ice_get_vf_by_id(), which increments the reference count of the vf pointer. If the subsequent call to ice_get_vf_vsi() fails, the function currently returns an error without decrementing the reference count of the vf pointer, leading to a reference count leak. The correct behavior, as implemented in this patch, is to decrement the reference count using ice_put_vf(vf) before returning an error when vsi is NULL. Second, the function calls ice_sriov_get_irqs(), which sets vf->first_vector_idx. If this call returns a negative value, indicating an error, the function returns an error without decrementing the reference count of the vf pointer, resulting in another reference count leak. The patch addresses this by adding a call to ice_put_vf(vf) before returning an error when vf->first_vector_idx < 0. This bug was identified by an experimental static analysis tool developed by our team. The tool specializes in analyzing reference count operations and identifying potential mismanagement of reference counts. In this case, the tool flagged the missing decrement operation as a potential issue, leading to this patch. Fixes: 4035c72dc1ba ("ice: reconfig host after changing MSI-X on VF") Fixes: 4d38cb44bd32 ("ice: manage VFs MSI-X using resource tracking") Cc: [email protected] Signed-off-by: Gui-Dong Han <[email protected]> Reviewed-by: Simon Horman <[email protected]> Tested-by: Rafal Romanowski <[email protected]> Signed-off-by: Tony Nguyen <[email protected]>
2024-09-06ice: make representor code genericMichal Swiatkowski1-2/+2
Keep the same flow of port representor creation, but instead of general attach function create helpers for specific representor type. Store function pointer for add and remove representor. Type of port representor can be also known based on VSI type, but it is more clean to have it directly saved in port representor structure. Add devlink lock for whole port representor creation and destruction. Reviewed-by: Simon Horman <[email protected]> Reviewed-by: Wojciech Drewek <[email protected]> Signed-off-by: Michal Swiatkowski <[email protected]> Tested-by: Rafal Romanowski <[email protected]> Signed-off-by: Tony Nguyen <[email protected]>
2024-06-28ice: Add get/set hw address for VFs using devlink commandsKarthik Sundaravel1-9/+25
Changing the MAC address of the VFs is currently unsupported via devlink. Add the function handlers to set and get the HW address for the VFs. Signed-off-by: Karthik Sundaravel <[email protected]> Tested-by: Rafal Romanowski <[email protected]> Signed-off-by: Tony Nguyen <[email protected]>
2024-05-06ice: refactor struct ice_vsi_cfg_params to be inside of struct ice_vsiMateusz Polchlopek1-1/+1
Refactor struct ice_vsi_cfg_params to be embedded into struct ice_vsi. Prior to that the members of the struct were scattered around ice_vsi, and were copy-pasted for purposes of reinit. Now we have struct handy, and it is easier to have something sticky in the flags field. Suggested-by: Przemek Kitszel <[email protected]> Reviewed-by: Przemek Kitszel <[email protected]> Reviewed-by: Vaishnavi Tipireddy <[email protected]> Signed-off-by: Mateusz Polchlopek <[email protected]> Tested-by: Pucha Himasekhar Reddy <[email protected]> (A Contingent worker at Intel) Signed-off-by: Tony Nguyen <[email protected]>
2024-04-17ice: Add automatic VF reset on Tx MDD eventsMarcin Szycik1-6/+19
In cases when VF sends malformed packets that are classified as malicious, it can cause Tx queue to freeze as a result of Malicious Driver Detection event. Such malformed packets can appear as a result of a faulty userspace app running on VF. This frozen queue can be stuck for several minutes being unusable. User might prefer to immediately bring the VF back to operational state after such event, which can be done by automatically resetting the VF which caused MDD. This is already implemented for Rx events (mdd-auto-reset-vf flag private flag needs to be set). Extend the VF auto reset to also cover Tx MDD events. When any MDD event occurs on VF (Tx or Rx) and the mdd-auto-reset-vf private flag is set, perform a graceful VF reset to quickly bring it back to operational state. Reviewed-by: Wojciech Drewek <[email protected]> Reviewed-by: Przemek Kitszel <[email protected]> Co-developed-by: Liang-Min Wang <[email protected]> Signed-off-by: Liang-Min Wang <[email protected]> Signed-off-by: Marcin Szycik <[email protected]> Reviewed-by: Simon Horman <[email protected]> Tested-by: Rafal Romanowski <[email protected]> Signed-off-by: Tony Nguyen <[email protected]>
2024-04-12ice: store VF relative MSI-X index in q_vector->vf_reg_idxJacob Keller1-3/+4
The ice physical function driver needs to configure the association of queues and interrupts on behalf of its virtual functions. This is done over virtchnl by the VF sending messages during its initialization phase. These messages contain a vector_id which the VF wants to associate with a given queue. This ID is relative to the VF space, where 0 indicates the control IRQ for non-queue interrupts. When programming the mapping, the PF driver currently passes this vector_id directly to the low level functions for programming. This works for SR-IOV, because the hardware uses the VF-based indexing for interrupts. This won't work for Scalable IOV, which uses PF-based indexing for programming its VSIs. To handle this, the driver needs to be able to look up the proper index to use for programming. For typical IRQs, this would be the q_vector->reg_idx field. The q_vector->reg_idx can't be set to a VF relative value, because it is used when the PF needs to control the interrupt, such as when triggering a software interrupt on stopping the Tx queue. Thus, introduce a new q_vector->vf_reg_idx which can store the VF relative index for registers which expect this. Use this in ice_cfg_interrupt to look up the VF index from the q_vector. This allows removing the vector ID parameter of ice_cfg_interrupt. Also notice that this function returns an int, but then is cast to the virtchnl error enumeration, virtchnl_status_code. Update the return type to indicate it does not return an integer error code. We can't use normal error codes here because the return values are passed across the virtchnl interface. This will allow the future Scalable IOV VFs to correctly look up the index needed for programming the VF queues without breaking SR-IOV. Signed-off-by: Jacob Keller <[email protected]> Tested-by: Rafal Romanowski <[email protected]> Signed-off-by: Tony Nguyen <[email protected]>
2024-04-12ice: set vf->num_msix in ice_initialize_vf_entry()Jacob Keller1-5/+0
Commit fe1c5ca2fe76 ("ice: implement num_msix field per VF") updated the driver to allow for per-VF MSI-X configuration. The initial defaults were set in ice_create_vf_entries(). This logic is better placed in ice_initialize_vf_entry(). Indeed, that function already sets vf->num_vf_qs, as well as initializes the allow list via calling ice_vc_set_default_allowlist(). Move this logic into ice_initialize_vf_entry(). This makes the code clear, and ensures that these VF fields will be initialized properly for both SR-IOV VFs and the upcoming Scalable IOV VFs. Signed-off-by: Jacob Keller <[email protected]> Reviewed-by: Jesse Brandeburg <[email protected]> Tested-by: Rafal Romanowski <[email protected]> Signed-off-by: Tony Nguyen <[email protected]>
2024-03-25ice: remove eswitch changing queues algorithmMichal Swiatkowski1-3/+0
Changing queues used by eswitch will be done through PF netdev. There is no need to reserve queues if the number of used queues is known. Reviewed-by: Wojciech Drewek <[email protected]> Reviewed-by: Marcin Szycik <[email protected]> Signed-off-by: Michal Swiatkowski <[email protected]> Tested-by: Sujai Buvaneswaran <[email protected]> Signed-off-by: Tony Nguyen <[email protected]>
2024-03-07Merge git://git.kernel.org/pub/scm/linux/kernel/git/netdev/netJakub Kicinski1-2/+9
Cross-merge networking fixes after downstream PR. No conflicts. Adjacent changes: net/core/page_pool_user.c 0b11b1c5c320 ("netdev: let netlink core handle -EMSGSIZE errors") 429679dcf7d9 ("page_pool: fix netlink dump stop/resume") Signed-off-by: Jakub Kicinski <[email protected]>
2024-03-04ice: remove vf->lan_vsi_num fieldJacob Keller1-1/+0
The lan_vsi_num field of the VF structure is no longer used for any purpose. Remove it. Signed-off-by: Jacob Keller <[email protected]> Reviewed-by: Przemek Kitszel <[email protected]> Tested-by: Rafal Romanowski <[email protected]> Signed-off-by: Tony Nguyen <[email protected]>
2024-03-01ice: reconfig host after changing MSI-X on VFMichal Swiatkowski1-2/+9
During VSI reconfiguration filters and VSI config which is set in ice_vf_init_host_cfg() are lost. Recall the host configuration function to restore them. Without this config VF on which MSI-X amount was changed might had a connection problems. Fixes: 4d38cb44bd32 ("ice: manage VFs MSI-X using resource tracking") Reviewed-by: Jacob Keller <[email protected]> Signed-off-by: Michal Swiatkowski <[email protected]> Reviewed-by: Simon Horman <[email protected]> Tested-by: Rafal Romanowski <[email protected]> Signed-off-by: Tony Nguyen <[email protected]>
2024-01-02ice: replace ice_vf_recreate_vsi() with ice_vf_reconfig_vsi()Jacob Keller1-22/+2
The ice_vf_create_vsi() function and its VF ops helper introduced by commit a4c785e8162e ("ice: convert vf_ops .vsi_rebuild to .create_vsi") are used during an individual VF reset to re-create the VSI. This was done in order to ensure that the VSI gets properly reconfigured within the hardware. This is somewhat heavy handed as we completely release the VSI memory and structure, and then create a new VSI. This can also potentially force a change of the VSI index as we will re-use the first open slot in the VSI array which may not be the same. As part of implementing devlink reload, commit 6624e780a577 ("ice: split ice_vsi_setup into smaller functions") split VSI setup into smaller functions, introducing both ice_vsi_cfg() and ice_vsi_decfg() which can be used to configure or deconfigure an existing software VSI structure. Rather than completely removing the VSI and adding a new one using the .create_vsi() VF operation, simply use ice_vsi_decfg() to remove the current configuration. Save the VSI type and then call ice_vsi_cfg() to reconfigure the VSI as the same type that it was before. The existing reset logic assumes that all hardware filters will be removed, so also call ice_fltr_remove_all() before re-configuring the VSI. This new operation does not re-create the VSI, so rename it to ice_vf_reconfig_vsi(). The new approach can safely share the exact same flow for both SR-IOV VFs as well as the Scalable IOV VFs being worked on. This uses less code and is a better abstraction over fully deleting the VSI and adding a new one. Signed-off-by: Jacob Keller <[email protected]> Reviewed-by: Przemek Kitszel <[email protected]> Reviewed-by: Petr Oros <[email protected]> Tested-by: Rafal Romanowski <[email protected]> Signed-off-by: Tony Nguyen <[email protected]>
2023-12-18ice: field get conversionJesse Brandeburg1-2/+1
Refactor the ice driver to use FIELD_GET() for mask and shift reads, which reduces lines of code and adds clarity of intent. This code was generated by the following coccinelle/spatch script and then manually repaired. @get@ constant shift,mask; type T; expression a; @@ -(((T)(a) & mask) >> shift) +FIELD_GET(mask, a) and applied via: spatch --sp-file field_prep.cocci --in-place --dir \ drivers/net/ethernet/intel/ CC: Alexander Lobakin <[email protected]> Cc: Julia Lawall <[email protected]> Reviewed-by: Marcin Szycik <[email protected]> Reviewed-by: Simon Horman <[email protected]> Signed-off-by: Jesse Brandeburg <[email protected]> Tested-by: Pucha Himasekhar Reddy <[email protected]> (A Contingent worker at Intel) Signed-off-by: Tony Nguyen <[email protected]>
2023-12-18ice: field prep conversionJesse Brandeburg1-24/+14
Refactor ice driver to use FIELD_PREP(), which reduces lines of code and adds clarity of intent. This code was generated by the following coccinelle/spatch script and then manually repaired. Several places I changed to OR into a single variable with |= instead of using a multi-line statement with trailing OR operators, as it (subjectively) makes the code clearer. A local variable vmvf_and_timeout was created and used to avoid multiple logical ORs being __le16 converted, which shortened some lines and makes the code cleaner. Also clean up a couple of places where conversions were made to have the code read more clearly/consistently. @prep2@ constant shift,mask; type T; expression a; @@ -(((T)(a) << shift) & mask) +FIELD_PREP(mask, a) @prep@ constant shift,mask; type T; expression a; @@ -((T)((a) << shift) & mask) +FIELD_PREP(mask, a) Cc: Julia Lawall <[email protected]> CC: Alexander Lobakin <[email protected]> Reviewed-by: Marcin Szycik <[email protected]> Reviewed-by: Simon Horman <[email protected]> Signed-off-by: Jesse Brandeburg <[email protected]> Tested-by: Pucha Himasekhar Reddy <[email protected]> (A Contingent worker at Intel) Signed-off-by: Tony Nguyen <[email protected]>
2023-12-07Merge git://git.kernel.org/pub/scm/linux/kernel/git/netdev/netJakub Kicinski1-6/+1
Cross-merge networking fixes after downstream PR. Conflicts: drivers/net/ethernet/stmicro/stmmac/dwmac5.c drivers/net/ethernet/stmicro/stmmac/dwmac5.h drivers/net/ethernet/stmicro/stmmac/dwxgmac2_core.c drivers/net/ethernet/stmicro/stmmac/hwif.h 37e4b8df27bc ("net: stmmac: fix FPE events losing") c3f3b97238f6 ("net: stmmac: Refactor EST implementation") https://lore.kernel.org/all/[email protected]/ Adjacent changes: net/ipv4/tcp_ao.c 9396c4ee93f9 ("net/tcp: Don't store TCP-AO maclen on reqsk") 7b0f570f879a ("tcp: Move TCP-AO bits from cookie_v[46]_check() to tcp_ao_syncookie().") Signed-off-by: Jakub Kicinski <[email protected]>
2023-12-05ice: change vfs.num_msix_per to vf->num_msixMichal Swiatkowski1-6/+1
vfs::num_msix_per should be only used as default value for vf->num_msix. For other use cases vf->num_msix should be used, as VF can have different MSI-X amount values. Fix incorrect register index calculation. vfs::num_msix_per and pf->sriov_base_vector shouldn't be used after implementation of changing MSI-X amount on VFs. Instead vf->first_vector_idx should be used, as it is storing value for first irq index. Fixes: fe1c5ca2fe76 ("ice: implement num_msix field per VF") Reviewed-by: Jacob Keller <[email protected]> Signed-off-by: Michal Swiatkowski <[email protected]> Reviewed-by: Simon Horman <[email protected]> Tested-by: Rafal Romanowski <[email protected]> Signed-off-by: Tony Nguyen <[email protected]>
2023-11-13ice: reserve number of CP queuesMichal Swiatkowski1-0/+3
Rebuilding CP VSI each time the PR is created drastically increase the time of maximum VFs creation. Add function to reserve number of CP queues to deal with this problem. Use the same function to decrease number of queues in case of removing VFs. Assume that caller of ice_eswitch_reserve_cp_queues() will also call ice_eswitch_attach/detach() correct number of times. Still one by one PR adding is handy for VF resetting routine. Reviewed-by: Wojciech Drewek <[email protected]> Signed-off-by: Michal Swiatkowski <[email protected]> Tested-by: Sujai Buvaneswaran <[email protected]> Signed-off-by: Tony Nguyen <[email protected]>
2023-11-13ice: add VF representors one by oneMichal Swiatkowski1-8/+9
Implement adding representors one by one. Always set switchdev environment when first representor is being added and clear environment when last one is being removed. Basic switchdev configuration remains the same. Code related to creating and configuring representor was changed. Instead of setting whole representors in one function handle only one representor in setup function. The same with removing representors. Stop representors when new one is being added or removed. Stop means, disabling napi, stopping traffic and removing slow path rule. It is needed because ::q_id will change after remapping, so each representor will need new rule. When representor are stopped rebuild control plane VSI with one more or one less queue. One more if new representor is being added, one less if representor is being removed. Bridge port is removed during unregister_netdev() call on PR, so there is no need to call it from driver side. After that do remap new queues to correct vector. At the end start all representors (napi enable, start queues, add slow path rule). Reviewed-by: Piotr Raczynski <[email protected]> Reviewed-by: Wojciech Drewek <[email protected]> Signed-off-by: Michal Swiatkowski <[email protected]> Tested-by: Sujai Buvaneswaran <[email protected]> Signed-off-by: Tony Nguyen <[email protected]>
2023-10-20ice: manage VFs MSI-X using resource trackingMichal Swiatkowski1-19/+151
Track MSI-X for VFs using bitmap, by setting and clearing bitmap during allocation and freeing. Try to linearize irqs usage for VFs, by freeing them and allocating once again. Do it only for VFs that aren't currently running. Signed-off-by: Michal Swiatkowski <[email protected]> Reviewed-by: Przemek Kitszel <[email protected]> Tested-by: Rafal Romanowski <[email protected]> Signed-off-by: Jacob Keller <[email protected]> Signed-off-by: David S. Miller <[email protected]>
2023-10-20ice: set MSI-X vector count on VFMichal Swiatkowski1-0/+69
Implement ops needed to set MSI-X vector count on VF. sriov_get_vf_total_msix() should return total number of MSI-X that can be used by the VFs. Return the value set by devlink resources API (pf->req_msix.vf). sriov_set_msix_vec_count() will set number of MSI-X on particular VF. Disable VF register mapping, rebuild VSI with new MSI-X and queues values and enable new VF register mapping. For best performance set number of queues equal to number of MSI-X. Signed-off-by: Michal Swiatkowski <[email protected]> Reviewed-by: Przemek Kitszel <[email protected]> Tested-by: Rafal Romanowski <[email protected]> Signed-off-by: Jacob Keller <[email protected]> Signed-off-by: David S. Miller <[email protected]>
2023-10-20ice: add bitmap to track VF MSI-X usageMichal Swiatkowski1-0/+9
Create a bitamp to track MSI-X usage for VFs. The bitmap has the size of total MSI-X amount on device, because at init time the amount of MSI-X used by VFs isn't known. The bitmap is used in follow up patchset to provide a block of continuous block of MSI-X indexes for each created VF. Signed-off-by: Michal Swiatkowski <[email protected]> Reviewed-by: Przemek Kitszel <[email protected]> Tested-by: Rafal Romanowski <[email protected]> Signed-off-by: Jacob Keller <[email protected]> Signed-off-by: David S. Miller <[email protected]>
2023-10-20ice: implement num_msix field per VFMichal Swiatkowski1-4/+9
Store the amount of MSI-X per VF instead of storing it in pf struct. It is used to calculate number of q_vectors (and queues) for VF VSI. This is necessary because with follow up changes the number of MSI-X can be different between VFs. Use it instead of using pf->vf_msix value in all cases. Signed-off-by: Michal Swiatkowski <[email protected]> Reviewed-by: Przemek Kitszel <[email protected]> Tested-by: Rafal Romanowski <[email protected]> Signed-off-by: Jacob Keller <[email protected]> Signed-off-by: David S. Miller <[email protected]>
2023-10-20ice: store VF's pci_dev ptr in ice_vfPrzemek Kitszel1-24/+26
Extend struct ice_vf by vfdev. Calculation of vfdev falls more nicely into ice_create_vf_entries(). Caching of vfdev enables simplification of ice_restore_all_vfs_msi_state(). Reviewed-by: Jesse Brandeburg <[email protected]> Reviewed-by: Jacob Keller <[email protected]> Signed-off-by: Przemek Kitszel <[email protected]> Co-developed-by: Mateusz Polchlopek <[email protected]> Signed-off-by: Mateusz Polchlopek <[email protected]> Tested-by: Rafal Romanowski <[email protected]> Reviewed-by: Simon Horman <[email protected]> Signed-off-by: Jacob Keller <[email protected]> Signed-off-by: David S. Miller <[email protected]>
2023-08-21Revert "ice: Fix ice VF reset during iavf initialization"Petr Oros1-4/+4
This reverts commit 7255355a0636b4eff08d5e8139c77d98f151c4fc. After this commit we are not able to attach VF to VM: virsh attach-interface v0 hostdev --managed 0000:41:01.0 --mac 52:52:52:52:52:52 error: Failed to attach interface error: Cannot set interface MAC to 52:52:52:52:52:52 for ifname enp65s0f0np0 vf 0: Resource temporarily unavailable ice_check_vf_ready_for_cfg() already contain waiting for reset. New condition in ice_check_vf_ready_for_reset() causing only problems. Fixes: 7255355a0636 ("ice: Fix ice VF reset during iavf initialization") Signed-off-by: Petr Oros <[email protected]> Reviewed-by: Simon Horman <[email protected]> Reviewed-by: Przemek Kitszel <[email protected]> Reviewed-by: Jacob Keller <[email protected]> Tested-by: Rafal Romanowski <[email protected]> Signed-off-by: Tony Nguyen <[email protected]>
2023-06-22ice: clean up freeing SR-IOV VFsPrzemek Kitszel1-3/+2
The check for existing VFs was redundant since very inception of SR-IOV sysfs interface in the kernel, see commit 1789382a72a5 ("PCI: SRIOV control and status via sysfs"). Reviewed-by: Michal Swiatkowski <[email protected]> Reviewed-by: Simon Horman <[email protected]> Signed-off-by: Przemek Kitszel <[email protected]> Tested-by: Rafal Romanowski <[email protected]> Signed-off-by: Tony Nguyen <[email protected]>
2023-05-18Merge branch '100GbE' of ↵Jakub Kicinski1-4/+0
git://git.kernel.org/pub/scm/linux/kernel/git/tnguy/next-queue Tony Nguyen says: ==================== Intel Wired LAN Driver Updates 2023-05-17 (ice, MAINTAINERS) This series contains updates to ice driver and MAINTAINERS file. Paul refactors PHY to link mode reporting and updates some PHY types to report more accurate link modes for ice. Dave removes mutual exclusion policy between LAG and SR-IOV in ice driver. Jesse updates link for Intel Wired LAN in the MAINTAINERS file. * '100GbE' of git://git.kernel.org/pub/scm/linux/kernel/git/tnguy/next-queue: MAINTAINERS: update Intel Ethernet links ice: Remove LAG+SRIOV mutual exclusion ice: update PHY type to ethtool link mode mapping ice: refactor PHY type to ethtool link mode ice: update ICE_PHY_TYPE_HIGH_MAX_INDEX ==================== Link: https://lore.kernel.org/r/[email protected] Signed-off-by: Jakub Kicinski <[email protected]>
2023-05-18Merge git://git.kernel.org/pub/scm/linux/kernel/git/netdev/netJakub Kicinski1-4/+4
Conflicts: drivers/net/ethernet/freescale/fec_main.c 6ead9c98cafc ("net: fec: remove the xdp_return_frame when lack of tx BDs") 144470c88c5d ("net: fec: using the standard return codes when xdp xmit errors") Signed-off-by: Jakub Kicinski <[email protected]>
2023-05-17ice: Remove LAG+SRIOV mutual exclusionDave Ertman1-4/+0
There was a change previously to stop SR-IOV and LAG from existing on the same interface. This was to prevent the violation of LACP (Link Aggregation Control Protocol). The method to achieve this was to add a no-op Rx handler onto the netdev when SR-IOV VFs were present, thus blocking bonding, bridging, etc from claiming the interface by adding its own Rx handler. Also, when an interface was added into a aggregate, then the SR-IOV capability was set to false. There are some users that have in house solutions using both SR-IOV and bridging/bonding that this method interferes with (e.g. creating duplicate VFs on the bonded interfaces and failing between them when the interface fails over). It makes more sense to provide the most functionality possible, the restriction on co-existence of these features will be removed. No additional functionality is currently being provided beyond what existed before the co-existence restriction was put into place. It is up to the end user to not implement a solution that would interfere with existing network protocols. Reviewed-by: Michal Swiatkowski <[email protected]> Signed-off-by: Dave Ertman <[email protected]> Signed-off-by: Wojciech Drewek <[email protected]> Tested-by: Pucha Himasekhar Reddy <[email protected]> (A Contingent worker at Intel) Signed-off-by: Tony Nguyen <[email protected]>
2023-05-16ice: Fix ice VF reset during iavf initializationDawid Wesierski1-4/+4
Fix the current implementation that causes ice_trigger_vf_reset() to start resetting the VF even when the VF-NIC is still initializing. When we reset NIC with ice driver it can interfere with iavf-vf initialization e.g. during consecutive resets induced by ice iavf ice | | |<-----------------| | ice resets vf iavf | reset | start | |<-----------------| | ice resets vf | causing iavf | initialization | error | | iavf reset end This leads to a series of -53 errors (failed to init adminq) from the IAVF. Change the state of the vf_state field to be not active when the IAVF is still initializing. Make sure to wait until receiving the message on the message box to ensure that the vf is ready and initializded. In simple terms we use the ACTIVE flag to make sure that the ice driver knows if the iavf is ready for another reset iavf ice | | | | |<------------- ice resets vf iavf vf_state != ACTIVE reset | start | | | | | iavf | reset-------> vf_state == ACTIVE end ice resets vf | | | | Fixes: c54d209c78b8 ("ice: Wait for VF to be reset/ready before configuration") Signed-off-by: Dawid Wesierski <[email protected]> Signed-off-by: Kamil Maziarz <[email protected]> Acked-by: Jacob Keller <[email protected]> Tested-by: Rafal Romanowski <[email protected]> Signed-off-by: Tony Nguyen <[email protected]>
2023-05-16ice: add dynamic interrupt allocationPiotr Raczynski1-2/+3
Currently driver can only allocate interrupt vectors during init phase by calling pci_alloc_irq_vectors. Change that and make use of new pci_msix_alloc_irq_at/pci_msix_free_irq API and enable to allocate and free more interrupts after MSIX has been enabled. Since not all platforms supports dynamic allocation, check it with pci_msix_can_alloc_dyn. Extend the tracker to keep track how many interrupts are allocated initially so when all such vectors are already used, additional interrupts are automatically allocated dynamically. Remember each interrupt allocation method to then free appropriately. Since some features may require interrupts allocated dynamically add appropriate VSI flag and take it into account when allocating new interrupt. Reviewed-by: Michal Swiatkowski <[email protected]> Reviewed-by: Simon Horman <[email protected]> Tested-by: Pucha Himasekhar Reddy <[email protected]> (A Contingent worker at Intel) Signed-off-by: Piotr Raczynski <[email protected]> Signed-off-by: Tony Nguyen <[email protected]>
2023-05-16ice: track interrupt vectors with xarrayPiotr Raczynski1-2/+2
Replace custom interrupt tracker with generic xarray data structure. Remove all code responsible for searching for a new entry with xa_alloc, which always tries to allocate at the lowes possible index. As a result driver is always using a contiguous region of the MSIX vector table. New tracker keeps ice_irq_entry entries in xarray as opaque for the rest of the driver hiding the entry details from the caller. Reviewed-by: Jacob Keller <[email protected]> Reviewed-by: Michal Swiatkowski <[email protected]> Reviewed-by: Simon Horman <[email protected]> Tested-by: Pucha Himasekhar Reddy <[email protected]> (A Contingent worker at Intel) Signed-off-by: Piotr Raczynski <[email protected]> Signed-off-by: Tony Nguyen <[email protected]>
2023-05-16ice: add individual interrupt allocationPiotr Raczynski1-1/+1
Currently interrupt allocations, depending on a feature are distributed in batches. Also, after allocation there is a series of operations that distributes per irq settings through that batch of interrupts. Although driver does not yet support dynamic interrupt allocation, keep allocated interrupts in a pool and add allocation abstraction logic to make code more flexible. Keep per interrupt information in the ice_q_vector structure, which yields ice_vsi::base_vector redundant. Also, as a result there are a few functions that can be removed. Reviewed-by: Jacob Keller <[email protected]> Reviewed-by: Michal Swiatkowski <[email protected]> Reviewed-by: Simon Horman <[email protected]> Tested-by: Pucha Himasekhar Reddy <[email protected]> (A Contingent worker at Intel) Signed-off-by: Piotr Raczynski <[email protected]> Signed-off-by: Tony Nguyen <[email protected]>
2023-05-16ice: remove redundant SRIOV codePiotr Raczynski1-36/+0
Remove redundant code from ice_get_max_valid_res_idx that has no effect. ice_pf::irq_tracker is initialized during driver probe, there is no reason to check it again. Also it is not possible for pf::sriov_base_vector to be lower than the tracker length, remove WARN_ON that will never happen. Get rid of ice_get_max_valid_res_idx helper function completely since it can never return negative value. Reviewed-by: Jacob Keller <[email protected]> Reviewed-by: Michal Swiatkowski <[email protected]> Reviewed-by: Simon Horman <[email protected]> Tested-by: Rafal Romanowski <[email protected]> Signed-off-by: Piotr Raczynski <[email protected]> Signed-off-by: Tony Nguyen <[email protected]>
2023-03-24Merge git://git.kernel.org/pub/scm/linux/kernel/git/netdev/netJakub Kicinski1-4/+4
Conflicts: drivers/net/ethernet/mellanox/mlx5/core/en_tc.c 6e9d51b1a5cb ("net/mlx5e: Initialize link speed to zero") 1bffcea42926 ("net/mlx5e: Add devlink hairpin queues parameters") https://lore.kernel.org/all/[email protected]/ https://lore.kernel.org/all/[email protected]/ Adjacent changes: drivers/net/phy/phy.c 323fe43cf9ae ("net: phy: Improved PHY error reporting in state machine") 4203d84032e2 ("net: phy: Ensure state transitions are processed from phy_stop()") Signed-off-by: Jakub Kicinski <[email protected]>
2023-03-21ice: check if VF exists before mode checkMichal Swiatkowski1-4/+4
Setting trust on VF should return EINVAL when there is no VF. Move checking for switchdev mode after checking if VF exists. Fixes: c54d209c78b8 ("ice: Wait for VF to be reset/ready before configuration") Signed-off-by: Michal Swiatkowski <[email protected]> Signed-off-by: Kalyan Kodamagula <[email protected]> Tested-by: Sujai Buvaneswaran <[email protected]> Signed-off-by: Tony Nguyen <[email protected]>
2023-03-13ice: move ice_is_malicious_vf() to ice_virtchnl.cJacob Keller1-45/+0
The ice_is_malicious_vf() function is currently implemented in ice_sriov.c This function is not Single Root specific, and a future change is going to refactor the ice_vc_process_vf_msg() function to call this instead of calling it before ice_vc_process_vf_msg() in the main loop of __ice_clean_ctrlq. To make that change easier to review, first move this function into ice_virtchnl.c but leave the call in __ice_clean_ctrlq() alone. Signed-off-by: Jacob Keller <[email protected]> Reviewed-by: Michal Swiatkowski <[email protected]> Tested-by: Marek Szlosek <[email protected]> Signed-off-by: Tony Nguyen <[email protected]>
2023-03-13ice: print message if ice_mbx_vf_state_handler returns an errorJacob Keller1-1/+2
If ice_mbx_vf_state_handler() returns an error, the ice_is_malicious_vf() function just exits without printing anything. Instead, use dev_warn_ratelimited to print a warning that we were unable to check the status for this VF. The _ratelimited variant is used to avoid potentially spamming the log if this function is failing consistently for every single mailbox message. Also we can drop the "goto" as it simply skips over a report_malvf check. That variable should always be false if ice_mbx_vf_state_handler returns non-zero. Signed-off-by: Jacob Keller <[email protected]> Reviewed-by: Michal Swiatkowski <[email protected]> Tested-by: Marek Szlosek <[email protected]> Signed-off-by: Tony Nguyen <[email protected]>
2023-03-13ice: pass mbxdata to ice_is_malicious_vf()Jacob Keller1-11/+3
The ice_is_malicious_vf() function takes information about the current state of the mailbox during a single interrupt. This information includes the number of messages processed so far, as well as the number of pending messages not yet processed. A future refactor is going to make ice_vc_process_vf_msg() call ice_is_malicious_vf() instead of having it called separately in ice_main.c This change will require passing all the necessary arguments into ice_vc_process_vf_msg(). To make this simpler, have the main loop fill in the struct ice_mbx_data and pass that rather than passing in the num_msg_proc and num_msg_pending. Signed-off-by: Jacob Keller <[email protected]> Reviewed-by: Michal Swiatkowski <[email protected]> Tested-by: Marek Szlosek <[email protected]> Signed-off-by: Tony Nguyen <[email protected]>
2023-03-13ice: remove unnecessary &array[0] and just use arrayJacob Keller1-1/+1
In ice_is_malicious_vf we print the VF MAC address using %pM by passing the address of the first element of vf->dev_lan_addr. This is equivalent to just passing vf->dev_lan_addr, so do that. Signed-off-by: Jacob Keller <[email protected]> Reviewed-by: Michal Swiatkowski <[email protected]> Tested-by: Marek Szlosek <[email protected]> Signed-off-by: Tony Nguyen <[email protected]>
2023-03-13ice: always report VF overflowing mailbox even without PF VSIJacob Keller1-4/+4
In ice_is_malicious_vf we report a message warning the system administrator when a VF is potentially spamming the PF with asynchronous messages that could overflow the PF mailbox. The specific message was requested by our customer support team to include the VF and PF MAC address. In some cases we may not be able to locate the PF VSI to obtain the MAC address for the PF. The current implementation discards the message entirely in this case. Fix this to instead print a zero address in that case so that we always print something here. Note that dev_warn will also include the PCI device information allowing another mechanism for determining on which PF the potentially malicious VF belongs. Signed-off-by: Jacob Keller <[email protected]> Reviewed-by: Michal Swiatkowski <[email protected]> Tested-by: Marek Szlosek <[email protected]> Signed-off-by: Tony Nguyen <[email protected]>
2023-03-13ice: initialize mailbox snapshot earlier in PF initJacob Keller1-2/+0
Now that we no longer depend on the number of VFs being allocated, we can move the ice_mbx_init_snapshot function earlier. This will be required by Scalable IOV as we will not be calling ice_sriov_configure for Scalable VFs. Signed-off-by: Jacob Keller <[email protected]> Reviewed-by: Michal Swiatkowski <[email protected]> Tested-by: Marek Szlosek <[email protected]> Signed-off-by: Tony Nguyen <[email protected]>
2023-03-13ice: merge ice_mbx_report_malvf with ice_mbx_vf_state_handlerJacob Keller1-22/+12
The ice_mbx_report_malvf function is used to update the ice_mbx_vf_info.malicious member after we detect a malicious VF. This is done by calling ice_mbx_report_malvf after ice_mbx_vf_state_handler sets its "is_malvf" return parameter true. Instead of requiring two steps, directly update the malicious bit in the state handler, and remove the need for separately calling ice_mbx_report_malvf. Signed-off-by: Jacob Keller <[email protected]> Reviewed-by: Michal Swiatkowski <[email protected]> Tested-by: Marek Szlosek <[email protected]> Signed-off-by: Tony Nguyen <[email protected]>
2023-03-13ice: remove ice_mbx_deinit_snapshotJacob Keller1-4/+1
The ice_mbx_deinit_snapshot function's only remaining job is to clear the previous snapshot data. This snapshot data is initialized when SR-IOV adds VFs, so it is not necessary to clear this data when removing VFs. Since no allocation occurs we no longer need to free anything and we can safely remove this function. Signed-off-by: Jacob Keller <[email protected]> Reviewed-by: Michal Swiatkowski <[email protected]> Tested-by: Marek Szlosek <[email protected]> Signed-off-by: Tony Nguyen <[email protected]>
2023-03-13ice: move VF overflow message count into struct ice_mbx_vf_infoJacob Keller1-6/+3
The ice driver has some logic in ice_vf_mbx.c used to detect potentially malicious VF behavior with regards to overflowing the PF mailbox. This logic currently stores message counts in struct ice_mbx_vf_counter.vf_cntr as an array. This array is allocated during initialization with ice_mbx_init_snapshot. This logic makes sense for SR-IOV where all VFs are allocated at once up front. However, in the future with Scalable IOV this logic will not work. VFs can be added and removed dynamically. We could try to keep the vf_cntr array for the maximum possible number of VFs, but this is a waste of memory. Use the recently introduced struct ice_mbx_vf_info structure to store the message count. Pass a pointer to the mbx_info for a VF instead of using its VF ID. Replace the array of VF message counts with a linked list that tracks all currently active mailbox tracking info structures. Signed-off-by: Jacob Keller <[email protected]> Reviewed-by: Michal Swiatkowski <[email protected]> Tested-by: Marek Szlosek <[email protected]> Signed-off-by: Tony Nguyen <[email protected]>
2023-03-13ice: track malicious VFs in new ice_mbx_vf_info structureJacob Keller1-4/+3
Currently the PF tracks malicious VFs in a malvfs bitmap which is used by the ice_mbx_clear_malvf and ice_mbx_report_malvf functions. This bitmap is used to ensure that we only report a VF as malicious once rather than continuously spamming the event log. This mechanism of storage for the malicious indication works well enough for SR-IOV. However, it will not work with Scalable IOV. This is because Scalable IOV VFs can be allocated dynamically and might change VF ID when their underlying VSI changes. To support this, the mailbox overflow logic will need to be refactored. First, introduce a new ice_mbx_vf_info structure which will be used to store data about a VF. Embed this structure in the struct ice_vf, and ensure it gets initialized when a new VF is created. For now this only stores the malicious indicator bit. Pass a pointer to the VF's mbx_info structure instead of using a bitmap to keep track of these bits. A future change will extend this structure and the rest of the logic associated with the overflow detection. Signed-off-by: Jacob Keller <[email protected]> Reviewed-by: Michal Swiatkowski <[email protected]> Tested-by: Marek Szlosek <[email protected]> Signed-off-by: Tony Nguyen <[email protected]>
2023-03-13ice: convert ice_mbx_clear_malvf to void and use WARNJacob Keller1-4/+2
The ice_mbx_clear_malvf function checks for a few error conditions before clearing the appropriate data. These error conditions are really warnings that should never occur in a properly initialized driver. Every caller of ice_mbx_clear_malvf just prints a dev_dbg message on failure which will generally be ignored. Convert this function to void and switch the error return values to WARN_ON. This will make any potentially misconfiguration more visible and makes future refactors that involve changing how we store the malicious VF data easier. Signed-off-by: Jacob Keller <[email protected]> Reviewed-by: Michal Swiatkowski <[email protected]> Tested-by: Marek Szlosek <[email protected]> Signed-off-by: Tony Nguyen <[email protected]>
2023-02-06ice: remove unnecessary virtchnl_ether_addr struct useJacob Keller1-8/+8
The dev_lan_addr and hw_lan_addr members of ice_vf are used only to store the MAC address for the VF. They are defined using virtchnl_ether_addr, but only the .addr sub-member is actually used. Drop the use of virtchnl_ether_addr and just use a u8 array of length [ETH_ALEN]. Signed-off-by: Jacob Keller <[email protected]> Tested-by: Marek Szlosek <[email protected]> Signed-off-by: Tony Nguyen <[email protected]>
2023-02-06ice: introduce .irq_close VF operationJacob Keller1-0/+1
The Scalable IOV implementation will require notifying the VDCM driver when an IRQ must be closed. This allows the VDCM to handle releasing stale IRQ context values and properly reconfigure. To handle this, introduce a new optional .irq_close callback to the VF operations structure. This will be implemented by Scalable IOV to handle the shutdown of the IRQ context. Since the SR-IOV implementation does not need this, we must check that its non-NULL before calling it. Signed-off-by: Jacob Keller <[email protected]> Tested-by: Marek Szlosek <[email protected]> Signed-off-by: Tony Nguyen <[email protected]>
2023-02-06ice: introduce clear_reset_state operationJacob Keller1-0/+16
When hardware is reset, the VF relies on the VFGEN_RSTAT register to detect when the VF is finished resetting. This is a tri-state register where 0 indicates a reset is in progress, 1 indicates the hardware is done resetting, and 2 indicates that the software is done resetting. Currently the PF driver relies on the device hardware resetting VFGEN_RSTAT when a global reset occurs. This works ok, but it does mean that the VF might not immediately notice a reset when the driver first detects that the global reset is occurring. This is also problematic for Scalable IOV, because there is no read/write equivalent VFGEN_RSTAT register for the Scalable VSI type. Instead, the Scalable IOV VFs will need to emulate this register. To support this, introduce a new VF operation, clear_reset_state, which is called when the PF driver first detects a global reset. The Single Root IOV implementation can just write to VFGEN_RSTAT to ensure it's cleared immediately, without waiting for the actual hardware reset to begin. The Scalable IOV implementation will use this as part of its tracking of the reset status to allow properly reporting the emulated VFGEN_RSTAT to the VF driver. Signed-off-by: Jacob Keller <[email protected]> Reviewed-by: Paul Menzel <[email protected]> Tested-by: Marek Szlosek <[email protected]> Signed-off-by: Tony Nguyen <[email protected]>