aboutsummaryrefslogtreecommitdiff
path: root/drivers/net/ethernet/intel/ice
AgeCommit message (Collapse)AuthorFilesLines
2023-02-06ice: Add more usage of existing function ice_get_vf_vsi(vf)Brett Creeley1-4/+4
Extend the usage of function ice_get_vf_vsi(vf) in multiple places instead of VF's VSI by using a long string of dereferences (i.e. vf->pf->vsi[vf->lan_vsi_idx]). Signed-off-by: Brett Creeley <[email protected]> Signed-off-by: Kalyan Kodamagula <[email protected]> Tested-by: Piotr Tyda <[email protected]> Signed-off-by: Tony Nguyen <[email protected]>
2023-02-06net/sched: move struct tc_mqprio_qopt_offload from pkt_cls.h to pkt_sched.hVladimir Oltean1-0/+1
Since mqprio is a scheduler and not a classifier, move its offload structure to pkt_sched.h, where struct tc_taprio_qopt_offload also lies. Also update some header inclusions in drivers that access this structure, to the best of my abilities. Cc: Igor Russkikh <[email protected]> Cc: Yisen Zhuang <[email protected]> Cc: Salil Mehta <[email protected]> Cc: Jesse Brandeburg <[email protected]> Cc: Tony Nguyen <[email protected]> Cc: Thomas Petazzoni <[email protected]> Cc: Saeed Mahameed <[email protected]> Cc: Leon Romanovsky <[email protected]> Cc: Horatiu Vultur <[email protected]> Cc: Lars Povlsen <[email protected]> Cc: Steen Hegelund <[email protected]> Cc: Daniel Machon <[email protected]> Cc: [email protected] Signed-off-by: Vladimir Oltean <[email protected]> Reviewed-by: Simon Horman <[email protected]> Signed-off-by: David S. Miller <[email protected]>
2023-02-03ice: implement devlink reinit actionMichal Swiatkowski1-22/+81
Call ice_unload() and ice_load() in driver reinit flow. Block reinit when switchdev, ADQ or SRIOV is active. In reload path we don't want to rebuild all features. Ask user to remove them instead of quitely removing it in reload path. Signed-off-by: Michal Swiatkowski <[email protected]> Tested-by: Gurucharan G <[email protected]> (A Contingent worker at Intel) Signed-off-by: Tony Nguyen <[email protected]>
2023-02-03ice: update VSI instead of init in some caseMichal Swiatkowski3-9/+13
ice_vsi_cfg() is called from different contexts: 1) VSI exsist in HW, but it is reconfigured, because of changing queues for example -> update instead of init should be used 2) VSI doesn't exsist, because rest has happened -> init command should be sent To support both cases pass boolean value which will store information what type of command has to be sent to HW. Signed-off-by: Michal Swiatkowski <[email protected]> Tested-by: Gurucharan G <[email protected]> (A Contingent worker at Intel) Signed-off-by: Tony Nguyen <[email protected]>
2023-02-03ice: move VSI delete outside deconfigMichal Swiatkowski3-20/+14
In deconfig VSI shouldn't be deleted from hw. Rewrite VSI delete function to reflect that sometimes it is only needed to remove VSI from hw without freeing the memory: ice_vsi_delete() -> delete from HW and free memory ice_vsi_delete_from_hw() -> delete only from HW Value returned from ice_vsi_free() is never used. Change return type to void. Signed-off-by: Michal Swiatkowski <[email protected]> Tested-by: Gurucharan G <[email protected]> (A Contingent worker at Intel) Signed-off-by: Tony Nguyen <[email protected]>
2023-02-03ice: sync netdev filters after clearing VSIMichal Swiatkowski1-0/+5
In driver reload path the netdev isn't removed, but VSI is. Remove filters on netdev right after removing them on VSI. Signed-off-by: Michal Swiatkowski <[email protected]> Tested-by: Gurucharan G <[email protected]> (A Contingent worker at Intel) Signed-off-by: Tony Nguyen <[email protected]>
2023-02-03ice: split probe into smaller functionsMichal Swiatkowski3-375/+546
Part of code from probe can be reused in reload flow. Move this code to separate function. Create unroll functions for each part of initialization, like: ice_init_dev() and ice_deinit_dev(). It simplifies unrolling and can be used in remove flow. Avoid freeing port info as it could be reused in reload path. Will be freed in remove path since is allocated via devm_kzalloc(). Also clean the remove path to reflect the init steps. Signed-off-by: Michal Swiatkowski <[email protected]> Tested-by: Gurucharan G <[email protected]> (A Contingent worker at Intel) Signed-off-by: Tony Nguyen <[email protected]>
2023-02-03ice: stop hard coding the ICE_VSI_CTRL locationJacob Keller1-19/+15
When allocating the ICE_VSI_CTRL, the allocated struct ice_vsi pointer is stored into the PF's pf->vsi array at a fixed location. This was historically done on the basis that it could provide an O(1) lookup for the special control VSI. Since we store the ctrl_vsi_idx, we already have O(1) lookup regardless of where in the array we store this VSI. Simplify the logic in ice_vsi_alloc by using the same method of storing the control VSI as other types of VSIs. Signed-off-by: Jacob Keller <[email protected]> Signed-off-by: Michal Swiatkowski <[email protected]> Tested-by: Gurucharan G <[email protected]> (A Contingent worker at Intel) Signed-off-by: Tony Nguyen <[email protected]>
2023-02-03ice: split ice_vsi_setup into smaller functionsMichal Swiatkowski4-483/+414
Main goal is to reuse the same functions in VSI config and rebuild paths. To do this split ice_vsi_setup into smaller pieces and reuse it during rebuild. ice_vsi_alloc() should only alloc memory, not set the default values for VSI. Move setting defaults to separate function. This will allow config of already allocated VSI, for example in reload path. The path is mostly moving code around without introducing new functionality. Functions ice_vsi_cfg() and ice_vsi_decfg() were added, but they are using code that already exist. Use flag to pass information about VSI initialization during rebuild instead of using boolean value. Co-developed-by: Jacob Keller <[email protected]> Signed-off-by: Jacob Keller <[email protected]> Signed-off-by: Michal Swiatkowski <[email protected]> Tested-by: Gurucharan G <[email protected]> (A Contingent worker at Intel) Signed-off-by: Tony Nguyen <[email protected]>
2023-02-03ice: cleanup in VSI config/deconfig codeMichal Swiatkowski5-44/+26
Do few small cleanups: 1) Rename the function to reflect that it doesn't configure all things related to VSI. ice_vsi_cfg_lan() better fits to what function is doing. ice_vsi_cfg() can be use to name function that will configure whole VSI. 2) Remove unused ethtype field from VSI. There is no need to set ethtype here, because it is never used. 3) Remove unnecessary check for ICE_VSI_CHNL. There is check for ICE_VSI_CHNL in ice_vsi_get_qs, so there is no need to check it before calling the function. 4) Simplify ice_vsi_alloc() call. There is no need to check the type of VSI before calling ice_vsi_alloc(). For ICE_VSI_CHNL vf is always NULL (ice_vsi_setup() is called with vf=NULL). For ICE_VSI_VF or ICE_VSI_CTRL ch is always NULL and for other VSI types ch and vf are always NULL. 5) Remove unnecessary call to ice_vsi_dis_irq(). ice_vsi_dis_irq() will be called in ice_vsi_close() flow (ice_vsi_close() -> ice_vsi_down() -> ice_vsi_dis_irq()). Remove unnecessary call. 6) Don't remove specific filters in release. All hw filters are removed in ice_fltr_remove_alli(), which is always called in VSI release flow. There is no need to remove only ethertype filters before calling ice_fltr_remove_all(). 7) Rename ice_vsi_clear() to ice_vsi_free(). As ice_vsi_clear() only free memory allocated in ice_vsi_alloc() rename it to ice_vsi_free() which better shows what function is doing. 8) Free coalesce param in rebuild. There is potential memory leak if configuration of VSI lan fails. Free coalesce to avoid it. Signed-off-by: Michal Swiatkowski <[email protected]> Tested-by: Gurucharan G <[email protected]> (A Contingent worker at Intel) Signed-off-by: Tony Nguyen <[email protected]>
2023-02-03ice: alloc id for RDMA using xa_arrayMichal Swiatkowski1-5/+6
Use xa_array instead of deprecated ida to alloc id for RDMA aux driver. Signed-off-by: Michal Swiatkowski <[email protected]> Signed-off-by: Tony Nguyen <[email protected]>
2023-02-03ice: move RDMA init to ice_idc.cMichal Swiatkowski3-25/+57
Simplify probe flow by moving all RDMA related code to ice_init_rdma(). Unroll irq allocation if RDMA initialization fails. Implement ice_deinit_rdma() and use it in remove flow. Signed-off-by: Michal Swiatkowski <[email protected]> Acked-by: Dave Ertman <[email protected]> Signed-off-by: Tony Nguyen <[email protected]>
2023-02-02drivers: net: turn on XDP featuresMarek Majtyka1-0/+5
A summary of the flags being set for various drivers is given below. Note that XDP_F_REDIRECT_TARGET and XDP_F_FRAG_TARGET are features that can be turned off and on at runtime. This means that these flags may be set and unset under RTNL lock protection by the driver. Hence, READ_ONCE must be used by code loading the flag value. Also, these flags are not used for synchronization against the availability of XDP resources on a device. It is merely a hint, and hence the read may race with the actual teardown of XDP resources on the device. This may change in the future, e.g. operations taking a reference on the XDP resources of the driver, and in turn inhibiting turning off this flag. However, for now, it can only be used as a hint to check whether device supports becoming a redirection target. Turn 'hw-offload' feature flag on for: - netronome (nfp) - netdevsim. Turn 'native' and 'zerocopy' features flags on for: - intel (i40e, ice, ixgbe, igc) - mellanox (mlx5). - stmmac - netronome (nfp) Turn 'native' features flags on for: - amazon (ena) - broadcom (bnxt) - freescale (dpaa, dpaa2, enetc) - funeth - intel (igb) - marvell (mvneta, mvpp2, octeontx2) - mellanox (mlx4) - mtk_eth_soc - qlogic (qede) - sfc - socionext (netsec) - ti (cpsw) - tap - tsnep - veth - xen - virtio_net. Turn 'basic' (tx, pass, aborted and drop) features flags on for: - netronome (nfp) - cavium (thunder) - hyperv. Turn 'redirect_target' feature flag on for: - amanzon (ena) - broadcom (bnxt) - freescale (dpaa, dpaa2) - intel (i40e, ice, igb, ixgbe) - ti (cpsw) - marvell (mvneta, mvpp2) - sfc - socionext (netsec) - qlogic (qede) - mellanox (mlx5) - tap - veth - virtio_net - xen Reviewed-by: Gerhard Engleder <[email protected]> Reviewed-by: Simon Horman <[email protected]> Acked-by: Stanislav Fomichev <[email protected]> Acked-by: Jakub Kicinski <[email protected]> Co-developed-by: Kumar Kartikeya Dwivedi <[email protected]> Signed-off-by: Kumar Kartikeya Dwivedi <[email protected]> Co-developed-by: Lorenzo Bianconi <[email protected]> Signed-off-by: Lorenzo Bianconi <[email protected]> Signed-off-by: Marek Majtyka <[email protected]> Link: https://lore.kernel.org/r/3eca9fafb308462f7edb1f58e451d59209aa07eb.1675245258.git.lorenzo@kernel.org Signed-off-by: Alexei Starovoitov <[email protected]>
2023-02-02Merge git://git.kernel.org/pub/scm/linux/kernel/git/netdev/netJakub Kicinski5-20/+44
net/core/gro.c 7d2c89b32587 ("skb: Do mix page pool and page referenced frags in GRO") b1a78b9b9886 ("net: add support for ipv4 big tcp") https://lore.kernel.org/all/[email protected]/ Signed-off-by: Jakub Kicinski <[email protected]>
2023-02-01ice: xsk: Do not convert to buff to frame for XDP_TXMaciej Fijalkowski4-94/+117
Let us store pointer to xdp_buff that came from xsk_buff_pool on tx_buf so that it will be possible to recycle it via xsk_buff_free() on Tx cleaning side. This way it is not necessary to do expensive copy to another xdp_buff backed by a newly allocated page. Signed-off-by: Maciej Fijalkowski <[email protected]> Signed-off-by: Daniel Borkmann <[email protected]> Reviewed-by: Alexander Lobakin <[email protected]> Link: https://lore.kernel.org/bpf/[email protected]
2023-02-01ice: Remove next_{dd,rs} fields from ice_tx_ringMaciej Fijalkowski4-8/+0
Now that both ZC and standard XDP data paths stopped using Tx logic based on next_dd and next_rs fields, we can safely remove these fields and shrink Tx ring structure. Signed-off-by: Maciej Fijalkowski <[email protected]> Signed-off-by: Daniel Borkmann <[email protected]> Reviewed-by: Alexander Lobakin <[email protected]> Link: https://lore.kernel.org/bpf/[email protected]
2023-02-01ice: Add support for XDP multi-buffer on Tx sideMaciej Fijalkowski5-96/+199
Similarly as for Rx side in previous patch, logic on XDP Tx in ice driver needs to be adjusted for multi-buffer support. Specifically, the way how HW Tx descriptors are produced and cleaned. Currently, XDP_TX works on strict ring boundaries, meaning it sets RS bit (on producer side) / looks up DD bit (on consumer/cleaning side) every quarter of the ring. It means that if for example multi buffer frame would span across the ring quarter boundary (say that frame consists of 4 frames and we start from 62 descriptor where ring is sized to 256 entries), RS bit would be produced in the middle of multi buffer frame, which would be a broken behavior as it needs to be set on the last descriptor of the frame. To make it work, set RS bit at the last descriptor from the batch of frames that XDP_TX action was used on and make the first entry remember the index of last descriptor with RS bit set. This way, cleaning side can take the index of descriptor with RS bit, look up DD bit's presence and clean from first entry to last. In order to clean up the code base introduce the common ice_set_rs_bit() which will return index of descriptor that got RS bit produced on so that standard driver can store this within proper ice_tx_buf and ZC driver can simply ignore return value. Co-developed-by: Martyna Szapar-Mudlaw <[email protected]> Signed-off-by: Martyna Szapar-Mudlaw <[email protected]> Signed-off-by: Maciej Fijalkowski <[email protected]> Signed-off-by: Daniel Borkmann <[email protected]> Reviewed-by: Alexander Lobakin <[email protected]> Link: https://lore.kernel.org/bpf/[email protected]
2023-02-01ice: Add support for XDP multi-buffer on Rx sideMaciej Fijalkowski6-98/+200
Ice driver needs to be a bit reworked on Rx data path in order to support multi-buffer XDP. For skb path, it currently works in a way that Rx ring carries pointer to skb so if driver didn't manage to combine fragmented frame at current NAPI instance, it can restore the state on next instance and keep looking for last fragment (so descriptor with EOP bit set). What needs to be achieved is that xdp_buff needs to be combined in such way (linear + frags part) in the first place. Then skb will be ready to go in case of XDP_PASS or BPF program being not present on interface. If BPF program is there, it would work on multi-buffer XDP. At this point xdp_buff resides directly on Rx ring, so given the fact that skb will be built straight from xdp_buff, there will be no further need to carry skb on Rx ring. Besides removing skb pointer from Rx ring, lots of members have been moved around within ice_rx_ring. First and foremost reason was to place rx_buf with xdp_buff on the same cacheline. This means that once we touch rx_buf (which is a preceding step before touching xdp_buff), xdp_buff will already be hot in cache. Second thing was that xdp_rxq is used rather rarely and it occupies a separate cacheline, so maybe it is better to have it at the end of ice_rx_ring. Other change that affects ice_rx_ring is the introduction of ice_rx_ring::first_desc. Its purpose is twofold - first is to propagate rx_buf->act to all the parts of current xdp_buff after running XDP program, so that ice_put_rx_buf() that got moved out of the main Rx processing loop will be able to tak an appriopriate action on each buffer. Second is for ice_construct_skb(). ice_construct_skb() has a copybreak mechanism which had an explicit impact on xdp_buff->skb conversion in the new approach when legacy Rx flag is toggled. It works in a way that linear part is 256 bytes long, if frame is bigger than that, remaining bytes are going as a frag to skb_shared_info. This means while memcpying frags from xdp_buff to newly allocated skb, care needs to be taken when picking the destination frag array entry. Upon the time ice_construct_skb() is called, when dealing with fragmented frame, current rx_buf points to the *last* fragment, but copybreak needs to be done against the first one. That's where ice_rx_ring::first_desc helps. When frame building spans across NAPI polls (DD bit is not set on current descriptor and xdp->data is not NULL) with current Rx buffer handling state there might be some problems. Since calls to ice_put_rx_buf() were pulled out of the main Rx processing loop and were scoped from cached_ntc to current ntc, remember that now mentioned function relies on rx_buf->act, which is set within ice_run_xdp(). ice_run_xdp() is called when EOP bit was found, so currently we could put Rx buffer with rx_buf->act being *uninitialized*. To address this, change scoping to rely on first_desc on both boundaries instead. This also implies that cleaned_count which is used as an input to ice_alloc_rx_buffers() and tells how many new buffers should be refilled has to be adjusted. If it stayed as is, what could happen is a case where ntc would go over ntu. Therefore, remove cleaned_count altogether and use against allocing routine newly introduced ICE_RX_DESC_UNUSED() macro which is an equivalent of ICE_DESC_UNUSED() dedicated for Rx side and based on struct ice_rx_ring::first_desc instead of next_to_clean. Signed-off-by: Maciej Fijalkowski <[email protected]> Signed-off-by: Daniel Borkmann <[email protected]> Reviewed-by: Alexander Lobakin <[email protected]> Link: https://lore.kernel.org/bpf/[email protected]
2023-02-01ice: Use xdp->frame_sz instead of recalculating truesizeMaciej Fijalkowski1-24/+9
SKB path calculates truesize on three different functions, which could be avoided as xdp_buff carries the already calculated truesize under xdp_buff::frame_sz. If ice_add_rx_frag() is adjusted to take the xdp_buff as an input just like functions responsible for creating sk_buff initially, codebase could be simplified by removing these redundant recalculations and rely on xdp_buff::frame_sz instead. Signed-off-by: Maciej Fijalkowski <[email protected]> Signed-off-by: Daniel Borkmann <[email protected]> Reviewed-by: Alexander Lobakin <[email protected]> Link: https://lore.kernel.org/bpf/[email protected]
2023-02-01ice: Do not call ice_finalize_xdp_rx() unnecessarilyMaciej Fijalkowski1-1/+1
Currently ice_finalize_xdp_rx() is called only when xdp_prog is present on VSI, which is a good thing. However, this optimization can be enhanced and check only if any of the XDP_TX/XDP_REDIRECT took place in current Rx processing. Non-zero value of @xdp_xmit indicates that xdp_prog is present on VSI. This way XDP_DROP-based workloads will not suffer from unnecessary calls to external function. Signed-off-by: Maciej Fijalkowski <[email protected]> Signed-off-by: Daniel Borkmann <[email protected]> Reviewed-by: Alexander Lobakin <[email protected]> Link: https://lore.kernel.org/bpf/[email protected]
2023-02-01ice: Use ice_max_xdp_frame_size() in ice_xdp_setup_prog()Maciej Fijalkowski1-14/+14
This should have been used in there from day 1, let us address that before introducing XDP multi-buffer support for Rx side. Signed-off-by: Maciej Fijalkowski <[email protected]> Signed-off-by: Daniel Borkmann <[email protected]> Reviewed-by: Alexander Lobakin <[email protected]> Link: https://lore.kernel.org/bpf/[email protected]
2023-02-01ice: Centrallize Rx buffer recyclingMaciej Fijalkowski2-38/+44
Currently calls to ice_put_rx_buf() are sprinkled through ice_clean_rx_irq() - first place is for explicit flow director's descriptor handling, second is after running XDP prog and the last one is after taking care of skb. 1st callsite was actually only for ntc bump purpose, as Rx buffer to be recycled is not even passed to a function. It is possible to walk through Rx buffers processed in particular NAPI cycle by caching ntc from beginning of the ice_clean_rx_irq(). To do so, let us store XDP verdict inside ice_rx_buf, so action we need to take on will be known. For XDP prog absence, just store ICE_XDP_PASS as a verdict. Signed-off-by: Maciej Fijalkowski <[email protected]> Signed-off-by: Daniel Borkmann <[email protected]> Reviewed-by: Alexander Lobakin <[email protected]> Link: https://lore.kernel.org/bpf/[email protected]
2023-02-01ice: Inline eop checkMaciej Fijalkowski2-21/+22
This might be in future used by ZC driver and might potentially yield a minor performance boost. While at it, constify arguments that ice_is_non_eop() takes, since they are pointers and this will help compiler while generating asm. Signed-off-by: Maciej Fijalkowski <[email protected]> Signed-off-by: Daniel Borkmann <[email protected]> Reviewed-by: Alexander Lobakin <[email protected]> Link: https://lore.kernel.org/bpf/[email protected]
2023-02-01ice: Pull out next_to_clean bump out of ice_put_rx_buf()Maciej Fijalkowski1-13/+16
Plan is to move ice_put_rx_buf() to the end of ice_clean_rx_irq() so in order to keep the ability of walking through HW Rx descriptors, pull out next_to_clean handling out of ice_put_rx_buf(). Signed-off-by: Maciej Fijalkowski <[email protected]> Signed-off-by: Daniel Borkmann <[email protected]> Reviewed-by: Alexander Lobakin <[email protected]> Link: https://lore.kernel.org/bpf/[email protected]
2023-02-01ice: Store page count inside ice_rx_bufMaciej Fijalkowski2-17/+12
This will allow us to avoid carrying additional auxiliary array of page counts when dealing with XDP multi buffer support. Previously combining fragmented frame to skb was not affected in the same way as XDP would be as whole frame is needed to be in place before executing XDP prog. Therefore, when going through HW Rx descriptors one-by-one, calls to ice_put_rx_buf() need to be taken *after* running XDP prog on a potentially multi buffered frame, so some additional storage of page count is needed. By adding page count to rx buf, it will make it easier to walk through processed entries at the end of rx cleaning routine and decide whether or not buffers should be recycled. While at it, bump ice_rx_buf::pagecnt_bias from u16 up to u32. It was proven many times that calculations on variables smaller than standard register size are harmful. This was also the case during experiments with embedding page count to ice_rx_buf - when this was added as u16 it had a performance impact. Signed-off-by: Maciej Fijalkowski <[email protected]> Signed-off-by: Daniel Borkmann <[email protected]> Reviewed-by: Alexander Lobakin <[email protected]> Link: https://lore.kernel.org/bpf/[email protected]
2023-02-01ice: Add xdp_buff to ice_rx_ring structMaciej Fijalkowski3-16/+25
In preparation for XDP multi-buffer support, let's store xdp_buff on Rx ring struct. This will allow us to combine fragmented frames across separate NAPI cycles in the same way as currently skb fragments are handled. This means that skb pointer on Rx ring will become redundant and will be removed. For now it is kept and layout of Rx ring struct was not inspected, some member movement will be needed later on so that will be the time to take care of it. Signed-off-by: Maciej Fijalkowski <[email protected]> Signed-off-by: Daniel Borkmann <[email protected]> Reviewed-by: Alexander Lobakin <[email protected]> Link: https://lore.kernel.org/bpf/[email protected]
2023-02-01ice: Prepare legacy-rx for upcoming XDP multi-buffer supportMaciej Fijalkowski5-23/+17
Rx path is going to be modified in a way that fragmented frame will be gathered within xdp_buff in the first place. This approach implies that underlying buffer has to provide tailroom for skb_shared_info. This is currently the case when ring uses build_skb but not when legacy-rx knob is turned on. This case configures 2k Rx buffers and has no way to provide either headroom or tailroom - FWIW it currently has XDP_PACKET_HEADROOM which is broken and in here it is removed. 2k Rx buffers were used so driver in this setting was able to support 9k MTU as it can chain up to 5 Rx buffers. With offset configuring HW writing 2k of a data was passing the half of the page which broke the assumption of our internal page recycling tricks. Now if above got fixed and legacy-rx path would be left as is, when referring to skb_shared_info via xdp_get_shared_info_from_buff(), packet's content would be corrupted again. Hence size of Rx buffer needs to be lowered and therefore supported MTU. This operation will allow us to keep the unified data path and with 8k MTU users (if any of legacy-rx) would still be good to go. However, tendency is to drop the support for this code path at some point. Add ICE_RXBUF_1664 as vsi::rx_buf_len and ICE_MAX_FRAME_LEGACY_RX (8320) as vsi::max_frame for legacy-rx. For bigger page sizes configure 3k Rx buffers, not 2k. Since headroom support is removed, disable data_meta support on legacy-rx. When preparing XDP buff, rely on ice_rx_ring::rx_offset setting when deciding whether to support data_meta or not. Signed-off-by: Maciej Fijalkowski <[email protected]> Signed-off-by: Daniel Borkmann <[email protected]> Reviewed-by: Alexander Lobakin <[email protected]> Link: https://lore.kernel.org/bpf/[email protected]
2023-01-30ice: Remove redundant pci_enable_pcie_error_reporting()Bjorn Helgaas1-3/+0
pci_enable_pcie_error_reporting() enables the device to send ERR_* Messages. Since f26e58bf6f54 ("PCI/AER: Enable error reporting when AER is native"), the PCI core does this for all devices during enumeration. Remove the redundant pci_enable_pcie_error_reporting() call from the driver. Also remove the corresponding pci_disable_pcie_error_reporting() from the driver .remove() path. Note that this doesn't control interrupt generation by the Root Port; that is controlled by the AER Root Error Command register, which is managed by the AER service driver. Signed-off-by: Bjorn Helgaas <[email protected]> Cc: Jesse Brandeburg <[email protected]> Cc: Tony Nguyen <[email protected]> Cc: [email protected] Cc: [email protected] Tested-by: Gurucharan G <[email protected]> (A Contingent worker at Intel) Signed-off-by: Tony Nguyen <[email protected]>
2023-01-30devlink: remove devlink featuresJiri Pirko1-1/+0
Devlink features were introduced to disallow devlink reload calls of userspace before the devlink was fully initialized. The reason for this workaround was the fact that devlink reload was originally called without devlink instance lock held. However, with recent changes that converted devlink reload to be performed under devlink instance lock, this is redundant so remove devlink features entirely. Note that mlx5 used this to enable devlink reload conditionally only when device didn't act as multi port slave. Move the multi port check into mlx5_devlink_reload_down() callback alongside with the other checks preventing the device from reload in certain states. Signed-off-by: Jiri Pirko <[email protected]> Reviewed-by: Jacob Keller <[email protected]> Signed-off-by: David S. Miller <[email protected]>
2023-01-27Merge git://git.kernel.org/pub/scm/linux/kernel/git/netdev/netJakub Kicinski2-11/+17
Conflicts: drivers/net/ethernet/intel/ice/ice_main.c 418e53401e47 ("ice: move devlink port creation/deletion") 643ef23bd9dd ("ice: Introduce local var for readability") https://lore.kernel.org/all/[email protected]/ https://lore.kernel.org/all/[email protected]/ drivers/net/ethernet/engleder/tsnep_main.c 3d53aaef4332 ("tsnep: Fix TX queue stop/wake for multiple queues") 25faa6a4c5ca ("tsnep: Replace TX spin_lock with __netif_tx_lock") https://lore.kernel.org/all/[email protected]/ net/netfilter/nf_conntrack_proto_sctp.c 13bd9b31a969 ("Revert "netfilter: conntrack: add sctp DATA_SENT state"") a44b7651489f ("netfilter: conntrack: unify established states for SCTP paths") f71cb8f45d09 ("netfilter: conntrack: sctp: use nf log infrastructure for invalid packets") https://lore.kernel.org/all/[email protected]/ https://lore.kernel.org/all/[email protected]/ Signed-off-by: Jakub Kicinski <[email protected]>
2023-01-27ice: Prevent set_channel from changing queues while RDMA activeDave Ertman5-19/+43
The PF controls the set of queues that the RDMA auxiliary_driver requests resources from. The set_channel command will alter that pool and trigger a reconfiguration of the VSI, which breaks RDMA functionality. Prevent set_channel from executing when RDMA driver bound to auxiliary device. Adding a locked variable to pass down the call chain to avoid double locking the device_lock. Fixes: 348048e724a0 ("ice: Implement iidc operations") Signed-off-by: Dave Ertman <[email protected]> Tested-by: Gurucharan G <[email protected]> (A Contingent worker at Intel) Signed-off-by: Tony Nguyen <[email protected]>
2023-01-27ice: remove pointless calls to devlink_param_driverinit_value_set()Jiri Pirko1-18/+2
devlink_param_driverinit_value_set() call makes sense only for "driverinit" params. However here, both params are "runtime". devlink_param_driverinit_value_set() returns -EOPNOTSUPP in such case and does not do anything. So remove the pointless calls to devlink_param_driverinit_value_set() entirely. Signed-off-by: Jiri Pirko <[email protected]> Reviewed-by: Jakub Kicinski <[email protected]> Reviewed-by: Jacob Keller <[email protected]> Signed-off-by: David S. Miller <[email protected]>
2023-01-24ice: move devlink port creation/deletionPaul M Stillwell Jr2-11/+17
Commit a286ba738714 ("ice: reorder PF/representor devlink port register/unregister flows") moved the code to create and destroy the devlink PF port. This was fine, but created a corner case issue in the case of ice_register_netdev() failing. In that case, the driver would end up calling ice_devlink_destroy_pf_port() twice. Additionally, it makes no sense to tie creation of the devlink PF port to the creation of the netdev so separate out the code to create/destroy the devlink PF port from the netdev code. This makes it a cleaner interface. Fixes: a286ba738714 ("ice: reorder PF/representor devlink port register/unregister flows") Signed-off-by: Paul M Stillwell Jr <[email protected]> Tested-by: Gurucharan G <[email protected]> (A Contingent worker at Intel) Signed-off-by: Tony Nguyen <[email protected]> Reviewed-by: Leon Romanovsky <[email protected]> Link: https://lore.kernel.org/r/[email protected] Signed-off-by: Jakub Kicinski <[email protected]>
2023-01-20ice: use GNSS subsystem instead of TTYArkadiusz Kubalewski4-257/+146
Previously support for GNSS was implemented as a TTY driver, it allowed to access GNSS receiver on /dev/ttyGNSS_<bus><func>. Use generic GNSS subsystem API instead of implementing own TTY driver. The receiver is accessible on /dev/gnss<id>. In case of multiple receivers in the OS, correct device can be found by enumerating either: - /sys/class/net/<eth port>/device/gnss/ - /sys/class/gnss/gnss<id>/device/ Using GNSS subsystem is superior to implementing own TTY driver, as the GNSS subsystem was designed solely for this purpose. It also implements TTY driver but in a common and defined way. From user perspective, there is no difference in communicating with a device, except new path to the device shall be used. The device will provide same information to the userspace as the old one, and can be used in the same way, i.e.: old # gpsmon /dev/ttyGNSS_2100_0 new # gpsmon /dev/gnss0 There is no other impact on userspace tools. User expecting onboard GNSS receiver support is required to enable CONFIG_GNSS=y/m in kernel config. Reviewed-by: Alexander Lobakin <[email protected]> Signed-off-by: Karol Kolacinski <[email protected]> Signed-off-by: Michal Michalik <[email protected]> Signed-off-by: Arkadiusz Kubalewski <[email protected]> Tested-by: Gurucharan G <[email protected]> (A Contingent worker at Intel) Signed-off-by: Tony Nguyen <[email protected]> Acked-by: Greg Kroah-Hartman <[email protected]> Signed-off-by: David S. Miller <[email protected]>
2023-01-19ice: Remove excess spaceTony Nguyen1-1/+1
smatch reports inconsistent indenting due to an extra space; remove it to resolve the issue. smatch warnings: drivers/net/ethernet/intel/ice/ice_lib.c:1673 ice_vsi_alloc_ring_stats() warn: inconsistent indenting Reported-by: kernel test robot <[email protected]> Signed-off-by: Tony Nguyen <[email protected]>
2023-01-19ice: Introduce local var for readabilityTony Nguyen1-3/+6
Based on previous feedback[1], introduce a local var to make things more readable. [1] https://lore.kernel.org/netdev/20220315203218.607f612b@kicinski-fedora-pc1c0hjn.dhcp.thefacebook.com/ Signed-off-by: Tony Nguyen <[email protected]> Tested-by: Gurucharan G <[email protected]> (A Contingent worker at Intel)
2023-01-19ice: Match parameter name for ice_cfg_phy_fc()Tony Nguyen1-1/+1
The parameter name in the function declaration and definition do not match; adjust the naming for consistency and to avoid confusion. Signed-off-by: Tony Nguyen <[email protected]>
2023-01-19ice: Explicitly return 0Tony Nguyen2-2/+2
Previous checks, and goto, will catch all errors meaning these returns will only return 0; explicitly return 0 for these cases. Signed-off-by: Tony Nguyen <[email protected]> Tested-by: Gurucharan G <[email protected]> (A Contingent worker at Intel)
2023-01-19ice: Reduce scope of variablesTony Nguyen2-5/+10
There are some places where the scope of a variable can be reduced, so do that. Signed-off-by: Tony Nguyen <[email protected]> Tested-by: Gurucharan G <[email protected]> (A Contingent worker at Intel)
2023-01-19ice: Move support DDP code out of ice_flex_pipe.cSergey Temerkhanov6-2300/+2344
Currently, ice_flex_pipe.c includes the DDP loading functions and has grown large. Although flexible processing support code is related to DDP loading, these parts are distinct. Move the DDP loading functionality from ice_flex_pipe.c to a separate file. Signed-off-by: Sergey Temerkhanov <[email protected]> Tested-by: Gurucharan G <[email protected]> (A Contingent worker at Intel) Signed-off-by: Tony Nguyen <[email protected]>
2023-01-19ice: Remove cppcheck suppressionsTony Nguyen4-8/+0
The use of suppressions for cppcheck in the kernel does not look to be standard as the ice driver is the only one doing it. Remove the comments/suppressions. Signed-off-by: Tony Nguyen <[email protected]>
2023-01-19ice: combine cases in ice_ksettings_find_adv_link_speed()Przemek Kitszel1-12/+8
Combine if statements setting the same link speed together. Suggested-by: Tony Nguyen <[email protected]> Signed-off-by: Przemek Kitszel <[email protected]> Acked-by: Anirudh Venkataramanan <[email protected]> Tested-by: Sunitha Mekala <[email protected]> (A Contingent worker at Intel) Signed-off-by: Tony Nguyen <[email protected]>
2023-01-19ice: Add support for 100G KR2/CR2/SR2 link reportingAnirudh Venkataramanan1-9/+32
Commit 2736d94f351b ("ethtool: Added support for 50Gbps per lane link modes") in v5.1 added (among other things) support for 100G CR2/KR2/SR2 link modes. Advertise these link modes if the firmware reports the corresponding PHY types. Signed-off-by: Anirudh Venkataramanan <[email protected]> Signed-off-by: Przemek Kitszel <[email protected]> Tested-by: Sunitha Mekala <[email protected]> (A Contingent worker at Intel) Signed-off-by: Tony Nguyen <[email protected]>
2023-01-19ice: add missing checks for PF vsi typeJesse Brandeburg1-9/+8
There were a few places we had missed checking the VSI type to make sure it was definitely a PF VSI, before calling setup functions intended only for the PF VSI. This doesn't fix any explicit bugs but cleans up the code in a few places and removes one explicit != vsi->type check that can be superseded by this code (it's a super set) Signed-off-by: Jesse Brandeburg <[email protected]> Tested-by: Gurucharan G <[email protected]> (A Contingent worker at Intel) Signed-off-by: Tony Nguyen <[email protected]>
2023-01-19ice: remove redundant non-null check in ice_setup_pf_sw()Anirudh Venkataramanan1-7/+5
Remove a redundant null check, as vsi could not be null at this point. Signed-off-by: Anirudh Venkataramanan <[email protected]> Signed-off-by: Przemek Kitszel <[email protected]> Reviewed-by: Leon Romanovsky <[email protected]> Tested-by: Gurucharan G <[email protected]> (A Contingent worker at Intel) Signed-off-by: Tony Nguyen <[email protected]>
2023-01-19ice: restrict PTP HW clock freq adjustments to 100, 000, 000 PPBSiddaraju DH1-1/+1
The PHY provides only 39b timestamp. With current timing implementation, we discard lower 7b, leaving 32b timestamp. The driver reconstructs the full 64b timestamp by correlating the 32b timestamp with cached_time for performance. The reconstruction algorithm does both forward & backward interpolation. The 32b timeval has overflow duration of 2^32 counts ~= 4.23 second. Due to interpolation in both direction, its now ~= 2.125 second IIRC, going with at least half a duration, the cached_time is updated with periodic thread of 1 second (worst-case) periodicity. But the 1 second periodicity is based on System-timer. With PPB adjustments, if the 1588 timers increments at say double the rate, (2s in-place of 1s), the Nyquist rate/half duration sampling/update of cached_time with 1 second periodic thread will lead to incorrect interpolations. Hence we should restrict the PPB adjustments to at least half duration of cached_time update which translates to 500,000,000 PPB. Since the periodicity of the cached-time system thread can vary, it is good to have some buffer time and considering practicality of PPB adjustments, limiting the max_adj to 100,000,000. Signed-off-by: Siddaraju DH <[email protected]> Tested-by: Gurucharan G <[email protected]> (A Contingent worker at Intel) Signed-off-by: Tony Nguyen <[email protected]>
2023-01-19ice: Support drop actionAmritha Nambiar2-20/+40
Currently the drop action is supported only in switchdev mode. Add support for offloading receive filters with action drop in ADQ/non-ADQ modes. This is in addition to other actions such as forwarding to a VSI (ADQ) or a queue (ADQ/non-ADQ). Also renamed 'ch_vsi' to 'dest_vsi' as it is valid for multiple actions such as forward to vsi/queue which may/may not create a channel vsi. Reviewed-by: Sridhar Samudrala <[email protected]> Signed-off-by: Amritha Nambiar <[email protected]> Tested-by: Bharathi Sreenivas <[email protected]> Signed-off-by: Tony Nguyen <[email protected]>
2023-01-19ice: Handle LLDP MIB Pending changeAnatolii Gerasymenko3-15/+91
If the number of Traffic Classes (TC) is decreased, the FW will no longer remove TC nodes, but will send a pending change notification. This will allow RDMA to destroy corresponding Control QP markers. After RDMA finishes outstanding operations, the ice driver will send an execute MIB Pending change admin queue command to FW to finish DCB configuration change. The FW will buffer all incoming Pending changes, so there can be only one active Pending change. RDMA driver guarantees to remove Control QP markers within 5000 ms. Hence, LLDP response timeout txTTL (default 30 sec) will be met. In the case of a Pending change, LLDP MIB Change Event (opcode 0x0A01) will contain the whole new MIB. But Get LLDP MIB (opcode 0x0A00) AQ call would still return an old MIB, as the Pending change hasn't been applied yet. Add ice_get_dcb_cfg_from_mib_change() function to retrieve DCBX config from LLDP MIB Change Event's buffer for Pending changes. Co-developed-by: Dave Ertman <[email protected]> Signed-off-by: Dave Ertman <[email protected]> Signed-off-by: Anatolii Gerasymenko <[email protected]> Tested-by: Arpana Arland <[email protected]> (A Contingent worker at Intel) Signed-off-by: Tony Nguyen <[email protected]>
2023-01-19ice: Add 'Execute Pending LLDP MIB' Admin Queue commandTsotne Chakhvadze4-2/+33
In DCB Willing Mode (FW managed LLDP), when the link partner changes configuration which requires fewer TCs, the TCs that are no longer needed are suspended by EMP FW, removed, and never resumed. This occurs before a MIB change event is indicated to SW. The permanent suspension and removal of these TC nodes in the scheduler prevents RDMA from being able to destroy QPs associated with this TC, requiring a CORE reset to recover. A new DCBX configuration change flow is defined to allow SW driver and other SW components (RDMA) to properly adjust to the configuration changes before they are taking effect in HW. This flow includes a two-way handshake between EMP FW<->LAN SW<->RDMA SW. List of changes: - Add 'Execute Pending LLDP MIB' AQC. - Add 'Pending Event Enable' bit. - Add additional logic to ignore Pending Event Enable' request while 'LLDP MIB Chnage' event is disabled. - Add 'Execute Pending LLDP MIB' AQC sending function to FW, which is needed to take place MIB Event change. Signed-off-by: Tsotne Chakhvadze <[email protected]> Co-developed-by: Karen Sornek <[email protected]> Signed-off-by: Karen Sornek <[email protected]> Co-developed-by: Dave Ertman <[email protected]> Signed-off-by: Dave Ertman <[email protected]> Co-developed-by: Anatolii Gerasymenko <[email protected]> Signed-off-by: Anatolii Gerasymenko <[email protected]> Tested-by: Arpana Arland <[email protected]> (A Contingent worker at Intel) Signed-off-by: Tony Nguyen <[email protected]>
2023-01-09ice: Add check for kzallocJiasheng Jiang1-9/+14
Add the check for the return value of kzalloc in order to avoid NULL pointer dereference. Moreover, use the goto-label to share the clean code. Fixes: d6b98c8d242a ("ice: add write functionality for GNSS TTY") Signed-off-by: Jiasheng Jiang <[email protected]> Reviewed-by: Jiri Pirko <[email protected]> Tested-by: Gurucharan G <[email protected]> (A Contingent worker at Intel) Signed-off-by: Tony Nguyen <[email protected]>