aboutsummaryrefslogtreecommitdiff
path: root/drivers/net
AgeCommit message (Collapse)AuthorFilesLines
2023-03-30wifi: iwlwifi: mvm: add support for post_channel_switch in MLD modeMiri Korenblit3-4/+9
Adjust the existing iwl_mvm_post_channel_switch() to the new MLD API and use it in the new MLD ieee80211_ops Signed-off-by: Miri Korenblit <[email protected]> Signed-off-by: Gregory Greenman <[email protected]> Link: https://lore.kernel.org/r/20230328104948.fa3992f7dfd2.Ie298a9b1522e956d7b699f0432795548bc6e47f9@changeid Signed-off-by: Johannes Berg <[email protected]>
2023-03-30wifi: iwlwifi: mvm: unite sta_modify_disable_tx flowsMiri Korenblit5-23/+47
These flows are the same in both MLD API and the current API, except for the commands that are being sent during this flows. Instead of checking each time before calling these floews what API we use and then call the correct function, call always the old one, which in turn will call the new one in case we're using the MLD API. Signed-off-by: Miri Korenblit <[email protected]> Signed-off-by: Gregory Greenman <[email protected]> Link: https://lore.kernel.org/r/20230328104948.5692d8dea9be.Ib1882b2c2f0b0603abc4b7d4a0ecc45cd1fbf9a7@changeid Signed-off-by: Johannes Berg <[email protected]>
2023-03-30wifi: iwlwifi: mvm: add cancel/remain_on_channel for MLD modeMiri Korenblit7-6/+142
Add an MLD version of the remain_on_channel and cancel_remain_on_channel callbacks. Signed-off-by: Miri Korenblit <[email protected]> Signed-off-by: Gregory Greenman <[email protected]> Link: https://lore.kernel.org/r/20230328104948.b51813dbebd4.Ia25bbd63d3138e4759237ce2be0cd0436fe01c0a@changeid Signed-off-by: Johannes Berg <[email protected]>
2023-03-30wifi: iwlwifi: mvm: refactor iwl_mvm_roc()Miri Korenblit1-44/+56
This flow is almost the same for both MLD and non-MLD modes, except for some function calls. Therefore there is no reason to add an MLD version of this flow. Instead - put the parts that are unique for each mode in helper functions, and in the next patch each version of this flow will call the common part with pointers to its specific helper functions. Signed-off-by: Miri Korenblit <[email protected]> Signed-off-by: Gregory Greenman <[email protected]> Link: https://lore.kernel.org/r/20230328104948.61bc077a7f3c.Ia3aa81d3293792bf8f80528dbc67a711ce334b32@changeid Signed-off-by: Johannes Berg <[email protected]>
2023-03-30wifi: iwlwifi: mvm: add some new MLD opsMiri Korenblit5-80/+446
Add MLD version of bss_info_changed/switch_vif_chanctx/ config_iface_filter and conf_tx() callbacks. Signed-off-by: Miri Korenblit <[email protected]> Signed-off-by: Gregory Greenman <[email protected]> Link: https://lore.kernel.org/r/20230328104948.9c83c253d610.Ibf2006be9ece87896c17cb43dfe3654ac73d81ff@changeid Signed-off-by: Johannes Berg <[email protected]>
2023-03-30wifi: iwlwifi: mvm: add sta handling flows for MLD modeMiri Korenblit6-25/+333
In MLD mode we have a new STA cmd. As a result, it is also changes the flows of adding/updating/removing and handling state of a station. Add these flows. Signed-off-by: Miri Korenblit <[email protected]> Signed-off-by: Gregory Greenman <[email protected]> Link: https://lore.kernel.org/r/20230328104948.b5548cfd8fe3.I70f9c8f3c95e18d5c9af0a5681e0830893509531@changeid Signed-off-by: Johannes Berg <[email protected]>
2023-03-30wifi: iwlwifi: mvm: add an indication that the new MLD API is usedMiri Korenblit2-0/+4
WE can't mix between the new MLD API and the old API. I.e. - we can't send one of the new cmds and then one of the old ones. This will cause a FW assert. So we need an indication what API should be used. We use the new API if: 1. FW supports it 2. We are registered to mac80211 with the new MLD ops Add an indication which will only be true if both conditions are true. Signed-off-by: Miri Korenblit <[email protected]> Signed-off-by: Gregory Greenman <[email protected]> Link: https://lore.kernel.org/r/20230328104948.5756b0907403.I0adce36d1783cce23d0e080e3c4a8953db33b515@changeid Signed-off-by: Johannes Berg <[email protected]>
2023-03-30wifi: iwlwifi: mvm: sta preparation for MLOGregory Greenman13-123/+161
Split iwl_mvm_sta into general and link specific parts. As a first step, all link dependent parameters reside in deflink. The change was done mostly using the spatch below with some manual adjustments. @iwl_mvm_sta@ struct iwl_mvm_sta *s; identifier var = {sta_id, lq_sta, avg_energy}; @@ ( s-> - var + deflink.var ) Signed-off-by: Gregory Greenman <[email protected]> Link: https://lore.kernel.org/r/20230328104948.34eace06d583.I1f8c5e919a71b21030460fbdd220d42401b688b1@changeid Signed-off-by: Johannes Berg <[email protected]>
2023-03-30wifi: iwlwifi: mvm: vif preparation for MLOGregory Greenman23-308/+355
In MLO, some fields of iwl_mvm_vif should be defined in the context of a link. Define a separate structure for these fields and add a deflink object to hold it as part of iwl_mvm_vif. Non-MLO legacy code will use only deflink object while MLO related code will use the corresponding link from the link array. It follows the strategy applied in mac80211 for introducing MLO changes. The below spatch takes care of updating all driver code to access fields separated into MLD specific data structure via deflink (need to convert all references to the fields listed in var to deflink.var and also to take care of calls like iwl_mvm_vif_from_mac80211(vif)->field). @iwl_mld_vif@ struct iwl_mvm_vif *v; struct ieee80211_vif *vv; identifier fn; identifier var = {bssid, ap_sta_id, bcast_sta, mcast_sta, beacon_stats, smps_requests, probe_resp_data, he_ru_2mhz_block, cab_queue, phy_ctxt, queue_params}; @@ ( v-> - var + deflink.var | fn(vv)-> - var + deflink.var ) Signed-off-by: Gregory Greenman <[email protected]> Link: https://lore.kernel.org/r/20230328104948.4896576f0a9f.Ifaf0187c96b9fe52b24bd629331165831a877691@changeid Signed-off-by: Johannes Berg <[email protected]>
2023-03-30wifi: iwlwifi: mvm: Use 64-bit division helper in ↵Nathan Chancellor1-1/+1
iwl_mvm_get_crosstimestamp_fw() There is a 64-bit division in iwl_mvm_get_crosstimestamp_fw(), which results in a link failure when building 32-bit architectures with clang: ld.lld: error: undefined symbol: __udivdi3 >>> referenced by ptp.c >>> drivers/net/wireless/intel/iwlwifi/mvm/ptp.o:(iwl_mvm_phc_get_crosstimestamp) in archive vmlinux.a GCC has optimizations for division by a constant that clang does not implement, so this issue is not visible when building with GCC. Use the 64-bit division helper div_u64(), which takes a u64 dividend and u32 divisor, which matches this situation and prevents the emission of a libcall for the division. Fixes: 21fb8da6ebe4 ("wifi: iwlwifi: mvm: read synced time from firmware if supported") Reported-by: Arnd Bergmann <[email protected]> Link: https://github.com/ClangBuiltLinux/linux/issues/1826 Reported-by: "kernelci.org bot" <[email protected]> Link: https://lore.kernel.org/[email protected]/ Signed-off-by: Nathan Chancellor <[email protected]> Signed-off-by: Johannes Berg <[email protected]>
2023-03-29Merge tag 'mlx5-updates-2023-03-28' of ↵Jakub Kicinski10-363/+464
git://git.kernel.org/pub/scm/linux/kernel/git/saeed/linux Saeed Mahameed says: ==================== mlx5-updates-2023-03-28 Dragos Tatulea says: ==================== net/mlx5e: RX, Drop page_cache and fully use page_pool For page allocation on the rx path, the mlx5e driver has been using an internal page cache in tandem with the page pool. The internal page cache uses a queue for page recycling which has the issue of head of queue blocking. This patch series drops the internal page_cache altogether and uses the page_pool to implement everything that was done by the page_cache before: * Let the page_pool handle dma mapping and unmapping. * Use fragmented pages with fragment counter instead of tracking via page ref. * Enable skb recycling. The patch series has the following effects on the rx path: * Improved performance for the cases when there was low page recycling due to head of queue blocking in the internal page_cache. The test for this was running a single iperf TCP stream to a rx queue which is bound on the same cpu as the application. |-------------+--------+--------+------+---------| | rq type | before | after | unit | diff | |-------------+--------+--------+------+---------| | striding rq | 30.1 | 31.4 | Gbps | 4.14 % | | legacy rq | 30.2 | 33.0 | Gbps | 8.48 % | |-------------+--------+--------+------+---------| * Small XDP performance degradation. The test was is XDP drop program running on a single rx queue with small packets incoming it looks like this: |-------------+----------+----------+------+---------| | rq type | before | after | unit | diff | |-------------+----------+----------+------+---------| | striding rq | 19725449 | 18544617 | pps | -6.37 % | | legacy rq | 19879931 | 18631841 | pps | -6.70 % | |-------------+----------+----------+------+---------| This will be handled in a different patch series by adding support for multi-packet per page. * For other cases the performance is roughly the same. The above numbers were obtained on the following system: 24 core Intel(R) Xeon(R) Platinum 8380 CPU @ 2.30GHz 32 GB RAM ConnectX-7 single port The breakdown on the patch series is the following: * Preparations for introducing the mlx5e_frag_page struct. * Delete the mlx5e_page_cache struct. * Enable dma mapping from page_pool. * Enable skb recycling and fragment counting. * Do deferred release of pages (just before alloc) to ensure better page_pool cache utilization. ==================== * tag 'mlx5-updates-2023-03-28' of git://git.kernel.org/pub/scm/linux/kernel/git/saeed/linux: net/mlx5e: RX, Remove unnecessary recycle parameter and page_cache stats net/mlx5e: RX, Break the wqe bulk refill in smaller chunks net/mlx5e: RX, Increase WQE bulk size for legacy rq net/mlx5e: RX, Split off release path for xsk buffers for legacy rq net/mlx5e: RX, Defer page release in legacy rq for better recycling net/mlx5e: RX, Change wqe last_in_page field from bool to bit flags net/mlx5e: RX, Defer page release in striding rq for better recycling net/mlx5e: RX, Rename xdp_xmit_bitmap to a more generic name net/mlx5e: RX, Enable skb page recycling through the page_pool net/mlx5e: RX, Enable dma map and sync from page_pool allocator net/mlx5e: RX, Remove internal page_cache net/mlx5e: RX, Store SHAMPO header pages in array net/mlx5e: RX, Remove alloc unit layout constraint for striding rq net/mlx5e: RX, Remove alloc unit layout constraint for legacy rq net/mlx5e: RX, Remove mlx5e_alloc_unit argument in page allocation ==================== Link: https://lore.kernel.org/r/[email protected] Signed-off-by: Jakub Kicinski <[email protected]>
2023-03-29bnxt_en: Add missing 200G link speed reportingMichael Chan2-0/+3
bnxt_fw_to_ethtool_speed() is missing the case statement for 200G link speed reported by firmware. As a result, ethtool will report unknown speed when the firmware reports 200G link speed. Fixes: 532262ba3b84 ("bnxt_en: ethtool: support PAM4 link speeds up to 200G") Signed-off-by: Michael Chan <[email protected]> Reviewed-by: Simon Horman <[email protected]> Signed-off-by: Jakub Kicinski <[email protected]>
2023-03-29bnxt_en: Fix typo in PCI id to device description string mappingKalesh AP1-4/+4
Fix 57502 and 57508 NPAR description string entries. The typos caused these devices to not match up with lspci output. Fixes: 49c98421e6ab ("bnxt_en: Add PCI IDs for 57500 series NPAR devices.") Reviewed-by: Pavan Chebbi <[email protected]> Signed-off-by: Kalesh AP <[email protected]> Signed-off-by: Michael Chan <[email protected]> Reviewed-by: Simon Horman <[email protected]> Signed-off-by: Jakub Kicinski <[email protected]>
2023-03-29bnxt_en: Fix reporting of test result in ethtool selftestKalesh AP1-0/+1
When the selftest command fails, driver is not reporting the failure by updating the "test->flags" when bnxt_close_nic() fails. Fixes: eb51365846bc ("bnxt_en: Add basic ethtool -t selftest support.") Reviewed-by: Pavan Chebbi <[email protected]> Reviewed-by: Somnath Kotur <[email protected]> Signed-off-by: Kalesh AP <[email protected]> Signed-off-by: Michael Chan <[email protected]> Reviewed-by: Simon Horman <[email protected]> Signed-off-by: Jakub Kicinski <[email protected]>
2023-03-29i40e: fix registers dump after run ethtool adapter self testRadoslaw Tyl2-6/+7
Fix invalid registers dump from ethtool -d ethX after adapter self test by ethtool -t ethY. It causes invalid data display. The problem was caused by overwriting i40e_reg_list[].elements which is common for ethtool self test and dump. Fixes: 22dd9ae8afcc ("i40e: Rework register diagnostic") Signed-off-by: Radoslaw Tyl <[email protected]> Reviewed-by: Michal Swiatkowski <[email protected]> Tested-by: Arpana Arland <[email protected]> (A Contingent worker at Intel) Signed-off-by: Tony Nguyen <[email protected]> Reviewed-by: Leon Romanovsky <[email protected]> Link: https://lore.kernel.org/r/[email protected] Signed-off-by: Jakub Kicinski <[email protected]>
2023-03-29Merge branch '100GbE' of ↵Jakub Kicinski5-8/+102
git://git.kernel.org/pub/scm/linux/kernel/git/tnguy/net-queue Tony Nguyen says: ==================== Intel Wired LAN Driver Updates 2023-03-28 (ice) This series contains updates to ice driver only. Jesse fixes mismatched header documentation reported when building with W=1. Brett restricts setting of VSI context to only applicable fields for the given ICE_AQ_VSI_PROP_Q_OPT_VALID bit. Junfeng adds check when adding Flow Director filters that conflict with existing filter rules. Jakob Koschel adds interim variable for iterating to prevent possible misuse after looping. * '100GbE' of git://git.kernel.org/pub/scm/linux/kernel/git/tnguy/net-queue: ice: fix invalid check for empty list in ice_sched_assoc_vsi_to_agg() ice: add profile conflict check for AVF FDIR ice: Fix ice_cfg_rdma_fltr() to only update relevant fields ice: fix W=1 headers mismatch ==================== Link: https://lore.kernel.org/r/[email protected] Signed-off-by: Jakub Kicinski <[email protected]>
2023-03-29Merge tag 'ieee802154-for-net-2023-03-29' of ↵Jakub Kicinski1-2/+1
git://git.kernel.org/pub/scm/linux/kernel/git/wpan/wpan Stefan Schmidt says: ==================== ieee802154 for net 2023-03-29 Two small fixes this time. Dongliang Mu removed an unnecessary null pointer check. Harshit Mogalapalli fixed an int comparison unsigned against signed from a recent other fix in the ca8210 driver. * tag 'ieee802154-for-net-2023-03-29' of git://git.kernel.org/pub/scm/linux/kernel/git/wpan/wpan: net: ieee802154: remove an unnecessary null pointer check ca8210: Fix unsigned mac_len comparison with zero in ca8210_skb_tx() ==================== Link: https://lore.kernel.org/r/[email protected] Signed-off-by: Jakub Kicinski <[email protected]>
2023-03-29net: ena: removed unused tx_bytes variableSimon Horman1-2/+0
clang 16.0.0 with W=1 reports: drivers/net/ethernet/amazon/ena/ena_netdev.c:1901:6: error: variable 'tx_bytes' set but not used [-Werror,-Wunused-but-set-variable] u32 tx_bytes = 0; The variable is not used so remove it. Signed-off-by: Simon Horman <[email protected]> Acked-by: Shay Agroskin <[email protected]> Link: https://lore.kernel.org/r/[email protected] Signed-off-by: Jakub Kicinski <[email protected]>
2023-03-29octeon_ep: unlock the correct lock on error pathDan Carpenter1-1/+1
The h and the f letters are swapped so it unlocks the wrong lock. Fixes: 577f0d1b1c5f ("octeon_ep: add separate mailbox command and response queues") Signed-off-by: Dan Carpenter <[email protected]> Reviewed-by: Leon Romanovsky <[email protected]> Link: https://lore.kernel.org/r/[email protected] Signed-off-by: Jakub Kicinski <[email protected]>
2023-03-29bnx2x: use the right build_skb() helperJakub Kicinski1-2/+14
build_skb() no longer accepts slab buffers. Since slab use is fairly uncommon we prefer the drivers to call a separate slab_build_skb() function appropriately. bnx2x uses the old semantics where size of 0 meant buffer from slab. It sets the fp->rx_frag_size to 0 for MTUs which don't fit in a page. It needs to call slab_build_skb(). This fixes the WARN_ONCE() of incorrect API use seen with bnx2x. Reported-by: Thomas Voegtle <[email protected]> Link: https://lore.kernel.org/all/[email protected]/ Fixes: ce098da1497c ("skbuff: Introduce slab_build_skb()") Reviewed-by: Leon Romanovsky <[email protected]> Link: https://lore.kernel.org/r/[email protected] Signed-off-by: Jakub Kicinski <[email protected]>
2023-03-29net: ipa: compute DMA pool size properlyAlex Elder1-1/+1
In gsi_trans_pool_init_dma(), the total size of a pool of memory used for DMA transactions is calculated. However the calculation is done incorrectly. For 4KB pages, this total size is currently always more than one page, and as a result, the calculation produces a positive (though incorrect) total size. The code still works in this case; we just end up with fewer DMA pool entries than we intended. Bjorn Andersson tested booting a kernel with 16KB pages, and hit a null pointer derereference in sg_alloc_append_table_from_pages(), descending from gsi_trans_pool_init_dma(). The cause of this was that a 16KB total size was going to be allocated, and with 16KB pages the order of that allocation is 0. The total_size calculation yielded 0, which eventually led to the crash. Correcting the total_size calculation fixes the problem. Reported-by: Bjorn Andersson <[email protected]> Tested-by: Bjorn Andersson <[email protected]> Fixes: 9dd441e4ed57 ("soc: qcom: ipa: GSI transactions") Reviewed-by: Mark Bloch <[email protected]> Signed-off-by: Alex Elder <[email protected]> Reviewed-by: Leon Romanovsky <[email protected]> Link: https://lore.kernel.org/r/[email protected] Signed-off-by: Jakub Kicinski <[email protected]>
2023-03-29net: wwan: iosm: fixes 7560 modem crashM Chetan Kumar1-0/+7
ModemManger/Apps probing the wwan0xmmrpc0 port for 7560 Modem results in modem crash. 7560 Modem FW uses the MBIM interface for control command communication whereas 7360 uses Intel RPC interface so disable wwan0xmmrpc0 port for 7560. Fixes: d08b0f8f46e4 ("net: wwan: iosm: add rpc interface for xmm modems") Reported-and-tested-by: Martin <[email protected]> Link: https://bugzilla.kernel.org/show_bug.cgi?id=217200 Signed-off-by: M Chetan Kumar <[email protected]> Signed-off-by: Shane Parslow <[email protected]> Signed-off-by: David S. Miller <[email protected]>
2023-03-29net: hns3: support wake on lan configuration and queryHao Lan9-0/+202
The HNS3 driver supports Wake-on-LAN, which can wake up the server from power off state to power on state by magic packet or magic security packet. ChangeLog: v1->v2: Deleted the debugfs function that overlaps with the ethtool function from suggestion of Andrew Lunn. v2->v3: Return the wol configuration stored in driver, suggested by Alexander H Duyck. v3->v4: Add a helper to go from netdev to the local struct, suggested by Simon Horman and Jakub Kicinski. Reviewed-by: Simon Horman <[email protected]> Signed-off-by: Hao Lan <[email protected]> Signed-off-by: David S. Miller <[email protected]>
2023-03-29sfc: add offloading of 'foreign' TC (decap) rulesEdward Cree4-17/+366
A 'foreign' rule is one for which the net_dev is not the sfc netdevice or any of its representors. The driver registers indirect flow blocks for tunnel netdevs so that it can offload decap rules. For example: tc filter add dev vxlan0 parent ffff: protocol ipv4 flower \ enc_src_ip 10.1.0.2 enc_dst_ip 10.1.0.1 \ enc_key_id 1000 enc_dst_port 4789 \ action tunnel_key unset \ action mirred egress redirect dev $REPRESENTOR When notified of a rule like this, register an encap match on the IP and dport tuple (creating an Outer Rule table entry) and insert an MAE action rule to perform the decapsulation and deliver to the representee. Moved efx_tc_delete_rule() below efx_tc_flower_release_encap_match() to avoid the need for a forward declaration. Signed-off-by: Edward Cree <[email protected]> Reviewed-by: Simon Horman <[email protected]> Signed-off-by: David S. Miller <[email protected]>
2023-03-29sfc: add code to register and unregister encap matchesEdward Cree2-0/+174
Add a hashtable to detect duplicate and conflicting matches. If match is not a duplicate, call MAE functions to add/remove it from OR table. Calling code not added yet, so mark the new functions as unused. Signed-off-by: Edward Cree <[email protected]> Reviewed-by: Simon Horman <[email protected]> Signed-off-by: David S. Miller <[email protected]>
2023-03-29sfc: add functions to insert encap matches into the MAEEdward Cree3-0/+111
An encap match corresponds to an entry in the exact-match Outer Rule table; the lookup response includes the encap type (protocol) allowing the hardware to continue parsing into the inner headers. Signed-off-by: Edward Cree <[email protected]> Reviewed-by: Simon Horman <[email protected]> Signed-off-by: David S. Miller <[email protected]>
2023-03-29sfc: handle enc keys in efx_tc_flower_parse_match()Edward Cree1-0/+61
Translate the fields from flow dissector into struct efx_tc_match. In efx_tc_flower_replace(), reject filters that match on them, because only 'foreign' filters (i.e. those for which the ingress dev is not the sfc netdev or any of its representors, e.g. a tunnel netdev) can use them. Signed-off-by: Edward Cree <[email protected]> Reviewed-by: Simon Horman <[email protected]> Signed-off-by: David S. Miller <[email protected]>
2023-03-29sfc: add notion of match on enc keys to MAE machineryEdward Cree3-2/+122
Extend the MAE caps check to validate that the hardware supports these outer-header matches where used by the driver. Extend efx_mae_populate_match_criteria() to fill in the outer rule ID and VNI match fields. Nothing yet populates these match fields, nor creates outer rules. Signed-off-by: Edward Cree <[email protected]> Reviewed-by: Simon Horman <[email protected]> Signed-off-by: David S. Miller <[email protected]>
2023-03-29sfc: document TC-to-EF100-MAE action translation conceptsEdward Cree1-1/+25
Includes an explanation of the lifetime of the 'cursor' action-set `act`. Signed-off-by: Edward Cree <[email protected]> Reviewed-by: Simon Horman <[email protected]> Signed-off-by: David S. Miller <[email protected]>
2023-03-29macvlan: Add netlink attribute for broadcast cutoffHerbert Xu1-2/+29
Make the broadcast cutoff configurable through netlink. Note that macvlan is weird because there is no central device for us to configure (the lowerdev could be anything). So all the options are duplicated over what could be thousands of child devices. IFLA_MACVLAN_BC_QUEUE_LEN took the approach of taking the maximum of all child device settings. This is unnecessary as we could simply store the option in the port device and take the last child device that gets updated as the value to use. Signed-off-by: Herbert Xu <[email protected]> Signed-off-by: David S. Miller <[email protected]>
2023-03-29macvlan: Skip broadcast queue if multicast with single receiverHerbert Xu1-28/+46
As it stands all broadcast and multicast packets are queued and processed in a work queue. This is so that we don't overwhelm the receive softirq path by generating thousands of packets or more (see commit 412ca1550cbe "macvlan: Move broadcasts into a work queue"). As such all multicast packets will be delayed, even if they will be received by a single macvlan device. As using a workqueue is not free in terms of latency, we should avoid this where possible. This patch adds a new filter to determine which addresses should be delayed and which ones won't. This is done using a crude counter of how many times an address has been added to the macvlan port (ha->synced). For now if an address has been added more than once, then it will be considered to be broadcast. This could be tuned further by making this threshold configurable. Signed-off-by: Herbert Xu <[email protected]> Signed-off-by: David S. Miller <[email protected]>
2023-03-29ipv6: Remove in6addr_any alternatives.Kuniyuki Iwashima1-3/+2
Some code defines the IPv6 wildcard address as a local variable and use it with memcmp() or ipv6_addr_equal(). Let's use in6addr_any and ipv6_addr_any() instead. Signed-off-by: Kuniyuki Iwashima <[email protected]> Reviewed-by: Mark Bloch <[email protected]> Reviewed-by: David Ahern <[email protected]> Signed-off-by: David S. Miller <[email protected]>
2023-03-28Merge tag 'mlx5-updates-2023-03-20' of ↵Jakub Kicinski8-222/+344
git://git.kernel.org/pub/scm/linux/kernel/git/saeed/linux Saeed Mahameed says: ==================== mlx5-updates-2023-03-20 mlx5 dynamic msix This patch series adds support for dynamic msix vectors allocation in mlx5. Eli Cohen Says: ================ The following series of patches modifies mlx5_core to work with the dynamic MSIX API. Currently, mlx5_core allocates all the interrupt vectors it needs and distributes them amongst the consumers. With the introduction of dynamic MSIX support, which allows for allocation of interrupts more than once, we now allocate vectors as we need them. This allows other drivers running on top of mlx5_core to allocate interrupt vectors for their own use. An example for this is mlx5_vdpa, which uses these vectors to propagate interrupts directly from the hardware to the vCPU [1]. As a preparation for using this series, a use after free issue is fixed in lib/cpu_rmap.c and the allocator for rmap entries has been modified. A complementary API for irq_cpu_rmap_add() has also been introduced. [1] https://git.kernel.org/pub/scm/linux/kernel/git/saeed/linux.git/patch/?id=0f2bf1fcae96a83b8c5581854713c9fc3407556e ================ * tag 'mlx5-updates-2023-03-20' of git://git.kernel.org/pub/scm/linux/kernel/git/saeed/linux: net/mlx5: Provide external API for allocating vectors net/mlx5: Use one completion vector if eth is disabled net/mlx5: Refactor calculation of required completion vectors net/mlx5: Move devlink registration before mlx5_load net/mlx5: Use dynamic msix vectors allocation net/mlx5: Refactor completion irq request/release code net/mlx5: Improve naming of pci function vectors net/mlx5: Use newer affinity descriptor net/mlx5: Modify struct mlx5_irq to use struct msi_map net/mlx5: Fix wrong comment net/mlx5e: Coding style fix, add empty line lib: cpu_rmap: Add irq_cpu_rmap_remove to complement irq_cpu_rmap_add lib: cpu_rmap: Use allocator for rmap entries lib: cpu_rmap: Avoid use after free on rmap->obj array entries ==================== Link: https://lore.kernel.org/r/[email protected] Signed-off-by: Jakub Kicinski <[email protected]>
2023-03-28net: ethernet: 8390: axnet_cs: remove unused xfer_count variableTom Rix1-3/+0
clang with W=1 reports drivers/net/ethernet/8390/axnet_cs.c:653:9: error: variable 'xfer_count' set but not used [-Werror,-Wunused-but-set-variable] int xfer_count = count; ^ This variable is not used so remove it. Signed-off-by: Tom Rix <[email protected]> Reviewed-by: Simon Horman <[email protected]> Link: https://lore.kernel.org/r/[email protected] Signed-off-by: Jakub Kicinski <[email protected]>
2023-03-28net: ethernet: mtk_eth_soc: fix tx throughput regression with direct 1G linksFelix Fietkau1-2/+0
Using the QDMA tx scheduler to throttle tx to line speed works fine for switch ports, but apparently caused a regression on non-switch ports. Based on a number of tests, it seems that this throttling can be safely dropped without re-introducing the issues on switch ports that the tx scheduling changes resolved. Link: https://lore.kernel.org/netdev/trinity-92c3826f-c2c8-40af-8339-bc6d0d3ffea4-1678213958520@3c-app-gmx-bs16/ Fixes: f63959c7eec3 ("net: ethernet: mtk_eth_soc: implement multi-queue support for per-port queues") Reported-by: Frank Wunderlich <[email protected]> Reported-by: Daniel Golle <[email protected]> Tested-by: Daniel Golle <[email protected]> Signed-off-by: Felix Fietkau <[email protected]> Link: https://lore.kernel.org/r/[email protected] Signed-off-by: Jakub Kicinski <[email protected]>
2023-03-29driver core: class: mark the struct class for sysfs callbacks as constantGreg Kroah-Hartman1-9/+9
struct class should never be modified in a sysfs callback as there is nothing in the structure to modify, and frankly, the structure is almost never used in a sysfs callback, so mark it as constant to allow struct class to be moved to read-only memory. While we are touching all class sysfs callbacks also mark the attribute as constant as it can not be modified. The bonding code still uses this structure so it can not be removed from the function callbacks. Cc: "David S. Miller" <[email protected]> Cc: "Rafael J. Wysocki" <[email protected]> Cc: Bartosz Golaszewski <[email protected]> Cc: Eric Dumazet <[email protected]> Cc: Jakub Kicinski <[email protected]> Cc: Jens Axboe <[email protected]> Cc: Johannes Berg <[email protected]> Cc: Linus Walleij <[email protected]> Cc: Minchan Kim <[email protected]> Cc: Miquel Raynal <[email protected]> Cc: Namjae Jeon <[email protected]> Cc: Paolo Abeni <[email protected]> Cc: Russ Weight <[email protected]> Cc: Sergey Senozhatsky <[email protected]> Cc: Steve French <[email protected]> Cc: Vignesh Raghavendra <[email protected]> Cc: [email protected] Cc: [email protected] Cc: [email protected] Cc: [email protected] Cc: [email protected] Cc: [email protected] Cc: [email protected] Reviewed-by: Luis Chamberlain <[email protected]> Link: https://lore.kernel.org/r/[email protected] Signed-off-by: Greg Kroah-Hartman <[email protected]>
2023-03-28Revert "sh_eth: remove open coded netif_running()"Wolfram Sang2-1/+6
This reverts commit ce1fdb065695f49ef6f126d35c1abbfe645d62d5. It turned out this actually introduces a race condition. netif_running() is not a suitable check for get_stats. Reported-by: Sergey Shtylyov <[email protected]> Signed-off-by: Wolfram Sang <[email protected]> Reviewed-by: Sergey Shtylyov <[email protected]> Reviewed-by: Simon Horman <[email protected]> Link: https://lore.kernel.org/r/[email protected] Signed-off-by: Jakub Kicinski <[email protected]>
2023-03-28net/mlx5e: Fix build break on 32bitSaeed Mahameed1-2/+2
The cited commit caused the following build break in mlx5 due to a change in size of MAX_SKB_FRAGS. error: format '%lu' expects argument of type 'long unsigned int', but argument 7 has type 'unsigned int' [-Werror=format=] Fix this by explicit casting. Fixes: 3948b05950fd ("net: introduce a config option to tweak MAX_SKB_FRAGS") Signed-off-by: Saeed Mahameed <[email protected]> Link: https://lore.kernel.org/r/[email protected] Signed-off-by: Jakub Kicinski <[email protected]>
2023-03-28net/mlx5e: RX, Remove unnecessary recycle parameter and page_cache statsDragos Tatulea4-59/+25
The recycle parameter used during page release is no longer necessary: the page pool can detect when the page cannot be recycled to the cache or ring without any outside hint. The page pool will also take care of cleaning up after itself once all the inflight pages have been released. So no need to explicitly release pages to the system. Remove the internal page_cache stats as the mlx5e_page_cache struct no longer exists. Delete the documentation entries along with the stats. Signed-off-by: Dragos Tatulea <[email protected]> Reviewed-by: Tariq Toukan <[email protected]> Signed-off-by: Saeed Mahameed <[email protected]>
2023-03-28net/mlx5e: RX, Break the wqe bulk refill in smaller chunksDragos Tatulea3-3/+34
To avoid overflowing the page pool's cache, don't release the whole bulk which is usually larger than the cache refill size. Group release+alloc instead into cache refill units that allow releasing to the cache and then allocating from the cache. A refill_unit variable is added as a iteration unit over the wqe_bulk when doing release+alloc. For a single ring, single core, default MTU (1500) TCP stream test the number of pages allocated from the cache directly (rx_pp_recycle_cached) increases from 0% to 52%: +---------------------------------------------+ | Page Pool stats (/sec) | Before | After | +-------------------------+---------+---------+ |rx_pp_alloc_fast | 2145422 | 2193802 | |rx_pp_alloc_slow | 2 | 0 | |rx_pp_alloc_empty | 2 | 0 | |rx_pp_alloc_refill | 34059 | 16634 | |rx_pp_alloc_waive | 0 | 0 | |rx_pp_recycle_cached | 0 | 1145818 | |rx_pp_recycle_cache_full | 0 | 0 | |rx_pp_recycle_ring | 2179361 | 1064616 | |rx_pp_recycle_ring_full | 121 | 0 | +---------------------------------------------+ With this patch, the performance for legacy rq for the above test is back to baseline. Signed-off-by: Dragos Tatulea <[email protected]> Signed-off-by: Saeed Mahameed <[email protected]>
2023-03-28net/mlx5e: RX, Increase WQE bulk size for legacy rqDragos Tatulea2-5/+44
Deferred page release was added to legacy rq but its desired effect (driver releases last fragment to page pool cache) is not yet visible due to the WQE bulks being too small. This patch increases the WQE bulk size to span 512 KB and clip it to one quarter of the rx queue size. Signed-off-by: Dragos Tatulea <[email protected]> Signed-off-by: Saeed Mahameed <[email protected]>
2023-03-28net/mlx5e: RX, Split off release path for xsk buffers for legacy rqDragos Tatulea1-15/+35
Don't mix xsk buffer releases with page releases anymore. This is needed for handling of deferred page release. Add a new bulk free function for xsk buffers from wqe frags. Signed-off-by: Dragos Tatulea <[email protected]> Reviewed-by: Tariq Toukan <[email protected]> Signed-off-by: Saeed Mahameed <[email protected]>
2023-03-28net/mlx5e: RX, Defer page release in legacy rq for better recyclingDragos Tatulea4-23/+66
Currently, fragmented pages from the page pool can be released in two ways: 1) In the mlx5e driver when trimming off the unused fragments AND the associated skb fragments have been released. This path allows recycling of pages to the page pool cache (allow_direct == true). 2) On the skb release path (last fragment release), which will always release pages to the page pool ring (allow_direct == false). Whichever is releasing the last fragment will be decisive on where the page gets released: the cache or the ring. So we obviously want to maximize for doing the release from 1. This patch does that by deferring the release of page fragments right before requesting new ones from the page pool. A flag is added to make sure that there's no release before first alloc and that XDP_TX fragments are not released prematurely. This is a preparation patch that doesn't unlock the performance improvements yet. A followup patch will do that. Signed-off-by: Dragos Tatulea <[email protected]> Reviewed-by: Tariq Toukan <[email protected]> Signed-off-by: Saeed Mahameed <[email protected]>
2023-03-28net/mlx5e: RX, Change wqe last_in_page field from bool to bit flagsDragos Tatulea3-4/+8
Change the bool flag to a bitfield as we'll use it in a downstream patch in the series to add signaling about skipping a fragment release. Signed-off-by: Dragos Tatulea <[email protected]> Reviewed-by: Tariq Toukan <[email protected]> Signed-off-by: Saeed Mahameed <[email protected]>
2023-03-28net/mlx5e: RX, Defer page release in striding rq for better recyclingDragos Tatulea4-10/+25
Currently, for striding RQ, fragmented pages from the page pool can get released in two ways: 1) In the mlx5e driver when trimming off the unused fragments AND the associated skb fragments have been released. This path allows recycling of pages to the page pool cache (allow_direct == true). 2) On the skb release path (last fragment release), which will always release pages to the page pool ring (allow_direct == false). Whichever is releasing the last fragment will be decisive on where the page gets released: the cache or the ring. So we obviously want to maximize for doing the release from 1. This patch does that by deferring the release of page fragments right before requesting new ones from the page pool. Extra care needs to be taken for the corner cases: * On first call, make sure that release is not called. The skip_release_bitmap is used for this purpose. * On rq shutdown, make sure that all wqes that were not in the linked list are released. For a single ring, single core, default MTU (1500) TCP stream test the number of pages allocated from the cache directly (rx_pp_recycle_cached) increases from 31 % to 98 %: +----------------------------------------------+ | Page Pool stats (/sec) | Before | After | +-------------------------+---------+----------+ |rx_pp_alloc_fast | 2137754 | 2261033 | |rx_pp_alloc_slow | 47 | 9 | |rx_pp_alloc_empty | 47 | 9 | |rx_pp_alloc_refill | 23230 | 819 | |rx_pp_alloc_waive | 0 | 0 | |rx_pp_recycle_cached | 672182 | 2209015 | |rx_pp_recycle_cache_full | 1789 | 0 | |rx_pp_recycle_ring | 1485848 | 52259 | |rx_pp_recycle_ring_full | 3003 | 584 | +----------------------------------------------+ With this patch, the performance in striding rq for the above test is back to baseline. Signed-off-by: Dragos Tatulea <[email protected]> Reviewed-by: Tariq Toukan <[email protected]> Signed-off-by: Saeed Mahameed <[email protected]>
2023-03-28net/mlx5e: RX, Rename xdp_xmit_bitmap to a more generic nameDragos Tatulea3-9/+9
The xdp_xmit_bitmap currently serves only one purpose: to avoid releasing pages that are still in use due to XDP TX. A following patch will use this bitmap in a slightly different context but for the same purpose. So rename the bitmap to a more generic name that reflects the purpose not the context. Signed-off-by: Dragos Tatulea <[email protected]> Reviewed-by: Tariq Toukan <[email protected]> Signed-off-by: Saeed Mahameed <[email protected]>
2023-03-28net/mlx5e: RX, Enable skb page recycling through the page_poolDragos Tatulea5-88/+121
Start using the page_pool skb recycling api to recycle all pages back to the page pool and stop using atomic page reference counting. The mlx5e driver used to manage in-flight pages using page refcounting: for each fragment there were 2 atomic write operations happening (one for building the skb and one on skb release). The page_pool api introduced a method to track page fragments more optimally: * The page's pp_fragment_count is set to a large bias on page alloc (1 x atomic write operation). * The driver tracks the actual page fragments in a non atomic variable. * When the skb is recycled, pp_fragment_count is decremented (atomic write operation). * When page is released in the driver, the unused number of fragments (relative to the bias) is deducted from pp_fragment_count (atomic write operation). * Last page defragmentation will only be an atomic read. So in total there are `number of fragments + 1` atomic write ops. As opposed to previously: `2 * frags` atomic writes ops. Pages are wrapped in a mlx5e_frag_page structure which also contains the number of fragments. This makes it easy to count the fragments in the driver. This change brings performance improvements for the case when the old rx page_cache had low recycling rates due to head of queue blocking. For a iperf3 TCP test with a single stream, on a single core (iperf and receive queue running on same core), the following improvements can be noticed: * Striding rq: - before (net-next baseline): bitrate = 30.1 Gbits/sec - after : bitrate = 31.4 Gbits/sec (diff: 4.14 %) * Legacy rq: - before (net-next baseline): bitrate = 30.2 Gbits/sec - after : bitrate = 33.0 Gbits/sec (diff: 8.48 %) There are 2 temporary performance degradations introduced: 1) TCP streams that had a good recycling rate with the old page_cache have a degradation for both striding and linear rq. This is due to very low page pool cache recycling: the pages are released during skb recycle which will release pages to the page pool ring for safety. The following patches in this series will tackle this problem by deferring the page release in the driver to increase the chance of having pages recycled to the cache. 2) XDP performance is now lower (4-5 %) due to the higher number of atomic operations used for fragment management. But this opens the door for supporting multiple packets per page in XDP, which will bring a big gain. Otherwise, performance is similar to baseline. Signed-off-by: Dragos Tatulea <[email protected]> Reviewed-by: Tariq Toukan <[email protected]> Signed-off-by: Saeed Mahameed <[email protected]>
2023-03-28net/mlx5e: RX, Enable dma map and sync from page_pool allocatorDragos Tatulea4-27/+4
Remove driver dma mapping and unmapping of pages. Let the page_pool api do it. Signed-off-by: Dragos Tatulea <[email protected]> Reviewed-by: Tariq Toukan <[email protected]> Signed-off-by: Saeed Mahameed <[email protected]>
2023-03-28net/mlx5e: RX, Remove internal page_cacheDragos Tatulea3-72/+0
This patch removes the internal rx page_cache and uses the generic page_pool api only. It used to be that the page_pool couldn't handle all the mlx5 driver usecases, but with the introduction of skb recycling and page fragmentaton in the page_pool full switch can now be made. Some benfits of this transition: * Better page recycling in the cases when the page_cache was suffering from head of queue blocking. The page_pool doesn't have this issue. * DMA mapping/unmapping can be managed by the page_pool. * mlx5e_rq size reduced by more than 50% due to the page_cache array being deleted. This patch only removes the page_cache. Downstream patches will enable the required page_pool features and will add further fine-tuning. Signed-off-by: Dragos Tatulea <[email protected]> Reviewed-by: Tariq Toukan <[email protected]> Signed-off-by: Saeed Mahameed <[email protected]>
2023-03-28net/mlx5e: RX, Store SHAMPO header pages in arrayDragos Tatulea3-26/+45
Save allocated SHAMPO header pages to an array to which the mlx5e_dma_info page will point to. This change is a preparation for introducing mlx5e_frag_page structure in a downstream patch. There's no new functionality introduced. Signed-off-by: Dragos Tatulea <[email protected]> Reviewed-by: Tariq Toukan <[email protected]> Signed-off-by: Saeed Mahameed <[email protected]>