aboutsummaryrefslogtreecommitdiff
path: root/drivers/net/ethernet
AgeCommit message (Collapse)AuthorFilesLines
2019-04-27net: ethernet: ti: ale: fix mcast super settingGrygorii Strashko1-1/+1
Use correct define ALE_SUPER for ALE Multicast Address Table Entry Supervisory Packet (SUPER) bit setting instead of ALE_BLOCKED. No issues were observed till now as it have never been set, but it's going to be used by new CPSW switch driver. Signed-off-by: Grygorii Strashko <[email protected]> Signed-off-by: David S. Miller <[email protected]>
2019-04-27net: ethernet: ti: cpsw: drop cpsw_tx_packet_submit()Grygorii Strashko1-12/+3
Drop unnecessary wrapper function cpsw_tx_packet_submit() which is used only in one place. Reviewed-by: Andrew Lunn <[email protected]> Signed-off-by: Grygorii Strashko <[email protected]> Signed-off-by: David S. Miller <[email protected]>
2019-04-27net: ethernet: ti: cpsw: use devm_alloc_etherdev_mqs()Grygorii Strashko1-12/+6
Use devm_alloc_etherdev_mqs() and simplify code. Reviewed-by: Andrew Lunn <[email protected]> Signed-off-by: Grygorii Strashko <[email protected]> Signed-off-by: David S. Miller <[email protected]>
2019-04-27net: ethernet: ti: cpsw: drop pinctrl_pm_select_default_state callGrygorii Strashko1-3/+0
Drop pinctrl_pm_select_default_state call from probe as default pinctrl state is set by DD core. Reviewed-by: Andrew Lunn <[email protected]> Signed-off-by: Grygorii Strashko <[email protected]> Signed-off-by: David S. Miller <[email protected]>
2019-04-27net: ethernet: ti: cpsw: use local var dev in probeGrygorii Strashko1-32/+33
Use local variable struct device *dev in probe to simplify code. Signed-off-by: Grygorii Strashko <[email protected]> Reviewed-by: Andrew Lunn <[email protected]> Signed-off-by: David S. Miller <[email protected]>
2019-04-27net: ethernet: ti: cpsw: update cpsw_split_res() to accept cpsw_commonGrygorii Strashko1-8/+6
Update cpsw_split_res() to accept struct cpsw_common instead of struct net_device to simplify code. Signed-off-by: Grygorii Strashko <[email protected]> Reviewed-by: Andrew Lunn <[email protected]> Signed-off-by: David S. Miller <[email protected]>
2019-04-27net: ethernet: ti: cpsw: drop CONFIG_TI_CPSW_ALE config optionGrygorii Strashko3-28/+2
All TI drivers CPSW/NETCP can't work without ALE, hence simplify build of those drivers by always linking cpsw_ale and drop CONFIG_TI_CPSW_ALE config option. Signed-off-by: Grygorii Strashko <[email protected]> Reviewed-by: Andrew Lunn <[email protected]> Signed-off-by: David S. Miller <[email protected]>
2019-04-27net: ethernet: ti: cpsw: drop TI_DAVINCI_CPDMA config optionGrygorii Strashko3-43/+3
Both drivers CPSW and EMAC can't work without CPDMA, hence simplify build of those drivers by always linking davinci_cpdma and drop TI_DAVINCI_CPDMA config option. Note. the davinci_emac driver module was changed to "ti_davinci_emac" to make build work. Signed-off-by: Grygorii Strashko <[email protected]> Reviewed-by: Andrew Lunn <[email protected]> Signed-off-by: David S. Miller <[email protected]>
2019-04-27net: ethernet: ti: convert to SPDX license identifiersGrygorii Strashko18-180/+18
Replace textual license with SPDX-License-Identifier. Signed-off-by: Grygorii Strashko <[email protected]> Reviewed-by: Andrew Lunn <[email protected]> Signed-off-by: David S. Miller <[email protected]>
2019-04-26net: hns3: remove reset after command send failedWeihang Li1-10/+0
It's meaningless to trigger reset when failed to send command to IMP, because the failure is usually caused by no authority, illegal command and so on. When that happened, we just need to return the status code for further debugging. Signed-off-by: Weihang Li <[email protected]> Signed-off-by: Peng Li <[email protected]> Signed-off-by: Huazhong Tan <[email protected]> Signed-off-by: David S. Miller <[email protected]>
2019-04-26net: hns3: prevent double free in hns3_put_ring_config()Huazhong Tan1-5/+5
This patch adds a check for the hns3_put_ring_config() to prevent double free, and for more readable, move the NULL assignment of priv->ring_data into the hns3_put_ring_config(). Signed-off-by: Huazhong Tan <[email protected]> Signed-off-by: Peng Li <[email protected]> Signed-off-by: David S. Miller <[email protected]>
2019-04-26net: hns3: extend the loopback state acquisition timeliuzhongzhu1-2/+2
The test results show that the maximum time of hardware return to mac link state is 500MS.The software needs to set twice the maximum time of hardware return state (1000MS). If not modified, the loopback test returns probability failure. Signed-off-by: liuzhongzhu <[email protected]> Signed-off-by: Peng Li <[email protected]> Signed-off-by: Huazhong Tan <[email protected]> Signed-off-by: David S. Miller <[email protected]>
2019-04-26net: hns3: fix pause configure fail problemHuazhong Tan1-1/+4
When configure pause, current implementation returns directly after setup PFC without setup BP, which is not sufficient. So this patch fixes it, only return while setting PFC failed. Fixes: 44e59e375bf7 ("net: hns3: do not return GE PFC setting err when initializing") Signed-off-by: Huazhong Tan <[email protected]> Signed-off-by: Peng Li <[email protected]> Signed-off-by: David S. Miller <[email protected]>
2019-04-26net: hns3: not reset TQP in the DOWN while VF resettingHuazhong Tan1-2/+4
Since the hardware does not handle mailboxes and the hardware reset include TQP reset, so it is unnecessary to reset TQP in the hclgevf_ae_stop() while doing VF reset. Also it is unnecessary to reset the remaining TQP when one reset fails. Signed-off-by: Huazhong Tan <[email protected]> Signed-off-by: Peng Li <[email protected]> Signed-off-by: David S. Miller <[email protected]>
2019-04-26net: hns3: use a reserved byte to identify need_resp flagHuazhong Tan3-5/+9
This patch uses a reserved byte in the hclge_mbx_vf_to_pf_cmd to save the need_resp flag, so when PF received the mailbox, it can use it to decise whether send a response to VF. For hclge_set_vf_uc_mac_addr(), it should use mbx_need_resp flag to decide whether send response to VF. Signed-off-by: Huazhong Tan <[email protected]> Signed-off-by: Peng Li <[email protected]> Signed-off-by: David S. Miller <[email protected]>
2019-04-26net: hns3: use atomic_t replace u32 for arq's countHuazhong Tan3-5/+6
Since irq handler and mailbox task will both update arq's count, so arq's count should use atomic_t instead of u32, otherwise its value may go wrong finally. Fixes: 07a0556a3a73 ("net: hns3: Changes to support ARQ(Asynchronous Receive Queue)") Signed-off-by: Huazhong Tan <[email protected]> Signed-off-by: Peng Li <[email protected]> Signed-off-by: David S. Miller <[email protected]>
2019-04-26net: hns3: stop sending keep alive msg when VF command queue needs reinitHuazhong Tan1-1/+1
HCLGEVF_STATE_CMD_DISABLE is more suitable than HCLGEVF_STATE_RST_HANDLING to stop sending keep alive msg, since HCLGEVF_STATE_RST_HANDLING only be set when the reset task is running. Fixes: c59a85c07e77 ("net: hns3: stop sending keep alive msg to PF when VF is resetting") Signed-off-by: Huazhong Tan <[email protected]> Signed-off-by: Peng Li <[email protected]> Signed-off-by: David S. Miller <[email protected]>
2019-04-26net: hns3: handle the BD info on the last BD of the packetYunsheng Lin1-18/+20
The bdinfo handled in hns3_handle_bdinfo is only valid on the last BD of the current packet, currently the bd info may be handled based on the first BD if the packet has more than two BDs, which may cause rx error. This patch fixes it by using the last BD of the current packet in hns3_handle_bdinfo. Also, hns3_set_rx_skb_rss_type has used RSS hash value from the last BD of the current packet, so remove the same last BD calculation in hns3_set_rx_skb_rss_type and call it from hns3_handle_bdinfo. Fixes: e55970950556 ("net: hns3: Add handling of GRO Pkts not fully RX'ed in NAPI poll") Signed-off-by: Yunsheng Lin <[email protected]> Signed-off-by: Peng Li <[email protected]> Signed-off-by: Huazhong Tan <[email protected]> Signed-off-by: David S. Miller <[email protected]>
2019-04-26net: hns3: fix for TX clean num when cleaning TX BDYunsheng Lin1-1/+2
hns3_desc_unused() returns how many BD have been cleaned, but new buffer has not been attached to them. The register of HNS3_RING_RX_RING_FBDNUM_REG returns how many BD need allocating new buffer to or need to cleaned. So the remaining BD need to be clean is HNS3_RING_RX_RING_FBDNUM_REG - hns3_desc_unused(). Also, new buffer can not attach to the pending BD when the last BD is not handled, because memcpy has not been done on the first pending BD. This patch fixes by subtracting the pending BD num from unused_count after 'HNS3_RING_RX_RING_FBDNUM_REG - unused_count' is used to calculate the BD bum need to be clean. Fixes: e55970950556 ("net: hns3: Add handling of GRO Pkts not fully RX'ed in NAPI poll") Signed-off-by: Yunsheng Lin <[email protected]> Signed-off-by: Peng Li <[email protected]> Signed-off-by: Huazhong Tan <[email protected]> Signed-off-by: David S. Miller <[email protected]>
2019-04-26net: hns3: fix data race between ring->next_to_cleanYunsheng Lin2-5/+16
hns3_clean_tx_ring calls hns3_nic_reclaim_one_desc to clean buffers and set ring->next_to_clean, then hns3_nic_net_xmit reuses the cleaned buffers. But there are no memory barriers when buffers gets recycled, so the recycled buffers can be corrupted. This patch uses smp_store_release to update ring->next_to_clean and smp_load_acquire to read ring->next_to_clean to properly hand off buffers from hns3_clean_tx_ring to hns3_nic_net_xmit. Fixes: 76ad4f0ee747 ("net: hns3: Add support of HNS3 Ethernet Driver for hip08 SoC") Signed-off-by: Yunsheng Lin <[email protected]> Signed-off-by: Peng Li <[email protected]> Signed-off-by: Huazhong Tan <[email protected]> Signed-off-by: David S. Miller <[email protected]>
2019-04-26nfp: implement PCI driver shutdown callbackDirk van der Merwe2-5/+30
Device may be shutdown without the hardware being reinitialized, in which case we want to ensure we cleanup properly. This is especially important for kexec with traffic flowing. The shutdown procedures resembles the remove procedures, so we can reuse those common tasks. Signed-off-by: Dirk van der Merwe <[email protected]> Reviewed-by: Jakub Kicinski <[email protected]> Signed-off-by: David S. Miller <[email protected]>
2019-04-26cnic: Refactor code and mark expected switch fall-throughGustavo A. R. Silva1-7/+6
In preparation to enabling -Wimplicit-fallthrough, refactor code a bit and mark switch cases where we are expecting to fall through. This patch fixes the following warning: drivers/net/ethernet/broadcom/cnic.c: In function ‘cnic_cm_process_kcqe’: drivers/net/ethernet/broadcom/cnic.c:4044:11: warning: this statement may fall through [-Wimplicit-fallthrough=] opcode = L4_KCQE_OPCODE_VALUE_CLOSE_COMP; drivers/net/ethernet/broadcom/cnic.c:4050:2: note: here case L4_KCQE_OPCODE_VALUE_RESET_RECEIVED: ^~~~ Warning level 3 was used: -Wimplicit-fallthrough=3 Notice that, in this particular case, the code comment is modified in accordance with what GCC is expecting to find. This patch is part of the ongoing efforts to enable -Wimplicit-fallthrough. Signed-off-by: Gustavo A. R. Silva <[email protected]> Signed-off-by: David S. Miller <[email protected]>
2019-04-26cxgb4/cxgb4vf_main: Mark expected switch fall-throughGustavo A. R. Silva1-1/+1
In preparation to enabling -Wimplicit-fallthrough, mark switch cases where we are expecting to fall through. This patch fixes the following warning: drivers/net/ethernet/chelsio/cxgb4vf/cxgb4vf_main.c: In function ‘fwevtq_handler’: drivers/net/ethernet/chelsio/cxgb4vf/cxgb4vf_main.c:520:7: warning: this statement may fall through [-Wimplicit-fallthrough=] cpl = (void *)p; ~~~~^~~~~~~~~~~ drivers/net/ethernet/chelsio/cxgb4vf/cxgb4vf_main.c:524:2: note: here case CPL_SGE_EGR_UPDATE: { ^~~~ Warning level 3 was used: -Wimplicit-fallthrough=3 Notice that, in this particular case, the code comment is modified in accordance with what GCC is expecting to find. This patch is part of the ongoing efforts to enable -Wimplicit-fallthrough. Signed-off-by: Gustavo A. R. Silva <[email protected]> Signed-off-by: David S. Miller <[email protected]>
2019-04-26amd-xgbe: Mark expected switch fall-throughsGustavo A. R. Silva1-3/+3
In preparation to enabling -Wimplicit-fallthrough, mark switch cases where we are expecting to fall through. This patch fixes the following warnings: In file included from drivers/net/ethernet/amd/xgbe/xgbe-drv.c:129: drivers/net/ethernet/amd/xgbe/xgbe-drv.c: In function ‘xgbe_set_hwtstamp_settings’: drivers/net/ethernet/amd/xgbe/xgbe-common.h:1392:9: warning: this statement may fall through [-Wimplicit-fallthrough=] (_var) |= (((_val) & ((0x1 << (_width)) - 1)) << (_index)); \ ~~~~~~~^~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ drivers/net/ethernet/amd/xgbe/xgbe-common.h:1419:2: note: in expansion of macro ‘SET_BITS’ SET_BITS((_var), \ ^~~~~~~~ drivers/net/ethernet/amd/xgbe/xgbe-drv.c:1614:3: note: in expansion of macro ‘XGMAC_SET_BITS’ XGMAC_SET_BITS(mac_tscr, MAC_TSCR, TSVER2ENA, 1); ^~~~~~~~~~~~~~ drivers/net/ethernet/amd/xgbe/xgbe-drv.c:1616:2: note: here case HWTSTAMP_FILTER_PTP_V1_L4_EVENT: ^~~~ In file included from drivers/net/ethernet/amd/xgbe/xgbe-drv.c:129: drivers/net/ethernet/amd/xgbe/xgbe-common.h:1392:9: warning: this statement may fall through [-Wimplicit-fallthrough=] (_var) |= (((_val) & ((0x1 << (_width)) - 1)) << (_index)); \ ~~~~~~~^~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ drivers/net/ethernet/amd/xgbe/xgbe-common.h:1419:2: note: in expansion of macro ‘SET_BITS’ SET_BITS((_var), \ ^~~~~~~~ drivers/net/ethernet/amd/xgbe/xgbe-drv.c:1625:3: note: in expansion of macro ‘XGMAC_SET_BITS’ XGMAC_SET_BITS(mac_tscr, MAC_TSCR, TSVER2ENA, 1); ^~~~~~~~~~~~~~ drivers/net/ethernet/amd/xgbe/xgbe-drv.c:1627:2: note: here case HWTSTAMP_FILTER_PTP_V1_L4_SYNC: ^~~~ In file included from drivers/net/ethernet/amd/xgbe/xgbe-drv.c:129: drivers/net/ethernet/amd/xgbe/xgbe-common.h:1392:9: warning: this statement may fall through [-Wimplicit-fallthrough=] (_var) |= (((_val) & ((0x1 << (_width)) - 1)) << (_index)); \ ~~~~~~~^~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ drivers/net/ethernet/amd/xgbe/xgbe-common.h:1419:2: note: in expansion of macro ‘SET_BITS’ SET_BITS((_var), \ ^~~~~~~~ drivers/net/ethernet/amd/xgbe/xgbe-drv.c:1636:3: note: in expansion of macro ‘XGMAC_SET_BITS’ XGMAC_SET_BITS(mac_tscr, MAC_TSCR, TSVER2ENA, 1); ^~~~~~~~~~~~~~ drivers/net/ethernet/amd/xgbe/xgbe-drv.c:1638:2: note: here case HWTSTAMP_FILTER_PTP_V1_L4_DELAY_REQ: ^~~~ Warning level 3 was used: -Wimplicit-fallthrough=3 Notice that, in this particular case, the code comments are modified in accordance with what GCC is expecting to find. This patch is part of the ongoing efforts to enable -Wimplicit-fallthrough. Signed-off-by: Gustavo A. R. Silva <[email protected]> Signed-off-by: David S. Miller <[email protected]>
2019-04-25Merge git://git.kernel.org/pub/scm/linux/kernel/git/davem/netDavid S. Miller16-31/+54
Two easy cases of overlapping changes. Signed-off-by: David S. Miller <[email protected]>
2019-04-24net: mvneta: Switch to using devm_alloc_etherdev_mqsRosen Penev1-8/+4
It allows some of the code to be simplified. Tested on Turris Omnia. Signed-off-by: Rosen Penev <[email protected]> Signed-off-by: David S. Miller <[email protected]>
2019-04-23mlxsw: spectrum_router: Prevent ipv6 gateway with v4 route via replace and ↵David Ahern1-0/+2
append mlxsw currently does not support v6 gateways with v4 routes. Commit 19a9d136f198 ("ipv4: Flag fib_info with a fib_nh using IPv6 gateway") prevents a route from being added, but nothing stops the replace or append. Add a catch for them too. $ ip ro add 172.16.2.0/24 via 10.99.1.2 $ ip ro replace 172.16.2.0/24 via inet6 fe80::202:ff:fe00:b dev swp1s0 Error: mlxsw_spectrum: IPv6 gateway with IPv4 route is not supported. $ ip ro append 172.16.2.0/24 via inet6 fe80::202:ff:fe00:b dev swp1s0 Error: mlxsw_spectrum: IPv6 gateway with IPv4 route is not supported. Signed-off-by: David Ahern <[email protected]> Signed-off-by: David S. Miller <[email protected]>
2019-04-23net: socionext: replace napi_alloc_frag with the netdev variant on initIlias Apalodimas1-4/+7
The netdev variant is usable on any context since it disables interrupts. The napi variant of the call should only be used within softirq context. Replace napi_alloc_frag on driver init with the correct netdev_alloc_frag call Changes since v1: - Adjusted commit message Acked-by: Ard Biesheuvel <[email protected]> Acked-by: Jassi Brar <[email protected]> Fixes: 4acb20b46214 ("net: socionext: different approach on DMA") Signed-off-by: Ilias Apalodimas <[email protected]> Signed-off-by: David S. Miller <[email protected]>
2019-04-23net: atheros: fix spelling mistake "underun" -> "underrun"Colin Ian King4-5/+5
There are spelling mistakes in structure elements, fix these. Signed-off-by: Colin Ian King <[email protected]> Signed-off-by: David S. Miller <[email protected]>
2019-04-23ravb: Avoid unsupported internal delay mode for R-Car E3/D3Simon Horman1-2/+13
According to the R-Car Gen3 Hardware Manual Rev 1.50 of Nov 30, 2018, the TX clock internal delay mode isn't supported on R-Car E3 (r8a77990) or D3 (r8a77995). And by extension it is also not supported by RZ/G2E (r9a774c0). This matches all ES versions of the affected SoCs as it is not clear if this problem will be resolved in newer chips. This can be revisited, as necessary. This patch does not error-out if PHY_INTERFACE_MODE_RGMII_ID or PHY_INTERFACE_MODE_RGMII_TXID are used on SoCs where TX clock delay mode is not supported as there is a risk of introducing a regression when used in conjunction with older DT blobs present in the field. Rather, a warning is logged in such cases. Based on work by Kazuya Mizuguchi. Signed-off-by: Simon Horman <[email protected]> Reviewed-by: Andrew Lunn <[email protected]> Reviewed-by: Sergei Shtylyov <[email protected]> Signed-off-by: David S. Miller <[email protected]>
2019-04-23Merge tag 'mlx5-updates-2019-04-22' of ↵David S. Miller14-202/+399
git://git.kernel.org/pub/scm/linux/kernel/git/saeed/linux Saeed Mahameed says: ==================== mlx5-updates-2019-04-22 This series includes updates to mlx5e driver RX data path and some significant XDP RX/TX improvements to overcome/mitigate HW and PCIE bottlenecks. From Tariq: 1) Some Enhancements in rq->flags 2) Stabilize RX packet rate (on Striding RQ) with multiple outstanding UMR posts In this patch, we add support for multiple outstanding UMR posts, to allow faster gap closure between consuming MPWQEs and reposting them back into the WQ. Performance test: As expected, huge improvement in large-scale (48 cores). xdp_redirect_map, 64B UDP multi-stream. Redirect from ConnectX-5 100Gbps to ConnectX-6 100Gbps. CPU: Intel(R) Xeon(R) CPU E5-2680 v3 @ 2.50GHz. Before: Unstable, 7 to 30 Mpps After: Stable, at 70.5 Mpps From Shay: 3) XDP, Inline small packets into the TX MPWQE in XDP xmit flow Upon high packet rate with multiple CPUs TX workloads, much of the HCA's resources are spent on prefetching TX descriptors, thus affecting transmission rates. This patch comes to mitigate this problem by moving some workload to the CPU and reducing the HW data prefetch overhead for small packets (<= 256B). When forwarding packets with XDP, a packet that is smaller than a certain size (set to ~256 bytes) would be sent inline within its WQE TX descrptor (mem-copied), when the hardware tx queue is congested beyond a pre-defined water-mark. Performance: Tested packet rate for UDP 64Byte multi-stream over two dual port ConnectX-5 100Gbps NICs. CPU: Intel(R) Xeon(R) CPU E5-2680 v3 @ 2.50GHz * Tested with hyper-threading disabled XDP_TX: | | before | after | | | 24 rings | 51Mpps | 116Mpps | +126% | | 1 ring | 12Mpps | 12Mpps | same | XDP_REDIRECT: ** Below is the transmit rate, not the redirection rate which might be larger, and is not affected by this patch. | | before | after | | | 32 rings | 64Mpps | 92Mpps | +43% | | 1 ring | 6.4Mpps | 6.4Mpps | same | As we can see, feature significantly improves scaling, without hurting single ring performance. From Maxim: 4) Some trivial refactoring and code improvements prior to a larger series to support AF_XDP. ==================== Acked-by: Jakub Kicinski <[email protected]> Signed-off-by: David S. Miller <[email protected]>
2019-04-23net/mlx5e: Use #define for the WQE wait timeout constantMaxim Mikityanskiy1-3/+7
Create a #define for the timeout of mlx5e_wait_for_min_rx_wqes to clarify the meaning of a magic number. Signed-off-by: Maxim Mikityanskiy <[email protected]> Reviewed-by: Tariq Toukan <[email protected]> Signed-off-by: Saeed Mahameed <[email protected]>
2019-04-23net/mlx5e: Remove unused rx_page_reuse statMaxim Mikityanskiy2-5/+0
Remove the no longer used page_reuse stat of RQs. Signed-off-by: Maxim Mikityanskiy <[email protected]> Reviewed-by: Tariq Toukan <[email protected]> Signed-off-by: Saeed Mahameed <[email protected]>
2019-04-23net/mlx5e: Take HW interrupt trigger into a functionMaxim Mikityanskiy3-9/+13
mlx5e_trigger_irq posts a NOP to the ICO SQ just to trigger an IRQ and enter the NAPI poll on the right CPU according to the affinity. Use it in mlx5e_activate_rq. Signed-off-by: Maxim Mikityanskiy <[email protected]> Reviewed-by: Tariq Toukan <[email protected]> Signed-off-by: Saeed Mahameed <[email protected]>
2019-04-23net/mlx5e: Remove unused parameterMaxim Mikityanskiy3-11/+9
mdev is unused in mlx5e_rx_is_linear_skb. Signed-off-by: Maxim Mikityanskiy <[email protected]> Reviewed-by: Tariq Toukan <[email protected]> Signed-off-by: Saeed Mahameed <[email protected]>
2019-04-23net/mlx5e: Add an underflow warning commentMaxim Mikityanskiy1-2/+5
mlx5e_mpwqe_get_log_rq_size calculates the number of WQEs (N) based on the requested number of frames in the RQ (F) and the number of packets per WQE (P). It ensures that N is not less than the minimum number of WQEs in an RQ (N_min). Arithmetically, it means that F / P >= N_min should be true. This function deals with logarithms, so it should check that log(F) - log(P) >= log(N_min). However, if F < P, this expression will cause an unsigned underflow. Check log(F) >= log(P) + log(N_min) instead. Signed-off-by: Maxim Mikityanskiy <[email protected]> Reviewed-by: Tariq Toukan <[email protected]> Signed-off-by: Saeed Mahameed <[email protected]>
2019-04-23net/mlx5e: Move parameter calculation functions to en/params.cMaxim Mikityanskiy4-99/+128
This commit moves the parameter calculation functions to a separate file for better modularity and code sharing with future features. Signed-off-by: Maxim Mikityanskiy <[email protected]> Reviewed-by: Tariq Toukan <[email protected]> Signed-off-by: Saeed Mahameed <[email protected]>
2019-04-23net/mlx5e: Report mlx5e_xdp_set errorsMaxim Mikityanskiy1-1/+1
If the channels fail to reopen after setting an XDP program, return the error code instead of 0. A proper fix is still needed, as now any error while reopening the channels brings the interface down. This patch only adds error reporting. Signed-off-by: Maxim Mikityanskiy <[email protected]> Reviewed-by: Tariq Toukan <[email protected]> Signed-off-by: Saeed Mahameed <[email protected]>
2019-04-23net/mlx5e: Remove unused parameterMaxim Mikityanskiy1-2/+1
params is unused in mlx5e_init_di_list. Signed-off-by: Maxim Mikityanskiy <[email protected]> Reviewed-by: Tariq Toukan <[email protected]> Signed-off-by: Saeed Mahameed <[email protected]>
2019-04-23net/mlx5e: XDP, Inline small packets into the TX MPWQE in XDP xmit flowShay Agroskin6-15/+82
Upon high packet rate with multiple CPUs TX workloads, much of the HCA's resources are spent on prefetching TX descriptors, thus affecting transmission rates. This patch comes to mitigate this problem by moving some workload to the CPU and reducing the HW data prefetch overhead for small packets (<= 256B). When forwarding packets with XDP, a packet that is smaller than a certain size (set to ~256 bytes) would be sent inline within its WQE TX descrptor (mem-copied), when the hardware tx queue is congested beyond a pre-defined water-mark. This is added to better utilize the HW resources (which now makes one less packet data prefetch) and allow better scalability, on the account of CPU usage (which now 'memcpy's the packet into the WQE). To load balance between HW and CPU and get max packet rate, we use watermarks to detect how much the HW is congested and move the work loads back and forth between HW and CPU. Performance: Tested packet rate for UDP 64Byte multi-stream over two dual port ConnectX-5 100Gbps NICs. CPU: Intel(R) Xeon(R) CPU E5-2680 v3 @ 2.50GHz * Tested with hyper-threading disabled XDP_TX: | | before | after | | | 24 rings | 51Mpps | 116Mpps | +126% | | 1 ring | 12Mpps | 12Mpps | same | XDP_REDIRECT: ** Below is the transmit rate, not the redirection rate which might be larger, and is not affected by this patch. | | before | after | | | 32 rings | 64Mpps | 92Mpps | +43% | | 1 ring | 6.4Mpps | 6.4Mpps | same | As we can see, feature significantly improves scaling, without hurting single ring performance. Signed-off-by: Shay Agroskin <[email protected]> Signed-off-by: Tariq Toukan <[email protected]> Signed-off-by: Saeed Mahameed <[email protected]>
2019-04-23net/mlx5e: XDP, Add TX MPWQE session counterShay Agroskin3-0/+11
This counter tracks how many TX MPWQE sessions are started in XDP SQ in XDP TX/REDIRECT flow. It counts per-channel and global stats. Signed-off-by: Shay Agroskin <[email protected]> Signed-off-by: Saeed Mahameed <[email protected]>
2019-04-23net/mlx5e: XDP, Enhance RQ indication for XDP redirect flushTariq Toukan2-4/+4
The XDP redirect flush indication belongs to the receive queue, not to its XDP send queue. For this, use a new bit on rq->flags. Signed-off-by: Tariq Toukan <[email protected]> Reviewed-by: Shay Agroskin <[email protected]> Signed-off-by: Saeed Mahameed <[email protected]>
2019-04-23net/mlx5e: XDP, Fix shifted flag index in RQ bitmapTariq Toukan1-1/+1
Values in enum mlx5e_rq_flag are used as bit indixes. Intention was to use them with no BIT(i) wrapping. No functional bug fix here, as the same (shifted)flag bit is used for all set, test, and clear operations. Fixes: 121e89275471 ("net/mlx5e: Refactor RQ XDP_TX indication") Signed-off-by: Tariq Toukan <[email protected]> Reviewed-by: Shay Agroskin <[email protected]> Signed-off-by: Saeed Mahameed <[email protected]>
2019-04-23net/mlx5e: RX, Support multiple outstanding UMR postsTariq Toukan4-52/+136
The buffers mapping of the Multi-Packet WQEs (of Striding RQ) is done via UMR posts, one UMR WQE per an RX MPWQE. A single MPWQE is capable of serving many incoming packets, usually larger than the budget of a single napi cycle. Hence, posting a single UMR WQE per napi cycle (and handling its completion in the next cycle) works fine in many common cases, but not always. When an XDP program is loaded, every MPWQE is capable of serving less packets, to satisfy the packet-per-page requirement. Thus, for the same number of packets more MPWQEs (and UMR posts) are needed (twice as much for the default MTU), giving less latency room for the UMR completions. In this patch, we add support for multiple outstanding UMR posts, to allow faster gap closure between consuming MPWQEs and reposting them back into the WQ. For better SW and HW locality, we combine the UMR posts in bulks of (at least) two. This is expected to improve packet rate in high CPU scale. Performance test: As expected, huge improvement in large-scale (48 cores). xdp_redirect_map, 64B UDP multi-stream. Redirect from ConnectX-5 100Gbps to ConnectX-6 100Gbps. CPU: Intel(R) Xeon(R) CPU E5-2680 v3 @ 2.50GHz. Before: Unstable, 7 to 30 Mpps After: Stable, at 70.5 Mpps No degradation in other tested scenarios. Signed-off-by: Tariq Toukan <[email protected]> Signed-off-by: Saeed Mahameed <[email protected]>
2019-04-23Merge branch 'mlx5-next' of ↵Saeed Mahameed2-7/+10
git://git.kernel.org/pub/scm/linux/kernel/git/mellanox/linux
2019-04-23net: phy: marvell: add new default led configure for m88e151xJian Shen1-0/+3
The default m88e151x LED configuration is 0x1177, used LED[0] for 1000M link, LED[1] for 100M link, and LED[2] for active. But for some boards, which use LED[0] for link, and LED[1] for active, prefer to be 0x1040. To be compatible with this case, this patch defines a new dev_flag, and set it before connect phy in HNS3 driver. When phy initializing, using the new LED configuration if this dev_flag is set. Signed-off-by: Jian Shen <[email protected]> Signed-off-by: Huazhong Tan <[email protected]> Signed-off-by: David S. Miller <[email protected]>
2019-04-23net: pass net_device argument to the eth_get_headlenStanislav Fomichev13-13/+16
Update all users of eth_get_headlen to pass network device, fetch network namespace from it and pass it down to the flow dissector. This commit is a noop until administrator inserts BPF flow dissector program. Cc: Maxim Krasnyansky <[email protected]> Cc: Saeed Mahameed <[email protected]> Cc: Jeff Kirsher <[email protected]> Cc: [email protected] Cc: Yisen Zhuang <[email protected]> Cc: Salil Mehta <[email protected]> Cc: Michael Chan <[email protected]> Cc: Igor Russkikh <[email protected]> Signed-off-by: Stanislav Fomichev <[email protected]> Signed-off-by: Daniel Borkmann <[email protected]>
2019-04-22net: systemport: Remove need for DMA descriptorFlorian Fainelli2-58/+8
All we do is write the length/status and address bits to a DMA descriptor only to write its contents into on-chip registers right after, eliminate this unnecessary step. Signed-off-by: Florian Fainelli <[email protected]> Reviewed-by: Jakub Kicinski <[email protected]> Signed-off-by: David S. Miller <[email protected]>
2019-04-22mlxsw: spectrum_buffers: Adjust CPU port shared buffer egress quotasIdo Schimmel1-7/+35
Switch the CPU port to use the new dedicated egress pool instead the previously used egress pool which was shared with normal front panel ports. Add per-port quotas for the amount of traffic that can be buffered for the CPU port and also adjust the per-{port, TC} quotas. Signed-off-by: Ido Schimmel <[email protected]> Acked-by: Jiri Pirko <[email protected]> Reviewed-by: Petr Machata <[email protected]> Signed-off-by: David S. Miller <[email protected]>
2019-04-22mlxsw: spectrum_buffers: Allow skipping ingress port quota configurationIdo Schimmel1-2/+8
The CPU port is used to transmit traffic that is trapped to the host CPU. It is therefore irrelevant to define ingress quota for it. Add a 'skip_ingress' argument to the function tasked with configuring per-port quotas, so that ingress quotas could be skipped in case the passed local port is the CPU port. Signed-off-by: Ido Schimmel <[email protected]> Reviewed-by: Petr Machata <[email protected]> Acked-by: Jiri Pirko <[email protected]> Signed-off-by: David S. Miller <[email protected]>