aboutsummaryrefslogtreecommitdiff
path: root/drivers/net/ethernet
AgeCommit message (Collapse)AuthorFilesLines
2021-06-09ice: parameterize functions responsible for Tx ring managementMaciej Fijalkowski1-8/+10
Commit ae15e0ba1b33 ("ice: Change number of XDP Tx queues to match number of Rx queues") tried to address the incorrect setting of XDP queue count that was based on the Tx queue count, whereas in theory we should provide the XDP queue per Rx queue. However, the routines that setup and destroy the set of Tx resources are still based on the vsi->num_txq. Ice supports the asynchronous Tx/Rx queue count, so for a setup where vsi->num_txq > vsi->num_rxq, ice_vsi_stop_tx_rings and ice_vsi_cfg_txqs will be accessing the vsi->xdp_rings out of the bounds. Parameterize two mentioned functions so they get the size of Tx resources array as the input. Fixes: ae15e0ba1b33 ("ice: Change number of XDP Tx queues to match number of Rx queues") Signed-off-by: Maciej Fijalkowski <[email protected]> Tested-by: Kiran Bhandare <[email protected]> Signed-off-by: Tony Nguyen <[email protected]>
2021-06-09ice: add ndo_bpf callback for safe mode netdev opsMaciej Fijalkowski1-0/+15
ice driver requires a programmable pipeline firmware package in order to have a support for advanced features. Otherwise, driver falls back to so called 'safe mode'. For that mode, ndo_bpf callback is not exposed and when user tries to load XDP program, the following happens: $ sudo ./xdp1 enp179s0f1 libbpf: Kernel error message: Underlying driver does not support XDP in native mode link set xdp fd failed which is sort of confusing, as there is a native XDP support, but not in the current mode. Improve the user experience by providing the specific ndo_bpf callback dedicated for safe mode which will make use of extack to explicitly let the user know that the DDP package is missing and that's the reason that the XDP can't be loaded onto interface currently. Cc: Jamal Hadi Salim <[email protected]> Fixes: efc2214b6047 ("ice: Add support for XDP") Signed-off-by: Maciej Fijalkowski <[email protected]> Tested-by: Kiran Bhandare <[email protected]> Signed-off-by: Tony Nguyen <[email protected]>
2021-06-08net: lantiq: disable interrupt before sheduling NAPIAleksander Jan Bajkowski1-1/+1
This patch fixes TX hangs with threaded NAPI enabled. The scheduled NAPI seems to be executed in parallel with the interrupt on second thread. Sometimes it happens that ltq_dma_disable_irq() is executed after xrx200_tx_housekeeping(). The symptom is that TX interrupts are disabled in the DMA controller. As a result, the TX hangs after a few seconds of the iperf test. Scheduling NAPI after disabling interrupts fixes this issue. Tested on Lantiq xRX200 (BT Home Hub 5A). Fixes: 9423361da523 ("net: lantiq: Disable IRQs only if NAPI gets scheduled ") Signed-off-by: Aleksander Jan Bajkowski <[email protected]> Acked-by: Hauke Mehrtens <[email protected]> Signed-off-by: David S. Miller <[email protected]>
2021-06-08net: stmmac: explicitly deassert GMAC_AHB_RESETMatthew Hagan2-0/+13
We are currently assuming that GMAC_AHB_RESET will already be deasserted by the bootloader. However if this has not been done, probing of the GMAC will fail. To remedy this we must ensure GMAC_AHB_RESET has been deasserted prior to probing. v2 changes: - remove NULL condition check for stmmac_ahb_rst in stmmac_main.c - unwrap dev_err() message in stmmac_main.c - add PTR_ERR() around plat->stmmac_ahb_rst in stmmac_platform.c v3 changes: - add error pointer to dev_err() output - add reset_control_assert(stmmac_ahb_rst) in stmmac_dvr_remove - revert PTR_ERR() around plat->stmmac_ahb_rst since this is performed on the returned value of ret by the calling function Signed-off-by: Matthew Hagan <[email protected]> Signed-off-by: David S. Miller <[email protected]>
2021-06-08net: ena: fix DMA mapping function issues in XDPShay Agroskin1-26/+28
This patch fixes several bugs found when (DMA/LLQ) mapping a packet for transmission. The mapping procedure makes the transmitted packet accessible by the device. When using LLQ, this requires copying the packet's header to push header (which would be passed to LLQ) and creating DMA mapping for the payload (if the packet doesn't fit the maximum push length). When not using LLQ, we map the whole packet with DMA. The following bugs are fixed in the code: 1. Add support for non-LLQ machines: The ena_xdp_tx_map_frame() function assumed that LLQ is supported, and never mapped the whole packet using DMA. On some instances, which don't support LLQ, this causes loss of traffic. 2. Wrong DMA buffer length passed to device: When using LLQ, the first 'tx_max_header_size' bytes of the packet would be copied to push header. The rest of the packet would be copied to a DMA'd buffer. 3. Freeing the XDP buffer twice in case of a mapping error: In case a buffer DMA mapping fails, the function uses xdp_return_frame_rx_napi() to free the RX buffer and returns from the function with an error. XDP frames that fail to xmit get freed by the kernel and so there is no need for this call. Fixes: 548c4940b9f1 ("net: ena: Implement XDP_TX action") Signed-off-by: Shay Agroskin <[email protected]> Signed-off-by: David S. Miller <[email protected]>
2021-06-08sh_eth: Use devm_platform_get_and_ioremap_resource()Yang Yingliang1-4/+1
Use devm_platform_get_and_ioremap_resource() to simplify code. Signed-off-by: Yang Yingliang <[email protected]> Reviewed-by: Sergei Shtylyov <[email protected]> Signed-off-by: David S. Miller <[email protected]>
2021-06-08net: nixge: simplify code with devm platform functionsYang Yingliang1-6/+2
Use devm_platform_get_and_ioremap_resource() and devm_platform_ioremap_resource_byname to simplify code. Signed-off-by: Yang Yingliang <[email protected]> Signed-off-by: David S. Miller <[email protected]>
2021-06-08net: dsa: felix: re-enable TX flow control in ocelot_port_flush()Vladimir Oltean1-0/+5
Because flow control is set up statically in ocelot_init_port(), and not in phylink_mac_link_up(), what happens is that after the blamed commit, the flow control remains disabled after the port flushing procedure. Fixes: eb4733d7cffc ("net: dsa: felix: implement port flushing on .phylink_mac_link_down") Signed-off-by: Vladimir Oltean <[email protected]> Signed-off-by: David S. Miller <[email protected]>
2021-06-08ethernet/qlogic: Use list_for_each_entry() to simplify code in qlcnic_hw.cWang Hai1-6/+2
Convert list_for_each() to list_for_each_entry() where applicable. This simplifies the code. Reported-by: Hulk Robot <[email protected]> Signed-off-by: Wang Hai <[email protected]> Signed-off-by: David S. Miller <[email protected]>
2021-06-08net: stmmac: fix NPD with phylink_set_pcs if there is no MDIO busVladimir Oltean1-5/+2
priv->plat->mdio_bus_data is optional, some platforms may not set it, however we proceed to look straight at priv->plat->mdio_bus_data->has_xpcs. Since the xpcs is instantiated based on the has_xpcs property, we can avoid looking at the priv->plat->mdio_bus_data structure altogether and just check for the presence of the xpcs pointer. Fixes: 11059740e616 ("net: pcs: xpcs: convert to phylink_pcs_ops") Reported-by: Dan Carpenter <[email protected]> Signed-off-by: Vladimir Oltean <[email protected]> Signed-off-by: David S. Miller <[email protected]>
2021-06-08net: qede: Use list_for_each_entry() to simplify codeWang Hai1-4/+2
Convert list_for_each() to list_for_each_entry() where applicable. This simplifies the code. Reported-by: Hulk Robot <[email protected]> Signed-off-by: Wang Hai <[email protected]> Signed-off-by: David S. Miller <[email protected]>
2021-06-08net: hns3: add error handling compatibility during initializationJiaran Zhang3-11/+34
During initialization, the driver logs and clears the hw errors that already occurred. For device supports imp-handle ras capability, it needs handle different error status, otherwise it may cause wrong reset. So fix it by adding a new processing branch. Signed-off-by: Jiaran Zhang <[email protected]> Signed-off-by: Guangbin Huang <[email protected]> Signed-off-by: David S. Miller <[email protected]>
2021-06-08net: hns3: update error recovery module and typeJiaran Zhang3-5/+74
Update error recovery module and type for RoCE. The enumeration values of module names and error types are not sorted in sequence. If use the current printing mode, they cannot be correctly printed. Use the index mode, If mod_id and type_id match the enumerated value, display the corresponding information. Signed-off-by: Jiaran Zhang <[email protected]> Signed-off-by: Weihang Li <[email protected]> Signed-off-by: Guangbin Huang <[email protected]> Signed-off-by: David S. Miller <[email protected]>
2021-06-08net: hns3: add support for imp-handle ras capabilityJiaran Zhang5-1/+11
IMP(Intelligent Management Processor) firmware add a new feature to handle and consolidate RAS information for new devices, NIC driver only needs to query the reported RAS information. NIC driver adds support for this feature. Driver queries device capability to check whether IMP support this feature, If yes, execute the new RAS processing branch. In order to add a method to check whether PF supports imp-handle RAS feature, add dumping this info in debugfs. Signed-off-by: Jiaran Zhang <[email protected]> Signed-off-by: Guangbin Huang <[email protected]> Signed-off-by: David S. Miller <[email protected]>
2021-06-08net: hns3: add the RAS compatibility adaptation solutionJiaran Zhang5-39/+409
To adapt to hardware modification and ensure that the driver is compatible with the original error handling content, we need to add the RAS compatibility adaptation solution. Add a processing branch to the driver during error handling. In the new processing branch, NIC fault information is integrated by the IMP. An interaction command is added between the driver and IMP to query and clear the fault source and interrupt source. The IMP integrates error information and reports the highest reset level to the driver. Signed-off-by: Jiaran Zhang <[email protected]> Signed-off-by: Guangbin Huang <[email protected]> Signed-off-by: David S. Miller <[email protected]>
2021-06-08net: hns3: add support for handling all errors through MSI-XYufeng Mo3-23/+41
Currently, hardware errors can be reported through AER or MSI-X mode. However, the AER mode is intended to handle only bus errors, but not hardware errors. On the other hand, virtual machines cannot handle AER errors. When an AER error is reported, virtual machines will be suspended. So add support for handling all these hardware errors through MSI-X mode which depends on a newer version of firmware, and reserve the handler of the AER mode for compatibility. Signed-off-by: Yufeng Mo <[email protected]> Signed-off-by: Jiaran Zhang <[email protected]> Signed-off-by: Guangbin Huang <[email protected]> Signed-off-by: David S. Miller <[email protected]>
2021-06-08net: ena: re-organize code to improve readabilityShay Agroskin3-9/+14
Restructure some ethtool to a switch-case blocks to make it more uniform with other similar functions. Also restructure variable declaration to create reversed x-mas tree. Signed-off-by: Arthur Kiyanovski <[email protected]> Signed-off-by: Shay Agroskin <[email protected]> Signed-off-by: David S. Miller <[email protected]>
2021-06-08net: ena: Use dev_alloc() in RX buffer allocationShay Agroskin1-22/+36
Use dev_alloc() when allocating RX buffers instead of specifying the allocation flags explicitly. This result in same behaviour with less code. Also move the page allocation and its DMA mapping into a function. This creates a logical block, which may help understanding the code. Signed-off-by: Shay Agroskin <[email protected]> Reported-by: kernel test robot <[email protected]> Signed-off-by: David S. Miller <[email protected]>
2021-06-08net: ena: aggregate doorbell common operations into a functionShay Agroskin1-20/+18
The ena_ring_tx_doorbell() is introduced to call the doorbell and increase the driver's corresponding stat. Signed-off-by: Ido Segev <[email protected]> Signed-off-by: Shay Agroskin <[email protected]> Signed-off-by: David S. Miller <[email protected]>
2021-06-08net: ena: Remove module param and change message severityShay Agroskin1-5/+2
Remove the module param 'debug' which allows to specify the message level of the driver. This value can be specified using ethtool command. Also reduce the message level of LLQ support to be a warning since it is not an indication of an error. Signed-off-by: Arthur Kiyanovski <[email protected]> Signed-off-by: Shay Agroskin <[email protected]> Signed-off-by: David S. Miller <[email protected]>
2021-06-08net: ena: add jiffies of last napi call to statsShay Agroskin2-8/+21
There are instances when we want to know when the last napi was called for debugging. On stuck / heavy loaded CPUs, the ena napi handler might not be called for a long period of time. This stat can help us to determine how much time passed since the last execution of napi. Signed-off-by: Sameeh Jubran <[email protected]> Signed-off-by: Shay Agroskin <[email protected]> Signed-off-by: David S. Miller <[email protected]>
2021-06-08net: ena: use build_skb() in RX pathShay Agroskin1-25/+41
This patch converts the RX path to use build_skb() for packets larger than copybreak (set to 256 by default). This function makes the first descriptor's page to be the linear part of the sk_buff struct buffer. Also remove the SKB description from the README since most of it no longer relevant and the parts that are left don't add information. Signed-off-by: Shay Agroskin <[email protected]> Signed-off-by: David S. Miller <[email protected]>
2021-06-08net: ena: Improve error logging in driverShay Agroskin1-5/+23
Add prints to improve logging of driver's errors. Signed-off-by: Arthur Kiyanovski <[email protected]> Signed-off-by: Shay Agroskin <[email protected]> Signed-off-by: David S. Miller <[email protected]>
2021-06-08net: ena: Remove unused codeShay Agroskin2-13/+0
The ENA_DEFAULT_MIN_RX_BUFF_ALLOC_SIZE macro, ena_xdp_queues_present() function and SUSPEND_RESUME enums aren't used in the driver, and so not needed. Signed-off-by: Arthur Kiyanovski <[email protected]> Signed-off-by: Gal Pressman <[email protected]> Signed-off-by: Sameeh Jubran <[email protected]> Signed-off-by: Shay Agroskin <[email protected]> Signed-off-by: David S. Miller <[email protected]>
2021-06-08net: ena: optimize data access in fast-path codeShay Agroskin3-17/+19
This tweaks several small places to improve the data access in fast path: * Remove duplicates of first_interrupt flag and surround it with WRITE/READ_ONCE macros: The flag is used to detect HW disorders in its interrupt communication with the driver. The flag is set when an interrupt is received and used in the health check function (ena_timer_service()) to help it find irregularities. * Reorder some fields in ena_napi struct to take better advantage of cache access pattern. * Move XDP TX queue number to a variable to save its calculation for every packet. * Use likely in a condition to improve branch prediction The 'first_interrupt' and 'interrupt_masked' flags were moved to reside in the same cache line as the first fields of 'napi' struct. This placement ensures that all memory accessed during upper-half handler reside in the same cacheline (napi_schedule_irqoff() only accesses 'state' and 'poll_list' fields which are at the beginning of napi struct). Signed-off-by: Sameeh Jubran <[email protected]> Signed-off-by: Shay Agroskin <[email protected]> Signed-off-by: David S. Miller <[email protected]>
2021-06-08mlxsw: thermal: Read module temperature thresholds using MTMP registerMykola Kostenok1-17/+30
mlxsw_thermal_module_trips_update() is used to update the trip points of the module's thermal zone. Currently, this is done by querying the thresholds from the module's EEPROM via MCIA register. This data does not pass validation and in some cases can be unreliable. For example, due to some problem with transceiver module. Previous patch made it possible to read module's temperature and thresholds via MTMP register. Therefore, extend mlxsw_thermal_module_trips_update() to use the thresholds queried from MTMP, if valid. This is both more reliable and more efficient than current method, as temperature and thresholds are queried in one transaction instead of three. This is significant when working over a slow bus such as I2C. Signed-off-by: Mykola Kostenok <[email protected]> Acked-by: Vadim Pasternak <[email protected]> Signed-off-by: Ido Schimmel <[email protected]> Signed-off-by: David S. Miller <[email protected]>
2021-06-08mlxsw: thermal: Add function for reading module temperature and thresholdsMykola Kostenok1-15/+35
Provide new function mlxsw_thermal_module_temp_and_thresholds_get() for reading temperature and temperature thresholds by a single operation. The motivation is to reduce the number of transactions with the device which is important when operating over a slow bus such as I2C. Currently, the sole caller of the function is only using it to read the module's temperature. The next patch will also use it to query the module's temperature thresholds. Signed-off-by: Mykola Kostenok <[email protected]> Acked-by: Vadim Pasternak <[email protected]> Signed-off-by: Ido Schimmel <[email protected]> Signed-off-by: David S. Miller <[email protected]>
2021-06-08mlxsw: core_env: Read module temperature thresholds using MTMP registerMykola Kostenok1-2/+11
Currently, module temperature thresholds are obtained from Management Cable Info Access (MCIA) register by specifying the thresholds offsets within module EEPROM layout. This data does not pass validation and in some cases can be unreliable. For example, due to some problem with the module. Add support for a new feature provided by Management Temperature (MTMP) register for sanitization of temperature thresholds values. Extend mlxsw_env_module_temp_thresholds_get() to get temperature thresholds through MTMP field 'max_operational_temperature' - if it is not zero, feature is supported. Otherwise fallback to old method and get the thresholds through MCIA. Signed-off-by: Mykola Kostenok <[email protected]> Acked-by: Vadim Pasternak <[email protected]> Signed-off-by: Ido Schimmel <[email protected]> Signed-off-by: David S. Miller <[email protected]>
2021-06-08mlxsw: reg: Extend MTMP register with new threshold fieldMykola Kostenok4-8/+26
Extend Management Temperature (MTMP) register with new field specifying the maximum temperature threshold. Extend mlxsw_reg_mtmp_unpack() function with two extra arguments, providing high and maximum temperature thresholds. For modules, these thresholds correspond to critical and emergency thresholds that are read from the module's EEPROM. Signed-off-by: Mykola Kostenok <[email protected]> Acked-by: Vadim Pasternak <[email protected]> Signed-off-by: Ido Schimmel <[email protected]> Signed-off-by: David S. Miller <[email protected]>
2021-06-08mlxsw: spectrum_router: Remove abort mechanismAmit Cohen2-125/+5
The abort mechanism was introduced in commit 8e05fd7166c6 ("fib: hook IPv4 fib for hardware offload") with the purpose of falling back to software-based routing in case of a route programming error in hardware. The process is irreversible and requires users to reload the offloading driver or reboot the machine. While this approach might make sense in theory, it makes very little sense in practice. In the case of high speed ASICs such as the Spectrum ASIC, the abort mechanism effectively kills the machine upon a non-fatal error such as a route programming error. Such an extreme policy does not belong in the kernel, especially when user space can simply try to reprogram the route following the RTM_NEWROUTE failure notification. Therefore, remove the abort mechanism. Signed-off-by: Amit Cohen <[email protected]> Reviewed-by: Petr Machata <[email protected]> Signed-off-by: Ido Schimmel <[email protected]> Signed-off-by: David S. Miller <[email protected]>
2021-06-08net: stmmac: enable Intel mGbE 2.5Gbps link speedVoon Weifeng4-1/+68
The Intel mGbE supports 2.5Gbps link speed by increasing the clock rate by 2.5 times of the original rate. In this mode, the serdes/PHY operates at a serial baud rate of 3.125 Gbps and the PCS data path and GMII interface of the MAC operate at 312.5 MHz instead of 125 MHz. For Intel mGbE, the overclocking of 2.5 times clock rate to support 2.5G is only able to be configured in the BIOS during boot time. Kernel driver has no access to modify the clock rate for 1Gbps/2.5G mode. The way to determined the current 1G/2.5G mode is by reading a dedicated adhoc register through mdio bus. In short, after the system boot up, it is either in 1G mode or 2.5G mode which not able to be changed on the fly. Compared to 1G mode, the 2.5G mode selects the 2500BASEX as PHY interface and disables the xpcs_an_inband. This is to cater for some PHYs that only supports 2500BASEX PHY interface with no autonegotiation. v2: remove MAC supported link speed masking v3: Restructure to introduce intel_speed_mode_2500() to read serdes registers for max speed supported and select the appropritate configuration. Use max_speed to determine the supported link speed mask. Signed-off-by: Voon Weifeng <[email protected]> Signed-off-by: Michael Sit Wei Hong <[email protected]> Reviewed-by: Vladimir Oltean <[email protected]> Signed-off-by: David S. Miller <[email protected]>
2021-06-08net: stmmac: split xPCS setup from mdio registerVoon Weifeng3-29/+45
This patch is a preparation patch for the enabling of Intel mGbE 2.5Gbps link speed. The Intel mGbR link speed configuration (1G/2.5G) is depends on a mdio ADHOC register which can be configured in the bios menu. As PHY interface might be different for 1G and 2.5G, the mdio bus need be ready to check the link speed and select the PHY interface before probing the xPCS. Signed-off-by: Voon Weifeng <[email protected]> Signed-off-by: Michael Sit Wei Hong <[email protected]> Signed-off-by: David S. Miller <[email protected]>
2021-06-07mvneta: recycle buffersMatteo Croce1-4/+7
Use the new recycling API for page_pool. In a drop rate test, the packet rate increased by 10%, from 296 Kpps to 326 Kpps. perf top on a stock system shows: Overhead Shared Object Symbol 23.66% [kernel] [k] __pi___inval_dcache_area 22.85% [mvneta] [k] mvneta_rx_swbm 7.54% [kernel] [k] kmem_cache_alloc 6.49% [kernel] [k] eth_type_trans 3.94% [kernel] [k] dev_gro_receive 3.91% [kernel] [k] __netif_receive_skb_core 3.91% [kernel] [k] kmem_cache_free 3.76% [kernel] [k] page_pool_release_page 3.56% [kernel] [k] free_unref_page 2.40% [kernel] [k] build_skb 1.49% [kernel] [k] skb_release_data 1.45% [kernel] [k] __alloc_pages_bulk 1.30% [kernel] [k] page_frag_free And this is the same output with recycling enabled: Overhead Shared Object Symbol 26.41% [kernel] [k] __pi___inval_dcache_area 25.00% [mvneta] [k] mvneta_rx_swbm 8.14% [kernel] [k] kmem_cache_alloc 6.84% [kernel] [k] eth_type_trans 4.44% [kernel] [k] __netif_receive_skb_core 4.38% [kernel] [k] kmem_cache_free 4.16% [kernel] [k] dev_gro_receive 3.21% [kernel] [k] page_pool_put_page 2.41% [kernel] [k] build_skb 1.82% [kernel] [k] skb_release_data 1.61% [kernel] [k] napi_gro_receive 1.25% [kernel] [k] page_pool_refill_alloc_cache 1.16% [kernel] [k] __netif_receive_skb_list_core We can see that page_pool_release_page(), free_unref_page() and __alloc_pages_bulk() are no longer on top of the list when receiving traffic. The test was done with mausezahn on the TX side with 64 byte raw ethernet frames. Signed-off-by: Matteo Croce <[email protected]> Signed-off-by: David S. Miller <[email protected]>
2021-06-07mvpp2: recycle buffersMatteo Croce1-1/+1
Use the new recycling API for page_pool. In a drop rate test, the packet rate is almost doubled, from 1110 Kpps to 2128 Kpps. perf top on a stock system shows: Overhead Shared Object Symbol 34.88% [kernel] [k] page_pool_release_page 8.06% [kernel] [k] free_unref_page 6.42% [mvpp2] [k] mvpp2_rx 6.07% [kernel] [k] eth_type_trans 5.18% [kernel] [k] __netif_receive_skb_core 4.95% [kernel] [k] build_skb 4.88% [kernel] [k] kmem_cache_free 3.97% [kernel] [k] kmem_cache_alloc 3.45% [kernel] [k] dev_gro_receive 2.73% [kernel] [k] page_frag_free 2.07% [kernel] [k] __alloc_pages_bulk 1.99% [kernel] [k] arch_local_irq_save 1.84% [kernel] [k] skb_release_data 1.20% [kernel] [k] netif_receive_skb_list_internal With packet rate stable at 1100 Kpps: tx: 0 bps 0 pps rx: 532.7 Mbps 1110 Kpps tx: 0 bps 0 pps rx: 532.6 Mbps 1110 Kpps tx: 0 bps 0 pps rx: 532.4 Mbps 1109 Kpps tx: 0 bps 0 pps rx: 532.1 Mbps 1109 Kpps tx: 0 bps 0 pps rx: 531.9 Mbps 1108 Kpps tx: 0 bps 0 pps rx: 531.9 Mbps 1108 Kpps And this is the same output with recycling enabled: Overhead Shared Object Symbol 12.91% [kernel] [k] eth_type_trans 12.54% [mvpp2] [k] mvpp2_rx 9.67% [kernel] [k] build_skb 9.63% [kernel] [k] __netif_receive_skb_core 8.44% [kernel] [k] page_pool_put_page 8.07% [kernel] [k] kmem_cache_free 7.79% [kernel] [k] kmem_cache_alloc 6.86% [kernel] [k] dev_gro_receive 3.19% [kernel] [k] skb_release_data 2.41% [kernel] [k] netif_receive_skb_list_internal 2.18% [kernel] [k] page_pool_refill_alloc_cache 1.76% [kernel] [k] napi_gro_receive 1.61% [kernel] [k] kfree_skb 1.20% [kernel] [k] dma_sync_single_for_device 1.16% [mvpp2] [k] mvpp2_poll 1.12% [mvpp2] [k] mvpp2_read With packet rate above 2100 Kpps: tx: 0 bps 0 pps rx: 1021 Mbps 2128 Kpps tx: 0 bps 0 pps rx: 1021 Mbps 2127 Kpps tx: 0 bps 0 pps rx: 1021 Mbps 2128 Kpps tx: 0 bps 0 pps rx: 1021 Mbps 2128 Kpps tx: 0 bps 0 pps rx: 1022 Mbps 2128 Kpps tx: 0 bps 0 pps rx: 1022 Mbps 2129 Kpps The major performance increase is explained by the fact that the most CPU consuming functions (page_pool_release_page, page_frag_free and free_unref_page) are no longer called on a per packet basis. The test was done by sending to the macchiatobin 64 byte ethernet frames with an invalid ethertype, so the packets are dropped early in the RX path. Signed-off-by: Matteo Croce <[email protected]> Signed-off-by: David S. Miller <[email protected]>
2021-06-07skbuff: add a parameter to __skb_frag_unrefMatteo Croce2-2/+2
This is a prerequisite patch, the next one is enabling recycling of skbs and fragments. Add an extra argument on __skb_frag_unref() to handle recycling, and update the current users of the function with that. Signed-off-by: Matteo Croce <[email protected]> Signed-off-by: David S. Miller <[email protected]>
2021-06-07net: moxa: Use devm_platform_get_and_ioremap_resource()Yang Yingliang1-3/+2
Use devm_platform_get_and_ioremap_resource() to simplify code and avoid a null-ptr-deref by checking 'res' in it. Signed-off-by: Yang Yingliang <[email protected]> Signed-off-by: David S. Miller <[email protected]>
2021-06-07net: micrel: check return value after calling platform_get_resource()Yang Yingliang1-0/+4
It will cause null-ptr-deref if platform_get_resource() returns NULL, we need check the return value. Signed-off-by: Yang Yingliang <[email protected]> Signed-off-by: David S. Miller <[email protected]>
2021-06-07net: mvpp2: check return value after calling platform_get_resource()Yang Yingliang1-0/+4
It will cause null-ptr-deref if platform_get_resource() returns NULL, we need check the return value. Signed-off-by: Yang Yingliang <[email protected]> Signed-off-by: David S. Miller <[email protected]>
2021-06-07net: ethernet: bgmac: Use devm_platform_ioremap_resource_bynameYang Yingliang1-14/+7
Use the devm_platform_ioremap_resource_byname() helper instead of calling platform_get_resource_byname() and devm_ioremap_resource() separately. Signed-off-by: Yang Yingliang <[email protected]> Signed-off-by: David S. Miller <[email protected]>
2021-06-07net: enetc: Use devm_platform_get_and_ioremap_resource()Yang Yingliang1-3/+1
Use devm_platform_get_and_ioremap_resource() to simplify code. Signed-off-by: Yang Yingliang <[email protected]> Signed-off-by: David S. Miller <[email protected]>
2021-06-07net: macb: Use devm_platform_get_and_ioremap_resource()Yang Yingliang1-2/+1
Use devm_platform_get_and_ioremap_resource() to simplify code. Signed-off-by: Yang Yingliang <[email protected]> Acked-by: Nicolas Ferre <[email protected]> Signed-off-by: David S. Miller <[email protected]>
2021-06-07net: bcmgenet: check return value after calling platform_get_resource()Yang Yingliang1-0/+4
It will cause null-ptr-deref if platform_get_resource() returns NULL, we need check the return value. Signed-off-by: Yang Yingliang <[email protected]> Acked-by: Florian Fainelli <[email protected]> Signed-off-by: David S. Miller <[email protected]>
2021-06-07net: tulip: Remove the repeated declarationShaokun Zhang1-1/+0
Function 'pnic2_lnk_change' is declared twice, so remove the repeated declaration. Cc: "David S. Miller" <[email protected]> Cc: Jakub Kicinski <[email protected]> Signed-off-by: Shaokun Zhang <[email protected]> Signed-off-by: David S. Miller <[email protected]>
2021-06-07net: hns3: remove now redundant logic related to HNAE3_UNKNOWN_RESETYufeng Mo2-23/+0
Earlier patches have decoupled the MSI-X conveyed error handling and recovery logic. This earlier concept code is no longer required. Signed-off-by: Yufeng Mo <[email protected]> Signed-off-by: Salil Mehta <[email protected]> Signed-off-by: Jiaran Zhang <[email protected]> Signed-off-by: Guangbin Huang <[email protected]> Signed-off-by: David S. Miller <[email protected]>
2021-06-07net: hns3: add scheduling logic for error handling taskJiaran Zhang1-14/+6
Error handling & recovery is done in context of reset task which gets scheduled from misc interrupt handler in existing code. But since error handling has been moved to new task, it should get scheduled instead of the reset task from the interrupt handler. Signed-off-by: Jiaran Zhang <[email protected]> Signed-off-by: Salil Mehta <[email protected]> Signed-off-by: Yufeng Mo <[email protected]> Signed-off-by: Guangbin Huang <[email protected]> Signed-off-by: David S. Miller <[email protected]>
2021-06-07net: hns3: add a separate error handling taskJiaran Zhang3-2/+41
Error handling and recovery logic are intertwined. Error handling (i.e. error identification, clearing error sources and initiation of recovery) is done in context of reset task. If certain hardware errors get delivered during driver init time, which can cause driver init/loading to fail. Introduce a separate error handling task to ensure below: 1. Reset logic remains independent of the error handling logic. 2. Add the hclge_errhand_task_schedule to schedule error recovery tasks, This will ensure that common misellaneous MSI-X interrupt are re-enabled quickly. Signed-off-by: Jiaran Zhang <[email protected]> Signed-off-by: Salil Mehta <[email protected]> Signed-off-by: Yufeng Mo <[email protected]> Signed-off-by: Guangbin Huang <[email protected]> Signed-off-by: David S. Miller <[email protected]>
2021-06-07qed: Fix duplicate included linux/kernel.hJiapeng Chong1-1/+0
Clean up the following includecheck warning: ./drivers/net/ethernet/qlogic/qed/qed_nvmetcp_fw_funcs.h: linux/kernel.h is included more than once. No functional change. Reported-by: Abaci Robot <[email protected]> Signed-off-by: Jiapeng Chong <[email protected]> Signed-off-by: David S. Miller <[email protected]>
2021-06-07Merge branch '100GbE' of ↵David S. Miller19-151/+446
git://git.kernel.org/pub/scm/linux/kernel/git/tnguy/next-queue Tony Nguyen says: ==================== 100GbE Intel Wired LAN Driver Updates 2021-06-07 This series contains updates to virtchnl header file and ice driver. Brett adds capability bits to virtchnl to specify whether a primary or secondary MAC address is being requested and adds the implementation to ice. He also adds storing of VF MAC address so that it will be preserved across reboots of VM and refactors VF queue configuration to remove the expectation that configuration be done all at once. Krzysztof refactors ice_setup_rx_ctx() to remove configuration not related to Rx context into a new function, ice_vsi_cfg_rxq(). Liwei Song extends the wait time for the global config timeout. Salil Mehta refactors code in ice_vsi_set_num_qs() to remove an unnecessary call when the user has requested specific number of Rx or Tx queues. Jesse converts define macros to static inlines for NOP configurations. Jake adds messaging when devlink fails to read device capabilities and when pldmfw cannot find the requested firmware. Adds a wait for reset completion when reporting devlink info and reinitializes NVM during rebuild to ensure values are current. Ani adds detection and reporting of modules exceeding supported power levels and changes an error message to a debug message. Paul fixes a clang warning for deadcode.DeadStores. ==================== Signed-off-by: David S. Miller <[email protected]>
2021-06-07net: gemini: Use devm_platform_get_and_ioremap_resource()Yang Yingliang1-25/+9
Use devm_platform_get_and_ioremap_resource() to simplify code. Signed-off-by: Yang Yingliang <[email protected]> Reviewed-by: Linus Walleij <[email protected]> Signed-off-by: David S. Miller <[email protected]>
2021-06-07mlxsw: core: Set thermal zone polling delay argument to real value at initMykola Kostenok1-2/+4
Thermal polling delay argument for modules and gearboxes thermal zones used to be initialized with zero value, while actual delay was used to be set by mlxsw_thermal_set_mode() by thermal operation callback set_mode(). After operations set_mode()/get_mode() have been removed by cited commits, modules and gearboxes thermal zones always have polling time set to zero and do not perform temperature monitoring. Set non-zero "polling_delay" in thermal_zone_device_register() routine, thus, the relevant thermal zones will perform thermal monitoring. Cc: Andrzej Pietrasiewicz <[email protected]> Fixes: 5d7bd8aa7c35 ("thermal: Simplify or eliminate unnecessary set_mode() methods") Fixes: 1ee14820fd8e ("thermal: remove get_mode() operation of drivers") Signed-off-by: Mykola Kostenok <[email protected]> Acked-by: Vadim Pasternak <[email protected]> Reviewed-by: Jiri Pirko <[email protected]> Signed-off-by: Ido Schimmel <[email protected]> Signed-off-by: David S. Miller <[email protected]>