aboutsummaryrefslogtreecommitdiff
path: root/drivers/net/ethernet
AgeCommit message (Collapse)AuthorFilesLines
2020-09-25ice: fix memory leak if register_netdev_failsJacob Keller3-19/+6
The ice_setup_pf_sw function can cause a memory leak if register_netdev fails, due to accidentally failing to free the VSI rings. Fix the memory leak by using ice_vsi_release, ensuring we actually go through the full teardown process. This should be safe even if the netdevice is not registered because we will have set the netdev pointer to NULL, ensuring ice_vsi_release won't call unregister_netdev. An alternative fix would be moving management of the PF VSI netdev into the main VSI setup code. This is complicated and likely requires significant refactor in how we manage VSIs Fixes: 3a858ba392c3 ("ice: Add support for VSI allocation and deallocation") Signed-off-by: Jacob Keller <[email protected]> Tested-by: Aaron Brown <[email protected]> Signed-off-by: Tony Nguyen <[email protected]>
2020-09-25ice: Fix call trace on suspendAnirudh Venkataramanan1-0/+1
It appears that the ice_suspend flow is missing a call to pci_save_state and this is triggering the message "State of device not saved by ice_suspend" and a call trace. Fix it. Fixes: 769c500dcc1e ("ice: Add advanced power mgmt for WoL") Signed-off-by: Anirudh Venkataramanan <[email protected]> Tested-by: Aaron Brown <[email protected]> Signed-off-by: Tony Nguyen <[email protected]>
2020-09-25iavf: Fix incorrect adapter get in iavf_resumeSylwester Dziedziuch1-2/+2
When calling iavf_resume there was a crash because wrong function was used to get iavf_adapter and net_device pointers. Changed how iavf_resume is getting iavf_adapter and net_device pointers from pci_dev. Fixes: 5eae00c57f5e ("i40evf: main driver core") Signed-off-by: Sylwester Dziedziuch <[email protected]> Reviewed-by: Aleksandr Loktionov <[email protected]> Tested-by: Aaron Brown <[email protected]> Signed-off-by: Tony Nguyen <[email protected]>
2020-09-25sgiseeq: convert to dma_alloc_noncoherentChristoph Hellwig1-10/+18
Use the new non-coherent DMA API including proper ownership transfers. This includes adding additional calls to dma_sync_desc_dev as the old syncing was rather ad-hoc. Thanks to Thomas Bogendoerfer for debugging the ownership transfer issues. Signed-off-by: Christoph Hellwig <[email protected]> Tested-by: Thomas Bogendoerfer <[email protected]>
2020-09-25lib82596: convert to dma_alloc_noncoherentChristoph Hellwig3-63/+80
Use the new non-coherent DMA API including proper ownership transfers. This includes moving the DMA helpers to lib82596 based of an ifdef to avoid include order problems. Signed-off-by: Christoph Hellwig <[email protected]> Tested-by: Thomas Bogendoerfer <[email protected]> (SNI part)
2020-09-25lib82596: move DMA allocation into the callers of i82596_probeChristoph Hellwig3-39/+40
This allows us to get rid of the LIB82596_DMA_ATTR defined and prepare for untangling the coherent vs non-coherent DMA allocation API. Signed-off-by: Christoph Hellwig <[email protected]> Tested-by: Thomas Bogendoerfer <[email protected]> (SNI part)
2020-09-25net/au1000-eth: stop using DMA_ATTR_NON_CONSISTENTChristoph Hellwig1-9/+6
The au1000-eth driver contains none of the manual cache synchronization required for using DMA_ATTR_NON_CONSISTENT. From what I can tell it can be used on both dma coherent and non-coherent DMA platforms, but I suspect it has been buggy on the non-coherent platforms all along. Signed-off-by: Christoph Hellwig <[email protected]>
2020-09-24net: hns3: rename macro of pci device id of vfGuangbin Huang3-8/+9
VF devices do not have speed division, its speed is depended on its PF. So macro name of PCI device id of VF is incorrent to have 100G info, it should be renamed by removing 100G info. Signed-off-by: Guangbin Huang <[email protected]> Signed-off-by: Huazhong Tan <[email protected]> Signed-off-by: David S. Miller <[email protected]>
2020-09-24net: hns3: add support for 200G deviceGuangbin Huang5-11/+55
The 200G device has a new device id 0xA228, so adds this device id to pci table, then the driver can probe it. As speed_ability queried from firmware has only 8 bits and already be used up, so firmware adds extra speed_ability_ext to indicate more speed abilities to support 200G and driver needs to parse it. Signed-off-by: Guangbin Huang <[email protected]> Signed-off-by: Huazhong Tan <[email protected]> Signed-off-by: David S. Miller <[email protected]>
2020-09-24net: hns3: add debugfs of dumping pf interrupt resourcesYufeng Mo2-0/+13
The pf's interrupt resources will be changed with the number of enabled pf. Dumping this resource information will be helpful for debugging. Signed-off-by: Yufeng Mo <[email protected]> Signed-off-by: Huazhong Tan <[email protected]> Signed-off-by: David S. Miller <[email protected]>
2020-09-24net: hns3: add a hardware error detect typeYufeng Mo3-0/+5
In hns3_process_hw_error(), the hardware error detection of the ROCEE AXI RESP error type is added. When this error occurs, the client needs to be notified of this error and take corresponding operation. Signed-off-by: Yufeng Mo <[email protected]> Signed-off-by: Huazhong Tan <[email protected]> Signed-off-by: David S. Miller <[email protected]>
2020-09-24net: hns3: remove unnecessary variable initializationYufeng Mo6-9/+9
If a variable is assigned a value before it is used, it's no need to assign an initial value to the variable. So remove these redundant operations. Signed-off-by: Yufeng Mo <[email protected]> Signed-off-by: Huazhong Tan <[email protected]> Signed-off-by: David S. Miller <[email protected]>
2020-09-24net: hns3: refactor the function for dumping tc information in debugfsYufeng Mo1-10/+5
Remove some unnecessary parameters of hclge_title_idx_print(), and rename this function for readability. Signed-off-by: Yufeng Mo <[email protected]> Signed-off-by: Huazhong Tan <[email protected]> Signed-off-by: David S. Miller <[email protected]>
2020-09-24net/fsl: quieten expected MDIO access failuresJamie Iles1-1/+1
MDIO reads can happen during PHY probing, and printing an error with dev_err can result in a large number of error messages during device probe. On a platform with a serial console this can result in excessively long boot times in a way that looks like an infinite loop when multiple busses are present. Since 0f183fd151c (net/fsl: enable extended scanning in xgmac_mdio) we perform more scanning so there are potentially more failures. Reduce the logging level to dev_dbg which is consistent with the Freescale enetc driver. Cc: Jeremy Linton <[email protected]> Signed-off-by: Jamie Iles <[email protected]> Reviewed-by: Andrew Lunn <[email protected]> Signed-off-by: David S. Miller <[email protected]>
2020-09-24net/ethernet/broadcom: fix spelling typoWang Qing1-8/+8
Modify the comment typo: "compliment" -> "complement". Signed-off-by: Wang Qing <[email protected]> Signed-off-by: David S. Miller <[email protected]>
2020-09-24hinic: fix wrong return value of mac-set cmdLuo bin2-13/+5
It should also be regarded as an error when hw return status=4 for PF's setting mac cmd. Only if PF return status=4 to VF should this cmd be taken special treatment. Fixes: 7dd29ee12865 ("hinic: add sriov feature support") Signed-off-by: Luo bin <[email protected]> Signed-off-by: David S. Miller <[email protected]>
2020-09-24Merge tag 'mlx5-updates-2020-09-22' of ↵David S. Miller21-1658/+2339
git://git.kernel.org/pub/scm/linux/kernel/git/saeed/linux Saeed Mahameed says: ==================== mlx5-updates-2020-09-22 This series includes mlx5 updates 1) Add support for Connection Tracking offload in NIC mode. Supporting CT offload in NIC mode on Mellanox cards is useful for scenarios where the dual port NIC serves as a gateway between 2 networks and forwards traffic between these networks. Since the traffic is not terminated on the host in this case, no use of SRIOV VFs and/or switchdev mode is required. Today Mellanox NIC cards already support offloading of packet forwarding between physical ports without going to the host so combining it with CT offloading allows users to create a gateway with forwarding and CT (Including NAT) offloading capabilities in non-switchdev mode. To support connection tracking in non-Switchdev mode (Single NIC mode), we need to make use of the current Connection tracking infrastructure implemented on top of E-Switch and the mlx5 generic flow table chains APIs, to make it work on non-Eswitch steering domain e.g. NIC RX domain, the following was performed: 1.1) Refactor current flow steering chains infrastructure and updates TC nic mode implementation to use flow table chains. 1.2) Refactor current Connection Tracking (CT) infrastructure to not assume E-switch backend, and make the CT layer agnostic to underlying steering mode (E-Switch/NIC) 1.3) Plumbing to support CT offload in NIC mode. 2) Trivial code cleanups. ==================== Signed-off-by: David S. Miller <[email protected]>
2020-09-24dpaa2-mac: add PCS support through the Lynx moduleIoana Ciornei3-1/+91
Include PCS support in the dpaa2-eth driver by integrating it with the new Lynx PCS module. There is not much to talk about in terms of changes needed in the dpaa2-eth driver since the only steps necessary are to find the MDIO device representing the PCS, register it to the Lynx PCS module and then let phylink know if its existence also. After this, the PCS callbacks will be treated directly by Lynx, without interraction from dpaa2-eth's part. Signed-off-by: Ioana Ciornei <[email protected]> Reviewed-by: Andrew Lunn <[email protected]> Signed-off-by: David S. Miller <[email protected]>
2020-09-24net: mscc: ocelot: always pass skb clone to ocelot_port_add_txtstamp_skbVladimir Oltean2-23/+29
Currently, ocelot switchdev passes the skb directly to the function that enqueues it to the list of skb's awaiting a TX timestamp. Whereas the felix DSA driver first clones the skb, then passes the clone to this queue. This matters because in the case of felix, the common IRQ handler, which is ocelot_get_txtstamp(), currently clones the clone, and frees the original clone. This is useless and can be simplified by using skb_complete_tx_timestamp() instead of skb_tstamp_tx(). Signed-off-by: Vladimir Oltean <[email protected]> Acked-by: Richard Cochran <[email protected]> Signed-off-by: David S. Miller <[email protected]>
2020-09-23net: stmmac: removed enabling eee in EEE set callbackVoon Weifeng1-11/+4
EEE should be only be enabled during stmmac_mac_link_up() when the link are up and being set up properly. set_eee should only do settings configuration and disabling the eee. Without this fix, turning on EEE using ethtool will return "Operation not supported". This is due to the driver is in a dead loop waiting for eee to be advertised in the for eee to be activated but the driver will only configure the EEE advertisement after the eee is activated. Ethtool should only return "Operation not supported" if there is no EEE capbility in the MAC controller. Fixes: 8a7493e58ad6 ("net: stmmac: Fix a race in EEE enable callback") Signed-off-by: Voon Weifeng <[email protected]> Acked-by: Mark Gross <[email protected]> Signed-off-by: David S. Miller <[email protected]>
2020-09-23net: lantiq: Add locking for TX DMA channelHauke Mehrtens1-0/+2
The TX DMA channel data is accessed by the xrx200_start_xmit() and the xrx200_tx_housekeeping() function from different threads. Make sure the accesses are synchronized by acquiring the netif_tx_lock() in the xrx200_tx_housekeeping() function too. This lock is acquired by the kernel before calling xrx200_start_xmit(). Signed-off-by: Hauke Mehrtens <[email protected]> Signed-off-by: David S. Miller <[email protected]>
2020-09-23octeontx2-pf: Support to change VLAN based RSS hash options via ethtoolGeorge Cherian2-1/+8
Add support to control rx-flow-hash based on VLAN. By default VLAN plus 4-tuple based hashing is enabled. Changes can be done runtime using ethtool To enable 2-tuple plus VLAN based flow distribution # ethtool -N <intf> rx-flow-hash <prot> sdv To enable 4-tuple plus VLAN based flow distribution # ethtool -N <intf> rx-flow-hash <prot> sdfnv Signed-off-by: George Cherian <[email protected]> Signed-off-by: Sunil Goutham <[email protected]> Signed-off-by: David S. Miller <[email protected]>
2020-09-23octeontx2-af: Add support for VLAN based RSS hashingGeorge Cherian2-0/+9
Added support for PF/VF drivers to choose RSS flow key algorithm with VLAN tag included in hashing input data. Only CTAG is considered. Signed-off-by: George Cherian <[email protected]> Signed-off-by: Sunil Goutham <[email protected]> Signed-off-by: David S. Miller <[email protected]>
2020-09-23Revert "ravb: Fixed to be able to unload modules"Geert Uytterhoeven1-55/+55
This reverts commit 1838d6c62f57836639bd3d83e7855e0ee4f6defc. This commit moved the ravb_mdio_init() call (and thus the of_mdiobus_register() call) from the ravb_probe() to the ravb_open() call. This causes a regression during system resume (s2idle/s2ram), as new PHY devices cannot be bound while suspended. During boot, the Micrel PHY is detected like this: Micrel KSZ9031 Gigabit PHY e6800000.ethernet-ffffffff:00: attached PHY driver [Micrel KSZ9031 Gigabit PHY] (mii_bus:phy_addr=e6800000.ethernet-ffffffff:00, irq=228) ravb e6800000.ethernet eth0: Link is Up - 1Gbps/Full - flow control off During system suspend, (A) defer_all_probes is set to true, and (B) usermodehelper_disabled is set to UMH_DISABLED, to avoid drivers being probed while suspended. A. If CONFIG_MODULES=n, phy_device_register() calling device_add() merely adds the device, but does not probe it yet, as really_probe() returns early due to defer_all_probes being set: dpm_resume+0x128/0x4f8 device_resume+0xcc/0x1b0 dpm_run_callback+0x74/0x340 ravb_resume+0x190/0x1b8 ravb_open+0x84/0x770 of_mdiobus_register+0x1e0/0x468 of_mdiobus_register_phy+0x1b8/0x250 of_mdiobus_phy_device_register+0x178/0x1e8 phy_device_register+0x114/0x1b8 device_add+0x3d4/0x798 bus_probe_device+0x98/0xa0 device_initial_probe+0x10/0x18 __device_attach+0xe4/0x140 bus_for_each_drv+0x64/0xc8 __device_attach_driver+0xb8/0xe0 driver_probe_device.part.11+0xc4/0xd8 really_probe+0x32c/0x3b8 Later, phy_attach_direct() notices no PHY driver has been bound, and falls back to the Generic PHY, leading to degraded operation: Generic PHY e6800000.ethernet-ffffffff:00: attached PHY driver [Generic PHY] (mii_bus:phy_addr=e6800000.ethernet-ffffffff:00, irq=POLL) ravb e6800000.ethernet eth0: Link is Up - 1Gbps/Full - flow control off B. If CONFIG_MODULES=y, request_module() returns early with -EBUSY due to UMH_DISABLED, and MDIO initialization fails completely: mdio_bus e6800000.ethernet-ffffffff:00: error -16 loading PHY driver module for ID 0x00221622 ravb e6800000.ethernet eth0: failed to initialize MDIO PM: dpm_run_callback(): ravb_resume+0x0/0x1b8 returns -16 PM: Device e6800000.ethernet failed to resume: error -16 Ignoring -EBUSY in phy_request_driver_module(), like was done for -ENOENT in commit 21e194425abd65b5 ("net: phy: fix issue with loading PHY driver w/o initramfs"), would makes it fall back to the Generic PHY, like in the CONFIG_MODULES=n case. Signed-off-by: Geert Uytterhoeven <[email protected]> Cc: [email protected] Reviewed-by: Sergei Shtylyov <[email protected]> Signed-off-by: David S. Miller <[email protected]>
2020-09-23octeontx2-pf: Add tracepoints for PF/VF mailboxSubbaraya Sundeep3-0/+10
With tracepoints support present in the mailbox code this patch adds tracepoints in PF and VF drivers at places where mailbox messages are allocated, sent and at message interrupts. Signed-off-by: Subbaraya Sundeep <[email protected]> Signed-off-by: Sunil Goutham <[email protected]> Signed-off-by: David S. Miller <[email protected]>
2020-09-23octeontx2-af: Introduce tracepoints for mailboxSubbaraya Sundeep6-2/+136
Added tracepoints in mailbox code so that the mailbox operations like message allocation, sending message and message interrupts are traced. Also the mailbox errors occurred like timeout or wrong responses are traced. These will help in debugging mailbox issues. Here's an example output showing one of the mailbox messages sent by PF to AF and AF responding to it: ~# mount -t tracefs none /sys/kernel/tracing/ ~# echo 1 > /sys/kernel/tracing/events/rvu/enable ~# ifconfig eth0 up ~# cat /sys/kernel/tracing/trace ~# cat /sys/kernel/tracing/trace tracer: nop _-----=> irqs-off / _----=> need-resched | / _---=> hardirq/softirq || / _--=> preempt-depth ||| / delay TASK-PID CPU# |||| TIMESTAMP FUNCTION | | | |||| | | ifconfig-2382 [002] .... 756.161892: otx2_msg_alloc: [0002:02:00.0] msg:(0x400) size:40 ifconfig-2382 [002] ...1 756.161895: otx2_msg_send: [0002:02:00.0] sent 1 msg(s) of size:48 <idle>-0 [000] d.h1 756.161902: otx2_msg_interrupt: [0002:01:00.0] mbox interrupt PF(s) to AF (0x2) kworker/u49:0-1165 [000] .... 756.162049: otx2_msg_process: [0002:01:00.0] msg:(0x400) error:0 kworker/u49:0-1165 [000] ...1 756.162051: otx2_msg_send: [0002:01:00.0] sent 1 msg(s) of size:32 kworker/u49:0-1165 [000] d.h. 756.162056: otx2_msg_interrupt: [0002:02:00.0] mbox interrupt AF to PF (0x1) Signed-off-by: Subbaraya Sundeep <[email protected]> Signed-off-by: Sunil Goutham <[email protected]> Signed-off-by: David S. Miller <[email protected]>
2020-09-23net: allwinner: remove redundant irqsave and irqrestore in hardIRQBarry Song1-4/+2
The comment "holders of db->lock must always block IRQs" and related code to do irqsave and irqrestore don't make sense since we are in a IRQ-disabled hardIRQ context. Cc: Maxime Ripard <[email protected]> Cc: Chen-Yu Tsai <[email protected]> Signed-off-by: Barry Song <[email protected]> Acked-by: Maxime Ripard <[email protected]> Signed-off-by: David S. Miller <[email protected]>
2020-09-23net: hns3: Constify static structsRikard Falkeborn2-18/+18
A number of static variables were not modified. Make them const to allow the compiler to put them in read-only memory. In order to do so, constify a couple of input pointers as well as some local pointers. This moves about 35Kb to read-only memory as seen by the output of the size command. Before: text data bss dec hex filename 404938 111534 640 517112 7e3f8 drivers/net/ethernet/hisilicon/hns3/hns3pf/hclge.ko After: text data bss dec hex filename 439499 76974 640 517113 7e3f9 drivers/net/ethernet/hisilicon/hns3/hns3pf/hclge.ko Signed-off-by: Rikard Falkeborn <[email protected]> Signed-off-by: David S. Miller <[email protected]>
2020-09-23net/mlx5: remove unreachable returnPavel Machek (CIP)1-2/+0
The last return statement is unreachable code. I'm not sure if it will provoke any warnings, but it looks ugly. Signed-off-by: Pavel Machek (CIP) <[email protected]> Signed-off-by: Saeed Mahameed <[email protected]>
2020-09-23net/mlx5: simplify the return expression of mlx5_ec_init()Qinglang Miao1-7/+1
Simplify the return expression. Signed-off-by: Qinglang Miao <[email protected]> Signed-off-by: Saeed Mahameed <[email protected]>
2020-09-23net/mlx5e: Use kfree() to free fd->g in accel_fs_tcp_create_groups()Denis Efremov1-1/+1
Memory ft->g in accel_fs_tcp_create_groups() is allocaed with kcalloc(). It's excessive to free ft->g with kvfree(). Use kfree() instead. Signed-off-by: Denis Efremov <[email protected]> Signed-off-by: Saeed Mahameed <[email protected]>
2020-09-23net/mlx5e: IPsec: Use kvfree() for memory allocated with kvzalloc()Denis Efremov1-2/+2
Variables flow_group_in, spec in rx_fs_create() are allocated with kvzalloc(). It's incorrect to free them with kfree(). Use kvfree() instead. Fixes: 5e466345291a ("net/mlx5e: IPsec: Add IPsec steering in local NIC RX") Signed-off-by: Denis Efremov <[email protected]> Signed-off-by: Saeed Mahameed <[email protected]>
2020-09-23net/mlx5e: Keep direct reference to mlx5_core_dev in tc ctAriel Levkovich1-19/+18
Keep and use a direct reference to the mlx5 core device in all of tc_ct code instead of accessing it via a pointer to mlx5 eswitch in order to support nic mode ct offload for VF devices that don't have a valid eswitch pointer set. Signed-off-by: Ariel Levkovich <[email protected]> Reviewed-by: Roi Dayan <[email protected]> Signed-off-by: Saeed Mahameed <[email protected]>
2020-09-23net/mlx5e: TC: Remove unused parameter from mlx5_tc_ct_add_no_trk_match()Saeed Mahameed3-9/+4
priv is never used in this function Fixes: 7e36feeb0467 ("net/mlx5e: CT: Don't offload tuple rewrites for established tuples") Signed-off-by: Saeed Mahameed <[email protected]>
2020-09-23net/mlx5e: CT: Use the same counter for both directionsOz Shlomo1-9/+85
A connection is represented by two 5-tuple entries, one for each direction. Currently, each direction allocates its own hw counter, which is inefficient as ct aging is managed per connection. Share the counter that was allocated for the original direction with the reverse direction. Signed-off-by: Oz Shlomo <[email protected]> Reviewed-by: Roi Dayan <[email protected]> Signed-off-by: Saeed Mahameed <[email protected]>
2020-09-23net/mlx5e: Support CT offload for tc nic flowsAriel Levkovich6-200/+348
Adding support to perform CT related tc actions and matching on CT states for nic flows. The ct flows management and handling will be done using a new instance of the ct database that is declared in this patch to keep it separate from the eswitch ct flows database. Offloading and unoffloading ct flows will be done using the existing ct offload api by providing it the relevant ct database reference in each mode. In addition, refactoring the tc ct api is introduced to make it agnostic to the flow type and perform the resource allocations and rule insertion to the proper steering domain in the device. In the initialization call, the api requests and stores in the ct database instance all the relevant information that distinguishes between nic flows and esw flows, such as chains database, steering namespace and mod hdr table. This way the operations of adding and removing ct flows to the device can later performed agnostically to the flow type. Signed-off-by: Ariel Levkovich <[email protected]> Reviewed-by: Roi Dayan <[email protected]> Signed-off-by: Saeed Mahameed <[email protected]>
2020-09-23net/mlx5e: rework ct offload init messagesAriel Levkovich1-22/+17
The changes are: - Use mlx5_core print macros instead of netdev_warn since netdev is not always initialized at that stage. - Print a warning message in case the issue is with lack of support for CT offload without indicating an error. Signed-off-by: Ariel Levkovich <[email protected]> Reviewed-by: Roi Dayan <[email protected]> Signed-off-by: Saeed Mahameed <[email protected]>
2020-09-23net/mlx5e: Add tc chains offload support for nic flowsAriel Levkovich4-57/+250
Allow adding nic tc flow rules with goto chain action. Connecting the nic flows to the mlx5 chains infrastructure in previous patches allows us to support the creation of chained flow tables and rules that direct to another chain for further packet processing. This is a required preparation to support CT offloads for nic tc flows. We allow the creation of 256 different chains for nic flows since we have 8 bits available for the chain restore tag in case of a miss. Signed-off-by: Ariel Levkovich <[email protected]> Reviewed-by: Roi Dayan <[email protected]> Signed-off-by: Saeed Mahameed <[email protected]>
2020-09-23net/mlx5: Refactor tc flow attributes structureAriel Levkovich7-262/+376
In order to support chains and connection tracking offload for nic flows, there's a need to introduce a common flow attributes struct so that these features can be agnostic and have access to a single attributes struct, regardless of the flow type. Therefore, a new tc flow attributes format is introduced to allow access to attributes that are common to eswitch and nic flows. The common attributes will always get allocated for the new flows, regardless of their type, while the type specific attributes are separated into different structs and will be allocated based on the flow type to avoid memory waste. When allocating the flow attributes the caller provides the flow steering namespace and according the namespace type the additional space for the extra, type specific, attributes is determined and added to the total attribute allocation size. In addition, the attributes that are going to be common to both flow types are moved to the common attributes struct. Signed-off-by: Ariel Levkovich <[email protected]> Reviewed-by: Roi Dayan <[email protected]> Reviewed-by: Vlad Buslov <[email protected]> Signed-off-by: Saeed Mahameed <[email protected]> Signed-off-by: Saeed Mahameed <[email protected]>
2020-09-23net/mlx5e: Split nic tc flow allocation and creationAriel Levkovich2-46/+77
For future support of CT offload with nic tc flows, where the flow rule is not created immediately but rather following a future event, the patch is splitting the nic rule creation and deletion into 2 parts: 1. Creating/Deleting and setting the rule attributes. 2. Creating/Deleting the flow table and flow rule itself. This way the attributes can be prepared and stored in the flow handle when the tc flow is created but the rule can actually be created at any point in the future, using these pre allocated attributes. Signed-off-by: Ariel Levkovich <[email protected]> Reviewed-by: Roi Dayan <[email protected]> Signed-off-by: Saeed Mahameed <[email protected]>
2020-09-23net/mlx5e: Tc nic flows to use mlx5_chains flow tablesAriel Levkovich2-31/+60
Change nic tc flows offload path to use the chains and prios infrastructure for the flow table creation as a preparation to support tc multi chains and priorities for nic flows. Adding an instance of the table chaining database to the nic tc struct and perform the root table creation and desctuction via the chains api while keeping the limit of a single chain (0) in nic tc mode. This will be extendable to supporting multiple chains in the following patches. The flow table sizes and default miss table parameters that are provided to the chains creation api are kept the same. Signed-off-by: Ariel Levkovich <[email protected]> Reviewed-by: Roi Dayan <[email protected]> Signed-off-by: Saeed Mahameed <[email protected]>
2020-09-23net/mlx5: Allow ft level ignore for nic rx tablesAriel Levkovich1-2/+3
Allow setting a flow table with a lower level as a rule destination in nic rx tables. This is required in order to support table chaining of tc nic flows. Signed-off-by: Ariel Levkovich <[email protected]> Reviewed-by: Roi Dayan <[email protected]> Signed-off-by: Saeed Mahameed <[email protected]>
2020-09-23net/mlx5: Refactor multi chains and prios supportAriel Levkovich11-1066/+1174
Decouple the chains infrastructure from eswitch and make it generic to support other steering namespaces. The change defines an agnostic data structure to keep all the relevant information for maintaining flow table chaining in any steering namespace. Each namespace that requires table chaining will be required to allocate such data structure. The chains creation code will receive the steering namespace and flow table parameters from the caller so it will operate agnosticly when creating the required resources to maintain the table chaining function while Parts of the code that are relevant to eswitch specific functionality are moved to eswitch files. Signed-off-by: Ariel Levkovich <[email protected]> Reviewed-by: Roi Dayan <[email protected]> Signed-off-by: Saeed Mahameed <[email protected]>
2020-09-23net: realtek: Remove set but not used variableZheng Yongjun1-2/+2
Fixes gcc '-Wunused-but-set-variable' warning: drivers/net/ethernet/realtek/8139cp.c: In function cp_tx_timeout: drivers/net/ethernet/realtek/8139cp.c:1242:6: warning: variable ‘rc’ set but not used [-Wunused-but-set-variable] `rc` is never used, so remove it. Signed-off-by: Zheng Yongjun <[email protected]> Signed-off-by: David S. Miller <[email protected]>
2020-09-23hinic: improve the comments of function headerLuo bin6-4/+11
Fix the warnings about function header comments when building hinic driver with "W=1" option. Signed-off-by: Luo bin <[email protected]> Signed-off-by: David S. Miller <[email protected]>
2020-09-23Merge git://git.kernel.org/pub/scm/linux/kernel/git/bpf/bpf-nextDavid S. Miller1-6/+12
Alexei Starovoitov says: ==================== pull-request: bpf-next 2020-09-23 The following pull-request contains BPF updates for your *net-next* tree. We've added 95 non-merge commits during the last 22 day(s) which contain a total of 124 files changed, 4211 insertions(+), 2040 deletions(-). The main changes are: 1) Full multi function support in libbpf, from Andrii. 2) Refactoring of function argument checks, from Lorenz. 3) Make bpf_tail_call compatible with functions (subprograms), from Maciej. 4) Program metadata support, from YiFei. 5) bpf iterator optimizations, from Yonghong. ==================== Signed-off-by: David S. Miller <[email protected]>
2020-09-23net: microchip: Make `lan743x_pm_suspend` function return right valueZheng Yongjun1-4/+1
drivers/net/ethernet/microchip/lan743x_main.c: In function lan743x_pm_suspend: `ret` is set but not used. In fact, `pci_prepare_to_sleep` function value should be the right value of `lan743x_pm_suspend` function, therefore, fix it. Signed-off-by: Zheng Yongjun <[email protected]> Signed-off-by: David S. Miller <[email protected]>
2020-09-22Merge tag 'mlx5-updates-2020-09-21' of ↵David S. Miller14-311/+637
git://git.kernel.org/pub/scm/linux/kernel/git/saeed/linux Saeed Mahameed says: ==================== mlx5-updates-2020-09-21 Multi packet TX descriptor support for SKBs. This series introduces some refactoring of the regular TX data path in mlx5 and adds the Enhanced TX MPWQE feature support. MPWQE stands for multi-packet work queue element, and it can serve multiple packets, reducing the PCI bandwidth spent on control traffic. It should improve performance in scenarios where PCI is the bottleneck, and xmit_more is signaled by the kernel. The refactoring done in this series also improves the packet rate on its own. MPWQE is already implemented in the XDP tx path, this series adds the support of MPWQE for regular kernel SKB tx path. MPWQE is supported from ConnectX-5 and onward, for legacy devices we need to keep backward compatibility for regular (Single packet) WQE descriptor. MPWQE is not compatible with certain offloads and features, such as TLS offload, TSO, nonlinear SKBs. If such incompatible features are in use, the driver gracefully falls back to non-MPWQE per SKB. Prior to the final patch "net/mlx5e: Enhanced TX MPWQE for SKBs" that adds the actual support, Maxim did some refactoring to the tx data path to split it into stages and smaller helper functions that can be utilized and reused for both legacy and new MPWQE feature. Performance testing: UDP performance is improved in a single stream pktgen test: Packet rate: 16.86 Mpps (±0.15 Mpps) -> 20.94 Mpps (±0.33 Mpps) Instructions per packet: 434 -> 329 Cycles per packet: 158 -> 123 Instructions per cycle: 2.75 -> 2.67 TCP and XDP_TX single stream tests show no performance difference. MPWQE can reduce PCI bandwidth: PCI Gen2, pktgen at fixed rate of 36864000 pps on 24 CPU cores: Inbound PCI utilization with MPWQE off: 80.3% Inbound PCI utilization with MPWQE on: 59.0% PCI Gen3, pktgen at fixed rate of 56064000 pps on 24 CPU cores: Inbound PCI utilization with MPWQE off: 65.4% Inbound PCI utilization with MPWQE on: 49.3% MPWQE can also reduce CPU load, increasing the packet rate in case of CPU bottleneck: PCI Gen2, pktgen at full rate on 24 CPU cores: Packet rate with MPWQE off: 37.5 Mpps Packet rate with MPWQE on: 49.0 Mpps PCI Gen3, pktgen at full rate on 24 CPU cores: Packet rate with MPWQE off: 57.0 Mpps Packet rate with MPWQE on: 66.8 Mpps Burst size in all pktgen tests is 32. CPU: Intel(R) Xeon(R) CPU E5-2680 v3 @ 2.50GHz (x86_64) NIC: Mellanox ConnectX-6 Dx GCC 10.2.0 ==================== Signed-off-by: David S. Miller <[email protected]>
2020-09-22Merge git://git.kernel.org/pub/scm/linux/kernel/git/netdev/netDavid S. Miller54-428/+664
Two minor conflicts: 1) net/ipv4/route.c, adding a new local variable while moving another local variable and removing it's initial assignment. 2) drivers/net/dsa/microchip/ksz9477.c, overlapping changes. One pretty prints the port mode differently, whilst another changes the driver to try and obtain the port mode from the port node rather than the switch node. Signed-off-by: David S. Miller <[email protected]>
2020-09-21net/mlx5e: Enhanced TX MPWQE for SKBsMaxim Mikityanskiy11-18/+240
This commit adds support for Enhanced TX MPWQE feature in the regular (SKB) data path. A MPWQE (multi-packet work queue element) can serve multiple packets, reducing the PCI bandwidth on control traffic. Two new stats (tx*_mpwqe_blks and tx*_mpwqe_pkts) are added. The feature is on by default and controlled by the skb_tx_mpwqe private flag. In a MPWQE, eseg is shared among all packets, so eseg-based offloads (IPSEC, GENEVE, checksum) run on a separate eseg that is compared to the eseg of the current MPWQE session to decide if the new packet can be added to the same session. MPWQE is not compatible with certain offloads and features, such as TLS offload, TSO, nonlinear SKBs. If such incompatible features are in use, the driver gracefully falls back to non-MPWQE. This change has no performance impact in TCP single stream test and XDP_TX single stream test. UDP pktgen, 64-byte packets, single stream, MPWQE off: Packet rate: 16.96 Mpps (±0.12 Mpps) -> 17.01 Mpps (±0.20 Mpps) Instructions per packet: 421 -> 429 Cycles per packet: 156 -> 161 Instructions per cycle: 2.70 -> 2.67 UDP pktgen, 64-byte packets, single stream, MPWQE on: Packet rate: 16.96 Mpps (±0.12 Mpps) -> 20.94 Mpps (±0.33 Mpps) Instructions per packet: 421 -> 329 Cycles per packet: 156 -> 123 Instructions per cycle: 2.70 -> 2.67 Enabling MPWQE can reduce PCI bandwidth: PCI Gen2, pktgen at fixed rate of 36864000 pps on 24 CPU cores: Inbound PCI utilization with MPWQE off: 80.3% Inbound PCI utilization with MPWQE on: 59.0% PCI Gen3, pktgen at fixed rate of 56064000 pps on 24 CPU cores: Inbound PCI utilization with MPWQE off: 65.4% Inbound PCI utilization with MPWQE on: 49.3% Enabling MPWQE can also reduce CPU load, increasing the packet rate in case of CPU bottleneck: PCI Gen2, pktgen at full rate on 24 CPU cores: Packet rate with MPWQE off: 37.5 Mpps Packet rate with MPWQE on: 49.0 Mpps PCI Gen3, pktgen at full rate on 24 CPU cores: Packet rate with MPWQE off: 57.0 Mpps Packet rate with MPWQE on: 66.8 Mpps Burst size in all pktgen tests is 32. CPU: Intel(R) Xeon(R) CPU E5-2680 v3 @ 2.50GHz (x86_64) NIC: Mellanox ConnectX-6 Dx GCC 10.2.0 Signed-off-by: Maxim Mikityanskiy <[email protected]> Reviewed-by: Tariq Toukan <[email protected]> Signed-off-by: Saeed Mahameed <[email protected]>