aboutsummaryrefslogtreecommitdiff
AgeCommit message (Collapse)AuthorFilesLines
2020-07-28net_sched: initialize timer earlier in red_init()Cong Wang1-4/+4
When red_init() fails, red_destroy() is called to clean up. If the timer is not initialized yet, del_timer_sync() will complain. So we have to move timer_setup() before any failure. Reported-and-tested-by: syzbot+6e95a4fabf88dc217145@syzkaller.appspotmail.com Fixes: aee9caa03fc3 ("net: sched: sch_red: Add qevents "early_drop" and "mark"") Cc: Petr Machata <petrm@mellanox.com> Cc: Jamal Hadi Salim <jhs@mojatatu.com> Cc: Jiri Pirko <jiri@resnulli.us> Signed-off-by: Cong Wang <xiyou.wangcong@gmail.com> Reviewed-by: Petr Machata <petrm@mellanox.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2020-07-28Merge branch 'hinic-add-some-error-messages-for-debug'David S. Miller21-81/+855
Luo bin says: ==================== hinic: add some error messages for debug patch #1: support to handle hw abnormal event patch #2: improve the error messages when functions return failure and dump relevant registers in some exception handling processes ==================== Signed-off-by: David S. Miller <davem@davemloft.net>
2020-07-28hinic: add log in exception handling processesLuo bin13-45/+151
improve the error message when functions return failure and dump relevant registers in some exception handling processes Signed-off-by: Luo bin <luobin9@huawei.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2020-07-28hinic: add support to handle hw abnormal eventLuo bin10-36/+704
add support to handle hw abnormal event such as hardware failure, cable unplugged,link error Signed-off-by: Luo bin <luobin9@huawei.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2020-07-28Merge branch 'introduce-PLDM-firmware-update-library'David S. Miller24-3/+2953
Jacob Keller says: ==================== introduce PLDM firmware update library This series goal is to enable support for updating the ice hardware flash using the devlink flash command. The ice firmware update files are distributed using the file format described by the "PLDM for Firmware Update" standard: https://www.dmtf.org/documents/pmci/pldm-firmware-update-specification-100 Because this file format is standard, this series introduces a new library that handles the generic logic for parsing the PLDM file header. The library uses a design that is very similar to the Mellanox mlxfw module. That is, a simple ops table is setup and device drivers instantiate an instance of the pldmfw library with the device specific operations. Doing so allows for each device to implement the low level behavior for how to interact with its firmware. This series includes the library and an implementation for the ice hardware. I've removed all of the parameters, and the proposed changes to support overwrite mode. I'll be working on the overwrite mask suggestion from Jakub as a follow-up series. Because the PLDM file format is a standard and not something that is specific to the Intel hardware, I opted to place this update library in lib/pldmfw. I should note that while I tried to make the library generic, it does not attempt to mimic the complete "Update Agent" as defined in the standard. This is mostly due to the fact that the actual interfaces exposed to software for the ice hardware would not allow this. This series depends on some work just recently and is based on top of the patch series sent by Tony published at: https://lore.kernel.org/netdev/20200723234720.1547308-1-anthony.l.nguyen@intel.com/T/#t Changes since v2 RFC * Removed overwrite mode patches, as this can become a follow up series with a separate discussion * Fixed a minor bug in the pldm_timestamp structure not being packed. * Dropped Cc for other driver maintainers, as this series no longer includes changes to the core flash update command. * Re-ordered patches slightly. Changes since v1 RFC * Removed the "allow_downgrade_on_flash_update" parameter. Instead, the driver will always attempt to flash the device, even when firmware indicates that it would be a downgrade. A dev_warn is used to indicate when this occurs. * Removed the "ignore_pending_flash_update". Instead, the driver will always check for and cancel any previous pending update. A devlink flash status message will be sent when this cancellation occurs. * Removed the "reset_after_flash_update" parameter. This will instead be implemented as part of a devlink reset interface, work left for a future change. * Replaced the "flash_update_preservation_level" parameter with a new "overwrite" mode attribute on the flash update command. For ice, this mode will select the preservation level. For all other drivers, I modified them to check that the mode is "OVERWRITE_NOTHING", and have Cc'd the maintainers to get their input. ==================== Signed-off-by: David S. Miller <davem@davemloft.net>
2020-07-28ice: implement device flash update via devlinkJacob Keller9-1/+1007
Use the newly added pldmfw library to implement device flash update for the Intel ice networking device driver. This support uses the devlink flash update interface. The main parts of the flash include the Option ROM, the netlist module, and the main NVM data. The PLDM firmware file contains modules for each of these components. Using the pldmfw library, the provided firmware file will be scanned for the three major components, "fw.undi" for the Option ROM, "fw.mgmt" for the main NVM module containing the primary device firmware, and "fw.netlist" containing the netlist module. The flash is separated into two banks, the active bank containing the running firmware, and the inactive bank which we use for update. Each module is updated in a staged process. First, the inactive bank is erased, preparing the device for update. Second, the contents of the component are copied to the inactive portion of the flash. After all components are updated, the driver signals the device to switch the active bank during the next EMP reset (which would usually occur during the next reboot). Although the firmware AdminQ interface does report an immediate status for each command, the NVM erase and NVM write commands receive status asynchronously. The driver must not continue writing until previous erase and write commands have finished. The real status of the NVM commands is returned over the receive AdminQ. Implement a simple interface that uses a wait queue so that the main update thread can sleep until the completion status is reported by firmware. For erasing the inactive banks, this can take quite a while in practice. To help visualize the process to the devlink application and other applications based on the devlink netlink interface, status is reported via the devlink_flash_update_status_notify. While we do report status after each 4k block when writing, there is no real status we can report during erasing. We simply must wait for the complete module erasure to finish. With this implementation, basic flash update for the ice hardware is supported. Signed-off-by: Jacob Keller <jacob.e.keller@intel.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2020-07-28ice: add flags indicating pending update of firmware moduleJacob Keller3-0/+24
After a flash update, the pending status of the update can be determined from the device capabilities. Read the appropriate device capability and store whether there is a pending update awaiting a reboot. Signed-off-by: Jacob Keller <jacob.e.keller@intel.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2020-07-28ice: Add AdminQ commands for FW updateCudzilo, Szymon T5-2/+281
Add structures, identifiers, and helper functions for several AdminQ commands related to performing a firmware update for the ice hardware. These will be used in future code for implementing the devlink .flash_update handler. Signed-off-by: Cudzilo, Szymon T <szymon.t.cudzilo@intel.com> Signed-off-by: Jacob Keller <jacob.e.keller@intel.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2020-07-28ice: Add support for unified NVM update flow capabilityJacek Naczyk3-0/+11
Extends function parsing response from Discover Device Capability AQC to check if the device supports unified NVM update flow. Signed-off-by: Jacek Naczyk <jacek.naczyk@intel.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2020-07-28Add pldmfw library for PLDM firmware updateJacob Keller11-0/+1630
The pldmfw library is used to implement common logic needed to flash devices based on firmware files using the format described by the PLDM for Firmware Update standard. This library consists of logic to parse the PLDM file format from a firmware file object, as well as common logic for sending the relevant PLDM header data to the device firmware. A simple ops table is provided so that device drivers can implement device specific hardware interactions while keeping the common logic to the pldmfw library. This library will be used by the Intel ice networking driver as part of implementing device flash update via devlink. The library aims to be vendor and device agnostic. For this reason, it has been placed in lib/pldmfw, in the hopes that other devices which use the PLDM firmware file format may benefit from it in the future. However, do note that not all features defined in the PLDM standard have been implemented. Signed-off-by: Jacob Keller <jacob.e.keller@intel.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2020-07-28Merge branch 'mptcp-Exchange-MPTCP-DATA_FIN-DATA_ACK-before-TCP-FIN'David S. Miller4-66/+306
Mat Martineau says: ==================== mptcp: Exchange MPTCP DATA_FIN/DATA_ACK before TCP FIN This series allows the MPTCP-level connection to be closed with the peers exchanging DATA_FIN and DATA_ACK according to the state machine in appendix D of RFC 8684. The process is very similar to the TCP disconnect state machine. The prior code sends DATA_FIN only when TCP FIN packets are sent, and does not allow for the MPTCP-level connection to be half-closed. Patch 8 ("mptcp: Use full MPTCP-level disconnect state machine") is the core of the series. Earlier patches in the series have some small fixes and helpers in preparation, and the final four small patches do some cleanup. ==================== Signed-off-by: David S. Miller <davem@davemloft.net>
2020-07-28mptcp: Safely store sequence number when sending dataMat Martineau1-1/+1
The MPTCP socket's write_seq member can be read without the msk lock held, so use WRITE_ONCE() to store it. Signed-off-by: Mat Martineau <mathew.j.martineau@linux.intel.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2020-07-28mptcp: Safely read sequence number when lock isn't heldMat Martineau1-1/+1
The MPTCP socket's write_seq member should be read with READ_ONCE() when the msk lock is not held. Signed-off-by: Mat Martineau <mathew.j.martineau@linux.intel.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2020-07-28mptcp: Skip unnecessary skb extension allocation for bare acksMat Martineau1-3/+6
Bare TCP ack skbs are freed right after MPTCP sees them, so the work to allocate, zero, and populate the MPTCP skb extension is wasted. Detect these skbs and do not add skb extensions to them. Signed-off-by: Mat Martineau <mathew.j.martineau@linux.intel.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2020-07-28mptcp: Only use subflow EOF signaling on fallback connectionsMat Martineau1-1/+2
The MPTCP state machine handles disconnections on non-fallback connections, but the mptcp_sock still needs to get notified when fallback subflows disconnect. Signed-off-by: Mat Martineau <mathew.j.martineau@linux.intel.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2020-07-28mptcp: Use full MPTCP-level disconnect state machineMat Martineau3-17/+92
RFC 8684 appendix D describes the connection state machine for MPTCP. This patch implements the DATA_FIN / DATA_ACK exchanges and MPTCP-level socket state changes described in that appendix, rather than simply sending DATA_FIN along with TCP FIN when disconnecting subflows. DATA_FIN is now sent and acknowledged before shutting down the subflows. Received DATA_FIN information (if not part of a data packet) is written to the MPTCP socket when the incoming DSS option is parsed by the subflow, and the MPTCP worker is scheduled to process the flag. DATA_FIN received as part of a full DSS mapping will be handled when the mapping is processed. The DATA_FIN is acknowledged by the worker if the reader is caught up. If there is still data to be moved to the MPTCP-level queue, ack_seq will be incremented to account for the DATA_FIN when it reaches the end of the stream and a DATA_ACK will be sent to the peer. Signed-off-by: Mat Martineau <mathew.j.martineau@linux.intel.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2020-07-28mptcp: Add helper to process acks of DATA_FINMat Martineau1-8/+46
After DATA_FIN has been sent, the peer will acknowledge it. An ack of the relevant MPTCP-level sequence number will update the MPTCP connection state appropriately. Signed-off-by: Mat Martineau <mathew.j.martineau@linux.intel.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2020-07-28mptcp: Add mptcp_close_state() helperMat Martineau1-0/+27
This will be used to transition to the appropriate state on close and determine if a DATA_FIN needs to be sent for that state transition. Signed-off-by: Mat Martineau <mathew.j.martineau@linux.intel.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2020-07-28mptcp: Track received DATA_FIN sequence number and add related helpersMat Martineau3-10/+115
Incoming DATA_FIN headers need to propagate the presence of the DATA_FIN bit and the associated sequence number to the MPTCP layer, even when arriving on a bare ACK that does not get added to the receive queue. Add structure members to store the DATA_FIN information and helpers to set and check those values. Signed-off-by: Mat Martineau <mathew.j.martineau@linux.intel.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2020-07-28mptcp: Use MPTCP-level flag for sending DATA_FINMat Martineau3-24/+18
Since DATA_FIN information is the same for every subflow, store it only in the mptcp_sock. Signed-off-by: Mat Martineau <mathew.j.martineau@linux.intel.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2020-07-28mptcp: Remove outdated and incorrect commentMat Martineau1-1/+0
mptcp_close() acquires the msk lock, so it clearly should not be held before the function is called. Signed-off-by: Mat Martineau <mathew.j.martineau@linux.intel.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2020-07-28mptcp: Return EPIPE if sending is shut down during a sendmsgMat Martineau1-0/+5
A MPTCP socket where sending has been shut down should not attempt to send additional data, since DATA_FIN has already been sent. Signed-off-by: Mat Martineau <mathew.j.martineau@linux.intel.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2020-07-28mptcp: Allow DATA_FIN in headers without TCP FINMat Martineau1-10/+3
RFC 8684-compliant DATA_FIN needs to be sent and ack'd before subflows are closed with TCP FIN, so write DATA_FIN DSS headers whenever their transmission has been enabled by the MPTCP connection-level socket. Signed-off-by: Mat Martineau <mathew.j.martineau@linux.intel.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2020-07-28Merge branch 'sockptr_t-fixes-v2'David S. Miller12-64/+61
Christoph Hellwig says: ==================== sockptr_t fixes v2 a bunch of fixes for the sockptr_t conversion Changes since v1: - fix a user pointer dereference braino in bpfilter ==================== Signed-off-by: David S. Miller <davem@davemloft.net>
2020-07-28net: improve the user pointer check in init_user_sockptrChristoph Hellwig3-14/+8
Make sure not just the pointer itself but the whole range lies in the user address space. For that pass the length and then use the access_ok helper to do the check. Fixes: 6d04fe15f78a ("net: optimize the sockptr_t for unified kernel/user address spaces") Reported-by: David Laight <David.Laight@ACULAB.COM> Signed-off-by: Christoph Hellwig <hch@lst.de> Signed-off-by: David S. Miller <davem@davemloft.net>
2020-07-28net: remove sockptr_advanceChristoph Hellwig10-48/+49
sockptr_advance never properly worked. Replace it with _offset variants of copy_from_sockptr and copy_to_sockptr. Fixes: ba423fdaa589 ("net: add a new sockptr_t type") Reported-by: Jason A. Donenfeld <Jason@zx2c4.com> Reported-by: Ido Schimmel <idosch@idosch.org> Signed-off-by: Christoph Hellwig <hch@lst.de> Acked-by: Jason A. Donenfeld <Jason@zx2c4.com> Tested-by: Ido Schimmel <idosch@mellanox.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2020-07-28net: make sockptr_is_null strict aliasing safeChristoph Hellwig1-1/+3
While the kernel in general is not strict aliasing safe we can trivially do that in sockptr_is_null without affecting code generation, so always check the actually assigned union member. Reported-by: Jan Engelhardt <jengelh@inai.de> Signed-off-by: Christoph Hellwig <hch@lst.de> Signed-off-by: David S. Miller <davem@davemloft.net>
2020-07-28netfilter: arp_tables: restore a SPDX identifierChristoph Hellwig1-1/+1
This was accidentally removed in an unrelated commit. Fixes: c2f12630c60f ("netfilter: switch nf_setsockopt to sockptr_t") Signed-off-by: Christoph Hellwig <hch@lst.de> Signed-off-by: David S. Miller <davem@davemloft.net>
2020-07-28Merge branch 'mlxsw-Add-support-for-QSFP-DD-transceiver-type'David S. Miller2-12/+44
Ido Schimmel says: ==================== mlxsw: Add support for QSFP-DD transceiver type This patch set from Vadim adds support for Quad Small Form Factor Pluggable Double Density (QSFP-DD) modules in mlxsw. Patch #1 enables dumping of QSFP-DD module information through ethtool. Patch #2 enables reading of temperature thresholds from QSFP-DD modules for hwmon and thermal zone purposes. Changes since v1 [1]: Only rebase on top of net-next. After discussing with Andrew and Adrian we agreed that current approach is OK and that in the future we can follow Andrew's suggestion to "make a new API where user space can request any pages it want, and specify the size of the page". This should allow us "to work around known issues when manufactures get their EEPROM wrong". [1] https://lore.kernel.org/netdev/20200626144724.224372-1-idosch@idosch.org/#t ==================== Signed-off-by: David S. Miller <davem@davemloft.net>
2020-07-28mlxsw: core: Add support for temperature thresholds reading for QSFP-DD ↵Vadim Pasternak2-10/+23
transceivers Allow QSFP-DD transceivers temperature thresholds reading for hardware monitoring and thermal control. For this type, the thresholds are located in page 02h according to the "Module and Lane Thresholds" description from Common Management Interface Specification. Signed-off-by: Vadim Pasternak <vadimp@mellanox.com> Signed-off-by: Ido Schimmel <idosch@mellanox.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2020-07-28mlxsw: core: Add ethtool support for QSFP-DD transceiversVadim Pasternak2-2/+21
The Quad Small Form Factor Pluggable Double Density (QSFP-DD) hardware specification defines a form factor that supports up to 400 Gbps in aggregate over an 8x50-Gbps electrical interface. The QSFP-DD supports both optical and copper interfaces. Implementation is based on Common Management Interface Specification; Rev 4.0 May 8, 2019. Table 8-2 "Identifier and Status Summary (Lower Page)" from this spec defines "Id and Status" fields located at offsets 00h - 02h. Bit 2 at offset 02h ("Flat_mem") specifies QSFP EEPROM memory mode, which could be "upper memory flat" or "paged". Flat memory mode is coded "1", and indicates that only page 00h is implemented in EEPROM. Paged memory is coded "0" and indicates that pages 00h, 01h, 02h, 10h and 11h are implemented. Pages 10h and 11h are currently not supported by the driver. "Flat" memory mode is used for the passive copper transceivers. For this type only page 00h (256 bytes) is available. "Paged" memory is used for the optical transceivers. For this type pages 00h (256 bytes), 01h (128 bytes) and 02h (128 bytes) are available. Upper page 01h contains static advertising field, while upper page 02h contains the module-defined thresholds and lane-specific monitors. Extend enumerator 'mlxsw_reg_mcia_eeprom_module_info_id' with additional field 'MLXSW_REG_MCIA_EEPROM_MODULE_INFO_TYPE_ID'. This field is used to indicate for QSFP-DD transceiver type which memory mode is to be used. Expose 256 bytes buffer for QSFP-DD passive copper transceiver and 512 bytes buffer for optical. Signed-off-by: Vadim Pasternak <vadimp@mellanox.com> Signed-off-by: Ido Schimmel <idosch@mellanox.com> Reviewed-by: Andrew Lunn <andrew@lunn.ch> Signed-off-by: David S. Miller <davem@davemloft.net>
2020-07-28Merge tag 'mlx5-updates-2020-07-28' of ↵David S. Miller26-213/+302
git://git.kernel.org/pub/scm/linux/kernel/git/saeed/linux Saeed Mahameed says: ==================== mlx5-updates-2020-07-28 Misc and small update to mlx5 driver: 1) Aya adds PCIe relaxed ordering support for mlx5 netdev queues. 2) Eran Refactors pages data base to be per vf/function to speedup unload time. 3) Parav changes eswitch steering initialization to account for tota_vports rather than for only active vports and Link non uplink representors to PCI device, for uniform naming scheme. 4) Tariq, trivial RX code improvements and missing inidirect calls wrappers. 5) Small cleanup patches ==================== Signed-off-by: David S. Miller <davem@davemloft.net>
2020-07-28farsync: use generic power managementVaibhav Gupta1-6/+4
The .suspend() and .resume() callbacks are not defined for this driver. Still, their power management structure follows the legacy framework. To bring it under the generic framework, simply remove the binding of callbacks from "struct pci_driver". Change code indentation from space to tab in "struct pci_driver". Signed-off-by: Vaibhav Gupta <vaibhavgupta40@gmail.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2020-07-28net/mlx5: drop unnecessary list_emptyJulia Lawall2-10/+8
list_for_each_entry is able to handle an empty list. The only effect of avoiding the loop is not initializing the index variable. Drop list_empty tests in cases where these variables are not used. Note that list_for_each_entry is defined in terms of list_first_entry, which indicates that it should not be used on an empty list. But in list_for_each_entry, the element obtained by list_first_entry is not really accessed, only the address of its list_head field is compared to the address of the list head, so the list_first_entry is safe. The semantic patch that makes this change is as follows (with another variant for the no brace case): (http://coccinelle.lip6.fr/) <smpl> @@ expression x,e; iterator name list_for_each_entry; statement S; identifier i; @@ -if (!(list_empty(x))) { list_for_each_entry(i,x,...) S - } ... when != i ? i = e </smpl> Signed-off-by: Julia Lawall <Julia.Lawall@inria.fr> Signed-off-by: Saeed Mahameed <saeedm@mellanox.com>
2020-07-28net/mlx5: Use fallthrough pseudo-keywordGustavo A. R. Silva8-13/+13
Replace the existing /* fall through */ comments and its variants with the new pseudo-keyword macro fallthrough[1]. Also, remove unnecessary fall-through markings when it is the case. [1] https://www.kernel.org/doc/html/v5.7/process/deprecated.html?highlight=fallthrough#implicit-switch-case-fall-through Signed-off-by: Gustavo A. R. Silva <gustavoars@kernel.org> Signed-off-by: Saeed Mahameed <saeedm@mellanox.com>
2020-07-28net/mlx5: DR, Reduce print level for matcher printAlex Vesker1-1/+1
There is no need to print on each unsuccessful matcher ip_version combination since it probably will happen when trying to create all the possible combinations. On a real failure we have a print in the calling function. Signed-off-by: Alex Vesker <valex@mellanox.com> Signed-off-by: Saeed Mahameed <saeedm@mellanox.com>
2020-07-28net/mlx5e: Add support for PCI relaxed orderingAya Levin3-2/+13
The concept of Relaxed Ordering in the PCI Express environment allows switches in the path between the Requester and Completer to reorder some transactions just received before others that were previously enqueued. In ETH driver, there is no question of write integrity since each memory segment is written only once per cycle. In addition, the driver doesn't access the memory shared with the hardware until the corresponding CQE arrives indicating all PCI transactions are done. Running TCP single stream over ConnectX-4 LX, ARM CPU on remote-numa has 300% improvement in the bandwidth. With relaxed ordering turned off: BW:10 [GB/s] With relaxed ordering turned on: BW:40 [GB/s] The driver turns relaxed ordering with respect to the firmware capabilities and the return value from pcie_relaxed_ordering_enabled(). Signed-off-by: Aya Levin <ayal@mellanox.com> Signed-off-by: Saeed Mahameed <saeedm@mellanox.com>
2020-07-28net/mlx5e: Use indirect call wrappers for RX post WQEs functionsTariq Toukan5-8/+7
Use the indirect call wrapper API macros for declaration and scope of the RX post WQEs functions. Signed-off-by: Tariq Toukan <tariqt@mellanox.com> Reviewed-by: Maxim Mikityanskiy <maximmi@mellanox.com> Signed-off-by: Saeed Mahameed <saeedm@mellanox.com>
2020-07-28net/mlx5e: Move exposure of datapath function to txrx headerTariq Toukan4-23/+29
Move them from the generic header file "en.h", to the datapath header file "txrx.h". Signed-off-by: Tariq Toukan <tariqt@mellanox.com> Reviewed-by: Maxim Mikityanskiy <maximmi@mellanox.com> Signed-off-by: Saeed Mahameed <saeedm@mellanox.com>
2020-07-28net/mlx5e: RX, Re-work initializaiton of RX function pointersTariq Toukan9-96/+112
Instead of exposing the RQ datapath handlers (from en_rx.c) so that they are set in the control path (in en_main.c), wrap this logic in a single function in en_rx.c and expose it alone. Every profile will now have a pointer to the new mlx5e_rx_handlers structure, instead of directly pointing to the previously-exposed RQ handlers. This significantly improves locality and modularity of the driver, and allows many functions in en_rx.c to become static. Signed-off-by: Tariq Toukan <tariqt@mellanox.com> Reviewed-by: Maxim Mikityanskiy <maximmi@mellanox.com> Signed-off-by: Saeed Mahameed <saeedm@mellanox.com>
2020-07-28net/mlx5e: Link non uplink representors to PCI deviceParav Pandit1-1/+1
Currently PF and VF representors are exposed as virtual device. They are not linked to its parent PCI device like how uplink representor is linked. Due to this, PF and VF representors cannot benefit of the systemd defined naming scheme. This requires special handling by the users. Hence, link the PF and VF representors to their parent PCI device similar to existing uplink representor netdevice. Example: udevadm output before linking to PCI device: $ udevadm test-builtin net_id /sys/class/net/eth6 Load module index Network interface NamePolicy= disabled on kernel command line, ignoring. Parsed configuration file /usr/lib/systemd/network/99-default.link Created link configuration context. Using default interface naming scheme 'v243'. ID_NET_NAMING_SCHEME=v243 Unload module index Unloaded link configuration context. udevadm output after linking to PCI device: $ udevadm test-builtin net_id /sys/class/net/eth6 Load module index Network interface NamePolicy= disabled on kernel command line, ignoring. Parsed configuration file /usr/lib/systemd/network/99-default.link Created link configuration context. Using default interface naming scheme 'v243'. ID_NET_NAMING_SCHEME=v243 ID_NET_NAME_PATH=enp0s8f0npf0vf0 Unload module index Unloaded link configuration context. In past there was little concern over seeing 10,000 lines output showing up at thread [1] is not applicable as ndo ops for VF handling is not exposed for all the 100 repesentors for mlx5 devices. Additionally alternative device naming [2] to overcome shorter device naming is also part of the latest systemd release v245. [1] https://marc.info/?l=linux-netdev&m=152657949117904&w=2 [2] https://lwn.net/Articles/814068/ Signed-off-by: Parav Pandit <parav@mellanox.com> Reviewed-by: Roi Dayan <roid@mellanox.com> Acked-by: Jiri Pirko <jiri@mellanox.com> Signed-off-by: Saeed Mahameed <saeedm@mellanox.com>
2020-07-28net/mlx5: E-switch, Use eswitch total_vportsParav Pandit1-7/+7
Currently steering table and rx group initialization helper routines works on the total_vports passed as input parameter. Both eswitch helpers work on the mlx5_eswitch and thereby have access to esw->total_vports. Hence use it directly instead of passing it via function input arguments. Signed-off-by: Parav Pandit <parav@mellanox.com> Reviewed-by: Roi Dayan <roid@mellanox.com> Reviewed-by: Bodong Wang <bodong@mellanox.com> Signed-off-by: Saeed Mahameed <saeedm@mellanox.com>
2020-07-28net/mlx5: E-switch, Reuse total_vports and avoid duplicate nvportsParav Pandit2-8/+6
Total e-switch vports are already stored in mlx5_eswitch total_vports. Avoid copy of it in nvports and reuse existing total_vports calculation. Signed-off-by: Parav Pandit <parav@mellanox.com> Reviewed-by: Roi Dayan <roid@mellanox.com> Reviewed-by: Bodong Wang <bodong@mellanox.com> Signed-off-by: Saeed Mahameed <saeedm@mellanox.com>
2020-07-28net/mlx5: E-switch, Consider maximum vf vports for steering initParav Pandit1-7/+1
When eswitch is enabled, VFs might not be enabled. Hence, consider maximum number of VFs. This further closes the gap between handling VF vports between ECPF and PF. Fixes: ea2128fd632c ("net/mlx5: E-switch, Reduce dependency on num_vfs during mode set") Signed-off-by: Parav Pandit <parav@mellanox.com> Reviewed-by: Roi Dayan <roid@mellanox.com> Reviewed-by: Bodong Wang <bodong@mellanox.com> Signed-off-by: Saeed Mahameed <saeedm@mellanox.com>
2020-07-28net/mlx5: Add function ID to reclaim pages debug logAvihu Hagag1-1/+2
Add function ID to reclaim pages debug log for better user visibility. Signed-off-by: Avihu Hagag <avihuh@mellanox.com> Reviewed-by: Eran Ben Elisha <eranbe@mellanox.com> Signed-off-by: Saeed Mahameed <saeedm@mellanox.com>
2020-07-28net/mlx5: Hold pages RB tree per VFEran Ben Elisha2-39/+105
Per page request event, FW request to allocated or release pages for a single function. Driver maintains FW pages object per function, so there is no need to hold one global page data-base. Instead, have a page data-base per function, which will improve performance release flow in all cases, especially for "release all pages". As the range of function IDs is large and not sequential, use xarray to store a per function ID page data-base, where the function ID is the key. Upon first allocation of a page to a function ID, create the page data-base per function. This data-base will be released only at pagealloc mechanism cleanup. NIC: ConnectX-4 Lx CPU: Intel(R) Xeon(R) CPU E5-2650 v2 @ 2.60GHz Test case: 32 VFs, measure release pages on one VF as part of FLR Before: 0.021 Sec After: 0.014 Sec The improvement depends on amount of VFs and memory utilization by them. Time measurements above were taken from idle system. Signed-off-by: Eran Ben Elisha <eranbe@mellanox.com> Reviewed-by: Mark Bloch <markb@mellanox.com> Signed-off-by: Saeed Mahameed <saeedm@mellanox.com>
2020-07-27net/mlx4: Use fallthrough pseudo-keywordGustavo A. R. Silva3-5/+5
Replace the existing /* fall through */ comments and its variants with the new pseudo-keyword macro fallthrough[1]. [1] https://www.kernel.org/doc/html/v5.7/process/deprecated.html?highlight=fallthrough#implicit-switch-case-fall-through Signed-off-by: Gustavo A. R. Silva <gustavoars@kernel.org> Signed-off-by: David S. Miller <davem@davemloft.net>
2020-07-27Merge branch '1GbE' of ↵David S. Miller4-52/+2
git://git.kernel.org/pub/scm/linux/kernel/git/jkirsher/next-queue Tony Nguyen says: ==================== 1GbE Intel Wired LAN Driver Updates 2020-07-27 This series contains updates to igc driver only. Sasha cleans up double definitions, unneeded and non applicable registers, and removes unused fields in structs. Ensures the Receive Descriptor Minimum Threshold Count is cleared and fixes a static checker error. v2: Remove fields from hw_stats in patches that removed their uses. Reworded patch descriptions for patches 1, 2, and 4. ==================== Signed-off-by: David S. Miller <davem@davemloft.net>
2020-07-27qed: fix assignment of n_rq_elems to incorrect params fieldColin Ian King1-1/+1
Currently n_rq_elems is being assigned to params.elem_size instead of the field params.num_elems. Coverity is detecting this as a double assingment to params.elem_size and reporting this as an usused value on the first assignment. Fix this. Addresses-Coverity: ("Unused value") Fixes: b6db3f71c976 ("qed: simplify chain allocation with init params struct") Signed-off-by: Colin Ian King <colin.king@canonical.com> Acked-by: Alexander Lobakin <alobakin@marvell.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2020-07-27Merge branch 'sfc-driver-for-EF100-family-NICs-part-1'David S. Miller30-31/+2439
Edward Cree says: ==================== sfc: driver for EF100 family NICs, part 1 EF100 is a new NIC architecture under development at Xilinx, based partly on existing Solarflare technology. As many of the hardware interfaces resemble EF10, support is implemented within the 'sfc' driver, which previous patch series "commonised" for this purpose. In order to maintain bisectability while splitting into patches of a reasonable size, I had to do a certain amount of back-and-forth with stubs for things that the common code may try to call, mainly because we can't do them until we've set up MCDI, but we can't set up MCDI without probing the event queues, at which point a lot of the common machinery becomes reachable from event handlers. Consequently, this first series doesn't get as far as actually sending and receiving packets. I have a second series ready to follow it which implements the datapath (and a few other things like ethtool). Changes from v4: * Fix build on CONFIG_RETPOLINE=n by using plain prototypes instead of INDIRECT_CALLABLE_DECLARE. Changes from v3: * combine both drivers (sfc_ef100 and sfc) into a single module, to make non-modular builds work. Patch #4 now adds a few indirections to support this; the ones in the RX and TX path use indirect-call- wrappers to minimise the performance impact. Changes from v2: * remove MODULE_VERSION. * call efx_destroy_reset_workqueue() from ef100_exit_module(). * correct uint32_ts to u32s. While I was at it, I fixed a bunch of other style issues in the function-control-window code. All in patch #4. Changes from v1: * kernel test robot spotted a link error when sfc_ef100 was built without mdio. It turns out the thing we were trying to link to was a bogus thing to do on anything but Falcon, so new patch #1 removes it from this driver. * fix undeclared symbols in patch #4 by shuffling around prototypes and #includes and adding 'static' where appropriate. * fix uninitialised variable 'rc2' in patch #7. ==================== Signed-off-by: David S. Miller <davem@davemloft.net>