aboutsummaryrefslogtreecommitdiff
path: root/drivers/net/ethernet/intel
AgeCommit message (Collapse)AuthorFilesLines
2019-07-31ice: separate out control queue lock creationJacob Keller3-29/+91
The ice_init_all_ctrlq and ice_shutdown_all_ctrlq functions create and destroy the locks used to protect the send and receive process of each control queue. This is problematic, as the driver may use these functions to shutdown and re-initialize the control queues at run time. For example, it may do this in response to a device reset. If the driver failed to recover from a reset, it might leave the control queues offline. In this case, the locks will no longer be initialized. A later call to ice_sq_send_cmd will then attempt to acquire a lock that has been destroyed. It is incorrect behavior to access a lock that has been destroyed. Indeed, ice_aq_send_cmd already tries to avoid accessing an offline control queue, but the check occurs inside the lock. The root of the problem is that the locks are destroyed at run time. Modify ice_init_all_ctrlq and ice_shutdown_all_ctrlq such that they no longer create or destroy the locks. Introduce new functions, ice_create_all_ctrlq and ice_destroy_all_ctrlq. Call these functions in ice_init_hw and ice_deinit_hw. Now, the control queue locks will remain valid for the life of the driver, and will not be destroyed until the driver unloads. This also allows removing a duplicate check of the sq.count and rq.count values when shutting down the controlqs. The ice_shutdown_ctrlq function already checks this value under the lock. Previously commit dec64ff10ed9 ("ice: use [sr]q.count when checking if queue is initialized") needed this check to happen outside the lock, because it prevented duplicate attempts at destroying the locks. The driver may now safely use ice_init_all_ctrlq and ice_shutdown_all_ctrlq while handling reset events, without causing the locks to be invalid. Signed-off-by: Jacob Keller <[email protected]> Signed-off-by: Tony Nguyen <[email protected]> Tested-by: Andrew Bowers <[email protected]> Signed-off-by: Jeff Kirsher <[email protected]>
2019-07-31ice: Always set prefena when configuring an Rx queueBrett Creeley2-1/+9
Currently we are always setting prefena to 0. This is causing the hardware to only fetch descriptors when there are none free in the cache for a received packet instead of prefetching when it has used the last descriptor regardless of incoming packets. Fix this by allowing the hardware to prefetch Rx descriptors. Signed-off-by: Brett Creeley <[email protected]> Tested-by: Andrew Bowers <[email protected]> Signed-off-by: Jeff Kirsher <[email protected]>
2019-07-31ice: Move vector base setup to PF VSITony Nguyen1-4/+4
When interrupt tracking was refactored, during rebuild, the call to ice_vsi_setup_vector_base() was inadvertently removed from the PF VSI instead of being removed from the VF VSI. During reset, the failure to properly setup the vector base generates a call trace. Correct this so that resets/rebuilds properly complete. Fixes: cbe66bfee6a0 ("ice: Refactor interrupt tracking") Signed-off-by: Tony Nguyen <[email protected]> Tested-by: Andrew Bowers <[email protected]> Signed-off-by: Jeff Kirsher <[email protected]>
2019-07-31ice: track hardware stat registers past rolloverJacob Keller5-130/+91
Currently, ice_stat_update32 and ice_stat_update40 will limit the value of the software statistic to 32 or 40 bits wide, depending on which register is being read. This means that if a driver is running for a long time, the displayed software register values will roll over to zero at 40 bits or 32 bits. This occurs because the functions directly assign the difference between the previous value and current value of the hardware statistic. Instead, add this value to the current software statistic, and then update the previous value. In this way, each time ice_stat_update40 or ice_stat_update32 are called, they will increment the software tracking value by the difference of the hardware register from its last read. The software tracking value will correctly count up until it overflows a u64. The only requirement is that the ice_stat_update functions be called at least once each time the hardware register overflows. While we're fixing ice_stat_update40, modify it to use rd64 instead of two calls to rd32. Additionally, drop the now unnecessary hireg function parameter. Signed-off-by: Jacob Keller <[email protected]> Signed-off-by: Tony Nguyen <[email protected]> Tested-by: Andrew Bowers <[email protected]> Signed-off-by: Jeff Kirsher <[email protected]>
2019-07-31ice: add lp_advertising flow control supportPaul Greenwalt1-32/+72
Add support for reporting link partner advertising when ETHTOOL_GLINKSETTINGS defined. Get pause param reports the Tx/Rx pause configured, and then ethtool issues ETHTOOL_GSET ioctl and ice_get_settings_link_up reports the negotiated Tx/Rx pause. Negotiated pause frame report per IEEE 802.3-2005 table 288-3. $ ethtool --show-pause ens6f0 Pause parameters for ens6f0: Autonegotiate: on RX: on TX: on RX negotiated: on TX negotiated: on $ ethtool ens6f0 Settings for ens6f0: Supported ports: [ FIBRE ] Supported link modes: 25000baseCR/Full Supported pause frame use: Symmetric Supports auto-negotiation: Yes Supported FEC modes: None BaseR RS Advertised link modes: 25000baseCR/Full Advertised pause frame use: Symmetric Receive-only Advertised auto-negotiation: Yes Advertised FEC modes: None BaseR RS Link partner advertised link modes: Not reported Link partner advertised pause frame use: Symmetric Link partner advertised auto-negotiation: Yes Link partner advertised FEC modes: Not reported Speed: 25000Mb/s Duplex: Full Port: Direct Attach Copper PHYAD: 0 Transceiver: internal Auto-negotiation: on Supports Wake-on: g Wake-on: g Current message level: 0x00000007 (7) drv probe link Link detected: yes When ETHTOOL_GLINKSETTINGS is not defined, get pause param reports the negotiated Tx/Rx pause. Signed-off-by: Paul Greenwalt <[email protected]> Tested-by: Andrew Bowers <[email protected]> Signed-off-by: Jeff Kirsher <[email protected]>
2019-07-30net: Use skb_frag_off accessorsJonathan Lemon3-4/+4
Use accessor functions for skb fragment's page_offset instead of direct references, in preparation for bvec conversion. Signed-off-by: Jonathan Lemon <[email protected]> Signed-off-by: David S. Miller <[email protected]>
2019-07-24Merge branch '1GbE' of ↵David S. Miller6-14/+22
git://git.kernel.org/pub/scm/linux/kernel/git/jkirsher/next-queue Jeff Kirsher says: ==================== 1GbE Intel Wired LAN Driver Updates 2019-07-24 This series contains updates to igc and e1000e client drivers only. Sasha provides a couple of cleanups to remove code that is not needed and reduce structure sizes. Updated the MAC reset flow to use the device reset flow instead of a port reset flow. Added addition device id's that will be supported. Kai-Heng Feng provides a workaround for a possible stalled packet issue in our ICH devices due to a clock recovery from the PCH being too slow. v2: removed the last patch in the series that supposedly fixed a MAC/PHY de-sync potential issue while waiting for additional information from hardware engineers. ==================== Signed-off-by: David S. Miller <[email protected]>
2019-07-24net/ixgbevf: fix a compilation error of skb_frag_tQian Cai1-2/+5
The linux-next commit "net: Rename skb_frag_t size to bv_len" [1] introduced a compilation error on powerpc as it forgot to deal with the renaming from "size" to "bv_len" for ixgbevf. [1] https://lore.kernel.org/netdev/[email protected]/T/#md052f1c7de965ccd1bdcb6f92e1990a52298eac5 In file included from ./include/linux/cache.h:5, from ./include/linux/printk.h:9, from ./include/linux/kernel.h:15, from ./include/linux/list.h:9, from ./include/linux/module.h:9, from drivers/net/ethernet/intel/ixgbevf/ixgbevf_main.c:12: drivers/net/ethernet/intel/ixgbevf/ixgbevf_main.c: In function 'ixgbevf_xmit_frame_ring': drivers/net/ethernet/intel/ixgbevf/ixgbevf_main.c:4138:51: error: 'skb_frag_t' {aka 'struct bio_vec'} has no member named 'size' count += TXD_USE_COUNT(skb_shinfo(skb)->frags[f].size); ^ ./include/uapi/linux/kernel.h:13:40: note: in definition of macro '__KERNEL_DIV_ROUND_UP' #define __KERNEL_DIV_ROUND_UP(n, d) (((n) + (d) - 1) / (d)) ^ drivers/net/ethernet/intel/ixgbevf/ixgbevf_main.c:4138:12: note: in expansion of macro 'TXD_USE_COUNT' count += TXD_USE_COUNT(skb_shinfo(skb)->frags[f].size); Signed-off-by: Qian Cai <[email protected]> Signed-off-by: David S. Miller <[email protected]>
2019-07-24e1000e: add workaround for possible stalled packetKai-Heng Feng2-1/+11
This works around a possible stalled packet issue, which may occur due to clock recovery from the PCH being too slow, when the LAN is transitioning from K1 at 1G link speed. Bugzilla: https://bugzilla.kernel.org/show_bug.cgi?id=204057 Signed-off-by: Kai-Heng Feng <[email protected]> Tested-by: Aaron Brown <[email protected]> Signed-off-by: Jeff Kirsher <[email protected]>
2019-07-24igc: Add more SKUs for i225 deviceSasha Neftin3-0/+9
Add support for more SKUs. Signed-off-by: Sasha Neftin <[email protected]> Tested-by: Aaron Brown <[email protected]> Signed-off-by: Jeff Kirsher <[email protected]>
2019-07-24igc: Update the MAC reset flowSasha Neftin2-2/+2
Use Device Reset flow instead of Port Reset flow. This flow performs a reset of the entire controller device, resulting in a state nearly approximating the state following a power-up reset or internal PCIe reset, except for system PCI configuration. Signed-off-by: Sasha Neftin <[email protected]> Tested-by: Aaron Brown <[email protected]> Signed-off-by: Jeff Kirsher <[email protected]>
2019-07-24igc: Remove the unused field from a device specification structureSasha Neftin1-5/+0
This patch comes to clean up the device specification structure. Signed-off-by: Sasha Neftin <[email protected]> Tested-by: Aaron Brown <[email protected]> Signed-off-by: Jeff Kirsher <[email protected]>
2019-07-24igc: Remove the polarity field from a PHY information structureSasha Neftin1-6/+0
Polarity and cable length fields is not applicable for the i225 device. This patch comes to clean up PHY information structure. Signed-off-by: Sasha Neftin <[email protected]> Tested-by: Aaron Brown <[email protected]> Signed-off-by: Jeff Kirsher <[email protected]>
2019-07-23igb: Use dev_get_drvdata where possibleChuhong Yuan1-2/+1
Instead of using to_pci_dev + pci_get_drvdata, use dev_get_drvdata to make code simpler. Signed-off-by: Chuhong Yuan <[email protected]> Signed-off-by: David S. Miller <[email protected]>
2019-07-23i40e: Use dev_get_drvdataChuhong Yuan1-5/+3
Instead of using to_pci_dev + pci_get_drvdata, use dev_get_drvdata to make code simpler. Signed-off-by: Chuhong Yuan <[email protected]> Signed-off-by: David S. Miller <[email protected]>
2019-07-23fm10k: Use dev_get_drvdataChuhong Yuan1-2/+2
Instead of using to_pci_dev + pci_get_drvdata, use dev_get_drvdata to make code simpler. Signed-off-by: Chuhong Yuan <[email protected]> Signed-off-by: David S. Miller <[email protected]>
2019-07-23e1000e: Use dev_get_drvdata where possibleChuhong Yuan1-4/+3
Instead of using to_pci_dev + pci_get_drvdata, use dev_get_drvdata to make code simpler. Signed-off-by: Chuhong Yuan <[email protected]> Signed-off-by: David S. Miller <[email protected]>
2019-07-22net: Use skb accessors in network driversMatthew Wilcox (Oracle)14-28/+28
In preparation for unifying the skb_frag and bio_vec, use the fine accessors which already exist and use skb_frag_t instead of struct skb_frag_struct. Signed-off-by: Matthew Wilcox (Oracle) <[email protected]> Signed-off-by: David S. Miller <[email protected]>
2019-07-21igc: Prefer pcie_capability_read_word()Frederick Lawler1-8/+4
Commit 8c0d3a02c130 ("PCI: Add accessors for PCI Express Capability") added accessors for the PCI Express Capability so that drivers didn't need to be aware of differences between v1 and v2 of the PCI Express Capability. Replace pci_read_config_word() and pci_write_config_word() calls with pcie_capability_read_word() and pcie_capability_write_word(). Signed-off-by: Frederick Lawler <[email protected]> Signed-off-by: David S. Miller <[email protected]>
2019-07-15docs: driver-model: move it to the driver-api bookMauro Carvalho Chehab1-1/+1
The audience for the Kernel driver-model is clearly Kernel hackers. Signed-off-by: Mauro Carvalho Chehab <[email protected]> Acked-by: Jeff Kirsher <[email protected]> # ice driver changes
2019-07-12Merge tag 'driver-core-5.3-rc1' of ↵Linus Torvalds1-1/+1
git://git.kernel.org/pub/scm/linux/kernel/git/gregkh/driver-core Pull driver core and debugfs updates from Greg KH: "Here is the "big" driver core and debugfs changes for 5.3-rc1 It's a lot of different patches, all across the tree due to some api changes and lots of debugfs cleanups. Other than the debugfs cleanups, in this set of changes we have: - bus iteration function cleanups - scripts/get_abi.pl tool to display and parse Documentation/ABI entries in a simple way - cleanups to Documenatation/ABI/ entries to make them parse easier due to typos and other minor things - default_attrs use for some ktype users - driver model documentation file conversions to .rst - compressed firmware file loading - deferred probe fixes All of these have been in linux-next for a while, with a bunch of merge issues that Stephen has been patient with me for" * tag 'driver-core-5.3-rc1' of git://git.kernel.org/pub/scm/linux/kernel/git/gregkh/driver-core: (102 commits) debugfs: make error message a bit more verbose orangefs: fix build warning from debugfs cleanup patch ubifs: fix build warning after debugfs cleanup patch driver: core: Allow subsystems to continue deferring probe drivers: base: cacheinfo: Ensure cpu hotplug work is done before Intel RDT arch_topology: Remove error messages on out-of-memory conditions lib: notifier-error-inject: no need to check return value of debugfs_create functions swiotlb: no need to check return value of debugfs_create functions ceph: no need to check return value of debugfs_create functions sunrpc: no need to check return value of debugfs_create functions ubifs: no need to check return value of debugfs_create functions orangefs: no need to check return value of debugfs_create functions nfsd: no need to check return value of debugfs_create functions lib: 842: no need to check return value of debugfs_create functions debugfs: provide pr_fmt() macro debugfs: log errors when something goes wrong drivers: s390/cio: Fix compilation warning about const qualifiers drivers: Add generic helper to match by of_node driver_find_device: Unify the match function with class_find_device() bus_find_device: Unify the match callback with class_find_device ...
2019-07-09net: flow_offload: rename tc_cls_flower_offload to flow_cls_offloadPablo Neira Ayuso3-30/+30
And any other existing fields in this structure that refer to tc. Specifically: * tc_cls_flower_offload_flow_rule() to flow_cls_offload_flow_rule(). * TC_CLSFLOWER_* to FLOW_CLS_*. * tc_cls_common_offload to tc_cls_common_offload. Signed-off-by: Pablo Neira Ayuso <[email protected]> Signed-off-by: David S. Miller <[email protected]>
2019-07-09drivers: net: use flow block APIPablo Neira Ayuso4-4/+16
This patch updates flow_block_cb_setup_simple() to use the flow block API. Several drivers are also adjusted to use it. This patch introduces the per-driver list of flow blocks to account for blocks that are already in use. Remove tc_block_offload alias. Signed-off-by: Pablo Neira Ayuso <[email protected]> Signed-off-by: David S. Miller <[email protected]>
2019-07-09net: flow_offload: add flow_block_cb_setup_simple()Pablo Neira Ayuso4-93/+19
Most drivers do the same thing to set up the flow block callbacks, this patch adds a helper function to do this. This preparation patch reduces the number of changes to adapt the existing drivers to use the flow block callback API. This new helper function takes a flow block list per-driver, which is set to NULL until this driver list is used. This patch also introduces the flow_block_command and flow_block_binder_type enumerations, which are renamed to use FLOW_BLOCK_* in follow up patches. There are three definitions (aliases) in order to reduce the number of updates in this patch, which go away once drivers are fully adapted to use this flow block API. Signed-off-by: Pablo Neira Ayuso <[email protected]> Reviewed-by: Jakub Kicinski <[email protected]> Signed-off-by: David S. Miller <[email protected]>
2019-07-04Merge git://git.kernel.org/pub/scm/linux/kernel/git/bpf/bpf-nextDavid S. Miller2-11/+16
Daniel Borkmann says: ==================== pull-request: bpf-next 2019-07-03 The following pull-request contains BPF updates for your *net-next* tree. There is a minor merge conflict in mlx5 due to 8960b38932be ("linux/dim: Rename externally used net_dim members") which has been pulled into your tree in the meantime, but resolution seems not that bad ... getting current bpf-next out now before there's coming more on mlx5. ;) I'm Cc'ing Saeed just so he's aware of the resolution below: ** First conflict in drivers/net/ethernet/mellanox/mlx5/core/en_main.c: <<<<<<< HEAD static int mlx5e_open_cq(struct mlx5e_channel *c, struct dim_cq_moder moder, struct mlx5e_cq_param *param, struct mlx5e_cq *cq) ======= int mlx5e_open_cq(struct mlx5e_channel *c, struct net_dim_cq_moder moder, struct mlx5e_cq_param *param, struct mlx5e_cq *cq) >>>>>>> e5a3e259ef239f443951d401db10db7d426c9497 Resolution is to take the second chunk and rename net_dim_cq_moder into dim_cq_moder. Also the signature for mlx5e_open_cq() in ... drivers/net/ethernet/mellanox/mlx5/core/en.h +977 ... and in mlx5e_open_xsk() ... drivers/net/ethernet/mellanox/mlx5/core/en/xsk/setup.c +64 ... needs the same rename from net_dim_cq_moder into dim_cq_moder. ** Second conflict in drivers/net/ethernet/mellanox/mlx5/core/en_main.c: <<<<<<< HEAD int cpu = cpumask_first(mlx5_comp_irq_get_affinity_mask(priv->mdev, ix)); struct dim_cq_moder icocq_moder = {0, 0}; struct net_device *netdev = priv->netdev; struct mlx5e_channel *c; unsigned int irq; ======= struct net_dim_cq_moder icocq_moder = {0, 0}; >>>>>>> e5a3e259ef239f443951d401db10db7d426c9497 Take the second chunk and rename net_dim_cq_moder into dim_cq_moder as well. Let me know if you run into any issues. Anyway, the main changes are: 1) Long-awaited AF_XDP support for mlx5e driver, from Maxim. 2) Addition of two new per-cgroup BPF hooks for getsockopt and setsockopt along with a new sockopt program type which allows more fine-grained pass/reject settings for containers. Also add a sock_ops callback that can be selectively enabled on a per-socket basis and is executed for every RTT to help tracking TCP statistics, both features from Stanislav. 3) Follow-up fix from loops in precision tracking which was not propagating precision marks and as a result verifier assumed that some branches were not taken and therefore wrongly removed as dead code, from Alexei. 4) Fix BPF cgroup release synchronization race which could lead to a double-free if a leaf's cgroup_bpf object is released and a new BPF program is attached to the one of ancestor cgroups in parallel, from Roman. 5) Support for bulking XDP_TX on veth devices which improves performance in some cases by around 9%, from Toshiaki. 6) Allow for lookups into BPF devmap and improve feedback when calling into bpf_redirect_map() as lookup is now performed right away in the helper itself, from Toke. 7) Add support for fq's Earliest Departure Time to the Host Bandwidth Manager (HBM) sample BPF program, from Lawrence. 8) Various cleanups and minor fixes all over the place from many others. ==================== Signed-off-by: David S. Miller <[email protected]>
2019-06-30Merge branch '10GbE' of ↵David S. Miller21-145/+686
git://git.kernel.org/pub/scm/linux/kernel/git/jkirsher/next-queue Jeff Kirsher says: ==================== Intel Wired LAN Driver Updates 2019-06-28 This series contains a smorgasbord of updates to many of the Intel drivers. Gustavo A. R. Silva updates the ice and iavf drivers to use the strcut_size() helper where possible. Miguel increases the pause and refresh time for flow control in the e1000e driver during reset for certain devices. Dann Frazier fixes a potential NULL pointer dereference in ixgbe driver when using non-IPSec enabled devices. Colin Ian King fixes a potential overflow during a shift in the ixgbe driver. Also fixes a potential NULL pointer dereference in the iavf driver by adding a check. Venkatesh Srinivas converts the e1000 driver to use dma_wmb() instead of wmb() for doorbell writes to avoid SFENCEs in the transmit and receive paths. Arjan updates the e1000e driver to improve boot time by over 100 msec by reducing the usleep ranges suring system startup. Artem updates the igb driver register dump in ethtool, first prepares the register dump for future additions of registers in the dump, then secondly, adds the RR2DCDELAY register to the dump. When dealing with time-sensitive networks, this register is helpful in determining your latency from the device to the ring. Alex fixes the ixgbevf driver to use the current cached link state, rather than trying to re-check the value from the PF. Harshitha adds support for MACVLAN offloads in i40e by using channels as MACVLAN interfaces. Detlev Casanova updates the e1000e driver to use delayed work instead of timers to run the watchdog. Vitaly fixes an issue in e1000e, where when disconnecting and reconnecting the physical cable connection, the NIC enters a DMoff state. This state causes a mismatch in link and duplexing, so check the PCIm function state and perform a PHY reset when in this state to resolve the issue. ==================== Signed-off-by: David S. Miller <[email protected]>
2019-06-28e1000e: PCIm function state supportVitaly Lifshits2-1/+20
Due to commit: 5d8682588605 ("[misc] mei: me: allow runtime pm for platform with D0i3") When disconnecting the cable and reconnecting it the NIC enters DMoff state. This caused wrong link indication and duplex mismatch. This bug is described in: https://bugzilla.redhat.com/show_bug.cgi?id=1689436 Checking PCIm function state and performing PHY reset after a timeout in watchdog task solves this issue. Signed-off-by: Vitaly Lifshits <[email protected]> Acked-by: Sasha Neftin <[email protected]> Tested-by: Aaron Brown <[email protected]> Signed-off-by: Jeff Kirsher <[email protected]>
2019-06-28e1000e: Make watchdog use delayed workDetlev Casanova2-27/+32
Use delayed work instead of timers to run the watchdog of the e1000e driver. Simplify the code with one less middle function. Signed-off-by: Detlev Casanova <[email protected]> Tested-by: Aaron Brown <[email protected]> Signed-off-by: Jeff Kirsher <[email protected]>
2019-06-28i40e: Add macvlan support on i40eHarshitha Ramamurthy2-2/+522
This patch enables macvlan offloads for i40e. The idea is to use channels as macvlan interfaces. The channels are VSIs of type VMDQ. When the first macvlan is created, the maximum number of channels possible are created. From then on, as a macvlan interface is created, a macvlan filter is added to these already created channels (VSIs). This patch utilizes subordinate device traffic classes to make queue groups(channels) available for an upper device like a macvlan. Steps to configure macvlan offloads: 1. ethtool -K ethx l2-fwd-offload on 2. ip link add link ethx name macvlan1 type macvlan 3. ip addr add <address> dev macvlan1 4. ip link set macvlan1 up Signed-off-by: Harshitha Ramamurthy <[email protected]> Tested-by: Andrew Bowers <[email protected]> Signed-off-by: Jeff Kirsher <[email protected]>
2019-06-28ixgbevf: Use cached link state instead of re-reading the value for ethtoolAlexander Duyck1-8/+2
Change the ethtool link settings call to just read the cached state out of the adapter structure instead of trying to recheck the value from the PF. Doing this should prevent excessive reading of the mailbox. Signed-off-by: Alexander Duyck <[email protected]> Reviewed-by: "Guilherme G. Piccoli" <[email protected]> Tested-by: Andrew Bowers <[email protected]> Signed-off-by: Jeff Kirsher <[email protected]>
2019-06-28iavf: fix dereference of null rx_buffer pointerColin Ian King1-2/+4
A recent commit efa14c3985828d ("iavf: allow null RX descriptors") added a null pointer sanity check on rx_buffer, however, rx_buffer is being dereferenced before that check, which implies a null pointer dereference bug can potentially occur. Fix this by only dereferencing rx_buffer until after the null pointer check. Addresses-Coverity: ("Dereference before null check") Signed-off-by: Colin Ian King <[email protected]> Tested-by: Andrew Bowers <[email protected]> Signed-off-by: Jeff Kirsher <[email protected]>
2019-06-28igb: add RR2DCDELAY to ethtool registers dumpArtem Bityutskiy2-1/+6
This patch adds the RR2DCDELAY register to the ethtool registers dump. RR2DCDELAY exists on I210 and I211 Intel Gigabit Ethernet chips and it stands for "Read Request To Data Completion Delay". Here is how this register is described in the I210 datasheet: "This field captures the maximum PCIe split time in 16 ns units, which is the maximum delay between the read request to the first data completion. This is giving an estimation of the PCIe round trip time." In other words, whenever I210 reads from the host memory (e.g., fetches a descriptor from the ring), the chip measures every PCI DMA read transaction and captures the maximum value. So it ends up containing the longest DMA transaction time. This register is very useful for troubleshooting and research purposes. If you are dealing with time-sensitive networks, this register can help you get an idea of your "I210-to-ring" latency. This helps answering questions like "should I have PCIe ASPM enabled?" or "should I enable deep C-states?" on my system. It is safe to read this register at any point, reading it has no effect on the I210 chip functionality. Signed-off-by: Artem Bityutskiy <[email protected]> Tested-by: Aaron Brown <[email protected]> Signed-off-by: Jeff Kirsher <[email protected]>
2019-06-28igb: minor ethool regdump amendmentArtem Bityutskiy1-35/+35
This patch has no functional impact and it is just a preparation for the following patch. It removes an early return from the 'igb_get_regs()' function by moving the 82576-only registers dump into an "if" block. With this preparation, we can dump more non-82576 registers at the end of this function. Signed-off-by: Artem Bityutskiy <[email protected]> Tested-by: Aaron Brown <[email protected]> Signed-off-by: Jeff Kirsher <[email protected]>
2019-06-28iavf: Fix up debug print macroJeff Kirsher1-3/+7
This aligns the iavf_debug() macro with the other Intel drivers. Add the bus number, bus_id field to i40e_bus_info so output shows each physical port(i.e func) in following format: [[[[<domain>]:]<bus>]:][<slot>][.[<func>]] domains are numbered from 0 to ffff), bus (0-ff), slot (0-1f) and function (0-7). Signed-off-by: Jeff Kirsher <[email protected]> Tested-by: Andrew Bowers <[email protected]>
2019-06-28e1000e: Reduce boot time by tightening sleep rangesArjan van de Ven7-28/+28
The e1000e driver is a great user of the usleep_range() API, and has nice ranges that in principle help power management. However the ranges that are used only during system startup are very long (and can add easily 100 msec to the boot time) while the power savings of such long ranges is irrelevant due to the one-off, boot only, nature of these functions. This patch shrinks some of the longest ranges to be shorter (while still using a power friendly 1 msec range); this saves 100msec+ of boot time on my BDW NUCs Signed-off-by: Arjan van de Ven <[email protected]> Signed-off-by: Paul Menzel <[email protected]> Tested-by: Aaron Brown <[email protected]> Signed-off-by: Jeff Kirsher <[email protected]>
2019-06-28iavf: use struct_size() helperGustavo A. R. Silva1-21/+16
Make use of the struct_size() helper instead of an open-coded version in order to avoid any potential type mistakes, in particular in the context in which this code is being used. So, replace code of the following form: sizeof(struct virtchnl_ether_addr_list) + (count * sizeof(struct virtchnl_ether_addr)) with: struct_size(veal, list, count) and so on... This code was detected with the help of Coccinelle. Signed-off-by: "Gustavo A. R. Silva" <[email protected]> Tested-by: Andrew Bowers <[email protected]> Signed-off-by: Jeff Kirsher <[email protected]>
2019-06-28e1000: Use dma_wmb() instead of wmb() before doorbell writesVenkatesh Srinivas1-3/+3
e1000 writes to doorbells to post transmit descriptors and fill the receive ring. After writing descriptors to memory but before writing to doorbells, use dma_wmb() rather than wmb(). wmb() is more heavyweight than necessary for a device to see descriptor writes. On x86, this avoids SFENCEs before doorbell writes in both the Tx and Rx paths. On ARM, this converts DSB ST -> DMB OSHST. Tested: 82576EB / x86; QEMU (qemu emulates an 8257x) Signed-off-by: Venkatesh Srinivas <[email protected]> Tested-by: Aaron Brown <[email protected]> Signed-off-by: Jeff Kirsher <[email protected]>
2019-06-28ixgbe: fix potential u32 overflow on shiftColin Ian King1-10/+4
The u32 variable rem is being shifted using u32 arithmetic however it is being passed to div_u64 that expects the expression to be a u64. The 32 bit shift may potentially overflow, so cast rem to a u64 before shifting to avoid this. Also remove comment about overflow. Addresses-Coverity: ("Unintentional integer overflow") Fixes: cd4583206990 ("ixgbe: implement support for SDP/PPS output on X550 hardware") Fixes: 68d9676fc04e ("ixgbe: fix PTP SDP pin setup on X540 hardware") Signed-off-by: Colin Ian King <[email protected]> Acked-by: Jacob Keller <[email protected]> Tested-by: Andrew Bowers <[email protected]> Signed-off-by: Jeff Kirsher <[email protected]>
2019-06-28ixgbe: Avoid NULL pointer dereference with VF on non-IPsec hwDann Frazier1-0/+3
An ipsec structure will not be allocated if the hardware does not support offload. Fixes the following Oops: [ 191.045452] Unable to handle kernel NULL pointer dereference at virtual address 0000000000000000 [ 191.054232] Mem abort info: [ 191.057014] ESR = 0x96000004 [ 191.060057] Exception class = DABT (current EL), IL = 32 bits [ 191.065963] SET = 0, FnV = 0 [ 191.069004] EA = 0, S1PTW = 0 [ 191.072132] Data abort info: [ 191.074999] ISV = 0, ISS = 0x00000004 [ 191.078822] CM = 0, WnR = 0 [ 191.081780] user pgtable: 4k pages, 48-bit VAs, pgdp = 0000000043d9e467 [ 191.088382] [0000000000000000] pgd=0000000000000000 [ 191.093252] Internal error: Oops: 96000004 [#1] SMP [ 191.098119] Modules linked in: vhost_net vhost tap vfio_pci vfio_virqfd vfio_iommu_type1 vfio xt_CHECKSUM iptable_mangle ipt_MASQUERADE iptable_nat nf_nat_ipv4 nf_nat xt_conntrack nf_conntrack nf_defrag_ipv6 nf_defrag_ipv4 ipt_REJECT nf_reject_ipv4 xt_tcpudp bridge stp llc ebtable_filter devlink ebtables ip6table_filter ip6_tables iptable_filter bpfilter ipmi_ssif nls_iso8859_1 input_leds joydev ipmi_si hns_roce_hw_v2 ipmi_devintf hns_roce ipmi_msghandler cppc_cpufreq sch_fq_codel ib_iser rdma_cm iw_cm ib_cm ib_core iscsi_tcp libiscsi_tcp libiscsi scsi_transport_iscsi ip_tables x_tables autofs4 ses enclosure btrfs zstd_compress raid10 raid456 async_raid6_recov async_memcpy async_pq async_xor async_tx xor hid_generic usbhid hid raid6_pq libcrc32c raid1 raid0 multipath linear ixgbevf hibmc_drm ttm [ 191.168607] drm_kms_helper aes_ce_blk aes_ce_cipher syscopyarea crct10dif_ce sysfillrect ghash_ce qla2xxx sysimgblt sha2_ce sha256_arm64 hisi_sas_v3_hw fb_sys_fops sha1_ce uas nvme_fc mpt3sas ixgbe drm hisi_sas_main nvme_fabrics usb_storage hclge scsi_transport_fc ahci libsas hnae3 raid_class libahci xfrm_algo scsi_transport_sas mdio aes_neon_bs aes_neon_blk crypto_simd cryptd aes_arm64 [ 191.202952] CPU: 94 PID: 0 Comm: swapper/94 Not tainted 4.19.0-rc1+ #11 [ 191.209553] Hardware name: Huawei D06 /D06, BIOS Hisilicon D06 UEFI RC0 - V1.20.01 04/26/2019 [ 191.218064] pstate: 20400089 (nzCv daIf +PAN -UAO) [ 191.222873] pc : ixgbe_ipsec_vf_clear+0x60/0xd0 [ixgbe] [ 191.228093] lr : ixgbe_msg_task+0x2d0/0x1088 [ixgbe] [ 191.233044] sp : ffff000009b3bcd0 [ 191.236346] x29: ffff000009b3bcd0 x28: 0000000000000000 [ 191.241647] x27: ffff000009628000 x26: 0000000000000000 [ 191.246946] x25: ffff803f652d7600 x24: 0000000000000004 [ 191.252246] x23: ffff803f6a718900 x22: 0000000000000000 [ 191.257546] x21: 0000000000000000 x20: 0000000000000000 [ 191.262845] x19: 0000000000000000 x18: 0000000000000000 [ 191.268144] x17: 0000000000000000 x16: 0000000000000000 [ 191.273443] x15: 0000000000000000 x14: 0000000100000026 [ 191.278742] x13: 0000000100000025 x12: ffff8a5f7fbe0df0 [ 191.284042] x11: 000000010000000b x10: 0000000000000040 [ 191.289341] x9 : 0000000000001100 x8 : ffff803f6a824fd8 [ 191.294640] x7 : ffff803f6a825098 x6 : 0000000000000001 [ 191.299939] x5 : ffff000000f0ffc0 x4 : 0000000000000000 [ 191.305238] x3 : ffff000028c00000 x2 : ffff803f652d7600 [ 191.310538] x1 : 0000000000000000 x0 : ffff000000f205f0 [ 191.315838] Process swapper/94 (pid: 0, stack limit = 0x00000000addfed5a) [ 191.322613] Call trace: [ 191.325055] ixgbe_ipsec_vf_clear+0x60/0xd0 [ixgbe] [ 191.329927] ixgbe_msg_task+0x2d0/0x1088 [ixgbe] [ 191.334536] ixgbe_msix_other+0x274/0x330 [ixgbe] [ 191.339233] __handle_irq_event_percpu+0x78/0x270 [ 191.343924] handle_irq_event_percpu+0x40/0x98 [ 191.348355] handle_irq_event+0x50/0xa8 [ 191.352180] handle_fasteoi_irq+0xbc/0x148 [ 191.356263] generic_handle_irq+0x34/0x50 [ 191.360259] __handle_domain_irq+0x68/0xc0 [ 191.364343] gic_handle_irq+0x84/0x180 [ 191.368079] el1_irq+0xe8/0x180 [ 191.371208] arch_cpu_idle+0x30/0x1a8 [ 191.374860] do_idle+0x1dc/0x2a0 [ 191.378077] cpu_startup_entry+0x2c/0x30 [ 191.381988] secondary_start_kernel+0x150/0x1e0 [ 191.386506] Code: 6b15003f 54000320 f1404a9f 54000060 (79400260) Fixes: eda0333ac2930 ("ixgbe: add VF IPsec management") Signed-off-by: Dann Frazier <[email protected]> Acked-by: Shannon Nelson <[email protected]> Tested-by: Andrew Bowers <[email protected]> Signed-off-by: Jeff Kirsher <[email protected]>
2019-06-28e1000e: Increase pause and refresh timeMiguel Bernal Marin1-2/+2
Suggested-by: Tim Pepper <[email protected]> Signed-off-by: Miguel Bernal Marin <[email protected]> Signed-off-by: Paul Menzel <[email protected]> Acked-by: Sasha Neftin <[email protected]> Tested-by: Aaron Brown <[email protected]> Signed-off-by: Jeff Kirsher <[email protected]>
2019-06-28ice: Use struct_size() helperGustavo A. R. Silva1-2/+2
One of the more common cases of allocation size calculations is finding the size of a structure that has a zero-sized array at the end, along with memory for some number of elements for that array. For example: struct foo { int stuff; struct boo entry[]; }; size = sizeof(struct foo) + count * sizeof(struct boo); instance = alloc(size, GFP_KERNEL); Instead of leaving these open-coded and prone to type mistakes, we can now use the new struct_size() helper: size = struct_size(instance, entry, count); This code was detected with the help of Coccinelle. Signed-off-by: "Gustavo A. R. Silva" <[email protected]> Tested-by: Andrew Bowers <[email protected]> Signed-off-by: Jeff Kirsher <[email protected]>
2019-06-28igb: clear out skb->tstamp after reading the txtimeVedang Patel1-0/+1
If a packet which is utilizing the launchtime feature (via SO_TXTIME socket option) also requests the hardware transmit timestamp, the hardware timestamp is not delivered to the userspace. This is because the value in skb->tstamp is mistaken as the software timestamp. Applications, like ptp4l, request a hardware timestamp by setting the SOF_TIMESTAMPING_TX_HARDWARE socket option. Whenever a new timestamp is detected by the driver (this work is done in igb_ptp_tx_work() which calls igb_ptp_tx_hwtstamps() in igb_ptp.c[1]), it will queue the timestamp in the ERR_QUEUE for the userspace to read. When the userspace is ready, it will issue a recvmsg() call to collect this timestamp. The problem is in this recvmsg() call. If the skb->tstamp is not cleared out, it will be interpreted as a software timestamp and the hardware tx timestamp will not be successfully sent to the userspace. Look at skb_is_swtx_tstamp() and the callee function __sock_recv_timestamp() in net/socket.c for more details. Signed-off-by: Vedang Patel <[email protected]> Tested-by: Aaron Brown <[email protected]> Signed-off-by: David S. Miller <[email protected]>
2019-06-27xsk: Return the whole xdp_desc from xsk_umem_consume_txMaxim Mikityanskiy2-11/+16
Some drivers want to access the data transmitted in order to implement acceleration features of the NICs. It is also useful in AF_XDP TX flow. Change the xsk_umem_consume_tx API to return the whole xdp_desc, that contains the data pointer, length and DMA address, instead of only the latter two. Adapt the implementation of i40e and ixgbe to this change. Signed-off-by: Maxim Mikityanskiy <[email protected]> Signed-off-by: Tariq Toukan <[email protected]> Acked-by: Saeed Mahameed <[email protected]> Cc: Björn Töpel <[email protected]> Cc: Magnus Karlsson <[email protected]> Acked-by: Björn Töpel <[email protected]> Signed-off-by: Daniel Borkmann <[email protected]>
2019-06-26i40e/i40e_virtchnl_pf: Use struct_size() in kzalloc()Gustavo A. R. Silva1-9/+6
One of the more common cases of allocation size calculations is finding the size of a structure that has a zero-sized array at the end, along with memory for some number of elements for that array. For example: struct virtchnl_iwarp_qvlist_info { ... struct virtchnl_iwarp_qv_info qv_info[1]; }; size = sizeof(struct virtchnl_iwarp_qvlist_info) + (sizeof(struct virtchnl_iwarp_qv_info) * count; instance = kzalloc(size, GFP_KERNEL); and struct virtchnl_vf_resource { ... struct virtchnl_vsi_resource vsi_res[1]; }; size = sizeof(struct virtchnl_vf_resource) + sizeof(struct virtchnl_vsi_resource) * count; instance = kzalloc(size, GFP_KERNEL); Instead of leaving these open-coded and prone to type mistakes, we can now use the new struct_size() helper: instance = kzalloc(struct_size(instance, qv_info, count), GFP_KERNEL); and instance = kzalloc(struct_size(instance, vsi_res, count), GFP_KERNEL); Notice that, in the first case above, variable size is not necessary, hence it is removed. This code was detected with the help of Coccinelle. Signed-off-by: "Gustavo A. R. Silva" <[email protected]> Tested-by: Andrew Bowers <[email protected]> Signed-off-by: Jeff Kirsher <[email protected]>
2019-06-26i40e: update copyright stringAlice Michael1-1/+1
It was found that the string that prints our copyright was not up to date. Updating to reflect our copyright. Signed-off-by: Alice Michael <[email protected]> Tested-by: Andrew Bowers <[email protected]> Signed-off-by: Jeff Kirsher <[email protected]>
2019-06-26i40e: Fix descriptor count manipulationMaciej Fijalkowski4-14/+34
Changing descriptor count via 'ethtool -G' is not persistent across resets. When PF reset occurs, we roll back to the default value of vsi->num_desc, which is used then in i40e_alloc_rings to set descriptor count. XDP does a PF reset so when user has changed the descriptor count and load XDP program, the default count will be back there. To fix this: * introduce new VSI members - num_tx_desc and num_rx_desc in favour of num_desc * set them in i40e_set_ringparam to user's values * set them to default values in i40e_set_num_rings_in_vsi only when they don't have previous values Signed-off-by: Maciej Fijalkowski <[email protected]> Tested-by: Andrew Bowers <[email protected]> Signed-off-by: Jeff Kirsher <[email protected]>
2019-06-26i40e: missing priorities for any QoS trafficAleksandr Loktionov1-3/+54
This patch fixes reading f/w LLDP agent status at DCB init time. It's done by removing direct NVM reading in i40e_update_dcb_config() and checking whether f/w LLDP agent is disabled via I40E_FLAG_DISABLE_FW_LLDP flag in i40e_init_pf_dcb(). The function i40e_update_dcb_config() in i40e_main.c is a temporary solution which will be later renamed to i40e_init_dcb() in the i40e_dcb module. Also logging was extended to make visible if f/w LLDP agent is running or not and always log a message when DCB was not initialized. Without this patch for new f/w versions f/w LLDP agent status was always read from NVM as disabled and DCB initialization failed without clear reason in logs. Signed-off-by: Aleksandr Loktionov <[email protected]> Tested-by: Andrew Bowers <[email protected]> Signed-off-by: Jeff Kirsher <[email protected]>
2019-06-26i40e: Add log entry while creating or deleting TC0Piotr Kwapulinski1-0/+4
Generate log entry when TC0 is created or deleted. Log entry is generated during main VSI setup. Before there was no log info about adding or deleting TC0. Signed-off-by: Piotr Kwapulinski <[email protected]> Tested-by: Andrew Bowers <[email protected]> Signed-off-by: Jeff Kirsher <[email protected]>
2019-06-26i40e: fix incorrect function documentation commentJacob Keller1-2/+1
Signed-off-by: Jacob Keller <[email protected]> Tested-by: Andrew Bowers <[email protected]> Signed-off-by: Jeff Kirsher <[email protected]>
2019-06-26i40e: Fix for missing "link modes" info in ethtoolMartyna Szapar1-2/+1
Fix for missing "Supported link modes" and "Advertised link modes" info in ethtool after changed speed on X722 devices with BASE-T PHY with FW API version >= 1.7. The same FW API version on X710 and X722 does not mean the same feature set so the change was needed as mac type of the device should also be checked instead of FW API version only. Signed-off-by: Martyna Szapar <[email protected]> Tested-by: Andrew Bowers <[email protected]> Signed-off-by: Jeff Kirsher <[email protected]>