Age | Commit message (Collapse) | Author | Files | Lines |
|
The ice_init_all_ctrlq and ice_shutdown_all_ctrlq functions create and
destroy the locks used to protect the send and receive process of each
control queue.
This is problematic, as the driver may use these functions to shutdown
and re-initialize the control queues at run time. For example, it may do
this in response to a device reset.
If the driver failed to recover from a reset, it might leave the control
queues offline. In this case, the locks will no longer be initialized.
A later call to ice_sq_send_cmd will then attempt to acquire a lock that
has been destroyed.
It is incorrect behavior to access a lock that has been destroyed.
Indeed, ice_aq_send_cmd already tries to avoid accessing an offline
control queue, but the check occurs inside the lock.
The root of the problem is that the locks are destroyed at run time.
Modify ice_init_all_ctrlq and ice_shutdown_all_ctrlq such that they no
longer create or destroy the locks.
Introduce new functions, ice_create_all_ctrlq and ice_destroy_all_ctrlq.
Call these functions in ice_init_hw and ice_deinit_hw.
Now, the control queue locks will remain valid for the life of the
driver, and will not be destroyed until the driver unloads.
This also allows removing a duplicate check of the sq.count and
rq.count values when shutting down the controlqs. The ice_shutdown_ctrlq
function already checks this value under the lock. Previously
commit dec64ff10ed9 ("ice: use [sr]q.count when checking if queue is
initialized") needed this check to happen outside the lock, because it
prevented duplicate attempts at destroying the locks.
The driver may now safely use ice_init_all_ctrlq and
ice_shutdown_all_ctrlq while handling reset events, without causing the
locks to be invalid.
Signed-off-by: Jacob Keller <[email protected]>
Signed-off-by: Tony Nguyen <[email protected]>
Tested-by: Andrew Bowers <[email protected]>
Signed-off-by: Jeff Kirsher <[email protected]>
|
|
Currently we are always setting prefena to 0. This is causing the
hardware to only fetch descriptors when there are none free in the cache
for a received packet instead of prefetching when it has used the last
descriptor regardless of incoming packets. Fix this by allowing the
hardware to prefetch Rx descriptors.
Signed-off-by: Brett Creeley <[email protected]>
Tested-by: Andrew Bowers <[email protected]>
Signed-off-by: Jeff Kirsher <[email protected]>
|
|
When interrupt tracking was refactored, during rebuild, the call to
ice_vsi_setup_vector_base() was inadvertently removed from the PF VSI
instead of being removed from the VF VSI. During reset, the failure to
properly setup the vector base generates a call trace. Correct this so
that resets/rebuilds properly complete.
Fixes: cbe66bfee6a0 ("ice: Refactor interrupt tracking")
Signed-off-by: Tony Nguyen <[email protected]>
Tested-by: Andrew Bowers <[email protected]>
Signed-off-by: Jeff Kirsher <[email protected]>
|
|
Currently, ice_stat_update32 and ice_stat_update40 will limit the
value of the software statistic to 32 or 40 bits wide, depending on
which register is being read.
This means that if a driver is running for a long time, the displayed
software register values will roll over to zero at 40 bits or 32 bits.
This occurs because the functions directly assign the difference between
the previous value and current value of the hardware statistic.
Instead, add this value to the current software statistic, and then
update the previous value.
In this way, each time ice_stat_update40 or ice_stat_update32 are
called, they will increment the software tracking value by the
difference of the hardware register from its last read. The software
tracking value will correctly count up until it overflows a u64.
The only requirement is that the ice_stat_update functions be called at
least once each time the hardware register overflows.
While we're fixing ice_stat_update40, modify it to use rd64 instead of
two calls to rd32. Additionally, drop the now unnecessary hireg
function parameter.
Signed-off-by: Jacob Keller <[email protected]>
Signed-off-by: Tony Nguyen <[email protected]>
Tested-by: Andrew Bowers <[email protected]>
Signed-off-by: Jeff Kirsher <[email protected]>
|
|
Add support for reporting link partner advertising when
ETHTOOL_GLINKSETTINGS defined. Get pause param reports the Tx/Rx
pause configured, and then ethtool issues ETHTOOL_GSET ioctl and
ice_get_settings_link_up reports the negotiated Tx/Rx pause. Negotiated
pause frame report per IEEE 802.3-2005 table 288-3.
$ ethtool --show-pause ens6f0
Pause parameters for ens6f0:
Autonegotiate: on
RX: on
TX: on
RX negotiated: on
TX negotiated: on
$ ethtool ens6f0
Settings for ens6f0:
Supported ports: [ FIBRE ]
Supported link modes: 25000baseCR/Full
Supported pause frame use: Symmetric
Supports auto-negotiation: Yes
Supported FEC modes: None BaseR RS
Advertised link modes: 25000baseCR/Full
Advertised pause frame use: Symmetric Receive-only
Advertised auto-negotiation: Yes
Advertised FEC modes: None BaseR RS
Link partner advertised link modes: Not reported
Link partner advertised pause frame use: Symmetric
Link partner advertised auto-negotiation: Yes
Link partner advertised FEC modes: Not reported
Speed: 25000Mb/s
Duplex: Full
Port: Direct Attach Copper
PHYAD: 0
Transceiver: internal
Auto-negotiation: on
Supports Wake-on: g
Wake-on: g
Current message level: 0x00000007 (7)
drv probe link
Link detected: yes
When ETHTOOL_GLINKSETTINGS is not defined, get pause param reports the
negotiated Tx/Rx pause.
Signed-off-by: Paul Greenwalt <[email protected]>
Tested-by: Andrew Bowers <[email protected]>
Signed-off-by: Jeff Kirsher <[email protected]>
|
|
Use accessor functions for skb fragment's page_offset instead
of direct references, in preparation for bvec conversion.
Signed-off-by: Jonathan Lemon <[email protected]>
Signed-off-by: David S. Miller <[email protected]>
|
|
git://git.kernel.org/pub/scm/linux/kernel/git/jkirsher/next-queue
Jeff Kirsher says:
====================
1GbE Intel Wired LAN Driver Updates 2019-07-24
This series contains updates to igc and e1000e client drivers only.
Sasha provides a couple of cleanups to remove code that is not needed
and reduce structure sizes. Updated the MAC reset flow to use the
device reset flow instead of a port reset flow. Added addition device
id's that will be supported.
Kai-Heng Feng provides a workaround for a possible stalled packet issue
in our ICH devices due to a clock recovery from the PCH being too slow.
v2: removed the last patch in the series that supposedly fixed a MAC/PHY
de-sync potential issue while waiting for additional information from
hardware engineers.
====================
Signed-off-by: David S. Miller <[email protected]>
|
|
The linux-next commit "net: Rename skb_frag_t size to bv_len" [1]
introduced a compilation error on powerpc as it forgot to deal with the
renaming from "size" to "bv_len" for ixgbevf.
[1] https://lore.kernel.org/netdev/[email protected]/T/#md052f1c7de965ccd1bdcb6f92e1990a52298eac5
In file included from ./include/linux/cache.h:5,
from ./include/linux/printk.h:9,
from ./include/linux/kernel.h:15,
from ./include/linux/list.h:9,
from ./include/linux/module.h:9,
from
drivers/net/ethernet/intel/ixgbevf/ixgbevf_main.c:12:
drivers/net/ethernet/intel/ixgbevf/ixgbevf_main.c: In function
'ixgbevf_xmit_frame_ring':
drivers/net/ethernet/intel/ixgbevf/ixgbevf_main.c:4138:51: error:
'skb_frag_t' {aka 'struct bio_vec'} has no member named 'size'
count += TXD_USE_COUNT(skb_shinfo(skb)->frags[f].size);
^
./include/uapi/linux/kernel.h:13:40: note: in definition of macro
'__KERNEL_DIV_ROUND_UP'
#define __KERNEL_DIV_ROUND_UP(n, d) (((n) + (d) - 1) / (d))
^
drivers/net/ethernet/intel/ixgbevf/ixgbevf_main.c:4138:12: note: in
expansion of macro 'TXD_USE_COUNT'
count += TXD_USE_COUNT(skb_shinfo(skb)->frags[f].size);
Signed-off-by: Qian Cai <[email protected]>
Signed-off-by: David S. Miller <[email protected]>
|
|
This works around a possible stalled packet issue, which may occur due to
clock recovery from the PCH being too slow, when the LAN is transitioning
from K1 at 1G link speed.
Bugzilla: https://bugzilla.kernel.org/show_bug.cgi?id=204057
Signed-off-by: Kai-Heng Feng <[email protected]>
Tested-by: Aaron Brown <[email protected]>
Signed-off-by: Jeff Kirsher <[email protected]>
|
|
Add support for more SKUs.
Signed-off-by: Sasha Neftin <[email protected]>
Tested-by: Aaron Brown <[email protected]>
Signed-off-by: Jeff Kirsher <[email protected]>
|
|
Use Device Reset flow instead of Port Reset flow.
This flow performs a reset of the entire controller device,
resulting in a state nearly approximating the state
following a power-up reset or internal PCIe reset,
except for system PCI configuration.
Signed-off-by: Sasha Neftin <[email protected]>
Tested-by: Aaron Brown <[email protected]>
Signed-off-by: Jeff Kirsher <[email protected]>
|
|
This patch comes to clean up the device specification structure.
Signed-off-by: Sasha Neftin <[email protected]>
Tested-by: Aaron Brown <[email protected]>
Signed-off-by: Jeff Kirsher <[email protected]>
|
|
Polarity and cable length fields is not applicable for the i225 device.
This patch comes to clean up PHY information structure.
Signed-off-by: Sasha Neftin <[email protected]>
Tested-by: Aaron Brown <[email protected]>
Signed-off-by: Jeff Kirsher <[email protected]>
|
|
Instead of using to_pci_dev + pci_get_drvdata,
use dev_get_drvdata to make code simpler.
Signed-off-by: Chuhong Yuan <[email protected]>
Signed-off-by: David S. Miller <[email protected]>
|
|
Instead of using to_pci_dev + pci_get_drvdata,
use dev_get_drvdata to make code simpler.
Signed-off-by: Chuhong Yuan <[email protected]>
Signed-off-by: David S. Miller <[email protected]>
|
|
Instead of using to_pci_dev + pci_get_drvdata,
use dev_get_drvdata to make code simpler.
Signed-off-by: Chuhong Yuan <[email protected]>
Signed-off-by: David S. Miller <[email protected]>
|
|
Instead of using to_pci_dev + pci_get_drvdata,
use dev_get_drvdata to make code simpler.
Signed-off-by: Chuhong Yuan <[email protected]>
Signed-off-by: David S. Miller <[email protected]>
|
|
In preparation for unifying the skb_frag and bio_vec, use the fine
accessors which already exist and use skb_frag_t instead of
struct skb_frag_struct.
Signed-off-by: Matthew Wilcox (Oracle) <[email protected]>
Signed-off-by: David S. Miller <[email protected]>
|
|
Commit 8c0d3a02c130 ("PCI: Add accessors for PCI Express Capability")
added accessors for the PCI Express Capability so that drivers didn't
need to be aware of differences between v1 and v2 of the PCI
Express Capability.
Replace pci_read_config_word() and pci_write_config_word() calls with
pcie_capability_read_word() and pcie_capability_write_word().
Signed-off-by: Frederick Lawler <[email protected]>
Signed-off-by: David S. Miller <[email protected]>
|
|
The audience for the Kernel driver-model is clearly Kernel hackers.
Signed-off-by: Mauro Carvalho Chehab <[email protected]>
Acked-by: Jeff Kirsher <[email protected]> # ice driver changes
|
|
git://git.kernel.org/pub/scm/linux/kernel/git/gregkh/driver-core
Pull driver core and debugfs updates from Greg KH:
"Here is the "big" driver core and debugfs changes for 5.3-rc1
It's a lot of different patches, all across the tree due to some api
changes and lots of debugfs cleanups.
Other than the debugfs cleanups, in this set of changes we have:
- bus iteration function cleanups
- scripts/get_abi.pl tool to display and parse Documentation/ABI
entries in a simple way
- cleanups to Documenatation/ABI/ entries to make them parse easier
due to typos and other minor things
- default_attrs use for some ktype users
- driver model documentation file conversions to .rst
- compressed firmware file loading
- deferred probe fixes
All of these have been in linux-next for a while, with a bunch of
merge issues that Stephen has been patient with me for"
* tag 'driver-core-5.3-rc1' of git://git.kernel.org/pub/scm/linux/kernel/git/gregkh/driver-core: (102 commits)
debugfs: make error message a bit more verbose
orangefs: fix build warning from debugfs cleanup patch
ubifs: fix build warning after debugfs cleanup patch
driver: core: Allow subsystems to continue deferring probe
drivers: base: cacheinfo: Ensure cpu hotplug work is done before Intel RDT
arch_topology: Remove error messages on out-of-memory conditions
lib: notifier-error-inject: no need to check return value of debugfs_create functions
swiotlb: no need to check return value of debugfs_create functions
ceph: no need to check return value of debugfs_create functions
sunrpc: no need to check return value of debugfs_create functions
ubifs: no need to check return value of debugfs_create functions
orangefs: no need to check return value of debugfs_create functions
nfsd: no need to check return value of debugfs_create functions
lib: 842: no need to check return value of debugfs_create functions
debugfs: provide pr_fmt() macro
debugfs: log errors when something goes wrong
drivers: s390/cio: Fix compilation warning about const qualifiers
drivers: Add generic helper to match by of_node
driver_find_device: Unify the match function with class_find_device()
bus_find_device: Unify the match callback with class_find_device
...
|
|
And any other existing fields in this structure that refer to tc.
Specifically:
* tc_cls_flower_offload_flow_rule() to flow_cls_offload_flow_rule().
* TC_CLSFLOWER_* to FLOW_CLS_*.
* tc_cls_common_offload to tc_cls_common_offload.
Signed-off-by: Pablo Neira Ayuso <[email protected]>
Signed-off-by: David S. Miller <[email protected]>
|
|
This patch updates flow_block_cb_setup_simple() to use the flow block API.
Several drivers are also adjusted to use it.
This patch introduces the per-driver list of flow blocks to account for
blocks that are already in use.
Remove tc_block_offload alias.
Signed-off-by: Pablo Neira Ayuso <[email protected]>
Signed-off-by: David S. Miller <[email protected]>
|
|
Most drivers do the same thing to set up the flow block callbacks, this
patch adds a helper function to do this.
This preparation patch reduces the number of changes to adapt the
existing drivers to use the flow block callback API.
This new helper function takes a flow block list per-driver, which is
set to NULL until this driver list is used.
This patch also introduces the flow_block_command and
flow_block_binder_type enumerations, which are renamed to use
FLOW_BLOCK_* in follow up patches.
There are three definitions (aliases) in order to reduce the number of
updates in this patch, which go away once drivers are fully adapted to
use this flow block API.
Signed-off-by: Pablo Neira Ayuso <[email protected]>
Reviewed-by: Jakub Kicinski <[email protected]>
Signed-off-by: David S. Miller <[email protected]>
|
|
Daniel Borkmann says:
====================
pull-request: bpf-next 2019-07-03
The following pull-request contains BPF updates for your *net-next* tree.
There is a minor merge conflict in mlx5 due to 8960b38932be ("linux/dim:
Rename externally used net_dim members") which has been pulled into your
tree in the meantime, but resolution seems not that bad ... getting current
bpf-next out now before there's coming more on mlx5. ;) I'm Cc'ing Saeed
just so he's aware of the resolution below:
** First conflict in drivers/net/ethernet/mellanox/mlx5/core/en_main.c:
<<<<<<< HEAD
static int mlx5e_open_cq(struct mlx5e_channel *c,
struct dim_cq_moder moder,
struct mlx5e_cq_param *param,
struct mlx5e_cq *cq)
=======
int mlx5e_open_cq(struct mlx5e_channel *c, struct net_dim_cq_moder moder,
struct mlx5e_cq_param *param, struct mlx5e_cq *cq)
>>>>>>> e5a3e259ef239f443951d401db10db7d426c9497
Resolution is to take the second chunk and rename net_dim_cq_moder into
dim_cq_moder. Also the signature for mlx5e_open_cq() in ...
drivers/net/ethernet/mellanox/mlx5/core/en.h +977
... and in mlx5e_open_xsk() ...
drivers/net/ethernet/mellanox/mlx5/core/en/xsk/setup.c +64
... needs the same rename from net_dim_cq_moder into dim_cq_moder.
** Second conflict in drivers/net/ethernet/mellanox/mlx5/core/en_main.c:
<<<<<<< HEAD
int cpu = cpumask_first(mlx5_comp_irq_get_affinity_mask(priv->mdev, ix));
struct dim_cq_moder icocq_moder = {0, 0};
struct net_device *netdev = priv->netdev;
struct mlx5e_channel *c;
unsigned int irq;
=======
struct net_dim_cq_moder icocq_moder = {0, 0};
>>>>>>> e5a3e259ef239f443951d401db10db7d426c9497
Take the second chunk and rename net_dim_cq_moder into dim_cq_moder
as well.
Let me know if you run into any issues. Anyway, the main changes are:
1) Long-awaited AF_XDP support for mlx5e driver, from Maxim.
2) Addition of two new per-cgroup BPF hooks for getsockopt and
setsockopt along with a new sockopt program type which allows more
fine-grained pass/reject settings for containers. Also add a sock_ops
callback that can be selectively enabled on a per-socket basis and is
executed for every RTT to help tracking TCP statistics, both features
from Stanislav.
3) Follow-up fix from loops in precision tracking which was not propagating
precision marks and as a result verifier assumed that some branches were
not taken and therefore wrongly removed as dead code, from Alexei.
4) Fix BPF cgroup release synchronization race which could lead to a
double-free if a leaf's cgroup_bpf object is released and a new BPF
program is attached to the one of ancestor cgroups in parallel, from Roman.
5) Support for bulking XDP_TX on veth devices which improves performance
in some cases by around 9%, from Toshiaki.
6) Allow for lookups into BPF devmap and improve feedback when calling into
bpf_redirect_map() as lookup is now performed right away in the helper
itself, from Toke.
7) Add support for fq's Earliest Departure Time to the Host Bandwidth
Manager (HBM) sample BPF program, from Lawrence.
8) Various cleanups and minor fixes all over the place from many others.
====================
Signed-off-by: David S. Miller <[email protected]>
|
|
git://git.kernel.org/pub/scm/linux/kernel/git/jkirsher/next-queue
Jeff Kirsher says:
====================
Intel Wired LAN Driver Updates 2019-06-28
This series contains a smorgasbord of updates to many of the Intel
drivers.
Gustavo A. R. Silva updates the ice and iavf drivers to use the
strcut_size() helper where possible.
Miguel increases the pause and refresh time for flow control in the
e1000e driver during reset for certain devices.
Dann Frazier fixes a potential NULL pointer dereference in ixgbe driver
when using non-IPSec enabled devices.
Colin Ian King fixes a potential overflow during a shift in the ixgbe
driver. Also fixes a potential NULL pointer dereference in the iavf
driver by adding a check.
Venkatesh Srinivas converts the e1000 driver to use dma_wmb() instead of
wmb() for doorbell writes to avoid SFENCEs in the transmit and receive
paths.
Arjan updates the e1000e driver to improve boot time by over 100 msec by
reducing the usleep ranges suring system startup.
Artem updates the igb driver register dump in ethtool, first prepares
the register dump for future additions of registers in the dump, then
secondly, adds the RR2DCDELAY register to the dump. When dealing with
time-sensitive networks, this register is helpful in determining your
latency from the device to the ring.
Alex fixes the ixgbevf driver to use the current cached link state,
rather than trying to re-check the value from the PF.
Harshitha adds support for MACVLAN offloads in i40e by using channels as
MACVLAN interfaces.
Detlev Casanova updates the e1000e driver to use delayed work instead of
timers to run the watchdog.
Vitaly fixes an issue in e1000e, where when disconnecting and
reconnecting the physical cable connection, the NIC enters a DMoff
state. This state causes a mismatch in link and duplexing, so check the
PCIm function state and perform a PHY reset when in this state to
resolve the issue.
====================
Signed-off-by: David S. Miller <[email protected]>
|
|
Due to commit: 5d8682588605 ("[misc] mei: me: allow runtime
pm for platform with D0i3")
When disconnecting the cable and reconnecting it the NIC
enters DMoff state. This caused wrong link indication
and duplex mismatch. This bug is described in:
https://bugzilla.redhat.com/show_bug.cgi?id=1689436
Checking PCIm function state and performing PHY reset after a
timeout in watchdog task solves this issue.
Signed-off-by: Vitaly Lifshits <[email protected]>
Acked-by: Sasha Neftin <[email protected]>
Tested-by: Aaron Brown <[email protected]>
Signed-off-by: Jeff Kirsher <[email protected]>
|
|
Use delayed work instead of timers to run the watchdog of the e1000e
driver.
Simplify the code with one less middle function.
Signed-off-by: Detlev Casanova <[email protected]>
Tested-by: Aaron Brown <[email protected]>
Signed-off-by: Jeff Kirsher <[email protected]>
|
|
This patch enables macvlan offloads for i40e. The idea is to use
channels as macvlan interfaces. The channels are VSIs of
type VMDQ. When the first macvlan is created, the maximum number of
channels possible are created. From then on, as a macvlan interface
is created, a macvlan filter is added to these already created
channels (VSIs).
This patch utilizes subordinate device traffic classes to make queue
groups(channels) available for an upper device like a macvlan.
Steps to configure macvlan offloads:
1. ethtool -K ethx l2-fwd-offload on
2. ip link add link ethx name macvlan1 type macvlan
3. ip addr add <address> dev macvlan1
4. ip link set macvlan1 up
Signed-off-by: Harshitha Ramamurthy <[email protected]>
Tested-by: Andrew Bowers <[email protected]>
Signed-off-by: Jeff Kirsher <[email protected]>
|
|
Change the ethtool link settings call to just read the cached state out of
the adapter structure instead of trying to recheck the value from the PF.
Doing this should prevent excessive reading of the mailbox.
Signed-off-by: Alexander Duyck <[email protected]>
Reviewed-by: "Guilherme G. Piccoli" <[email protected]>
Tested-by: Andrew Bowers <[email protected]>
Signed-off-by: Jeff Kirsher <[email protected]>
|
|
A recent commit efa14c3985828d ("iavf: allow null RX descriptors") added
a null pointer sanity check on rx_buffer, however, rx_buffer is being
dereferenced before that check, which implies a null pointer dereference
bug can potentially occur. Fix this by only dereferencing rx_buffer
until after the null pointer check.
Addresses-Coverity: ("Dereference before null check")
Signed-off-by: Colin Ian King <[email protected]>
Tested-by: Andrew Bowers <[email protected]>
Signed-off-by: Jeff Kirsher <[email protected]>
|
|
This patch adds the RR2DCDELAY register to the ethtool registers dump.
RR2DCDELAY exists on I210 and I211 Intel Gigabit Ethernet chips and it stands
for "Read Request To Data Completion Delay". Here is how this register is
described in the I210 datasheet:
"This field captures the maximum PCIe split time in 16 ns units, which is the
maximum delay between the read request to the first data completion. This is
giving an estimation of the PCIe round trip time."
In other words, whenever I210 reads from the host memory (e.g., fetches a
descriptor from the ring), the chip measures every PCI DMA read transaction and
captures the maximum value. So it ends up containing the longest DMA
transaction time.
This register is very useful for troubleshooting and research purposes. If you
are dealing with time-sensitive networks, this register can help you get
an idea of your "I210-to-ring" latency. This helps answering questions like
"should I have PCIe ASPM enabled?" or "should I enable deep C-states?" on
my system.
It is safe to read this register at any point, reading it has no effect on
the I210 chip functionality.
Signed-off-by: Artem Bityutskiy <[email protected]>
Tested-by: Aaron Brown <[email protected]>
Signed-off-by: Jeff Kirsher <[email protected]>
|
|
This patch has no functional impact and it is just a preparation
for the following patch. It removes an early return from the
'igb_get_regs()' function by moving the 82576-only registers
dump into an "if" block. With this preparation, we can dump more
non-82576 registers at the end of this function.
Signed-off-by: Artem Bityutskiy <[email protected]>
Tested-by: Aaron Brown <[email protected]>
Signed-off-by: Jeff Kirsher <[email protected]>
|
|
This aligns the iavf_debug() macro with the other Intel drivers.
Add the bus number, bus_id field to i40e_bus_info so output shows
each physical port(i.e func) in following format:
[[[[<domain>]:]<bus>]:][<slot>][.[<func>]]
domains are numbered from 0 to ffff), bus (0-ff), slot (0-1f) and
function (0-7).
Signed-off-by: Jeff Kirsher <[email protected]>
Tested-by: Andrew Bowers <[email protected]>
|
|
The e1000e driver is a great user of the usleep_range() API,
and has nice ranges that in principle help power management.
However the ranges that are used only during system startup are
very long (and can add easily 100 msec to the boot time) while
the power savings of such long ranges is irrelevant due to the
one-off, boot only, nature of these functions.
This patch shrinks some of the longest ranges to be shorter
(while still using a power friendly 1 msec range); this saves
100msec+ of boot time on my BDW NUCs
Signed-off-by: Arjan van de Ven <[email protected]>
Signed-off-by: Paul Menzel <[email protected]>
Tested-by: Aaron Brown <[email protected]>
Signed-off-by: Jeff Kirsher <[email protected]>
|
|
Make use of the struct_size() helper instead of an open-coded version
in order to avoid any potential type mistakes, in particular in the
context in which this code is being used.
So, replace code of the following form:
sizeof(struct virtchnl_ether_addr_list) + (count * sizeof(struct virtchnl_ether_addr))
with:
struct_size(veal, list, count)
and so on...
This code was detected with the help of Coccinelle.
Signed-off-by: "Gustavo A. R. Silva" <[email protected]>
Tested-by: Andrew Bowers <[email protected]>
Signed-off-by: Jeff Kirsher <[email protected]>
|
|
e1000 writes to doorbells to post transmit descriptors and fill the
receive ring. After writing descriptors to memory but before
writing to doorbells, use dma_wmb() rather than wmb(). wmb() is more
heavyweight than necessary for a device to see descriptor writes.
On x86, this avoids SFENCEs before doorbell writes in both the
Tx and Rx paths. On ARM, this converts DSB ST -> DMB OSHST.
Tested: 82576EB / x86; QEMU (qemu emulates an 8257x)
Signed-off-by: Venkatesh Srinivas <[email protected]>
Tested-by: Aaron Brown <[email protected]>
Signed-off-by: Jeff Kirsher <[email protected]>
|
|
The u32 variable rem is being shifted using u32 arithmetic however
it is being passed to div_u64 that expects the expression to be a u64.
The 32 bit shift may potentially overflow, so cast rem to a u64 before
shifting to avoid this. Also remove comment about overflow.
Addresses-Coverity: ("Unintentional integer overflow")
Fixes: cd4583206990 ("ixgbe: implement support for SDP/PPS output on X550 hardware")
Fixes: 68d9676fc04e ("ixgbe: fix PTP SDP pin setup on X540 hardware")
Signed-off-by: Colin Ian King <[email protected]>
Acked-by: Jacob Keller <[email protected]>
Tested-by: Andrew Bowers <[email protected]>
Signed-off-by: Jeff Kirsher <[email protected]>
|
|
An ipsec structure will not be allocated if the hardware does not support
offload. Fixes the following Oops:
[ 191.045452] Unable to handle kernel NULL pointer dereference at virtual address 0000000000000000
[ 191.054232] Mem abort info:
[ 191.057014] ESR = 0x96000004
[ 191.060057] Exception class = DABT (current EL), IL = 32 bits
[ 191.065963] SET = 0, FnV = 0
[ 191.069004] EA = 0, S1PTW = 0
[ 191.072132] Data abort info:
[ 191.074999] ISV = 0, ISS = 0x00000004
[ 191.078822] CM = 0, WnR = 0
[ 191.081780] user pgtable: 4k pages, 48-bit VAs, pgdp = 0000000043d9e467
[ 191.088382] [0000000000000000] pgd=0000000000000000
[ 191.093252] Internal error: Oops: 96000004 [#1] SMP
[ 191.098119] Modules linked in: vhost_net vhost tap vfio_pci vfio_virqfd vfio_iommu_type1 vfio xt_CHECKSUM iptable_mangle ipt_MASQUERADE iptable_nat nf_nat_ipv4 nf_nat xt_conntrack nf_conntrack nf_defrag_ipv6 nf_defrag_ipv4 ipt_REJECT nf_reject_ipv4 xt_tcpudp bridge stp llc ebtable_filter devlink ebtables ip6table_filter ip6_tables iptable_filter bpfilter ipmi_ssif nls_iso8859_1 input_leds joydev ipmi_si hns_roce_hw_v2 ipmi_devintf hns_roce ipmi_msghandler cppc_cpufreq sch_fq_codel ib_iser rdma_cm iw_cm ib_cm ib_core iscsi_tcp libiscsi_tcp libiscsi scsi_transport_iscsi ip_tables x_tables autofs4 ses enclosure btrfs zstd_compress raid10 raid456 async_raid6_recov async_memcpy async_pq async_xor async_tx xor hid_generic usbhid hid raid6_pq libcrc32c raid1 raid0 multipath linear ixgbevf hibmc_drm ttm
[ 191.168607] drm_kms_helper aes_ce_blk aes_ce_cipher syscopyarea crct10dif_ce sysfillrect ghash_ce qla2xxx sysimgblt sha2_ce sha256_arm64 hisi_sas_v3_hw fb_sys_fops sha1_ce uas nvme_fc mpt3sas ixgbe drm hisi_sas_main nvme_fabrics usb_storage hclge scsi_transport_fc ahci libsas hnae3 raid_class libahci xfrm_algo scsi_transport_sas mdio aes_neon_bs aes_neon_blk crypto_simd cryptd aes_arm64
[ 191.202952] CPU: 94 PID: 0 Comm: swapper/94 Not tainted 4.19.0-rc1+ #11
[ 191.209553] Hardware name: Huawei D06 /D06, BIOS Hisilicon D06 UEFI RC0 - V1.20.01 04/26/2019
[ 191.218064] pstate: 20400089 (nzCv daIf +PAN -UAO)
[ 191.222873] pc : ixgbe_ipsec_vf_clear+0x60/0xd0 [ixgbe]
[ 191.228093] lr : ixgbe_msg_task+0x2d0/0x1088 [ixgbe]
[ 191.233044] sp : ffff000009b3bcd0
[ 191.236346] x29: ffff000009b3bcd0 x28: 0000000000000000
[ 191.241647] x27: ffff000009628000 x26: 0000000000000000
[ 191.246946] x25: ffff803f652d7600 x24: 0000000000000004
[ 191.252246] x23: ffff803f6a718900 x22: 0000000000000000
[ 191.257546] x21: 0000000000000000 x20: 0000000000000000
[ 191.262845] x19: 0000000000000000 x18: 0000000000000000
[ 191.268144] x17: 0000000000000000 x16: 0000000000000000
[ 191.273443] x15: 0000000000000000 x14: 0000000100000026
[ 191.278742] x13: 0000000100000025 x12: ffff8a5f7fbe0df0
[ 191.284042] x11: 000000010000000b x10: 0000000000000040
[ 191.289341] x9 : 0000000000001100 x8 : ffff803f6a824fd8
[ 191.294640] x7 : ffff803f6a825098 x6 : 0000000000000001
[ 191.299939] x5 : ffff000000f0ffc0 x4 : 0000000000000000
[ 191.305238] x3 : ffff000028c00000 x2 : ffff803f652d7600
[ 191.310538] x1 : 0000000000000000 x0 : ffff000000f205f0
[ 191.315838] Process swapper/94 (pid: 0, stack limit = 0x00000000addfed5a)
[ 191.322613] Call trace:
[ 191.325055] ixgbe_ipsec_vf_clear+0x60/0xd0 [ixgbe]
[ 191.329927] ixgbe_msg_task+0x2d0/0x1088 [ixgbe]
[ 191.334536] ixgbe_msix_other+0x274/0x330 [ixgbe]
[ 191.339233] __handle_irq_event_percpu+0x78/0x270
[ 191.343924] handle_irq_event_percpu+0x40/0x98
[ 191.348355] handle_irq_event+0x50/0xa8
[ 191.352180] handle_fasteoi_irq+0xbc/0x148
[ 191.356263] generic_handle_irq+0x34/0x50
[ 191.360259] __handle_domain_irq+0x68/0xc0
[ 191.364343] gic_handle_irq+0x84/0x180
[ 191.368079] el1_irq+0xe8/0x180
[ 191.371208] arch_cpu_idle+0x30/0x1a8
[ 191.374860] do_idle+0x1dc/0x2a0
[ 191.378077] cpu_startup_entry+0x2c/0x30
[ 191.381988] secondary_start_kernel+0x150/0x1e0
[ 191.386506] Code: 6b15003f 54000320 f1404a9f 54000060 (79400260)
Fixes: eda0333ac2930 ("ixgbe: add VF IPsec management")
Signed-off-by: Dann Frazier <[email protected]>
Acked-by: Shannon Nelson <[email protected]>
Tested-by: Andrew Bowers <[email protected]>
Signed-off-by: Jeff Kirsher <[email protected]>
|
|
Suggested-by: Tim Pepper <[email protected]>
Signed-off-by: Miguel Bernal Marin <[email protected]>
Signed-off-by: Paul Menzel <[email protected]>
Acked-by: Sasha Neftin <[email protected]>
Tested-by: Aaron Brown <[email protected]>
Signed-off-by: Jeff Kirsher <[email protected]>
|
|
One of the more common cases of allocation size calculations is finding
the size of a structure that has a zero-sized array at the end, along
with memory for some number of elements for that array. For example:
struct foo {
int stuff;
struct boo entry[];
};
size = sizeof(struct foo) + count * sizeof(struct boo);
instance = alloc(size, GFP_KERNEL);
Instead of leaving these open-coded and prone to type mistakes, we can
now use the new struct_size() helper:
size = struct_size(instance, entry, count);
This code was detected with the help of Coccinelle.
Signed-off-by: "Gustavo A. R. Silva" <[email protected]>
Tested-by: Andrew Bowers <[email protected]>
Signed-off-by: Jeff Kirsher <[email protected]>
|
|
If a packet which is utilizing the launchtime feature (via SO_TXTIME socket
option) also requests the hardware transmit timestamp, the hardware
timestamp is not delivered to the userspace. This is because the value in
skb->tstamp is mistaken as the software timestamp.
Applications, like ptp4l, request a hardware timestamp by setting the
SOF_TIMESTAMPING_TX_HARDWARE socket option. Whenever a new timestamp is
detected by the driver (this work is done in igb_ptp_tx_work() which calls
igb_ptp_tx_hwtstamps() in igb_ptp.c[1]), it will queue the timestamp in the
ERR_QUEUE for the userspace to read. When the userspace is ready, it will
issue a recvmsg() call to collect this timestamp. The problem is in this
recvmsg() call. If the skb->tstamp is not cleared out, it will be
interpreted as a software timestamp and the hardware tx timestamp will not
be successfully sent to the userspace. Look at skb_is_swtx_tstamp() and the
callee function __sock_recv_timestamp() in net/socket.c for more details.
Signed-off-by: Vedang Patel <[email protected]>
Tested-by: Aaron Brown <[email protected]>
Signed-off-by: David S. Miller <[email protected]>
|
|
Some drivers want to access the data transmitted in order to implement
acceleration features of the NICs. It is also useful in AF_XDP TX flow.
Change the xsk_umem_consume_tx API to return the whole xdp_desc, that
contains the data pointer, length and DMA address, instead of only the
latter two. Adapt the implementation of i40e and ixgbe to this change.
Signed-off-by: Maxim Mikityanskiy <[email protected]>
Signed-off-by: Tariq Toukan <[email protected]>
Acked-by: Saeed Mahameed <[email protected]>
Cc: Björn Töpel <[email protected]>
Cc: Magnus Karlsson <[email protected]>
Acked-by: Björn Töpel <[email protected]>
Signed-off-by: Daniel Borkmann <[email protected]>
|
|
One of the more common cases of allocation size calculations is finding
the size of a structure that has a zero-sized array at the end, along
with memory for some number of elements for that array. For example:
struct virtchnl_iwarp_qvlist_info {
...
struct virtchnl_iwarp_qv_info qv_info[1];
};
size = sizeof(struct virtchnl_iwarp_qvlist_info) + (sizeof(struct virtchnl_iwarp_qv_info) * count;
instance = kzalloc(size, GFP_KERNEL);
and
struct virtchnl_vf_resource {
...
struct virtchnl_vsi_resource vsi_res[1];
};
size = sizeof(struct virtchnl_vf_resource) + sizeof(struct virtchnl_vsi_resource) * count;
instance = kzalloc(size, GFP_KERNEL);
Instead of leaving these open-coded and prone to type mistakes, we can
now use the new struct_size() helper:
instance = kzalloc(struct_size(instance, qv_info, count), GFP_KERNEL);
and
instance = kzalloc(struct_size(instance, vsi_res, count), GFP_KERNEL);
Notice that, in the first case above, variable size is not necessary, hence it
is removed.
This code was detected with the help of Coccinelle.
Signed-off-by: "Gustavo A. R. Silva" <[email protected]>
Tested-by: Andrew Bowers <[email protected]>
Signed-off-by: Jeff Kirsher <[email protected]>
|
|
It was found that the string that prints our copyright was
not up to date. Updating to reflect our copyright.
Signed-off-by: Alice Michael <[email protected]>
Tested-by: Andrew Bowers <[email protected]>
Signed-off-by: Jeff Kirsher <[email protected]>
|
|
Changing descriptor count via 'ethtool -G' is not persistent across resets.
When PF reset occurs, we roll back to the default value of vsi->num_desc,
which is used then in i40e_alloc_rings to set descriptor count. XDP does a
PF reset so when user has changed the descriptor count and load XDP
program, the default count will be back there.
To fix this:
* introduce new VSI members - num_tx_desc and num_rx_desc in favour of
num_desc
* set them in i40e_set_ringparam to user's values
* set them to default values in i40e_set_num_rings_in_vsi only when they
don't have previous values
Signed-off-by: Maciej Fijalkowski <[email protected]>
Tested-by: Andrew Bowers <[email protected]>
Signed-off-by: Jeff Kirsher <[email protected]>
|
|
This patch fixes reading f/w LLDP agent status at DCB init time.
It's done by removing direct NVM reading in i40e_update_dcb_config()
and checking whether f/w LLDP agent is disabled via
I40E_FLAG_DISABLE_FW_LLDP flag in i40e_init_pf_dcb(). The function
i40e_update_dcb_config() in i40e_main.c is a temporary solution which
will be later renamed to i40e_init_dcb() in the i40e_dcb module. Also
logging was extended to make visible if f/w LLDP agent is running or not
and always log a message when DCB was not initialized. Without this
patch for new f/w versions f/w LLDP agent status was always read
from NVM as disabled and DCB initialization failed without
clear reason in logs.
Signed-off-by: Aleksandr Loktionov <[email protected]>
Tested-by: Andrew Bowers <[email protected]>
Signed-off-by: Jeff Kirsher <[email protected]>
|
|
Generate log entry when TC0 is created or deleted.
Log entry is generated during main VSI setup.
Before there was no log info about adding or deleting TC0.
Signed-off-by: Piotr Kwapulinski <[email protected]>
Tested-by: Andrew Bowers <[email protected]>
Signed-off-by: Jeff Kirsher <[email protected]>
|
|
Signed-off-by: Jacob Keller <[email protected]>
Tested-by: Andrew Bowers <[email protected]>
Signed-off-by: Jeff Kirsher <[email protected]>
|
|
Fix for missing "Supported link modes" and "Advertised link modes"
info in ethtool after changed speed on X722 devices with BASE-T PHY
with FW API version >= 1.7.
The same FW API version on X710 and X722 does not mean the same
feature set so the change was needed as mac type of the device
should also be checked instead of FW API version only.
Signed-off-by: Martyna Szapar <[email protected]>
Tested-by: Andrew Bowers <[email protected]>
Signed-off-by: Jeff Kirsher <[email protected]>
|