Age | Commit message (Collapse) | Author | Files | Lines |
|
There exist races when port is being configured and remove is
triggered.
unregister_netdev is not and can't be called under crit_lock
mutex since it is calling ndo_stop -> iavf_close which requires
this lock. Depending on init state the netdev could be still
unregistered so unregister_netdev never cleans up, when shortly
after that the device could become registered.
Make iavf_remove wait until port finishes initialization.
All critical state changes are atomic (under crit_lock).
Crashes that come from iavf_reset_interrupt_capability and
iavf_free_traffic_irqs should now be solved in a graceful
manner.
Fixes: 605ca7c5c6707 ("iavf: Fix kernel BUG in free_msi_irqs")
Signed-off-by: Slawomir Laba <[email protected]>
Signed-off-by: Phani Burra <[email protected]>
Signed-off-by: Jacob Keller <[email protected]>
Signed-off-by: Mateusz Palczewski <[email protected]>
Tested-by: Konrad Jankowski <[email protected]>
Signed-off-by: Tony Nguyen <[email protected]>
|
|
The driver used to crash in multiple spots when put to stress testing
of the init, reset and remove paths.
The user would experience call traces or hangs when creating,
resetting, removing VFs. Depending on the machines, the call traces
are happening in random spots, like reset restoring resources racing
with driver remove.
Make adapter->crit_lock mutex a mandatory lock for guarding the
operations performed on all workqueues and functions dealing with
resource allocation and disposal.
Make __IAVF_REMOVE a final state of the driver respected by
workqueues that shall not requeue, when they fail to obtain the
crit_lock.
Make the IRQ handler not to queue the new work for adminq_task
when the __IAVF_REMOVE state is set.
Fixes: 5ac49f3c2702 ("iavf: use mutexes for locking of critical sections")
Signed-off-by: Slawomir Laba <[email protected]>
Signed-off-by: Phani Burra <[email protected]>
Signed-off-by: Jacob Keller <[email protected]>
Signed-off-by: Mateusz Palczewski <[email protected]>
Tested-by: Konrad Jankowski <[email protected]>
Signed-off-by: Tony Nguyen <[email protected]>
|
|
tools/testing/selftests/net/mptcp/mptcp_join.sh
34aa6e3bccd8 ("selftests: mptcp: add ip mptcp wrappers")
857898eb4b28 ("selftests: mptcp: add missing join check")
6ef84b1517e0 ("selftests: mptcp: more robust signal race test")
https://lore.kernel.org/all/[email protected]/
drivers/net/ethernet/mellanox/mlx5/core/en/tc/act/act.h
drivers/net/ethernet/mellanox/mlx5/core/en/tc/act/ct.c
fb7e76ea3f3b6 ("net/mlx5e: TC, Skip redundant ct clear actions")
c63741b426e11 ("net/mlx5e: Fix MPLSoUDP encap to use MPLS action information")
09bf97923224f ("net/mlx5e: TC, Move pedit_headers_action to parse_attr")
84ba8062e383 ("net/mlx5e: Test CT and SAMPLE on flow attr")
efe6f961cd2e ("net/mlx5e: CT, Don't set flow flag CT for ct clear flow")
3b49a7edec1d ("net/mlx5e: TC, Reject rules with multiple CT actions")
Signed-off-by: Jakub Kicinski <[email protected]>
|
|
Revert of a patch that instead of fixing a AQ error when trying
to reset BW limit introduced several regressions related to
creation and managing TC. Currently there are errors when creating
a TC on both PF and VF.
Error log:
[17428.783095] i40e 0000:3b:00.1: AQ command Config VSI BW allocation per TC failed = 14
[17428.783107] i40e 0000:3b:00.1: Failed configuring TC map 0 for VSI 391
[17428.783254] i40e 0000:3b:00.1: AQ command Config VSI BW allocation per TC failed = 14
[17428.783259] i40e 0000:3b:00.1: Unable to configure TC map 0 for VSI 391
This reverts commit 3d2504663c41104b4359a15f35670cfa82de1bbf.
Fixes: 3d2504663c41 (i40e: Fix reset bw limit when DCB enabled with 1 TC)
Signed-off-by: Mateusz Palczewski <[email protected]>
Signed-off-by: Tony Nguyen <[email protected]>
Link: https://lore.kernel.org/r/[email protected]
Signed-off-by: Jakub Kicinski <[email protected]>
|
|
Remove non-inclusive language from the driver.
Additionally correct the duplication "from from"
reported by checkpatch after the changes above.
Signed-off-by: Piotr Skajewski <[email protected]>
Tested-by: Dave Switzer <[email protected]>
Signed-off-by: Tony Nguyen <[email protected]>
|
|
Eliminate the follow smatch warning:
drivers/net/ethernet/intel/ixgbevf/ixgbevf_main.c:2756
ixgbevf_alloc_q_vector() warn: inconsistent indenting
Reported-by: Abaci Robot <[email protected]>
Signed-off-by: Yang Li <[email protected]>
Signed-off-by: Tony Nguyen <[email protected]>
|
|
The 'if (ntu == rx_ring->count)' block in i40e_alloc_rx_buffers_zc()
was previously residing in the loop, but after introducing the
batched interface it is used only to wrap-around the NTU descriptor,
thus no more need to assign 'xdp'.
'cleaned_count' in i40e_clean_rx_irq_zc() was previously being
incremented in the loop, but after commit f12738b6ec06
("i40e: remove unnecessary cleaned_count updates") it gets
assigned only once after it, so the initialization can be dropped.
Fixes: 6aab0bb0c5cd ("i40e: Use the xsk batched rx allocation interface")
Fixes: f12738b6ec06 ("i40e: remove unnecessary cleaned_count updates")
Signed-off-by: Alexander Lobakin <[email protected]>
Acked-by: Maciej Fijalkowski <[email protected]>
Tested-by: George Kuruvinakunnel <[email protected]>
Signed-off-by: Tony Nguyen <[email protected]>
Signed-off-by: David S. Miller <[email protected]>
|
|
Clang static analysis reports this issues
ice_common.c:5008:21: warning: The left expression of the compound
assignment is an uninitialized value. The computed value will
also be garbage
ldo->phy_type_low |= ((u64)buf << (i * 16));
~~~~~~~~~~~~~~~~~ ^
When called from ice_cfg_phy_fec() ldo is the uninitialized local
variable tlv. So initialize.
Fixes: ea78ce4dab05 ("ice: add link lenient and default override support")
Signed-off-by: Tom Rix <[email protected]>
Tested-by: Gurucharan G <[email protected]> (A Contingent worker at Intel)
Signed-off-by: Tony Nguyen <[email protected]>
|
|
Clang static analysis reports this issue
time64.h:69:50: warning: The left operand of '+'
is a garbage value
set_normalized_timespec64(&ts_delta, lhs.tv_sec + rhs.tv_sec,
~~~~~~~~~~ ^
In ice_ptp_adjtime_nonatomic(), the timespec64 variable 'now'
is set by ice_ptp_gettimex64(). This function can fail
with -EBUSY, so 'now' can have a gargbage value.
So check the return.
Fixes: 06c16d89d2cb ("ice: register 1588 PTP clock device object for E810 devices")
Signed-off-by: Tom Rix <[email protected]>
Tested-by: Gurucharan G <[email protected]> (A Contingent worker at Intel)
Signed-off-by: Tony Nguyen <[email protected]>
|
|
Commit c503e63200c6 ("ice: Stop processing VF messages during teardown")
introduced a driver state flag, ICE_VF_DEINIT_IN_PROGRESS, which is
intended to prevent some issues with concurrently handling messages from
VFs while tearing down the VFs.
This change was motivated by crashes caused while tearing down and
bringing up VFs in rapid succession.
It turns out that the fix actually introduces issues with the VF driver
caused because the PF no longer responds to any messages sent by the VF
during its .remove routine. This results in the VF potentially removing
its DMA memory before the PF has shut down the device queues.
Additionally, the fix doesn't actually resolve concurrency issues within
the ice driver. It is possible for a VF to initiate a reset just prior
to the ice driver removing VFs. This can result in the remove task
concurrently operating while the VF is being reset. This results in
similar memory corruption and panics purportedly fixed by that commit.
Fix this concurrency at its root by protecting both the reset and
removal flows using the existing VF cfg_lock. This ensures that we
cannot remove the VF while any outstanding critical tasks such as a
virtchnl message or a reset are occurring.
This locking change also fixes the root cause originally fixed by commit
c503e63200c6 ("ice: Stop processing VF messages during teardown"), so we
can simply revert it.
Note that I kept these two changes together because simply reverting the
original commit alone would leave the driver vulnerable to worse race
conditions.
Fixes: c503e63200c6 ("ice: Stop processing VF messages during teardown")
Signed-off-by: Jacob Keller <[email protected]>
Tested-by: Konrad Jankowski <[email protected]>
Signed-off-by: Tony Nguyen <[email protected]>
|
|
Accidentally filter flag for none encapsulated l4 port field is always
set. Even if user wants to add encapsulated l4 port field.
Remove this unnecessary flag setting.
Fixes: 9e300987d4a81 ("ice: VXLAN and Geneve TC support")
Signed-off-by: Michal Swiatkowski <[email protected]>
Tested-by: Sandeep Penigalapati <[email protected]>
Signed-off-by: Tony Nguyen <[email protected]>
|
|
In switchdev mode, slow-path rules need to match all protocols, in order
to correctly redirect unfiltered or missed packets to the uplink. To set
this up for the virtual function to uplink flow, the rule that redirects
packets to the control VSI must have the tunnel type set to
ICE_SW_TUN_AND_NON_TUN. As a result of that new tunnel type being set,
ice_get_compat_fv_bitmap will select ICE_PROF_ALL. At that point all
profiles would be selected for this rule, resulting in the desired
behavior. Without this change slow-path would not work with
tunnel protocols.
Fixes: 8b032a55c1bd ("ice: low level support for tunnels")
Signed-off-by: Wojciech Drewek <[email protected]>
Tested-by: Sandeep Penigalapati <[email protected]>
Signed-off-by: Tony Nguyen <[email protected]>
|
|
No conflicts.
Signed-off-by: Jakub Kicinski <[email protected]>
|
|
The status of support for RDMA is currently being tracked with two
separate status flags. This is unnecessary with the current state of
the driver.
Simplify status tracking down to a single flag.
Rename the helper function to denote the RDMA specific status and
universally use the helper function to test the status bit.
Signed-off-by: Dave Ertman <[email protected]>
Tested-by: Leszek Kaliszczuk <[email protected]>
Reviewed-by: Leon Romanovsky <[email protected]>
Signed-off-by: Tony Nguyen <[email protected]>
Signed-off-by: David S. Miller <[email protected]>
|
|
The COMMS package can enable the hardware parser to recognize IPSEC
frames with ESP header and SPI identifier. If this package is available
and configured for loading in /lib/firmware, then the driver will
succeed in enabling this protocol type for RSS.
This in turn allows the hardware to hash over the SPI and use it to pick
a consistent receive queue for the same secure flow. Without this all
traffic is steered to the same queue for multiple traffic threads from
the same IP address. For that reason this is marked as a fix, as the
driver supports the model, but it wasn't enabled.
If the package is not available, adding this type will fail, but the
failure is ignored on purpose as it has no negative affect.
Fixes: c90ed40cefe1 ("ice: Enable writing hardware filtering tables")
Signed-off-by: Jesse Brandeburg <[email protected]>
Tested-by: Gurucharan G <[email protected]> (A Contingent worker at Intel)
Signed-off-by: Tony Nguyen <[email protected]>
Signed-off-by: David S. Miller <[email protected]>
|
|
No conflicts.
Signed-off-by: Jakub Kicinski <[email protected]>
|
|
If a call to re-create the auxiliary device happens in a context that has
already taken the RTNL lock, then the call flow that recreates auxiliary
device can hang if there is another attempt to claim the RTNL lock by the
auxiliary driver.
To avoid this, any call to re-create auxiliary devices that comes from
an source that is holding the RTNL lock (e.g. netdev notifier when
interface exits a bond) should execute in a separate thread. To
accomplish this, add a flag to the PF that will be evaluated in the
service task and dealt with there.
Fixes: f9f5301e7e2d ("ice: Register auxiliary device to provide RDMA")
Signed-off-by: Dave Ertman <[email protected]>
Reviewed-by: Jonathan Toppins <[email protected]>
Tested-by: Gurucharan G <[email protected]>
Signed-off-by: Tony Nguyen <[email protected]>
|
|
Currently, the same handler is called for both a NETDEV_BONDING_INFO
LAG unlink notification as for a NETDEV_UNREGISTER call. This is
causing a problem though, since the netdev_notifier_info passed has
a different structure depending on which event is passed. The problem
manifests as a call trace from a BUG: KASAN stack-out-of-bounds error.
Fix this by creating a handler specific to NETDEV_UNREGISTER that only
is passed valid elements in the netdev_notifier_info struct for the
NETDEV_UNREGISTER event.
Also included is the removal of an unbalanced dev_put on the peer_netdev
and related braces.
Fixes: 6a8b357278f5 ("ice: Respond to a NETDEV_UNREGISTER event for LAG")
Signed-off-by: Dave Ertman <[email protected]>
Acked-by: Jonathan Toppins <[email protected]>
Tested-by: Sunitha Mekala <[email protected]>
Signed-off-by: Tony Nguyen <[email protected]>
|
|
The driver was avoiding offload for IPIP (at least) frames due to
parsing the inner header offsets incorrectly when trying to check
lengths.
This length check works for VXLAN frames but fails on IPIP frames
because skb_transport_offset points to the inner header in IPIP
frames, which meant the subtraction of transport_header from
inner_network_header returns a negative value (-20).
With the code before this patch, everything continued to work, but GSO
was being used to segment, causing throughputs of 1.5Gb/s per thread.
After this patch, throughput is more like 10Gb/s per thread for IPIP
traffic.
Fixes: e94d44786693 ("ice: Implement filter sync, NDO operations and bump version")
Signed-off-by: Jesse Brandeburg <[email protected]>
Reviewed-by: Paul Menzel <[email protected]>
Tested-by: Gurucharan G <[email protected]>
Signed-off-by: Tony Nguyen <[email protected]>
|
|
Propagate the error code from ice_get_link_default_override() instead
of returning success.
Fixes: ea78ce4dab05 ("ice: add link lenient and default override support")
Signed-off-by: Dan Carpenter <[email protected]>
Tested-by: Gurucharan G <[email protected]>
Signed-off-by: Tony Nguyen <[email protected]>
|
|
git://git.kernel.org/pub/scm/linux/kernel/git/tnguy/next-queue
Tony Nguyen says:
====================
100GbE Intel Wired LAN Driver Updates 2022-02-09
This series contains updates to ice driver only.
Brett adds support for QinQ. This begins with code refactoring and
re-organization of VLAN configuration functions to allow for
introduction of VSI VLAN ops to enable setting and calling of
respective operations based on device support of single or double
VLANs. Implementations are added for outer VLAN support.
To support QinQ, the device must be set to double VLAN mode (DVM).
In order for this to occur, the DDP package and NVM must also support
DVM. Functions to determine compatibility and properly configure the
device are added as well as setting the proper bits to advertise and
utilize the proper offloads. Support for VIRTCHNL_VF_OFFLOAD_VLAN_V2
is also included to allow for VF to negotiate and utilize this
functionality.
====================
Signed-off-by: David S. Miller <[email protected]>
|
|
Daniel Borkmann says:
====================
pull-request: bpf-next 2022-02-09
We've added 126 non-merge commits during the last 16 day(s) which contain
a total of 201 files changed, 4049 insertions(+), 2215 deletions(-).
The main changes are:
1) Add custom BPF allocator for JITs that pack multiple programs into a huge
page to reduce iTLB pressure, from Song Liu.
2) Add __user tagging support in vmlinux BTF and utilize it from BPF
verifier when generating loads, from Yonghong Song.
3) Add per-socket fast path check guarding from cgroup/BPF overhead when
used by only some sockets, from Pavel Begunkov.
4) Continued libbpf deprecation work of APIs/features and removal of their
usage from samples, selftests, libbpf & bpftool, from Andrii Nakryiko
and various others.
5) Improve BPF instruction set documentation by adding byte swap
instructions and cleaning up load/store section, from Christoph Hellwig.
6) Switch BPF preload infra to light skeleton and remove libbpf dependency
from it, from Alexei Starovoitov.
7) Fix architecture-agnostic macros in libbpf for accessing syscall
arguments from BPF progs for non-x86 architectures,
from Ilya Leoshkevich.
8) Rework port members in struct bpf_sk_lookup and struct bpf_sock to be
of 16-bit field with anonymous zero padding, from Jakub Sitnicki.
9) Add new bpf_copy_from_user_task() helper to read memory from a different
task than current. Add ability to create sleepable BPF iterator progs,
from Kenny Yu.
10) Implement XSK batching for ice's zero-copy driver used by AF_XDP and
utilize TX batching API from XSK buffer pool, from Maciej Fijalkowski.
11) Generate temporary netns names for BPF selftests to avoid naming
collisions, from Hangbin Liu.
12) Implement bpf_core_types_are_compat() with limited recursion for
in-kernel usage, from Matteo Croce.
13) Simplify pahole version detection and finally enable CONFIG_DEBUG_INFO_DWARF5
to be selected with CONFIG_DEBUG_INFO_BTF, from Nathan Chancellor.
14) Misc minor fixes to libbpf and selftests from various folks.
* https://git.kernel.org/pub/scm/linux/kernel/git/bpf/bpf-next: (126 commits)
selftests/bpf: Cover 4-byte load from remote_port in bpf_sk_lookup
bpf: Make remote_port field in struct bpf_sk_lookup 16-bit wide
libbpf: Fix compilation warning due to mismatched printf format
selftests/bpf: Test BPF_KPROBE_SYSCALL macro
libbpf: Add BPF_KPROBE_SYSCALL macro
libbpf: Fix accessing the first syscall argument on s390
libbpf: Fix accessing the first syscall argument on arm64
libbpf: Allow overriding PT_REGS_PARM1{_CORE}_SYSCALL
selftests/bpf: Skip test_bpf_syscall_macro's syscall_arg1 on arm64 and s390
libbpf: Fix accessing syscall arguments on riscv
libbpf: Fix riscv register names
libbpf: Fix accessing syscall arguments on powerpc
selftests/bpf: Use PT_REGS_SYSCALL_REGS in bpf_syscall_macro
libbpf: Add PT_REGS_SYSCALL_REGS macro
selftests/bpf: Fix an endianness issue in bpf_syscall_macro test
bpf: Fix bpf_prog_pack build HPAGE_PMD_SIZE
bpf: Fix leftover header->pages in sparc and powerpc code.
libbpf: Fix signedness bug in btf_dump_array_data()
selftests/bpf: Do not export subtest as standalone test
bpf, x86_64: Fail gracefully on bpf_jit_binary_pack_finalize failures
...
====================
Link: https://lore.kernel.org/r/[email protected]
Signed-off-by: Jakub Kicinski <[email protected]>
|
|
VFs by default are able to see all tagged traffic regardless of trust
and VLAN filters. Based on legacy devices (i.e. ixgbe, i40e), customers
expect VFs to receive all VLAN tagged traffic with a matching
destination MAC.
Add an ethtool private flag 'vf-vlan-pruning' and set the default to
off so VFs will receive all VLAN traffic directed towards them. When
the flag is turned on, VF will only be able to receive untagged
traffic or traffic with VLAN tags it has created interfaces for.
Also, the flag cannot be changed while any VFs are allocated. This was
done to simplify the implementation. So, if this flag is needed, then
the PF admin must enable it. If the user tries to enable the flag while
VFs are active, then print an unsupported message with the
vf-vlan-pruning flag included. In case multiple flags were specified, this
makes it clear to the user which flag failed.
Signed-off-by: Brett Creeley <[email protected]>
Tested-by: Gurucharan G <[email protected]>
Signed-off-by: Tony Nguyen <[email protected]>
|
|
Currently there is only support for 802.1Q port VLANs on SR-IOV VFs. Add
support to also allow 802.1ad port VLANs when double VLAN mode is
enabled.
Signed-off-by: Brett Creeley <[email protected]>
Tested-by: Konrad Jankowski <[email protected]>
Signed-off-by: Tony Nguyen <[email protected]>
|
|
In order for the driver to support 802.1ad VLAN filtering and offloads,
it needs to advertise those VLAN features and also support modifying
those VLAN features, so make the necessary changes to
ice_set_netdev_features(). By default, enable CTAG insertion/stripping
and CTAG filtering for both Single and Double VLAN Modes (SVM/DVM).
Also, in DVM, enable STAG filtering by default. This is done by
setting the feature bits in netdev->features. Also, in DVM, support
toggling of STAG insertion/stripping, but don't enable them by
default. This is done by setting the feature bits in
netdev->hw_features.
Since 802.1ad VLAN filtering and offloads are only supported in DVM, make
sure they are not enabled by default and that they cannot be enabled
during runtime, when the device is in SVM.
Add an implementation for the ndo_fix_features() callback. This is
needed since the hardware cannot support multiple VLAN ethertypes for
VLAN insertion/stripping simultaneously and all supported VLAN filtering
must either be enabled or disabled together.
Disable inner VLAN stripping by default when DVM is enabled. If a VSI
supports stripping the inner VLAN in DVM, then it will have to configure
that during runtime. For example if a VF is configured in a port VLAN
while DVM is enabled it will be allowed to offload inner VLANs.
Signed-off-by: Brett Creeley <[email protected]>
Tested-by: Gurucharan G <[email protected]>
Signed-off-by: Tony Nguyen <[email protected]>
|
|
In order to support configuring the device in Double VLAN Mode (DVM),
the DDP and FW have to support DVM. If both support DVM, the PF that
downloads the package needs to update the default recipes, set the
VLAN mode, and update boost TCAM entries.
To support updating the default recipes in DVM, add support for
updating an existing switch recipe's lkup_idx and mask. This is done
by first calling the get recipe AQ (0x0292) with the desired recipe
ID. Then, if that is successful update one of the lookup indices
(lkup_idx) and its associated mask if the mask is valid otherwise
the already existing mask will be used.
The VLAN mode of the device has to be configured while the global
configuration lock is held while downloading the DDP, specifically after
the DDP has been downloaded. If supported, the device will default to
DVM.
Co-developed-by: Dan Nowlin <[email protected]>
Signed-off-by: Dan Nowlin <[email protected]>
Signed-off-by: Brett Creeley <[email protected]>
Tested-by: Gurucharan G <[email protected]>
Signed-off-by: Tony Nguyen <[email protected]>
|
|
Add support for the VF driver to be able to request
VIRTCHNL_VF_OFFLOAD_VLAN_V2, negotiate its VLAN capabilities via
VIRTCHNL_OP_GET_OFFLOAD_VLAN_V2_CAPS, add/delete VLAN filters, and
enable/disable VLAN offloads.
VFs supporting VIRTCHNL_OFFLOAD_VLAN_V2 will be able to use the
following virtchnl opcodes:
VIRTCHNL_OP_GET_OFFLOAD_VLAN_V2_CAPS
VIRTCHNL_OP_ADD_VLAN_V2
VIRTCHNL_OP_DEL_VLAN_V2
VIRTCHNL_OP_ENABLE_VLAN_STRIPPING_V2
VIRTCHNL_OP_DISABLE_VLAN_STRIPPING_V2
VIRTCHNL_OP_ENABLE_VLAN_INSERTION_V2
VIRTCHNL_OP_DISABLE_VLAN_INSERTION_V2
Legacy VF drivers may expect the initial VLAN stripping settings to be
configured by the PF, so the PF initializes VLAN stripping based on the
VIRTCHNL_OP_GET_VF_RESOURCES opcode. However, with VLAN support via
VIRTCHNL_VF_OFFLOAD_VLAN_V2, this function is only expected to be used
for VFs that only support VIRTCHNL_VF_OFFLOAD_VLAN, which will only
be supported when a port VLAN is configured. Update the function
based on the new expectations. Also, change the message when the PF
can't enable/disable VLAN stripping to a dev_dbg() as this isn't fatal.
When a VF isn't in a port VLAN and it only supports
VIRTCHNL_VF_OFFLOAD_VLAN when Double VLAN Mode (DVM) is enabled, then
the PF needs to reject the VIRTCHNL_VF_OFFLOAD_VLAN capability and
configure the VF in software only VLAN mode. To do this add the new
function ice_vf_vsi_cfg_legacy_vlan_mode(), which updates the VF's
inner and outer ice_vsi_vlan_ops functions and sets up software only
VLAN mode.
Signed-off-by: Brett Creeley <[email protected]>
Tested-by: Konrad Jankowski <[email protected]>
Signed-off-by: Tony Nguyen <[email protected]>
|
|
Currently the driver only supports 802.1Q VLAN insertion and stripping.
However, once Double VLAN Mode (DVM) is fully supported, then both 802.1Q
and 802.1ad VLAN insertion and stripping will be supported. Unfortunately
the VSI context parameters only allow for one VLAN ethertype at a time
for VLAN offloads so only one or the other VLAN ethertype offload can be
supported at once.
To support this, multiple changes are needed.
Rx path changes:
[1] In DVM, the Rx queue context l2tagsel field needs to be cleared so
the outermost tag shows up in the l2tag2_2nd field of the Rx flex
descriptor. In Single VLAN Mode (SVM), the l2tagsel field should remain
1 to support SVM configurations.
[2] Modify the ice_test_staterr() function to take a __le16 instead of
the ice_32b_rx_flex_desc union pointer so this function can be used for
both rx_desc->wb.status_error0 and rx_desc->wb.status_error1.
[3] Add the new inline function ice_get_vlan_tag_from_rx_desc() that
checks if there is a VLAN tag in l2tag1 or l2tag2_2nd.
[4] In ice_receive_skb(), add a check to see if NETIF_F_HW_VLAN_STAG_RX
is enabled in netdev->features. If it is, then this is the VLAN
ethertype that needs to be added to the stripping VLAN tag. Since
ice_fix_features() prevents CTAG_RX and STAG_RX from being enabled
simultaneously, the VLAN ethertype will only ever be 802.1Q or 802.1ad.
Tx path changes:
[1] In DVM, the VLAN tag needs to be placed in the l2tag2 field of the Tx
context descriptor. The new define ICE_TX_FLAGS_HW_OUTER_SINGLE_VLAN was
added to the list of tx_flags to handle this case.
[2] When the stack requests the VLAN tag to be offloaded on Tx, the
driver needs to set either ICE_TX_FLAGS_HW_OUTER_SINGLE_VLAN or
ICE_TX_FLAGS_HW_VLAN, so the tag is inserted in l2tag2 or l2tag1
respectively. To determine which location to use, set a bit in the Tx
ring flags field during ring allocation that can be used to determine
which field to use in the Tx descriptor. In DVM, always use l2tag2,
and in SVM, always use l2tag1.
Signed-off-by: Brett Creeley <[email protected]>
Tested-by: Gurucharan G <[email protected]>
Signed-off-by: Tony Nguyen <[email protected]>
|
|
Add a new outer_vlan_ops member to the ice_vsi structure as outer VLAN
ops are only available when the device is in Double VLAN Mode (DVM).
Depending on the VSI type, the requirements for what operations to
use/allow differ.
By default all VSI's have unsupported inner and outer VSI VLAN ops. This
implementation was chosen to prevent unexpected crashes due to null
pointer dereferences. Instead, if a VSI calls an unsupported op, it will
just return -EOPNOTSUPP.
Add implementations to support modifying outer VLAN fields for VSI
context. This includes the ability to modify VLAN stripping, insertion,
and the port VLAN based on the outer VLAN handling fields of the VSI
context.
These functions should only ever be used if DVM is enabled because that
means the firmware supports the outer VLAN fields in the VSI context. If
the device is in DVM, then always use the outer_vlan_ops, else use the
vlan_ops since the device is in Single VLAN Mode (SVM).
Also, move adding the untagged VLAN 0 filter from ice_vsi_setup() to
ice_vsi_vlan_setup() as the latter function is specific to the PF and
all other VSI types that need an untagged VLAN 0 filter already do this
in their specific flows. Without this change, Flow Director is failing
to initialize because it does not implement any VSI VLAN ops.
Signed-off-by: Brett Creeley <[email protected]>
Tested-by: Gurucharan G <[email protected]>
Signed-off-by: Tony Nguyen <[email protected]>
|
|
Current operations act on inner VLAN fields. To support double VLAN, outer
VLAN operations and functions will be implemented. Add the "inner" naming
to existing VLAN operations to distinguish them from the upcoming outer
values and functions. Some spacing adjustments are made to align
values.
Note that the inner is not talking about a tunneled VLAN, but the second
VLAN in the packet. For SVM the driver uses inner or single VLAN
filtering and offloads and in Double VLAN Mode the driver uses the
inner filtering and offloads for SR-IOV VFs in port VLANs in order to
support offloading the guest VLAN while a port VLAN is configured.
Signed-off-by: Brett Creeley <[email protected]>
Tested-by: Gurucharan G <[email protected]>
Signed-off-by: Tony Nguyen <[email protected]>
|
|
Currently the proto argument is unused. This is because the driver only
supports 802.1Q VLAN filtering. This policy is enforced via netdev
features that the driver sets up when configuring the netdev, so the
proto argument won't ever be anything other than 802.1Q. However, this
will allow for future iterations of the driver to seemlessly support
802.1ad filtering. Begin using the proto argument and extend the related
structures to support its use.
Signed-off-by: Brett Creeley <[email protected]>
Tested-by: Gurucharan G <[email protected]>
Signed-off-by: Tony Nguyen <[email protected]>
|
|
The current vf->port_vlan_info variable is a packed u16 that contains
the port VLAN ID and QoS/prio value. This is fine, but changes are
incoming that allow for an 802.1ad port VLAN. Add flexibility by
changing the vf->port_vlan_info member to be an ice_vlan structure.
Signed-off-by: Brett Creeley <[email protected]>
Tested-by: Gurucharan G <[email protected]>
Signed-off-by: Tony Nguyen <[email protected]>
|
|
Add a new struct for VLAN related information. Currently this holds
VLAN ID and priority values, but will be expanded to hold TPID value.
This reduces the changes necessary if any other values are added in
future. Remove the action argument from these calls as it's always
ICE_FWD_VSI.
Signed-off-by: Brett Creeley <[email protected]>
Tested-by: Gurucharan G <[email protected]>
Signed-off-by: Tony Nguyen <[email protected]>
|
|
Incoming changes to support 802.1Q and/or 802.1ad VLAN filtering and
offloads require more flexibility when configuring VLANs. The VSI VLAN
interface will allow flexibility for configuring VLANs for all VSI
types. Add new files to separate the VSI VLAN ops and move functions to
make the code more organized.
Signed-off-by: Brett Creeley <[email protected]>
Tested-by: Gurucharan G <[email protected]>
Signed-off-by: Tony Nguyen <[email protected]>
|
|
There are multiple places where VLAN 0 is being added. Create a function
to be called in order to minimize changes as the implementation is expanded
to support double VLAN and avoid duplicated code.
Signed-off-by: Brett Creeley <[email protected]>
Tested-by: Gurucharan G <[email protected]>
Signed-off-by: Tony Nguyen <[email protected]>
|
|
Add functions to configure Tx VLAN antispoof based on iproute
configuration and/or VLAN mode and VF driver support. This is needed
later so the driver can control when it can be configured. Also, add
functions that can be used to enable and disable MAC and VLAN
spoofcheck. Move spoofchk configuration during VSI setup into the
SR-IOV initialization path and into the post VSI rebuild flow for VF
VSIs.
Signed-off-by: Brett Creeley <[email protected]>
Tested-by: Gurucharan G <[email protected]>
Signed-off-by: Tony Nguyen <[email protected]>
|
|
-queue
Tony Nguyen says:
====================
40GbE Intel Wired LAN Driver Updates 2022-02-08
Joe Damato says:
This patch set makes several updates to the i40e driver stats collection
and reporting code to help users of i40e get a better sense of how the
driver is performing and interacting with the rest of the kernel.
These patches include some new stats (like waived and busy) which were
inspired by other drivers that track stats using the same nomenclature.
The new stats and an existing stat, rx_reuse, are now accessible with
ethtool to make harvesting this data more convenient for users.
====================
Signed-off-by: David S. Miller <[email protected]>
|
|
git://git.kernel.org/pub/scm/linux/kernel/git/tnguy/next-queue
Tony Nguyen says:
====================
1GbE Intel Wired LAN Driver Updates 2022-02-07
Corinna Vinschen says:
Fix the kernel warning "Missing unregister, handled but fix driver"
when running, e.g.,
$ ethtool -G eth0 rx 1024
on igc. Remove memset hack from igb and align igb code to igc.
====================
Signed-off-by: David S. Miller <[email protected]>
|
|
git://git.kernel.org/pub/scm/linux/kernel/git/tnguy/linux
Nguyen, Anthony L says:
====================
iwl-next Intel Wired LAN Driver Updates 2022-02-07
Dave adds support for ice driver to provide DSCP QoS mappings to irdma
driver.
[1] https://lore.kernel.org/netdev/[email protected]/
* 'iwl-next' of git://git.kernel.org/pub/scm/linux/kernel/git/tnguy/linux:
ice: add support for DSCP QoS for IDC
====================
Link: https://lore.kernel.org/r/[email protected]
Signed-off-by: Jakub Kicinski <[email protected]>
|
|
In some cases, pages cannot be reused by i40e because the page is busy. Add
a counter for this event.
Busy page count is accessible via ethtool.
Signed-off-by: Joe Damato <[email protected]>
Tested-by: Dave Switzer <[email protected]>
Signed-off-by: Tony Nguyen <[email protected]>
|
|
In some cases, pages can not be reused because they are not associated with
the correct NUMA zone. Knowing how often pages are waived helps users to
understand the interaction between the driver's memory usage and their
system.
Pass rx_stats through to i40e_can_reuse_rx_page to allow tracking when
pages are waived.
The page waive count is accessible via ethtool.
Signed-off-by: Joe Damato <[email protected]>
Tested-by: Dave Switzer <[email protected]>
Signed-off-by: Tony Nguyen <[email protected]>
|
|
Add a counter for new page allocations in the i40e RX path. This stat is
accessible with ethtool.
Signed-off-by: Joe Damato <[email protected]>
Tested-by: Dave Switzer <[email protected]>
Signed-off-by: Tony Nguyen <[email protected]>
|
|
rx page reuse was already being tracked by the i40e driver per RX ring.
Aggregate the counts and make them accessible via ethtool.
Signed-off-by: Joe Damato <[email protected]>
Tested-by: Dave Switzer <[email protected]>
Signed-off-by: Tony Nguyen <[email protected]>
|
|
Page reuse was being tracked from two locations:
- i40e_reuse_rx_page (via 40e_clean_rx_irq), and
- i40e_alloc_mapped_page
Remove the double count and only count reuse from i40e_alloc_mapped_page
when the page is about to be reused.
Signed-off-by: Joe Damato <[email protected]>
Tested-by: Dave Switzer <[email protected]>
Signed-off-by: Tony Nguyen <[email protected]>
|
|
On changing the RX ring parameters igb uses a hack to avoid a warning
when calling xdp_rxq_info_reg via igb_setup_rx_resources. It just
clears the struct xdp_rxq_info content.
Instead, change this to unregister if we're already registered. Align
code to the igc code.
Fixes: 9cbc948b5a20c ("igb: add XDP support")
Signed-off-by: Corinna Vinschen <[email protected]>
Acked-by: Vinicius Costa Gomes <[email protected]>
Tested-by: Sandeep Penigalapati <[email protected]>
Signed-off-by: Tony Nguyen <[email protected]>
|
|
Calling ethtool changing the RX ring parameters like this:
$ ethtool -G eth0 rx 1024
on igc triggers kernel warnings like this:
[ 225.198467] ------------[ cut here ]------------
[ 225.198473] Missing unregister, handled but fix driver
[ 225.198485] WARNING: CPU: 7 PID: 959 at net/core/xdp.c:168
xdp_rxq_info_reg+0x79/0xd0
[...]
[ 225.198601] Call Trace:
[ 225.198604] <TASK>
[ 225.198609] igc_setup_rx_resources+0x3f/0xe0 [igc]
[ 225.198617] igc_ethtool_set_ringparam+0x30e/0x450 [igc]
[ 225.198626] ethnl_set_rings+0x18a/0x250
[ 225.198631] genl_family_rcv_msg_doit+0xca/0x110
[ 225.198637] genl_rcv_msg+0xce/0x1c0
[ 225.198640] ? rings_prepare_data+0x60/0x60
[ 225.198644] ? genl_get_cmd+0xd0/0xd0
[ 225.198647] netlink_rcv_skb+0x4e/0xf0
[ 225.198652] genl_rcv+0x24/0x40
[ 225.198655] netlink_unicast+0x20e/0x330
[ 225.198659] netlink_sendmsg+0x23f/0x480
[ 225.198663] sock_sendmsg+0x5b/0x60
[ 225.198667] __sys_sendto+0xf0/0x160
[ 225.198671] ? handle_mm_fault+0xb2/0x280
[ 225.198676] ? do_user_addr_fault+0x1eb/0x690
[ 225.198680] __x64_sys_sendto+0x20/0x30
[ 225.198683] do_syscall_64+0x38/0x90
[ 225.198687] entry_SYSCALL_64_after_hwframe+0x44/0xae
[ 225.198693] RIP: 0033:0x7f7ae38ac3aa
igc_ethtool_set_ringparam() copies the igc_ring structure but neglects to
reset the xdp_rxq_info member before calling igc_setup_rx_resources().
This in turn calls xdp_rxq_info_reg() with an already registered xdp_rxq_info.
Make sure to unregister the xdp_rxq_info structure first in
igc_setup_rx_resources.
Fixes: 73f1071c1d29 ("igc: Add support for XDP_TX action")
Reported-by: Lennert Buytenhek <[email protected]>
Signed-off-by: Corinna Vinschen <[email protected]>
Acked-by: Vinicius Costa Gomes <[email protected]>
Tested-by: Dvora Fuxbrumer <[email protected]>
Signed-off-by: Tony Nguyen <[email protected]>
|
|
From 4.17 onwards the ixgbevf driver uses build_skb() to build an skb
around new data in the page buffer shared with the ixgbe PF.
This uses either a 2K or 3K buffer, and offsets the DMA mapping by
NET_SKB_PAD + NET_IP_ALIGN. When using a smaller buffer RXDCTL is set to
ensure the PF does not write a full 2K bytes into the buffer, which is
actually 2K minus the offset.
However on the 82599 virtual function, the RXDCTL mechanism is not
available. The driver attempts to work around this by using the SET_LPE
mailbox method to lower the maximm frame size, but the ixgbe PF driver
ignores this in order to keep the PF and all VFs in sync[0].
This means the PF will write up to the full 2K set in SRRCTL, causing it
to write NET_SKB_PAD + NET_IP_ALIGN bytes past the end of the buffer.
With 4K pages split into two buffers, this means it either writes
NET_SKB_PAD + NET_IP_ALIGN bytes past the first buffer (and into the
second), or NET_SKB_PAD + NET_IP_ALIGN bytes past the end of the DMA
mapping.
Avoid this by only enabling build_skb when using "large" buffers (3K).
These are placed in each half of an order-1 page, preventing the PF from
writing past the end of the mapping.
[0]: Technically it only ever raises the max frame size, see
ixgbe_set_vf_lpe() in ixgbe_sriov.c
Fixes: f15c5ba5b6cd ("ixgbevf: add support for using order 1 pages to receive large frames")
Signed-off-by: Samuel Mendoza-Jonas <[email protected]>
Tested-by: Konrad Jankowski <[email protected]>
Signed-off-by: Tony Nguyen <[email protected]>
Signed-off-by: David S. Miller <[email protected]>
|
|
git://git.kernel.org/pub/scm/linux/kernel/git/tnguy/next-queue
Tony Nguyen says:
====================
40GbE Intel Wired LAN Driver Updates 2022-02-03
This series contains updates to the i40e client header file and driver.
Mateusz disables HW TC offload by default.
Joe Damato removes a no longer used statistic.
Jakub Kicinski removes an unused enum from the client header file.
Jedrzej changes some admin queue commands to occur under atomic context
and adds new functions for admin queue MAC VLAN filters to avoid a
potential race that could occur due storing results in a structure that
could be overwritten by the next admin queue call.
====================
Signed-off-by: David S. Miller <[email protected]>
|
|
No conflicts.
Signed-off-by: Jakub Kicinski <[email protected]>
|
|
The ice driver provides QoS information to auxiliary drivers
through the exported function ice_get_qos_params. This function
doesn't currently support L3 DSCP QoS.
Add the necessary defines, structure elements and code to support
DSCP QoS through the IIDC functions.
Signed-off-by: Dave Ertman <[email protected]>
Signed-off-by: Shiraz Saleem <[email protected]>
Signed-off-by: Tony Nguyen <[email protected]>
|