Age | Commit message (Collapse) | Author | Files | Lines |
|
When an XDP redirect happens before the link is ready, that
transmission will not finish and will timeout, causing an adapter
reset. If the redirects do not stop, the adapter will not stop
resetting.
Wait for the driver to signal that there's a carrier before allowing
transmissions to proceed.
Previous code was relying that when __IGC_DOWN is cleared, the NIC is
ready to transmit as all the queues are ready, what happens is that
the carrier presence will only be signaled later, after the watchdog
workqueue has a chance to run. And during this interval (between
clearing __IGC_DOWN and the watchdog running) if any transmission
happens the timeout is emitted (detected by igc_tx_timeout()) which
causes the reset, with the potential for the infinite loop.
Fixes: 4ff320361092 ("igc: Add support for XDP_REDIRECT action")
Reported-by: Ferenc Fejes <[email protected]>
Closes: https://lore.kernel.org/netdev/[email protected]/
Signed-off-by: Vinicius Costa Gomes <[email protected]>
Tested-by: Ferenc Fejes <[email protected]>
Reviewed-by: Maciej Fijalkowski <[email protected]>
Tested-by: Naama Meir <[email protected]>
Signed-off-by: Tony Nguyen <[email protected]>
Signed-off-by: David S. Miller <[email protected]>
|
|
git://git.kernel.org/pub/scm/linux/kernel/git/tnguy/next-queue
Tony Nguyen says:
====================
Introduce Intel IDPF driver
Pavan Kumar Linga says:
This patch series introduces the Intel Infrastructure Data Path Function
(IDPF) driver. It is used for both physical and virtual functions. Except
for some of the device operations the rest of the functionality is the
same for both PF and VF. IDPF uses virtchnl version2 opcodes and
structures defined in the virtchnl2 header file which helps the driver
to learn the capabilities and register offsets from the device
Control Plane (CP) instead of assuming the default values.
The format of the series follows the driver init flow to interface open.
To start with, probe gets called and kicks off the driver initialization
by spawning the 'vc_event_task' work queue which in turn calls the
'hard reset' function. As part of that, the mailbox is initialized which
is used to send/receive the virtchnl messages to/from the CP. Once that is
done, 'core init' kicks in which requests all the required global resources
from the CP and spawns the 'init_task' work queue to create the vports.
Based on the capability information received, the driver creates the said
number of vports (one or many) where each vport is associated to a netdev.
Also, each vport has its own resources such as queues, vectors etc.
From there, rest of the netdev_ops and data path are added.
IDPF implements both single queue which is traditional queueing model
as well as split queue model. In split queue model, it uses separate queue
for both completion descriptors and buffers which helps to implement
out-of-order completions. It also helps to implement asymmetric queues,
for example multiple RX completion queues can be processed by a single
RX buffer queue and multiple TX buffer queues can be processed by a
single TX completion queue. In single queue model, same queue is used
for both descriptor completions as well as buffer completions. It also
supports features such as generic checksum offload, generic receive
offload (hardware GRO) etc.
---
v7:
Patch 2:
* removed pci_[disable|enable]_pcie_error_reporting as they are dropped
from the core
Patch 4, 9:
* used 'kasprintf' instead of 'snprintf' to avoid providing explicit
character string size which also fixes "-Wformat-truncation" warnings
Patch 14:
* used 'ethtool_sprintf' instead of 'snprintf' to avoid providing explicit
character string size which also fixes "-Wformat-truncation" warning
* add string format argument to the 'ethtool_sprintf' to avoid warning on
"-Wformat-security"
v6: https://lore.kernel.org/netdev/[email protected]/
Note: 'Acked-by' was only added to patches 1, 2, 12 and not to the other
patches because of the changes in v6
Patch 3, 4, 5, 6, 7, 8, 9, 11, 13, 14, 15:
* renamed 'reset_lock' to 'vport_ctrl_lock' to reflect the lock usage
* to avoid defensive programming, used 'vport_ctrl_lock' for the user
callbacks that access the 'vport' to prevent the hardware reset thread
from releasing the 'vport', when the user callback is in progress
* added some variables to netdev private structure to avoid vport access
if possible from ethtool and ndo callbacks
* moved 'mac_filter_list_lock' and MAC related flags to vport_config
structure and refactored mac filter flow to handle asynchronous
ndo mac filter callbacks
* stop the queues before starting the reset flow to avoid TX hangs
* removed 'sw_mutex' and 'stop_mutex' as they are not needed anymore
* added missing clear bit in 'init_task' error path
* renamed labels appropriately
Patch 8:
* replaced page_pool_put_page with page_pool_put_full_page
* for the page pool max_len, used PAGE_SIZE
Patch 10, 11, 13:
* made use of the 'netif_txq_maybe_stop', '__netif_txq_completed_wake'
helper macros
Patch 13:
* removed IDPF_HR_RESET_IN_PROG flag check in idpf_tx_singleq_start
as it is defensive
Patch 14:
* removed max descriptor check as the core does that
* removed unnecessary error messages
* removed the stats that are common between the ones reported by ethtool
and ip link
* replaced snprintf with ethtool_sprintf
* added a comment to explain the reason for the max queue check
* as the netdev queues are set on alloc, there is no need to set
them again on reset unless there is a queue change, so move the
'idpf_set_real_num_queues' to 'idpf_initiate_soft_reset'
Patch 15:
* reworded the 'configure SRIOV' in the commit message
v5: https://lore.kernel.org/netdev/[email protected]/
Most Patches:
* wrapped line limit to 80 chars to those which don't effect readability
Patch 12:
* in skb_add_rx_frag, offset 'headlen' w.r.t page_offset when adding a
frag to avoid adding the header again
Patch 14:
* added NULL check for 'rxq' when dereferencing it in page_pool_get_stats
v4: https://lore.kernel.org/netdev/[email protected]/
Patch 1:
* s/virtcnl/virtchnl
* removed the kernel doc for the error code definitions that don't exist
* reworded the summary part in the virtchnl2 header
Patch 3:
* don't set local variable to NULL on error
* renamed sq_send_command_out label with err_unlock
* don't use __GFP_ZERO in dma_alloc_coherent
Patch 4:
* introduced mailbox workqueue to process mailbox interrupts
Patch 3, 4, 5, 6, 7, 8, 9, 11, 15:
* removed unnecessary variable 0-init
Patch 3, 5, 7, 8, 9, 15:
* removed defensive programming checks wherever applicable
* removed IDPF_CAP_FIELD_LAST as it can be treated as defensive
programming
Patch 3, 4, 5, 6, 7:
* replaced IDPF_DFLT_MBX_BUF_SIZE with IDPF_CTLQ_MAX_BUF_LEN
Patch 2 to 15:
* add kernel-doc for idpf.h and idpf_txrx.h enums and structures
Patch 4, 5, 15:
* adjusted the destroy sequence of the workqueues as per the alloc
sequence
Patch 4, 5, 9, 15:
* scrub unnecessary flags in 'idpf_flags'
- IDPF_REMOVE_IN_PROG flag can take care of the cases where
IDPF_REL_RES_IN_PROG is used, removed the later one
- IDPF_REQ_[TX|RX]_SPLITQ are replaced with struct variables
- IDPF_CANCEL_[SERVICE|STATS]_TASK are redundant as the work queue
doesn't get rescheduled again after 'cancel_delayed_work_sync'
- IDPF_HR_CORE_RESET is removed as there is no set_bit for this flag
- IDPF_MB_INTR_TRIGGER is removed as it is not needed anymore with the
mailbox workqueue implementation
Patch 7 to 15:
* replaced the custom buffer recycling code with page pool API
* switched the header split buffer allocations from using a bunch of
pages to using one large chunk of DMA memory
* reordered some of the flows in vport_open to support page pool
Patch 8, 12:
* don't suppress the alloc errors by using __GFP_NOWARN
Patch 9:
* removed dyn_ctl_clrpba_m as it is not being used
Patch 14:
* introduced enum idpf_vport_reset_cause instead of using vport flags
* introduced page pool stats
v3: https://lore.kernel.org/netdev/[email protected]/
Patch 5:
* instead of void, used 'struct virtchnl2_create_vport' type for
vport_params_recvd and vport_params_reqd and removed the typecasting
* used u16/u32 as needed instead of int for variables which cannot be
negative and updated in all the places whereever applicable
Patch 6:
* changed the commit message to "add ptypes and MAC filter support"
* used the sender Signed-off-by as the last tag on all the patches
* removed unnecessary variables 0-init
* instead of fixing the code in this commit, fixed it in the commit
where the change was introduced first
* moved get_type_info struct on to the stack instead of memory alloc
* moved mutex_lock and ptype_info memory alloc outside while loop and
adjusted the return flow
* used 'break' instead of 'continue' in ptype id switch case
v2: https://lore.kernel.org/netdev/[email protected]/
Patch 2:
* added "Intel(R)" to the DRV_SUMMARY and Makefile.
Patch 4, 5, 6, 15:
* replaced IDPF_VC_MSG_PENDING flag with mutex 'vc_buf_lock' for the
adapter related virtchnl opcodes.
* get the mutex lock in the virtchnl send thread itself instead of
in receive thread.
Patch 5, 6, 7, 8, 9, 11, 14, 15:
* replaced IDPF_VPORT_VC_MSG_PENDING flag with mutex 'vc_buf_lock' for
the vport related virtchnl opcodes.
* get the mutex lock in the virtchnl send thread itself instead of
in receive thread.
Patch 6:
* converted get_ptype_info logic from 1:N to 1:1 message exchange for
better handling of mutex lock.
Patch 15:
* introduced 'stats_lock' spinlock to avoid concurrent stats update.
v1: https://lore.kernel.org/netdev/[email protected]/
====================
Signed-off-by: David S. Miller <[email protected]>
|
|
If port VLAN is configured on a VF then any other VLANs on top of this VF
are broken.
During i40e_ndo_set_vf_port_vlan() call the i40e driver reset the VF and
iavf driver asks PF (using VIRTCHNL_OP_GET_VF_RESOURCES) for VF capabilities
but this reset occurs too early, prior setting of vf->info.pvid field
and because this field can be zero during i40e_vc_get_vf_resources_msg()
then VIRTCHNL_VF_OFFLOAD_VLAN capability is reported to iavf driver.
This is wrong because iavf driver should not report VLAN offloading
capability when port VLAN is configured as i40e does not support QinQ
offloading.
Fix the issue by moving VF reset after setting of vf->port_vlan_id
field.
Without this patch:
$ echo 1 > /sys/class/net/enp2s0f0/device/sriov_numvfs
$ ip link set enp2s0f0 vf 0 vlan 3
$ ip link set enp2s0f0v0 up
$ ip link add link enp2s0f0v0 name vlan4 type vlan id 4
$ ip link set vlan4 up
...
$ ethtool -k enp2s0f0v0 | grep vlan-offload
rx-vlan-offload: on
tx-vlan-offload: on
$ dmesg -l err | grep iavf
[1292500.742914] iavf 0000:02:02.0: Failed to add VLAN filter, error IAVF_ERR_INVALID_QP_ID
With this patch:
$ echo 1 > /sys/class/net/enp2s0f0/device/sriov_numvfs
$ ip link set enp2s0f0 vf 0 vlan 3
$ ip link set enp2s0f0v0 up
$ ip link add link enp2s0f0v0 name vlan4 type vlan id 4
$ ip link set vlan4 up
...
$ ethtool -k enp2s0f0v0 | grep vlan-offload
rx-vlan-offload: off [requested on]
tx-vlan-offload: off [requested on]
$ dmesg -l err | grep iavf
Fixes: f9b4b6278d51 ("i40e: Reset the VF upon conflicting VLAN configuration")
Signed-off-by: Ivan Vecera <[email protected]>
Reviewed-by: Jesse Brandeburg <[email protected]>
Tested-by: Rafal Romanowski <[email protected]>
Signed-off-by: Tony Nguyen <[email protected]>
|
|
When the iavf driver wants to reconfigure the VLAN filters
(iavf_add_vlan, iavf_del_vlan), it sets a flag in
aq_required:
adapter->aq_required |= IAVF_FLAG_AQ_ADD_VLAN_FILTER;
or:
adapter->aq_required |= IAVF_FLAG_AQ_DEL_VLAN_FILTER;
This is later processed by the watchdog_task, but it runs periodically
every 2 seconds, so it can be a long time before it processes the request.
In the worst case, the interface is unable to receive traffic for more
than 2 seconds for no objective reason.
Fixes: 5eae00c57f5e ("i40evf: main driver core")
Signed-off-by: Petr Oros <[email protected]>
Co-developed-by: Michal Schmidt <[email protected]>
Signed-off-by: Michal Schmidt <[email protected]>
Co-developed-by: Ivan Vecera <[email protected]>
Signed-off-by: Ivan Vecera <[email protected]>
Reviewed-by: Ahmed Zaki <[email protected]>
Reviewed-by: Simon Horman <[email protected]>
Tested-by: Rafal Romanowski <[email protected]>
Signed-off-by: Tony Nguyen <[email protected]>
|
|
Add helper for set iavf aq request AVF_FLAG_AQ_* and immediately
schedule watchdog_task. Helper will be used in cases where it is
necessary to run aq requests asap
Signed-off-by: Petr Oros <[email protected]>
Co-developed-by: Michal Schmidt <[email protected]>
Signed-off-by: Michal Schmidt <[email protected]>
Co-developed-by: Ivan Vecera <[email protected]>
Signed-off-by: Ivan Vecera <[email protected]>
Reviewed-by: Simon Horman <[email protected]>
Tested-by: Rafal Romanowski <[email protected]>
Signed-off-by: Tony Nguyen <[email protected]>
|
|
Prevent schedule operations for adminq during device remove and when
__IAVF_IN_REMOVE_TASK flag is set. Currently, the iavf_down function
adds operations for adminq that shouldn't be processed when the device
is in the __IAVF_REMOVE state.
Reproduction:
echo 4 > /sys/bus/pci/devices/0000:17:00.0/sriov_numvfs
ip link set dev ens1f0 vf 0 trust on
ip link set dev ens1f0 vf 1 trust on
ip link set dev ens1f0 vf 2 trust on
ip link set dev ens1f0 vf 3 trust on
ip link set dev ens1f0 vf 0 mac 00:22:33:44:55:66
ip link set dev ens1f0 vf 1 mac 00:22:33:44:55:67
ip link set dev ens1f0 vf 2 mac 00:22:33:44:55:68
ip link set dev ens1f0 vf 3 mac 00:22:33:44:55:69
echo 0000:17:02.0 > /sys/bus/pci/devices/0000\:17\:02.0/driver/unbind
echo 0000:17:02.1 > /sys/bus/pci/devices/0000\:17\:02.1/driver/unbind
echo 0000:17:02.2 > /sys/bus/pci/devices/0000\:17\:02.2/driver/unbind
echo 0000:17:02.3 > /sys/bus/pci/devices/0000\:17\:02.3/driver/unbind
sleep 10
echo 0000:17:02.0 > /sys/bus/pci/drivers/iavf/bind
echo 0000:17:02.1 > /sys/bus/pci/drivers/iavf/bind
echo 0000:17:02.2 > /sys/bus/pci/drivers/iavf/bind
echo 0000:17:02.3 > /sys/bus/pci/drivers/iavf/bind
modprobe vfio-pci
echo 8086 154c > /sys/bus/pci/drivers/vfio-pci/new_id
qemu-system-x86_64 -accel kvm -m 4096 -cpu host \
-drive file=centos9.qcow2,if=none,id=virtio-disk0 \
-device virtio-blk-pci,drive=virtio-disk0,bootindex=0 -smp 4 \
-device vfio-pci,host=17:02.0 -net none \
-device vfio-pci,host=17:02.1 -net none \
-device vfio-pci,host=17:02.2 -net none \
-device vfio-pci,host=17:02.3 -net none \
-daemonize -vnc :5
Current result:
There is a probability that the mac of VF in guest is inconsistent with
it in host
Expected result:
When passthrough NIC VF to guest, the VF in guest should always get
the same mac as it in host.
Fixes: 14756b2ae265 ("iavf: Fix __IAVF_RESETTING state usage")
Signed-off-by: Radoslaw Tyl <[email protected]>
Tested-by: Rafal Romanowski <[email protected]>
Signed-off-by: Tony Nguyen <[email protected]>
|
|
Cross-merge networking fixes after downstream PR.
No conflicts.
Signed-off-by: Paolo Abeni <[email protected]>
|
|
Add support for SRIOV: send the requested number of VFs
to the device Control Plane, via the virtchnl message
and then enable the VFs using 'pci_enable_sriov'.
Add other ndo ops supported by the driver such as features_check,
set_rx_mode, validate_addr, set_mac_address, change_mtu, get_stats64,
set_features, and tx_timeout. Initialize the statistics task which
requests the queue related statistics to the CP. Add loopback
and promiscuous mode support and the respective virtchnl messages.
Finally, add documentation and build support for the driver.
Signed-off-by: Joshua Hay <[email protected]>
Co-developed-by: Alan Brady <[email protected]>
Signed-off-by: Alan Brady <[email protected]>
Co-developed-by: Madhu Chittim <[email protected]>
Signed-off-by: Madhu Chittim <[email protected]>
Co-developed-by: Phani Burra <[email protected]>
Signed-off-by: Phani Burra <[email protected]>
Reviewed-by: Sridhar Samudrala <[email protected]>
Reviewed-by: Willem de Bruijn <[email protected]>
Co-developed-by: Pavan Kumar Linga <[email protected]>
Signed-off-by: Pavan Kumar Linga <[email protected]>
Signed-off-by: Tony Nguyen <[email protected]>
|
|
Initialize all the ethtool ops that are supported by the driver and
add the necessary support for the ethtool callbacks. Also add
asynchronous link notification virtchnl support where the device
Control Plane sends the link status and link speed as an
asynchronous event message. Driver report the link speed on
ethtool .idpf_get_link_ksettings query.
Introduce soft reset function which is used by some of the ethtool
callbacks such as .set_channels, .set_ringparam etc. to change the
existing queue configuration. It deletes the existing queues by sending
delete queues virtchnl message to the CP and calls the 'vport_stop' flow
which disables the queues, vport etc. New set of queues are requested to
the CP and reconfigure the queue context by calling the 'vport_open'
flow. Soft reset flow also adjusts the number of vectors associated to a
vport if .set_channels is called.
Signed-off-by: Alan Brady <[email protected]>
Co-developed-by: Alice Michael <[email protected]>
Signed-off-by: Alice Michael <[email protected]>
Co-developed-by: Joshua Hay <[email protected]>
Signed-off-by: Joshua Hay <[email protected]>
Co-developed-by: Phani Burra <[email protected]>
Signed-off-by: Phani Burra <[email protected]>
Reviewed-by: Sridhar Samudrala <[email protected]>
Reviewed-by: Willem de Bruijn <[email protected]>
Co-developed-by: Pavan Kumar Linga <[email protected]>
Signed-off-by: Pavan Kumar Linga <[email protected]>
Signed-off-by: Tony Nguyen <[email protected]>
|
|
Add the start_xmit, TX and RX napi poll support for the single queue
model. Unlike split queue model, single queue uses same queue to post
buffer descriptors and completed descriptors.
Signed-off-by: Joshua Hay <[email protected]>
Co-developed-by: Alan Brady <[email protected]>
Signed-off-by: Alan Brady <[email protected]>
Co-developed-by: Madhu Chittim <[email protected]>
Signed-off-by: Madhu Chittim <[email protected]>
Co-developed-by: Phani Burra <[email protected]>
Signed-off-by: Phani Burra <[email protected]>
Reviewed-by: Sridhar Samudrala <[email protected]>
Reviewed-by: Willem de Bruijn <[email protected]>
Co-developed-by: Pavan Kumar Linga <[email protected]>
Signed-off-by: Pavan Kumar Linga <[email protected]>
Signed-off-by: Tony Nguyen <[email protected]>
|
|
Add support to handle interrupts for the RX completion queue and
RX buffer queue. When the interrupt fires on RX completion queue,
process the RX descriptors that are received. Allocate and prepare
the SKB with the RX packet info, for both data and header buffer.
IDPF uses software maintained refill queues to manage buffers between
RX queue producer and the buffer queue consumer. They are required in
order to maintain a lockless buffer management system and are strictly
software only constructs. Instead of updating the RX buffer queue tail
with available buffers right after the clean routine, it posts the
buffer ids to the refill queues, only to post them to the HW later.
If the generic receive offload (GRO) is enabled in the capabilities
and turned on by default or via ethtool, then HW performs the
packet coalescing if certain criteria are met by the incoming
packets and updates the RX descriptor. Similar to GRO, if generic
checksum is enabled, HW computes the checksum and updates the
respective fields in the descriptor. Add support to update the
SKB fields with the GRO and the generic checksum received.
Signed-off-by: Alan Brady <[email protected]>
Co-developed-by: Joshua Hay <[email protected]>
Signed-off-by: Joshua Hay <[email protected]>
Co-developed-by: Madhu Chittim <[email protected]>
Signed-off-by: Madhu Chittim <[email protected]>
Co-developed-by: Phani Burra <[email protected]>
Signed-off-by: Phani Burra <[email protected]>
Reviewed-by: Sridhar Samudrala <[email protected]>
Reviewed-by: Willem de Bruijn <[email protected]>
Acked-by: Jakub Kicinski <[email protected]>
Co-developed-by: Pavan Kumar Linga <[email protected]>
Signed-off-by: Pavan Kumar Linga <[email protected]>
Signed-off-by: Tony Nguyen <[email protected]>
|
|
Add support to handle the interrupts for the TX completion queue and
process the various completion types.
In the flow scheduling mode, the driver processes primarily buffer
completions as well as descriptor completions occasionally. This mode
supports out of order TX completions. To do so, HW generates one buffer
completion per packet. Each of those completions contains the unique tag
provided during the TX encoding which is used to locate the packet either
on the TX buffer ring or in a hash table. The hash table is used to track
TX buffer information so the descriptor(s) for a given packet can be
reused while the driver is still waiting on the buffer completion(s).
Packets end up in the hash table in one of 2 ways: 1) a packet was
stashed during descriptor completion cleaning, or 2) because an out of
order buffer completion was processed. A descriptor completion arrives
only every so often and is primarily used to guarantee the TX descriptor
ring can be reused without having to wait on the individual buffer
completions. E.g. a descriptor completion for N+16 guarantees HW read all
of the descriptors for packets N through N+15, therefore all of the
buffers for packets N through N+15 are stashed into the hash table and the
descriptors can be reused for more TX packets. Similarly, a packet can be
stashed in the hash table because an out an order buffer completion was
processed. E.g. processing a buffer completion for packet N+3 implies that
HW read all of the descriptors for packets N through N+3 and they can be
reused. However, the HW did not do the DMA yet. The buffers for packets N
through N+2 cannot be freed, so they are stashed in the hash table.
In either case, the buffer completions will eventually be processed for
all of the stashed packets, and all of the buffers will be cleaned from
the hash table.
In queue based scheduling mode, the driver processes primarily descriptor
completions and cleans the TX ring the conventional way.
Finally, the driver triggers a TX queue drain after sending the disable
queues virtchnl message. When the HW completes the queue draining, it
sends the driver a queue marker packet completion. The driver determines
when all TX queues have been drained and proceeds with the disable flow.
With this, the driver can send TX packets and clean up the resources
properly.
Signed-off-by: Joshua Hay <[email protected]>
Co-developed-by: Alan Brady <[email protected]>
Signed-off-by: Alan Brady <[email protected]>
Co-developed-by: Madhu Chittim <[email protected]>
Signed-off-by: Madhu Chittim <[email protected]>
Co-developed-by: Phani Burra <[email protected]>
Signed-off-by: Phani Burra <[email protected]>
Reviewed-by: Sridhar Samudrala <[email protected]>
Reviewed-by: Willem de Bruijn <[email protected]>
Co-developed-by: Pavan Kumar Linga <[email protected]>
Signed-off-by: Pavan Kumar Linga <[email protected]>
Signed-off-by: Tony Nguyen <[email protected]>
|
|
Add start_xmit support for split queue model. To start with, add the
necessary checks to linearize the skb if it uses more number of
buffers than the hardware supported limit. Stop the transmit queue
if there are no enough descriptors available for the skb to use or
if there we're going to potentially overrun the completion queue.
Finally prepare the descriptor with all the required
information and update the tail.
Signed-off-by: Joshua Hay <[email protected]>
Co-developed-by: Alan Brady <[email protected]>
Signed-off-by: Alan Brady <[email protected]>
Co-developed-by: Madhu Chittim <[email protected]>
Signed-off-by: Madhu Chittim <[email protected]>
Co-developed-by: Phani Burra <[email protected]>
Signed-off-by: Phani Burra <[email protected]>
Reviewed-by: Sridhar Samudrala <[email protected]>
Reviewed-by: Willem de Bruijn <[email protected]>
Co-developed-by: Pavan Kumar Linga <[email protected]>
Signed-off-by: Pavan Kumar Linga <[email protected]>
Signed-off-by: Tony Nguyen <[email protected]>
|
|
To further continue 'vport open', initialize all the resources
required for the interrupts. To start with, initialize the
queue vector indices with the ones received from the device
Control Plane. Now that all the TX and RX queues are initialized,
map the RX descriptor and buffer queues as well as TX completion
queues to the allocated vectors. Initialize and enable the napi
handler for the napi polling. Finally, request the IRQs for the
interrupt vectors from the stack and setup the interrupt handler.
Once the interrupt init is done, send 'map queue vector', 'enable
queues' and 'enable vport' virtchnl messages to the CP to complete
the 'vport open' flow.
Co-developed-by: Alan Brady <[email protected]>
Signed-off-by: Alan Brady <[email protected]>
Co-developed-by: Joshua Hay <[email protected]>
Signed-off-by: Joshua Hay <[email protected]>
Co-developed-by: Madhu Chittim <[email protected]>
Signed-off-by: Madhu Chittim <[email protected]>
Co-developed-by: Phani Burra <[email protected]>
Signed-off-by: Phani Burra <[email protected]>
Reviewed-by: Sridhar Samudrala <[email protected]>
Reviewed-by: Willem de Bruijn <[email protected]>
Signed-off-by: Pavan Kumar Linga <[email protected]>
Signed-off-by: Tony Nguyen <[email protected]>
|
|
Similar to the TX, RX also supports both single and split queue models.
In single queue model, the same descriptor queue is used by SW to post
buffer descriptors to HW and by HW to post completed descriptors
to SW. In split queue model, "RX buffer queues" are used to pass
descriptor buffers from SW to HW whereas "RX queues" are used to
post the descriptor completions i.e. descriptors that point to
completed buffers, from HW to SW. "RX queue group" is a set of
RX queues grouped together and will be serviced by a "RX buffer queue
group". IDPF supports 2 buffer queues i.e. large buffer (4KB) queue
and small buffer (2KB) queue per buffer queue group. HW uses large
buffers for 'hardware gro' feature and also if the packet size is
more than 2KB, if not 2KB buffers are used.
Add all the resources required for the RX queues initialization.
Allocate memory for the RX queue and RX buffer queue groups. Initialize
the software maintained refill queues for buffer management algorithm.
Same like the TX queues, initialize the queue parameters for the RX
queues and send the config RX queue virtchnl message to the device
Control Plane.
Signed-off-by: Alan Brady <[email protected]>
Co-developed-by: Alice Michael <[email protected]>
Signed-off-by: Alice Michael <[email protected]>
Co-developed-by: Joshua Hay <[email protected]>
Signed-off-by: Joshua Hay <[email protected]>
Co-developed-by: Madhu Chittim <[email protected]>
Signed-off-by: Madhu Chittim <[email protected]>
Co-developed-by: Phani Burra <[email protected]>
Signed-off-by: Phani Burra <[email protected]>
Reviewed-by: Sridhar Samudrala <[email protected]>
Reviewed-by: Willem de Bruijn <[email protected]>
Co-developed-by: Pavan Kumar Linga <[email protected]>
Signed-off-by: Pavan Kumar Linga <[email protected]>
Signed-off-by: Tony Nguyen <[email protected]>
|
|
IDPF supports two queue models i.e. single queue which is a traditional
queueing model as well as split queue model. In single queue model,
the same descriptor queue is used by SW to post descriptors to the HW,
HW to post completed descriptors to SW. In split queue model, "TX Queues"
are used to pass buffers from SW to HW and "TX Completion Queues"
are used to post descriptor completions from HW to SW. Device supports
asymmetric ratio of TX queues to TX completion queues. Considering
this, queue group mechanism is used i.e. some TX queues are grouped
together which will be serviced by only one TX completion queue
per TX queue group.
Add all the resources required for the TX queues initialization.
To start with, allocate memory for the TX queue groups, TX queues and
TX completion queues. Then, allocate the descriptors for both TX and
TX completion queues, and bookkeeping buffers for TX queues alone.
Also, allocate queue vectors for the vport and initialize the TX queue
related fields for each queue vector.
Initialize the queue parameters such as q_id, q_type and tail register
offset with the info received from the device control plane (CP).
Once all the TX queues are configured, send config TX queue virtchnl
message to the CP with all the TX queue context information.
Signed-off-by: Alan Brady <[email protected]>
Co-developed-by: Alice Michael <[email protected]>
Signed-off-by: Alice Michael <[email protected]>
Co-developed-by: Joshua Hay <[email protected]>
Signed-off-by: Joshua Hay <[email protected]>
Co-developed-by: Phani Burra <[email protected]>
Signed-off-by: Phani Burra <[email protected]>
Reviewed-by: Sridhar Samudrala <[email protected]>
Reviewed-by: Willem de Bruijn <[email protected]>
Co-developed-by: Pavan Kumar Linga <[email protected]>
Signed-off-by: Pavan Kumar Linga <[email protected]>
Signed-off-by: Tony Nguyen <[email protected]>
|
|
Add the virtchnl support to request the packet types. Parse the responses
received from CP and based on the protocol headers, populate the packet
type structure with necessary information. Initialize the MAC address
and add the virtchnl support to add and del MAC address.
Co-developed-by: Alan Brady <[email protected]>
Signed-off-by: Alan Brady <[email protected]>
Co-developed-by: Joshua Hay <[email protected]>
Signed-off-by: Joshua Hay <[email protected]>
Co-developed-by: Madhu Chittim <[email protected]>
Signed-off-by: Madhu Chittim <[email protected]>
Co-developed-by: Phani Burra <[email protected]>
Signed-off-by: Phani Burra <[email protected]>
Co-developed-by: Shailendra Bhatnagar <[email protected]>
Signed-off-by: Shailendra Bhatnagar <[email protected]>
Reviewed-by: Sridhar Samudrala <[email protected]>
Reviewed-by: Willem de Bruijn <[email protected]>
Signed-off-by: Pavan Kumar Linga <[email protected]>
Signed-off-by: Tony Nguyen <[email protected]>
|
|
Add the required support to create a vport by spawning
the init task. Once the vport is created, initialize and
allocate the resources needed for it. Configure and register
a netdev for each vport with all the features supported
by the device based on the capabilities received from the
device Control Plane. Spawn the init task till all the default
vports are created.
Co-developed-by: Alan Brady <[email protected]>
Signed-off-by: Alan Brady <[email protected]>
Co-developed-by: Joshua Hay <[email protected]>
Signed-off-by: Joshua Hay <[email protected]>
Co-developed-by: Madhu Chittim <[email protected]>
Signed-off-by: Madhu Chittim <[email protected]>
Co-developed-by: Phani Burra <[email protected]>
Signed-off-by: Phani Burra <[email protected]>
Co-developed-by: Shailendra Bhatnagar <[email protected]>
Signed-off-by: Shailendra Bhatnagar <[email protected]>
Reviewed-by: Sridhar Samudrala <[email protected]>
Reviewed-by: Willem de Bruijn <[email protected]>
Signed-off-by: Pavan Kumar Linga <[email protected]>
Signed-off-by: Tony Nguyen <[email protected]>
|
|
As the mailbox is setup, add the necessary send and receive
mailbox message framework to support the virtchnl communication
between the driver and device Control Plane (CP).
Add the core initialization. To start with, driver confirms the
virtchnl version with the CP. Once that is done, it requests
and gets the required capabilities and resources needed such as
max vectors, queues etc.
Based on the vector information received in 'VIRTCHNL2_OP_GET_CAPS',
request the stack to allocate the required vectors. Finally add
the interrupt handling mechanism for the mailbox queue and enable
the interrupt.
Note: Checkpatch issues a warning about IDPF_FOREACH_VPORT_VC_STATE and
IDPF_GEN_STRING being complex macros and should be enclosed in parentheses
but it's not the case. They are never used as a statement and instead only
used to define the enum and array.
Co-developed-by: Alan Brady <[email protected]>
Signed-off-by: Alan Brady <[email protected]>
Co-developed-by: Emil Tantilov <[email protected]>
Signed-off-by: Emil Tantilov <[email protected]>
Co-developed-by: Joshua Hay <[email protected]>
Signed-off-by: Joshua Hay <[email protected]>
Co-developed-by: Madhu Chittim <[email protected]>
Signed-off-by: Madhu Chittim <[email protected]>
Co-developed-by: Phani Burra <[email protected]>
Signed-off-by: Phani Burra <[email protected]>
Co-developed-by: Shailendra Bhatnagar <[email protected]>
Signed-off-by: Shailendra Bhatnagar <[email protected]>
Reviewed-by: Sridhar Samudrala <[email protected]>
Reviewed-by: Willem de Bruijn <[email protected]>
Signed-off-by: Pavan Kumar Linga <[email protected]>
Signed-off-by: Tony Nguyen <[email protected]>
|
|
At the end of the probe, initialize and schedule the event workqueue.
It calls the hard reset function where reset checks are done to find
if the device is out of the reset. Control queue initialization and
the necessary control queue support is added.
Introduce function pointers for the register operations which are
different between PF and VF devices.
Signed-off-by: Joshua Hay <[email protected]>
Co-developed-by: Alan Brady <[email protected]>
Signed-off-by: Alan Brady <[email protected]>
Co-developed-by: Madhu Chittim <[email protected]>
Signed-off-by: Madhu Chittim <[email protected]>
Co-developed-by: Phani Burra <[email protected]>
Signed-off-by: Phani Burra <[email protected]>
Co-developed-by: Shailendra Bhatnagar <[email protected]>
Signed-off-by: Shailendra Bhatnagar <[email protected]>
Reviewed-by: Sridhar Samudrala <[email protected]>
Reviewed-by: Willem de Bruijn <[email protected]>
Co-developed-by: Pavan Kumar Linga <[email protected]>
Signed-off-by: Pavan Kumar Linga <[email protected]>
Signed-off-by: Tony Nguyen <[email protected]>
|
|
Add the required support to register IDPF PCI driver, as well as
probe and remove call backs. Enable the PCI device and request
the kernel to reserve the memory resources that will be used by the
driver. Finally map the BAR0 address space.
Signed-off-by: Phani Burra <[email protected]>
Co-developed-by: Alan Brady <[email protected]>
Signed-off-by: Alan Brady <[email protected]>
Co-developed-by: Madhu Chittim <[email protected]>
Signed-off-by: Madhu Chittim <[email protected]>
Co-developed-by: Shailendra Bhatnagar <[email protected]>
Signed-off-by: Shailendra Bhatnagar <[email protected]>
Reviewed-by: Sridhar Samudrala <[email protected]>
Reviewed-by: Willem de Bruijn <[email protected]>
Acked-by: Jakub Kicinski <[email protected]>
Co-developed-by: Pavan Kumar Linga <[email protected]>
Signed-off-by: Pavan Kumar Linga <[email protected]>
Signed-off-by: Tony Nguyen <[email protected]>
|
|
Virtchnl version 1 is an interface used by the current generation of
foundational NICs to negotiate the capabilities and configure the
HW resources such as queues, vectors, RSS LUT, etc between the PF
and VF drivers. It is not extensible to enable new features supported
in the next generation of NICs/IPUs and to negotiate descriptor types,
packet types and register offsets.
To overcome the limitations of the existing interface, introduce
the virtchnl version 2 and add the necessary opcodes, structures,
definitions, and descriptor formats. The driver also learns the
data queue and other register offsets to use instead of hardcoding
them. The advantage of this approach is that it gives the flexibility
to modify the register offsets if needed, restrict the use of
certain descriptor types and negotiate the supported packet types.
Co-developed-by: Alan Brady <[email protected]>
Signed-off-by: Alan Brady <[email protected]>
Co-developed-by: Joshua Hay <[email protected]>
Signed-off-by: Joshua Hay <[email protected]>
Co-developed-by: Madhu Chittim <[email protected]>
Signed-off-by: Madhu Chittim <[email protected]>
Co-developed-by: Phani Burra <[email protected]>
Signed-off-by: Phani Burra <[email protected]>
Co-developed-by: Sridhar Samudrala <[email protected]>
Signed-off-by: Sridhar Samudrala <[email protected]>
Reviewed-by: Sridhar Samudrala <[email protected]>
Reviewed-by: Willem de Bruijn <[email protected]>
Acked-by: Jakub Kicinski <[email protected]>
Signed-off-by: Pavan Kumar Linga <[email protected]>
Signed-off-by: Tony Nguyen <[email protected]>
|
|
Previously CRC stripping was always enabled for VF.
Now it is possible to turn off CRC stripping via ethtool:
#ethtool -K <interface> rx-fcs on
To turn off CRC stripping, first VLAN stripping must be disabled:
#ethtool -K <interface> rx-vlan-offload off
if any VLAN interfaces exists, otherwise VLAN stripping will be turned
off by the driver.
In iavf_configure_queues add check if CRC stripping is enabled for
VF, if it's enabled then set crc_disabled to false on every VF's
queue. In iavf_set_features add check if CRC stripping setting was
changed then schedule reset.
Signed-off-by: Norbert Zulinski <[email protected]>
Reviewed-by: Jesse Brandeburg <[email protected]>
Signed-off-by: Ahmed Zaki <[email protected]>
Tested-by: Rafal Romanowski <[email protected]>
Signed-off-by: Tony Nguyen <[email protected]>
|
|
When VLAN strip is enabled, the CRC strip must not be disabled. And when
the CRC strip is disabled, the VLAN strip should not be enabled.
The driver needs to check CRC strip disable setting parameter before
configuring the Rx/Tx queues, otherwise, in current error handling,
the already set Tx queue context doesn't roll back correctly, it will
cause the Tx queue setup failure next time:
"Failed to set LAN Tx queue context"
Signed-off-by: Haiyue Wang <[email protected]>
Reviewed-by: Jesse Brandeburg <[email protected]>
Reviewed-by: Paul Menzel <[email protected]>
Signed-off-by: Ahmed Zaki <[email protected]>
Tested-by: Rafal Romanowski <[email protected]>
Signed-off-by: Tony Nguyen <[email protected]>
|
|
To support CRC strip enable/disable functionality, VF needs the explicit
request VIRTCHNL_VF_OFFLOAD_CRC offload. Then according to crc_disable
flag of Rx queue configuration information to set up the queue context.
Signed-off-by: Haiyue Wang <[email protected]>
Reviewed-by: Jesse Brandeburg <[email protected]>
Signed-off-by: Ahmed Zaki <[email protected]>
Tested-by: Rafal Romanowski <[email protected]>
Signed-off-by: Tony Nguyen <[email protected]>
|
|
After commit 50f303496d92 ("igb: Enable SR-IOV after reinit"), removing
the igb module could hang or crash (depending on the machine) when the
module has been loaded with the max_vfs parameter set to some value != 0.
In case of one test machine with a dual port 82580, this hang occurred:
[ 232.480687] igb 0000:41:00.1: removed PHC on enp65s0f1
[ 233.093257] igb 0000:41:00.1: IOV Disabled
[ 233.329969] pcieport 0000:40:01.0: AER: Multiple Uncorrected (Non-Fatal) err0
[ 233.340302] igb 0000:41:00.0: PCIe Bus Error: severity=Uncorrected (Non-Fata)
[ 233.352248] igb 0000:41:00.0: device [8086:1516] error status/mask=00100000
[ 233.361088] igb 0000:41:00.0: [20] UnsupReq (First)
[ 233.368183] igb 0000:41:00.0: AER: TLP Header: 40000001 0000040f cdbfc00c c
[ 233.376846] igb 0000:41:00.1: PCIe Bus Error: severity=Uncorrected (Non-Fata)
[ 233.388779] igb 0000:41:00.1: device [8086:1516] error status/mask=00100000
[ 233.397629] igb 0000:41:00.1: [20] UnsupReq (First)
[ 233.404736] igb 0000:41:00.1: AER: TLP Header: 40000001 0000040f cdbfc00c c
[ 233.538214] pci 0000:41:00.1: AER: can't recover (no error_detected callback)
[ 233.538401] igb 0000:41:00.0: removed PHC on enp65s0f0
[ 233.546197] pcieport 0000:40:01.0: AER: device recovery failed
[ 234.157244] igb 0000:41:00.0: IOV Disabled
[ 371.619705] INFO: task irq/35-aerdrv:257 blocked for more than 122 seconds.
[ 371.627489] Not tainted 6.4.0-dirty #2
[ 371.632257] "echo 0 > /proc/sys/kernel/hung_task_timeout_secs" disables this.
[ 371.641000] task:irq/35-aerdrv state:D stack:0 pid:257 ppid:2 f0
[ 371.650330] Call Trace:
[ 371.653061] <TASK>
[ 371.655407] __schedule+0x20e/0x660
[ 371.659313] schedule+0x5a/0xd0
[ 371.662824] schedule_preempt_disabled+0x11/0x20
[ 371.667983] __mutex_lock.constprop.0+0x372/0x6c0
[ 371.673237] ? __pfx_aer_root_reset+0x10/0x10
[ 371.678105] report_error_detected+0x25/0x1c0
[ 371.682974] ? __pfx_report_normal_detected+0x10/0x10
[ 371.688618] pci_walk_bus+0x72/0x90
[ 371.692519] pcie_do_recovery+0xb2/0x330
[ 371.696899] aer_process_err_devices+0x117/0x170
[ 371.702055] aer_isr+0x1c0/0x1e0
[ 371.705661] ? __set_cpus_allowed_ptr+0x54/0xa0
[ 371.710723] ? __pfx_irq_thread_fn+0x10/0x10
[ 371.715496] irq_thread_fn+0x20/0x60
[ 371.719491] irq_thread+0xe6/0x1b0
[ 371.723291] ? __pfx_irq_thread_dtor+0x10/0x10
[ 371.728255] ? __pfx_irq_thread+0x10/0x10
[ 371.732731] kthread+0xe2/0x110
[ 371.736243] ? __pfx_kthread+0x10/0x10
[ 371.740430] ret_from_fork+0x2c/0x50
[ 371.744428] </TASK>
The reproducer was a simple script:
#!/bin/sh
for i in `seq 1 5`; do
modprobe -rv igb
modprobe -v igb max_vfs=1
sleep 1
modprobe -rv igb
done
It turned out that this could only be reproduce on 82580 (quad and
dual-port), but not on 82576, i350 and i210. Further debugging showed
that igb_enable_sriov()'s call to pci_enable_sriov() is failing, because
dev->is_physfn is 0 on 82580.
Prior to commit 50f303496d92 ("igb: Enable SR-IOV after reinit"),
igb_enable_sriov() jumped into the "err_out" cleanup branch. After this
commit it only returned the error code.
So the cleanup didn't take place, and the incorrect VF setup in the
igb_adapter structure fooled the igb driver into assuming that VFs have
been set up where no VF actually existed.
Fix this problem by cleaning up again if pci_enable_sriov() fails.
Fixes: 50f303496d92 ("igb: Enable SR-IOV after reinit")
Signed-off-by: Corinna Vinschen <[email protected]>
Reviewed-by: Akihiko Odaki <[email protected]>
Tested-by: Rafal Romanowski <[email protected]>
Signed-off-by: Tony Nguyen <[email protected]>
Signed-off-by: David S. Miller <[email protected]>
|
|
The commit in fixes introduced flags to control the status of hardware
configuration while processing packets. At the same time another structure
is used to provide configuration of timestamper to user-space applications.
The way it was coded makes this structures go out of sync easily. The
repro is easy for 82599 chips:
[root@hostname ~]# hwstamp_ctl -i eth0 -r 12 -t 1
current settings:
tx_type 0
rx_filter 0
new settings:
tx_type 1
rx_filter 12
The eth0 device is properly configured to timestamp any PTPv2 events.
[root@hostname ~]# hwstamp_ctl -i eth0 -r 1 -t 1
current settings:
tx_type 1
rx_filter 12
SIOCSHWTSTAMP failed: Numerical result out of range
The requested time stamping mode is not supported by the hardware.
The error is properly returned because HW doesn't support all packets
timestamping. But the adapter->flags is cleared of timestamp flags
even though no HW configuration was done. From that point no RX timestamps
are received by user-space application. But configuration shows good
values:
[root@hostname ~]# hwstamp_ctl -i eth0
current settings:
tx_type 1
rx_filter 12
Fix the issue by applying new flags only when the HW was actually
configured.
Fixes: a9763f3cb54c ("ixgbe: Update PTP to support X550EM_x devices")
Signed-off-by: Vadim Fedorenko <[email protected]>
Reviewed-by: Simon Horman <[email protected]>
Tested-by: Pucha Himasekhar Reddy <[email protected]> (A Contingent worker at Intel)
Signed-off-by: Tony Nguyen <[email protected]>
Signed-off-by: David S. Miller <[email protected]>
|
|
Currently when configuring promiscuous mode on the AVF we detect a
change in the netdev->flags. We use IFF_PROMISC and IFF_ALLMULTI to
determine whether or not we need to request/release promiscuous mode
and/or multicast promiscuous mode. The problem is that the AQ calls for
setting/clearing promiscuous/multicast mode are treated separately. This
leads to a case where we can trigger two promiscuous mode AQ calls in
a row with the incorrect state. To fix this make a few changes.
Use IAVF_FLAG_AQ_CONFIGURE_PROMISC_MODE instead of the previous
IAVF_FLAG_AQ_[REQUEST|RELEASE]_[PROMISC|ALLMULTI] flags.
In iavf_set_rx_mode() detect if there is a change in the
netdev->flags in comparison with adapter->flags and set the
IAVF_FLAG_AQ_CONFIGURE_PROMISC_MODE aq_required bit. Then in
iavf_process_aq_command() only check for IAVF_FLAG_CONFIGURE_PROMISC_MODE
and call iavf_set_promiscuous() if it's set.
In iavf_set_promiscuous() check again to see which (if any) promiscuous
mode bits have changed when comparing the netdev->flags with the
adapter->flags. Use this to set the flags which get sent to the PF
driver.
Add a spinlock that is used for updating current_netdev_promisc_flags
and only allows one promiscuous mode AQ at a time.
[1] Fixes the fact that we will only have one AQ call in the aq_required
queue at any one time.
[2] Streamlines the change in promiscuous mode to only set one AQ
required bit.
[3] This allows us to keep track of the current state of the flags and
also makes it so we can take the most recent netdev->flags promiscuous
mode state.
[4] This fixes the problem where a change in the netdev->flags can cause
IAVF_FLAG_AQ_CONFIGURE_PROMISC_MODE to be set in iavf_set_rx_mode(),
but cleared in iavf_set_promiscuous() before the change is ever made via
AQ call.
Fixes: 47d3483988f6 ("i40evf: Add driver support for promiscuous mode")
Signed-off-by: Brett Creeley <[email protected]>
Signed-off-by: Ahmed Zaki <[email protected]>
Tested-by: Rafal Romanowski <[email protected]>
Signed-off-by: Tony Nguyen <[email protected]>
|
|
Instead of freeing memory of a single VSI, make sure
the memory for all VSIs is cleared before releasing VSIs.
Add releasing of their resources in a loop with the iteration
number equal to the number of allocated VSIs.
Fixes: 41c445ff0f48 ("i40e: main driver core")
Signed-off-by: Andrii Staikov <[email protected]>
Signed-off-by: Aleksandr Loktionov <[email protected]>
Reviewed-by: Simon Horman <[email protected]>
Signed-off-by: Tony Nguyen <[email protected]>
|
|
Change the minimum value of RX/TX descriptors to 64 to enable setting the rx/tx
value between 64 and 80. All igb devices can use as low as 64 descriptors.
This change will unify igb with other drivers.
Based on commit 7b1be1987c1e ("e1000e: lower ring minimum size to 64")
Fixes: 9d5c824399de ("igb: PCI-Express 82575 Gigabit Ethernet driver")
Signed-off-by: Olga Zaborska <[email protected]>
Tested-by: Pucha Himasekhar Reddy <[email protected]> (A Contingent worker at Intel)
Signed-off-by: Tony Nguyen <[email protected]>
|
|
Change the minimum value of RX/TX descriptors to 64 to enable setting the rx/tx
value between 64 and 80. All igbvf devices can use as low as 64 descriptors.
This change will unify igbvf with other drivers.
Based on commit 7b1be1987c1e ("e1000e: lower ring minimum size to 64")
Fixes: d4e0fe01a38a ("igbvf: add new driver to support 82576 virtual functions")
Signed-off-by: Olga Zaborska <[email protected]>
Tested-by: Rafal Romanowski <[email protected]>
Signed-off-by: Tony Nguyen <[email protected]>
|
|
Change the minimum value of RX/TX descriptors to 64 to enable setting the rx/tx
value between 64 and 80. All igc devices can use as low as 64 descriptors.
This change will unify igc with other drivers.
Based on commit 7b1be1987c1e ("e1000e: lower ring minimum size to 64")
Fixes: 0507ef8a0372 ("igc: Add transmit and receive fastpath and interrupt handlers")
Signed-off-by: Olga Zaborska <[email protected]>
Tested-by: Naama Meir <[email protected]>
Signed-off-by: Tony Nguyen <[email protected]>
|
|
Disable virtualization features on 82580 just as on i210/i211.
This avoids that virt functions are acidentally called on 82850.
Fixes: 55cac248caa4 ("igb: Add full support for 82580 devices")
Signed-off-by: Corinna Vinschen <[email protected]>
Reviewed-by: Simon Horman <[email protected]>
Signed-off-by: David S. Miller <[email protected]>
|
|
Merge in late fixes to prepare for the 6.6 net-next PR.
No conflicts.
Signed-off-by: Paolo Abeni <[email protected]>
|
|
Increase the RX buffer size to 3K when the SBP bit is on. The size of
the RX buffer determines the number of pages allocated which may not
be sufficient for receive frames larger than the set MTU size.
Cc: [email protected]
Fixes: 89eaefb61dc9 ("igb: Support RX-ALL feature flag.")
Reported-by: Manfred Rudigier <[email protected]>
Signed-off-by: Radoslaw Tyl <[email protected]>
Tested-by: Arpana Arland <[email protected]> (A Contingent worker at Intel)
Signed-off-by: Tony Nguyen <[email protected]>
Signed-off-by: David S. Miller <[email protected]>
|
|
The ice hardware has a synchronization mechanism used to drive the
simultaneous application of commands on both PHY ports and the source timer
in the MAC.
When issuing a sync via ice_ptp_exec_tmr_cmd(), the hardware will
simultaneously apply the commands programmed for the main timer and each
PHY port. Neither the main timer command register, nor the PHY port command
registers auto clear on command execution.
During the execution of a timer command intended for a single port on E822
devices, such as those used to configure a PHY during link up, the driver
is not correctly clearing the previous commands.
This results in unintentionally executing the last programmed command on
the main timer and other PHY ports whenever performing reconfiguration on
E822 ports after link up. This results in unintended side effects on other
timers, depending on what command was previously programmed.
To fix this, the driver must ensure that the main timer and all other PHY
ports are properly initialized to perform no action.
The enumeration for timer commands does not include an enumeration value
for doing nothing. Introduce ICE_PTP_NOP for this purpose. When writing a
timer command to hardware, leave the command bits set to zero which
indicates that no operation should be performed on that port.
Modify ice_ptp_one_port_cmd() to always initialize all ports. For all ports
other than the one being configured, write their timer command register to
ICE_PTP_NOP. This ensures that no side effect happens on the timer command.
To fix this for the PHY ports, modify ice_ptp_one_port_cmd() to always
initialize all other ports to ICE_PTP_NOP. This ensures that no side
effects happen on the other ports.
Call ice_ptp_src_cmd() with a command value if ICE_PTP_NOP in
ice_sync_phy_timer_e822() and ice_start_phy_timer_e822().
With both of these changes, the driver should no longer execute a stale
command on the main timer or another PHY port when reconfiguring one of the
PHY ports after link up.
Fixes: 3a7496234d17 ("ice: implement basic E822 PTP support")
Signed-off-by: Siddaraju DH <[email protected]>
Signed-off-by: Jacob Keller <[email protected]>
Tested-by: Sunitha Mekala <[email protected]> (A Contingent worker at Intel)
Signed-off-by: Tony Nguyen <[email protected]>
Reviewed-by: Simon Horman <[email protected]>
Signed-off-by: David S. Miller <[email protected]>
|
|
Add devices IDs for the next LOM generations that will be available on the
next Intel Client platforms. This patch provides the initial support for
these devices.
Signed-off-by: Sasha Neftin <[email protected]>
Tested-by: Naama Meir <[email protected]>
Signed-off-by: Tony Nguyen <[email protected]>
|
|
With the 10us interval, we were seeing PTM transactions take around 12us.
Hardware team suggested this interval could be lowered to 1us which was
confirmed with PCIe sniffer. With the 1us interval, PTM dialogs took
around 2us.
Suggested-by: Vinicius Costa Gomes <[email protected]>
Signed-off-by: Sasha Neftin <[email protected]>
Tested-by: Muhammad Husaini Zulkifli <[email protected]>
Reviewed-by: Muhammad Husaini Zulkifli <[email protected]>
Tested-by: Naama Meir <[email protected]>
Signed-off-by: Tony Nguyen <[email protected]>
|
|
Add support for using the four sets of timestamping registers that
i225/i226 have available for TX.
In some workloads, where multiple applications request hardware
transmission timestamps, it was possible that some of those requests
were denied because the only in use register was already occupied.
This is also in preparation to future support for hardware
timestamping with multiple PTP domains. With multiple domains chances
of multiple TX timestamps being requested at the same time increase.
Before:
$ sudo ./ntpperf -i enp3s0 -m 10:22:22:22:22:21 -d 192.168.1.3 -s 172.18.0.0/16 -I -H -o 37
| responses | TX timestamp offset (ns)
rate clients | lost invalid basic xleave | min mean max stddev
1000 100 0.00% 0.00% 0.00% 100.00% +1 +41 +73 13
1500 150 0.00% 0.00% 0.00% 100.00% +9 +49 +87 15
2250 225 0.00% 0.00% 0.00% 100.00% +9 +42 +79 13
3375 337 0.00% 0.00% 0.00% 100.00% +11 +46 +81 13
5062 506 0.00% 0.00% 0.00% 100.00% +7 +44 +80 13
7593 759 0.00% 0.00% 0.00% 100.00% +9 +44 +79 12
11389 1138 0.00% 0.00% 0.00% 100.00% +14 +51 +87 13
17083 1708 0.00% 0.00% 0.00% 100.00% +1 +41 +80 14
25624 2562 0.00% 0.00% 0.00% 100.00% +11 +50 +5107 51
38436 3843 0.00% 0.00% 0.00% 100.00% -2 +36 +7843 38
57654 5765 0.00% 0.00% 0.00% 100.00% +4 +42 +10503 69
86481 8648 0.00% 0.00% 0.00% 100.00% +11 +54 +5492 65
129721 12972 0.00% 0.00% 0.00% 100.00% +31 +2680 +6942 2606
194581 16384 16.79% 0.00% 0.87% 82.34% +73 +4444 +15879 3116
291871 16384 35.05% 0.00% 1.53% 63.42% +188 +5381 +17019 3035
437806 16384 54.95% 0.00% 2.55% 42.50% +233 +6302 +13885 2846
After:
$ sudo ./ntpperf -i enp3s0 -m 10:22:22:22:22:21 -d 192.168.1.3 -s 172.18.0.0/16 -I -H -o 37
| responses | TX timestamp offset (ns)
rate clients | lost invalid basic xleave | min mean max stddev
1000 100 0.00% 0.00% 0.00% 100.00% -20 +12 +43 13
1500 150 0.00% 0.00% 0.00% 100.00% -23 +18 +57 14
2250 225 0.00% 0.00% 0.00% 100.00% -2 +33 +67 13
3375 337 0.00% 0.00% 0.00% 100.00% +1 +38 +76 13
5062 506 0.00% 0.00% 0.00% 100.00% +9 +52 +93 14
7593 759 0.00% 0.00% 0.00% 100.00% +11 +47 +82 13
11389 1138 0.00% 0.00% 0.00% 100.00% -9 +27 +74 13
17083 1708 0.00% 0.00% 0.00% 100.00% -13 +25 +66 14
25624 2562 0.00% 0.00% 0.00% 100.00% -8 +28 +65 13
38436 3843 0.00% 0.00% 0.00% 100.00% -13 +28 +69 13
57654 5765 0.00% 0.00% 0.00% 100.00% -11 +32 +71 14
86481 8648 0.00% 0.00% 0.00% 100.00% +2 +44 +83 14
129721 12972 15.36% 0.00% 0.35% 84.29% -2 +2248 +22907 4252
194581 16384 42.98% 0.00% 1.98% 55.04% -4 +5278 +65039 5856
291871 16384 54.33% 0.00% 2.21% 43.46% -3 +6306 +22608 5665
We can see that with 4 registers, as expected, we are able to handle a
increasing number of requests more consistently, but as soon as all
registers are in use, the decrease in quality of service happens in a
sharp step.
Signed-off-by: Vinicius Costa Gomes <[email protected]>
Reviewed-by: Vladimir Oltean <[email protected]>
Reviewed-by: Muhammad Husaini Zulkifli <[email protected]>
Reviewed-by: Kurt Kanzenbach <[email protected]>
Tested-by: Naama Meir <[email protected]>
Signed-off-by: Tony Nguyen <[email protected]>
|
|
Cross-merge networking fixes after downstream PR.
Conflicts:
include/net/inet_sock.h
f866fbc842de ("ipv4: fix data-races around inet->inet_id")
c274af224269 ("inet: introduce inet->inet_flags")
https://lore.kernel.org/all/[email protected]/
Adjacent changes:
drivers/net/bonding/bond_alb.c
e74216b8def3 ("bonding: fix macvlan over alb bond support")
f11e5bd159b0 ("bonding: support balance-alb with openvswitch")
drivers/net/ethernet/broadcom/bgmac.c
d6499f0b7c7c ("net: bgmac: Return PTR_ERR() for fixed_phy_register()")
23a14488ea58 ("net: bgmac: Fix return value check for fixed_phy_register()")
drivers/net/ethernet/broadcom/genet/bcmmii.c
32bbe64a1386 ("net: bcmgenet: Fix return value check for fixed_phy_register()")
acf50d1adbf4 ("net: bcmgenet: Return PTR_ERR() for fixed_phy_register()")
net/sctp/socket.c
f866fbc842de ("ipv4: fix data-races around inet->inet_id")
b09bde5c3554 ("inet: move inet->mc_loop to inet->inet_frags")
Signed-off-by: Jakub Kicinski <[email protected]>
|
|
Add check for pf->vf not being NULL before dereferencing
pf->vf[vsi->vf_id] in updating VSI filter sync.
Add a similar check before dereferencing !pf->vf[vsi->vf_id].trusted
in the condition for clearing promisc mode bit.
Fixes: c87c938f62d8 ("i40e: Add VF VLAN pruning")
Signed-off-by: Andrii Staikov <[email protected]>
Signed-off-by: Aleksandr Loktionov <[email protected]>
Tested-by: Rafal Romanowski <[email protected]>
Signed-off-by: Tony Nguyen <[email protected]>
Signed-off-by: David S. Miller <[email protected]>
|
|
The IGC_PTM_CTRL_SHRT_CYC defines the time between two consecutive PTM
requests. The bit resolution of this field is six bits. That bit five was
missing in the mask. This patch comes to correct the typo in the
IGC_PTM_CTRL_SHRT_CYC macro.
Fixes: a90ec8483732 ("igc: Add support for PTP getcrosststamp()")
Signed-off-by: Sasha Neftin <[email protected]>
Tested-by: Naama Meir <[email protected]>
Signed-off-by: Tony Nguyen <[email protected]>
Reviewed-by: Simon Horman <[email protected]>
Reviewed-by: Kalesh AP <[email protected]>
Link: https://lore.kernel.org/r/[email protected]
Signed-off-by: Jakub Kicinski <[email protected]>
|
|
If ptp_clock_register() fails or CONFIG_PTP isn't enabled, avoid starting
PTP related workqueues.
In this way we can fix this:
BUG: unable to handle page fault for address: ffffc9000440b6f8
#PF: supervisor read access in kernel mode
#PF: error_code(0x0000) - not-present page
PGD 100000067 P4D 100000067 PUD 1001e0067 PMD 107dc5067 PTE 0
Oops: 0000 [#1] PREEMPT SMP
[...]
Workqueue: events igb_ptp_overflow_check
RIP: 0010:igb_rd32+0x1f/0x60
[...]
Call Trace:
igb_ptp_read_82580+0x20/0x50
timecounter_read+0x15/0x60
igb_ptp_overflow_check+0x1a/0x50
process_one_work+0x1cb/0x3c0
worker_thread+0x53/0x3f0
? rescuer_thread+0x370/0x370
kthread+0x142/0x160
? kthread_associate_blkcg+0xc0/0xc0
ret_from_fork+0x1f/0x30
Fixes: 1f6e8178d685 ("igb: Prevent dropped Tx timestamps via work items and interrupts.")
Fixes: d339b1331616 ("igb: add PTP Hardware Clock code")
Signed-off-by: Alessio Igor Bogani <[email protected]>
Tested-by: Arpana Arland <[email protected]> (A Contingent worker at Intel)
Signed-off-by: Tony Nguyen <[email protected]>
Reviewed-by: Simon Horman <[email protected]>
Link: https://lore.kernel.org/r/[email protected]
Signed-off-by: Jakub Kicinski <[email protected]>
|
|
During stress test with attaching and detaching VF from KVM and
simultaneously changing VFs spoofcheck and trust there was a
NULL pointer dereference in ice_reset_vf that VF's VSI is null.
More than one instance of ice_reset_vf() can be running at a given
time. When we rebuild the VSI in ice_reset_vf, another reset can be
triaged from ice_service_task. In this case we can access the currently
uninitialized VSI and cause panic. The window for this racing condition
has been around for a long time but it's much worse after commit
227bf4500aaa ("ice: move VSI delete outside deconfig") because
the reset runs faster. ice_reset_vf() using vf->cfg_lock and when
we move this lock before accessing to the VF VSI, we can fix
BUG for all cases.
Panic occurs sometimes in ice_vsi_is_rx_queue_active() and sometimes
in ice_vsi_stop_all_rx_rings()
With our reproducer, we can hit BUG:
~8h before commit 227bf4500aaa ("ice: move VSI delete outside deconfig").
~20m after commit 227bf4500aaa ("ice: move VSI delete outside deconfig").
After this fix we are not able to reproduce it after ~48h
There was commit cf90b74341ee ("ice: Fix call trace with null VSI during
VF reset") which also tried to fix this issue, but it was only
partially resolved and the bug still exists.
[ 6420.658415] BUG: kernel NULL pointer dereference, address: 0000000000000000
[ 6420.665382] #PF: supervisor read access in kernel mode
[ 6420.670521] #PF: error_code(0x0000) - not-present page
[ 6420.675659] PGD 0
[ 6420.677679] Oops: 0000 [#1] PREEMPT SMP NOPTI
[ 6420.682038] CPU: 53 PID: 326472 Comm: kworker/53:0 Kdump: loaded Not tainted 5.14.0-317.el9.x86_64 #1
[ 6420.691250] Hardware name: Dell Inc. PowerEdge R750/04V528, BIOS 1.6.5 04/15/2022
[ 6420.698729] Workqueue: ice ice_service_task [ice]
[ 6420.703462] RIP: 0010:ice_vsi_is_rx_queue_active+0x2d/0x60 [ice]
[ 6420.705860] ice 0000:ca:00.0: VF 0 is now untrusted
[ 6420.709494] Code: 00 00 66 83 bf 76 04 00 00 00 48 8b 77 10 74 3e 31 c0 eb 0f 0f b7 97 76 04 00 00 48 83 c0 01 39 c2 7e 2b 48 8b 97 68 04 00 00 <0f> b7 0c 42 48 8b 96 20 13 00 00 48 8d 94 8a 00 00 12 00 8b 12 83
[ 6420.714426] ice 0000:ca:00.0 ens7f0: Setting MAC 22:22:22:22:22:00 on VF 0. VF driver will be reinitialized
[ 6420.733120] RSP: 0018:ff778d2ff383fdd8 EFLAGS: 00010246
[ 6420.733123] RAX: 0000000000000000 RBX: ff2acf1916294000 RCX: 0000000000000000
[ 6420.733125] RDX: 0000000000000000 RSI: ff2acf1f2c6401a0 RDI: ff2acf1a27301828
[ 6420.762346] RBP: ff2acf1a27301828 R08: 0000000000000010 R09: 0000000000001000
[ 6420.769476] R10: ff2acf1916286000 R11: 00000000019eba3f R12: ff2acf19066460d0
[ 6420.776611] R13: ff2acf1f2c6401a0 R14: ff2acf1f2c6401a0 R15: 00000000ffffffff
[ 6420.783742] FS: 0000000000000000(0000) GS:ff2acf28ffa80000(0000) knlGS:0000000000000000
[ 6420.791829] CS: 0010 DS: 0000 ES: 0000 CR0: 0000000080050033
[ 6420.797575] CR2: 0000000000000000 CR3: 00000016ad410003 CR4: 0000000000773ee0
[ 6420.804708] DR0: 0000000000000000 DR1: 0000000000000000 DR2: 0000000000000000
[ 6420.811034] vfio-pci 0000:ca:01.0: enabling device (0000 -> 0002)
[ 6420.811840] DR3: 0000000000000000 DR6: 00000000fffe0ff0 DR7: 0000000000000400
[ 6420.811841] PKRU: 55555554
[ 6420.811842] Call Trace:
[ 6420.811843] <TASK>
[ 6420.811844] ice_reset_vf+0x9a/0x450 [ice]
[ 6420.811876] ice_process_vflr_event+0x8f/0xc0 [ice]
[ 6420.841343] ice_service_task+0x23b/0x600 [ice]
[ 6420.845884] ? __schedule+0x212/0x550
[ 6420.849550] process_one_work+0x1e2/0x3b0
[ 6420.853563] ? rescuer_thread+0x390/0x390
[ 6420.857577] worker_thread+0x50/0x3a0
[ 6420.861242] ? rescuer_thread+0x390/0x390
[ 6420.865253] kthread+0xdd/0x100
[ 6420.868400] ? kthread_complete_and_exit+0x20/0x20
[ 6420.873194] ret_from_fork+0x1f/0x30
[ 6420.876774] </TASK>
[ 6420.878967] Modules linked in: vfio_pci vfio_pci_core vfio_iommu_type1 vfio iavf vhost_net vhost vhost_iotlb tap tun xt_CHECKSUM xt_MASQUERADE xt_conntrack ipt_REJECT nf_reject_ipv4 nft_compat nft_chain_nat nf_nat nf_conntrack nf_defrag_ipv6 nf_defrag_ipv4 nft_counter nf_tables bridge stp llc sctp ip6_udp_tunnel udp_tunnel nfp tls nfnetlink bluetooth mlx4_en mlx4_core rpcsec_gss_krb5 auth_rpcgss nfsv4 dns_resolver nfs lockd grace fscache netfs rfkill sunrpc intel_rapl_msr intel_rapl_common i10nm_edac nfit libnvdimm ipmi_ssif x86_pkg_temp_thermal intel_powerclamp coretemp irdma kvm_intel i40e kvm iTCO_wdt dcdbas ib_uverbs irqbypass iTCO_vendor_support mgag200 mei_me ib_core dell_smbios isst_if_mmio isst_if_mbox_pci rapl i2c_algo_bit drm_shmem_helper intel_cstate drm_kms_helper syscopyarea sysfillrect isst_if_common sysimgblt intel_uncore fb_sys_fops dell_wmi_descriptor wmi_bmof intel_vsec mei i2c_i801 acpi_ipmi ipmi_si i2c_smbus ipmi_devintf intel_pch_thermal acpi_power_meter pcspk
r
Fixes: efe41860008e ("ice: Fix memory corruption in VF driver")
Fixes: f23df5220d2b ("ice: Fix spurious interrupt during removal of trusted VF")
Signed-off-by: Petr Oros <[email protected]>
Reviewed-by: Simon Horman <[email protected]>
Reviewed-by: Przemek Kitszel <[email protected]>
Reviewed-by: Jacob Keller <[email protected]>
Tested-by: Rafal Romanowski <[email protected]>
Signed-off-by: Tony Nguyen <[email protected]>
|
|
This reverts commit 7255355a0636b4eff08d5e8139c77d98f151c4fc.
After this commit we are not able to attach VF to VM:
virsh attach-interface v0 hostdev --managed 0000:41:01.0 --mac 52:52:52:52:52:52
error: Failed to attach interface
error: Cannot set interface MAC to 52:52:52:52:52:52 for ifname enp65s0f0np0 vf 0: Resource temporarily unavailable
ice_check_vf_ready_for_cfg() already contain waiting for reset.
New condition in ice_check_vf_ready_for_reset() causing only problems.
Fixes: 7255355a0636 ("ice: Fix ice VF reset during iavf initialization")
Signed-off-by: Petr Oros <[email protected]>
Reviewed-by: Simon Horman <[email protected]>
Reviewed-by: Przemek Kitszel <[email protected]>
Reviewed-by: Jacob Keller <[email protected]>
Tested-by: Rafal Romanowski <[email protected]>
Signed-off-by: Tony Nguyen <[email protected]>
|
|
The driver is misconfiguring the hardware for some values of MTU such that
it could use multiple descriptors to receive a packet when it could have
simply used one.
Change the driver to use a round-up instead of the result of a shift, as
the shift can truncate the lower bits of the size, and result in the
problem noted above. It also aligns this driver with similar code in i40e.
The insidiousness of this problem is that everything works with the wrong
size, it's just not working as well as it could, as some MTU sizes end up
using two or more descriptors, and there is no way to tell that is
happening without looking at ice_trace or a bus analyzer.
Fixes: efc2214b6047 ("ice: Add support for XDP")
Reviewed-by: Przemek Kitszel <[email protected]>
Signed-off-by: Jesse Brandeburg <[email protected]>
Reviewed-by: Leon Romanovsky <[email protected]>
Tested-by: Pucha Himasekhar Reddy <[email protected]> (A Contingent worker at Intel)
Signed-off-by: Tony Nguyen <[email protected]>
|
|
git://git.kernel.org/pub/scm/linux/kernel/git/tnguy/next-queue
Tony Nguyen says:
====================
Intel Wired LAN Driver Updates 2023-08-17 (ice)
This series contains updates to ice driver only.
Jan removes unused functions and refactors code to make, possible,
functions static.
Jake rearranges some functions to be logically grouped.
Marcin removes an unnecessary call to disable VLAN stripping.
Yang Yingliang utilizes list_for_each_entry() helper for a couple list
traversals.
Przemek removes some parameters from ice_aq_alloc_free_res() which were
always the same and reworks ice_aq_wait_for_event() to reduce chance of
race.
* '100GbE' of git://git.kernel.org/pub/scm/linux/kernel/git/tnguy/next-queue:
ice: split ice_aq_wait_for_event() func into two
ice: embed &ice_rq_event_info event into struct ice_aq_task
ice: ice_aq_check_events: fix off-by-one check when filling buffer
ice: drop two params from ice_aq_alloc_free_res()
ice: use list_for_each_entry() helper
ice: Remove redundant VSI configuration in eswitch setup
ice: move E810T functions to before device agnostic ones
ice: refactor ice_vsi_is_vlan_pruning_ena
ice: refactor ice_ptp_hw to make functions static
ice: refactor ice_sched to make functions static
ice: Utilize assign_bit() helper
ice: refactor ice_vf_lib to make functions static
ice: refactor ice_lib to make functions static
ice: refactor ice_ddp to make functions static
ice: remove unused methods
====================
Link: https://lore.kernel.org/r/[email protected]
Signed-off-by: Jakub Kicinski <[email protected]>
|
|
git://git.kernel.org/pub/scm/linux/kernel/git/tnguy/next-queue
Tony Nguyen says:
====================
virtchnl: fix fake 1-elem arrays
Alexander Lobakin says:
6.5-rc1 started spitting warning splats when composing virtchnl
messages, precisely on virtchnl_rss_key and virtchnl_lut:
[ 84.167709] memcpy: detected field-spanning write (size 52) of single
field "vrk->key" at drivers/net/ethernet/intel/iavf/iavf_virtchnl.c:1095
(size 1)
[ 84.169915] WARNING: CPU: 3 PID: 11 at drivers/net/ethernet/intel/
iavf/iavf_virtchnl.c:1095 iavf_set_rss_key+0x123/0x140 [iavf]
...
[ 84.191982] Call Trace:
[ 84.192439] <TASK>
[ 84.192900] ? __warn+0xc9/0x1a0
[ 84.193353] ? iavf_set_rss_key+0x123/0x140 [iavf]
[ 84.193818] ? report_bug+0x12c/0x1b0
[ 84.194266] ? handle_bug+0x42/0x70
[ 84.194714] ? exc_invalid_op+0x1a/0x50
[ 84.195149] ? asm_exc_invalid_op+0x1a/0x20
[ 84.195592] ? iavf_set_rss_key+0x123/0x140 [iavf]
[ 84.196033] iavf_watchdog_task+0xb0c/0xe00 [iavf]
...
[ 84.225476] memcpy: detected field-spanning write (size 64) of single
field "vrl->lut" at drivers/net/ethernet/intel/iavf/iavf_virtchnl.c:1127
(size 1)
[ 84.227190] WARNING: CPU: 27 PID: 1044 at drivers/net/ethernet/intel/
iavf/iavf_virtchnl.c:1127 iavf_set_rss_lut+0x123/0x140 [iavf]
...
[ 84.246601] Call Trace:
[ 84.247228] <TASK>
[ 84.247840] ? __warn+0xc9/0x1a0
[ 84.248263] ? iavf_set_rss_lut+0x123/0x140 [iavf]
[ 84.248698] ? report_bug+0x12c/0x1b0
[ 84.249122] ? handle_bug+0x42/0x70
[ 84.249549] ? exc_invalid_op+0x1a/0x50
[ 84.249970] ? asm_exc_invalid_op+0x1a/0x20
[ 84.250390] ? iavf_set_rss_lut+0x123/0x140 [iavf]
[ 84.250820] iavf_watchdog_task+0xb16/0xe00 [iavf]
Gustavo already tried to fix those back in 2021[0][1]. Unfortunately,
a VM can run a different kernel than the host, meaning that those
structures are sorta ABI.
However, it is possible to have proper flex arrays + struct_size()
calculations and still send the very same messages with the same sizes.
The common rule is:
elem[1] -> elem[]
size = struct_size() + <difference between the old and the new msg size>
The "old" size in the current code is calculated 3 different ways for
10 virtchnl structures total. Each commit addresses one of the ways
cumulatively instead of per-structure.
I was planning to send it to -net initially, but given that virtchnl was
renamed from i40evf and got some fat style cleanup commits in the past,
it's not very straightforward to even pick appropriate SHAs, not
speaking of automatic portability. I may send manual backports for
a couple of the latest supported kernels later on if anyone needs it
at all.
[0] https://lore.kernel.org/all/20210525230912.GA175802@embeddedor
[1] https://lore.kernel.org/all/20210525231851.GA176647@embeddedor
* '40GbE' of git://git.kernel.org/pub/scm/linux/kernel/git/tnguy/next-queue:
virtchnl: fix fake 1-elem arrays for structures allocated as `nents`
virtchnl: fix fake 1-elem arrays in structures allocated as `nents + 1`
virtchnl: fix fake 1-elem arrays in structs allocated as `nents + 1` - 1
====================
Link: https://lore.kernel.org/r/[email protected]
Signed-off-by: Jakub Kicinski <[email protected]>
|
|
Cross-merge networking fixes after downstream PR.
Conflicts:
drivers/net/ethernet/sfc/tc.c
fa165e194997 ("sfc: don't unregister flow_indr if it was never registered")
3bf969e88ada ("sfc: add MAE table machinery for conntrack table")
https://lore.kernel.org/all/[email protected]/
No adjacent changes.
Signed-off-by: Jakub Kicinski <[email protected]>
|
|
git://git.kernel.org/pub/scm/linux/kernel/git/tnguy/net-queue
Tony Nguyen says:
====================
Intel Wired LAN Driver Updates 2023-08-16 (iavf, i40e)
This series contains updates to iavf and i40e drivers.
Piotr adds checks for unsupported Flow Director rules on iavf.
Andrii replaces incorrect 'write' messaging on read operations for i40e.
* '40GbE' of git://git.kernel.org/pub/scm/linux/kernel/git/tnguy/net-queue:
i40e: fix misleading debug logs
iavf: fix FDIR rule fields masks validation
====================
Link: https://lore.kernel.org/r/[email protected]
Signed-off-by: Jakub Kicinski <[email protected]>
|