Age | Commit message (Collapse) | Author | Files | Lines |
|
When VSI set up failed in i40e_probe() as part of PF switch set up
driver was trying to free misc IRQ vectors in
i40e_clear_interrupt_scheme and produced a kernel Oops:
Trying to free already-free IRQ 266
WARNING: CPU: 0 PID: 5 at kernel/irq/manage.c:1731 __free_irq+0x9a/0x300
Workqueue: events work_for_cpu_fn
RIP: 0010:__free_irq+0x9a/0x300
Call Trace:
? synchronize_irq+0x3a/0xa0
free_irq+0x2e/0x60
i40e_clear_interrupt_scheme+0x53/0x190 [i40e]
i40e_probe.part.108+0x134b/0x1a40 [i40e]
? kmem_cache_alloc+0x158/0x1c0
? acpi_ut_update_ref_count.part.1+0x8e/0x345
? acpi_ut_update_object_reference+0x15e/0x1e2
? strstr+0x21/0x70
? irq_get_irq_data+0xa/0x20
? mp_check_pin_attr+0x13/0xc0
? irq_get_irq_data+0xa/0x20
? mp_map_pin_to_irq+0xd3/0x2f0
? acpi_register_gsi_ioapic+0x93/0x170
? pci_conf1_read+0xa4/0x100
? pci_bus_read_config_word+0x49/0x70
? do_pci_enable_device+0xcc/0x100
local_pci_probe+0x41/0x90
work_for_cpu_fn+0x16/0x20
process_one_work+0x1a7/0x360
worker_thread+0x1cf/0x390
? create_worker+0x1a0/0x1a0
kthread+0x112/0x130
? kthread_flush_work_fn+0x10/0x10
ret_from_fork+0x1f/0x40
The problem is that at that point misc IRQ vectors
were not allocated yet and we get a call trace
that driver is trying to free already free IRQ vectors.
Add a check in i40e_clear_interrupt_scheme for __I40E_MISC_IRQ_REQUESTED
PF state before calling i40e_free_misc_vector. This state is set only if
misc IRQ vectors were properly initialized.
Fixes: c17401a1dd21 ("i40e: use separate state bit for miscellaneous IRQ setup")
Reported-by: PJ Waskiewicz <[email protected]>
Signed-off-by: Sylwester Dziedziuch <[email protected]>
Signed-off-by: Mateusz Palczewski <[email protected]>
Tested-by: Dave Switzer <[email protected]>
Signed-off-by: Tony Nguyen <[email protected]>
|
|
The loop in i40e_get_capabilities can never end. The problem is that
although i40e_aq_discover_capabilities returns with an error if there's
a firmware problem, the returned error is not checked. There is a check for
pf->hw.aq.asq_last_status but that value is set to I40E_AQ_RC_OK on most
firmware problems.
When i40e_aq_discover_capabilities encounters a firmware problem, it will
encounter the same problem on its next invocation. As the result, the loop
becomes endless. We hit this with I40E_ERR_ADMIN_QUEUE_TIMEOUT but looking
at the code, it can happen with a range of other firmware errors.
I don't know what the correct behavior should be: whether the firmware
should be retried a few times, or whether pf->hw.aq.asq_last_status should
be always set to the encountered firmware error (but then it would be
pointless and can be just replaced by the i40e_aq_discover_capabilities
return value). However, the current behavior with an endless loop under the
rtnl mutex(!) is unacceptable and Intel has not submitted a fix, although we
explained the bug to them 7 months ago.
This may not be the best possible fix but it's better than hanging the whole
system on a firmware bug.
Fixes: 56a62fc86895 ("i40e: init code and hardware support")
Tested-by: Stefan Assmann <[email protected]>
Signed-off-by: Jiri Benc <[email protected]>
Reviewed-by: Jesse Brandeburg <[email protected]>
Tested-by: Dave Switzer <[email protected]>
Signed-off-by: Tony Nguyen <[email protected]>
|
|
Convert all Ethernet drivers from memcpy(... dev->addr_len)
to eth_hw_addr_set():
@@
expression dev, np;
@@
- memcpy(dev->dev_addr, np, dev->addr_len)
+ eth_hw_addr_set(dev, np)
In theory addr_len may not be ETH_ALEN, but we don't expect
non-Ethernet devices to live under this directory, and only
the following cases of setting addr_len exist:
- cxgb4 for mgmt device,
and the drivers which set it to ETH_ALEN: s2io, mlx4, vxge.
Signed-off-by: Jakub Kicinski <[email protected]>
Signed-off-by: David S. Miller <[email protected]>
|
|
eth_hw_addr_set() takes a u8 pointer, like other
etherdevice helpers. Convert the few drivers which
require casts because they memcpy from "endian marked"
types.
Signed-off-by: Jakub Kicinski <[email protected]>
Signed-off-by: David S. Miller <[email protected]>
|
|
Convert Ethernet from ether_addr_copy() to eth_hw_addr_set():
@@
expression dev, np;
@@
- ether_addr_copy(dev->dev_addr, np)
+ eth_hw_addr_set(dev, np)
Signed-off-by: Jakub Kicinski <[email protected]>
Signed-off-by: David S. Miller <[email protected]>
|
|
Daniel Borkmann says:
====================
bpf-next 2021-10-02
We've added 85 non-merge commits during the last 15 day(s) which contain
a total of 132 files changed, 13779 insertions(+), 6724 deletions(-).
The main changes are:
1) Massive update on test_bpf.ko coverage for JITs as preparatory work for
an upcoming MIPS eBPF JIT, from Johan Almbladh.
2) Add a batched interface for RX buffer allocation in AF_XDP buffer pool,
with driver support for i40e and ice from Magnus Karlsson.
3) Add legacy uprobe support to libbpf to complement recently merged legacy
kprobe support, from Andrii Nakryiko.
4) Add bpf_trace_vprintk() as variadic printk helper, from Dave Marchevsky.
5) Support saving the register state in verifier when spilling <8byte bounded
scalar to the stack, from Martin Lau.
6) Add libbpf opt-in for stricter BPF program section name handling as part
of libbpf 1.0 effort, from Andrii Nakryiko.
7) Add a document to help clarifying BPF licensing, from Alexei Starovoitov.
8) Fix skel_internal.h to propagate errno if the loader indicates an internal
error, from Kumar Kartikeya Dwivedi.
9) Fix build warnings with -Wcast-function-type so that the option can later
be enabled by default for the kernel, from Kees Cook.
10) Fix libbpf to ignore STT_SECTION symbols in legacy map definitions as it
otherwise errors out when encountering them, from Toke Høiland-Jørgensen.
11) Teach libbpf to recognize specialized maps (such as for perf RB) and
internally remove BTF type IDs when creating them, from Hengqi Chen.
12) Various fixes and improvements to BPF selftests.
====================
Link: https://lore.kernel.org/r/[email protected]
Signed-off-by: Jakub Kicinski <[email protected]>
|
|
drivers/net/phy/bcm7xxx.c
d88fd1b546ff ("net: phy: bcm7xxx: Fixed indirect MMD operations")
f68d08c437f9 ("net: phy: bcm7xxx: Add EPHY entry for 72165")
net/sched/sch_api.c
b193e15ac69d ("net: prevent user from passing illegal stab size")
69508d43334e ("net_sched: Use struct_size() and flex_array_size() helpers")
Both cases trivial - adjacent code additions.
Signed-off-by: Jakub Kicinski <[email protected]>
|
|
Originally, ixgbe driver doesn't allow the mounting of xdpdrv if the
server is equipped with more than 64 cpus online. So it turns out that
the loading of xdpdrv causes the "NOMEM" failure.
Actually, we can adjust the algorithm and then make it work through
mapping the current cpu to some xdp ring with the protect of @tx_lock.
Here are some numbers before/after applying this patch with xdp-example
loaded on the eth0X:
As client (tx path):
Before After
TCP_STREAM send-64 734.14 714.20
TCP_STREAM send-128 1401.91 1395.05
TCP_STREAM send-512 5311.67 5292.84
TCP_STREAM send-1k 9277.40 9356.22 (not stable)
TCP_RR send-1 22559.75 21844.22
TCP_RR send-128 23169.54 22725.13
TCP_RR send-512 21670.91 21412.56
As server (rx path):
Before After
TCP_STREAM send-64 1416.49 1383.12
TCP_STREAM send-128 3141.49 3055.50
TCP_STREAM send-512 9488.73 9487.44
TCP_STREAM send-1k 9491.17 9356.22 (not stable)
TCP_RR send-1 23617.74 23601.60
...
Notice: the TCP_RR mode is unstable as the official document explains.
I tested many times with different parameters combined through netperf.
Though the result is not that accurate, I cannot see much influence on
this patch. The static key is places on the hot path, but it actually
shouldn't cause a huge regression theoretically.
Co-developed-by: Shujin Li <[email protected]>
Signed-off-by: Shujin Li <[email protected]>
Signed-off-by: Jason Xing <[email protected]>
Tested-by: Sandeep Penigalapati <[email protected]>
Signed-off-by: Tony Nguyen <[email protected]>
Signed-off-by: David S. Miller <[email protected]>
|
|
The ixgbe driver currently generates a NULL pointer dereference with
some machine (online cpus < 63). This is due to the fact that the
maximum value of num_xdp_queues is nr_cpu_ids. Code is in
"ixgbe_set_rss_queues"".
Here's how the problem repeats itself:
Some machine (online cpus < 63), And user set num_queues to 63 through
ethtool. Code is in the "ixgbe_set_channels",
adapter->ring_feature[RING_F_FDIR].limit = count;
It becomes 63.
When user use xdp, "ixgbe_set_rss_queues" will set queues num.
adapter->num_rx_queues = rss_i;
adapter->num_tx_queues = rss_i;
adapter->num_xdp_queues = ixgbe_xdp_queues(adapter);
And rss_i's value is from
f = &adapter->ring_feature[RING_F_FDIR];
rss_i = f->indices = f->limit;
So "num_rx_queues" > "num_xdp_queues", when run to "ixgbe_xdp_setup",
for (i = 0; i < adapter->num_rx_queues; i++)
if (adapter->xdp_ring[i]->xsk_umem)
It leads to panic.
Call trace:
[exception RIP: ixgbe_xdp+368]
RIP: ffffffffc02a76a0 RSP: ffff9fe16202f8d0 RFLAGS: 00010297
RAX: 0000000000000000 RBX: 0000000000000020 RCX: 0000000000000000
RDX: 0000000000000000 RSI: 000000000000001c RDI: ffffffffa94ead90
RBP: ffff92f8f24c0c18 R8: 0000000000000000 R9: 0000000000000000
R10: ffff9fe16202f830 R11: 0000000000000000 R12: ffff92f8f24c0000
R13: ffff9fe16202fc01 R14: 000000000000000a R15: ffffffffc02a7530
ORIG_RAX: ffffffffffffffff CS: 0010 SS: 0018
7 [ffff9fe16202f8f0] dev_xdp_install at ffffffffa89fbbcc
8 [ffff9fe16202f920] dev_change_xdp_fd at ffffffffa8a08808
9 [ffff9fe16202f960] do_setlink at ffffffffa8a20235
10 [ffff9fe16202fa88] rtnl_setlink at ffffffffa8a20384
11 [ffff9fe16202fc78] rtnetlink_rcv_msg at ffffffffa8a1a8dd
12 [ffff9fe16202fcf0] netlink_rcv_skb at ffffffffa8a717eb
13 [ffff9fe16202fd40] netlink_unicast at ffffffffa8a70f88
14 [ffff9fe16202fd80] netlink_sendmsg at ffffffffa8a71319
15 [ffff9fe16202fdf0] sock_sendmsg at ffffffffa89df290
16 [ffff9fe16202fe08] __sys_sendto at ffffffffa89e19c8
17 [ffff9fe16202ff30] __x64_sys_sendto at ffffffffa89e1a64
18 [ffff9fe16202ff38] do_syscall_64 at ffffffffa84042b9
19 [ffff9fe16202ff50] entry_SYSCALL_64_after_hwframe at ffffffffa8c0008c
So I fix ixgbe_max_channels so that it will not allow a setting of queues
to be higher than the num_online_cpus(). And when run to ixgbe_xdp_setup,
take the smaller value of num_rx_queues and num_xdp_queues.
Fixes: 4a9b32f30f80 ("ixgbe: fix potential RX buffer starvation for AF_XDP")
Signed-off-by: Feng Zhou <[email protected]>
Tested-by: Sandeep Penigalapati <[email protected]>
Signed-off-by: Tony Nguyen <[email protected]>
Signed-off-by: David S. Miller <[email protected]>
|
|
As noted in the "Deprecated Interfaces, Language Features, Attributes,
and Conventions" documentation [1], size calculations (especially
multiplication) should not be performed in memory allocator (or similar)
function arguments due to the risk of them overflowing. This could lead
to values wrapping around and a smaller allocation being made than the
caller was expecting. Using those allocations could lead to linear
overflows of heap memory and other misbehaviors.
In this case this is not actually dynamic sizes: both sides of the
multiplication are constant values. However it is best to refactor this
anyway, just to keep the open-coded math idiom out of code.
So, use the purpose specific kcalloc() function instead of the argument
size * count in the kzalloc() function.
[1] https://www.kernel.org/doc/html/v5.14/process/deprecated.html#open-coded-arithmetic-in-allocator-arguments
Signed-off-by: Len Baker <[email protected]>
Reviewed-by: Gustavo A. R. Silva <[email protected]>
Signed-off-by: Tony Nguyen <[email protected]>
|
|
In IPv4 header, fragment flags indicate whether the packet needs
to be fragmented or not. The value 0x20 represents MF (More Fragment); fix
the macro name to match this.
Signed-off-by: Ting Xu <[email protected]>
Signed-off-by: Jeff Guo <[email protected]>
Tested-by: Gurucharan G <[email protected]>
Signed-off-by: Tony Nguyen <[email protected]>
|
|
After commit a8f89fa27773 ("ice: do not abort devlink info if board
identifier can't be found"), the getter/fallback() functions no longer
report an error. Convert the interface to a void so that it is no
longer possible to add a version field that is fatal. This makes
sense, because we should not fail to report other versions just
because one of the version pieces could not be found.
Finally, clean up the getter functions line wrapping so that none of
them take more than 80 columns, as is the usual style for networking
files.
Signed-off-by: Jacob Keller <[email protected]>
Tested-by: Tony Brelinski <[email protected]>
Signed-off-by: Tony Nguyen <[email protected]>
|
|
The messaging for unsupported module detection is different for
lenient mode and strict mode. Update the code to print the right
messaging for a given link mode.
Media topology conflict is not an error in lenient mode, so return
an error code only if not in lenient mode.
Signed-off-by: Anirudh Venkataramanan <[email protected]>
Tested-by: Gurucharan G <[email protected]>
Signed-off-by: Tony Nguyen <[email protected]>
|
|
DSCP a.k.a L3 QoS is only supported on certain devices. To enforce this,
this patch introduces a bitmap of features and helper functions.
The feature bitmap is set based on device IDs on driver init. Currently,
DSCP is the only feature in this bitmap, but there will be more in the
future. In the DCB netlink flow, check if the feature bit is set before
exercising DSCP.
Signed-off-by: Anirudh Venkataramanan <[email protected]>
Tested-by: Gurucharan G <[email protected]>
Signed-off-by: Tony Nguyen <[email protected]>
|
|
Implement code to handle submission of APP TLV's
containing DSCP to TC mapping.
The first such mapping received on an interface
will cause that PF to switch to L3 DSCP QoS mode,
apply the default config for that mode, and apply
the received mapping.
Only one such mapping will be allowed per DSCP value,
and when the last DSCP mapping is deleted, the PF
will switch back into L2 VLAN QoS mode, applying the
appropriate default QoS settings.
L3 DSCP QoS mode will only be allowed in SW DCBx
mode, in other words, when the FW LLDP engine is
disabled. Commands that break this mutual exclusivity
will be blocked.
Co-developed-by: Anirudh Venkataramanan <[email protected]>
Signed-off-by: Anirudh Venkataramanan <[email protected]>
Signed-off-by: Dave Ertman <[email protected]>
Tested-by: Gurucharan G <[email protected]>
Signed-off-by: Tony Nguyen <[email protected]>
|
|
Use the new xsk batched rx allocation interface for the zero-copy data
path. As the array of struct xdp_buff pointers kept by the driver is
really a ring that wraps, the allocation routine is modified to detect
a wrap and in that case call the allocation function twice. The
allocation function cannot deal with wrapped rings, only arrays. As we
now know exactly how many buffers we get and that there is no
wrapping, the allocation function can be simplified even more as all
if-statements in the allocation loop can be removed, improving
performance.
Signed-off-by: Magnus Karlsson <[email protected]>
Signed-off-by: Daniel Borkmann <[email protected]>
Link: https://lore.kernel.org/bpf/[email protected]
|
|
Use the new xsk batched rx allocation interface for the zero-copy data
path. As the array of struct xdp_buff pointers kept by the driver is
really a ring that wraps, the allocation routine is modified to detect
a wrap and in that case call the allocation function twice. The
allocation function cannot deal with wrapped rings, only arrays. As we
now know exactly how many buffers we get and that there is no
wrapping, the allocation function can be simplified even more as all
if-statements in the allocation loop can be removed, improving
performance.
Signed-off-by: Magnus Karlsson <[email protected]>
Signed-off-by: Daniel Borkmann <[email protected]>
Link: https://lore.kernel.org/bpf/[email protected]
|
|
In order to use the new xsk batched buffer allocation interface, a
pointer to an array of struct xsk_buff pointers need to be provided so
that the function can put the result of the allocation there. In the
ice driver, we already have a ring that stores pointers to
xdp_buffs. This is only used for the xsk zero-copy driver and is a
union with the structure that is used for the regular non zero-copy
path. Unfortunately, that structure is larger than the xdp_buffs
pointers which mean that there will be a stride (of 20 bytes) between
each xdp_buff pointer. And feeding this into the xsk_buff_alloc_batch
interface will not work since it assumes a regular array of xdp_buff
pointers (each 8 bytes with 0 bytes in-between them on a 64-bit
system).
To fix this, remove the xdp_buff pointer from the rx_buf union and
move it one step higher to the union above which only has pointers to
arrays in it. This solves the problem and we can directly feed the SW
ring of xdp_buff pointers straight into the allocation function in the
next patch when that interface is used. This will improve performance.
Signed-off-by: Magnus Karlsson <[email protected]>
Signed-off-by: Daniel Borkmann <[email protected]>
Link: https://lore.kernel.org/bpf/[email protected]
|
|
The e100_get_regs function is used to implement a simple register dump
for the e100 device. The data is broken into a couple of MAC control
registers, and then a series of PHY registers, followed by a memory dump
buffer.
The total length of the register dump is defined as (1 + E100_PHY_REGS)
* sizeof(u32) + sizeof(nic->mem->dump_buf).
The logic for filling in the PHY registers uses a convoluted inverted
count for loop which counts from E100_PHY_REGS (0x1C) down to 0, and
assigns the slots 1 + E100_PHY_REGS - i. The first loop iteration will
fill in [1] and the final loop iteration will fill in [1 + 0x1C]. This
is actually one more than the supposed number of PHY registers.
The memory dump buffer is then filled into the space at
[2 + E100_PHY_REGS] which will cause that memcpy to assign 4 bytes past
the total size.
The end result is that we overrun the total buffer size allocated by the
kernel, which could lead to a panic or other issues due to memory
corruption.
It is difficult to determine the actual total number of registers
here. The only 8255x datasheet I could find indicates there are 28 total
MDI registers. However, we're reading 29 here, and reading them in
reverse!
In addition, the ethtool e100 register dump interface appears to read
the first PHY register to determine if the device is in MDI or MDIx
mode. This doesn't appear to be documented anywhere within the 8255x
datasheet. I can only assume it must be in register 28 (the extra
register we're reading here).
Lets not change any of the intended meaning of what we copy here. Just
extend the space by 4 bytes to account for the extra register and
continue copying the data out in the same order.
Change the E100_PHY_REGS value to be the correct total (29) so that the
total register dump size is calculated properly. Fix the offset for
where we copy the dump buffer so that it doesn't overrun the total size.
Re-write the for loop to use counting up instead of the convoluted
down-counting. Correct the mdio_read offset to use the 0-based register
offsets, but maintain the bizarre reverse ordering so that we have the
ABI expected by applications like ethtool. This requires and additional
subtraction of 1. It seems a bit odd but it makes the flow of assignment
into the register buffer easier to follow.
Fixes: 1da177e4c3f4 ("Linux-2.6.12-rc2")
Reported-by: Felicitas Hetzelt <[email protected]>
Signed-off-by: Jacob Keller <[email protected]>
Tested-by: Jacob Keller <[email protected]>
Signed-off-by: Tony Nguyen <[email protected]>
|
|
commit abf9b902059f ("e100: cleanup unneeded math") tried to simplify
e100_get_regs_len and remove a double 'divide and then multiply'
calculation that the e100_reg_regs_len function did.
This change broke the size calculation entirely as it failed to account
for the fact that the numbered registers are actually 4 bytes wide and
not 1 byte. This resulted in a significant under allocation of the
register buffer used by e100_get_regs.
Fix this by properly multiplying the register count by u32 first before
adding the size of the dump buffer.
Fixes: abf9b902059f ("e100: cleanup unneeded math")
Reported-by: Felicitas Hetzelt <[email protected]>
Signed-off-by: Jacob Keller <[email protected]>
Signed-off-by: Tony Nguyen <[email protected]>
|
|
Move devlink_registration routine to be the last command, when the
device is fully initialized.
Signed-off-by: Leon Romanovsky <[email protected]>
Signed-off-by: David S. Miller <[email protected]>
|
|
PF pointer is always valid when PCI core calls its .shutdown() and
.remove() callbacks. There is no need to check it again.
Fixes: 837f08fdecbe ("ice: Add basic driver framework for Intel(R) E800 Series")
Signed-off-by: Leon Romanovsky <[email protected]>
Signed-off-by: David S. Miller <[email protected]>
|
|
net/mptcp/protocol.c
977d293e23b4 ("mptcp: ensure tx skbs always have the MPTCP ext")
efe686ffce01 ("mptcp: ensure tx skbs always have the MPTCP ext")
same patch merged in both trees, keep net-next.
Signed-off-by: Jakub Kicinski <[email protected]>
|
|
devlink_register() can't fail and always returns success, but all drivers
are obligated to check returned status anyway. This adds a lot of boilerplate
code to handle impossible flow.
Make devlink_register() void and simplify the drivers that use that
API call.
Signed-off-by: Leon Romanovsky <[email protected]>
Acked-by: Simon Horman <[email protected]>
Acked-by: Vladimir Oltean <[email protected]> # dsa
Reviewed-by: Jiri Pirko <[email protected]>
Signed-off-by: David S. Miller <[email protected]>
|
|
When IGC=y and PTP_1588_CLOCK=m, the ptp_*() interface family is
not available to the igc driver. Make this driver depend on
PTP_1588_CLOCK_OPTIONAL so that it will build without errors.
Various igc commits have used ptp_*() functions without checking
that PTP_1588_CLOCK is enabled. Fix all of these here.
Fixes these build errors:
ld: drivers/net/ethernet/intel/igc/igc_main.o: in function `igc_msix_other':
igc_main.c:(.text+0x6494): undefined reference to `ptp_clock_event'
ld: igc_main.c:(.text+0x64ef): undefined reference to `ptp_clock_event'
ld: igc_main.c:(.text+0x6559): undefined reference to `ptp_clock_event'
ld: drivers/net/ethernet/intel/igc/igc_ethtool.o: in function `igc_ethtool_get_ts_info':
igc_ethtool.c:(.text+0xc7a): undefined reference to `ptp_clock_index'
ld: drivers/net/ethernet/intel/igc/igc_ptp.o: in function `igc_ptp_feature_enable_i225':
igc_ptp.c:(.text+0x330): undefined reference to `ptp_find_pin'
ld: igc_ptp.c:(.text+0x36f): undefined reference to `ptp_find_pin'
ld: drivers/net/ethernet/intel/igc/igc_ptp.o: in function `igc_ptp_init':
igc_ptp.c:(.text+0x11cd): undefined reference to `ptp_clock_register'
ld: drivers/net/ethernet/intel/igc/igc_ptp.o: in function `igc_ptp_stop':
igc_ptp.c:(.text+0x12dd): undefined reference to `ptp_clock_unregister'
ld: drivers/platform/x86/dell/dell-wmi-privacy.o: in function `dell_privacy_wmi_probe':
Fixes: 64433e5bf40ab ("igc: Enable internal i225 PPS")
Fixes: 60dbede0c4f3d ("igc: Add support for ethtool GET_TS_INFO command")
Fixes: 87938851b6efb ("igc: enable auxiliary PHC functions for the i225")
Fixes: 5f2958052c582 ("igc: Add basic skeleton for PTP")
Signed-off-by: Randy Dunlap <[email protected]>
Cc: Ederson de Souza <[email protected]>
Cc: Tony Nguyen <[email protected]>
Cc: Vinicius Costa Gomes <[email protected]>
Cc: Jeff Kirsher <[email protected]>
Cc: "David S. Miller" <[email protected]>
Cc: Jakub Kicinski <[email protected]>
Cc: Jesse Brandeburg <[email protected]>
Cc: [email protected]
Acked-by: Vinicius Costa Gomes <[email protected]>
Signed-off-by: David S. Miller <[email protected]>
|
|
After I turn on the CONFIG_LOCK_STAT=y, insmod e1000e.ko will report:
[ 5.641579] e1000e: Unknown symbol mutex_lock (err -2)
[ 90.775705] e1000e: Unknown symbol mutex_lock (err -2)
[ 132.252339] e1000e: Unknown symbol mutex_lock (err -2)
This problem fixed after include <linux/mutex.h>.
Signed-off-by: Hao Chen <[email protected]>
Signed-off-by: David S. Miller <[email protected]>
|
|
Checking tunnel offloading, it turns out that offloading doesn't work
as expected. The following script allows to reproduce the issue.
Call it as `testscript DEVICE LOCALIP REMOTEIP NETMASK'
=== SNIP ===
if [ $# -ne 4 ]
then
echo "Usage $0 DEVICE LOCALIP REMOTEIP NETMASK"
exit 1
fi
DEVICE="$1"
LOCAL_ADDRESS="$2"
REMOTE_ADDRESS="$3"
NWMASK="$4"
echo "Driver: $(ethtool -i ${DEVICE} | awk '/^driver:/{print $2}') "
ethtool -k "${DEVICE}" | grep tx-udp
echo
echo "Set up NIC and tunnel..."
ip addr add "${LOCAL_ADDRESS}/${NWMASK}" dev "${DEVICE}"
ip link set "${DEVICE}" up
sleep 2
ip link add vxlan1 type vxlan id 42 \
remote "${REMOTE_ADDRESS}" \
local "${LOCAL_ADDRESS}" \
dstport 0 \
dev "${DEVICE}"
ip addr add fc00::1/64 dev vxlan1
ip link set vxlan1 up
sleep 2
rm -f vxlan.pcap
echo "Running tcpdump and iperf3..."
( nohup tcpdump -i any -w vxlan.pcap >/dev/null 2>&1 ) &
sleep 2
iperf3 -c fc00::2 >/dev/null
pkill tcpdump
echo
echo -n "Max. Paket Size: "
tcpdump -r vxlan.pcap -nnle 2>/dev/null \
| grep "${LOCAL_ADDRESS}.*> ${REMOTE_ADDRESS}.*OTV" \
| awk '{print $8}' | awk -F ':' '{print $1}' \
| sort -n | tail -1
echo
ip link del vxlan1
ip addr del ${LOCAL_ADDRESS}/${NWMASK} dev "${DEVICE}"
=== SNAP ===
The expected outcome is
Max. Paket Size: 64904
This is what you see on igb, the code igc has been taken from.
However, on igc the output is
Max. Paket Size: 1516
so the GSO aggregate packets are segmented by the kernel before calling
igc_xmit_frame. Inside the subsequent call to igc_tso, the check for
skb_is_gso(skb) fails and the function returns prematurely.
It turns out that this occurs because the feature flags aren't set
entirely correctly in igc_probe. In contrast to the original code
from igb_probe, igc_probe neglects to set the flags required to allow
tunnel offloading.
Setting the same flags as igb fixes the issue on igc.
Fixes: 34428dff3679 ("igc: Add GSO partial support")
Signed-off-by: Paolo Abeni <[email protected]>
Tested-by: Corinna Vinschen <[email protected]>
Acked-by: Sasha Neftin <[email protected]>
Tested-by: Nechama Kraus <[email protected]>
Signed-off-by: Tony Nguyen <[email protected]>
Signed-off-by: David S. Miller <[email protected]>
|
|
There are two cases where the current PF does not support RDMA
functionality. The first is if the NVM loaded on the device is set
to not support RDMA (common_caps.rdma is false). The second is if
the kernel bonding driver has included the current PF in an active
link aggregate.
When the driver has determined that this PF does not support RDMA, then
auxiliary devices should not be created on the auxiliary bus. Without
a device on the auxiliary bus, even if the irdma driver is present, there
will be no RDMA activity attempted on this PF.
Currently, in the reset flow, an attempt to create auxiliary devices is
performed without regard to the ability of the PF. There needs to be a
check in ice_aux_plug_dev (as the central point that creates auxiliary
devices) to see if the PF is in a state to support the functionality.
When disabling and re-enabling RDMA due to the inclusion/removal of the PF
in a link aggregate, we also need to set/clear the bit which controls
auxiliary device creation so that a reset recovery in a link aggregate
situation doesn't try to create auxiliary devices when it shouldn't.
Fixes: f9f5301e7e2d ("ice: Register auxiliary device to provide RDMA")
Reported-by: Yongxin Liu <[email protected]>
Signed-off-by: Dave Ertman <[email protected]>
Signed-off-by: Tony Nguyen <[email protected]>
Signed-off-by: David S. Miller <[email protected]>
|
|
include/linux/netdevice.h
net/socket.c
d0efb16294d1 ("net: don't unconditionally copy_from_user a struct ifreq for socket ioctls")
876f0bf9d0d5 ("net: socket: simplify dev_ifconf handling")
29c4964822aa ("net: socket: rework compat_ifreq_ioctl()")
Signed-off-by: Jakub Kicinski <[email protected]>
|
|
commit 3ba7f53f8bf1 ("ice: don't remove netdev->dev_addr from uc sync
list") introduced calls to netif_addr_lock_bh() and
netif_addr_unlock_bh() in the driver's ndo_set_mac() callback. This is
fine since the driver is updated the netdev's dev_addr, but since this
is a spinlock, the driver cannot sleep when the lock is held.
Unfortunately the functions to add/delete MAC filters depend on a mutex.
This was causing a trace with the lock debug kernel config options
enabled when changing the mac address via iproute.
[ 203.273059] BUG: sleeping function called from invalid context at kernel/locking/mutex.c:281
[ 203.273065] in_atomic(): 1, irqs_disabled(): 0, non_block: 0, pid: 6698, name: ip
[ 203.273068] Preemption disabled at:
[ 203.273068] [<ffffffffc04aaeab>] ice_set_mac_address+0x8b/0x1c0 [ice]
[ 203.273097] CPU: 31 PID: 6698 Comm: ip Tainted: G S W I 5.14.0-rc4 #2
[ 203.273100] Hardware name: Intel Corporation S2600WFT/S2600WFT, BIOS SE5C620.86B.02.01.0010.010620200716 01/06/2020
[ 203.273102] Call Trace:
[ 203.273107] dump_stack_lvl+0x33/0x42
[ 203.273113] ? ice_set_mac_address+0x8b/0x1c0 [ice]
[ 203.273124] ___might_sleep.cold.150+0xda/0xea
[ 203.273131] mutex_lock+0x1c/0x40
[ 203.273136] ice_remove_mac+0xe3/0x180 [ice]
[ 203.273155] ? ice_fltr_add_mac_list+0x20/0x20 [ice]
[ 203.273175] ice_fltr_prepare_mac+0x43/0xa0 [ice]
[ 203.273194] ice_set_mac_address+0xab/0x1c0 [ice]
[ 203.273206] dev_set_mac_address+0xb8/0x120
[ 203.273210] dev_set_mac_address_user+0x2c/0x50
[ 203.273212] do_setlink+0x1dd/0x10e0
[ 203.273217] ? __nla_validate_parse+0x12d/0x1a0
[ 203.273221] __rtnl_newlink+0x530/0x910
[ 203.273224] ? __kmalloc_node_track_caller+0x17f/0x380
[ 203.273230] ? preempt_count_add+0x68/0xa0
[ 203.273236] ? _raw_spin_lock_irqsave+0x1f/0x30
[ 203.273241] ? kmem_cache_alloc_trace+0x4d/0x440
[ 203.273244] rtnl_newlink+0x43/0x60
[ 203.273245] rtnetlink_rcv_msg+0x13a/0x380
[ 203.273248] ? rtnl_calcit.isra.40+0x130/0x130
[ 203.273250] netlink_rcv_skb+0x4e/0x100
[ 203.273256] netlink_unicast+0x1a2/0x280
[ 203.273258] netlink_sendmsg+0x242/0x490
[ 203.273260] sock_sendmsg+0x58/0x60
[ 203.273263] ____sys_sendmsg+0x1ef/0x260
[ 203.273265] ? copy_msghdr_from_user+0x5c/0x90
[ 203.273268] ? ____sys_recvmsg+0xe6/0x170
[ 203.273270] ___sys_sendmsg+0x7c/0xc0
[ 203.273272] ? copy_msghdr_from_user+0x5c/0x90
[ 203.273274] ? ___sys_recvmsg+0x89/0xc0
[ 203.273276] ? __netlink_sendskb+0x50/0x50
[ 203.273278] ? mod_objcg_state+0xee/0x310
[ 203.273282] ? __dentry_kill+0x114/0x170
[ 203.273286] ? get_max_files+0x10/0x10
[ 203.273288] __sys_sendmsg+0x57/0xa0
[ 203.273290] do_syscall_64+0x37/0x80
[ 203.273295] entry_SYSCALL_64_after_hwframe+0x44/0xae
[ 203.273296] RIP: 0033:0x7f8edf96e278
[ 203.273298] Code: 89 02 48 c7 c0 ff ff ff ff eb b5 0f 1f 80 00 00 00 00 f3 0f 1e fa 48 8d 05 25 63 2c 00 8b 00 85 c0 75 17 b8 2e 00 00 00 0f 05 <48> 3d 00 f0 ff ff 77 58 c3 0f 1f 80 00 00 00 00 41 54 41 89 d4 55
[ 203.273300] RSP: 002b:00007ffcb8bdac08 EFLAGS: 00000246 ORIG_RAX: 000000000000002e
[ 203.273303] RAX: ffffffffffffffda RBX: 000000006115e0ae RCX: 00007f8edf96e278
[ 203.273304] RDX: 0000000000000000 RSI: 00007ffcb8bdac70 RDI: 0000000000000003
[ 203.273305] RBP: 0000000000000000 R08: 0000000000000001 R09: 00007ffcb8bda5b0
[ 203.273306] R10: 0000000000000000 R11: 0000000000000246 R12: 0000000000000001
[ 203.273306] R13: 0000555e10092020 R14: 0000000000000000 R15: 0000000000000005
Fix this by only locking when changing the netdev->dev_addr. Also, make
sure to restore the old netdev->dev_addr on any failures.
Fixes: 3ba7f53f8bf1 ("ice: don't remove netdev->dev_addr from uc sync list")
Signed-off-by: Brett Creeley <[email protected]>
Tested-by: Gurucharan G <[email protected]>
Signed-off-by: Tony Nguyen <[email protected]>
|
|
When we enabled auxiliary input/output support for the E810 device, we
forgot to add logic to restart the output when we change time. This is
important as the periodic output will be incorrect after a time change
otherwise.
This unfortunately includes the adjust time function, even though it
uses an atomic hardware interface. The atomic adjustment can still cause
the pin output to stall permanently, so we need to stop and restart it.
Introduce wrapper functions to temporarily disable and then re-enable
the clock outputs.
Fixes: 172db5f91d5f ("ice: add support for auxiliary input/output pins")
Signed-off-by: Jacob Keller <[email protected]>
Tested-by: Sunitha D Mekala <[email protected]>
Signed-off-by: Tony Nguyen <[email protected]>
|
|
Implement support for Credit-based shaper(CBS) Qdisc hardware
offload mode in the driver. There are two sets of IEEE802.1Qav
(CBS) HW logic in i225 controller and this patch supports
enabling them in the top two priority TX queues.
Driver implemented as recommended by Foxville External
Architecture Specification v0.993. Idleslope and Hi-credit are
the CBS tunable parameters for i225 NIC, programmed in TQAVCC
and TQAVHC registers respectively.
In-order for IEEE802.1Qav (CBS) algorithm to work as intended
and provide BW reservation CBS should be enabled in highest
priority queue first. If we enable CBS on any of low priority
queues, the traffic in high priority queue does not allow low
priority queue to be selected for transmission and bandwidth
reservation is not guaranteed.
Signed-off-by: Aravindhan Gunasekaran <[email protected]>
Signed-off-by: Mallikarjuna Chilakala <[email protected]>
Tested-by: Dvora Fuxbrumer <[email protected]>
Signed-off-by: Tony Nguyen <[email protected]>
|
|
Separates the procedure done during reset from applying a
configuration, knowing when the code is executing allow us to
separate the better what changes the hardware state from what
changes only the driver state.
Introduces a flag for bookkeeping the driver state of TSN
features. When Qav and frame-preemption is also implemented
this flag makes it easier to keep track on whether a TSN feature
driver state is enabled or not though controller state changes,
say, during a reset.
Signed-off-by: Vinicius Costa Gomes <[email protected]>
Signed-off-by: Aravindhan Gunasekaran <[email protected]>
Signed-off-by: Mallikarjuna Chilakala <[email protected]>
Tested-by: Dvora Fuxbrumer <[email protected]>
Signed-off-by: Tony Nguyen <[email protected]>
|
|
Sets default values for each queue cycle start and cycle end.
This allows some simplification in the handling of these
configurations as most TSN features in i225 require a cycle
to be configured.
In i225, cycle start and end time is required to be programmed
for CBS to work properly.
Signed-off-by: Vinicius Costa Gomes <[email protected]>
Signed-off-by: Aravindhan Gunasekaran <[email protected]>
Signed-off-by: Mallikarjuna Chilakala <[email protected]>
Tested-by: Dvora Fuxbrumer <[email protected]>
Signed-off-by: Tony Nguyen <[email protected]>
|
|
The driver didn't take the lock while flushing the Tx tracker, which
could cause a race where one thread is trying to read timestamps out
while another thread is trying to read the tracker to check the
timestamps.
Avoid this by ensuring that flushing is locked against read accesses.
Fixes: ea9b847cda64 ("ice: enable transmit timestamps for E810 devices")
Signed-off-by: Jacob Keller <[email protected]>
Tested-by: Gurucharan G <[email protected]>
Signed-off-by: Tony Nguyen <[email protected]>
|
|
We have code in the ice driver which allocates the pin_config structure
if n_pins is > 0, but we never set n_pins to be greater than zero.
There's no reason to keep this code until we actually have pin_config
support. Remove this. We can re-add it properly when we implement
support for pin_config for E810-T devices.
Fixes: 172db5f91d5f ("ice: add support for auxiliary input/output pins")
Signed-off-by: Jacob Keller <[email protected]>
Tested-by: Gurucharan G <[email protected]>
Signed-off-by: Tony Nguyen <[email protected]>
|
|
The driver accidentally copied the ice_for_each_rxq iterator when
implementing enablement of the ptp_tx bit for the Tx rings. We still
load the Tx rings and set the ptp_tx field, but we iterate over the
count of the num_rxq.
If the number of Tx and Rx queues differ, this could either cause
a buffer overrun when accessing the tx_rings list if num_txq is greater
than num_rxq, or it could cause us to fail to enable Tx timestamps for
some rings.
This was not noticed originally as we generally have the same number of
Tx and Rx queues.
Fixes: ea9b847cda64 ("ice: enable transmit timestamps for E810 devices")
Signed-off-by: Jacob Keller <[email protected]>
Tested-by: Gurucharan G <[email protected]>
Signed-off-by: Tony Nguyen <[email protected]>
|
|
drivers/net/wwan/mhi_wwan_mbim.c - drop the extra arg.
Signed-off-by: Jakub Kicinski <[email protected]>
|
|
i225 supports PCIe Precision Time Measurement (PTM), allowing us to
support the PTP_SYS_OFFSET_PRECISE ioctl() in the driver via the
getcrosststamp() function.
The easiest way to expose the PTM registers would be to configure the PTM
dialogs to run periodically, but the PTP_SYS_OFFSET_PRECISE ioctl()
semantics are more aligned to using a kind of "one-shot" way of retrieving
the PTM timestamps. But this causes a bit more code to be written: the
trigger registers for the PTM dialogs are not cleared automatically.
i225 can be configured to send "fake" packets with the PTM
information, adding support for handling these types of packets is
left for the future.
PTM improves the accuracy of time synchronization, for example, using
phc2sys, while a simple application is sending packets as fast as
possible. First, without .getcrosststamp():
phc2sys[191.382]: enp4s0 sys offset -959 s2 freq -454 delay 4492
phc2sys[191.482]: enp4s0 sys offset 798 s2 freq +1015 delay 4069
phc2sys[191.583]: enp4s0 sys offset 962 s2 freq +1418 delay 3849
phc2sys[191.683]: enp4s0 sys offset 924 s2 freq +1669 delay 3753
phc2sys[191.783]: enp4s0 sys offset 664 s2 freq +1686 delay 3349
phc2sys[191.883]: enp4s0 sys offset 218 s2 freq +1439 delay 2585
phc2sys[191.983]: enp4s0 sys offset 761 s2 freq +2048 delay 3750
phc2sys[192.083]: enp4s0 sys offset 756 s2 freq +2271 delay 4061
phc2sys[192.183]: enp4s0 sys offset 809 s2 freq +2551 delay 4384
phc2sys[192.283]: enp4s0 sys offset -108 s2 freq +1877 delay 2480
phc2sys[192.383]: enp4s0 sys offset -1145 s2 freq +807 delay 4438
phc2sys[192.484]: enp4s0 sys offset 571 s2 freq +2180 delay 3849
phc2sys[192.584]: enp4s0 sys offset 241 s2 freq +2021 delay 3389
phc2sys[192.684]: enp4s0 sys offset 405 s2 freq +2257 delay 3829
phc2sys[192.784]: enp4s0 sys offset 17 s2 freq +1991 delay 3273
phc2sys[192.884]: enp4s0 sys offset 152 s2 freq +2131 delay 3948
phc2sys[192.984]: enp4s0 sys offset -187 s2 freq +1837 delay 3162
phc2sys[193.084]: enp4s0 sys offset -1595 s2 freq +373 delay 4557
phc2sys[193.184]: enp4s0 sys offset 107 s2 freq +1597 delay 3740
phc2sys[193.284]: enp4s0 sys offset 199 s2 freq +1721 delay 4010
phc2sys[193.385]: enp4s0 sys offset -169 s2 freq +1413 delay 3701
phc2sys[193.485]: enp4s0 sys offset -47 s2 freq +1484 delay 3581
phc2sys[193.585]: enp4s0 sys offset -65 s2 freq +1452 delay 3778
phc2sys[193.685]: enp4s0 sys offset 95 s2 freq +1592 delay 3888
phc2sys[193.785]: enp4s0 sys offset 206 s2 freq +1732 delay 4445
phc2sys[193.885]: enp4s0 sys offset -652 s2 freq +936 delay 2521
phc2sys[193.985]: enp4s0 sys offset -203 s2 freq +1189 delay 3391
phc2sys[194.085]: enp4s0 sys offset -376 s2 freq +955 delay 2951
phc2sys[194.185]: enp4s0 sys offset -134 s2 freq +1084 delay 3330
phc2sys[194.285]: enp4s0 sys offset -22 s2 freq +1156 delay 3479
phc2sys[194.386]: enp4s0 sys offset 32 s2 freq +1204 delay 3602
phc2sys[194.486]: enp4s0 sys offset 122 s2 freq +1303 delay 3731
Statistics for this run (total of 2179 lines), in nanoseconds:
average: -1.12
stdev: 634.80
max: 1551
min: -2215
With .getcrosststamp() via PCIe PTM:
phc2sys[367.859]: enp4s0 sys offset 6 s2 freq +1727 delay 0
phc2sys[367.959]: enp4s0 sys offset -2 s2 freq +1721 delay 0
phc2sys[368.059]: enp4s0 sys offset 5 s2 freq +1727 delay 0
phc2sys[368.160]: enp4s0 sys offset -1 s2 freq +1723 delay 0
phc2sys[368.260]: enp4s0 sys offset -4 s2 freq +1719 delay 0
phc2sys[368.360]: enp4s0 sys offset -5 s2 freq +1717 delay 0
phc2sys[368.460]: enp4s0 sys offset 1 s2 freq +1722 delay 0
phc2sys[368.560]: enp4s0 sys offset -3 s2 freq +1718 delay 0
phc2sys[368.660]: enp4s0 sys offset 5 s2 freq +1725 delay 0
phc2sys[368.760]: enp4s0 sys offset -1 s2 freq +1721 delay 0
phc2sys[368.860]: enp4s0 sys offset 0 s2 freq +1721 delay 0
phc2sys[368.960]: enp4s0 sys offset 0 s2 freq +1721 delay 0
phc2sys[369.061]: enp4s0 sys offset 4 s2 freq +1725 delay 0
phc2sys[369.161]: enp4s0 sys offset 1 s2 freq +1724 delay 0
phc2sys[369.261]: enp4s0 sys offset 4 s2 freq +1727 delay 0
phc2sys[369.361]: enp4s0 sys offset 8 s2 freq +1732 delay 0
phc2sys[369.461]: enp4s0 sys offset 7 s2 freq +1733 delay 0
phc2sys[369.561]: enp4s0 sys offset 4 s2 freq +1733 delay 0
phc2sys[369.661]: enp4s0 sys offset 1 s2 freq +1731 delay 0
phc2sys[369.761]: enp4s0 sys offset 1 s2 freq +1731 delay 0
phc2sys[369.861]: enp4s0 sys offset -5 s2 freq +1725 delay 0
phc2sys[369.961]: enp4s0 sys offset -4 s2 freq +1725 delay 0
phc2sys[370.062]: enp4s0 sys offset 2 s2 freq +1730 delay 0
phc2sys[370.162]: enp4s0 sys offset -7 s2 freq +1721 delay 0
phc2sys[370.262]: enp4s0 sys offset -3 s2 freq +1723 delay 0
phc2sys[370.362]: enp4s0 sys offset 1 s2 freq +1726 delay 0
phc2sys[370.462]: enp4s0 sys offset -3 s2 freq +1723 delay 0
phc2sys[370.562]: enp4s0 sys offset -1 s2 freq +1724 delay 0
phc2sys[370.662]: enp4s0 sys offset -4 s2 freq +1720 delay 0
phc2sys[370.762]: enp4s0 sys offset -7 s2 freq +1716 delay 0
phc2sys[370.862]: enp4s0 sys offset -2 s2 freq +1719 delay 0
Statistics for this run (total of 2179 lines), in nanoseconds:
average: 0.14
stdev: 5.03
max: 48
min: -27
For reference, the statistics for runs without PCIe congestion show
that the improvements from enabling PTM are less dramatic. For two
runs of 16466 entries:
without PTM: avg -0.04 stdev 10.57 max 39 min -42
with PTM: avg 0.01 stdev 4.20 max 19 min -16
One possible explanation is that when PTM is not enabled, and there's a lot
of traffic in the PCIe fabric, some register reads will take more time
than the others because of congestion on the PCIe fabric.
When PTM is enabled, even if the PTM dialogs take more time to
complete under heavy traffic, the time measurements do not depend on
the time to read the registers.
This was implemented following the i225 EAS version 0.993.
Signed-off-by: Vinicius Costa Gomes <[email protected]>
Tested-by: Dvora Fuxbrumer <[email protected]>
Signed-off-by: Tony Nguyen <[email protected]>
|
|
Enables PCIe PTM (Precision Time Measurement) support in the igc
driver. Notifies the PCI devices that PCIe PTM should be enabled.
PCIe PTM is similar protocol to PTP (Precision Time Protocol) running
in the PCIe fabric, it allows devices to report time measurements from
their internal clocks and the correlation with the PCIe root clock.
The i225 NIC exposes some registers that expose those time
measurements, those registers will be used, in later patches, to
implement the PTP_SYS_OFFSET_PRECISE ioctl().
Signed-off-by: Vinicius Costa Gomes <[email protected]>
Tested-by: Dvora Fuxbrumer <[email protected]>
Signed-off-by: Tony Nguyen <[email protected]>
|
|
In order to support more coalesce parameters through netlink,
add two new parameter kernel_coal and extack for .set_coalesce
and .get_coalesce, then some extra info can return to user with
the netlink API.
Signed-off-by: Yufeng Mo <[email protected]>
Signed-off-by: Huazhong Tan <[email protected]>
Signed-off-by: Jakub Kicinski <[email protected]>
|
|
On new platforms, the NVM is read-only. Attempting to update the NVM
is causing a lockup to occur. Do not attempt to write to the NVM
on platforms where it's not supported.
Emit an error message when the NVM checksum is invalid.
Bugzilla: https://bugzilla.kernel.org/show_bug.cgi?id=213667
Fixes: fb776f5d57ee ("e1000e: Add support for Tiger Lake")
Suggested-by: Dima Ruinskiy <[email protected]>
Suggested-by: Vitaly Lifshits <[email protected]>
Signed-off-by: Sasha Neftin <[email protected]>
Tested-by: Dvora Fuxbrumer <[email protected]>
Signed-off-by: Tony Nguyen <[email protected]>
|
|
We should decode the latency and the max_latency before directly compare.
The latency should be presented as lat_enc = scale x value:
lat_enc_d = (lat_enc & 0x0x3ff) x (1U << (5*((max_ltr_enc & 0x1c00)
>> 10)))
Fixes: cf8fb73c23aa ("e1000e: add support for LTR on I217/I218")
Suggested-by: Yee Li <[email protected]>
Signed-off-by: Sasha Neftin <[email protected]>
Tested-by: Dvora Fuxbrumer <[email protected]>
Signed-off-by: Tony Nguyen <[email protected]>
|
|
Use num_tx_queues rather than the IGC_MAX_TX_QUEUES fixed number 4 when
iterating over tx_ring queue since instantiated queue count could be
less than 4 where on-line cpu count is less than 4.
Fixes: ec50a9d437f0 ("igc: Add support for taprio offloading")
Signed-off-by: Toshiki Nishioka <[email protected]>
Signed-off-by: Muhammad Husaini Zulkifli <[email protected]>
Tested-by: Muhammad Husaini Zulkifli <[email protected]>
Acked-by: Sasha Neftin <[email protected]>
Signed-off-by: Tony Nguyen <[email protected]>
|
|
After unplug thunderbolt dock with i225, pciehp interrupt is triggered,
remove call will read/write mmio address which is already disconnected,
then cause page fault and make system hang.
Check PCI state to remove device safely.
Trace:
BUG: unable to handle page fault for address: 000000000000b604
Oops: 0000 [#1] SMP NOPTI
RIP: 0010:igc_rd32+0x1c/0x90 [igc]
Call Trace:
igc_ptp_suspend+0x6c/0xa0 [igc]
igc_ptp_stop+0x12/0x50 [igc]
igc_remove+0x7f/0x1c0 [igc]
pci_device_remove+0x3e/0xb0
__device_release_driver+0x181/0x240
Fixes: 13b5b7fd6a4a ("igc: Add support for Tx/Rx rings")
Fixes: b03c49cde61f ("igc: Save PTP time before a reset")
Signed-off-by: Aaron Ma <[email protected]>
Tested-by: Dvora Fuxbrumer <[email protected]>
Signed-off-by: Tony Nguyen <[email protected]>
|
|
The devlink dev info command reports version information about the
device and firmware running on the board. This includes the "board.id"
field which is supposed to represent an identifier of the board design.
The ice driver uses the Product Board Assembly identifier for this.
In some cases, the PBA is not present in the NVM. If this happens,
devlink dev info will fail with an error. Instead, modify the
ice_info_pba function to just exit without filling in the context
buffer. This will cause the board.id field to be skipped. Log a dev_dbg
message in case someone wants to confirm why board.id is not showing up
for them.
Fixes: e961b679fb0b ("ice: add board identifier info to devlink .info_get")
Signed-off-by: Jacob Keller <[email protected]>
Tested-by: Tony Brelinski <[email protected]>
Signed-off-by: Tony Nguyen <[email protected]>
Link: https://lore.kernel.org/r/[email protected]
Signed-off-by: Jakub Kicinski <[email protected]>
|
|
drivers/ptp/Kconfig:
55c8fca1dae1 ("ptp_pch: Restore dependency on PCI")
e5f31552674e ("ethernet: fix PTP_1588_CLOCK dependencies")
Signed-off-by: Jakub Kicinski <[email protected]>
|
|
Make changes to MAC address dependent on the response of PF.
Disallow changes to HW MAC address and MAC filter from untrusted
VF, thanks to that ping is not lost if VF tries to change MAC.
Add a new field in iavf_mac_filter, to indicate whether there
was response from PF for given filter. Based on this field pass
or discard the filter.
If untrusted VF tried to change it's address, it's not changed.
Still filter was changed, because of that ping couldn't go through.
Fixes: c5c922b3e09b ("iavf: fix MAC address setting for VFs when filter is rejected")
Signed-off-by: Przemyslaw Patynowski <[email protected]>
Signed-off-by: Sylwester Dziedziuch <[email protected]>
Signed-off-by: Mateusz Palczewski <[email protected]>
Tested-by: Gurucharan G <[email protected]>
Signed-off-by: Tony Nguyen <[email protected]>
Signed-off-by: Jakub Kicinski <[email protected]>
|
|
Without this patch, ATR does not work. Receive/transmit uses queue
selection based on SW DCB hashing method.
If traffic classes are not configured for PF, then use
netdev_pick_tx function for selecting queue for packet transmission.
Instead of calling i40e_swdcb_skb_tx_hash, call netdev_pick_tx,
which ensures that packet is transmitted/received from CPU that is
running the application.
Reproduction steps:
1. Load i40e driver
2. Map each MSI interrupt of i40e port for each CPU
3. Disable ntuple, enable ATR i.e.:
ethtool -K $interface ntuple off
ethtool --set-priv-flags $interface flow-director-atr
4. Run application that is generating traffic and is bound to a
single CPU, i.e.:
taskset -c 9 netperf -H 1.1.1.1 -t TCP_RR -l 10
5. Observe behavior:
Application's traffic should be restricted to the CPU provided in
taskset.
Fixes: 89ec1f0886c1 ("i40e: Fix queue-to-TC mapping on Tx")
Signed-off-by: Przemyslaw Patynowski <[email protected]>
Signed-off-by: Arkadiusz Kubalewski <[email protected]>
Tested-by: Dave Switzer <[email protected]>
Signed-off-by: Tony Nguyen <[email protected]>
Signed-off-by: Jakub Kicinski <[email protected]>
|
|
In ixgbe_xsk_pool_enable(), if ixgbe_xsk_wakeup() fails,
We should restore the previous state and clean up the
resources. Add the missing clear af_xdp_zc_qps and unmap dma
to fix this bug.
Fixes: d49e286d354e ("ixgbe: add tracking of AF_XDP zero-copy state for each queue pair")
Fixes: 4a9b32f30f80 ("ixgbe: fix potential RX buffer starvation for AF_XDP")
Signed-off-by: Wang Hai <[email protected]>
Acked-by: Magnus Karlsson <[email protected]>
Tested-by: Sandeep Penigalapati <[email protected]>
Signed-off-by: Tony Nguyen <[email protected]>
Link: https://lore.kernel.org/r/[email protected]
Signed-off-by: Jakub Kicinski <[email protected]>
|