blaster4385/linux-IllusionX - Linux kernel with personal config changes for arch linux

Age	Commit message (Collapse)	Author	Files	Lines
2021-10-06	i40e: Fix freeing of uninitialized misc IRQ vector	Sylwester Dziedziuch	1	-1/+2
	When VSI set up failed in i40e_probe() as part of PF switch set up driver was trying to free misc IRQ vectors in i40e_clear_interrupt_scheme and produced a kernel Oops: Trying to free already-free IRQ 266 WARNING: CPU: 0 PID: 5 at kernel/irq/manage.c:1731 __free_irq+0x9a/0x300 Workqueue: events work_for_cpu_fn RIP: 0010:__free_irq+0x9a/0x300 Call Trace: ? synchronize_irq+0x3a/0xa0 free_irq+0x2e/0x60 i40e_clear_interrupt_scheme+0x53/0x190 [i40e] i40e_probe.part.108+0x134b/0x1a40 [i40e] ? kmem_cache_alloc+0x158/0x1c0 ? acpi_ut_update_ref_count.part.1+0x8e/0x345 ? acpi_ut_update_object_reference+0x15e/0x1e2 ? strstr+0x21/0x70 ? irq_get_irq_data+0xa/0x20 ? mp_check_pin_attr+0x13/0xc0 ? irq_get_irq_data+0xa/0x20 ? mp_map_pin_to_irq+0xd3/0x2f0 ? acpi_register_gsi_ioapic+0x93/0x170 ? pci_conf1_read+0xa4/0x100 ? pci_bus_read_config_word+0x49/0x70 ? do_pci_enable_device+0xcc/0x100 local_pci_probe+0x41/0x90 work_for_cpu_fn+0x16/0x20 process_one_work+0x1a7/0x360 worker_thread+0x1cf/0x390 ? create_worker+0x1a0/0x1a0 kthread+0x112/0x130 ? kthread_flush_work_fn+0x10/0x10 ret_from_fork+0x1f/0x40 The problem is that at that point misc IRQ vectors were not allocated yet and we get a call trace that driver is trying to free already free IRQ vectors. Add a check in i40e_clear_interrupt_scheme for __I40E_MISC_IRQ_REQUESTED PF state before calling i40e_free_misc_vector. This state is set only if misc IRQ vectors were properly initialized. Fixes: c17401a1dd21 ("i40e: use separate state bit for miscellaneous IRQ setup") Reported-by: PJ Waskiewicz <[email protected]> Signed-off-by: Sylwester Dziedziuch <[email protected]> Signed-off-by: Mateusz Palczewski <[email protected]> Tested-by: Dave Switzer <[email protected]> Signed-off-by: Tony Nguyen <[email protected]>
2021-10-06	i40e: fix endless loop under rtnl	Jiri Benc	1	-1/+1
	The loop in i40e_get_capabilities can never end. The problem is that although i40e_aq_discover_capabilities returns with an error if there's a firmware problem, the returned error is not checked. There is a check for pf->hw.aq.asq_last_status but that value is set to I40E_AQ_RC_OK on most firmware problems. When i40e_aq_discover_capabilities encounters a firmware problem, it will encounter the same problem on its next invocation. As the result, the loop becomes endless. We hit this with I40E_ERR_ADMIN_QUEUE_TIMEOUT but looking at the code, it can happen with a range of other firmware errors. I don't know what the correct behavior should be: whether the firmware should be retried a few times, or whether pf->hw.aq.asq_last_status should be always set to the encountered firmware error (but then it would be pointless and can be just replaced by the i40e_aq_discover_capabilities return value). However, the current behavior with an endless loop under the rtnl mutex(!) is unacceptable and Intel has not submitted a fix, although we explained the bug to them 7 months ago. This may not be the best possible fix but it's better than hanging the whole system on a firmware bug. Fixes: 56a62fc86895 ("i40e: init code and hardware support") Tested-by: Stefan Assmann <[email protected]> Signed-off-by: Jiri Benc <[email protected]> Reviewed-by: Jesse Brandeburg <[email protected]> Tested-by: Dave Switzer <[email protected]> Signed-off-by: Tony Nguyen <[email protected]>
2021-10-05	ethernet: use eth_hw_addr_set() for dev->addr_len cases	Jakub Kicinski	9	-18/+16
	Convert all Ethernet drivers from memcpy(... dev->addr_len) to eth_hw_addr_set(): @@ expression dev, np; @@ - memcpy(dev->dev_addr, np, dev->addr_len) + eth_hw_addr_set(dev, np) In theory addr_len may not be ETH_ALEN, but we don't expect non-Ethernet devices to live under this directory, and only the following cases of setting addr_len exist: - cxgb4 for mgmt device, and the drivers which set it to ETH_ALEN: s2io, mlx4, vxge. Signed-off-by: Jakub Kicinski <[email protected]> Signed-off-by: David S. Miller <[email protected]>
2021-10-02	ethernet: use eth_hw_addr_set() - casts	Jakub Kicinski	1	-1/+1
	eth_hw_addr_set() takes a u8 pointer, like other etherdevice helpers. Convert the few drivers which require casts because they memcpy from "endian marked" types. Signed-off-by: Jakub Kicinski <[email protected]> Signed-off-by: David S. Miller <[email protected]>
2021-10-02	ethernet: use eth_hw_addr_set() instead of ether_addr_copy()	Jakub Kicinski	7	-13/+13
	Convert Ethernet from ether_addr_copy() to eth_hw_addr_set(): @@ expression dev, np; @@ - ether_addr_copy(dev->dev_addr, np) + eth_hw_addr_set(dev, np) Signed-off-by: Jakub Kicinski <[email protected]> Signed-off-by: David S. Miller <[email protected]>
2021-10-01	Merge https://git.kernel.org/pub/scm/linux/kernel/git/bpf/bpf-next	Jakub Kicinski	3	-87/+73
	Daniel Borkmann says: ==================== bpf-next 2021-10-02 We've added 85 non-merge commits during the last 15 day(s) which contain a total of 132 files changed, 13779 insertions(+), 6724 deletions(-). The main changes are: 1) Massive update on test_bpf.ko coverage for JITs as preparatory work for an upcoming MIPS eBPF JIT, from Johan Almbladh. 2) Add a batched interface for RX buffer allocation in AF_XDP buffer pool, with driver support for i40e and ice from Magnus Karlsson. 3) Add legacy uprobe support to libbpf to complement recently merged legacy kprobe support, from Andrii Nakryiko. 4) Add bpf_trace_vprintk() as variadic printk helper, from Dave Marchevsky. 5) Support saving the register state in verifier when spilling <8byte bounded scalar to the stack, from Martin Lau. 6) Add libbpf opt-in for stricter BPF program section name handling as part of libbpf 1.0 effort, from Andrii Nakryiko. 7) Add a document to help clarifying BPF licensing, from Alexei Starovoitov. 8) Fix skel_internal.h to propagate errno if the loader indicates an internal error, from Kumar Kartikeya Dwivedi. 9) Fix build warnings with -Wcast-function-type so that the option can later be enabled by default for the kernel, from Kees Cook. 10) Fix libbpf to ignore STT_SECTION symbols in legacy map definitions as it otherwise errors out when encountering them, from Toke Høiland-Jørgensen. 11) Teach libbpf to recognize specialized maps (such as for perf RB) and internally remove BTF type IDs when creating them, from Hengqi Chen. 12) Various fixes and improvements to BPF selftests. ==================== Link: https://lore.kernel.org/r/[email protected] Signed-off-by: Jakub Kicinski <[email protected]>
2021-09-30	Merge git://git.kernel.org/pub/scm/linux/kernel/git/netdev/net	Jakub Kicinski	3	-10/+22
	drivers/net/phy/bcm7xxx.c d88fd1b546ff ("net: phy: bcm7xxx: Fixed indirect MMD operations") f68d08c437f9 ("net: phy: bcm7xxx: Add EPHY entry for 72165") net/sched/sch_api.c b193e15ac69d ("net: prevent user from passing illegal stab size") 69508d43334e ("net_sched: Use struct_size() and flex_array_size() helpers") Both cases trivial - adjacent code additions. Signed-off-by: Jakub Kicinski <[email protected]>
2021-09-30	ixgbe: let the xdpdrv work with more than 64 cpus	Jason Xing	5	-24/+77
	Originally, ixgbe driver doesn't allow the mounting of xdpdrv if the server is equipped with more than 64 cpus online. So it turns out that the loading of xdpdrv causes the "NOMEM" failure. Actually, we can adjust the algorithm and then make it work through mapping the current cpu to some xdp ring with the protect of @tx_lock. Here are some numbers before/after applying this patch with xdp-example loaded on the eth0X: As client (tx path): Before After TCP_STREAM send-64 734.14 714.20 TCP_STREAM send-128 1401.91 1395.05 TCP_STREAM send-512 5311.67 5292.84 TCP_STREAM send-1k 9277.40 9356.22 (not stable) TCP_RR send-1 22559.75 21844.22 TCP_RR send-128 23169.54 22725.13 TCP_RR send-512 21670.91 21412.56 As server (rx path): Before After TCP_STREAM send-64 1416.49 1383.12 TCP_STREAM send-128 3141.49 3055.50 TCP_STREAM send-512 9488.73 9487.44 TCP_STREAM send-1k 9491.17 9356.22 (not stable) TCP_RR send-1 23617.74 23601.60 ... Notice: the TCP_RR mode is unstable as the official document explains. I tested many times with different parameters combined through netperf. Though the result is not that accurate, I cannot see much influence on this patch. The static key is places on the hot path, but it actually shouldn't cause a huge regression theoretically. Co-developed-by: Shujin Li <[email protected]> Signed-off-by: Shujin Li <[email protected]> Signed-off-by: Jason Xing <[email protected]> Tested-by: Sandeep Penigalapati <[email protected]> Signed-off-by: Tony Nguyen <[email protected]> Signed-off-by: David S. Miller <[email protected]>
2021-09-29	ixgbe: Fix NULL pointer dereference in ixgbe_xdp_setup	Feng Zhou	2	-3/+7
	The ixgbe driver currently generates a NULL pointer dereference with some machine (online cpus < 63). This is due to the fact that the maximum value of num_xdp_queues is nr_cpu_ids. Code is in "ixgbe_set_rss_queues"". Here's how the problem repeats itself: Some machine (online cpus < 63), And user set num_queues to 63 through ethtool. Code is in the "ixgbe_set_channels", adapter->ring_feature[RING_F_FDIR].limit = count; It becomes 63. When user use xdp, "ixgbe_set_rss_queues" will set queues num. adapter->num_rx_queues = rss_i; adapter->num_tx_queues = rss_i; adapter->num_xdp_queues = ixgbe_xdp_queues(adapter); And rss_i's value is from f = &adapter->ring_feature[RING_F_FDIR]; rss_i = f->indices = f->limit; So "num_rx_queues" > "num_xdp_queues", when run to "ixgbe_xdp_setup", for (i = 0; i < adapter->num_rx_queues; i++) if (adapter->xdp_ring[i]->xsk_umem) It leads to panic. Call trace: [exception RIP: ixgbe_xdp+368] RIP: ffffffffc02a76a0 RSP: ffff9fe16202f8d0 RFLAGS: 00010297 RAX: 0000000000000000 RBX: 0000000000000020 RCX: 0000000000000000 RDX: 0000000000000000 RSI: 000000000000001c RDI: ffffffffa94ead90 RBP: ffff92f8f24c0c18 R8: 0000000000000000 R9: 0000000000000000 R10: ffff9fe16202f830 R11: 0000000000000000 R12: ffff92f8f24c0000 R13: ffff9fe16202fc01 R14: 000000000000000a R15: ffffffffc02a7530 ORIG_RAX: ffffffffffffffff CS: 0010 SS: 0018 7 [ffff9fe16202f8f0] dev_xdp_install at ffffffffa89fbbcc 8 [ffff9fe16202f920] dev_change_xdp_fd at ffffffffa8a08808 9 [ffff9fe16202f960] do_setlink at ffffffffa8a20235 10 [ffff9fe16202fa88] rtnl_setlink at ffffffffa8a20384 11 [ffff9fe16202fc78] rtnetlink_rcv_msg at ffffffffa8a1a8dd 12 [ffff9fe16202fcf0] netlink_rcv_skb at ffffffffa8a717eb 13 [ffff9fe16202fd40] netlink_unicast at ffffffffa8a70f88 14 [ffff9fe16202fd80] netlink_sendmsg at ffffffffa8a71319 15 [ffff9fe16202fdf0] sock_sendmsg at ffffffffa89df290 16 [ffff9fe16202fe08] __sys_sendto at ffffffffa89e19c8 17 [ffff9fe16202ff30] __x64_sys_sendto at ffffffffa89e1a64 18 [ffff9fe16202ff38] do_syscall_64 at ffffffffa84042b9 19 [ffff9fe16202ff50] entry_SYSCALL_64_after_hwframe at ffffffffa8c0008c So I fix ixgbe_max_channels so that it will not allow a setting of queues to be higher than the num_online_cpus(). And when run to ixgbe_xdp_setup, take the smaller value of num_rx_queues and num_xdp_queues. Fixes: 4a9b32f30f80 ("ixgbe: fix potential RX buffer starvation for AF_XDP") Signed-off-by: Feng Zhou <[email protected]> Tested-by: Sandeep Penigalapati <[email protected]> Signed-off-by: Tony Nguyen <[email protected]> Signed-off-by: David S. Miller <[email protected]>
2021-09-28	ice: Prefer kcalloc over open coded arithmetic	Len Baker	1	-1/+1
	As noted in the "Deprecated Interfaces, Language Features, Attributes, and Conventions" documentation [1], size calculations (especially multiplication) should not be performed in memory allocator (or similar) function arguments due to the risk of them overflowing. This could lead to values wrapping around and a smaller allocation being made than the caller was expecting. Using those allocations could lead to linear overflows of heap memory and other misbehaviors. In this case this is not actually dynamic sizes: both sides of the multiplication are constant values. However it is best to refactor this anyway, just to keep the open-coded math idiom out of code. So, use the purpose specific kcalloc() function instead of the argument size * count in the kzalloc() function. [1] https://www.kernel.org/doc/html/v5.14/process/deprecated.html#open-coded-arithmetic-in-allocator-arguments Signed-off-by: Len Baker <[email protected]> Reviewed-by: Gustavo A. R. Silva <[email protected]> Signed-off-by: Tony Nguyen <[email protected]>
2021-09-28	ice: Fix macro name for IPv4 fragment flag	Jeff Guo	2	-2/+2
	In IPv4 header, fragment flags indicate whether the packet needs to be fragmented or not. The value 0x20 represents MF (More Fragment); fix the macro name to match this. Signed-off-by: Ting Xu <[email protected]> Signed-off-by: Jeff Guo <[email protected]> Tested-by: Gurucharan G <[email protected]> Signed-off-by: Tony Nguyen <[email protected]>
2021-09-28	ice: refactor devlink getter/fallback functions to void	Jacob Keller	1	-83/+50
	After commit a8f89fa27773 ("ice: do not abort devlink info if board identifier can't be found"), the getter/fallback() functions no longer report an error. Convert the interface to a void so that it is no longer possible to add a version field that is fatal. This makes sense, because we should not fail to report other versions just because one of the version pieces could not be found. Finally, clean up the getter functions line wrapping so that none of them take more than 80 columns, as is the usual style for networking files. Signed-off-by: Jacob Keller <[email protected]> Tested-by: Tony Brelinski <[email protected]> Signed-off-by: Tony Nguyen <[email protected]>
2021-09-28	ice: Fix link mode handling	Anirudh Venkataramanan	1	-2/+6
	The messaging for unsupported module detection is different for lenient mode and strict mode. Update the code to print the right messaging for a given link mode. Media topology conflict is not an error in lenient mode, so return an error code only if not in lenient mode. Signed-off-by: Anirudh Venkataramanan <[email protected]> Tested-by: Gurucharan G <[email protected]> Signed-off-by: Tony Nguyen <[email protected]>
2021-09-28	ice: Add feature bitmap, helpers and a check for DSCP	Anirudh Venkataramanan	5	-2/+63
	DSCP a.k.a L3 QoS is only supported on certain devices. To enforce this, this patch introduces a bitmap of features and helper functions. The feature bitmap is set based on device IDs on driver init. Currently, DSCP is the only feature in this bitmap, but there will be more in the future. In the DCB netlink flow, check if the feature bit is set before exercising DSCP. Signed-off-by: Anirudh Venkataramanan <[email protected]> Tested-by: Gurucharan G <[email protected]> Signed-off-by: Tony Nguyen <[email protected]>
2021-09-28	ice: Add DSCP support	Dave Ertman	11	-56/+464
	Implement code to handle submission of APP TLV's containing DSCP to TC mapping. The first such mapping received on an interface will cause that PF to switch to L3 DSCP QoS mode, apply the default config for that mode, and apply the received mapping. Only one such mapping will be allowed per DSCP value, and when the last DSCP mapping is deleted, the PF will switch back into L2 VLAN QoS mode, applying the appropriate default QoS settings. L3 DSCP QoS mode will only be allowed in SW DCBx mode, in other words, when the FW LLDP engine is disabled. Commands that break this mutual exclusivity will be blocked. Co-developed-by: Anirudh Venkataramanan <[email protected]> Signed-off-by: Anirudh Venkataramanan <[email protected]> Signed-off-by: Dave Ertman <[email protected]> Tested-by: Gurucharan G <[email protected]> Signed-off-by: Tony Nguyen <[email protected]>
2021-09-28	i40e: Use the xsk batched rx allocation interface	Magnus Karlsson	1	-27/+25
	Use the new xsk batched rx allocation interface for the zero-copy data path. As the array of struct xdp_buff pointers kept by the driver is really a ring that wraps, the allocation routine is modified to detect a wrap and in that case call the allocation function twice. The allocation function cannot deal with wrapped rings, only arrays. As we now know exactly how many buffers we get and that there is no wrapping, the allocation function can be simplified even more as all if-statements in the allocation loop can be removed, improving performance. Signed-off-by: Magnus Karlsson <[email protected]> Signed-off-by: Daniel Borkmann <[email protected]> Link: https://lore.kernel.org/bpf/[email protected]
2021-09-28	ice: Use the xsk batched rx allocation interface	Magnus Karlsson	1	-25/+19
	Use the new xsk batched rx allocation interface for the zero-copy data path. As the array of struct xdp_buff pointers kept by the driver is really a ring that wraps, the allocation routine is modified to detect a wrap and in that case call the allocation function twice. The allocation function cannot deal with wrapped rings, only arrays. As we now know exactly how many buffers we get and that there is no wrapping, the allocation function can be simplified even more as all if-statements in the allocation loop can be removed, improving performance. Signed-off-by: Magnus Karlsson <[email protected]> Signed-off-by: Daniel Borkmann <[email protected]> Link: https://lore.kernel.org/bpf/[email protected]
2021-09-28	ice: Use xdp_buf instead of rx_buf for xsk zero-copy	Magnus Karlsson	2	-39/+33
	In order to use the new xsk batched buffer allocation interface, a pointer to an array of struct xsk_buff pointers need to be provided so that the function can put the result of the allocation there. In the ice driver, we already have a ring that stores pointers to xdp_buffs. This is only used for the xsk zero-copy driver and is a union with the structure that is used for the regular non zero-copy path. Unfortunately, that structure is larger than the xdp_buffs pointers which mean that there will be a stride (of 20 bytes) between each xdp_buff pointer. And feeding this into the xsk_buff_alloc_batch interface will not work since it assumes a regular array of xdp_buff pointers (each 8 bytes with 0 bytes in-between them on a 64-bit system). To fix this, remove the xdp_buff pointer from the rx_buf union and move it one step higher to the union above which only has pointers to arrays in it. This solves the problem and we can directly feed the SW ring of xdp_buff pointers straight into the allocation function in the next patch when that interface is used. This will improve performance. Signed-off-by: Magnus Karlsson <[email protected]> Signed-off-by: Daniel Borkmann <[email protected]> Link: https://lore.kernel.org/bpf/[email protected]
2021-09-27	e100: fix buffer overrun in e100_get_regs	Jacob Keller	1	-6/+10
	The e100_get_regs function is used to implement a simple register dump for the e100 device. The data is broken into a couple of MAC control registers, and then a series of PHY registers, followed by a memory dump buffer. The total length of the register dump is defined as (1 + E100_PHY_REGS) * sizeof(u32) + sizeof(nic->mem->dump_buf). The logic for filling in the PHY registers uses a convoluted inverted count for loop which counts from E100_PHY_REGS (0x1C) down to 0, and assigns the slots 1 + E100_PHY_REGS - i. The first loop iteration will fill in [1] and the final loop iteration will fill in [1 + 0x1C]. This is actually one more than the supposed number of PHY registers. The memory dump buffer is then filled into the space at [2 + E100_PHY_REGS] which will cause that memcpy to assign 4 bytes past the total size. The end result is that we overrun the total buffer size allocated by the kernel, which could lead to a panic or other issues due to memory corruption. It is difficult to determine the actual total number of registers here. The only 8255x datasheet I could find indicates there are 28 total MDI registers. However, we're reading 29 here, and reading them in reverse! In addition, the ethtool e100 register dump interface appears to read the first PHY register to determine if the device is in MDI or MDIx mode. This doesn't appear to be documented anywhere within the 8255x datasheet. I can only assume it must be in register 28 (the extra register we're reading here). Lets not change any of the intended meaning of what we copy here. Just extend the space by 4 bytes to account for the extra register and continue copying the data out in the same order. Change the E100_PHY_REGS value to be the correct total (29) so that the total register dump size is calculated properly. Fix the offset for where we copy the dump buffer so that it doesn't overrun the total size. Re-write the for loop to use counting up instead of the convoluted down-counting. Correct the mdio_read offset to use the 0-based register offsets, but maintain the bizarre reverse ordering so that we have the ABI expected by applications like ethtool. This requires and additional subtraction of 1. It seems a bit odd but it makes the flow of assignment into the register buffer easier to follow. Fixes: 1da177e4c3f4 ("Linux-2.6.12-rc2") Reported-by: Felicitas Hetzelt <[email protected]> Signed-off-by: Jacob Keller <[email protected]> Tested-by: Jacob Keller <[email protected]> Signed-off-by: Tony Nguyen <[email protected]>
2021-09-27	e100: fix length calculation in e100_get_regs_len	Jacob Keller	1	-1/+5
	commit abf9b902059f ("e100: cleanup unneeded math") tried to simplify e100_get_regs_len and remove a double 'divide and then multiply' calculation that the e100_reg_regs_len function did. This change broke the size calculation entirely as it failed to account for the fact that the numbered registers are actually 4 bytes wide and not 1 byte. This resulted in a significant under allocation of the register buffer used by e100_get_regs. Fix this by properly multiplying the register count by u32 first before adding the size of the dump buffer. Fixes: abf9b902059f ("e100: cleanup unneeded math") Reported-by: Felicitas Hetzelt <[email protected]> Signed-off-by: Jacob Keller <[email protected]> Signed-off-by: Tony Nguyen <[email protected]>
2021-09-27	ice: Open devlink when device is ready	Leon Romanovsky	1	-4/+2
	Move devlink_registration routine to be the last command, when the device is fully initialized. Signed-off-by: Leon Romanovsky <[email protected]> Signed-off-by: David S. Miller <[email protected]>
2021-09-24	ice: Delete always true check of PF pointer	Leon Romanovsky	1	-3/+0
	PF pointer is always valid when PCI core calls its .shutdown() and .remove() callbacks. There is no need to check it again. Fixes: 837f08fdecbe ("ice: Add basic driver framework for Intel(R) E800 Series") Signed-off-by: Leon Romanovsky <[email protected]> Signed-off-by: David S. Miller <[email protected]>
2021-09-23	Merge git://git.kernel.org/pub/scm/linux/kernel/git/netdev/net	Jakub Kicinski	1	-0/+1
	net/mptcp/protocol.c 977d293e23b4 ("mptcp: ensure tx skbs always have the MPTCP ext") efe686ffce01 ("mptcp: ensure tx skbs always have the MPTCP ext") same patch merged in both trees, keep net-next. Signed-off-by: Jakub Kicinski <[email protected]>
2021-09-22	devlink: Make devlink_register to be void	Leon Romanovsky	3	-16/+4
	devlink_register() can't fail and always returns success, but all drivers are obligated to check returned status anyway. This adds a lot of boilerplate code to handle impossible flow. Make devlink_register() void and simplify the drivers that use that API call. Signed-off-by: Leon Romanovsky <[email protected]> Acked-by: Simon Horman <[email protected]> Acked-by: Vladimir Oltean <[email protected]> # dsa Reviewed-by: Jiri Pirko <[email protected]> Signed-off-by: David S. Miller <[email protected]>
2021-09-19	igc: fix build errors for PTP	Randy Dunlap	1	-0/+1
	When IGC=y and PTP_1588_CLOCK=m, the ptp_() interface family is not available to the igc driver. Make this driver depend on PTP_1588_CLOCK_OPTIONAL so that it will build without errors. Various igc commits have used ptp_() functions without checking that PTP_1588_CLOCK is enabled. Fix all of these here. Fixes these build errors: ld: drivers/net/ethernet/intel/igc/igc_main.o: in function `igc_msix_other': igc_main.c:(.text+0x6494): undefined reference to `ptp_clock_event' ld: igc_main.c:(.text+0x64ef): undefined reference to `ptp_clock_event' ld: igc_main.c:(.text+0x6559): undefined reference to `ptp_clock_event' ld: drivers/net/ethernet/intel/igc/igc_ethtool.o: in function `igc_ethtool_get_ts_info': igc_ethtool.c:(.text+0xc7a): undefined reference to `ptp_clock_index' ld: drivers/net/ethernet/intel/igc/igc_ptp.o: in function `igc_ptp_feature_enable_i225': igc_ptp.c:(.text+0x330): undefined reference to `ptp_find_pin' ld: igc_ptp.c:(.text+0x36f): undefined reference to `ptp_find_pin' ld: drivers/net/ethernet/intel/igc/igc_ptp.o: in function `igc_ptp_init': igc_ptp.c:(.text+0x11cd): undefined reference to `ptp_clock_register' ld: drivers/net/ethernet/intel/igc/igc_ptp.o: in function `igc_ptp_stop': igc_ptp.c:(.text+0x12dd): undefined reference to `ptp_clock_unregister' ld: drivers/platform/x86/dell/dell-wmi-privacy.o: in function `dell_privacy_wmi_probe': Fixes: 64433e5bf40ab ("igc: Enable internal i225 PPS") Fixes: 60dbede0c4f3d ("igc: Add support for ethtool GET_TS_INFO command") Fixes: 87938851b6efb ("igc: enable auxiliary PHC functions for the i225") Fixes: 5f2958052c582 ("igc: Add basic skeleton for PTP") Signed-off-by: Randy Dunlap <[email protected]> Cc: Ederson de Souza <[email protected]> Cc: Tony Nguyen <[email protected]> Cc: Vinicius Costa Gomes <[email protected]> Cc: Jeff Kirsher <[email protected]> Cc: "David S. Miller" <[email protected]> Cc: Jakub Kicinski <[email protected]> Cc: Jesse Brandeburg <[email protected]> Cc: [email protected] Acked-by: Vinicius Costa Gomes <[email protected]> Signed-off-by: David S. Miller <[email protected]>
2021-09-17	net: e1000e: solve insmod 'Unknown symbol mutex_lock' error	Hao Chen	1	-0/+1
	After I turn on the CONFIG_LOCK_STAT=y, insmod e1000e.ko will report: [ 5.641579] e1000e: Unknown symbol mutex_lock (err -2) [ 90.775705] e1000e: Unknown symbol mutex_lock (err -2) [ 132.252339] e1000e: Unknown symbol mutex_lock (err -2) This problem fixed after include <linux/mutex.h>. Signed-off-by: Hao Chen <[email protected]> Signed-off-by: David S. Miller <[email protected]>
2021-09-16	igc: fix tunnel offloading	Paolo Abeni	1	-1/+3
	Checking tunnel offloading, it turns out that offloading doesn't work as expected. The following script allows to reproduce the issue. Call it as `testscript DEVICE LOCALIP REMOTEIP NETMASK' === SNIP === if [ $# -ne 4 ] then echo "Usage $0 DEVICE LOCALIP REMOTEIP NETMASK" exit 1 fi DEVICE="$1" LOCAL_ADDRESS="$2" REMOTE_ADDRESS="$3" NWMASK="$4" echo "Driver: $(ethtool -i ${DEVICE} \| awk '/^driver:/{print $2}') " ethtool -k "${DEVICE}" \| grep tx-udp echo echo "Set up NIC and tunnel..." ip addr add "${LOCAL_ADDRESS}/${NWMASK}" dev "${DEVICE}" ip link set "${DEVICE}" up sleep 2 ip link add vxlan1 type vxlan id 42 \ remote "${REMOTE_ADDRESS}" \ local "${LOCAL_ADDRESS}" \ dstport 0 \ dev "${DEVICE}" ip addr add fc00::1/64 dev vxlan1 ip link set vxlan1 up sleep 2 rm -f vxlan.pcap echo "Running tcpdump and iperf3..." ( nohup tcpdump -i any -w vxlan.pcap >/dev/null 2>&1 ) & sleep 2 iperf3 -c fc00::2 >/dev/null pkill tcpdump echo echo -n "Max. Paket Size: " tcpdump -r vxlan.pcap -nnle 2>/dev/null \ \| grep "${LOCAL_ADDRESS}.> ${REMOTE_ADDRESS}.OTV" \ \| awk '{print $8}' \| awk -F ':' '{print $1}' \ \| sort -n \| tail -1 echo ip link del vxlan1 ip addr del ${LOCAL_ADDRESS}/${NWMASK} dev "${DEVICE}" === SNAP === The expected outcome is Max. Paket Size: 64904 This is what you see on igb, the code igc has been taken from. However, on igc the output is Max. Paket Size: 1516 so the GSO aggregate packets are segmented by the kernel before calling igc_xmit_frame. Inside the subsequent call to igc_tso, the check for skb_is_gso(skb) fails and the function returns prematurely. It turns out that this occurs because the feature flags aren't set entirely correctly in igc_probe. In contrast to the original code from igb_probe, igc_probe neglects to set the flags required to allow tunnel offloading. Setting the same flags as igb fixes the issue on igc. Fixes: 34428dff3679 ("igc: Add GSO partial support") Signed-off-by: Paolo Abeni <[email protected]> Tested-by: Corinna Vinschen <[email protected]> Acked-by: Sasha Neftin <[email protected]> Tested-by: Nechama Kraus <[email protected]> Signed-off-by: Tony Nguyen <[email protected]> Signed-off-by: David S. Miller <[email protected]>
2021-09-10	ice: Correctly deal with PFs that do not support RDMA	Dave Ertman	2	-0/+8
	There are two cases where the current PF does not support RDMA functionality. The first is if the NVM loaded on the device is set to not support RDMA (common_caps.rdma is false). The second is if the kernel bonding driver has included the current PF in an active link aggregate. When the driver has determined that this PF does not support RDMA, then auxiliary devices should not be created on the auxiliary bus. Without a device on the auxiliary bus, even if the irdma driver is present, there will be no RDMA activity attempted on this PF. Currently, in the reset flow, an attempt to create auxiliary devices is performed without regard to the ability of the PF. There needs to be a check in ice_aux_plug_dev (as the central point that creates auxiliary devices) to see if the PF is in a state to support the functionality. When disabling and re-enabling RDMA due to the inclusion/removal of the PF in a link aggregate, we also need to set/clear the bit which controls auxiliary device creation so that a reset recovery in a link aggregate situation doesn't try to create auxiliary devices when it shouldn't. Fixes: f9f5301e7e2d ("ice: Register auxiliary device to provide RDMA") Reported-by: Yongxin Liu <[email protected]> Signed-off-by: Dave Ertman <[email protected]> Signed-off-by: Tony Nguyen <[email protected]> Signed-off-by: David S. Miller <[email protected]>
2021-08-31	Merge git://git.kernel.org/pub/scm/linux/kernel/git/netdev/net	Jakub Kicinski	2	-16/+63
	include/linux/netdevice.h net/socket.c d0efb16294d1 ("net: don't unconditionally copy_from_user a struct ifreq for socket ioctls") 876f0bf9d0d5 ("net: socket: simplify dev_ifconf handling") 29c4964822aa ("net: socket: rework compat_ifreq_ioctl()") Signed-off-by: Jakub Kicinski <[email protected]>
2021-08-27	ice: Only lock to update netdev dev_addr	Brett Creeley	1	-4/+9
	commit 3ba7f53f8bf1 ("ice: don't remove netdev->dev_addr from uc sync list") introduced calls to netif_addr_lock_bh() and netif_addr_unlock_bh() in the driver's ndo_set_mac() callback. This is fine since the driver is updated the netdev's dev_addr, but since this is a spinlock, the driver cannot sleep when the lock is held. Unfortunately the functions to add/delete MAC filters depend on a mutex. This was causing a trace with the lock debug kernel config options enabled when changing the mac address via iproute. [ 203.273059] BUG: sleeping function called from invalid context at kernel/locking/mutex.c:281 [ 203.273065] in_atomic(): 1, irqs_disabled(): 0, non_block: 0, pid: 6698, name: ip [ 203.273068] Preemption disabled at: [ 203.273068] [<ffffffffc04aaeab>] ice_set_mac_address+0x8b/0x1c0 [ice] [ 203.273097] CPU: 31 PID: 6698 Comm: ip Tainted: G S W I 5.14.0-rc4 #2 [ 203.273100] Hardware name: Intel Corporation S2600WFT/S2600WFT, BIOS SE5C620.86B.02.01.0010.010620200716 01/06/2020 [ 203.273102] Call Trace: [ 203.273107] dump_stack_lvl+0x33/0x42 [ 203.273113] ? ice_set_mac_address+0x8b/0x1c0 [ice] [ 203.273124] ___might_sleep.cold.150+0xda/0xea [ 203.273131] mutex_lock+0x1c/0x40 [ 203.273136] ice_remove_mac+0xe3/0x180 [ice] [ 203.273155] ? ice_fltr_add_mac_list+0x20/0x20 [ice] [ 203.273175] ice_fltr_prepare_mac+0x43/0xa0 [ice] [ 203.273194] ice_set_mac_address+0xab/0x1c0 [ice] [ 203.273206] dev_set_mac_address+0xb8/0x120 [ 203.273210] dev_set_mac_address_user+0x2c/0x50 [ 203.273212] do_setlink+0x1dd/0x10e0 [ 203.273217] ? __nla_validate_parse+0x12d/0x1a0 [ 203.273221] __rtnl_newlink+0x530/0x910 [ 203.273224] ? __kmalloc_node_track_caller+0x17f/0x380 [ 203.273230] ? preempt_count_add+0x68/0xa0 [ 203.273236] ? _raw_spin_lock_irqsave+0x1f/0x30 [ 203.273241] ? kmem_cache_alloc_trace+0x4d/0x440 [ 203.273244] rtnl_newlink+0x43/0x60 [ 203.273245] rtnetlink_rcv_msg+0x13a/0x380 [ 203.273248] ? rtnl_calcit.isra.40+0x130/0x130 [ 203.273250] netlink_rcv_skb+0x4e/0x100 [ 203.273256] netlink_unicast+0x1a2/0x280 [ 203.273258] netlink_sendmsg+0x242/0x490 [ 203.273260] sock_sendmsg+0x58/0x60 [ 203.273263] ____sys_sendmsg+0x1ef/0x260 [ 203.273265] ? copy_msghdr_from_user+0x5c/0x90 [ 203.273268] ? ____sys_recvmsg+0xe6/0x170 [ 203.273270] ___sys_sendmsg+0x7c/0xc0 [ 203.273272] ? copy_msghdr_from_user+0x5c/0x90 [ 203.273274] ? ___sys_recvmsg+0x89/0xc0 [ 203.273276] ? __netlink_sendskb+0x50/0x50 [ 203.273278] ? mod_objcg_state+0xee/0x310 [ 203.273282] ? __dentry_kill+0x114/0x170 [ 203.273286] ? get_max_files+0x10/0x10 [ 203.273288] __sys_sendmsg+0x57/0xa0 [ 203.273290] do_syscall_64+0x37/0x80 [ 203.273295] entry_SYSCALL_64_after_hwframe+0x44/0xae [ 203.273296] RIP: 0033:0x7f8edf96e278 [ 203.273298] Code: 89 02 48 c7 c0 ff ff ff ff eb b5 0f 1f 80 00 00 00 00 f3 0f 1e fa 48 8d 05 25 63 2c 00 8b 00 85 c0 75 17 b8 2e 00 00 00 0f 05 <48> 3d 00 f0 ff ff 77 58 c3 0f 1f 80 00 00 00 00 41 54 41 89 d4 55 [ 203.273300] RSP: 002b:00007ffcb8bdac08 EFLAGS: 00000246 ORIG_RAX: 000000000000002e [ 203.273303] RAX: ffffffffffffffda RBX: 000000006115e0ae RCX: 00007f8edf96e278 [ 203.273304] RDX: 0000000000000000 RSI: 00007ffcb8bdac70 RDI: 0000000000000003 [ 203.273305] RBP: 0000000000000000 R08: 0000000000000001 R09: 00007ffcb8bda5b0 [ 203.273306] R10: 0000000000000000 R11: 0000000000000246 R12: 0000000000000001 [ 203.273306] R13: 0000555e10092020 R14: 0000000000000000 R15: 0000000000000005 Fix this by only locking when changing the netdev->dev_addr. Also, make sure to restore the old netdev->dev_addr on any failures. Fixes: 3ba7f53f8bf1 ("ice: don't remove netdev->dev_addr from uc sync list") Signed-off-by: Brett Creeley <[email protected]> Tested-by: Gurucharan G <[email protected]> Signed-off-by: Tony Nguyen <[email protected]>
2021-08-27	ice: restart periodic outputs around time changes	Jacob Keller	1	-0/+49
	When we enabled auxiliary input/output support for the E810 device, we forgot to add logic to restart the output when we change time. This is important as the periodic output will be incorrect after a time change otherwise. This unfortunately includes the adjust time function, even though it uses an atomic hardware interface. The atomic adjustment can still cause the pin output to stall permanently, so we need to stop and restart it. Introduce wrapper functions to temporarily disable and then re-enable the clock outputs. Fixes: 172db5f91d5f ("ice: add support for auxiliary input/output pins") Signed-off-by: Jacob Keller <[email protected]> Tested-by: Sunitha D Mekala <[email protected]> Signed-off-by: Tony Nguyen <[email protected]>
2021-08-27	igc: Add support for CBS offloading	Aravindhan Gunasekaran	5	-1/+195
	Implement support for Credit-based shaper(CBS) Qdisc hardware offload mode in the driver. There are two sets of IEEE802.1Qav (CBS) HW logic in i225 controller and this patch supports enabling them in the top two priority TX queues. Driver implemented as recommended by Foxville External Architecture Specification v0.993. Idleslope and Hi-credit are the CBS tunable parameters for i225 NIC, programmed in TQAVCC and TQAVHC registers respectively. In-order for IEEE802.1Qav (CBS) algorithm to work as intended and provide BW reservation CBS should be enabled in highest priority queue first. If we enable CBS on any of low priority queues, the traffic in high priority queue does not allow low priority queue to be selected for transmission and bandwidth reservation is not guaranteed. Signed-off-by: Aravindhan Gunasekaran <[email protected]> Signed-off-by: Mallikarjuna Chilakala <[email protected]> Tested-by: Dvora Fuxbrumer <[email protected]> Signed-off-by: Tony Nguyen <[email protected]>
2021-08-27	igc: Simplify TSN flags handling	Vinicius Costa Gomes	4	-27/+43
	Separates the procedure done during reset from applying a configuration, knowing when the code is executing allow us to separate the better what changes the hardware state from what changes only the driver state. Introduces a flag for bookkeeping the driver state of TSN features. When Qav and frame-preemption is also implemented this flag makes it easier to keep track on whether a TSN feature driver state is enabled or not though controller state changes, say, during a reset. Signed-off-by: Vinicius Costa Gomes <[email protected]> Signed-off-by: Aravindhan Gunasekaran <[email protected]> Signed-off-by: Mallikarjuna Chilakala <[email protected]> Tested-by: Dvora Fuxbrumer <[email protected]> Signed-off-by: Tony Nguyen <[email protected]>
2021-08-27	igc: Use default cycle 'start' and 'end' values for queues	Vinicius Costa Gomes	2	-22/+21
	Sets default values for each queue cycle start and cycle end. This allows some simplification in the handling of these configurations as most TSN features in i225 require a cycle to be configured. In i225, cycle start and end time is required to be programmed for CBS to work properly. Signed-off-by: Vinicius Costa Gomes <[email protected]> Signed-off-by: Aravindhan Gunasekaran <[email protected]> Signed-off-by: Mallikarjuna Chilakala <[email protected]> Tested-by: Dvora Fuxbrumer <[email protected]> Signed-off-by: Tony Nguyen <[email protected]>
2021-08-27	ice: add lock around Tx timestamp tracker flush	Jacob Keller	1	-0/+4
	The driver didn't take the lock while flushing the Tx tracker, which could cause a race where one thread is trying to read timestamps out while another thread is trying to read the tracker to check the timestamps. Avoid this by ensuring that flushing is locked against read accesses. Fixes: ea9b847cda64 ("ice: enable transmit timestamps for E810 devices") Signed-off-by: Jacob Keller <[email protected]> Tested-by: Gurucharan G <[email protected]> Signed-off-by: Tony Nguyen <[email protected]>
2021-08-27	ice: remove dead code for allocating pin_config	Jacob Keller	1	-11/+0
	We have code in the ice driver which allocates the pin_config structure if n_pins is > 0, but we never set n_pins to be greater than zero. There's no reason to keep this code until we actually have pin_config support. Remove this. We can re-add it properly when we implement support for pin_config for E810-T devices. Fixes: 172db5f91d5f ("ice: add support for auxiliary input/output pins") Signed-off-by: Jacob Keller <[email protected]> Tested-by: Gurucharan G <[email protected]> Signed-off-by: Tony Nguyen <[email protected]>
2021-08-27	ice: fix Tx queue iteration for Tx timestamp enablement	Jacob Keller	1	-1/+1
	The driver accidentally copied the ice_for_each_rxq iterator when implementing enablement of the ptp_tx bit for the Tx rings. We still load the Tx rings and set the ptp_tx field, but we iterate over the count of the num_rxq. If the number of Tx and Rx queues differ, this could either cause a buffer overrun when accessing the tx_rings list if num_txq is greater than num_rxq, or it could cause us to fail to enable Tx timestamps for some rings. This was not noticed originally as we generally have the same number of Tx and Rx queues. Fixes: ea9b847cda64 ("ice: enable transmit timestamps for E810 devices") Signed-off-by: Jacob Keller <[email protected]> Tested-by: Gurucharan G <[email protected]> Signed-off-by: Tony Nguyen <[email protected]>
2021-08-26	Merge git://git.kernel.org/pub/scm/linux/kernel/git/netdev/net	Jakub Kicinski	5	-25/+53
	drivers/net/wwan/mhi_wwan_mbim.c - drop the extra arg. Signed-off-by: Jakub Kicinski <[email protected]>
2021-08-24	igc: Add support for PTP getcrosststamp()	Vinicius Costa Gomes	4	-0/+234
	i225 supports PCIe Precision Time Measurement (PTM), allowing us to support the PTP_SYS_OFFSET_PRECISE ioctl() in the driver via the getcrosststamp() function. The easiest way to expose the PTM registers would be to configure the PTM dialogs to run periodically, but the PTP_SYS_OFFSET_PRECISE ioctl() semantics are more aligned to using a kind of "one-shot" way of retrieving the PTM timestamps. But this causes a bit more code to be written: the trigger registers for the PTM dialogs are not cleared automatically. i225 can be configured to send "fake" packets with the PTM information, adding support for handling these types of packets is left for the future. PTM improves the accuracy of time synchronization, for example, using phc2sys, while a simple application is sending packets as fast as possible. First, without .getcrosststamp(): phc2sys[191.382]: enp4s0 sys offset -959 s2 freq -454 delay 4492 phc2sys[191.482]: enp4s0 sys offset 798 s2 freq +1015 delay 4069 phc2sys[191.583]: enp4s0 sys offset 962 s2 freq +1418 delay 3849 phc2sys[191.683]: enp4s0 sys offset 924 s2 freq +1669 delay 3753 phc2sys[191.783]: enp4s0 sys offset 664 s2 freq +1686 delay 3349 phc2sys[191.883]: enp4s0 sys offset 218 s2 freq +1439 delay 2585 phc2sys[191.983]: enp4s0 sys offset 761 s2 freq +2048 delay 3750 phc2sys[192.083]: enp4s0 sys offset 756 s2 freq +2271 delay 4061 phc2sys[192.183]: enp4s0 sys offset 809 s2 freq +2551 delay 4384 phc2sys[192.283]: enp4s0 sys offset -108 s2 freq +1877 delay 2480 phc2sys[192.383]: enp4s0 sys offset -1145 s2 freq +807 delay 4438 phc2sys[192.484]: enp4s0 sys offset 571 s2 freq +2180 delay 3849 phc2sys[192.584]: enp4s0 sys offset 241 s2 freq +2021 delay 3389 phc2sys[192.684]: enp4s0 sys offset 405 s2 freq +2257 delay 3829 phc2sys[192.784]: enp4s0 sys offset 17 s2 freq +1991 delay 3273 phc2sys[192.884]: enp4s0 sys offset 152 s2 freq +2131 delay 3948 phc2sys[192.984]: enp4s0 sys offset -187 s2 freq +1837 delay 3162 phc2sys[193.084]: enp4s0 sys offset -1595 s2 freq +373 delay 4557 phc2sys[193.184]: enp4s0 sys offset 107 s2 freq +1597 delay 3740 phc2sys[193.284]: enp4s0 sys offset 199 s2 freq +1721 delay 4010 phc2sys[193.385]: enp4s0 sys offset -169 s2 freq +1413 delay 3701 phc2sys[193.485]: enp4s0 sys offset -47 s2 freq +1484 delay 3581 phc2sys[193.585]: enp4s0 sys offset -65 s2 freq +1452 delay 3778 phc2sys[193.685]: enp4s0 sys offset 95 s2 freq +1592 delay 3888 phc2sys[193.785]: enp4s0 sys offset 206 s2 freq +1732 delay 4445 phc2sys[193.885]: enp4s0 sys offset -652 s2 freq +936 delay 2521 phc2sys[193.985]: enp4s0 sys offset -203 s2 freq +1189 delay 3391 phc2sys[194.085]: enp4s0 sys offset -376 s2 freq +955 delay 2951 phc2sys[194.185]: enp4s0 sys offset -134 s2 freq +1084 delay 3330 phc2sys[194.285]: enp4s0 sys offset -22 s2 freq +1156 delay 3479 phc2sys[194.386]: enp4s0 sys offset 32 s2 freq +1204 delay 3602 phc2sys[194.486]: enp4s0 sys offset 122 s2 freq +1303 delay 3731 Statistics for this run (total of 2179 lines), in nanoseconds: average: -1.12 stdev: 634.80 max: 1551 min: -2215 With .getcrosststamp() via PCIe PTM: phc2sys[367.859]: enp4s0 sys offset 6 s2 freq +1727 delay 0 phc2sys[367.959]: enp4s0 sys offset -2 s2 freq +1721 delay 0 phc2sys[368.059]: enp4s0 sys offset 5 s2 freq +1727 delay 0 phc2sys[368.160]: enp4s0 sys offset -1 s2 freq +1723 delay 0 phc2sys[368.260]: enp4s0 sys offset -4 s2 freq +1719 delay 0 phc2sys[368.360]: enp4s0 sys offset -5 s2 freq +1717 delay 0 phc2sys[368.460]: enp4s0 sys offset 1 s2 freq +1722 delay 0 phc2sys[368.560]: enp4s0 sys offset -3 s2 freq +1718 delay 0 phc2sys[368.660]: enp4s0 sys offset 5 s2 freq +1725 delay 0 phc2sys[368.760]: enp4s0 sys offset -1 s2 freq +1721 delay 0 phc2sys[368.860]: enp4s0 sys offset 0 s2 freq +1721 delay 0 phc2sys[368.960]: enp4s0 sys offset 0 s2 freq +1721 delay 0 phc2sys[369.061]: enp4s0 sys offset 4 s2 freq +1725 delay 0 phc2sys[369.161]: enp4s0 sys offset 1 s2 freq +1724 delay 0 phc2sys[369.261]: enp4s0 sys offset 4 s2 freq +1727 delay 0 phc2sys[369.361]: enp4s0 sys offset 8 s2 freq +1732 delay 0 phc2sys[369.461]: enp4s0 sys offset 7 s2 freq +1733 delay 0 phc2sys[369.561]: enp4s0 sys offset 4 s2 freq +1733 delay 0 phc2sys[369.661]: enp4s0 sys offset 1 s2 freq +1731 delay 0 phc2sys[369.761]: enp4s0 sys offset 1 s2 freq +1731 delay 0 phc2sys[369.861]: enp4s0 sys offset -5 s2 freq +1725 delay 0 phc2sys[369.961]: enp4s0 sys offset -4 s2 freq +1725 delay 0 phc2sys[370.062]: enp4s0 sys offset 2 s2 freq +1730 delay 0 phc2sys[370.162]: enp4s0 sys offset -7 s2 freq +1721 delay 0 phc2sys[370.262]: enp4s0 sys offset -3 s2 freq +1723 delay 0 phc2sys[370.362]: enp4s0 sys offset 1 s2 freq +1726 delay 0 phc2sys[370.462]: enp4s0 sys offset -3 s2 freq +1723 delay 0 phc2sys[370.562]: enp4s0 sys offset -1 s2 freq +1724 delay 0 phc2sys[370.662]: enp4s0 sys offset -4 s2 freq +1720 delay 0 phc2sys[370.762]: enp4s0 sys offset -7 s2 freq +1716 delay 0 phc2sys[370.862]: enp4s0 sys offset -2 s2 freq +1719 delay 0 Statistics for this run (total of 2179 lines), in nanoseconds: average: 0.14 stdev: 5.03 max: 48 min: -27 For reference, the statistics for runs without PCIe congestion show that the improvements from enabling PTM are less dramatic. For two runs of 16466 entries: without PTM: avg -0.04 stdev 10.57 max 39 min -42 with PTM: avg 0.01 stdev 4.20 max 19 min -16 One possible explanation is that when PTM is not enabled, and there's a lot of traffic in the PCIe fabric, some register reads will take more time than the others because of congestion on the PCIe fabric. When PTM is enabled, even if the PTM dialogs take more time to complete under heavy traffic, the time measurements do not depend on the time to read the registers. This was implemented following the i225 EAS version 0.993. Signed-off-by: Vinicius Costa Gomes <[email protected]> Tested-by: Dvora Fuxbrumer <[email protected]> Signed-off-by: Tony Nguyen <[email protected]>
2021-08-24	igc: Enable PCIe PTM	Vinicius Costa Gomes	1	-0/+6
	Enables PCIe PTM (Precision Time Measurement) support in the igc driver. Notifies the PCI devices that PCIe PTM should be enabled. PCIe PTM is similar protocol to PTP (Precision Time Protocol) running in the PCIe fabric, it allows devices to report time measurements from their internal clocks and the correlation with the PCIe root clock. The i225 NIC exposes some registers that expose those time measurements, those registers will be used, in later patches, to implement the PTP_SYS_OFFSET_PRECISE ioctl(). Signed-off-by: Vinicius Costa Gomes <[email protected]> Tested-by: Dvora Fuxbrumer <[email protected]> Signed-off-by: Tony Nguyen <[email protected]>
2021-08-24	ethtool: extend coalesce setting uAPI with CQE mode	Yufeng Mo	11	-24/+76
	In order to support more coalesce parameters through netlink, add two new parameter kernel_coal and extack for .set_coalesce and .get_coalesce, then some extra info can return to user with the netlink API. Signed-off-by: Yufeng Mo <[email protected]> Signed-off-by: Huazhong Tan <[email protected]> Signed-off-by: Jakub Kicinski <[email protected]>
2021-08-20	e1000e: Do not take care about recovery NVM checksum	Sasha Neftin	1	-7/+11
	On new platforms, the NVM is read-only. Attempting to update the NVM is causing a lockup to occur. Do not attempt to write to the NVM on platforms where it's not supported. Emit an error message when the NVM checksum is invalid. Bugzilla: https://bugzilla.kernel.org/show_bug.cgi?id=213667 Fixes: fb776f5d57ee ("e1000e: Add support for Tiger Lake") Suggested-by: Dima Ruinskiy <[email protected]> Suggested-by: Vitaly Lifshits <[email protected]> Signed-off-by: Sasha Neftin <[email protected]> Tested-by: Dvora Fuxbrumer <[email protected]> Signed-off-by: Tony Nguyen <[email protected]>
2021-08-20	e1000e: Fix the max snoop/no-snoop latency for 10M	Sasha Neftin	2	-1/+16
	We should decode the latency and the max_latency before directly compare. The latency should be presented as lat_enc = scale x value: lat_enc_d = (lat_enc & 0x0x3ff) x (1U << (5*((max_ltr_enc & 0x1c00) >> 10))) Fixes: cf8fb73c23aa ("e1000e: add support for LTR on I217/I218") Suggested-by: Yee Li <[email protected]> Signed-off-by: Sasha Neftin <[email protected]> Tested-by: Dvora Fuxbrumer <[email protected]> Signed-off-by: Tony Nguyen <[email protected]>
2021-08-20	igc: Use num_tx_queues when iterating over tx_ring queue	Toshiki Nishioka	1	-2/+2
	Use num_tx_queues rather than the IGC_MAX_TX_QUEUES fixed number 4 when iterating over tx_ring queue since instantiated queue count could be less than 4 where on-line cpu count is less than 4. Fixes: ec50a9d437f0 ("igc: Add support for taprio offloading") Signed-off-by: Toshiki Nishioka <[email protected]> Signed-off-by: Muhammad Husaini Zulkifli <[email protected]> Tested-by: Muhammad Husaini Zulkifli <[email protected]> Acked-by: Sasha Neftin <[email protected]> Signed-off-by: Tony Nguyen <[email protected]>
2021-08-20	igc: fix page fault when thunderbolt is unplugged	Aaron Ma	2	-14/+21
	After unplug thunderbolt dock with i225, pciehp interrupt is triggered, remove call will read/write mmio address which is already disconnected, then cause page fault and make system hang. Check PCI state to remove device safely. Trace: BUG: unable to handle page fault for address: 000000000000b604 Oops: 0000 [#1] SMP NOPTI RIP: 0010:igc_rd32+0x1c/0x90 [igc] Call Trace: igc_ptp_suspend+0x6c/0xa0 [igc] igc_ptp_stop+0x12/0x50 [igc] igc_remove+0x7f/0x1c0 [igc] pci_device_remove+0x3e/0xb0 __device_release_driver+0x181/0x240 Fixes: 13b5b7fd6a4a ("igc: Add support for Tx/Rx rings") Fixes: b03c49cde61f ("igc: Save PTP time before a reset") Signed-off-by: Aaron Ma <[email protected]> Tested-by: Dvora Fuxbrumer <[email protected]> Signed-off-by: Tony Nguyen <[email protected]>
2021-08-19	ice: do not abort devlink info if board identifier can't be found	Jacob Keller	1	-1/+3
	The devlink dev info command reports version information about the device and firmware running on the board. This includes the "board.id" field which is supposed to represent an identifier of the board design. The ice driver uses the Product Board Assembly identifier for this. In some cases, the PBA is not present in the NVM. If this happens, devlink dev info will fail with an error. Instead, modify the ice_info_pba function to just exit without filling in the context buffer. This will cause the board.id field to be skipped. Log a dev_dbg message in case someone wants to confirm why board.id is not showing up for them. Fixes: e961b679fb0b ("ice: add board identifier info to devlink .info_get") Signed-off-by: Jacob Keller <[email protected]> Tested-by: Tony Brelinski <[email protected]> Signed-off-by: Tony Nguyen <[email protected]> Link: https://lore.kernel.org/r/[email protected] Signed-off-by: Jakub Kicinski <[email protected]>
2021-08-19	Merge git://git.kernel.org/pub/scm/linux/kernel/git/netdev/net	Jakub Kicinski	6	-6/+53
	drivers/ptp/Kconfig: 55c8fca1dae1 ("ptp_pch: Restore dependency on PCI") e5f31552674e ("ethernet: fix PTP_1588_CLOCK dependencies") Signed-off-by: Jakub Kicinski <[email protected]>
2021-08-19	iavf: Fix ping is lost after untrusted VF had tried to change MAC	Sylwester Dziedziuch	3	-2/+47
	Make changes to MAC address dependent on the response of PF. Disallow changes to HW MAC address and MAC filter from untrusted VF, thanks to that ping is not lost if VF tries to change MAC. Add a new field in iavf_mac_filter, to indicate whether there was response from PF for given filter. Based on this field pass or discard the filter. If untrusted VF tried to change it's address, it's not changed. Still filter was changed, because of that ping couldn't go through. Fixes: c5c922b3e09b ("iavf: fix MAC address setting for VFs when filter is rejected") Signed-off-by: Przemyslaw Patynowski <[email protected]> Signed-off-by: Sylwester Dziedziuch <[email protected]> Signed-off-by: Mateusz Palczewski <[email protected]> Tested-by: Gurucharan G <[email protected]> Signed-off-by: Tony Nguyen <[email protected]> Signed-off-by: Jakub Kicinski <[email protected]>
2021-08-19	i40e: Fix ATR queue selection	Arkadiusz Kubalewski	1	-2/+1
	Without this patch, ATR does not work. Receive/transmit uses queue selection based on SW DCB hashing method. If traffic classes are not configured for PF, then use netdev_pick_tx function for selecting queue for packet transmission. Instead of calling i40e_swdcb_skb_tx_hash, call netdev_pick_tx, which ensures that packet is transmitted/received from CPU that is running the application. Reproduction steps: 1. Load i40e driver 2. Map each MSI interrupt of i40e port for each CPU 3. Disable ntuple, enable ATR i.e.: ethtool -K $interface ntuple off ethtool --set-priv-flags $interface flow-director-atr 4. Run application that is generating traffic and is bound to a single CPU, i.e.: taskset -c 9 netperf -H 1.1.1.1 -t TCP_RR -l 10 5. Observe behavior: Application's traffic should be restricted to the CPU provided in taskset. Fixes: 89ec1f0886c1 ("i40e: Fix queue-to-TC mapping on Tx") Signed-off-by: Przemyslaw Patynowski <[email protected]> Signed-off-by: Arkadiusz Kubalewski <[email protected]> Tested-by: Dave Switzer <[email protected]> Signed-off-by: Tony Nguyen <[email protected]> Signed-off-by: Jakub Kicinski <[email protected]>
2021-08-17	ixgbe, xsk: clean up the resources in ixgbe_xsk_pool_enable error path	Wang Hai	1	-1/+4
	In ixgbe_xsk_pool_enable(), if ixgbe_xsk_wakeup() fails, We should restore the previous state and clean up the resources. Add the missing clear af_xdp_zc_qps and unmap dma to fix this bug. Fixes: d49e286d354e ("ixgbe: add tracking of AF_XDP zero-copy state for each queue pair") Fixes: 4a9b32f30f80 ("ixgbe: fix potential RX buffer starvation for AF_XDP") Signed-off-by: Wang Hai <[email protected]> Acked-by: Magnus Karlsson <[email protected]> Tested-by: Sandeep Penigalapati <[email protected]> Signed-off-by: Tony Nguyen <[email protected]> Link: https://lore.kernel.org/r/[email protected] Signed-off-by: Jakub Kicinski <[email protected]>