aboutsummaryrefslogtreecommitdiff
path: root/drivers/net/ethernet
AgeCommit message (Collapse)AuthorFilesLines
2021-11-26net: mscc: ocelot: correctly report the timestamping RX filters in ethtoolVladimir Oltean1-1/+4
The driver doesn't support RX timestamping for non-PTP packets, but it declares that it does. Restrict the reported RX filters to PTP v2 over L2 and over L4. Fixes: 4e3b0468e6d7 ("net: mscc: PTP Hardware Clock (PHC) support") Signed-off-by: Vladimir Oltean <[email protected]> Signed-off-by: Jakub Kicinski <[email protected]>
2021-11-26net: mscc: ocelot: set up traps for PTP packetsVladimir Oltean1-1/+240
IEEE 1588 support was declared too soon for the Ocelot switch. Out of reset, this switch does not apply any special treatment for PTP packets, i.e. when an event message is received, the natural tendency is to forward it by MAC DA/VLAN ID. This poses a problem when the ingress port is under a bridge, since user space application stacks (written primarily for endpoint ports, not switches) like ptp4l expect that PTP messages are always received on AF_PACKET / AF_INET sockets (depending on the PTP transport being used), and never being autonomously forwarded. Any forwarding, if necessary (for example in Transparent Clock mode) is handled in software by ptp4l. Having the hardware forward these packets too will cause duplicates which will confuse endpoints connected to these switches. So PTP over L2 barely works, in the sense that PTP packets reach the CPU port, but they reach it via flooding, and therefore reach lots of other unwanted destinations too. But PTP over IPv4/IPv6 does not work at all. This is because the Ocelot switch have a separate destination port mask for unknown IP multicast (which PTP over IP is) flooding compared to unknown non-IP multicast (which PTP over L2 is) flooding. Specifically, the driver allows the CPU port to be in the PGID_MC port group, but not in PGID_MCIPV4 and PGID_MCIPV6. There are several presentations from Allan Nielsen which explain that the embedded MIPS CPU on Ocelot switches is not very powerful at all, so every penny they could save by not allowing flooding to the CPU port module matters. Unknown IP multicast did not make it. The de facto consensus is that when a switch is PTP-aware and an application stack for PTP is running, switches should have some sort of trapping mechanism for PTP packets, to extract them from the hardware data path. This avoids both problems: (a) PTP packets are no longer flooded to unwanted destinations (b) PTP over IP packets are no longer denied from reaching the CPU since they arrive there via a trap and not via flooding It is not the first time when this change is attempted. Last time, the feedback from Allan Nielsen and Andrew Lunn was that the traps should not be installed by default, and that PTP-unaware switching may be desired for some use cases: https://patchwork.ozlabs.org/project/netdev/patch/[email protected]/ To address that feedback, the present patch adds the necessary packet traps according to the RX filter configuration transmitted by user space through the SIOCSHWTSTAMP ioctl. Trapping is done via VCAP IS2, where we keep 5 filters, which are amended each time RX timestamping is enabled or disabled on a port: - 1 for PTP over L2 - 2 for PTP over IPv4 (UDP ports 319 and 320) - 2 for PTP over IPv6 (UDP ports 319 and 320) The cookie by which these filters (invisible to tc) are identified is strategically chosen such that it does not collide with the filters used for the ocelot-8021q tagging protocol by the Felix driver, or with the MRP traps set up by the Ocelot library. Other alternatives were considered, like patching user space to do something, but there are so many ways in which PTP packets could be made to reach the CPU, generically speaking, that "do what?" is a very valid question. The ptp4l program from the linuxptp stack already attempts to do something: it calls setsockopt(IP_ADD_MEMBERSHIP) (and PACKET_ADD_MEMBERSHIP, respectively) which translates in both cases into a dev_mc_add() on the interface, in the kernel: https://github.com/richardcochran/linuxptp/blob/v3.1.1/udp.c#L73 https://github.com/richardcochran/linuxptp/blob/v3.1.1/raw.c Reality shows that this is not sufficient in case the interface belongs to a switchdev driver, as dev_mc_add() does not show the intention to trap a packet to the CPU, but rather the intention to not drop it (it is strictly for RX filtering, same as promiscuous does not mean to send all traffic to the CPU, but to not drop traffic with unknown MAC DA). This topic is a can of worms in itself, and it would be great if user space could just stay out of it. On the other hand, setting up PTP traps privately within the driver is not new by any stretch of the imagination: https://elixir.bootlin.com/linux/v5.16-rc2/source/drivers/net/ethernet/mellanox/mlxsw/spectrum_ptp.c#L833 https://elixir.bootlin.com/linux/v5.16-rc2/source/drivers/net/dsa/hirschmann/hellcreek.c#L1050 https://elixir.bootlin.com/linux/v5.16-rc2/source/include/linux/dsa/sja1105.h#L21 So this is the approach taken here as well. The difference here being that we prepare and destroy the traps per port, dynamically at runtime, as opposed to driver init time, because apparently, PTP-unaware forwarding is a use case. Fixes: 4e3b0468e6d7 ("net: mscc: PTP Hardware Clock (PHC) support") Reported-by: Po Liu <[email protected]> Signed-off-by: Vladimir Oltean <[email protected]> Acked-by: Richard Cochran <[email protected]> Signed-off-by: Jakub Kicinski <[email protected]>
2021-11-26net: mscc: ocelot: create a function that replaces an existing VCAP filterVladimir Oltean1-0/+16
VCAP (Versatile Content Aware Processor) is the TCAM-based engine behind tc flower offload on ocelot, among other things. The ingress port mask on which VCAP rules match is present as a bit field in the actual key of the rule. This means that it is possible for a rule to be shared among multiple source ports. When the rule is added one by one on each desired port, that the ingress port mask of the key must be edited and rewritten to hardware. But the API in ocelot_vcap.c does not allow for this. For one thing, ocelot_vcap_filter_add() and ocelot_vcap_filter_del() are not symmetric, because ocelot_vcap_filter_add() works with a preallocated and prepopulated filter and programs it to hardware, and ocelot_vcap_filter_del() does both the job of removing the specified filter from hardware, as well as kfreeing it. That is to say, the only option of editing a filter in place, which is to delete it, modify the structure and add it back, does not work because it results in use-after-free. This patch introduces ocelot_vcap_filter_replace, which trivially reprograms a VCAP entry to hardware, at the exact same index at which it existed before, without modifying any list or allocating any memory. Signed-off-by: Vladimir Oltean <[email protected]> Acked-by: Richard Cochran <[email protected]> Signed-off-by: Jakub Kicinski <[email protected]>
2021-11-26net: mscc: ocelot: don't downgrade timestamping RX filters in SIOCSHWTSTAMPVladimir Oltean1-6/+0
The ocelot driver, when asked to timestamp all receiving packets, 1588 v1 or NTP, says "nah, here's 1588 v2 for you". According to this discussion: https://patchwork.kernel.org/project/netdevbpf/patch/[email protected]/#24577647 drivers that downgrade from a wider request to a narrower response (or even a response where the intersection with the request is empty) are buggy, and should return -ERANGE instead. This patch fixes that. Fixes: 4e3b0468e6d7 ("net: mscc: PTP Hardware Clock (PHC) support") Suggested-by: Richard Cochran <[email protected]> Signed-off-by: Vladimir Oltean <[email protected]> Acked-by: Richard Cochran <[email protected]> Signed-off-by: Jakub Kicinski <[email protected]>
2021-11-26net: hns3: fix incorrect components info of ethtool --reset commandJie Wang1-0/+4
Currently, HNS3 driver doesn't clear the reset flags of components after successfully executing reset, it causes userspace info of "Components reset" and "Components not reset" is incorrect. So fix this problem by clear corresponding reset flag after reset process. Fixes: ddccc5e368a3 ("net: hns3: add support for triggering reset by ethtool") Signed-off-by: Jie Wang <[email protected]> Signed-off-by: Guangbin Huang <[email protected]> Signed-off-by: Jakub Kicinski <[email protected]>
2021-11-26net: hns3: fix one incorrect value of page pool info when queried by debugfsHao Chen1-1/+2
Currently, when user queries page pool info by debugfs command "cat page_pool_info", the cnt of allocated page for page pool may be incorrect because of memory inconsistency problem caused by compiler optimization. So this patch uses READ_ONCE() to read value of pages_state_hold_cnt to fix this problem. Fixes: 850bfb912a6d ("net: hns3: debugfs add support dumping page pool info") Signed-off-by: Hao Chen <[email protected]> Signed-off-by: Guangbin Huang <[email protected]> Signed-off-by: Jakub Kicinski <[email protected]>
2021-11-26net: hns3: add check NULL address for page poolHao Chen1-0/+5
When page pool is not enabled, its address value is still NULL and page pool should not be accessed, so add a check for it. Fixes: 850bfb912a6d ("net: hns3: debugfs add support dumping page pool info") Signed-off-by: Hao Chen <[email protected]> Signed-off-by: Guangbin Huang <[email protected]> Signed-off-by: Jakub Kicinski <[email protected]>
2021-11-26net: hns3: fix VF RSS failed problem after PF enable multi-TCsGuangbin Huang1-2/+2
When PF is set to multi-TCs and configured mapping relationship between priorities and TCs, the hardware will active these settings for this PF and its VFs. In this case when VF just uses one TC and its rx packets contain priority, and if the priority is not mapped to TC0, as other TCs of VF is not valid, hardware always put this kind of packets to the queue 0. It cause this kind of packets of VF can not be used RSS function. To fix this problem, set tc mode of all unused TCs of VF to the setting of TC0, then rx packet with priority which map to unused TC will be direct to TC0. Fixes: e2cb1dec9779 ("net: hns3: Add HNS3 VF HCL(Hardware Compatibility Layer) Support") Signed-off-by: Guangbin Huang <[email protected]> Signed-off-by: Jakub Kicinski <[email protected]>
2021-11-26net: qed: fix the array may be out of boundzhangyue1-3/+3
If the variable 'p_bit->flags' is always 0, the loop condition is always 0. The variable 'j' may be greater than or equal to 32. At this time, the array 'p_aeu->bits[32]' may be out of bound. Signed-off-by: zhangyue <[email protected]> Link: https://lore.kernel.org/r/[email protected] Signed-off-by: Jakub Kicinski <[email protected]>
2021-11-26net: stmmac: Disable Tx queues when reconfiguring the interfaceYannick Vignon1-0/+2
The Tx queues were not disabled in situations where the driver needed to stop the interface to apply a new configuration. This could result in a kernel panic when doing any of the 3 following actions: * reconfiguring the number of queues (ethtool -L) * reconfiguring the size of the ring buffers (ethtool -G) * installing/removing an XDP program (ip l set dev ethX xdp) Prevent the panic by making sure netif_tx_disable is called when stopping an interface. Without this patch, the following kernel panic can be observed when doing any of the actions above: Unable to handle kernel paging request at virtual address ffff80001238d040 [....] Call trace: dwmac4_set_addr+0x8/0x10 dev_hard_start_xmit+0xe4/0x1ac sch_direct_xmit+0xe8/0x39c __dev_queue_xmit+0x3ec/0xaf0 dev_queue_xmit+0x14/0x20 [...] [ end trace 0000000000000002 ]--- Fixes: 5fabb01207a2d ("net: stmmac: Add initial XDP support") Fixes: aa042f60e4961 ("net: stmmac: Add support to Ethtool get/set ring parameters") Fixes: 0366f7e06a6be ("net: stmmac: add ethtool support for get/set channels") Signed-off-by: Yannick Vignon <[email protected]> Link: https://lore.kernel.org/r/[email protected] Signed-off-by: Jakub Kicinski <[email protected]>
2021-11-25igb: fix netpoll exit with trafficJesse Brandeburg1-1/+1
Oleksandr brought a bug report where netpoll causes trace messages in the log on igb. Danielle brought this back up as still occurring, so we'll try again. [22038.710800] ------------[ cut here ]------------ [22038.710801] igb_poll+0x0/0x1440 [igb] exceeded budget in poll [22038.710802] WARNING: CPU: 12 PID: 40362 at net/core/netpoll.c:155 netpoll_poll_dev+0x18a/0x1a0 As Alex suggested, change the driver to return work_done at the exit of napi_poll, which should be safe to do in this driver because it is not polling multiple queues in this single napi context (multiple queues attached to one MSI-X vector). Several other drivers contain the same simple sequence, so I hope this will not create new problems. Fixes: 16eb8815c235 ("igb: Refactor clean_rx_irq to reduce overhead and improve performance") Reported-by: Oleksandr Natalenko <[email protected]> Reported-by: Danielle Ratson <[email protected]> Suggested-by: Alexander Duyck <[email protected]> Signed-off-by: Jesse Brandeburg <[email protected]> Tested-by: Oleksandr Natalenko <[email protected]> Tested-by: Danielle Ratson <[email protected]> Link: https://lore.kernel.org/r/[email protected] Signed-off-by: Jakub Kicinski <[email protected]>
2021-11-24lan743x: fix deadlock in lan743x_phy_link_status_change()Heiner Kallweit1-9/+3
Usage of phy_ethtool_get_link_ksettings() in the link status change handler isn't needed, and in combination with the referenced change it results in a deadlock. Simply remove the call and replace it with direct access to phydev->speed. The duplex argument of lan743x_phy_update_flowcontrol() isn't used and can be removed. Fixes: c10a485c3de5 ("phy: phy_ethtool_ksettings_get: Lock the phy for consistency") Reported-by: Alessandro B Maurici <[email protected]> Tested-by: Alessandro B Maurici <[email protected]> Signed-off-by: Heiner Kallweit <[email protected]> Reviewed-by: Andrew Lunn <[email protected]> Link: https://lore.kernel.org/r/[email protected] Signed-off-by: Jakub Kicinski <[email protected]>
2021-11-23net: marvell: mvpp2: increase MTU limit when XDP enabledMarek Behún1-6/+8
Currently mvpp2_xdp_setup won't allow attaching XDP program if mtu > ETH_DATA_LEN (1500). The mvpp2_change_mtu on the other hand checks whether MVPP2_RX_PKT_SIZE(mtu) > MVPP2_BM_LONG_PKT_SIZE. These two checks are semantically different. Moreover this limit can be increased to MVPP2_MAX_RX_BUF_SIZE, since in mvpp2_rx we have xdp.data = data + MVPP2_MH_SIZE + MVPP2_SKB_HEADROOM; xdp.frame_sz = PAGE_SIZE; Change the checks to check whether mtu > MVPP2_MAX_RX_BUF_SIZE Fixes: 07dd0a7aae7f ("mvpp2: add basic XDP support") Signed-off-by: Marek Behún <[email protected]> Signed-off-by: David S. Miller <[email protected]>
2021-11-23net: chelsio: cxgb4vf: Fix an error code in cxgb4vf_pci_probe()Zheyu Ma1-0/+1
During the process of driver probing, probe function should return < 0 for failure, otherwise kernel will treat value == 0 as success. Therefore, we should set err to -EINVAL when adapter->registered_device_map is NULL. Otherwise kernel will assume that driver has been successfully probed and will cause unexpected errors. Signed-off-by: Zheyu Ma <[email protected]> Signed-off-by: David S. Miller <[email protected]>
2021-11-23r8169: fix incorrect mac address assignmentHeiner Kallweit1-2/+3
The original changes brakes MAC address assignment on older chip versions (see bug report [0]), and it brakes random MAC assignment. is_valid_ether_addr() requires that its argument is word-aligned. Add the missing alignment to array mac_addr. [0] https://bugzilla.kernel.org/show_bug.cgi?id=215087 Fixes: 1c5d09d58748 ("ethernet: r8169: use eth_hw_addr_set()") Reported-by: Richard Herbert <[email protected]> Tested-by: Richard Herbert <[email protected]> Signed-off-by: Heiner Kallweit <[email protected]> Acked-by: Jakub Kicinski <[email protected]> Signed-off-by: David S. Miller <[email protected]>
2021-11-23Merge branch '100GbE' of ↵David S. Miller2-3/+24
git://git.kernel.org/pub/scm/linux/kernel/git/tnguy/net-queue Tony Nguyen says: ==================== Intel Wired LAN Driver Updates 2021-11-22 Maciej Fijalkowski says: Here are the two fixes for issues around ethtool's set_channels() callback for ice driver. Both are related to XDP resources. First one corrects the size of vsi->txq_map that is used to track the usage of Tx resources and the second one prevents the wrong refcounting of bpf_prog. ==================== Signed-off-by: David S. Miller <[email protected]>
2021-11-23mlxsw: spectrum: Protect driver from buggy firmwareAmit Cohen1-1/+1
When processing port up/down events generated by the device's firmware, the driver protects itself from events reported for non-existent local ports, but not the CPU port (local port 0), which exists, but lacks a netdev. This can result in a NULL pointer dereference when calling netif_carrier_{on,off}(). Fix this by bailing early when processing an event reported for the CPU port. Problem was only observed when running on top of a buggy emulator. Fixes: 28b1987ef506 ("mlxsw: spectrum: Register CPU port with devlink") Signed-off-by: Amit Cohen <[email protected]> Signed-off-by: Ido Schimmel <[email protected]> Signed-off-by: David S. Miller <[email protected]>
2021-11-23mlxsw: spectrum: Allow driver to load with old firmware versionsDanielle Ratson1-4/+4
The driver fails to load with old firmware versions that cannot report the maximum number of RIF MAC profiles [1]. Fix this by defaulting to a maximum of a single profile in such situations, as multiple profiles are not supported by old firmware versions. [1] mlxsw_spectrum 0000:03:00.0: cannot register bus device mlxsw_spectrum: probe of 0000:03:00.0 failed with error -5 Fixes: 1c375ffb2efab ("mlxsw: spectrum_router: Expose RIF MAC profiles to devlink resource") Signed-off-by: Danielle Ratson <[email protected]> Reported-by: Vadim Pasternak <[email protected]> Signed-off-by: Ido Schimmel <[email protected]> Signed-off-by: David S. Miller <[email protected]>
2021-11-22ice: avoid bpf_prog refcount underflowMarta Plantykow1-1/+17
Ice driver has the routines for managing XDP resources that are shared between ndo_bpf op and VSI rebuild flow. The latter takes place for example when user changes queue count on an interface via ethtool's set_channels(). There is an issue around the bpf_prog refcounting when VSI is being rebuilt - since ice_prepare_xdp_rings() is called with vsi->xdp_prog as an argument that is used later on by ice_vsi_assign_bpf_prog(), same bpf_prog pointers are swapped with each other. Then it is also interpreted as an 'old_prog' which in turn causes us to call bpf_prog_put on it that will decrement its refcount. Below splat can be interpreted in a way that due to zero refcount of a bpf_prog it is wiped out from the system while kernel still tries to refer to it: [ 481.069429] BUG: unable to handle page fault for address: ffffc9000640f038 [ 481.077390] #PF: supervisor read access in kernel mode [ 481.083335] #PF: error_code(0x0000) - not-present page [ 481.089276] PGD 100000067 P4D 100000067 PUD 1001cb067 PMD 106d2b067 PTE 0 [ 481.097141] Oops: 0000 [#1] PREEMPT SMP PTI [ 481.101980] CPU: 12 PID: 3339 Comm: sudo Tainted: G OE 5.15.0-rc5+ #1 [ 481.110840] Hardware name: Intel Corp. GRANTLEY/GRANTLEY, BIOS GRRFCRB1.86B.0276.D07.1605190235 05/19/2016 [ 481.122021] RIP: 0010:dev_xdp_prog_id+0x25/0x40 [ 481.127265] Code: 80 00 00 00 00 0f 1f 44 00 00 89 f6 48 c1 e6 04 48 01 fe 48 8b 86 98 08 00 00 48 85 c0 74 13 48 8b 50 18 31 c0 48 85 d2 74 07 <48> 8b 42 38 8b 40 20 c3 48 8b 96 90 08 00 00 eb e8 66 2e 0f 1f 84 [ 481.148991] RSP: 0018:ffffc90007b63868 EFLAGS: 00010286 [ 481.155034] RAX: 0000000000000000 RBX: ffff889080824000 RCX: 0000000000000000 [ 481.163278] RDX: ffffc9000640f000 RSI: ffff889080824010 RDI: ffff889080824000 [ 481.171527] RBP: ffff888107af7d00 R08: 0000000000000000 R09: ffff88810db5f6e0 [ 481.179776] R10: 0000000000000000 R11: ffff8890885b9988 R12: ffff88810db5f4bc [ 481.188026] R13: 0000000000000000 R14: 0000000000000000 R15: 0000000000000000 [ 481.196276] FS: 00007f5466d5bec0(0000) GS:ffff88903fb00000(0000) knlGS:0000000000000000 [ 481.205633] CS: 0010 DS: 0000 ES: 0000 CR0: 0000000080050033 [ 481.212279] CR2: ffffc9000640f038 CR3: 000000014429c006 CR4: 00000000003706e0 [ 481.220530] DR0: 0000000000000000 DR1: 0000000000000000 DR2: 0000000000000000 [ 481.228771] DR3: 0000000000000000 DR6: 00000000fffe0ff0 DR7: 0000000000000400 [ 481.237029] Call Trace: [ 481.239856] rtnl_fill_ifinfo+0x768/0x12e0 [ 481.244602] rtnl_dump_ifinfo+0x525/0x650 [ 481.249246] ? __alloc_skb+0xa5/0x280 [ 481.253484] netlink_dump+0x168/0x3c0 [ 481.257725] netlink_recvmsg+0x21e/0x3e0 [ 481.262263] ____sys_recvmsg+0x87/0x170 [ 481.266707] ? __might_fault+0x20/0x30 [ 481.271046] ? _copy_from_user+0x66/0xa0 [ 481.275591] ? iovec_from_user+0xf6/0x1c0 [ 481.280226] ___sys_recvmsg+0x82/0x100 [ 481.284566] ? sock_sendmsg+0x5e/0x60 [ 481.288791] ? __sys_sendto+0xee/0x150 [ 481.293129] __sys_recvmsg+0x56/0xa0 [ 481.297267] do_syscall_64+0x3b/0xc0 [ 481.301395] entry_SYSCALL_64_after_hwframe+0x44/0xae [ 481.307238] RIP: 0033:0x7f5466f39617 [ 481.311373] Code: 0c 00 f7 d8 64 89 02 48 c7 c0 ff ff ff ff eb bd 0f 1f 00 f3 0f 1e fa 64 8b 04 25 18 00 00 00 85 c0 75 10 b8 2f 00 00 00 0f 05 <48> 3d 00 f0 ff ff 77 51 c3 48 83 ec 28 89 54 24 1c 48 89 74 24 10 [ 481.342944] RSP: 002b:00007ffedc7f4308 EFLAGS: 00000246 ORIG_RAX: 000000000000002f [ 481.361783] RAX: ffffffffffffffda RBX: 00007ffedc7f5460 RCX: 00007f5466f39617 [ 481.380278] RDX: 0000000000000000 RSI: 00007ffedc7f5360 RDI: 0000000000000003 [ 481.398500] RBP: 00007ffedc7f53f0 R08: 0000000000000000 R09: 000055d556f04d50 [ 481.416463] R10: 0000000000000077 R11: 0000000000000246 R12: 00007ffedc7f5360 [ 481.434131] R13: 00007ffedc7f5350 R14: 00007ffedc7f5344 R15: 0000000000000e98 [ 481.451520] Modules linked in: ice(OE) af_packet binfmt_misc nls_iso8859_1 ipmi_ssif intel_rapl_msr intel_rapl_common x86_pkg_temp_thermal intel_powerclamp mxm_wmi mei_me coretemp mei ipmi_si ipmi_msghandler wmi acpi_pad acpi_power_meter ip_tables x_tables autofs4 crct10dif_pclmul crc32_pclmul ghash_clmulni_intel aesni_intel ahci crypto_simd cryptd libahci lpc_ich [last unloaded: ice] [ 481.528558] CR2: ffffc9000640f038 [ 481.542041] ---[ end trace d1f24c9ecf5b61c1 ]--- Fix this by only calling ice_vsi_assign_bpf_prog() inside ice_prepare_xdp_rings() when current vsi->xdp_prog pointer is NULL. This way set_channels() flow will not attempt to swap the vsi->xdp_prog pointers with itself. Also, sprinkle around some comments that provide a reasoning about correlation between driver and kernel in terms of bpf_prog refcount. Fixes: efc2214b6047 ("ice: Add support for XDP") Reviewed-by: Alexander Lobakin <[email protected]> Signed-off-by: Marta Plantykow <[email protected]> Co-developed-by: Maciej Fijalkowski <[email protected]> Signed-off-by: Maciej Fijalkowski <[email protected]> Tested-by: Kiran Bhandare <[email protected]> Signed-off-by: Tony Nguyen <[email protected]>
2021-11-22ice: fix vsi->txq_map sizingMaciej Fijalkowski1-2/+7
The approach of having XDP queue per CPU regardless of user's setting exposed a hidden bug that could occur in case when Rx queue count differ from Tx queue count. Currently vsi->txq_map's size is equal to the doubled vsi->alloc_txq, which is not correct due to the fact that XDP rings were previously based on the Rx queue count. Below splat can be seen when ethtool -L is used and XDP rings are configured: [ 682.875339] BUG: kernel NULL pointer dereference, address: 000000000000000f [ 682.883403] #PF: supervisor read access in kernel mode [ 682.889345] #PF: error_code(0x0000) - not-present page [ 682.895289] PGD 0 P4D 0 [ 682.898218] Oops: 0000 [#1] PREEMPT SMP PTI [ 682.903055] CPU: 42 PID: 2878 Comm: ethtool Tainted: G OE 5.15.0-rc5+ #1 [ 682.912214] Hardware name: Intel Corp. GRANTLEY/GRANTLEY, BIOS GRRFCRB1.86B.0276.D07.1605190235 05/19/2016 [ 682.923380] RIP: 0010:devres_remove+0x44/0x130 [ 682.928527] Code: 49 89 f4 55 48 89 fd 4c 89 ff 53 48 83 ec 10 e8 92 b9 49 00 48 8b 9d a8 02 00 00 48 8d 8d a0 02 00 00 49 89 c2 48 39 cb 74 0f <4c> 3b 63 10 74 25 48 8b 5b 08 48 39 cb 75 f1 4c 89 ff 4c 89 d6 e8 [ 682.950237] RSP: 0018:ffffc90006a679f0 EFLAGS: 00010002 [ 682.956285] RAX: 0000000000000286 RBX: ffffffffffffffff RCX: ffff88908343a370 [ 682.964538] RDX: 0000000000000001 RSI: ffffffff81690d60 RDI: 0000000000000000 [ 682.972789] RBP: ffff88908343a0d0 R08: 0000000000000000 R09: 0000000000000000 [ 682.981040] R10: 0000000000000286 R11: 3fffffffffffffff R12: ffffffff81690d60 [ 682.989282] R13: ffffffff81690a00 R14: ffff8890819807a8 R15: ffff88908343a36c [ 682.997535] FS: 00007f08c7bfa740(0000) GS:ffff88a03fd00000(0000) knlGS:0000000000000000 [ 683.006910] CS: 0010 DS: 0000 ES: 0000 CR0: 0000000080050033 [ 683.013557] CR2: 000000000000000f CR3: 0000001080a66003 CR4: 00000000003706e0 [ 683.021819] DR0: 0000000000000000 DR1: 0000000000000000 DR2: 0000000000000000 [ 683.030075] DR3: 0000000000000000 DR6: 00000000fffe0ff0 DR7: 0000000000000400 [ 683.038336] Call Trace: [ 683.041167] devm_kfree+0x33/0x50 [ 683.045004] ice_vsi_free_arrays+0x5e/0xc0 [ice] [ 683.050380] ice_vsi_rebuild+0x4c8/0x750 [ice] [ 683.055543] ice_vsi_recfg_qs+0x9a/0x110 [ice] [ 683.060697] ice_set_channels+0x14f/0x290 [ice] [ 683.065962] ethnl_set_channels+0x333/0x3f0 [ 683.070807] genl_family_rcv_msg_doit+0xea/0x150 [ 683.076152] genl_rcv_msg+0xde/0x1d0 [ 683.080289] ? channels_prepare_data+0x60/0x60 [ 683.085432] ? genl_get_cmd+0xd0/0xd0 [ 683.089667] netlink_rcv_skb+0x50/0xf0 [ 683.094006] genl_rcv+0x24/0x40 [ 683.097638] netlink_unicast+0x239/0x340 [ 683.102177] netlink_sendmsg+0x22e/0x470 [ 683.106717] sock_sendmsg+0x5e/0x60 [ 683.110756] __sys_sendto+0xee/0x150 [ 683.114894] ? handle_mm_fault+0xd0/0x2a0 [ 683.119535] ? do_user_addr_fault+0x1f3/0x690 [ 683.134173] __x64_sys_sendto+0x25/0x30 [ 683.148231] do_syscall_64+0x3b/0xc0 [ 683.161992] entry_SYSCALL_64_after_hwframe+0x44/0xae Fix this by taking into account the value that num_possible_cpus() yields in addition to vsi->alloc_txq instead of doubling the latter. Fixes: efc2214b6047 ("ice: Add support for XDP") Fixes: 22bf877e528f ("ice: introduce XDP_TX fallback path") Reviewed-by: Alexander Lobakin <[email protected]> Signed-off-by: Maciej Fijalkowski <[email protected]> Tested-by: Kiran Bhandare <[email protected]> Signed-off-by: Tony Nguyen <[email protected]>
2021-11-22nixge: fix mac address error handling againArnd Bergmann1-2/+2
The change to eth_hw_addr_set() caused gcc to correctly spot a bug that was introduced in an earlier incorrect fix: In file included from include/linux/etherdevice.h:21, from drivers/net/ethernet/ni/nixge.c:7: In function '__dev_addr_set', inlined from 'eth_hw_addr_set' at include/linux/etherdevice.h:319:2, inlined from 'nixge_probe' at drivers/net/ethernet/ni/nixge.c:1286:3: include/linux/netdevice.h:4648:9: error: 'memcpy' reading 6 bytes from a region of size 0 [-Werror=stringop-overread] 4648 | memcpy(dev->dev_addr, addr, len); | ^~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ As nixge_get_nvmem_address() can return either NULL or an error pointer, the NULL check is wrong, and we can end up reading from ERR_PTR(-EOPNOTSUPP), which gcc knows to contain zero readable bytes. Make the function always return an error pointer again but fix the check to match that. Fixes: f3956ebb3bf0 ("ethernet: use eth_hw_addr_set() instead of ether_addr_copy()") Fixes: abcd3d6fc640 ("net: nixge: Fix error path for obtaining mac address") Signed-off-by: Arnd Bergmann <[email protected]> Signed-off-by: David S. Miller <[email protected]>
2021-11-22net: ax88796c: do not receive data in pointerNicolas Iooss1-1/+1
Function axspi_read_status calls: ret = spi_write_then_read(ax_spi->spi, ax_spi->cmd_buf, 1, (u8 *)&status, 3); status is a pointer to a struct spi_status, which is 3-byte wide: struct spi_status { u16 isr; u8 status; }; But &status is the pointer to this pointer, and spi_write_then_read does not dereference this parameter: int spi_write_then_read(struct spi_device *spi, const void *txbuf, unsigned n_tx, void *rxbuf, unsigned n_rx) Therefore axspi_read_status currently receive a SPI response in the pointer status, which overwrites 24 bits of the pointer. Thankfully, on Little-Endian systems, the pointer is only used in le16_to_cpus(&status->isr); ... which is a no-operation. So there, the overwritten pointer is not dereferenced. Nevertheless on Big-Endian systems, this can lead to dereferencing pointers after their 24 most significant bits were overwritten. And in all systems this leads to possible use of uninitialized value in functions calling spi_write_then_read which expect status to be initialized when the function returns. Moreover function axspi_read_status (and macro AX_READ_STATUS) do not seem to be used anywhere. So currently this seems to be dead code. Fix the issue anyway so that future code works properly when using function axspi_read_status. Fixes: a97c69ba4f30 ("net: ax88796c: ASIX AX88796C SPI Ethernet Adapter Driver") Signed-off-by: Nicolas Iooss <[email protected]> Acked-by: Łukasz Stelmach <[email protected]> Signed-off-by: David S. Miller <[email protected]>
2021-11-22net: stmmac: retain PTP clock time during SIOCSHWTSTAMP ioctlsHolger Assmann3-47/+81
Currently, when user space emits SIOCSHWTSTAMP ioctl calls such as enabling/disabling timestamping or changing filter settings, the driver reads the current CLOCK_REALTIME value and programming this into the NIC's hardware clock. This might be necessary during system initialization, but at runtime, when the PTP clock has already been synchronized to a grandmaster, a reset of the timestamp settings might result in a clock jump. Furthermore, if the clock is also controlled by phc2sys in automatic mode (where the UTC offset is queried from ptp4l), that UTC-to-TAI offset (currently 37 seconds in 2021) would be temporarily reset to 0, and it would take a long time for phc2sys to readjust so that CLOCK_REALTIME and the PHC are apart by 37 seconds again. To address the issue, we introduce a new function called stmmac_init_tstamp_counter(), which gets called during ndo_open(). It contains the code snippet moved from stmmac_hwtstamp_set() that manages the time synchronization. Besides, the sub second increment configuration is also moved here since the related values are hardware dependent and runtime invariant. Furthermore, the hardware clock must be kept running even when no time stamping mode is selected in order to retain the synchronized time base. That way, timestamping can be enabled again at any time only with the need to compensate the clock's natural drifting. As a side effect, this patch fixes the issue that ptp_clock_info::enable can be called before SIOCSHWTSTAMP and the driver (which looks at priv->systime_flags) was not prepared to handle that ordering. Fixes: 92ba6888510c ("stmmac: add the support for PTP hw clock driver") Reported-by: Michael Olbrich <[email protected]> Signed-off-by: Ahmad Fatoum <[email protected]> Signed-off-by: Holger Assmann <[email protected]> Signed-off-by: Vladimir Oltean <[email protected]> Signed-off-by: David S. Miller <[email protected]>
2021-11-22nfp: checking parameter process for rx-usecs/tx-usecs is invalidDiana Wang2-4/+1
Use nn->tlv_caps.me_freq_mhz instead of nn->me_freq_mhz to check whether rx-usecs/tx-usecs is valid. This is because nn->tlv_caps.me_freq_mhz represents the clock_freq (MHz) of the flow processing cores (FPC) on the NIC. While nn->me_freq_mhz is not be set. Fixes: ce991ab6662a ("nfp: read ME frequency from vNIC ctrl memory") Signed-off-by: Diana Wang <[email protected]> Signed-off-by: Simon Horman <[email protected]> Reviewed-by: Jakub Kicinski <[email protected]> Signed-off-by: David S. Miller <[email protected]>
2021-11-19iavf: Fix VLAN feature flags after VFRBrett Creeley3-23/+56
When a VF goes through a reset, it's possible for the VF's feature set to change. For example it may lose the VIRTCHNL_VF_OFFLOAD_VLAN capability after VF reset. Unfortunately, the driver doesn't correctly deal with this situation and errors are seen from downing/upping the interface and/or moving the interface in/out of a network namespace. When setting the interface down/up we see the following errors after the VIRTCHNL_VF_OFFLOAD_VLAN capability was taken away from the VF: ice 0000:51:00.1: VF 1 failed opcode 12, retval: -64 iavf 0000:51:09.1: Failed to add VLAN filter, error IAVF_NOT_SUPPORTED ice 0000:51:00.1: VF 1 failed opcode 13, retval: -64 iavf 0000:51:09.1: Failed to delete VLAN filter, error IAVF_NOT_SUPPORTED These add/delete errors are happening because the VLAN filters are tracked internally to the driver and regardless of the VLAN_ALLOWED() setting the driver tries to delete/re-add them over virtchnl. Fix the delete failure by making sure to delete any VLAN filter tracking in the driver when a removal request is made, while preventing the virtchnl request. This makes it so the driver's VLAN list is up to date and the errors are Fix the add failure by making sure the check for VLAN_ALLOWED() during reset is done after the VF receives its capability list from the PF via VIRTCHNL_OP_GET_VF_RESOURCES. If VLAN functionality is not allowed, then prevent requesting re-adding the filters over virtchnl. When moving the interface into a network namespace we see the following errors after the VIRTCHNL_VF_OFFLOAD_VLAN capability was taken away from the VF: iavf 0000:51:09.1 enp81s0f1v1: NIC Link is Up Speed is 25 Gbps Full Duplex iavf 0000:51:09.1 temp_27: renamed from enp81s0f1v1 iavf 0000:51:09.1 mgmt: renamed from temp_27 iavf 0000:51:09.1 dev27: set_features() failed (-22); wanted 0x020190001fd54833, left 0x020190001fd54bb3 These errors are happening because we aren't correctly updating the netdev capabilities and dealing with ndo_fix_features() and ndo_set_features() correctly. Fix this by only reporting errors in the driver's ndo_set_features() callback when VIRTCHNL_VF_OFFLOAD_VLAN is not allowed and any attempt to enable the VLAN features is made. Also, make sure to disable VLAN insertion, filtering, and stripping since the VIRTCHNL_VF_OFFLOAD_VLAN flag applies to all of them and not just VLAN stripping. Also, after we process the capabilities in the VF reset path, make sure to call netdev_update_features() in case the capabilities have changed in order to update the netdev's feature set to match the VF's actual capabilities. Lastly, make sure to always report success on VLAN filter delete when VIRTCHNL_VF_OFFLOAD_VLAN is not supported. The changed flow in iavf_del_vlans() allows the stack to delete previosly existing VLAN filters even if VLAN filtering is not allowed. This makes it so the VLAN filter list is up to date. Fixes: 8774370d268f ("i40e/i40evf: support for VF VLAN tag stripping control") Signed-off-by: Brett Creeley <[email protected]> Tested-by: Konrad Jankowski <[email protected]> Signed-off-by: Tony Nguyen <[email protected]>
2021-11-19iavf: Fix refreshing iavf adapter stats on ethtool requestJedrzej Jagielski4-0/+25
Currently iavf adapter statistics are refreshed only in a watchdog task, triggered approximately every two seconds, which causes some ethtool requests to return outdated values. Add explicit statistics refresh when requested by ethtool -S. Fixes: b476b0030e61 ("iavf: Move commands processing to the separate function") Signed-off-by: Jan Sokolowski <[email protected]> Signed-off-by: Jedrzej Jagielski <[email protected]> Tested-by: Konrad Jankowski <[email protected]> Signed-off-by: Tony Nguyen <[email protected]>
2021-11-19iavf: Fix deadlock occurrence during resetting VF interfaceJedrzej Jagielski1-2/+5
System hangs if close the interface is called from the kernel during the interface is in resetting state. During resetting operation the link is closing but kernel didn't know it and it tried to close this interface again what sometimes led to deadlock. Inform kernel about current state of interface and turn off the flag IFF_UP when interface is closing until reset is finished. Previously it was most likely to hang the system when kernel (network manager) tried to close the interface in the same time when interface was in resetting state because of deadlock. Fixes: 3c8e0b989aa1 ("i40vf: don't stop me now") Signed-off-by: Jaroslaw Gawin <[email protected]> Signed-off-by: Jedrzej Jagielski <[email protected]> Tested-by: Konrad Jankowski <[email protected]> Signed-off-by: Tony Nguyen <[email protected]>
2021-11-19iavf: Prevent changing static ITR values if adaptive moderation is onNitesh B Venkatesh1-4/+26
Resolve being able to change static values on VF when adaptive interrupt moderation is enabled. This problem is fixed by checking the interrupt settings is not a combination of change of static value while adaptive interrupt moderation is turned on. Without this fix, the user would be able to change static values on VF with adaptive moderation enabled. Fixes: 65e87c0398f5 ("i40evf: support queue-specific settings for interrupt moderation") Signed-off-by: Nitesh B Venkatesh <[email protected]> Tested-by: George Kuruvinakunnel <[email protected]> Signed-off-by: Tony Nguyen <[email protected]>
2021-11-19stmmac_pci: Fix underflow size in stmmac_rxZekun Shen1-4/+5
This bug report came up when we were testing the device driver by fuzzing. It shows that buf1_len can get underflowed and be 0xfffffffc (4294967292). This bug is triggerable with a compromised/malfunctioning device. We found the bug through QEMU emulation tested the patch with emulation. We did NOT test it on real hardware. Attached is the bug report by fuzzing. BUG: KASAN: use-after-free in stmmac_napi_poll_rx+0x1c08/0x36e0 [stmmac] Read of size 4294967292 at addr ffff888016358000 by task ksoftirqd/0/9 CPU: 0 PID: 9 Comm: ksoftirqd/0 Tainted: G W 5.6.0 #1 Call Trace: dump_stack+0x76/0xa0 print_address_description.constprop.0+0x16/0x200 ? stmmac_napi_poll_rx+0x1c08/0x36e0 [stmmac] ? stmmac_napi_poll_rx+0x1c08/0x36e0 [stmmac] __kasan_report.cold+0x37/0x7c ? stmmac_napi_poll_rx+0x1c08/0x36e0 [stmmac] kasan_report+0xe/0x20 check_memory_region+0x15a/0x1d0 memcpy+0x20/0x50 stmmac_napi_poll_rx+0x1c08/0x36e0 [stmmac] ? stmmac_suspend+0x850/0x850 [stmmac] ? __next_timer_interrupt+0xba/0xf0 net_rx_action+0x363/0xbd0 ? call_timer_fn+0x240/0x240 ? __switch_to_asm+0x40/0x70 ? napi_busy_loop+0x520/0x520 ? __schedule+0x839/0x15a0 __do_softirq+0x18c/0x634 ? takeover_tasklets+0x5f0/0x5f0 run_ksoftirqd+0x15/0x20 smpboot_thread_fn+0x2f1/0x6b0 ? smpboot_unregister_percpu_thread+0x160/0x160 ? __kthread_parkme+0x80/0x100 ? smpboot_unregister_percpu_thread+0x160/0x160 kthread+0x2b5/0x3b0 ? kthread_create_on_node+0xd0/0xd0 ret_from_fork+0x22/0x40 Reported-by: Brendan Dolan-Gavitt <[email protected]> Signed-off-by: Zekun Shen <[email protected]> Signed-off-by: David S. Miller <[email protected]>
2021-11-19atlantic: fix double-free in aq_ring_tx_cleanZekun Shen1-1/+2
We found this bug while fuzzing the device driver. Using and freeing the dangling pointer buff->skb would cause use-after-free and double-free. This bug is triggerable with compromised/malfunctioning devices. We found the bug with QEMU emulation and tested the patch by emulation. We did NOT test on a real device. Attached is the bug report. BUG: KASAN: double-free or invalid-free in consume_skb+0x6c/0x1c0 Call Trace: dump_stack+0x76/0xa0 print_address_description.constprop.0+0x16/0x200 ? consume_skb+0x6c/0x1c0 kasan_report_invalid_free+0x61/0xa0 ? consume_skb+0x6c/0x1c0 __kasan_slab_free+0x15e/0x170 ? consume_skb+0x6c/0x1c0 kfree+0x8c/0x230 consume_skb+0x6c/0x1c0 aq_ring_tx_clean+0x5c2/0xa80 [atlantic] aq_vec_poll+0x309/0x5d0 [atlantic] ? _sub_I_65535_1+0x20/0x20 [atlantic] ? __next_timer_interrupt+0xba/0xf0 net_rx_action+0x363/0xbd0 ? call_timer_fn+0x240/0x240 ? __switch_to_asm+0x34/0x70 ? napi_busy_loop+0x520/0x520 ? net_tx_action+0x379/0x720 __do_softirq+0x18c/0x634 ? takeover_tasklets+0x5f0/0x5f0 run_ksoftirqd+0x15/0x20 smpboot_thread_fn+0x2f1/0x6b0 ? smpboot_unregister_percpu_thread+0x160/0x160 ? __kthread_parkme+0x80/0x100 ? smpboot_unregister_percpu_thread+0x160/0x160 kthread+0x2b5/0x3b0 ? kthread_create_on_node+0xd0/0xd0 ret_from_fork+0x22/0x40 Reported-by: Brendan Dolan-Gavitt <[email protected]> Signed-off-by: Zekun Shen <[email protected]> Reviewed-by: Igor Russkikh <[email protected]> Signed-off-by: David S. Miller <[email protected]>
2021-11-19net: marvell: prestera: fix double free issue on err pathVolodymyr Mytnyk1-4/+2
fix error path handling in prestera_bridge_port_join() that cases prestera driver to crash (see below). Trace: Internal error: Oops: 96000044 [#1] SMP Modules linked in: prestera_pci prestera uio_pdrv_genirq CPU: 1 PID: 881 Comm: ip Not tainted 5.15.0 #1 pstate: 60000005 (nZCv daif -PAN -UAO -TCO -DIT -SSBS BTYPE=--) pc : prestera_bridge_destroy+0x2c/0xb0 [prestera] lr : prestera_bridge_port_join+0x2cc/0x350 [prestera] sp : ffff800011a1b0f0 ... x2 : ffff000109ca6c80 x1 : dead000000000100 x0 : dead000000000122 Call trace: prestera_bridge_destroy+0x2c/0xb0 [prestera] prestera_bridge_port_join+0x2cc/0x350 [prestera] prestera_netdev_port_event.constprop.0+0x3c4/0x450 [prestera] prestera_netdev_event_handler+0xf4/0x110 [prestera] raw_notifier_call_chain+0x54/0x80 call_netdevice_notifiers_info+0x54/0xa0 __netdev_upper_dev_link+0x19c/0x380 Fixes: e1189d9a5fbe ("net: marvell: prestera: Add Switchdev driver implementation") Signed-off-by: Volodymyr Mytnyk <[email protected]> Signed-off-by: David S. Miller <[email protected]>
2021-11-19net: marvell: prestera: fix brige port operationVolodymyr Mytnyk1-1/+1
Return NOTIFY_DONE (dont't care) for switchdev notifications that prestera driver don't know how to handle them. With introduction of SWITCHDEV_BRPORT_[UN]OFFLOADED switchdev events, the driver rejects adding swport to bridge operation which is handled by prestera_bridge_port_join() func. The root cause of this is that prestera driver returns error (EOPNOTSUPP) in prestera_switchdev_blk_event() handler for unknown swdev events. This causes switchdev_bridge_port_offload() to fail when adding port to bridge in prestera_bridge_port_join(). Fixes: 957e2235e526 ("net: make switchdev_bridge_port_{,unoffload} loosely coupled with the bridge") Signed-off-by: Volodymyr Mytnyk <[email protected]> Signed-off-by: David S. Miller <[email protected]>
2021-11-18net: ethernet: dec: tulip: de4x5: fix possible array overflows in ↵Teng Qi1-0/+4
type3_infoblock() The definition of macro MOTO_SROM_BUG is: #define MOTO_SROM_BUG (lp->active == 8 && (get_unaligned_le32( dev->dev_addr) & 0x00ffffff) == 0x3e0008) and the if statement if (MOTO_SROM_BUG) lp->active = 0; using this macro indicates lp->active could be 8. If lp->active is 8 and the second comparison of this macro is false. lp->active will remain 8 in: lp->phy[lp->active].gep = (*p ? p : NULL); p += (2 * (*p) + 1); lp->phy[lp->active].rst = (*p ? p : NULL); p += (2 * (*p) + 1); lp->phy[lp->active].mc = get_unaligned_le16(p); p += 2; lp->phy[lp->active].ana = get_unaligned_le16(p); p += 2; lp->phy[lp->active].fdx = get_unaligned_le16(p); p += 2; lp->phy[lp->active].ttm = get_unaligned_le16(p); p += 2; lp->phy[lp->active].mci = *p; However, the length of array lp->phy is 8, so array overflows can occur. To fix these possible array overflows, we first check lp->active and then return -EINVAL if it is greater or equal to ARRAY_SIZE(lp->phy) (i.e. 8). Reported-by: TOTE Robot <[email protected]> Signed-off-by: Teng Qi <[email protected]> Reviewed-by: Arnd Bergmann <[email protected]> Signed-off-by: David S. Miller <[email protected]>
2021-11-18net: tulip: de4x5: fix the problem that the array 'lp->phy[8]' may be out of ↵zhangyue1-13/+17
bound In line 5001, if all id in the array 'lp->phy[8]' is not 0, when the 'for' end, the 'k' is 8. At this time, the array 'lp->phy[8]' may be out of bound. Signed-off-by: zhangyue <[email protected]> Signed-off-by: David S. Miller <[email protected]>
2021-11-18Merge branch '40GbE' of git://git.kernel.org/pub/scm/linux/kernel/git/tnguy/net-David S. Miller3-136/+147
queue Tony Nguyen says: ==================== Intel Wired LAN Driver Updates 2021-11-17 This series contains updates to i40e driver only. Eryk adds accounting for VLAN header in packet size when VF port VLAN is configured. He also fixes TC queue distribution when the user has changed queue counts as well as for configuration of VF ADQ which caused dropped packets. Michal adds tracking for when a VSI is being released to prevent null pointer dereference when managing filters. Karen ensures PF successfully initiates VF requested reset which could cause a call trace otherwise. Jedrzej moves validation of channel queue value earlier to prevent partial configuration when the value is invalid. Grzegorz corrects the reported error when adding filter fails. ==================== Signed-off-by: David S. Miller <[email protected]>
2021-11-18e100: fix device suspend/resumeJesse Brandeburg1-5/+13
As reported in [1], e100 was no longer working for suspend/resume cycles. The previous commit mentioned in the fixes appears to have broken things and this attempts to practice best known methods for device power management and keep wake-up working while allowing suspend/resume to work. To do this, I reorder a little bit of code and fix the resume path to make sure the device is enabled. [1] https://bugzilla.kernel.org/show_bug.cgi?id=214933 Fixes: 69a74aef8a18 ("e100: use generic power management") Cc: Vaibhav Gupta <[email protected]> Reported-by: Alexey Kuznetsov <[email protected]> Signed-off-by: Jesse Brandeburg <[email protected]> Tested-by: Alexey Kuznetsov <[email protected]> Signed-off-by: Tony Nguyen <[email protected]> Signed-off-by: David S. Miller <[email protected]>
2021-11-18ethernet: hisilicon: hns: hns_dsaf_misc: fix a possible array overflow in ↵Teng Qi1-0/+4
hns_dsaf_ge_srst_by_port() The if statement: if (port >= DSAF_GE_NUM) return; limits the value of port less than DSAF_GE_NUM (i.e., 8). However, if the value of port is 6 or 7, an array overflow could occur: port_rst_off = dsaf_dev->mac_cb[port]->port_rst_off; because the length of dsaf_dev->mac_cb is DSAF_MAX_PORT_NUM (i.e., 6). To fix this possible array overflow, we first check port and if it is greater than or equal to DSAF_MAX_PORT_NUM, the function returns. Reported-by: TOTE Robot <[email protected]> Signed-off-by: Teng Qi <[email protected]> Signed-off-by: David S. Miller <[email protected]>
2021-11-17octeontx2-af: debugfs: don't corrupt user memoryDan Carpenter1-7/+10
The user supplies the "count" value to say how big its read buffer is. The rvu_dbg_lmtst_map_table_display() function does not take the "count" into account but instead just copies the whole table, potentially corrupting the user's data. Introduce the "ret" variable to store how many bytes we can copy. Also I changed the type of "off" to size_t to make using min() simpler. Fixes: 0daa55d033b0 ("octeontx2-af: cn10k: debugfs for dumping LMTST map table") Signed-off-by: Dan Carpenter <[email protected]> Link: https://lore.kernel.org/r/20211117073454.GD5237@kili Signed-off-by: Jakub Kicinski <[email protected]>
2021-11-17i40e: Fix display error code in dmesgGrzegorz Szczurek1-3/+2
Fix misleading display error in dmesg if tc filter return fail. Only i40e status error code should be converted to string, not linux error code. Otherwise, we return false information about the error. Fixes: 2f4b411a3d67 ("i40e: Enable cloud filters via tc-flower") Signed-off-by: Grzegorz Szczurek <[email protected]> Signed-off-by: Mateusz Palczewski <[email protected]> Tested-by: Dave Switzer <[email protected]> Signed-off-by: Tony Nguyen <[email protected]>
2021-11-17i40e: Fix creation of first queue by omitting it if is not power of twoJedrzej Jagielski1-40/+19
Reject TCs creation with proper message if the first queue assignment is not equal to the power of two. The first queue number was checked too late in the second queue iteration, if second queue was configured at all. Now if first queue value is not a power of two, then trying to create qdisc will be rejected. Fixes: 8f88b3034db3 ("i40e: Add infrastructure for queue channel support") Signed-off-by: Grzegorz Szczurek <[email protected]> Signed-off-by: Jedrzej Jagielski <[email protected]> Tested-by: Tony Brelinski <[email protected]> Signed-off-by: Tony Nguyen <[email protected]>
2021-11-17i40e: Fix warning message and call stack during rmmod i40e driverKaren Sornek1-32/+21
Restore part of reset functionality used when reset is called from the VF to reset itself. Without this fix warning message is displayed when VF is being removed via sysfs. Fix the crash of the VF during reset by ensuring that the PF receives the reset message successfully. Refactor code to use one function instead of two. Fixes: 5c3c48ac6bf5 ("i40e: implement virtual device interface") Signed-off-by: Grzegorz Szczurek <[email protected]> Signed-off-by: Karen Sornek <[email protected]> Tested-by: Tony Brelinski <[email protected]> Signed-off-by: Tony Nguyen <[email protected]>
2021-11-17i40e: Fix ping is lost after configuring ADq on VFEryk Rybak3-8/+74
Properly reconfigure VF VSIs after VF request ADQ. Created new function to update queue mapping and queue pairs per TC with AQ update VSI. This sets proper RSS size on NIC. VFs num_queue_pairs should not be changed during setup of queue maps. Previously, VF main VSI in ADQ had configured too many queues and had wrong RSS size, which lead to packets not being consumed and drops in connectivity. Fixes: bc6d33c8d93f ("i40e: Fix the number of queues available to be mapped for use") Co-developed-by: Przemyslaw Patynowski <[email protected]> Signed-off-by: Przemyslaw Patynowski <[email protected]> Signed-off-by: Eryk Rybak <[email protected]> Tested-by: Tony Brelinski <[email protected]> Signed-off-by: Tony Nguyen <[email protected]>
2021-11-17i40e: Fix changing previously set num_queue_pairs for PFsEryk Rybak1-12/+23
Currently, the i40e_vsi_setup_queue_map is basing the count of queues in TCs on a VSI's alloc_queue_pairs member which is not changed throughout any user's action (for example via ethtool's set_channels callback). This implies that vsi->tc_config.tc_info[n].qcount value that is given to the kernel via netdev_set_tc_queue() that notifies about the count of queues per particular traffic class is constant even if user has changed the total count of queues. This in turn caused the kernel warning after setting the queue count to the lower value than the initial one: $ ethtool -l ens801f0 Channel parameters for ens801f0: Pre-set maximums: RX: 0 TX: 0 Other: 1 Combined: 64 Current hardware settings: RX: 0 TX: 0 Other: 1 Combined: 64 $ ethtool -L ens801f0 combined 40 [dmesg] Number of in use tx queues changed invalidating tc mappings. Priority traffic classification disabled! Reason was that vsi->alloc_queue_pairs stayed at 64 value which was used to set the qcount on TC0 (by default only TC0 exists so all of the existing queues are assigned to TC0). we update the offset/qcount via netdev_set_tc_queue() back to the old value but then the netif_set_real_num_tx_queues() is using the vsi->num_queue_pairs as a value which got set to 40. Fix it by using vsi->req_queue_pairs as a queue count that will be distributed across TCs. Do it only for non-zero values, which implies that user actually requested the new count of queues. For VSIs other than main, stay with the vsi->alloc_queue_pairs as we only allow manipulating the queue count on main VSI. Fixes: bc6d33c8d93f ("i40e: Fix the number of queues available to be mapped for use") Co-developed-by: Maciej Fijalkowski <[email protected]> Signed-off-by: Maciej Fijalkowski <[email protected]> Co-developed-by: Przemyslaw Patynowski <[email protected]> Signed-off-by: Przemyslaw Patynowski <[email protected]> Signed-off-by: Eryk Rybak <[email protected]> Tested-by: Tony Brelinski <[email protected]> Signed-off-by: Tony Nguyen <[email protected]>
2021-11-17i40e: Fix NULL ptr dereference on VSI filter syncMichal Maloszewski2-2/+4
Remove the reason of null pointer dereference in sync VSI filters. Added new I40E_VSI_RELEASING flag to signalize deleting and releasing of VSI resources to sync this thread with sync filters subtask. Without this patch it is possible to start update the VSI filter list after VSI is removed, that's causing a kernel oops. Fixes: 41c445ff0f48 ("i40e: main driver core") Signed-off-by: Grzegorz Szczurek <[email protected]> Signed-off-by: Michal Maloszewski <[email protected]> Reviewed-by: Przemyslaw Patynowski <[email protected]> Reviewed-by: Witold Fijalkowski <[email protected]> Reviewed-by: Jaroslaw Gawin <[email protected]> Reviewed-by: Aleksandr Loktionov <[email protected]> Tested-by: Tony Brelinski <[email protected]> Signed-off-by: Tony Nguyen <[email protected]>
2021-11-17i40e: Fix correct max_pkt_size on VF RX queueEryk Rybak1-44/+9
Setting VLAN port increasing RX queue max_pkt_size by 4 bytes to take VLAN tag into account. Trigger the VF reset when setting port VLAN for VF to renegotiate its capabilities and reinitialize. Fixes: ba4e003d29c1 ("i40e: don't hold spinlock while resetting VF") Signed-off-by: Sylwester Dziedziuch <[email protected]> Signed-off-by: Aleksandr Loktionov <[email protected]> Signed-off-by: Eryk Rybak <[email protected]> Tested-by: Konrad Jankowski <[email protected]> Signed-off-by: Tony Nguyen <[email protected]>
2021-11-17net: ax88796c: use bit numbers insetad of bit masksŁukasz Stelmach1-3/+3
Change the values of EVENT_* constants from bit masks to bit numbers as accepted by {clear,set,test}_bit() functions. Reported-by: Dan Carpenter <[email protected]> Signed-off-by: Łukasz Stelmach <[email protected]> Signed-off-by: David S. Miller <[email protected]>
2021-11-17net: dpaa2-eth: fix use-after-free in dpaa2_eth_removePavel Skripkin1-2/+2
Access to netdev after free_netdev() will cause use-after-free bug. Move debug log before free_netdev() call to avoid it. Fixes: 7472dd9f6499 ("staging: fsl-dpaa2/eth: Move print message") Signed-off-by: Pavel Skripkin <[email protected]> Signed-off-by: David S. Miller <[email protected]>
2021-11-17Merge tag 'mlx5-fixes-2021-11-16' of ↵David S. Miller16-89/+122
git://git.kernel.org/pub/scm/linux/kernel/git/saeed/linux Saeed Mahameed says: ==================== mlx5-fixes-2021-11-16 Please pull this mlx5 fixes series, or let me know in case of any problem. ==================== Signed-off-by: David S. Miller <[email protected]>
2021-11-16net: stmmac: Fix signed/unsigned wreckageThomas Gleixner1-13/+10
The recent addition of timestamp correction to compensate the CDC error introduced a subtle signed/unsigned bug in stmmac_get_tx_hwtstamp() while it managed for some obscure reason to avoid that in stmmac_get_rx_hwtstamp(). The issue is: s64 adjust = 0; u64 ns; adjust += -(2 * (NSEC_PER_SEC / priv->plat->clk_ptp_rate)); ns += adjust; works by chance on 64bit, but falls apart on 32bit because the compiler knows that adjust fits into 32bit and then treats the addition as a u64 + u32 resulting in an off by ~2 seconds failure. The RX variant uses an u64 for adjust and does the adjustment via ns -= adjust; because consistency is obviously overrated. Get rid of the pointless zero initialized adjust variable and do: ns -= (2 * NSEC_PER_SEC) / priv->plat->clk_ptp_rate; which is obviously correct and spares the adjust obfuscation. Aside of that it yields a more accurate result because the multiplication takes place before the integer divide truncation and not afterwards. Stick the calculation into an inline so it can't be accidentally disimproved. Return an u32 from that inline as the result is guaranteed to fit which lets the compiler optimize the substraction. Cc: [email protected] Fixes: 3600be5f58c1 ("net: stmmac: add timestamp correction to rid CDC sync error") Reported-by: Benedikt Spranger <[email protected]> Signed-off-by: Thomas Gleixner <[email protected]> Tested-by: Benedikt Spranger <[email protected]> Tested-by: Kurt Kanzenbach <[email protected]> # Intel EHL Link: https://lore.kernel.org/r/87mtm578cs.ffs@tglx Signed-off-by: Jakub Kicinski <[email protected]>
2021-11-16bnxt_en: Fix compile error regression when CONFIG_BNXT_SRIOV is not setMichael Chan2-1/+11
bp->sriov_cfg is not defined when CONFIG_BNXT_SRIOV is not set. Fix it by adding a helper function bnxt_sriov_cfg() to handle the logic with or without the config option. Fixes: 46d08f55d24e ("bnxt_en: extend RTNL to VF check in devlink driver_reinit") Reported-by: kernel test robot <[email protected]> Reviewed-by: Edwin Peer <[email protected]> Reviewed-by: Andy Gospodarek <[email protected]> Signed-off-by: Michael Chan <[email protected]> Link: https://lore.kernel.org/r/[email protected] Signed-off-by: Jakub Kicinski <[email protected]>