aboutsummaryrefslogtreecommitdiff
path: root/drivers/net/ethernet
AgeCommit message (Collapse)AuthorFilesLines
2021-02-16net: enetc: fix destroyed phylink dereference during unbindVladimir Oltean1-2/+3
The following call path suggests that calling unregister_netdev on an interface that is up will first bring it down. enetc_pf_remove -> unregister_netdev -> unregister_netdevice_queue -> unregister_netdevice_many -> dev_close_many -> __dev_close_many -> enetc_close -> enetc_stop -> phylink_stop However, enetc first destroys the phylink instance, then calls unregister_netdev. This is already dissimilar to the setup (and error path teardown path) from enetc_pf_probe, but more than that, it is buggy because it is invalid to call phylink_stop after phylink_destroy. So let's first unregister the netdev (and let the .ndo_stop events consume themselves), then destroy the phylink instance, then free the netdev. Fixes: 71b77a7a27a3 ("enetc: Migrate to PHYLINK and PCS_LYNX") Signed-off-by: Vladimir Oltean <[email protected]> Signed-off-by: David S. Miller <[email protected]>
2021-02-16net: mvneta: Implement mqprio supportMaxime Chevallier1-0/+61
Implement a basic MQPrio support, inserting rules in RX that translate the TC to prio mapping into vlan prio to queues. The TX logic stays the same as when we don't offload the qdisc. Signed-off-by: Maxime Chevallier <[email protected]> Signed-off-by: David S. Miller <[email protected]>
2021-02-16net: mvneta: Remove per-cpu queue mapping for Armada 3700Maxime Chevallier1-1/+8
According to Errata #23 "The per-CPU GbE interrupt is limited to Core 0", we can't use the per-cpu interrupt mechanism on the Armada 3700 familly. This is correctly checked for RSS configuration, but the initial queue mapping is still done by having the queues spread across all the CPUs in the system, both in the init path and in the cpu_hotplug path. Fixes: 2636ac3cc2b4 ("net: mvneta: Add network support for Armada 3700 SoC") Signed-off-by: Maxime Chevallier <[email protected]> Signed-off-by: David S. Miller <[email protected]>
2021-02-16Merge branch 'mlx5-next' of ↵David S. Miller8-117/+399
git://git.kernel.org/pub/scm/linux/kernel/git/mellanox/linux Saeed Mahameed says: ==================== pull-request: mlx5-next 2021-02-16 The patches in this pr are already submitted and reviewed through the netdev and rdma mailing lists. The series includes mlx5 HW bits and definitions for mlx5 real time clock translation and handling in the mlx5 driver clock module to enable and support such mode [1] [1] https://patchwork.kernel.org/project/netdevbpf/patch/[email protected]/ ==================== Signed-off-by: David S. Miller <[email protected]>
2021-02-16drivers: net: xilinx_emaclite: remove arch limitationGary Guo2-3/+2
The changes made in eccd540 is enough for xilinx_emaclite to run without problem on 64-bit systems. I have tested it on a Xilinx FPGA with RV64 softcore. The architecture limitation in Kconfig seems no longer necessary. A small change is included to print address with %lx instead of casting to int and print with %x. Signed-off-by: Gary Guo <[email protected]> Signed-off-by: David S. Miller <[email protected]>
2021-02-16net: mscc: ocelot: Add support for MRPHoratiu Vultur4-1/+245
Add basic support for MRP. The HW will just trap all MRP frames on the ring ports to CPU and allow the SW to process them. In this way it is possible to for this node to behave both as MRM and MRC. Current limitations are: - it doesn't support Interconnect roles. - it supports only a single ring. - the HW should be able to do forwarding of MRP Test frames so the SW will not need to do this. So it would be able to have the role MRC without SW support. Signed-off-by: Horatiu Vultur <[email protected]> Signed-off-by: David S. Miller <[email protected]>
2021-02-16tg3: Remove unused PHY_BRCM flagsFlorian Fainelli1-6/+0
The tg3 driver tried to communicate towards the PHY driver whether it wanted RGMII in-band signaling enabled or disabled however there is nothing that looks at those flags in drivers/net/phy/broadcom.c so this does do not anything. Suggested-by: Vladimir Oltean <[email protected]> Signed-off-by: Florian Fainelli <[email protected]> Signed-off-by: David S. Miller <[email protected]>
2021-02-16net: amd-xgbe: Fix network fluctuations when using 1G BELFUSE SFPShyam Sundar S K1-0/+3
Frequent link up/down events can happen when a Bel Fuse SFP part is connected to the amd-xgbe device. Try to avoid the frequent link issues by resetting the PHY as documented in Bel Fuse SFP datasheets. Fixes: e722ec82374b ("amd-xgbe: Update the BelFuse quirk to support SGMII") Co-developed-by: Sudheesh Mavila <[email protected]> Signed-off-by: Sudheesh Mavila <[email protected]> Signed-off-by: Shyam Sundar S K <[email protected]> Acked-by: Tom Lendacky <[email protected]> Signed-off-by: David S. Miller <[email protected]>
2021-02-16net: amd-xgbe: Reset link when the link never comes backShyam Sundar S K2-1/+9
Normally, auto negotiation and reconnect should be automatically done by the hardware. But there seems to be an issue where auto negotiation has to be restarted manually. This happens because of link training and so even though still connected to the partner the link never "comes back". This needs an auto-negotiation restart. Also, a change in xgbe-mdio is needed to get ethtool to recognize the link down and get the link change message. This change is only required in a backplane connection mode. Fixes: abf0a1c2b26a ("amd-xgbe: Add support for SFP+ modules") Co-developed-by: Sudheesh Mavila <[email protected]> Signed-off-by: Sudheesh Mavila <[email protected]> Signed-off-by: Shyam Sundar S K <[email protected]> Acked-by: Tom Lendacky <[email protected]> Signed-off-by: David S. Miller <[email protected]>
2021-02-16net: amd-xgbe: Fix NETDEV WATCHDOG transmit queue timeout warningShyam Sundar S K2-1/+1
The current driver calls netif_carrier_off() late in the link tear down which can result in a netdev watchdog timeout. Calling netif_carrier_off() immediately after netif_tx_stop_all_queues() avoids the warning. ------------[ cut here ]------------ NETDEV WATCHDOG: enp3s0f2 (amd-xgbe): transmit queue 0 timed out WARNING: CPU: 3 PID: 0 at net/sched/sch_generic.c:461 dev_watchdog+0x20d/0x220 Modules linked in: amd_xgbe(E) amd-xgbe 0000:03:00.2 enp3s0f2: Link is Down CPU: 3 PID: 0 Comm: swapper/3 Tainted: G E Hardware name: AMD Bilby-RV2/Bilby-RV2, BIOS RBB1202A 10/18/2019 RIP: 0010:dev_watchdog+0x20d/0x220 Code: 00 49 63 4e e0 eb 92 4c 89 e7 c6 05 c6 e2 c1 00 01 e8 e7 ce fc ff 89 d9 48 RSP: 0018:ffff90cfc28c3e88 EFLAGS: 00010286 RAX: 0000000000000000 RBX: 0000000000000000 RCX: 0000000000000006 RDX: 0000000000000007 RSI: 0000000000000086 RDI: ffff90cfc28d63c0 RBP: ffff90cfb977845c R08: 0000000000000050 R09: 0000000000196018 R10: ffff90cfc28c3ef8 R11: 0000000000000000 R12: ffff90cfb9778000 R13: 0000000000000003 R14: ffff90cfb9778480 R15: 0000000000000010 FS: 0000000000000000(0000) GS:ffff90cfc28c0000(0000) knlGS:0000000000000000 CS: 0010 DS: 0000 ES: 0000 CR0: 0000000080050033 CR2: 00007f240ff2d9d0 CR3: 00000001e3e0a000 CR4: 00000000003406e0 Call Trace: <IRQ> ? pfifo_fast_reset+0x100/0x100 call_timer_fn+0x2b/0x130 run_timer_softirq+0x3e8/0x440 ? enqueue_hrtimer+0x39/0x90 Fixes: e722ec82374b ("amd-xgbe: Update the BelFuse quirk to support SGMII") Co-developed-by: Sudheesh Mavila <[email protected]> Signed-off-by: Sudheesh Mavila <[email protected]> Signed-off-by: Shyam Sundar S K <[email protected]> Acked-by: Tom Lendacky <[email protected]> Signed-off-by: David S. Miller <[email protected]>
2021-02-16net: amd-xgbe: Reset the PHY rx data path when mailbox command timeoutShyam Sundar S K2-1/+41
Sometimes mailbox commands timeout when the RX data path becomes unresponsive. This prevents the submission of new mailbox commands to DXIO. This patch identifies the timeout and resets the RX data path so that the next message can be submitted properly. Fixes: 549b32af9f7c ("amd-xgbe: Simplify mailbox interface rate change code") Co-developed-by: Sudheesh Mavila <[email protected]> Signed-off-by: Sudheesh Mavila <[email protected]> Signed-off-by: Shyam Sundar S K <[email protected]> Acked-by: Tom Lendacky <[email protected]> Signed-off-by: David S. Miller <[email protected]>
2021-02-16net/mlx5: Add cyc2time HW translation mode supportAya Levin8-38/+241
Device timestamp can be in real time mode (cycles to time translation is offloaded into the Hardware). With real time mode, HW provides timestamp which is already translated into nanoseconds. With this mode, driver adjusts both the HW and timecounter (to keep clock_info_page updated) using callbacks: adjfreq, adjtime and settime. HW clock modifications are done via MTUTC access reg commands. Driver is allowed to modify HW real time clock only if MCAM ptpcyc2realtime_modify capability is set. Add MTUTC set function to be used for configuring the HW real time clock. Modify existing code to support both internal timer (with conversion via timecounter_cyc2time() and real time (no conversions). Align the signatures of the helpers converting from timestamp to nanoseconds. With that, when allocating a queue assign the corresponding callback with respect to the capability. Adjust 1PPS timestamp calculation flows based on the timestamp mode. Cyc2time offload brings two major advantages: - Improve MTAE (Max Time Absolute Error) for HW TS by up to 160 ns over a 100% loaded CPU. - Faster data-path timestamp to nanoseconds, as translation is lock-less and done in HW. On real time mode, timestamp format is 32 high bits of seconds and 32 low bits of nanoseconds. On some flows, driver shall convert this format into nanoseconds wall-clock with REAL_TIME_TO_NS macro. HW supports a single clock, and it is shared by all functions on a device. In case real time clock is used, it is recommended to use a single GM to all device's functions. Signed-off-by: Eran Ben Elisha <[email protected]> Signed-off-by: Aya Levin <[email protected]> Signed-off-by: Saeed Mahameed <[email protected]>
2021-02-16net/mlx5: Move some PPS logic into helper functionsEran Ben Elisha1-40/+73
Some of PPS logic (timestamp calculations) fits only internal timer timestamp mode. Move these logics into helper functions. Later in the patchset cyc2time HW translation mode will expose its own PPS timestamp calculations. With this change, main flow will only hold calling PPS logic based on run time mode. Signed-off-by: Eran Ben Elisha <[email protected]> Signed-off-by: Aya Levin <[email protected]> Reviewed-by: Moshe Shemesh <[email protected]> Signed-off-by: Saeed Mahameed <[email protected]>
2021-02-16net/mlx5: Move all internal timer metadata into a dedicated structEran Ben Elisha2-47/+63
Internal timer mode (SW clock) requires some PTP clock related metadata structs. Real time mode (HW clock) will not need these metadata structs. This separation emphasize the different interfaces for HW clock and SW clock. Signed-off-by: Eran Ben Elisha <[email protected]> Signed-off-by: Aya Levin <[email protected]> Signed-off-by: Saeed Mahameed <[email protected]>
2021-02-16net/mlx5: Refactor init clock functionEran Ben Elisha1-23/+53
Function mlx5_init_clock() is responsible for internal PTP related metadata initializations. Break mlx5_init_clock() to sub functions, each takes care of its own logic. Signed-off-by: Eran Ben Elisha <[email protected]> Signed-off-by: Aya Levin <[email protected]> Reviewed-by: Moshe Shemesh <[email protected]> Signed-off-by: Saeed Mahameed <[email protected]>
2021-02-16octeontx2-af: cn10k: Fixes CN10K RPM reference issueGeetha sowjanya4-6/+12
This patch fixes references to uninitialized variables and debugfs entry name for CN10K platform and HW_TSO flag check. Fixes: 3ad3f8f93c81 ("octeontx2-af: cn10k: MAC internal loopback support"). Signed-off-by: Geetha sowjanya <[email protected]> Signed-off-by: Sunil Goutham <[email protected]> v1-v2 - Clear HW_TSO flag for 96xx B0 version. This patch fixes the bug introduced by the commit 3ad3f8f93c81 ("octeontx2-af: cn10k: MAC internal loopback support"). These changes are not yet merged into net branch, hence submitting to net-next. Signed-off-by: David S. Miller <[email protected]>
2021-02-16ionic: Remove unused function pointer typedef ionic_reset_cbChen Lin1-2/+0
Remove the 'ionic_reset_cb' typedef as it is not used. Signed-off-by: Chen Lin <[email protected]> Acked-by: Shannon Nelson <[email protected]> Signed-off-by: David S. Miller <[email protected]>
2021-02-16kbuild: simplify access to the kernel's versionSasha Levin1-2/+2
Instead of storing the version in a single integer and having various kernel (and userspace) code how it's constructed, export individual (major, patchlevel, sublevel) components and simplify kernel code that uses it. This should also make it easier on userspace. Signed-off-by: Sasha Levin <[email protected]> Acked-by: Greg Kroah-Hartman <[email protected]> Signed-off-by: Masahiro Yamada <[email protected]>
2021-02-15i40e: Fix uninitialized variable mfs_maxColin Ian King1-1/+1
The variable mfs_max is not initialized and is being compared to find the maximum value. Fix this by initializing it to 0. Addresses-Coverity: ("Uninitialized scalar variable") Fixes: 90bc8e003be2 ("i40e: Add hardware configuration for software based DCB") Signed-off-by: Colin Ian King <[email protected]> Signed-off-by: David S. Miller <[email protected]>
2021-02-15net: phy: rename PHY_IGNORE_INTERRUPT to PHY_MAC_INTERRUPTHeiner Kallweit3-4/+4
Some internal PHY's have their events like link change reported by the MAC interrupt. We have PHY_IGNORE_INTERRUPT to deal with this scenario. I'm not too happy with this name. We don't ignore interrupts, typically there is no interrupt exposed at a PHY level. So let's rename it to PHY_MAC_INTERRUPT. This is in line with phy_mac_interrupt(), which is called from the MAC interrupt handler to handle PHY events. Signed-off-by: Heiner Kallweit <[email protected]> Reviewed-by: Andrew Lunn <[email protected]> Acked-by: Florian Fainelli <[email protected]> Reviewed-by: Russell King <[email protected]> Signed-off-by: David S. Miller <[email protected]>
2021-02-15ibmvnic: serialize access to work queue on removeSukadev Bhattiprolu2-8/+24
The work queue is used to queue reset requests like CHANGE-PARAM or FAILOVER resets for the worker thread. When the adapter is being removed the adapter state is set to VNIC_REMOVING and the work queue is flushed so no new work is added. However the check for adapter being removed is racy in that the adapter can go into REMOVING state just after we check and we might end up adding work just as it is being flushed (or after). The ->rwi_lock is already being used to serialize queue/dequeue work. Extend its usage ensure there is no race when scheduling/flushing work. Fixes: 6954a9e4192b ("ibmvnic: Flush existing work items before device removal") Signed-off-by: Sukadev Bhattiprolu <[email protected]> Cc:Uwe Kleine-König <[email protected]> Cc:Saeed Mahameed <[email protected]> Reviewed-by: Dany Madden <[email protected]> Signed-off-by: David S. Miller <[email protected]>
2021-02-15ibmvnic: skip send_request_unmap for timeout resetLijun Pan1-1/+6
Timeout reset will trigger the VIOS to unmap it automatically, similarly as FAILVOER and MOBILITY events. If we unmap it in the linux side, we will see errors like "30000003: Error 4 in REQUEST_UNMAP_RSP". So, don't call send_request_unmap for timeout reset. Fixes: ed651a10875f ("ibmvnic: Updated reset handling") Signed-off-by: Lijun Pan <[email protected]> Signed-off-by: David S. Miller <[email protected]>
2021-02-15ibmvnic: add memory barrier to protect long term bufferLijun Pan1-0/+5
dma_rmb() barrier is added to load the long term buffer before copying it to socket buffer; and dma_wmb() barrier is added to update the long term buffer before it being accessed by VIOS (virtual i/o server). Fixes: 032c5e82847a ("Driver for IBM System i/p VNIC protocol") Signed-off-by: Lijun Pan <[email protected]> Acked-by: Thomas Falcon <[email protected]> Signed-off-by: David S. Miller <[email protected]>
2021-02-15ibmvnic: substitute mb() with dma_wmb() for send_*crq* functionsLijun Pan1-2/+2
The CRQ and subCRQ descriptors are DMA mapped, so dma_wmb(), though weaker, is good enough to protect the data structures. Signed-off-by: Lijun Pan <[email protected]> Acked-by: Thomas Falcon <[email protected]> Signed-off-by: David S. Miller <[email protected]>
2021-02-15ibmvnic: simplify reset_long_term_buff functionLijun Pan1-38/+8
The only thing reset_long_term_buff() should do is set buffer to zero. After doing that, it is not necessary to send_request_map again to VIOS since it actually does not change the mapping. So, keep memset function and remove all others. Signed-off-by: Lijun Pan <[email protected]> Signed-off-by: David S. Miller <[email protected]>
2021-02-15i40e: Fix incorrect argument in call to ipv6_addr_any()Gustavo A. R. Silva1-1/+1
It seems that the right argument to be passed is &tcp_ip6_spec->ip6dst, not &tcp_ip6_spec->ip6src, when calling function ipv6_addr_any(). Addresses-Coverity-ID: 1501734 ("Copy-paste error") Fixes: efca91e89b67 ("i40e: Add flow director support for IPv6") Signed-off-by: Gustavo A. R. Silva <[email protected]> Signed-off-by: David S. Miller <[email protected]>
2021-02-15net: broadcom: bcm4908_enet: set MTU on open & on requestRafał Miłecki1-6/+25
Hardware comes up with default max frame size set to 1518. When using it with switch it results in actual Ethernet MTU 1492: 1518 - 14 (Ethernet header) - 4 (Broadcom's tag) - 4 (802.1q) - 4 (FCS) Above means hardware in its default state can't handle standard Ethernet traffic (MTU 1500). Define maximum possible Ethernet overhead and always set MAC max frame length accordingly. This change fixes handling Ethernet frames of length 1506 - 1514. Signed-off-by: Rafał Miłecki <[email protected]> Signed-off-by: David S. Miller <[email protected]>
2021-02-15net: stmmac: Add Toshiba Visconti SoCs glue driverNobuhiro Iwamatsu3-0/+294
Add dwmac-visconti to the stmmac driver in Toshiba Visconti ARM SoCs. This patch contains only the basic function of the device. There is no clock control, PM, etc. yet. These will be added in the future. Signed-off-by: Nobuhiro Iwamatsu <[email protected]> Signed-off-by: David S. Miller <[email protected]>
2021-02-15r8169: fix resuming from suspend on RTL8105e if machine runs on batteryHeiner Kallweit1-0/+2
Armin reported that after referenced commit his RTL8105e is dead when resuming from suspend and machine runs on battery. This patch has been confirmed to fix the issue. Fixes: e80bd76fbf56 ("r8169: work around power-saving bug on some chip versions") Reported-by: Armin Wolf <[email protected]> Tested-by: Armin Wolf <[email protected]> Signed-off-by: Heiner Kallweit <[email protected]> Signed-off-by: David S. Miller <[email protected]>
2021-02-15net: mvpp2: Add TX flow control support for jumbo framesStefan Chulski1-0/+26
With MTU less than 1500B on all ports, the driver uses per CPU pool mode. If one of the ports set to jumbo frame MTU size, all ports move to shared pools mode. Here, buffer manager TX Flow Control reconfigured on all ports. Signed-off-by: Stefan Chulski <[email protected]> Signed-off-by: David S. Miller <[email protected]>
2021-02-15net: mvpp2: reduce tx-fifo for loopback portStefan Chulski2-7/+7
1KB is enough for loopback port, so 2KB can be distributed between other ports. Signed-off-by: Stefan Chulski <[email protected]> Signed-off-by: David S. Miller <[email protected]>
2021-02-15net: mscc: ocelot: avoid type promotion when calling ocelot_ifh_set_destVladimir Oltean1-1/+1
Smatch is confused by the fact that a 32-bit BIT(port) macro is passed as argument to the ocelot_ifh_set_dest function and warns: ocelot_xmit() warn: should '(((1))) << (dp->index)' be a 64 bit type? seville_xmit() warn: should '(((1))) << (dp->index)' be a 64 bit type? The destination port mask is copied into a 12-bit field of the packet, starting at bit offset 67 and ending at 56. So this DSA tagging protocol supports at most 12 bits, which is clearly less than 32. Attempting to send to a port number > 12 will cause the packing() call to truncate way before there will be 32-bit truncation due to type promotion of the BIT(port) argument towards u64. Therefore, smatch's fears that BIT(port) will do the wrong thing and cause unexpected truncation for "port" values >= 32 are unfounded. Nonetheless, let's silence the warning by explicitly passing an u64 value to ocelot_ifh_set_dest, such that the compiler does not need to do a questionable type promotion. Reported-by: kernel test robot <[email protected]> Reported-by: Dan Carpenter <[email protected]> Signed-off-by: Vladimir Oltean <[email protected]> Signed-off-by: David S. Miller <[email protected]>
2021-02-15cxgb4/chtls/cxgbit: Keeping the max ofld immediate data size same in cxgb4 ↵Ayush Sawal3-6/+11
and ulds The Max imm data size in cxgb4 is not similar to the max imm data size in the chtls. This caused an mismatch in output of is_ofld_imm() of cxgb4 and chtls. So fixed this by keeping the max wreq size of imm data same in both chtls and cxgb4 as MAX_IMM_OFLD_TX_DATA_WR_LEN. As cxgb4's max imm. data value for ofld packets is changed to MAX_IMM_OFLD_TX_DATA_WR_LEN. Using the same in cxgbit also. Fixes: 36bedb3f2e5b8 ("crypto: chtls - Inline TLS record Tx") Signed-off-by: Ayush Sawal <[email protected]> Acked-by: Jakub Kicinski <[email protected]> Signed-off-by: David S. Miller <[email protected]>
2021-02-15r8169: fix resuming from suspend on RTL8105e if machine runs on batteryHeiner Kallweit1-0/+1
Armin reported that after referenced commit his RTL8105e is dead when resuming from suspend and machine runs on battery. This patch has been confirmed to fix the issue. Fixes: e80bd76fbf56 ("r8169: work around power-saving bug on some chip versions") Reported-by: Armin Wolf <[email protected]> Tested-by: Armin Wolf <[email protected]> Signed-off-by: Heiner Kallweit <[email protected]> Signed-off-by: David S. Miller <[email protected]>
2021-02-14net: mvpp2: improve Networking Complex Control register namingStefan Chulski2-7/+7
GENCONF_CTRL0_PORTX naming improved. Non functional change. Signed-off-by: Stefan Chulski <[email protected]> Signed-off-by: David S. Miller <[email protected]>
2021-02-14net: mvpp2: improve mvpp2_get_sram returnStefan Chulski1-3/+1
Use PTR_ERR_OR_ZERO instead of IS_ERR and PTR_ERR. Non functional change. Signed-off-by: Stefan Chulski <[email protected]> Signed-off-by: David S. Miller <[email protected]>
2021-02-14net: mvpp2: improve Packet Processor version checkStefan Chulski1-18/+18
Use >= MVPP22 instead of != MVPP21. Non functional change. Signed-off-by: Stefan Chulski <[email protected]> Signed-off-by: David S. Miller <[email protected]>
2021-02-14net: mvpp2: simplify PPv2 version ID readStefan Chulski1-4/+2
PPv2.1 contain 0 in Version ID register, priv->hw_version check can be removed. Signed-off-by: Stefan Chulski <[email protected]> Signed-off-by: David S. Miller <[email protected]>
2021-02-14net: dsa: tag_ocelot_8021q: add support for PTP timestampingVladimir Oltean2-2/+8
For TX timestamping, we use the felix_txtstamp method which is common with the regular (non-8021q) ocelot tagger. This method says that skb deferral is needed, prepares a timestamp request ID, and puts a clone of the skb in a queue waiting for the timestamp IRQ. felix_txtstamp is called by dsa_skb_tx_timestamp() just before the tagger's xmit method. In the tagger xmit, we divert the packets classified by dsa_skb_tx_timestamp() as PTP towards the MMIO-based injection registers, and we declare them as dead towards dsa_slave_xmit. If not PTP, we proceed with normal tag_8021q stuff. Then the timestamp IRQ fires, the clone queued up from felix_txtstamp is matched to the TX timestamp retrieved from the switch's FIFO based on the timestamp request ID, and the clone is delivered to the stack. On RX, thanks to the VCAP IS2 rule that redirects the frames with an EtherType for 1588 towards two destinations: - the CPU port module (for MMIO based extraction) and - if the "no XTR IRQ" workaround is in place, the dsa_8021q CPU port the relevant data path processing starts in the ptp_classify_raw BPF classifier installed by DSA in the RX data path (post tagger, which is completely unaware that it saw a PTP packet). This time we can't reuse the same implementation of .port_rxtstamp that also works with the default ocelot tagger. That is because felix_rxtstamp is given an skb with a freshly stripped DSA header, and it says "I don't need deferral for its RX timestamp, it's right in it, let me show you"; and it just points to the header right behind skb->data, from where it unpacks the timestamp and annotates the skb with it. The same thing cannot happen with tag_ocelot_8021q, because for one thing, the skb did not have an extraction frame header in the first place, but a VLAN tag with no timestamp information. So the code paths in felix_rxtstamp for the regular and 8021q tagger are completely independent. With tag_8021q, the timestamp must come from the packet's duplicate delivered to the CPU port module, but there is potentially complex logic to be handled [ and prone to reordering ] if we were to just start reading packets from the CPU port module, and try to match them to the one we received over Ethernet and which needs an RX timestamp. So we do something simple: we tell DSA "give me some time to think" (we request skb deferral by returning false from .port_rxtstamp) and we just drop the frame we got over Ethernet with no attempt to match it to anything - we just treat it as a notification that there's data to be processed from the CPU port module's queues. Then we proceed to read the packets from those, one by one, which we deliver up the stack, timestamped, using netif_rx - the same function that any driver would use anyway if it needed RX timestamp deferral. So the assumption is that we'll come across the PTP packet that triggered the CPU extraction notification eventually, but we don't know when exactly. Thanks to the VCAP IS2 trap/redirect rule and the exclusion of the CPU port module from the flooding replicators, only PTP frames should be present in the CPU port module's RX queues anyway. There is just one conflict between the VCAP IS2 trapping rule and the semantics of the BPF classifier. Namely, ptp_classify_raw() deems general messages as non-timestampable, but still, those are trapped to the CPU port module since they have an EtherType of ETH_P_1588. So, if the "no XTR IRQ" workaround is in place, we need to run another BPF classifier on the frames extracted over MMIO, to avoid duplicates being sent to the stack (once over Ethernet, once over MMIO). It doesn't look like it's possible to install VCAP IS2 rules based on keys extracted from the 1588 frame headers. Signed-off-by: Vladimir Oltean <[email protected]> Reviewed-by: Florian Fainelli <[email protected]> Signed-off-by: David S. Miller <[email protected]>
2021-02-14net: mscc: ocelot: refactor ocelot_xtr_irq_handler into ocelot_xtr_pollVladimir Oltean2-135/+156
Since the felix DSA driver will need to poll the CPU port module for extracted frames as well, let's create some common functions that read an Extraction Frame Header, and then an skb, from a CPU extraction group. We abuse the struct ocelot_ops :: port_to_netdev function a little bit, in order to retrieve the DSA port net_device or the ocelot switchdev net_device based on the source port information from the Extraction Frame Header, but it's all in the benefit of code simplification - netdev_alloc_skb needs it. Originally, the port_to_netdev method was intended for parsing act->dev from tc flower offload code. Signed-off-by: Vladimir Oltean <[email protected]> Reviewed-by: Florian Fainelli <[email protected]> Signed-off-by: David S. Miller <[email protected]>
2021-02-14net: mscc: ocelot: use common tag parsing code with DSAVladimir Oltean3-77/+24
The Injection Frame Header and Extraction Frame Header that the switch prepends to frames over the NPI port is also prepended to frames delivered over the CPU port module's queues. Let's unify the handling of the frame headers by making the ocelot driver call some helpers exported by the DSA tagger. Among other things, this allows us to get rid of the strange cpu_to_be32 when transmitting the Injection Frame Header on ocelot, since the packing API uses network byte order natively (when "quirks" is 0). The comments above ocelot_gen_ifh talk about setting pop_cnt to 3, and the cpu extraction queue mask to something, but the code doesn't do it, so we don't do it either. Signed-off-by: Vladimir Oltean <[email protected]> Reviewed-by: Florian Fainelli <[email protected]> Signed-off-by: David S. Miller <[email protected]>
2021-02-14net: mscc: ocelot: refactor ocelot_port_inject_frame out of ocelot_port_xmitVladimir Oltean2-74/+87
The felix DSA driver will inject some frames through register MMIO, same as ocelot switchdev currently does. So we need to be able to reuse the common code. Also create some shim definitions, since the DSA tagger can be compiled without support for the switch driver. Signed-off-by: Vladimir Oltean <[email protected]> Reviewed-by: Florian Fainelli <[email protected]> Signed-off-by: David S. Miller <[email protected]>
2021-02-14net: mscc: ocelot: use DIV_ROUND_UP helper in ocelot_port_inject_frameVladimir Oltean1-1/+1
This looks a bit nicer than the open-coded "(x + 3) % 4" idiom. Signed-off-by: Vladimir Oltean <[email protected]> Reviewed-by: Florian Fainelli <[email protected]> Signed-off-by: David S. Miller <[email protected]>
2021-02-14net: mscc: ocelot: better error handling in ocelot_xtr_irq_handlerVladimir Oltean1-10/+12
The ocelot_rx_frame_word() function can return a negative error code, however this isn't being checked for consistently. Errors being ignored have not been seen in practice though. Also, some constructs can be simplified by using "goto" instead of repeated "break" statements. Signed-off-by: Vladimir Oltean <[email protected]> Reviewed-by: Florian Fainelli <[email protected]> Signed-off-by: David S. Miller <[email protected]>
2021-02-14net: mscc: ocelot: only drain extraction queue on errorVladimir Oltean1-1/+1
It appears that the intention of this snippet of code is to not exit ocelot_xtr_irq_handler() while in the middle of extracting a frame. The problem in extracting it word by word is that future extraction attempts are really easy to get desynchronized, since the IRQ handler assumes that the first 16 bytes are the IFH, which give further information about the frame, such as frame length. But during normal operation, "err" will not be 0, but 4, set from here: for (i = 0; i < OCELOT_TAG_LEN / 4; i++) { err = ocelot_rx_frame_word(ocelot, grp, true, &ifh[i]); if (err != 4) break; } if (err != 4) break; In that case, draining the extraction queue is a no-op. So explicitly make this code execute only on negative err. Signed-off-by: Vladimir Oltean <[email protected]> Reviewed-by: Florian Fainelli <[email protected]> Signed-off-by: David S. Miller <[email protected]>
2021-02-14net: mscc: ocelot: stop returning IRQ_NONE in ocelot_xtr_irq_handlerVladimir Oltean1-5/+2
Since the xtr (extraction) IRQ of the ocelot switch is not shared, then if it fired, it means that some data must be present in the queues of the CPU port module. So simplify the code. Signed-off-by: Vladimir Oltean <[email protected]> Reviewed-by: Florian Fainelli <[email protected]> Signed-off-by: David S. Miller <[email protected]>
2021-02-14bnxt_en: Improve logging of error recovery settings information.Michael Chan1-7/+8
We currently only log the error recovery settings if it is enabled. In some cases, firmware disables error recovery after it was initially enabled. Without logging anything, the user will not be aware of this change in setting. Log it when error recovery is disabled. Also, change the reset count value from hexadecimal to decimal. Reviewed-by: Edwin Peer <[email protected]> Reviewed-by: Pavan Chebbi <[email protected]> Signed-off-by: Michael Chan <[email protected]> Signed-off-by: David S. Miller <[email protected]>
2021-02-14bnxt_en: Reply to firmware's echo request async message.Michael Chan2-0/+32
This is a new async message that the firmware can send to check if it can communicate with the driver. This is an added error detection scheme that firmware can use if it suspects errors in the PCIe interface. When the driver receives this async message, it will reply back echoing some data in the async message. If the firmware is not getting the reply with the proper data after some retries, error recovery will kick in. Reviewed-by: Andy Gospodarek <[email protected]> Reviewed-by: Edwin Peer <[email protected]> Reviewed-by: Vasundhara Volam <[email protected]> Signed-off-by: Michael Chan <[email protected]> Signed-off-by: David S. Miller <[email protected]>
2021-02-14bnxt_en: Initialize "context kind" field for context memory blocks.Michael Chan1-5/+42
If firmware provides the offset to the "context kind" field of the relevant context memory blocks, we'll initialize just that field for each block instead of initializing all of context memory. Populate the bnxt_mem_init structure with the proper offset returned by firmware. If it is older firmware and the information is not available, we set the offset to an invalid value and fall back to the old behavior of initializing every byte. Otherwise, we initialize only the "context kind" byte at the offset. Reviewed-by: Edwin Peer <[email protected]> Signed-off-by: Michael Chan <[email protected]> Signed-off-by: David S. Miller <[email protected]>
2021-02-14bnxt_en: Add context memory initialization infrastructure.Michael Chan2-18/+53
Currently, the driver calls memset() to set all relevant context memory used by the chip to the initial value. This can take many milliseconds with the potentially large number of context pages allocated for the chip. To make this faster, we only need to initialize the "context kind" field of each block of context memory. This patch sets up the infrastructure to do that with the bnxt_mem_init structure. In the next patch, we'll add the logic to obtain the offset of the "context kind" from the firmware. This patch is not changing the current behavior of calling memset() to initialize all relevant context memory. Reviewed-by: Pavan Chebbi <[email protected]> Reviewed-by: Edwin Peer <[email protected]> Signed-off-by: Michael Chan <[email protected]> Signed-off-by: David S. Miller <[email protected]>