aboutsummaryrefslogtreecommitdiff
path: root/drivers/net/ethernet/intel/ice
AgeCommit message (Collapse)AuthorFilesLines
2021-09-27ice: Open devlink when device is readyLeon Romanovsky1-4/+2
Move devlink_registration routine to be the last command, when the device is fully initialized. Signed-off-by: Leon Romanovsky <[email protected]> Signed-off-by: David S. Miller <[email protected]>
2021-09-24ice: Delete always true check of PF pointerLeon Romanovsky1-3/+0
PF pointer is always valid when PCI core calls its .shutdown() and .remove() callbacks. There is no need to check it again. Fixes: 837f08fdecbe ("ice: Add basic driver framework for Intel(R) E800 Series") Signed-off-by: Leon Romanovsky <[email protected]> Signed-off-by: David S. Miller <[email protected]>
2021-09-22devlink: Make devlink_register to be voidLeon Romanovsky3-16/+4
devlink_register() can't fail and always returns success, but all drivers are obligated to check returned status anyway. This adds a lot of boilerplate code to handle impossible flow. Make devlink_register() void and simplify the drivers that use that API call. Signed-off-by: Leon Romanovsky <[email protected]> Acked-by: Simon Horman <[email protected]> Acked-by: Vladimir Oltean <[email protected]> # dsa Reviewed-by: Jiri Pirko <[email protected]> Signed-off-by: David S. Miller <[email protected]>
2021-09-10ice: Correctly deal with PFs that do not support RDMADave Ertman2-0/+8
There are two cases where the current PF does not support RDMA functionality. The first is if the NVM loaded on the device is set to not support RDMA (common_caps.rdma is false). The second is if the kernel bonding driver has included the current PF in an active link aggregate. When the driver has determined that this PF does not support RDMA, then auxiliary devices should not be created on the auxiliary bus. Without a device on the auxiliary bus, even if the irdma driver is present, there will be no RDMA activity attempted on this PF. Currently, in the reset flow, an attempt to create auxiliary devices is performed without regard to the ability of the PF. There needs to be a check in ice_aux_plug_dev (as the central point that creates auxiliary devices) to see if the PF is in a state to support the functionality. When disabling and re-enabling RDMA due to the inclusion/removal of the PF in a link aggregate, we also need to set/clear the bit which controls auxiliary device creation so that a reset recovery in a link aggregate situation doesn't try to create auxiliary devices when it shouldn't. Fixes: f9f5301e7e2d ("ice: Register auxiliary device to provide RDMA") Reported-by: Yongxin Liu <[email protected]> Signed-off-by: Dave Ertman <[email protected]> Signed-off-by: Tony Nguyen <[email protected]> Signed-off-by: David S. Miller <[email protected]>
2021-08-31Merge git://git.kernel.org/pub/scm/linux/kernel/git/netdev/netJakub Kicinski2-16/+63
include/linux/netdevice.h net/socket.c d0efb16294d1 ("net: don't unconditionally copy_from_user a struct ifreq for socket ioctls") 876f0bf9d0d5 ("net: socket: simplify dev_ifconf handling") 29c4964822aa ("net: socket: rework compat_ifreq_ioctl()") Signed-off-by: Jakub Kicinski <[email protected]>
2021-08-27ice: Only lock to update netdev dev_addrBrett Creeley1-4/+9
commit 3ba7f53f8bf1 ("ice: don't remove netdev->dev_addr from uc sync list") introduced calls to netif_addr_lock_bh() and netif_addr_unlock_bh() in the driver's ndo_set_mac() callback. This is fine since the driver is updated the netdev's dev_addr, but since this is a spinlock, the driver cannot sleep when the lock is held. Unfortunately the functions to add/delete MAC filters depend on a mutex. This was causing a trace with the lock debug kernel config options enabled when changing the mac address via iproute. [ 203.273059] BUG: sleeping function called from invalid context at kernel/locking/mutex.c:281 [ 203.273065] in_atomic(): 1, irqs_disabled(): 0, non_block: 0, pid: 6698, name: ip [ 203.273068] Preemption disabled at: [ 203.273068] [<ffffffffc04aaeab>] ice_set_mac_address+0x8b/0x1c0 [ice] [ 203.273097] CPU: 31 PID: 6698 Comm: ip Tainted: G S W I 5.14.0-rc4 #2 [ 203.273100] Hardware name: Intel Corporation S2600WFT/S2600WFT, BIOS SE5C620.86B.02.01.0010.010620200716 01/06/2020 [ 203.273102] Call Trace: [ 203.273107] dump_stack_lvl+0x33/0x42 [ 203.273113] ? ice_set_mac_address+0x8b/0x1c0 [ice] [ 203.273124] ___might_sleep.cold.150+0xda/0xea [ 203.273131] mutex_lock+0x1c/0x40 [ 203.273136] ice_remove_mac+0xe3/0x180 [ice] [ 203.273155] ? ice_fltr_add_mac_list+0x20/0x20 [ice] [ 203.273175] ice_fltr_prepare_mac+0x43/0xa0 [ice] [ 203.273194] ice_set_mac_address+0xab/0x1c0 [ice] [ 203.273206] dev_set_mac_address+0xb8/0x120 [ 203.273210] dev_set_mac_address_user+0x2c/0x50 [ 203.273212] do_setlink+0x1dd/0x10e0 [ 203.273217] ? __nla_validate_parse+0x12d/0x1a0 [ 203.273221] __rtnl_newlink+0x530/0x910 [ 203.273224] ? __kmalloc_node_track_caller+0x17f/0x380 [ 203.273230] ? preempt_count_add+0x68/0xa0 [ 203.273236] ? _raw_spin_lock_irqsave+0x1f/0x30 [ 203.273241] ? kmem_cache_alloc_trace+0x4d/0x440 [ 203.273244] rtnl_newlink+0x43/0x60 [ 203.273245] rtnetlink_rcv_msg+0x13a/0x380 [ 203.273248] ? rtnl_calcit.isra.40+0x130/0x130 [ 203.273250] netlink_rcv_skb+0x4e/0x100 [ 203.273256] netlink_unicast+0x1a2/0x280 [ 203.273258] netlink_sendmsg+0x242/0x490 [ 203.273260] sock_sendmsg+0x58/0x60 [ 203.273263] ____sys_sendmsg+0x1ef/0x260 [ 203.273265] ? copy_msghdr_from_user+0x5c/0x90 [ 203.273268] ? ____sys_recvmsg+0xe6/0x170 [ 203.273270] ___sys_sendmsg+0x7c/0xc0 [ 203.273272] ? copy_msghdr_from_user+0x5c/0x90 [ 203.273274] ? ___sys_recvmsg+0x89/0xc0 [ 203.273276] ? __netlink_sendskb+0x50/0x50 [ 203.273278] ? mod_objcg_state+0xee/0x310 [ 203.273282] ? __dentry_kill+0x114/0x170 [ 203.273286] ? get_max_files+0x10/0x10 [ 203.273288] __sys_sendmsg+0x57/0xa0 [ 203.273290] do_syscall_64+0x37/0x80 [ 203.273295] entry_SYSCALL_64_after_hwframe+0x44/0xae [ 203.273296] RIP: 0033:0x7f8edf96e278 [ 203.273298] Code: 89 02 48 c7 c0 ff ff ff ff eb b5 0f 1f 80 00 00 00 00 f3 0f 1e fa 48 8d 05 25 63 2c 00 8b 00 85 c0 75 17 b8 2e 00 00 00 0f 05 <48> 3d 00 f0 ff ff 77 58 c3 0f 1f 80 00 00 00 00 41 54 41 89 d4 55 [ 203.273300] RSP: 002b:00007ffcb8bdac08 EFLAGS: 00000246 ORIG_RAX: 000000000000002e [ 203.273303] RAX: ffffffffffffffda RBX: 000000006115e0ae RCX: 00007f8edf96e278 [ 203.273304] RDX: 0000000000000000 RSI: 00007ffcb8bdac70 RDI: 0000000000000003 [ 203.273305] RBP: 0000000000000000 R08: 0000000000000001 R09: 00007ffcb8bda5b0 [ 203.273306] R10: 0000000000000000 R11: 0000000000000246 R12: 0000000000000001 [ 203.273306] R13: 0000555e10092020 R14: 0000000000000000 R15: 0000000000000005 Fix this by only locking when changing the netdev->dev_addr. Also, make sure to restore the old netdev->dev_addr on any failures. Fixes: 3ba7f53f8bf1 ("ice: don't remove netdev->dev_addr from uc sync list") Signed-off-by: Brett Creeley <[email protected]> Tested-by: Gurucharan G <[email protected]> Signed-off-by: Tony Nguyen <[email protected]>
2021-08-27ice: restart periodic outputs around time changesJacob Keller1-0/+49
When we enabled auxiliary input/output support for the E810 device, we forgot to add logic to restart the output when we change time. This is important as the periodic output will be incorrect after a time change otherwise. This unfortunately includes the adjust time function, even though it uses an atomic hardware interface. The atomic adjustment can still cause the pin output to stall permanently, so we need to stop and restart it. Introduce wrapper functions to temporarily disable and then re-enable the clock outputs. Fixes: 172db5f91d5f ("ice: add support for auxiliary input/output pins") Signed-off-by: Jacob Keller <[email protected]> Tested-by: Sunitha D Mekala <[email protected]> Signed-off-by: Tony Nguyen <[email protected]>
2021-08-27ice: add lock around Tx timestamp tracker flushJacob Keller1-0/+4
The driver didn't take the lock while flushing the Tx tracker, which could cause a race where one thread is trying to read timestamps out while another thread is trying to read the tracker to check the timestamps. Avoid this by ensuring that flushing is locked against read accesses. Fixes: ea9b847cda64 ("ice: enable transmit timestamps for E810 devices") Signed-off-by: Jacob Keller <[email protected]> Tested-by: Gurucharan G <[email protected]> Signed-off-by: Tony Nguyen <[email protected]>
2021-08-27ice: remove dead code for allocating pin_configJacob Keller1-11/+0
We have code in the ice driver which allocates the pin_config structure if n_pins is > 0, but we never set n_pins to be greater than zero. There's no reason to keep this code until we actually have pin_config support. Remove this. We can re-add it properly when we implement support for pin_config for E810-T devices. Fixes: 172db5f91d5f ("ice: add support for auxiliary input/output pins") Signed-off-by: Jacob Keller <[email protected]> Tested-by: Gurucharan G <[email protected]> Signed-off-by: Tony Nguyen <[email protected]>
2021-08-27ice: fix Tx queue iteration for Tx timestamp enablementJacob Keller1-1/+1
The driver accidentally copied the ice_for_each_rxq iterator when implementing enablement of the ptp_tx bit for the Tx rings. We still load the Tx rings and set the ptp_tx field, but we iterate over the count of the num_rxq. If the number of Tx and Rx queues differ, this could either cause a buffer overrun when accessing the tx_rings list if num_txq is greater than num_rxq, or it could cause us to fail to enable Tx timestamps for some rings. This was not noticed originally as we generally have the same number of Tx and Rx queues. Fixes: ea9b847cda64 ("ice: enable transmit timestamps for E810 devices") Signed-off-by: Jacob Keller <[email protected]> Tested-by: Gurucharan G <[email protected]> Signed-off-by: Tony Nguyen <[email protected]>
2021-08-26Merge git://git.kernel.org/pub/scm/linux/kernel/git/netdev/netJakub Kicinski1-1/+3
drivers/net/wwan/mhi_wwan_mbim.c - drop the extra arg. Signed-off-by: Jakub Kicinski <[email protected]>
2021-08-24ethtool: extend coalesce setting uAPI with CQE modeYufeng Mo1-4/+8
In order to support more coalesce parameters through netlink, add two new parameter kernel_coal and extack for .set_coalesce and .get_coalesce, then some extra info can return to user with the netlink API. Signed-off-by: Yufeng Mo <[email protected]> Signed-off-by: Huazhong Tan <[email protected]> Signed-off-by: Jakub Kicinski <[email protected]>
2021-08-19ice: do not abort devlink info if board identifier can't be foundJacob Keller1-1/+3
The devlink dev info command reports version information about the device and firmware running on the board. This includes the "board.id" field which is supposed to represent an identifier of the board design. The ice driver uses the Product Board Assembly identifier for this. In some cases, the PBA is not present in the NVM. If this happens, devlink dev info will fail with an error. Instead, modify the ice_info_pba function to just exit without filling in the context buffer. This will cause the board.id field to be skipped. Log a dev_dbg message in case someone wants to confirm why board.id is not showing up for them. Fixes: e961b679fb0b ("ice: add board identifier info to devlink .info_get") Signed-off-by: Jacob Keller <[email protected]> Tested-by: Tony Brelinski <[email protected]> Signed-off-by: Tony Nguyen <[email protected]> Link: https://lore.kernel.org/r/[email protected] Signed-off-by: Jakub Kicinski <[email protected]>
2021-08-19Merge git://git.kernel.org/pub/scm/linux/kernel/git/netdev/netJakub Kicinski1-1/+1
drivers/ptp/Kconfig: 55c8fca1dae1 ("ptp_pch: Restore dependency on PCI") e5f31552674e ("ethernet: fix PTP_1588_CLOCK dependencies") Signed-off-by: Jakub Kicinski <[email protected]>
2021-08-13ice: Fix perout start time roundingMaciej Machnikowski1-1/+1
Internal tests found out that the latest code doesn't bring up 1PPS out as expected. As a result of incorrect define used to round the time up the time was round down to the past second boundary. Fix define used for rounding to properly round up to the next Top of second in ice_ptp_cfg_clkout to fix it. Fixes: 172db5f91d5f ("ice: add support for auxiliary input/output pins") Signed-off-by: Maciej Machnikowski <[email protected]> Tested-by: Sunitha Mekala <[email protected]> Signed-off-by: Tony Nguyen <[email protected]> Link: https://lore.kernel.org/r/[email protected] Signed-off-by: Jakub Kicinski <[email protected]>
2021-08-13Merge git://git.kernel.org/pub/scm/linux/kernel/git/netdev/netJakub Kicinski3-8/+28
Conflicts: drivers/net/ethernet/broadcom/bnxt/bnxt_ptp.h 9e26680733d5 ("bnxt_en: Update firmware call to retrieve TX PTP timestamp") 9e518f25802c ("bnxt_en: 1PPS functions to configure TSIO pins") 099fdeda659d ("bnxt_en: Event handler for PPS events") kernel/bpf/helpers.c include/linux/bpf-cgroup.h a2baf4e8bb0f ("bpf: Fix potentially incorrect results with bpf_get_local_storage()") c7603cfa04e7 ("bpf: Add ambient BPF runtime context stored in current") drivers/net/ethernet/mellanox/mlx5/core/pci_irq.c 5957cc557dc5 ("net/mlx5: Set all field of mlx5_irq before inserting it to the xarray") 2d0b41a37679 ("net/mlx5: Refcount mlx5_irq with integer") MAINTAINERS 7b637cd52f02 ("MAINTAINERS: fix Microchip CAN BUS Analyzer Tool entry typo") 7d901a1e878a ("net: phy: add Maxlinear GPY115/21x/24x driver") Signed-off-by: Jakub Kicinski <[email protected]>
2021-08-09ice: don't remove netdev->dev_addr from uc sync listBrett Creeley1-8/+15
In some circumstances, such as with bridging, it's possible that the stack will add the device's own MAC address to its unicast address list. If, later, the stack deletes this address, the driver will receive a request to remove this address. The driver stores its current MAC address as part of the VSI MAC filter list instead of separately. So, this causes a problem when the device's MAC address is deleted unexpectedly, which results in traffic failure in some cases. The following configuration steps will reproduce the previously mentioned problem: > ip link set eth0 up > ip link add dev br0 type bridge > ip link set br0 up > ip addr flush dev eth0 > ip link set eth0 master br0 > echo 1 > /sys/class/net/br0/bridge/vlan_filtering > modprobe -r veth > modprobe -r bridge > ip addr add 192.168.1.100/24 dev eth0 The following ping command fails due to the netdev->dev_addr being deleted when removing the bridge module. > ping <link partner> Fix this by making sure to not delete the netdev->dev_addr during MAC address sync. After fixing this issue it was noticed that the netdev_warn() in .set_mac was overly verbose, so make it at netdev_dbg(). Also, there is a possibility of a race condition between .set_mac and .set_rx_mode. Fix this by calling netif_addr_lock_bh() and netif_addr_unlock_bh() on the device's netdev when the netdev->dev_addr is going to be updated in .set_mac. Fixes: e94d44786693 ("ice: Implement filter sync, NDO operations and bump version") Signed-off-by: Brett Creeley <[email protected]> Tested-by: Liang Li <[email protected]> Tested-by: Gurucharan G <[email protected]> Signed-off-by: Tony Nguyen <[email protected]>
2021-08-09ice: Stop processing VF messages during teardownAnirudh Venkataramanan2-0/+8
When VFs are setup and torn down in quick succession, it is possible that a VF is torn down by the PF while the VF's virtchnl requests are still in the PF's mailbox ring. Processing the VF's virtchnl request when the VF itself doesn't exist results in undefined behavior. Fix this by adding a check to stop processing virtchnl requests when VF teardown is in progress. Fixes: ddf30f7ff840 ("ice: Add handler to configure SR-IOV") Signed-off-by: Anirudh Venkataramanan <[email protected]> Tested-by: Konrad Jankowski <[email protected]> Signed-off-by: Tony Nguyen <[email protected]>
2021-08-09ice: Prevent probing virtual functionsAnirudh Venkataramanan1-0/+5
The userspace utility "driverctl" can be used to change/override the system's default driver choices. This is useful in some situations (buggy driver, old driver missing a device ID, trying a workaround, etc.) where the user needs to load a different driver. However, this is also prone to user error, where a driver is mapped to a device it's not designed to drive. For example, if the ice driver is mapped to driver iavf devices, the ice driver crashes. Add a check to return an error if the ice driver is being used to probe a virtual function. Fixes: 837f08fdecbe ("ice: Add basic driver framework for Intel(R) E800 Series") Signed-off-by: Anirudh Venkataramanan <[email protected]> Tested-by: Gurucharan G <[email protected]> Tested-by: Konrad Jankowski <[email protected]> Signed-off-by: Tony Nguyen <[email protected]>
2021-08-09devlink: Set device as early as possibleLeon Romanovsky1-2/+2
All kernel devlink implementations call to devlink_alloc() during initialization routine for specific device which is used later as a parent device for devlink_register(). Such late device assignment causes to the situation which requires us to call to device_register() before setting other parameters, but that call opens devlink to the world and makes accessible for the netlink users. Any attempt to move devlink_register() to be the last call generates the following error due to access to the devlink->dev pointer. [ 8.758862] devlink_nl_param_fill+0x2e8/0xe50 [ 8.760305] devlink_param_notify+0x6d/0x180 [ 8.760435] __devlink_params_register+0x2f1/0x670 [ 8.760558] devlink_params_register+0x1e/0x20 The simple change of API to set devlink device in the devlink_alloc() instead of devlink_register() fixes all this above and ensures that prior to call to devlink_register() everything already set. Signed-off-by: Leon Romanovsky <[email protected]> Reviewed-by: Jiri Pirko <[email protected]> Signed-off-by: David S. Miller <[email protected]>
2021-07-27dev_ioctl: split out ndo_eth_ioctlArnd Bergmann1-3/+3
Most users of ndo_do_ioctl are ethernet drivers that implement the MII commands SIOCGMIIPHY/SIOCGMIIREG/SIOCSMIIREG, or hardware timestamping with SIOCSHWTSTAMP/SIOCGHWTSTAMP. Separate these from the few drivers that use ndo_do_ioctl to implement SIOCBOND, SIOCBR and SIOCWANDEV commands. This is a purely cosmetic change intended to help readers find their way through the implementation. Cc: Doug Ledford <[email protected]> Cc: Jason Gunthorpe <[email protected]> Cc: Jay Vosburgh <[email protected]> Cc: Veaceslav Falico <[email protected]> Cc: Andy Gospodarek <[email protected]> Cc: Andrew Lunn <[email protected]> Cc: Vivien Didelot <[email protected]> Cc: Florian Fainelli <[email protected]> Cc: Vladimir Oltean <[email protected]> Cc: Leon Romanovsky <[email protected]> Cc: [email protected] Signed-off-by: Arnd Bergmann <[email protected]> Acked-by: Jason Gunthorpe <[email protected]> Signed-off-by: David S. Miller <[email protected]>
2021-06-28Merge git://git.kernel.org/pub/scm/linux/kernel/git/bpf/bpf-nextDavid S. Miller2-8/+1
Daniel Borkmann says: ==================== pull-request: bpf-next 2021-06-28 The following pull-request contains BPF updates for your *net-next* tree. We've added 37 non-merge commits during the last 12 day(s) which contain a total of 56 files changed, 394 insertions(+), 380 deletions(-). The main changes are: 1) XDP driver RCU cleanups, from Toke Høiland-Jørgensen and Paul E. McKenney. 2) Fix bpf_skb_change_proto() IPv4/v6 GSO handling, from Maciej Żenczykowski. 3) Fix false positive kmemleak report for BPF ringbuf alloc, from Rustam Kovhaev. 4) Fix x86 JIT's extable offset calculation for PROBE_LDX NULL, from Ravi Bangoria. 5) Enable libbpf fallback probing with tracing under RHEL7, from Jonathan Edwards. 6) Clean up x86 JIT to remove unused cnt tracking from EMIT macro, from Jiri Olsa. 7) Netlink cleanups for libbpf to please Coverity, from Kumar Kartikeya Dwivedi. 8) Allow to retrieve ancestor cgroup id in tracing programs, from Namhyung Kim. 9) Fix lirc BPF program query to use user-provided prog_cnt, from Sean Young. 10) Add initial libbpf doc including generated kdoc for its API, from Grant Seltzer. 11) Make xdp_rxq_info_unreg_mem_model() more robust, from Jakub Kicinski. 12) Fix up bpfilter startup log-level to info level, from Gary Lin. ==================== Signed-off-by: David S. Miller <[email protected]>
2021-06-25ice: Fix a memory leak in an error handling path in 'ice_pf_dcb_cfg()'Christophe JAILLET1-2/+4
If this 'kzalloc()' fails we must free some resources as in all the other error handling paths of this function. Fixes: 348048e724a0 ("ice: Implement iidc operations") Signed-off-by: Christophe JAILLET <[email protected]> Signed-off-by: Tony Nguyen <[email protected]>
2021-06-25ice: remove unnecessary VSI assignmentTony Nguyen1-1/+0
ice_get_vf_vsi() is being called twice for the same VSI. Remove the unnecessary call/assignment. Signed-off-by: Tony Nguyen <[email protected]> Tested-by: Tony Brelinski <[email protected]>
2021-06-25ice: remove the VSI info from previous aggVictor Raj1-2/+22
Remove the VSI info from previous aggregator after moving the VSI to a new aggregator. Signed-off-by: Victor Raj <[email protected]> Tested-by: Tony Brelinski <[email protected]> Signed-off-by: Tony Nguyen <[email protected]>
2021-06-25ice: add support for auxiliary input/output pinsMaciej Machnikowski4-0/+366
The E810 device supports programmable pins for enabling both input and output events related to the PTP hardware clock. This includes both output signals with programmable period, as well as timestamping of events on input pins. Add support for enabling these using the CONFIG_PTP_1588_CLOCK interface. This allows programming the software defined pins to take advantage of the hardware clock features. Signed-off-by: Maciej Machnikowski <[email protected]> Signed-off-by: Jacob Keller <[email protected]> Signed-off-by: Tony Nguyen <[email protected]>
2021-06-25ice: add tracepointsJesse Brandeburg3-0/+249
This patch is modeled after one by Scott Peterson for i40e. Add tracepoints to the driver, via a new file ice_trace.h and some new trace calls added in interesting places in the driver. Add some tracing for DIMLIB to help debug interrupt moderation problems. Performance should not be affected, and this can be very useful for debugging and adding new trace events to paths in the future. Note eBPF programs can attach to these events, as well as perf can count them since we're attaching to the events subsystem in the kernel. Co-developed-by: Ben Shelton <[email protected]> Signed-off-by: Ben Shelton <[email protected]> Signed-off-by: Jesse Brandeburg <[email protected]> Tested-by: Tony Brelinski <[email protected]> Signed-off-by: Tony Nguyen <[email protected]>
2021-06-24intel: Remove rcu_read_lock() around XDP program invocationToke Høiland-Jørgensen2-8/+1
The Intel drivers all have rcu_read_lock()/rcu_read_unlock() pairs around XDP program invocations. However, the actual lifetime of the objects referred by the XDP program invocation is longer, all the way through to the call to xdp_do_flush(), making the scope of the rcu_read_lock() too small. This turns out to be harmless because it all happens in a single NAPI poll cycle (and thus under local_bh_disable()), but it makes the rcu_read_lock() misleading. Rather than extend the scope of the rcu_read_lock(), just get rid of it entirely. With the addition of RCU annotations to the XDP_REDIRECT map types that take bh execution into account, lockdep even understands this to be safe, so there's really no reason to keep it around. Signed-off-by: Toke Høiland-Jørgensen <[email protected]> Signed-off-by: Daniel Borkmann <[email protected]> Tested-by: Jesper Dangaard Brouer <[email protected]> # i40e Cc: Jesse Brandeburg <[email protected]> Cc: Tony Nguyen <[email protected]> Cc: [email protected] Link: https://lore.kernel.org/bpf/[email protected]
2021-06-18Merge git://git.kernel.org/pub/scm/linux/kernel/git/netdev/netJakub Kicinski2-8/+25
Trivial conflicts in net/can/isotp.c and tools/testing/selftests/net/mptcp/mptcp_connect.sh scaled_ppm_to_ppb() was moved from drivers/ptp/ptp_clock.c to include/linux/ptp_clock_kernel.h in -next so re-apply the fix there. Signed-off-by: Jakub Kicinski <[email protected]>
2021-06-18ice: report hash type such as L2/L3/L4Jesse Brandeburg5-126/+50
The hardware is reporting the type of the hash used for RSS as a PTYPE field in the receive descriptor. Use this value to set the skb packet hash type by extending the hash type table to cover all 10-bits of possible values (requiring some variables to be changed from u8 to u16), and then use that table to convert to one of the possible values in enum pkt_hash_types. While we're here, remove the unused ptype struct value, which makes table init easier for the zero entries, and use ranged initializer to remove a bunch of code (works with gcc and clang). Without this change, the kernel will recalculate the hash in software, which can consume extra CPU cycles. Co-developed-by: Kiran Patil <[email protected]> Signed-off-by: Kiran Patil <[email protected]> Signed-off-by: Jesse Brandeburg <[email protected]> Tested-by: Tony Brelinski <[email protected]> Signed-off-by: Tony Nguyen <[email protected]>
2021-06-17ice: remove redundant continue statement in a for-loopColin Ian King1-6/+4
The continue statement in the for-loop is redundant. Re-work the hw_lock check to remove it. Addresses-Coverity: ("Continue has no effect") Signed-off-by: Colin Ian King <[email protected]> Reviewed-by: Jacob Keller <[email protected]> Signed-off-by: Tony Nguyen <[email protected]>
2021-06-17net: ice: ptp: fix compilation warning if PTP_1588_CLOCK is disabledLorenzo Bianconi1-1/+1
Fix the following compilation warning if PTP_1588_CLOCK is not enabled drivers/net/ethernet/intel/ice/ice_ptp.h:149:1: error: return type defaults to ‘int’ [-Werror=return-type] ice_ptp_request_ts(struct ice_ptp_tx *tx, struct sk_buff *skb) Fixes: ea9b847cda647 ("ice: enable transmit timestamps for E810 devices") Signed-off-by: Lorenzo Bianconi <[email protected]> Reviewed-by: Jacob Keller <[email protected]> Signed-off-by: Tony Nguyen <[email protected]>
2021-06-17ice: remove unnecessary NULL checks before ptp_read_system_*Jacob Keller1-8/+4
The ptp_read_system_prets and ptp_read_system_postts functions already check for the NULL value of the ptp_system_timestamp structure pointer. There is no need to check this manually in the ice driver code. Remove the checks. Reported-by: Jakub Kicinski <[email protected]> Signed-off-by: Jacob Keller <[email protected]> Signed-off-by: Tony Nguyen <[email protected]>
2021-06-17ice: Remove the repeated declarationShaokun Zhang1-1/+0
Function 'ice_is_vsi_valid' is declared twice, remove the repeated declaration. Cc: Jesse Brandeburg <[email protected]> Cc: Tony Nguyen <[email protected]> Cc: "David S. Miller" <[email protected]> Cc: Jakub Kicinski <[email protected]> Signed-off-by: Shaokun Zhang <[email protected]> Signed-off-by: Tony Nguyen <[email protected]>
2021-06-17ice: remove local variablePaul M Stillwell Jr1-2/+1
Remove the local variable since it's only used once. Instead, use it directly. Signed-off-by: Paul M Stillwell Jr <[email protected]> Tested-by: Tony Brelinski <[email protected]> Signed-off-by: Tony Nguyen <[email protected]>
2021-06-17ice: reduce scope of variablesPaul M Stillwell Jr2-6/+6
There are some places where the scope of a variable can be reduced so do that. Signed-off-by: Paul M Stillwell Jr <[email protected]> Tested-by: Tony Brelinski <[email protected]> Signed-off-by: Tony Nguyen <[email protected]>
2021-06-17ice: mark PTYPE 2 as reservedJacob Keller1-1/+1
The entry for PTYPE 2 in the ice_ptype_lkup table incorrectly states that this is an L2 packet with no payload. According to the datasheet, this PTYPE is actually unused and reserved. Fix the lookup entry to indicate this is an unused entry that is reserved. Signed-off-by: Jacob Keller <[email protected]> Tested-by: Tony Brelinski <[email protected]> Signed-off-by: Tony Nguyen <[email protected]>
2021-06-17ice: fix incorrect payload indicator on PTYPEJacob Keller1-1/+1
The entry for PTYPE 90 indicates that the payload is layer 3. This does not match the specification in the datasheet which indicates the packet is a MAC, IPv6, UDP packet, with a payload in layer 4. Fix the lookup table to match the data sheet. Signed-off-by: Jacob Keller <[email protected]> Tested-by: Tony Brelinski <[email protected]> Signed-off-by: Tony Nguyen <[email protected]>
2021-06-11ice: enable transmit timestamps for E810 devicesJacob Keller9-4/+518
Add support for enabling Tx timestamp requests for outgoing packets on E810 devices. The ice hardware can support multiple outstanding Tx timestamp requests. When sending a descriptor to hardware, a Tx timestamp request is made by setting a request bit, and assigning an index that represents which Tx timestamp index to store the timestamp in. Hardware makes no effort to synchronize the index use, so it is up to software to ensure that Tx timestamp indexes are not re-used before the timestamp is reported back. To do this, introduce a Tx timestamp tracker which will keep track of currently in-use indexes. In the hot path, if a packet has a timestamp request, an index will be requested from the tracker. Unfortunately, this does require a lock as the indexes are shared across all queues on a PHY. There are not enough indexes to reliably assign only 1 to each queue. For the E810 devices, the timestamp indexes are not shared across PHYs, so each port can have its own tracking. Once hardware captures a timestamp, an interrupt is fired. In this interrupt, trigger a new work item that will figure out which timestamp was completed, and report the timestamp back to the stack. This function loops through the Tx timestamp indexes and checks whether there is now a valid timestamp. If so, it clears the PHY timestamp indication in the PHY memory, locks and removes the SKB and bit in the tracker, then reports the timestamp to the stack. It is possible in some cases that a timestamp request will be initiated but never completed. This might occur if the packet is dropped by software or hardware before it reaches the PHY. Add a task to the periodic work function that will check whether a timestamp request is more than a few seconds old. If so, the timestamp index is cleared in the PHY, and the SKB is released. Just as with Rx timestamps, the Tx timestamps are only 40 bits wide, and use the same overall logic for extending to 64 bits of nanoseconds. With this change, E810 devices should be able to perform basic PTP functionality. Future changes will extend the support to cover the E822-based devices. Signed-off-by: Jacob Keller <[email protected]> Tested-by: Tony Brelinski <[email protected]> Signed-off-by: Tony Nguyen <[email protected]>
2021-06-11ice: enable receive hardware timestampingJacob Keller9-6/+410
Add SIOCGHWTSTAMP and SIOCSHWTSTAMP ioctl handlers to respond to requests to enable timestamping support. If the request is for enabling Rx timestamps, set a bit in the Rx descriptors to indicate that receive timestamps should be reported. Hardware captures receive timestamps in the PHY which only captures part of the timer, and reports only 40 bits into the Rx descriptor. The upper 32 bits represent the contents of GLTSYN_TIME_L at the point of packet reception, while the lower 8 bits represent the upper 8 bits of GLTSYN_TIME_0. The networking and PTP stack expect 64 bit timestamps in nanoseconds. To support this, implement some logic to extend the timestamps by using the full PHC time. If the Rx timestamp was captured prior to the PHC time, then the real timestamp is PHC - (lower_32_bits(PHC) - timestamp) If the Rx timestamp was captured after the PHC time, then the real timestamp is PHC + (timestamp - lower_32_bits(PHC)) These calculations are correct as long as neither the PHC timestamp nor the Rx timestamps are more than 2^32-1 nanseconds old. Further, we can detect when the Rx timestamp is before or after the PHC as long as the PHC timestamp is no more than 2^31-1 nanoseconds old. In that case, we calculate the delta between the lower 32 bits of the PHC and the Rx timestamp. If it's larger than 2^31-1 then the Rx timestamp must have been captured in the past. If it's smaller, then the Rx timestamp must have been captured after PHC time. Add an ice_ptp_extend_32b_ts function that relies on a cached copy of the PHC time and implements this algorithm to calculate the proper upper 32bits of the Rx timestamps. Cache the PHC time periodically in all of the Rx rings. This enables each Rx ring to simply call the extension function with a recent copy of the PHC time. By ensuring that the PHC time is kept up to date periodically, we ensure this algorithm doesn't use stale data and produce incorrect results. To cache the time, introduce a kworker and a kwork item to periodically store the Rx time. It might seem like we should use the .do_aux_work interface of the PTP clock. This doesn't work because all PFs must cache this time, but only one PF owns the PTP clock device. Thus, the ice driver will manage its own kthread instead of relying on the PTP do_aux_work handler. With this change, the driver can now report Rx timestamps on all incoming packets. Signed-off-by: Jacob Keller <[email protected]> Tested-by: Tony Brelinski <[email protected]> Signed-off-by: Tony Nguyen <[email protected]>
2021-06-11ice: report the PTP clock index in ethtool .get_ts_infoJacob Keller3-1/+155
Now that the driver registers a PTP clock device that represents the clock hardware, it is important that the clock index is reported via the ethtool .get_ts_info callback. The underlying hardware resource is shared between multiple PF functions. Only one function owns the hardware resources associated with a timer, but multiple functions may be associated with it for the purposes of timestamping. To support this, the owning PF will store the clock index into the driver shared parameters buffer in firmware. Other PFs will look up the clock index by reading the driver shared parameter on demand when requested via the .get_ts_info ethtool function. In this way, all functions which are tied to the same timer are able to report the clock index. Userspace software such as ptp4l performs a look up on the netdev to determine the associated clock, and all commands to control or configure the clock will be handled through the controlling PF. Signed-off-by: Jacob Keller <[email protected]> Tested-by: Tony Brelinski <[email protected]> Signed-off-by: Tony Nguyen <[email protected]>
2021-06-11ice: register 1588 PTP clock device object for E810 devicesJacob Keller7-0/+512
Add a new ice_ptp.c file for holding the basic PTP clock interface functions. If the device supports PTP, call the new ice_ptp_init and ice_ptp_release functions where appropriate. If the function owns the hardware resource associated with the PTP hardware clock, register with the PTP_1588_CLOCK infrastructure to allocate a new clock object that represents the device hardware clock. Implement basic functionality for reading and setting the clock time, performing clock adjustments, and adjusting the clock frequency. Future changes will introduce functionality for handling related features including Tx and Rx timestamps. Signed-off-by: Jacob Keller <[email protected]> Tested-by: Tony Brelinski <[email protected]> Signed-off-by: Tony Nguyen <[email protected]>
2021-06-11ice: add low level PTP clock access functionsJacob Keller4-0/+758
Add the ice_ptp_hw.c file and some associated definitions to the ice driver folder. This file contains basic low level definitions for functions that interact with the device hardware. For now, only E810-based devices are supported. The ice hardware supports 2 major variants which have different PHYs with different procedures necessary for interacting with the device clock. Because the device captures timestamps in the PHY, each PHY has its own internal timer. The timers are synchronized in hardware by first preparing the source timer and the PHY timer shadow registers, and then issuing a synchronization command. This ensures that both the source timer and PHY timers are programmed simultaneously. The timers themselves are all driven from the same oscillator source. The functions in ice_ptp_hw.c abstract over the differences between how the PHYs in E810 are programmed vs how the PHYs in E822 devices are programmed. This series only implements E810 support, but E822 support will be added in a future change. Signed-off-by: Jacob Keller <[email protected]> Tested-by: Tony Brelinski <[email protected]> Signed-off-by: Tony Nguyen <[email protected]>
2021-06-11ice: add support for set/get of driver-stored firmware parametersJacob Keller3-0/+108
Depending on the device configuration, the ice hardware may share the PTP hardware clock timer between multiple PFs. Each PF is informed by firmware during initialization of the PTP timer association. When bringing up PTP, only the PFs which own the timer shall allocate a PTP hardware clock. Other PFs associated with that timer must report the correct PTP clock index in order to allow userspace software the ability to know which ports are connected to the same clock. To support this, the firmware has driver shared parameters. These parameters enable one PF to write the clock index into firmware, and have other PFs read the associated value out. This enables the driver to have only a single PF allocate and control the device timer registers, while other PFs associated with that timer can report the correct clock in the ETHTOOL_GET_TS_INFO report. Add support for the necessary admin queue commands to enable reading and writing of the driver shared parameters. This will be used in a future change to enable sharing the PTP clock index between PF drivers. Signed-off-by: Jacob Keller <[email protected]> Tested-by: Tony Brelinski <[email protected]> Signed-off-by: Tony Nguyen <[email protected]>
2021-06-11ice: process 1588 PTP capabilities during initializationJacob Keller3-0/+151
The device firmware reports PTP clock capabilities to each PF during initialization. This includes various information for both the overall device and the individual function, including For functions: * whether this function has timesync enabled * whether this function owns one of the 2 possible clock timers, and which one * which timer the function is associated with * the clock frequency, if the device supports multiple clock frequencies * The GPIO pin association for the timer owned by this PF, if any For the device: * Which PF owns timer 0, if any * Which PF owns timer 1, if any * whether timer 0 is enabled * whether timer 1 is enabled Extract the bits from the capabilities information reported by firmware and store them in the device and function capability structures.o This information will be used in a future change to have the function driver enable PTP hardware clock support. Signed-off-by: Jacob Keller <[email protected]> Tested-by: Tony Brelinski <[email protected]> Signed-off-by: Tony Nguyen <[email protected]>
2021-06-11ice: add support for sideband messagesJacob Keller11-2/+343
In order to support certain device features, including enabling the PTP hardware clock, the ice driver needs to control some registers on the device PHY. These registers are accessed by sending sideband messages. For some hardware, these messages must be sent over the device admin queue, while other hardware has a dedicated control queue for the sideband messages. Add the neighbor device message structure for sending a message to the neighboring device. Where supported, initialize the sideband control queue and handle cleanup. Add a wrapper function for sending sideband control queue messages that read or write a neighboring device register. Because some devices send sideband messages over the AdminQ, also increase the length of the admin queue to allow more messages to be queued up. This is important because the sideband messages add additional pressure on the AQ usage. This support will be used in following patches to enable support for CONFIG_1588_PTP_CLOCK. Signed-off-by: Jacob Keller <[email protected]> Tested-by: Tony Brelinski <[email protected]> Signed-off-by: Tony Nguyen <[email protected]>
2021-06-09ice: parameterize functions responsible for Tx ring managementMaciej Fijalkowski1-8/+10
Commit ae15e0ba1b33 ("ice: Change number of XDP Tx queues to match number of Rx queues") tried to address the incorrect setting of XDP queue count that was based on the Tx queue count, whereas in theory we should provide the XDP queue per Rx queue. However, the routines that setup and destroy the set of Tx resources are still based on the vsi->num_txq. Ice supports the asynchronous Tx/Rx queue count, so for a setup where vsi->num_txq > vsi->num_rxq, ice_vsi_stop_tx_rings and ice_vsi_cfg_txqs will be accessing the vsi->xdp_rings out of the bounds. Parameterize two mentioned functions so they get the size of Tx resources array as the input. Fixes: ae15e0ba1b33 ("ice: Change number of XDP Tx queues to match number of Rx queues") Signed-off-by: Maciej Fijalkowski <[email protected]> Tested-by: Kiran Bhandare <[email protected]> Signed-off-by: Tony Nguyen <[email protected]>
2021-06-09ice: add ndo_bpf callback for safe mode netdev opsMaciej Fijalkowski1-0/+15
ice driver requires a programmable pipeline firmware package in order to have a support for advanced features. Otherwise, driver falls back to so called 'safe mode'. For that mode, ndo_bpf callback is not exposed and when user tries to load XDP program, the following happens: $ sudo ./xdp1 enp179s0f1 libbpf: Kernel error message: Underlying driver does not support XDP in native mode link set xdp fd failed which is sort of confusing, as there is a native XDP support, but not in the current mode. Improve the user experience by providing the specific ndo_bpf callback dedicated for safe mode which will make use of extack to explicitly let the user know that the DDP package is missing and that's the reason that the XDP can't be loaded onto interface currently. Cc: Jamal Hadi Salim <[email protected]> Fixes: efc2214b6047 ("ice: Add support for XDP") Signed-off-by: Maciej Fijalkowski <[email protected]> Tested-by: Kiran Bhandare <[email protected]> Signed-off-by: Tony Nguyen <[email protected]>
2021-06-07Merge branch '100GbE' of ↵David S. Miller19-151/+446
git://git.kernel.org/pub/scm/linux/kernel/git/tnguy/next-queue Tony Nguyen says: ==================== 100GbE Intel Wired LAN Driver Updates 2021-06-07 This series contains updates to virtchnl header file and ice driver. Brett adds capability bits to virtchnl to specify whether a primary or secondary MAC address is being requested and adds the implementation to ice. He also adds storing of VF MAC address so that it will be preserved across reboots of VM and refactors VF queue configuration to remove the expectation that configuration be done all at once. Krzysztof refactors ice_setup_rx_ctx() to remove configuration not related to Rx context into a new function, ice_vsi_cfg_rxq(). Liwei Song extends the wait time for the global config timeout. Salil Mehta refactors code in ice_vsi_set_num_qs() to remove an unnecessary call when the user has requested specific number of Rx or Tx queues. Jesse converts define macros to static inlines for NOP configurations. Jake adds messaging when devlink fails to read device capabilities and when pldmfw cannot find the requested firmware. Adds a wait for reset completion when reporting devlink info and reinitializes NVM during rebuild to ensure values are current. Ani adds detection and reporting of modules exceeding supported power levels and changes an error message to a debug message. Paul fixes a clang warning for deadcode.DeadStores. ==================== Signed-off-by: David S. Miller <[email protected]>
2021-06-07Merge ra.kernel.org:/pub/scm/linux/kernel/git/netdev/netDavid S. Miller7-60/+59
Bug fixes overlapping feature additions and refactoring, mostly. Signed-off-by: David S. Miller <[email protected]>