Age | Commit message (Collapse) | Author | Files | Lines |
|
Move devlink_registration routine to be the last command, when the
device is fully initialized.
Signed-off-by: Leon Romanovsky <[email protected]>
Signed-off-by: David S. Miller <[email protected]>
|
|
PF pointer is always valid when PCI core calls its .shutdown() and
.remove() callbacks. There is no need to check it again.
Fixes: 837f08fdecbe ("ice: Add basic driver framework for Intel(R) E800 Series")
Signed-off-by: Leon Romanovsky <[email protected]>
Signed-off-by: David S. Miller <[email protected]>
|
|
devlink_register() can't fail and always returns success, but all drivers
are obligated to check returned status anyway. This adds a lot of boilerplate
code to handle impossible flow.
Make devlink_register() void and simplify the drivers that use that
API call.
Signed-off-by: Leon Romanovsky <[email protected]>
Acked-by: Simon Horman <[email protected]>
Acked-by: Vladimir Oltean <[email protected]> # dsa
Reviewed-by: Jiri Pirko <[email protected]>
Signed-off-by: David S. Miller <[email protected]>
|
|
There are two cases where the current PF does not support RDMA
functionality. The first is if the NVM loaded on the device is set
to not support RDMA (common_caps.rdma is false). The second is if
the kernel bonding driver has included the current PF in an active
link aggregate.
When the driver has determined that this PF does not support RDMA, then
auxiliary devices should not be created on the auxiliary bus. Without
a device on the auxiliary bus, even if the irdma driver is present, there
will be no RDMA activity attempted on this PF.
Currently, in the reset flow, an attempt to create auxiliary devices is
performed without regard to the ability of the PF. There needs to be a
check in ice_aux_plug_dev (as the central point that creates auxiliary
devices) to see if the PF is in a state to support the functionality.
When disabling and re-enabling RDMA due to the inclusion/removal of the PF
in a link aggregate, we also need to set/clear the bit which controls
auxiliary device creation so that a reset recovery in a link aggregate
situation doesn't try to create auxiliary devices when it shouldn't.
Fixes: f9f5301e7e2d ("ice: Register auxiliary device to provide RDMA")
Reported-by: Yongxin Liu <[email protected]>
Signed-off-by: Dave Ertman <[email protected]>
Signed-off-by: Tony Nguyen <[email protected]>
Signed-off-by: David S. Miller <[email protected]>
|
|
include/linux/netdevice.h
net/socket.c
d0efb16294d1 ("net: don't unconditionally copy_from_user a struct ifreq for socket ioctls")
876f0bf9d0d5 ("net: socket: simplify dev_ifconf handling")
29c4964822aa ("net: socket: rework compat_ifreq_ioctl()")
Signed-off-by: Jakub Kicinski <[email protected]>
|
|
commit 3ba7f53f8bf1 ("ice: don't remove netdev->dev_addr from uc sync
list") introduced calls to netif_addr_lock_bh() and
netif_addr_unlock_bh() in the driver's ndo_set_mac() callback. This is
fine since the driver is updated the netdev's dev_addr, but since this
is a spinlock, the driver cannot sleep when the lock is held.
Unfortunately the functions to add/delete MAC filters depend on a mutex.
This was causing a trace with the lock debug kernel config options
enabled when changing the mac address via iproute.
[ 203.273059] BUG: sleeping function called from invalid context at kernel/locking/mutex.c:281
[ 203.273065] in_atomic(): 1, irqs_disabled(): 0, non_block: 0, pid: 6698, name: ip
[ 203.273068] Preemption disabled at:
[ 203.273068] [<ffffffffc04aaeab>] ice_set_mac_address+0x8b/0x1c0 [ice]
[ 203.273097] CPU: 31 PID: 6698 Comm: ip Tainted: G S W I 5.14.0-rc4 #2
[ 203.273100] Hardware name: Intel Corporation S2600WFT/S2600WFT, BIOS SE5C620.86B.02.01.0010.010620200716 01/06/2020
[ 203.273102] Call Trace:
[ 203.273107] dump_stack_lvl+0x33/0x42
[ 203.273113] ? ice_set_mac_address+0x8b/0x1c0 [ice]
[ 203.273124] ___might_sleep.cold.150+0xda/0xea
[ 203.273131] mutex_lock+0x1c/0x40
[ 203.273136] ice_remove_mac+0xe3/0x180 [ice]
[ 203.273155] ? ice_fltr_add_mac_list+0x20/0x20 [ice]
[ 203.273175] ice_fltr_prepare_mac+0x43/0xa0 [ice]
[ 203.273194] ice_set_mac_address+0xab/0x1c0 [ice]
[ 203.273206] dev_set_mac_address+0xb8/0x120
[ 203.273210] dev_set_mac_address_user+0x2c/0x50
[ 203.273212] do_setlink+0x1dd/0x10e0
[ 203.273217] ? __nla_validate_parse+0x12d/0x1a0
[ 203.273221] __rtnl_newlink+0x530/0x910
[ 203.273224] ? __kmalloc_node_track_caller+0x17f/0x380
[ 203.273230] ? preempt_count_add+0x68/0xa0
[ 203.273236] ? _raw_spin_lock_irqsave+0x1f/0x30
[ 203.273241] ? kmem_cache_alloc_trace+0x4d/0x440
[ 203.273244] rtnl_newlink+0x43/0x60
[ 203.273245] rtnetlink_rcv_msg+0x13a/0x380
[ 203.273248] ? rtnl_calcit.isra.40+0x130/0x130
[ 203.273250] netlink_rcv_skb+0x4e/0x100
[ 203.273256] netlink_unicast+0x1a2/0x280
[ 203.273258] netlink_sendmsg+0x242/0x490
[ 203.273260] sock_sendmsg+0x58/0x60
[ 203.273263] ____sys_sendmsg+0x1ef/0x260
[ 203.273265] ? copy_msghdr_from_user+0x5c/0x90
[ 203.273268] ? ____sys_recvmsg+0xe6/0x170
[ 203.273270] ___sys_sendmsg+0x7c/0xc0
[ 203.273272] ? copy_msghdr_from_user+0x5c/0x90
[ 203.273274] ? ___sys_recvmsg+0x89/0xc0
[ 203.273276] ? __netlink_sendskb+0x50/0x50
[ 203.273278] ? mod_objcg_state+0xee/0x310
[ 203.273282] ? __dentry_kill+0x114/0x170
[ 203.273286] ? get_max_files+0x10/0x10
[ 203.273288] __sys_sendmsg+0x57/0xa0
[ 203.273290] do_syscall_64+0x37/0x80
[ 203.273295] entry_SYSCALL_64_after_hwframe+0x44/0xae
[ 203.273296] RIP: 0033:0x7f8edf96e278
[ 203.273298] Code: 89 02 48 c7 c0 ff ff ff ff eb b5 0f 1f 80 00 00 00 00 f3 0f 1e fa 48 8d 05 25 63 2c 00 8b 00 85 c0 75 17 b8 2e 00 00 00 0f 05 <48> 3d 00 f0 ff ff 77 58 c3 0f 1f 80 00 00 00 00 41 54 41 89 d4 55
[ 203.273300] RSP: 002b:00007ffcb8bdac08 EFLAGS: 00000246 ORIG_RAX: 000000000000002e
[ 203.273303] RAX: ffffffffffffffda RBX: 000000006115e0ae RCX: 00007f8edf96e278
[ 203.273304] RDX: 0000000000000000 RSI: 00007ffcb8bdac70 RDI: 0000000000000003
[ 203.273305] RBP: 0000000000000000 R08: 0000000000000001 R09: 00007ffcb8bda5b0
[ 203.273306] R10: 0000000000000000 R11: 0000000000000246 R12: 0000000000000001
[ 203.273306] R13: 0000555e10092020 R14: 0000000000000000 R15: 0000000000000005
Fix this by only locking when changing the netdev->dev_addr. Also, make
sure to restore the old netdev->dev_addr on any failures.
Fixes: 3ba7f53f8bf1 ("ice: don't remove netdev->dev_addr from uc sync list")
Signed-off-by: Brett Creeley <[email protected]>
Tested-by: Gurucharan G <[email protected]>
Signed-off-by: Tony Nguyen <[email protected]>
|
|
When we enabled auxiliary input/output support for the E810 device, we
forgot to add logic to restart the output when we change time. This is
important as the periodic output will be incorrect after a time change
otherwise.
This unfortunately includes the adjust time function, even though it
uses an atomic hardware interface. The atomic adjustment can still cause
the pin output to stall permanently, so we need to stop and restart it.
Introduce wrapper functions to temporarily disable and then re-enable
the clock outputs.
Fixes: 172db5f91d5f ("ice: add support for auxiliary input/output pins")
Signed-off-by: Jacob Keller <[email protected]>
Tested-by: Sunitha D Mekala <[email protected]>
Signed-off-by: Tony Nguyen <[email protected]>
|
|
The driver didn't take the lock while flushing the Tx tracker, which
could cause a race where one thread is trying to read timestamps out
while another thread is trying to read the tracker to check the
timestamps.
Avoid this by ensuring that flushing is locked against read accesses.
Fixes: ea9b847cda64 ("ice: enable transmit timestamps for E810 devices")
Signed-off-by: Jacob Keller <[email protected]>
Tested-by: Gurucharan G <[email protected]>
Signed-off-by: Tony Nguyen <[email protected]>
|
|
We have code in the ice driver which allocates the pin_config structure
if n_pins is > 0, but we never set n_pins to be greater than zero.
There's no reason to keep this code until we actually have pin_config
support. Remove this. We can re-add it properly when we implement
support for pin_config for E810-T devices.
Fixes: 172db5f91d5f ("ice: add support for auxiliary input/output pins")
Signed-off-by: Jacob Keller <[email protected]>
Tested-by: Gurucharan G <[email protected]>
Signed-off-by: Tony Nguyen <[email protected]>
|
|
The driver accidentally copied the ice_for_each_rxq iterator when
implementing enablement of the ptp_tx bit for the Tx rings. We still
load the Tx rings and set the ptp_tx field, but we iterate over the
count of the num_rxq.
If the number of Tx and Rx queues differ, this could either cause
a buffer overrun when accessing the tx_rings list if num_txq is greater
than num_rxq, or it could cause us to fail to enable Tx timestamps for
some rings.
This was not noticed originally as we generally have the same number of
Tx and Rx queues.
Fixes: ea9b847cda64 ("ice: enable transmit timestamps for E810 devices")
Signed-off-by: Jacob Keller <[email protected]>
Tested-by: Gurucharan G <[email protected]>
Signed-off-by: Tony Nguyen <[email protected]>
|
|
drivers/net/wwan/mhi_wwan_mbim.c - drop the extra arg.
Signed-off-by: Jakub Kicinski <[email protected]>
|
|
In order to support more coalesce parameters through netlink,
add two new parameter kernel_coal and extack for .set_coalesce
and .get_coalesce, then some extra info can return to user with
the netlink API.
Signed-off-by: Yufeng Mo <[email protected]>
Signed-off-by: Huazhong Tan <[email protected]>
Signed-off-by: Jakub Kicinski <[email protected]>
|
|
The devlink dev info command reports version information about the
device and firmware running on the board. This includes the "board.id"
field which is supposed to represent an identifier of the board design.
The ice driver uses the Product Board Assembly identifier for this.
In some cases, the PBA is not present in the NVM. If this happens,
devlink dev info will fail with an error. Instead, modify the
ice_info_pba function to just exit without filling in the context
buffer. This will cause the board.id field to be skipped. Log a dev_dbg
message in case someone wants to confirm why board.id is not showing up
for them.
Fixes: e961b679fb0b ("ice: add board identifier info to devlink .info_get")
Signed-off-by: Jacob Keller <[email protected]>
Tested-by: Tony Brelinski <[email protected]>
Signed-off-by: Tony Nguyen <[email protected]>
Link: https://lore.kernel.org/r/[email protected]
Signed-off-by: Jakub Kicinski <[email protected]>
|
|
drivers/ptp/Kconfig:
55c8fca1dae1 ("ptp_pch: Restore dependency on PCI")
e5f31552674e ("ethernet: fix PTP_1588_CLOCK dependencies")
Signed-off-by: Jakub Kicinski <[email protected]>
|
|
Internal tests found out that the latest code doesn't bring up 1PPS out
as expected. As a result of incorrect define used to round the time up
the time was round down to the past second boundary.
Fix define used for rounding to properly round up to the next Top of
second in ice_ptp_cfg_clkout to fix it.
Fixes: 172db5f91d5f ("ice: add support for auxiliary input/output pins")
Signed-off-by: Maciej Machnikowski <[email protected]>
Tested-by: Sunitha Mekala <[email protected]>
Signed-off-by: Tony Nguyen <[email protected]>
Link: https://lore.kernel.org/r/[email protected]
Signed-off-by: Jakub Kicinski <[email protected]>
|
|
Conflicts:
drivers/net/ethernet/broadcom/bnxt/bnxt_ptp.h
9e26680733d5 ("bnxt_en: Update firmware call to retrieve TX PTP timestamp")
9e518f25802c ("bnxt_en: 1PPS functions to configure TSIO pins")
099fdeda659d ("bnxt_en: Event handler for PPS events")
kernel/bpf/helpers.c
include/linux/bpf-cgroup.h
a2baf4e8bb0f ("bpf: Fix potentially incorrect results with bpf_get_local_storage()")
c7603cfa04e7 ("bpf: Add ambient BPF runtime context stored in current")
drivers/net/ethernet/mellanox/mlx5/core/pci_irq.c
5957cc557dc5 ("net/mlx5: Set all field of mlx5_irq before inserting it to the xarray")
2d0b41a37679 ("net/mlx5: Refcount mlx5_irq with integer")
MAINTAINERS
7b637cd52f02 ("MAINTAINERS: fix Microchip CAN BUS Analyzer Tool entry typo")
7d901a1e878a ("net: phy: add Maxlinear GPY115/21x/24x driver")
Signed-off-by: Jakub Kicinski <[email protected]>
|
|
In some circumstances, such as with bridging, it's possible that the
stack will add the device's own MAC address to its unicast address list.
If, later, the stack deletes this address, the driver will receive a
request to remove this address.
The driver stores its current MAC address as part of the VSI MAC filter
list instead of separately. So, this causes a problem when the device's
MAC address is deleted unexpectedly, which results in traffic failure in
some cases.
The following configuration steps will reproduce the previously
mentioned problem:
> ip link set eth0 up
> ip link add dev br0 type bridge
> ip link set br0 up
> ip addr flush dev eth0
> ip link set eth0 master br0
> echo 1 > /sys/class/net/br0/bridge/vlan_filtering
> modprobe -r veth
> modprobe -r bridge
> ip addr add 192.168.1.100/24 dev eth0
The following ping command fails due to the netdev->dev_addr being
deleted when removing the bridge module.
> ping <link partner>
Fix this by making sure to not delete the netdev->dev_addr during MAC
address sync. After fixing this issue it was noticed that the
netdev_warn() in .set_mac was overly verbose, so make it at
netdev_dbg().
Also, there is a possibility of a race condition between .set_mac and
.set_rx_mode. Fix this by calling netif_addr_lock_bh() and
netif_addr_unlock_bh() on the device's netdev when the netdev->dev_addr
is going to be updated in .set_mac.
Fixes: e94d44786693 ("ice: Implement filter sync, NDO operations and bump version")
Signed-off-by: Brett Creeley <[email protected]>
Tested-by: Liang Li <[email protected]>
Tested-by: Gurucharan G <[email protected]>
Signed-off-by: Tony Nguyen <[email protected]>
|
|
When VFs are setup and torn down in quick succession, it is possible
that a VF is torn down by the PF while the VF's virtchnl requests are
still in the PF's mailbox ring. Processing the VF's virtchnl request
when the VF itself doesn't exist results in undefined behavior. Fix
this by adding a check to stop processing virtchnl requests when VF
teardown is in progress.
Fixes: ddf30f7ff840 ("ice: Add handler to configure SR-IOV")
Signed-off-by: Anirudh Venkataramanan <[email protected]>
Tested-by: Konrad Jankowski <[email protected]>
Signed-off-by: Tony Nguyen <[email protected]>
|
|
The userspace utility "driverctl" can be used to change/override the
system's default driver choices. This is useful in some situations
(buggy driver, old driver missing a device ID, trying a workaround,
etc.) where the user needs to load a different driver.
However, this is also prone to user error, where a driver is mapped
to a device it's not designed to drive. For example, if the ice driver
is mapped to driver iavf devices, the ice driver crashes.
Add a check to return an error if the ice driver is being used to
probe a virtual function.
Fixes: 837f08fdecbe ("ice: Add basic driver framework for Intel(R) E800 Series")
Signed-off-by: Anirudh Venkataramanan <[email protected]>
Tested-by: Gurucharan G <[email protected]>
Tested-by: Konrad Jankowski <[email protected]>
Signed-off-by: Tony Nguyen <[email protected]>
|
|
All kernel devlink implementations call to devlink_alloc() during
initialization routine for specific device which is used later as
a parent device for devlink_register().
Such late device assignment causes to the situation which requires us to
call to device_register() before setting other parameters, but that call
opens devlink to the world and makes accessible for the netlink users.
Any attempt to move devlink_register() to be the last call generates the
following error due to access to the devlink->dev pointer.
[ 8.758862] devlink_nl_param_fill+0x2e8/0xe50
[ 8.760305] devlink_param_notify+0x6d/0x180
[ 8.760435] __devlink_params_register+0x2f1/0x670
[ 8.760558] devlink_params_register+0x1e/0x20
The simple change of API to set devlink device in the devlink_alloc()
instead of devlink_register() fixes all this above and ensures that
prior to call to devlink_register() everything already set.
Signed-off-by: Leon Romanovsky <[email protected]>
Reviewed-by: Jiri Pirko <[email protected]>
Signed-off-by: David S. Miller <[email protected]>
|
|
Most users of ndo_do_ioctl are ethernet drivers that implement
the MII commands SIOCGMIIPHY/SIOCGMIIREG/SIOCSMIIREG, or hardware
timestamping with SIOCSHWTSTAMP/SIOCGHWTSTAMP.
Separate these from the few drivers that use ndo_do_ioctl to
implement SIOCBOND, SIOCBR and SIOCWANDEV commands.
This is a purely cosmetic change intended to help readers find
their way through the implementation.
Cc: Doug Ledford <[email protected]>
Cc: Jason Gunthorpe <[email protected]>
Cc: Jay Vosburgh <[email protected]>
Cc: Veaceslav Falico <[email protected]>
Cc: Andy Gospodarek <[email protected]>
Cc: Andrew Lunn <[email protected]>
Cc: Vivien Didelot <[email protected]>
Cc: Florian Fainelli <[email protected]>
Cc: Vladimir Oltean <[email protected]>
Cc: Leon Romanovsky <[email protected]>
Cc: [email protected]
Signed-off-by: Arnd Bergmann <[email protected]>
Acked-by: Jason Gunthorpe <[email protected]>
Signed-off-by: David S. Miller <[email protected]>
|
|
Daniel Borkmann says:
====================
pull-request: bpf-next 2021-06-28
The following pull-request contains BPF updates for your *net-next* tree.
We've added 37 non-merge commits during the last 12 day(s) which contain
a total of 56 files changed, 394 insertions(+), 380 deletions(-).
The main changes are:
1) XDP driver RCU cleanups, from Toke Høiland-Jørgensen and Paul E. McKenney.
2) Fix bpf_skb_change_proto() IPv4/v6 GSO handling, from Maciej Żenczykowski.
3) Fix false positive kmemleak report for BPF ringbuf alloc, from Rustam Kovhaev.
4) Fix x86 JIT's extable offset calculation for PROBE_LDX NULL, from Ravi Bangoria.
5) Enable libbpf fallback probing with tracing under RHEL7, from Jonathan Edwards.
6) Clean up x86 JIT to remove unused cnt tracking from EMIT macro, from Jiri Olsa.
7) Netlink cleanups for libbpf to please Coverity, from Kumar Kartikeya Dwivedi.
8) Allow to retrieve ancestor cgroup id in tracing programs, from Namhyung Kim.
9) Fix lirc BPF program query to use user-provided prog_cnt, from Sean Young.
10) Add initial libbpf doc including generated kdoc for its API, from Grant Seltzer.
11) Make xdp_rxq_info_unreg_mem_model() more robust, from Jakub Kicinski.
12) Fix up bpfilter startup log-level to info level, from Gary Lin.
====================
Signed-off-by: David S. Miller <[email protected]>
|
|
If this 'kzalloc()' fails we must free some resources as in all the other
error handling paths of this function.
Fixes: 348048e724a0 ("ice: Implement iidc operations")
Signed-off-by: Christophe JAILLET <[email protected]>
Signed-off-by: Tony Nguyen <[email protected]>
|
|
ice_get_vf_vsi() is being called twice for the same VSI. Remove the
unnecessary call/assignment.
Signed-off-by: Tony Nguyen <[email protected]>
Tested-by: Tony Brelinski <[email protected]>
|
|
Remove the VSI info from previous aggregator after moving the VSI to a
new aggregator.
Signed-off-by: Victor Raj <[email protected]>
Tested-by: Tony Brelinski <[email protected]>
Signed-off-by: Tony Nguyen <[email protected]>
|
|
The E810 device supports programmable pins for enabling both input and
output events related to the PTP hardware clock. This includes both
output signals with programmable period, as well as timestamping of
events on input pins.
Add support for enabling these using the CONFIG_PTP_1588_CLOCK
interface.
This allows programming the software defined pins to take advantage of
the hardware clock features.
Signed-off-by: Maciej Machnikowski <[email protected]>
Signed-off-by: Jacob Keller <[email protected]>
Signed-off-by: Tony Nguyen <[email protected]>
|
|
This patch is modeled after one by Scott Peterson for i40e.
Add tracepoints to the driver, via a new file ice_trace.h and some new
trace calls added in interesting places in the driver. Add some tracing
for DIMLIB to help debug interrupt moderation problems.
Performance should not be affected, and this can be very useful
for debugging and adding new trace events to paths in the future.
Note eBPF programs can attach to these events, as well as perf
can count them since we're attaching to the events subsystem
in the kernel.
Co-developed-by: Ben Shelton <[email protected]>
Signed-off-by: Ben Shelton <[email protected]>
Signed-off-by: Jesse Brandeburg <[email protected]>
Tested-by: Tony Brelinski <[email protected]>
Signed-off-by: Tony Nguyen <[email protected]>
|
|
The Intel drivers all have rcu_read_lock()/rcu_read_unlock() pairs around
XDP program invocations. However, the actual lifetime of the objects
referred by the XDP program invocation is longer, all the way through to
the call to xdp_do_flush(), making the scope of the rcu_read_lock() too
small. This turns out to be harmless because it all happens in a single
NAPI poll cycle (and thus under local_bh_disable()), but it makes the
rcu_read_lock() misleading.
Rather than extend the scope of the rcu_read_lock(), just get rid of it
entirely. With the addition of RCU annotations to the XDP_REDIRECT map
types that take bh execution into account, lockdep even understands this to
be safe, so there's really no reason to keep it around.
Signed-off-by: Toke Høiland-Jørgensen <[email protected]>
Signed-off-by: Daniel Borkmann <[email protected]>
Tested-by: Jesper Dangaard Brouer <[email protected]> # i40e
Cc: Jesse Brandeburg <[email protected]>
Cc: Tony Nguyen <[email protected]>
Cc: [email protected]
Link: https://lore.kernel.org/bpf/[email protected]
|
|
Trivial conflicts in net/can/isotp.c and
tools/testing/selftests/net/mptcp/mptcp_connect.sh
scaled_ppm_to_ppb() was moved from drivers/ptp/ptp_clock.c
to include/linux/ptp_clock_kernel.h in -next so re-apply
the fix there.
Signed-off-by: Jakub Kicinski <[email protected]>
|
|
The hardware is reporting the type of the hash used for RSS
as a PTYPE field in the receive descriptor. Use this value to set
the skb packet hash type by extending the hash type table to
cover all 10-bits of possible values (requiring some variables
to be changed from u8 to u16), and then use that table to convert
to one of the possible values in enum pkt_hash_types.
While we're here, remove the unused ptype struct value, which
makes table init easier for the zero entries, and use ranged
initializer to remove a bunch of code (works with gcc and clang).
Without this change, the kernel will recalculate the hash in software,
which can consume extra CPU cycles.
Co-developed-by: Kiran Patil <[email protected]>
Signed-off-by: Kiran Patil <[email protected]>
Signed-off-by: Jesse Brandeburg <[email protected]>
Tested-by: Tony Brelinski <[email protected]>
Signed-off-by: Tony Nguyen <[email protected]>
|
|
The continue statement in the for-loop is redundant. Re-work the hw_lock
check to remove it.
Addresses-Coverity: ("Continue has no effect")
Signed-off-by: Colin Ian King <[email protected]>
Reviewed-by: Jacob Keller <[email protected]>
Signed-off-by: Tony Nguyen <[email protected]>
|
|
Fix the following compilation warning if PTP_1588_CLOCK is not enabled
drivers/net/ethernet/intel/ice/ice_ptp.h:149:1:
error: return type defaults to ‘int’ [-Werror=return-type]
ice_ptp_request_ts(struct ice_ptp_tx *tx, struct sk_buff *skb)
Fixes: ea9b847cda647 ("ice: enable transmit timestamps for E810 devices")
Signed-off-by: Lorenzo Bianconi <[email protected]>
Reviewed-by: Jacob Keller <[email protected]>
Signed-off-by: Tony Nguyen <[email protected]>
|
|
The ptp_read_system_prets and ptp_read_system_postts functions already
check for the NULL value of the ptp_system_timestamp structure pointer.
There is no need to check this manually in the ice driver code. Remove
the checks.
Reported-by: Jakub Kicinski <[email protected]>
Signed-off-by: Jacob Keller <[email protected]>
Signed-off-by: Tony Nguyen <[email protected]>
|
|
Function 'ice_is_vsi_valid' is declared twice, remove the
repeated declaration.
Cc: Jesse Brandeburg <[email protected]>
Cc: Tony Nguyen <[email protected]>
Cc: "David S. Miller" <[email protected]>
Cc: Jakub Kicinski <[email protected]>
Signed-off-by: Shaokun Zhang <[email protected]>
Signed-off-by: Tony Nguyen <[email protected]>
|
|
Remove the local variable since it's only used once. Instead, use it
directly.
Signed-off-by: Paul M Stillwell Jr <[email protected]>
Tested-by: Tony Brelinski <[email protected]>
Signed-off-by: Tony Nguyen <[email protected]>
|
|
There are some places where the scope of a variable can
be reduced so do that.
Signed-off-by: Paul M Stillwell Jr <[email protected]>
Tested-by: Tony Brelinski <[email protected]>
Signed-off-by: Tony Nguyen <[email protected]>
|
|
The entry for PTYPE 2 in the ice_ptype_lkup table incorrectly states
that this is an L2 packet with no payload. According to the datasheet,
this PTYPE is actually unused and reserved.
Fix the lookup entry to indicate this is an unused entry that is
reserved.
Signed-off-by: Jacob Keller <[email protected]>
Tested-by: Tony Brelinski <[email protected]>
Signed-off-by: Tony Nguyen <[email protected]>
|
|
The entry for PTYPE 90 indicates that the payload is layer 3. This does
not match the specification in the datasheet which indicates the packet
is a MAC, IPv6, UDP packet, with a payload in layer 4.
Fix the lookup table to match the data sheet.
Signed-off-by: Jacob Keller <[email protected]>
Tested-by: Tony Brelinski <[email protected]>
Signed-off-by: Tony Nguyen <[email protected]>
|
|
Add support for enabling Tx timestamp requests for outgoing packets on
E810 devices.
The ice hardware can support multiple outstanding Tx timestamp requests.
When sending a descriptor to hardware, a Tx timestamp request is made by
setting a request bit, and assigning an index that represents which Tx
timestamp index to store the timestamp in.
Hardware makes no effort to synchronize the index use, so it is up to
software to ensure that Tx timestamp indexes are not re-used before the
timestamp is reported back.
To do this, introduce a Tx timestamp tracker which will keep track of
currently in-use indexes.
In the hot path, if a packet has a timestamp request, an index will be
requested from the tracker. Unfortunately, this does require a lock as
the indexes are shared across all queues on a PHY. There are not enough
indexes to reliably assign only 1 to each queue.
For the E810 devices, the timestamp indexes are not shared across PHYs,
so each port can have its own tracking.
Once hardware captures a timestamp, an interrupt is fired. In this
interrupt, trigger a new work item that will figure out which timestamp
was completed, and report the timestamp back to the stack.
This function loops through the Tx timestamp indexes and checks whether
there is now a valid timestamp. If so, it clears the PHY timestamp
indication in the PHY memory, locks and removes the SKB and bit in the
tracker, then reports the timestamp to the stack.
It is possible in some cases that a timestamp request will be initiated
but never completed. This might occur if the packet is dropped by
software or hardware before it reaches the PHY.
Add a task to the periodic work function that will check whether
a timestamp request is more than a few seconds old. If so, the timestamp
index is cleared in the PHY, and the SKB is released.
Just as with Rx timestamps, the Tx timestamps are only 40 bits wide, and
use the same overall logic for extending to 64 bits of nanoseconds.
With this change, E810 devices should be able to perform basic PTP
functionality.
Future changes will extend the support to cover the E822-based devices.
Signed-off-by: Jacob Keller <[email protected]>
Tested-by: Tony Brelinski <[email protected]>
Signed-off-by: Tony Nguyen <[email protected]>
|
|
Add SIOCGHWTSTAMP and SIOCSHWTSTAMP ioctl handlers to respond to
requests to enable timestamping support. If the request is for enabling
Rx timestamps, set a bit in the Rx descriptors to indicate that receive
timestamps should be reported.
Hardware captures receive timestamps in the PHY which only captures part
of the timer, and reports only 40 bits into the Rx descriptor. The upper
32 bits represent the contents of GLTSYN_TIME_L at the point of packet
reception, while the lower 8 bits represent the upper 8 bits of
GLTSYN_TIME_0.
The networking and PTP stack expect 64 bit timestamps in nanoseconds. To
support this, implement some logic to extend the timestamps by using the
full PHC time.
If the Rx timestamp was captured prior to the PHC time, then the real
timestamp is
PHC - (lower_32_bits(PHC) - timestamp)
If the Rx timestamp was captured after the PHC time, then the real
timestamp is
PHC + (timestamp - lower_32_bits(PHC))
These calculations are correct as long as neither the PHC timestamp nor
the Rx timestamps are more than 2^32-1 nanseconds old. Further, we can
detect when the Rx timestamp is before or after the PHC as long as the
PHC timestamp is no more than 2^31-1 nanoseconds old.
In that case, we calculate the delta between the lower 32 bits of the
PHC and the Rx timestamp. If it's larger than 2^31-1 then the Rx
timestamp must have been captured in the past. If it's smaller, then the
Rx timestamp must have been captured after PHC time.
Add an ice_ptp_extend_32b_ts function that relies on a cached copy of
the PHC time and implements this algorithm to calculate the proper upper
32bits of the Rx timestamps.
Cache the PHC time periodically in all of the Rx rings. This enables
each Rx ring to simply call the extension function with a recent copy of
the PHC time. By ensuring that the PHC time is kept up to date
periodically, we ensure this algorithm doesn't use stale data and
produce incorrect results.
To cache the time, introduce a kworker and a kwork item to periodically
store the Rx time. It might seem like we should use the .do_aux_work
interface of the PTP clock. This doesn't work because all PFs must cache
this time, but only one PF owns the PTP clock device.
Thus, the ice driver will manage its own kthread instead of relying on
the PTP do_aux_work handler.
With this change, the driver can now report Rx timestamps on all
incoming packets.
Signed-off-by: Jacob Keller <[email protected]>
Tested-by: Tony Brelinski <[email protected]>
Signed-off-by: Tony Nguyen <[email protected]>
|
|
Now that the driver registers a PTP clock device that represents the
clock hardware, it is important that the clock index is reported via the
ethtool .get_ts_info callback.
The underlying hardware resource is shared between multiple PF
functions. Only one function owns the hardware resources associated with
a timer, but multiple functions may be associated with it for the
purposes of timestamping.
To support this, the owning PF will store the clock index into the
driver shared parameters buffer in firmware. Other PFs will look up the
clock index by reading the driver shared parameter on demand when
requested via the .get_ts_info ethtool function.
In this way, all functions which are tied to the same timer are able to
report the clock index. Userspace software such as ptp4l performs
a look up on the netdev to determine the associated clock, and all
commands to control or configure the clock will be handled through the
controlling PF.
Signed-off-by: Jacob Keller <[email protected]>
Tested-by: Tony Brelinski <[email protected]>
Signed-off-by: Tony Nguyen <[email protected]>
|
|
Add a new ice_ptp.c file for holding the basic PTP clock interface
functions. If the device supports PTP, call the new ice_ptp_init and
ice_ptp_release functions where appropriate.
If the function owns the hardware resource associated with the PTP
hardware clock, register with the PTP_1588_CLOCK infrastructure to
allocate a new clock object that represents the device hardware clock.
Implement basic functionality for reading and setting the clock time,
performing clock adjustments, and adjusting the clock frequency.
Future changes will introduce functionality for handling related
features including Tx and Rx timestamps.
Signed-off-by: Jacob Keller <[email protected]>
Tested-by: Tony Brelinski <[email protected]>
Signed-off-by: Tony Nguyen <[email protected]>
|
|
Add the ice_ptp_hw.c file and some associated definitions to the ice
driver folder. This file contains basic low level definitions for
functions that interact with the device hardware.
For now, only E810-based devices are supported. The ice hardware
supports 2 major variants which have different PHYs with different
procedures necessary for interacting with the device clock.
Because the device captures timestamps in the PHY, each PHY has its own
internal timer. The timers are synchronized in hardware by first
preparing the source timer and the PHY timer shadow registers, and then
issuing a synchronization command. This ensures that both the source
timer and PHY timers are programmed simultaneously. The timers
themselves are all driven from the same oscillator source.
The functions in ice_ptp_hw.c abstract over the differences between how
the PHYs in E810 are programmed vs how the PHYs in E822 devices are
programmed. This series only implements E810 support, but E822 support
will be added in a future change.
Signed-off-by: Jacob Keller <[email protected]>
Tested-by: Tony Brelinski <[email protected]>
Signed-off-by: Tony Nguyen <[email protected]>
|
|
Depending on the device configuration, the ice hardware may share the
PTP hardware clock timer between multiple PFs. Each PF is informed by
firmware during initialization of the PTP timer association.
When bringing up PTP, only the PFs which own the timer shall allocate
a PTP hardware clock. Other PFs associated with that timer must report
the correct PTP clock index in order to allow userspace software the
ability to know which ports are connected to the same clock.
To support this, the firmware has driver shared parameters. These
parameters enable one PF to write the clock index into firmware, and
have other PFs read the associated value out. This enables the driver to
have only a single PF allocate and control the device timer registers,
while other PFs associated with that timer can report the correct clock
in the ETHTOOL_GET_TS_INFO report.
Add support for the necessary admin queue commands to enable reading and
writing of the driver shared parameters. This will be used in a future
change to enable sharing the PTP clock index between PF drivers.
Signed-off-by: Jacob Keller <[email protected]>
Tested-by: Tony Brelinski <[email protected]>
Signed-off-by: Tony Nguyen <[email protected]>
|
|
The device firmware reports PTP clock capabilities to each PF during
initialization. This includes various information for both the overall
device and the individual function, including
For functions:
* whether this function has timesync enabled
* whether this function owns one of the 2 possible clock timers, and
which one
* which timer the function is associated with
* the clock frequency, if the device supports multiple clock frequencies
* The GPIO pin association for the timer owned by this PF, if any
For the device:
* Which PF owns timer 0, if any
* Which PF owns timer 1, if any
* whether timer 0 is enabled
* whether timer 1 is enabled
Extract the bits from the capabilities information reported by firmware
and store them in the device and function capability structures.o
This information will be used in a future change to have the function
driver enable PTP hardware clock support.
Signed-off-by: Jacob Keller <[email protected]>
Tested-by: Tony Brelinski <[email protected]>
Signed-off-by: Tony Nguyen <[email protected]>
|
|
In order to support certain device features, including enabling the PTP
hardware clock, the ice driver needs to control some registers on the
device PHY.
These registers are accessed by sending sideband messages. For some
hardware, these messages must be sent over the device admin queue, while
other hardware has a dedicated control queue for the sideband messages.
Add the neighbor device message structure for sending a message to the
neighboring device. Where supported, initialize the sideband control
queue and handle cleanup.
Add a wrapper function for sending sideband control queue messages that
read or write a neighboring device register.
Because some devices send sideband messages over the AdminQ, also
increase the length of the admin queue to allow more messages to be
queued up. This is important because the sideband messages add
additional pressure on the AQ usage.
This support will be used in following patches to enable support for
CONFIG_1588_PTP_CLOCK.
Signed-off-by: Jacob Keller <[email protected]>
Tested-by: Tony Brelinski <[email protected]>
Signed-off-by: Tony Nguyen <[email protected]>
|
|
Commit ae15e0ba1b33 ("ice: Change number of XDP Tx queues to match
number of Rx queues") tried to address the incorrect setting of XDP
queue count that was based on the Tx queue count, whereas in theory we
should provide the XDP queue per Rx queue. However, the routines that
setup and destroy the set of Tx resources are still based on the
vsi->num_txq.
Ice supports the asynchronous Tx/Rx queue count, so for a setup where
vsi->num_txq > vsi->num_rxq, ice_vsi_stop_tx_rings and ice_vsi_cfg_txqs
will be accessing the vsi->xdp_rings out of the bounds.
Parameterize two mentioned functions so they get the size of Tx resources
array as the input.
Fixes: ae15e0ba1b33 ("ice: Change number of XDP Tx queues to match number of Rx queues")
Signed-off-by: Maciej Fijalkowski <[email protected]>
Tested-by: Kiran Bhandare <[email protected]>
Signed-off-by: Tony Nguyen <[email protected]>
|
|
ice driver requires a programmable pipeline firmware package in order to
have a support for advanced features. Otherwise, driver falls back to so
called 'safe mode'. For that mode, ndo_bpf callback is not exposed and
when user tries to load XDP program, the following happens:
$ sudo ./xdp1 enp179s0f1
libbpf: Kernel error message: Underlying driver does not support XDP in native mode
link set xdp fd failed
which is sort of confusing, as there is a native XDP support, but not in
the current mode. Improve the user experience by providing the specific
ndo_bpf callback dedicated for safe mode which will make use of extack
to explicitly let the user know that the DDP package is missing and
that's the reason that the XDP can't be loaded onto interface currently.
Cc: Jamal Hadi Salim <[email protected]>
Fixes: efc2214b6047 ("ice: Add support for XDP")
Signed-off-by: Maciej Fijalkowski <[email protected]>
Tested-by: Kiran Bhandare <[email protected]>
Signed-off-by: Tony Nguyen <[email protected]>
|
|
git://git.kernel.org/pub/scm/linux/kernel/git/tnguy/next-queue
Tony Nguyen says:
====================
100GbE Intel Wired LAN Driver Updates 2021-06-07
This series contains updates to virtchnl header file and ice driver.
Brett adds capability bits to virtchnl to specify whether a primary or
secondary MAC address is being requested and adds the implementation to
ice. He also adds storing of VF MAC address so that it will be preserved
across reboots of VM and refactors VF queue configuration to remove the
expectation that configuration be done all at once.
Krzysztof refactors ice_setup_rx_ctx() to remove configuration not
related to Rx context into a new function, ice_vsi_cfg_rxq().
Liwei Song extends the wait time for the global config timeout.
Salil Mehta refactors code in ice_vsi_set_num_qs() to remove an
unnecessary call when the user has requested specific number of Rx or Tx
queues.
Jesse converts define macros to static inlines for NOP configurations.
Jake adds messaging when devlink fails to read device capabilities and
when pldmfw cannot find the requested firmware. Adds a wait for reset
completion when reporting devlink info and reinitializes NVM during
rebuild to ensure values are current.
Ani adds detection and reporting of modules exceeding supported power
levels and changes an error message to a debug message.
Paul fixes a clang warning for deadcode.DeadStores.
====================
Signed-off-by: David S. Miller <[email protected]>
|
|
Bug fixes overlapping feature additions and refactoring, mostly.
Signed-off-by: David S. Miller <[email protected]>
|