Age | Commit message (Collapse) | Author | Files | Lines |
|
Triggering the reset while in switchdev mode causes
errors[1]. Rules are already removed by this time
because switch content is flushed in case of the reset.
This means that rules were deleted from HW but SW
still thinks they exist so when we get
SWITCHDEV_FDB_DEL_TO_DEVICE notification we try to
delete not existing rule.
We can avoid these errors by clearing the rules
early in the reset flow before they are removed from HW.
Switchdev API will get notified that the rule was removed
so we won't get SWITCHDEV_FDB_DEL_TO_DEVICE notification.
Remove unnecessary ice_clear_sw_switch_recipes.
[1]
ice 0000:01:00.0: Failed to delete FDB forward rule, err: -2
ice 0000:01:00.0: Failed to delete FDB guard rule, err: -2
Fixes: 7c945a1a8e5f ("ice: Switchdev FDB events support")
Reviewed-by: Mateusz Polchlopek <mateusz.polchlopek@intel.com>
Signed-off-by: Wojciech Drewek <wojciech.drewek@intel.com>
Tested-by: Sujai Buvaneswaran <sujai.buvaneswaran@intel.com>
Signed-off-by: Tony Nguyen <anthony.l.nguyen@intel.com>
|
|
netif_is_ice() works by checking the pointer to netdev ops. However, it
only checks for the default ice_netdev_ops, not ice_netdev_safe_mode_ops,
so in Safe Mode it always returns false, which is unintuitive. While it
doesn't look like netif_is_ice() is currently being called anywhere in Safe
Mode, this could change and potentially lead to unexpected behaviour.
Fixes: df006dd4b1dc ("ice: Add initial support framework for LAG")
Reviewed-by: Przemek Kitszel <przemyslaw.kitszel@intel.com>
Signed-off-by: Marcin Szycik <marcin.szycik@linux.intel.com>
Reviewed-by: Brett Creeley <brett.creeley@amd.com>
Tested-by: Sujai Buvaneswaran <sujai.buvaneswaran@intel.com>
Signed-off-by: Tony Nguyen <anthony.l.nguyen@intel.com>
|
|
If DDP package is missing or corrupted, the driver should enter Safe Mode.
Instead, an error is returned and probe fails.
To fix this, don't exit init if ice_init_ddp_config() returns an error.
Repro:
* Remove or rename DDP package (/lib/firmware/intel/ice/ddp/ice.pkg)
* Load ice
Fixes: cc5776fe1832 ("ice: Enable switching default Tx scheduler topology")
Reviewed-by: Przemek Kitszel <przemyslaw.kitszel@intel.com>
Signed-off-by: Marcin Szycik <marcin.szycik@linux.intel.com>
Reviewed-by: Brett Creeley <brett.creeley@amd.com>
Tested-by: Pucha Himasekhar Reddy <himasekharx.reddy.pucha@intel.com> (A Contingent worker at Intel)
Signed-off-by: Tony Nguyen <anthony.l.nguyen@intel.com>
|
|
Fix leak of the FW blob (DDP pkg).
Make ice_cfg_tx_topo() const-correct, so ice_init_tx_topology() can avoid
copying whole FW blob. Copy just the topology section, and only when
needed. Reuse the buffer allocated for the read of the current topology.
This was found by kmemleak, with the following trace for each PF:
[<ffffffff8761044d>] kmemdup_noprof+0x1d/0x50
[<ffffffffc0a0a480>] ice_init_ddp_config+0x100/0x220 [ice]
[<ffffffffc0a0da7f>] ice_init_dev+0x6f/0x200 [ice]
[<ffffffffc0a0dc49>] ice_init+0x29/0x560 [ice]
[<ffffffffc0a10c1d>] ice_probe+0x21d/0x310 [ice]
Constify ice_cfg_tx_topo() @buf parameter.
This cascades further down to few more functions.
Fixes: cc5776fe1832 ("ice: Enable switching default Tx scheduler topology")
CC: Larysa Zaremba <larysa.zaremba@intel.com>
CC: Jacob Keller <jacob.e.keller@intel.com>
CC: Pucha Himasekhar Reddy <himasekharx.reddy.pucha@intel.com>
CC: Mateusz Polchlopek <mateusz.polchlopek@intel.com>
Signed-off-by: Przemek Kitszel <przemyslaw.kitszel@intel.com>
Reviewed-by: Jacob Keller <jacob.e.keller@intel.com>
Tested-by: Pucha Himasekhar Reddy <himasekharx.reddy.pucha@intel.com> (A Contingent worker at Intel)
Signed-off-by: Tony Nguyen <anthony.l.nguyen@intel.com>
|
|
Cross-merge networking fixes after downstream PR.
No conflicts (sort of) and no adjacent changes.
This merge reverts commit b3c9e65eb227 ("net: hsr: remove seqnr_lock")
from net, as it was superseded by
commit 430d67bdcb04 ("net: hsr: Use the seqnr lock for frames received via interlink port.")
in net-next.
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
|
|
Our driver uses devres to manage resources, in particular we call
pcim_enable_device(), what also means we express the intent to get
automatic pci_disable_device() call at driver removal. Manual calls to
pci_disable_device() misuse the API.
Recent commit (see "Fixes" tag) has changed the removal action from
conditional (silent ignore of double call to pci_disable_device()) to
unconditional, but able to catch unwanted redundant calls; see cited
"Fixes" commit for details.
Since that, unloading the driver yields following warn+splat:
[70633.628490] ice 0000:af:00.7: disabling already-disabled device
[70633.628512] WARNING: CPU: 52 PID: 33890 at drivers/pci/pci.c:2250 pci_disable_device+0xf4/0x100
...
[70633.628744] ? pci_disable_device+0xf4/0x100
[70633.628752] release_nodes+0x4a/0x70
[70633.628759] devres_release_all+0x8b/0xc0
[70633.628768] device_unbind_cleanup+0xe/0x70
[70633.628774] device_release_driver_internal+0x208/0x250
[70633.628781] driver_detach+0x47/0x90
[70633.628786] bus_remove_driver+0x80/0x100
[70633.628791] pci_unregister_driver+0x2a/0xb0
[70633.628799] ice_module_exit+0x11/0x3a [ice]
Note that this is the only Intel ethernet driver that needs such fix.
Fixes: f748a07a0b64 ("PCI: Remove legacy pcim_release()")
Reviewed-by: Larysa Zaremba <larysa.zaremba@intel.com>
Reviewed-by: Philipp Stanner <pstanner@redhat.com>
Signed-off-by: Przemek Kitszel <przemyslaw.kitszel@intel.com>
Tested-by: Pucha Himasekhar Reddy <himasekharx.reddy.pucha@intel.com> (A Contingent worker at Intel)
Signed-off-by: Tony Nguyen <anthony.l.nguyen@intel.com>
|
|
Implement subfunction driver. It is probe when subfunction port is
activated.
VSI is already created. During the probe VSI is being configured.
MAC unicast and broadcast filter is added to allow traffic to pass.
Store subfunction pointer in VSI struct. The same is done for VF
pointer. Make union of subfunction and VF pointer as only one of them
can be set with one VSI.
Reviewed-by: Simon Horman <horms@kernel.org>
Reviewed-by: Jiri Pirko <jiri@nvidia.com>
Signed-off-by: Piotr Raczynski <piotr.raczynski@intel.com>
Signed-off-by: Michal Swiatkowski <michal.swiatkowski@linux.intel.com>
Tested-by: Rafal Romanowski <rafal.romanowski@intel.com>
Signed-off-by: Tony Nguyen <anthony.l.nguyen@intel.com>
|
|
When subfunction VSI is open the same code as for PF VSI should be
executed. Also when up is complete. Reflect that in code by adding
subfunction VSI to consideration.
In case of stopping, PF doesn't have additional tasks, so the same
is with subfunction VSI.
Reviewed-by: Simon Horman <horms@kernel.org>
Signed-off-by: Michal Swiatkowski <michal.swiatkowski@linux.intel.com>
Tested-by: Rafal Romanowski <rafal.romanowski@intel.com>
Signed-off-by: Tony Nguyen <anthony.l.nguyen@intel.com>
|
|
Implement devlink port handlers responsible for ethernet type devlink
subfunctions. Create subfunction devlink port and setup all resources
needed for a subfunction netdev to operate. Configure new VSI for each
new subfunction, initialize and configure interrupts and Tx/Rx resources.
Set correct MAC filters and create new netdev.
For now, subfunction is limited to only one Tx/Rx queue pair.
Only allocate new subfunction VSI with devlink port new command.
Allocate and free subfunction MSIX interrupt vectors using new API
calls with pci_msix_alloc_irq_at and pci_msix_free_irq.
Support both automatic and manual subfunction numbers. If no subfunction
number is provided, use xa_alloc to pick a number automatically. This
will find the first free index and use that as the number. This reduces
burden on users in the simple case where a specific number is not
required. It may also be slightly faster to check that a number exists
since xarray lookup should be faster than a linear scan of the dyn_ports
xarray.
Co-developed-by: Jacob Keller <jacob.e.keller@intel.com>
Signed-off-by: Jacob Keller <jacob.e.keller@intel.com>
Signed-off-by: Piotr Raczynski <piotr.raczynski@intel.com>
Signed-off-by: Michal Swiatkowski <michal.swiatkowski@linux.intel.com>
Tested-by: Rafal Romanowski <rafal.romanowski@intel.com>
Signed-off-by: Tony Nguyen <anthony.l.nguyen@intel.com>
|
|
Make some of the netdevice_ops functions visible from outside for
another VSI type created netdev.
Reviewed-by: Simon Horman <horms@kernel.org>
Reviewed-by: Przemek Kitszel <przemyslaw.kitszel@intel.com>
Reviewed-by: Wojciech Drewek <wojciech.drewek@intel.com>
Signed-off-by: Piotr Raczynski <piotr.raczynski@intel.com>
Signed-off-by: Michal Swiatkowski <michal.swiatkowski@linux.intel.com>
Tested-by: Rafal Romanowski <rafal.romanowski@intel.com>
Signed-off-by: Tony Nguyen <anthony.l.nguyen@intel.com>
|
|
Add required plumbing for new VSI type dedicated to devlink subfunctions.
Make sure that the vsi is properly configured and destroyed. Also allow
loading XDP and AF_XDP sockets.
The first implementation of devlink subfunctions supports only one Tx/Rx
queue pair per given subfunction.
Reviewed-by: Simon Horman <horms@kernel.org>
Reviewed-by: Przemek Kitszel <przemyslaw.kitszel@intel.com>
Reviewed-by: Wojciech Drewek <wojciech.drewek@intel.com>
Signed-off-by: Piotr Raczynski <piotr.raczynski@intel.com>
Signed-off-by: Michal Swiatkowski <michal.swiatkowski@linux.intel.com>
Tested-by: Rafal Romanowski <rafal.romanowski@intel.com>
Signed-off-by: Tony Nguyen <anthony.l.nguyen@intel.com>
|
|
After XDP configuration is completed, we bring the interface up
unconditionally, regardless of its state before the call to .ndo_bpf().
Preserve the information whether the interface had to be brought down and
later bring it up only in such case.
Fixes: efc2214b6047 ("ice: Add support for XDP")
Reviewed-by: Wojciech Drewek <wojciech.drewek@intel.com>
Reviewed-by: Jacob Keller <jacob.e.keller@intel.com>
Tested-by: Chandan Kumar Rout <chandanx.rout@intel.com>
Acked-by: Maciej Fijalkowski <maciej.fijalkowski@intel.com>
Signed-off-by: Larysa Zaremba <larysa.zaremba@intel.com>
Signed-off-by: Tony Nguyen <anthony.l.nguyen@intel.com>
|
|
If VSI rebuild is pending, .ndo_bpf() can attach/detach the XDP program on
VSI without applying new ring configuration. When unconfiguring the VSI, we
can encounter the state in which there is an XDP program but no XDP rings
to destroy or there will be XDP rings that need to be destroyed, but no XDP
program to indicate their presence.
When unconfiguring, rely on the presence of XDP rings rather then XDP
program, as they better represent the current state that has to be
destroyed.
Reviewed-by: Wojciech Drewek <wojciech.drewek@intel.com>
Reviewed-by: Jacob Keller <jacob.e.keller@intel.com>
Tested-by: Chandan Kumar Rout <chandanx.rout@intel.com>
Acked-by: Maciej Fijalkowski <maciej.fijalkowski@intel.com>
Signed-off-by: Larysa Zaremba <larysa.zaremba@intel.com>
Signed-off-by: Tony Nguyen <anthony.l.nguyen@intel.com>
|
|
The main threat to data consistency in ice_xdp() is a possible asynchronous
PF reset. It can be triggered by a user or by TX timeout handler.
XDP setup and PF reset code access the same resources in the following
sections:
* ice_vsi_close() in ice_prepare_for_reset() - already rtnl-locked
* ice_vsi_rebuild() for the PF VSI - not protected
* ice_vsi_open() - already rtnl-locked
With an unfortunate timing, such accesses can result in a crash such as the
one below:
[ +1.999878] ice 0000:b1:00.0: Registered XDP mem model MEM_TYPE_XSK_BUFF_POOL on Rx ring 14
[ +2.002992] ice 0000:b1:00.0: Registered XDP mem model MEM_TYPE_XSK_BUFF_POOL on Rx ring 18
[Mar15 18:17] ice 0000:b1:00.0 ens801f0np0: NETDEV WATCHDOG: CPU: 38: transmit queue 14 timed out 80692736 ms
[ +0.000093] ice 0000:b1:00.0 ens801f0np0: tx_timeout: VSI_num: 6, Q 14, NTC: 0x0, HW_HEAD: 0x0, NTU: 0x0, INT: 0x4000001
[ +0.000012] ice 0000:b1:00.0 ens801f0np0: tx_timeout recovery level 1, txqueue 14
[ +0.394718] ice 0000:b1:00.0: PTP reset successful
[ +0.006184] BUG: kernel NULL pointer dereference, address: 0000000000000098
[ +0.000045] #PF: supervisor read access in kernel mode
[ +0.000023] #PF: error_code(0x0000) - not-present page
[ +0.000023] PGD 0 P4D 0
[ +0.000018] Oops: 0000 [#1] PREEMPT SMP NOPTI
[ +0.000023] CPU: 38 PID: 7540 Comm: kworker/38:1 Not tainted 6.8.0-rc7 #1
[ +0.000031] Hardware name: Intel Corporation S2600WFT/S2600WFT, BIOS SE5C620.86B.02.01.0014.082620210524 08/26/2021
[ +0.000036] Workqueue: ice ice_service_task [ice]
[ +0.000183] RIP: 0010:ice_clean_tx_ring+0xa/0xd0 [ice]
[...]
[ +0.000013] Call Trace:
[ +0.000016] <TASK>
[ +0.000014] ? __die+0x1f/0x70
[ +0.000029] ? page_fault_oops+0x171/0x4f0
[ +0.000029] ? schedule+0x3b/0xd0
[ +0.000027] ? exc_page_fault+0x7b/0x180
[ +0.000022] ? asm_exc_page_fault+0x22/0x30
[ +0.000031] ? ice_clean_tx_ring+0xa/0xd0 [ice]
[ +0.000194] ice_free_tx_ring+0xe/0x60 [ice]
[ +0.000186] ice_destroy_xdp_rings+0x157/0x310 [ice]
[ +0.000151] ice_vsi_decfg+0x53/0xe0 [ice]
[ +0.000180] ice_vsi_rebuild+0x239/0x540 [ice]
[ +0.000186] ice_vsi_rebuild_by_type+0x76/0x180 [ice]
[ +0.000145] ice_rebuild+0x18c/0x840 [ice]
[ +0.000145] ? delay_tsc+0x4a/0xc0
[ +0.000022] ? delay_tsc+0x92/0xc0
[ +0.000020] ice_do_reset+0x140/0x180 [ice]
[ +0.000886] ice_service_task+0x404/0x1030 [ice]
[ +0.000824] process_one_work+0x171/0x340
[ +0.000685] worker_thread+0x277/0x3a0
[ +0.000675] ? preempt_count_add+0x6a/0xa0
[ +0.000677] ? _raw_spin_lock_irqsave+0x23/0x50
[ +0.000679] ? __pfx_worker_thread+0x10/0x10
[ +0.000653] kthread+0xf0/0x120
[ +0.000635] ? __pfx_kthread+0x10/0x10
[ +0.000616] ret_from_fork+0x2d/0x50
[ +0.000612] ? __pfx_kthread+0x10/0x10
[ +0.000604] ret_from_fork_asm+0x1b/0x30
[ +0.000604] </TASK>
The previous way of handling this through returning -EBUSY is not viable,
particularly when destroying AF_XDP socket, because the kernel proceeds
with removal anyway.
There is plenty of code between those calls and there is no need to create
a large critical section that covers all of them, same as there is no need
to protect ice_vsi_rebuild() with rtnl_lock().
Add xdp_state_lock mutex to protect ice_vsi_rebuild() and ice_xdp().
Leaving unprotected sections in between would result in two states that
have to be considered:
1. when the VSI is closed, but not yet rebuild
2. when VSI is already rebuild, but not yet open
The latter case is actually already handled through !netif_running() case,
we just need to adjust flag checking a little. The former one is not as
trivial, because between ice_vsi_close() and ice_vsi_rebuild(), a lot of
hardware interaction happens, this can make adding/deleting rings exit
with an error. Luckily, VSI rebuild is pending and can apply new
configuration for us in a managed fashion.
Therefore, add an additional VSI state flag ICE_VSI_REBUILD_PENDING to
indicate that ice_xdp() can just hot-swap the program.
Also, as ice_vsi_rebuild() flow is touched in this patch, make it more
consistent by deconfiguring VSI when coalesce allocation fails.
Fixes: 2d4238f55697 ("ice: Add support for AF_XDP")
Fixes: efc2214b6047 ("ice: Add support for XDP")
Reviewed-by: Wojciech Drewek <wojciech.drewek@intel.com>
Reviewed-by: Jacob Keller <jacob.e.keller@intel.com>
Tested-by: Chandan Kumar Rout <chandanx.rout@intel.com>
Signed-off-by: Larysa Zaremba <larysa.zaremba@intel.com>
Reviewed-by: Maciej Fijalkowski <maciej.fijalkowski@intel.com>
Signed-off-by: Tony Nguyen <anthony.l.nguyen@intel.com>
|
|
Currently, netif_queue_set_napi() is called from ice_vsi_rebuild() that is
not rtnl-locked when called from the reset. This creates the need to take
the rtnl_lock just for a single function and complicates the
synchronization with .ndo_bpf. At the same time, there no actual need to
fill napi-to-queue information at this exact point.
Fill napi-to-queue information when opening the VSI and clear it when the
VSI is being closed. Those routines are already rtnl-locked.
Also, rewrite napi-to-queue assignment in a way that prevents inclusion of
XDP queues, as this leads to out-of-bounds writes, such as one below.
[ +0.000004] BUG: KASAN: slab-out-of-bounds in netif_queue_set_napi+0x1c2/0x1e0
[ +0.000012] Write of size 8 at addr ffff889881727c80 by task bash/7047
[ +0.000006] CPU: 24 PID: 7047 Comm: bash Not tainted 6.10.0-rc2+ #2
[ +0.000004] Hardware name: Intel Corporation S2600WFT/S2600WFT, BIOS SE5C620.86B.02.01.0014.082620210524 08/26/2021
[ +0.000003] Call Trace:
[ +0.000003] <TASK>
[ +0.000002] dump_stack_lvl+0x60/0x80
[ +0.000007] print_report+0xce/0x630
[ +0.000007] ? __pfx__raw_spin_lock_irqsave+0x10/0x10
[ +0.000007] ? __virt_addr_valid+0x1c9/0x2c0
[ +0.000005] ? netif_queue_set_napi+0x1c2/0x1e0
[ +0.000003] kasan_report+0xe9/0x120
[ +0.000004] ? netif_queue_set_napi+0x1c2/0x1e0
[ +0.000004] netif_queue_set_napi+0x1c2/0x1e0
[ +0.000005] ice_vsi_close+0x161/0x670 [ice]
[ +0.000114] ice_dis_vsi+0x22f/0x270 [ice]
[ +0.000095] ice_pf_dis_all_vsi.constprop.0+0xae/0x1c0 [ice]
[ +0.000086] ice_prepare_for_reset+0x299/0x750 [ice]
[ +0.000087] pci_dev_save_and_disable+0x82/0xd0
[ +0.000006] pci_reset_function+0x12d/0x230
[ +0.000004] reset_store+0xa0/0x100
[ +0.000006] ? __pfx_reset_store+0x10/0x10
[ +0.000002] ? __pfx_mutex_lock+0x10/0x10
[ +0.000004] ? __check_object_size+0x4c1/0x640
[ +0.000007] kernfs_fop_write_iter+0x30b/0x4a0
[ +0.000006] vfs_write+0x5d6/0xdf0
[ +0.000005] ? fd_install+0x180/0x350
[ +0.000005] ? __pfx_vfs_write+0x10/0xA10
[ +0.000004] ? do_fcntl+0x52c/0xcd0
[ +0.000004] ? kasan_save_track+0x13/0x60
[ +0.000003] ? kasan_save_free_info+0x37/0x60
[ +0.000006] ksys_write+0xfa/0x1d0
[ +0.000003] ? __pfx_ksys_write+0x10/0x10
[ +0.000002] ? __x64_sys_fcntl+0x121/0x180
[ +0.000004] ? _raw_spin_lock+0x87/0xe0
[ +0.000005] do_syscall_64+0x80/0x170
[ +0.000007] ? _raw_spin_lock+0x87/0xe0
[ +0.000004] ? __pfx__raw_spin_lock+0x10/0x10
[ +0.000003] ? file_close_fd_locked+0x167/0x230
[ +0.000005] ? syscall_exit_to_user_mode+0x7d/0x220
[ +0.000005] ? do_syscall_64+0x8c/0x170
[ +0.000004] ? do_syscall_64+0x8c/0x170
[ +0.000003] ? do_syscall_64+0x8c/0x170
[ +0.000003] ? fput+0x1a/0x2c0
[ +0.000004] ? filp_close+0x19/0x30
[ +0.000004] ? do_dup2+0x25a/0x4c0
[ +0.000004] ? __x64_sys_dup2+0x6e/0x2e0
[ +0.000002] ? syscall_exit_to_user_mode+0x7d/0x220
[ +0.000004] ? do_syscall_64+0x8c/0x170
[ +0.000003] ? __count_memcg_events+0x113/0x380
[ +0.000005] ? handle_mm_fault+0x136/0x820
[ +0.000005] ? do_user_addr_fault+0x444/0xa80
[ +0.000004] ? clear_bhb_loop+0x25/0x80
[ +0.000004] ? clear_bhb_loop+0x25/0x80
[ +0.000002] entry_SYSCALL_64_after_hwframe+0x76/0x7e
[ +0.000005] RIP: 0033:0x7f2033593154
Fixes: 080b0c8d6d26 ("ice: Fix ASSERT_RTNL() warning during certain scenarios")
Fixes: 91fdbce7e8d6 ("ice: Add support in the driver for associating queue with napi")
Reviewed-by: Wojciech Drewek <wojciech.drewek@intel.com>
Reviewed-by: Jacob Keller <jacob.e.keller@intel.com>
Reviewed-by: Amritha Nambiar <amritha.nambiar@intel.com>
Signed-off-by: Larysa Zaremba <larysa.zaremba@intel.com>
Reviewed-by: Maciej Fijalkowski <maciej.fijalkowski@intel.com>
Tested-by: George Kuruvinakunnel <george.kuruvinakunnel@intel.com>
Signed-off-by: Tony Nguyen <anthony.l.nguyen@intel.com>
|
|
Ethtool callbacks can be executed while reset is in progress and try to
access deleted resources, e.g. getting coalesce settings can result in a
NULL pointer dereference seen below.
Reproduction steps:
Once the driver is fully initialized, trigger reset:
# echo 1 > /sys/class/net/<interface>/device/reset
when reset is in progress try to get coalesce settings using ethtool:
# ethtool -c <interface>
BUG: kernel NULL pointer dereference, address: 0000000000000020
PGD 0 P4D 0
Oops: Oops: 0000 [#1] PREEMPT SMP PTI
CPU: 11 PID: 19713 Comm: ethtool Tainted: G S 6.10.0-rc7+ #7
RIP: 0010:ice_get_q_coalesce+0x2e/0xa0 [ice]
RSP: 0018:ffffbab1e9bcf6a8 EFLAGS: 00010206
RAX: 000000000000000c RBX: ffff94512305b028 RCX: 0000000000000000
RDX: 0000000000000000 RSI: ffff9451c3f2e588 RDI: ffff9451c3f2e588
RBP: 0000000000000000 R08: 0000000000000000 R09: 0000000000000000
R10: ffff9451c3f2e580 R11: 000000000000001f R12: ffff945121fa9000
R13: ffffbab1e9bcf760 R14: 0000000000000013 R15: ffffffff9e65dd40
FS: 00007faee5fbe740(0000) GS:ffff94546fd80000(0000) knlGS:0000000000000000
CS: 0010 DS: 0000 ES: 0000 CR0: 0000000080050033
CR2: 0000000000000020 CR3: 0000000106c2e005 CR4: 00000000001706f0
Call Trace:
<TASK>
ice_get_coalesce+0x17/0x30 [ice]
coalesce_prepare_data+0x61/0x80
ethnl_default_doit+0xde/0x340
genl_family_rcv_msg_doit+0xf2/0x150
genl_rcv_msg+0x1b3/0x2c0
netlink_rcv_skb+0x5b/0x110
genl_rcv+0x28/0x40
netlink_unicast+0x19c/0x290
netlink_sendmsg+0x222/0x490
__sys_sendto+0x1df/0x1f0
__x64_sys_sendto+0x24/0x30
do_syscall_64+0x82/0x160
entry_SYSCALL_64_after_hwframe+0x76/0x7e
RIP: 0033:0x7faee60d8e27
Calling netif_device_detach() before reset makes the net core not call
the driver when ethtool command is issued, the attempt to execute an
ethtool command during reset will result in the following message:
netlink error: No such device
instead of NULL pointer dereference. Once reset is done and
ice_rebuild() is executing, the netif_device_attach() is called to allow
for ethtool operations to occur again in a safe manner.
Fixes: fcea6f3da546 ("ice: Add stats and ethtool support")
Suggested-by: Jakub Kicinski <kuba@kernel.org>
Reviewed-by: Igor Bagnucki <igor.bagnucki@intel.com>
Signed-off-by: Dawid Osuchowski <dawid.osuchowski@linux.intel.com>
Tested-by: Pucha Himasekhar Reddy <himasekharx.reddy.pucha@intel.com> (A Contingent worker at Intel)
Reviewed-by: Michal Schmidt <mschmidt@redhat.com>
Signed-off-by: Tony Nguyen <anthony.l.nguyen@intel.com>
|
|
Synchronize OICR IRQ when preparing for reset to avoid potential
race conditions between the reset procedure and OICR
Fixes: 4aad5335969f ("ice: add individual interrupt allocation")
Signed-off-by: Grzegorz Nitka <grzegorz.nitka@intel.com>
Signed-off-by: Sergey Temerkhanov <sergey.temerkhanov@intel.com>
Reviewed-by: Przemek Kitszel <przemyslaw.kitszel@intel.com>
Tested-by: Pucha Himasekhar Reddy <himasekharx.reddy.pucha@intel.com> (A Contingent worker at Intel)
Signed-off-by: Tony Nguyen <anthony.l.nguyen@intel.com>
|
|
xsk_buff_pool pointers that ice ring structs hold are updated via
ndo_bpf that is executed in process context while it can be read by
remote CPU at the same time within NAPI poll. Use synchronize_net()
after pointer update and {READ,WRITE}_ONCE() when working with mentioned
pointer.
Fixes: 2d4238f55697 ("ice: Add support for AF_XDP")
Reviewed-by: Shannon Nelson <shannon.nelson@amd.com>
Tested-by: Chandan Kumar Rout <chandanx.rout@intel.com> (A Contingent Worker at Intel)
Signed-off-by: Maciej Fijalkowski <maciej.fijalkowski@intel.com>
Signed-off-by: Tony Nguyen <anthony.l.nguyen@intel.com>
|
|
Since the port representors are added one by one there is no need to do
eswitch rebuild. Each port representor is detached and attached in VF
reset path.
Reviewed-by: Wojciech Drewek <wojciech.drewek@intel.com>
Signed-off-by: Michal Swiatkowski <michal.swiatkowski@linux.intel.com>
Reviewed-by: Simon Horman <horms@kernel.org>
Tested-by: Sujai Buvaneswaran <sujai.buvaneswaran@intel.com>
Signed-off-by: Tony Nguyen <anthony.l.nguyen@intel.com>
|
|
We are moving away from the Sourceforge email address. Rather than
removing or updating the email for the affected entries, remove the
MODULE_AUTHOR altogether as its usage is incorrect [1].
Link: https://lore.kernel.org/netdev/20200626115236.7f36d379@kicinski-fedora-pc1c0hjn.dhcp.thefacebook.com/ [1]
Reviewed-by: Jesse Brandeburg <jesse.brandeburg@intel.com>
Acked-by: Alexander Lobakin <aleksander.lobakin@intel.com> # libeth, libie
Reviewed-by: Simon Horman <horms@kernel.org>
Signed-off-by: Tony Nguyen <anthony.l.nguyen@intel.com>
|
|
Admin queue command for shutdown AQ contains a flag to indicate driver
unload. However, the flag is always set in the driver, even for resets. It
can cause the firmware to consider driver as unloaded once the PF reset is
triggered on all ports of device, which could lead to unexpected results.
Add an additional function parameter to functions that shutdown AQ,
indicating whether the driver is actually unloading.
Reviewed-by: Ahmed Zaki <ahmed.zaki@intel.com>
Reviewed-by: Przemek Kitszel <przemyslaw.kitszel@intel.com>
Signed-off-by: Piotr Gardocki <piotrx.gardocki@intel.com>
Signed-off-by: Marcin Szycik <marcin.szycik@linux.intel.com>
Reviewed-by: Simon Horman <horms@kernel.org>
Tested-by: Pucha Himasekhar Reddy <himasekharx.reddy.pucha@intel.com> (A Contingent worker at Intel)
Signed-off-by: Tony Nguyen <anthony.l.nguyen@intel.com>
|
|
Check the return value from ice_vsi_rebuild() and prevent the usage of
incorrectly configured VSI.
Reviewed-by: Michal Swiatkowski <michal.swiatkowski@linux.intel.com>
Reviewed-by: Przemek Kitszel <przemyslaw.kitszel@intel.com>
Signed-off-by: Eric Joyner <eric.joyner@intel.com>
Signed-off-by: Karen Ostrowska <karen.ostrowska@intel.com>
Tested-by: Pucha Himasekhar Reddy <himasekharx.reddy.pucha@intel.com> (A Contingent worker at Intel)
Signed-off-by: Tony Nguyen <anthony.l.nguyen@intel.com>
|
|
Cross-merge networking fixes after downstream PR.
No conflicts.
Adjacent changes:
e3f02f32a050 ("ionic: fix kernel panic due to multi-buffer handling")
d9c04209990b ("ionic: Mark error paths in the data path as unlikely")
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
|
|
TC queues needs to be correctly updated when the number of queues on
a VSI is reconfigured, so netdev's queue and TC settings will be
dynamically adjusted and could accurately represent the underlying
hardware state after changes to the VSI queue counts.
Fixes: 0754d65bd4be ("ice: Add infrastructure for mqprio support via ndo_setup_tc")
Reviewed-by: Wojciech Drewek <wojciech.drewek@intel.com>
Signed-off-by: Jan Sokolowski <jan.sokolowski@intel.com>
Signed-off-by: Karen Ostrowska <karen.ostrowska@intel.com>
Tested-by: Pucha Himasekhar Reddy <himasekharx.reddy.pucha@intel.com> (A Contingent worker at Intel)
Signed-off-by: Tony Nguyen <anthony.l.nguyen@intel.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
|
|
Cross-merge networking fixes after downstream PR.
Conflicts:
drivers/net/ethernet/broadcom/bnxt/bnxt.c
1e7962114c10 ("bnxt_en: Restore PTP tx_avail count in case of skb_pad() error")
165f87691a89 ("bnxt_en: add timestamping statistics support")
No adjacent changes.
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
|
|
Commit 24407a01e57c ("ice: Add 200G speed/phy type use") added support
for 200G PHY speeds, but did not include 200G link speed message
support. As a result the driver incorrectly reports Unknown for 200G
link speed.
Fix this by adding 200G support to ice_print_link_msg().
Fixes: 24407a01e57c ("ice: Add 200G speed/phy type use")
Reviewed-by: Michal Swiatkowski <michal.swiatkowski@linux.intel.com>
Signed-off-by: Paul Greenwalt <paul.greenwalt@intel.com>
Reviewed-by: Jesse Brandeburg <jesse.brandeburg@intel.com>
Tested-by: Pucha Himasekhar Reddy <himasekharx.reddy.pucha@intel.com> (A Contingent worker at Intel)
Signed-off-by: Tony Nguyen <anthony.l.nguyen@intel.com>
|
|
A bug in https://bugzilla.kernel.org/show_bug.cgi?id=218906 describes
that irdma would break and report hardware initialization failed after
suspend/resume with Intel E810 NIC (tested on 6.9.0-rc5).
The problem is caused due to the collision between the irq numbers
requested in irdma and the irq numbers requested in other drivers
after suspend/resume.
The irq numbers used by irdma are derived from ice's ice_pf->msix_entries
which stores mappings between MSI-X index and Linux interrupt number.
It's supposed to be cleaned up when suspend and rebuilt in resume but
it's not, causing irdma using the old irq numbers stored in the old
ice_pf->msix_entries to request_irq() when resume. And eventually
collide with other drivers.
This patch fixes this problem. On suspend, we call ice_deinit_rdma() to
clean up the ice_pf->msix_entries (and free the MSI-X vectors used by
irdma if we've dynamically allocated them). On resume, we call
ice_init_rdma() to rebuild the ice_pf->msix_entries (and allocate the
MSI-X vectors if we would like to dynamically allocate them).
Fixes: f9f5301e7e2d ("ice: Register auxiliary device to provide RDMA")
Tested-by: Cyrus Lien <cyrus.lien@canonical.com>
Signed-off-by: En-Wei Wu <en-wei.wu@canonical.com>
Reviewed-by: Wojciech Drewek <wojciech.drewek@intel.com>
Tested-by: Pucha Himasekhar Reddy <himasekharx.reddy.pucha@intel.com> (A Contingent worker at Intel)
Signed-off-by: Tony Nguyen <anthony.l.nguyen@intel.com>
|
|
irq_set_affinity_hint() is deprecated. Use irq_update_affinity_hint()
instead. This removes the side-effect of actually applying the affinity.
The driver does not really need to worry about spreading its IRQs across
CPUs. The core code already takes care of that.
On the contrary, when the driver applies affinities by itself, it breaks
the users' expectations:
1. The user configures irqbalance with IRQBALANCE_BANNED_CPULIST in
order to prevent IRQs from being moved to certain CPUs that run a
real-time workload.
2. ice reconfigures VSIs at runtime due to a MIB change
(ice_dcb_process_lldp_set_mib_change). Reopening a VSI resets the
affinity in ice_vsi_req_irq_msix().
3. ice has no idea about irqbalance's config, so it may move an IRQ to
a banned CPU. The real-time workload suffers unacceptable latency.
I am not sure if updating the affinity hints is at all useful, because
irqbalance ignores them since 2016 ([1]), but at least it's harmless.
This ice change is similar to i40e commit d34c54d1739c ("i40e: Use
irq_update_affinity_hint()").
[1] https://github.com/Irqbalance/irqbalance/commit/dcc411e7bfdd
Signed-off-by: Michal Schmidt <mschmidt@redhat.com>
Reviewed-by: Sunil Goutham <sgoutham@marvell.com>
Reviewed-by: Jacob Keller <jacob.e.keller@intel.com>
Tested-by: Pucha Himasekhar Reddy <himasekharx.reddy.pucha@intel.com>
Signed-off-by: Jacob Keller <jacob.e.keller@intel.com>
Link: https://lore.kernel.org/r/20240607-next-2024-06-03-intel-next-batch-v3-3-d1470cee3347@intel.com
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
|
|
ice_pf_dcb_recfg() re-maps queues to vectors with
ice_vsi_map_rings_to_vectors(), which does not restore the previous
state for XDP queues. This leads to no AF_XDP traffic after rebuild.
Map XDP queues to vectors in ice_vsi_map_rings_to_vectors().
Also, move the code around, so XDP queues are mapped independently only
through .ndo_bpf().
Fixes: 6624e780a577 ("ice: split ice_vsi_setup into smaller functions")
Reviewed-by: Przemek Kitszel <przemyslaw.kitszel@intel.com>
Signed-off-by: Larysa Zaremba <larysa.zaremba@intel.com>
Reviewed-by: Simon Horman <horms@kernel.org>
Tested-by: Chandan Kumar Rout <chandanx.rout@intel.com>
Signed-off-by: Jacob Keller <jacob.e.keller@intel.com>
Link: https://lore.kernel.org/r/20240603-net-2024-05-30-intel-net-fixes-v2-5-e3563aa89b0c@intel.com
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
|
|
Commit 6624e780a577 ("ice: split ice_vsi_setup into smaller functions")
has placed ice_vsi_free_q_vectors() after ice_destroy_xdp_rings() in
the rebuild process. The behaviour of the XDP rings config functions is
context-dependent, so the change of order has led to
ice_destroy_xdp_rings() doing additional work and removing XDP prog, when
it was supposed to be preserved.
Also, dependency on the PF state reset flags creates an additional,
fortunately less common problem:
* PFR is requested e.g. by tx_timeout handler
* .ndo_bpf() is asked to delete the program, calls ice_destroy_xdp_rings(),
but reset flag is set, so rings are destroyed without deleting the
program
* ice_vsi_rebuild tries to delete non-existent XDP rings, because the
program is still on the VSI
* system crashes
With a similar race, when requested to attach a program,
ice_prepare_xdp_rings() can actually skip setting the program in the VSI
and nevertheless report success.
Instead of reverting to the old order of function calls, add an enum
argument to both ice_prepare_xdp_rings() and ice_destroy_xdp_rings() in
order to distinguish between calls from rebuild and .ndo_bpf().
Fixes: efc2214b6047 ("ice: Add support for XDP")
Reviewed-by: Igor Bagnucki <igor.bagnucki@intel.com>
Signed-off-by: Larysa Zaremba <larysa.zaremba@intel.com>
Reviewed-by: Simon Horman <horms@kernel.org>
Tested-by: Chandan Kumar Rout <chandanx.rout@intel.com>
Signed-off-by: Jacob Keller <jacob.e.keller@intel.com>
Link: https://lore.kernel.org/r/20240603-net-2024-05-30-intel-net-fixes-v2-4-e3563aa89b0c@intel.com
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
|
|
git://git.kernel.org/pub/scm/linux/kernel/git/tnguy/next-queue
Tony Nguyen says:
====================
Intel Wired LAN Driver Updates 2024-05-06 (ice)
This series contains updates to ice driver only.
Paul adds support for additional E830 devices and adjusts naming for
existing E830 devices.
Marcin commonizes a couple of TC setup calls to reduce duplicated code.
Mateusz adds ice_vsi_cfg_params into ice_vsi to consolidate info.
* '100GbE' of git://git.kernel.org/pub/scm/linux/kernel/git/tnguy/next-queue:
ice: refactor struct ice_vsi_cfg_params to be inside of struct ice_vsi
ice: Deduplicate tc action setup
ice: update E830 device ids and comments
ice: add additional E830 device ids
====================
Link: https://lore.kernel.org/r/20240506170827.948682-1-anthony.l.nguyen@intel.com
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
|
|
Simon reported that ndo_change_mtu() methods were never
updated to use WRITE_ONCE(dev->mtu, new_mtu) as hinted
in commit 501a90c94510 ("inet: protect against too small
mtu values.")
We read dev->mtu without holding RTNL in many places,
with READ_ONCE() annotations.
It is time to take care of ndo_change_mtu() methods
to use corresponding WRITE_ONCE()
Signed-off-by: Eric Dumazet <edumazet@google.com>
Reported-by: Simon Horman <horms@kernel.org>
Closes: https://lore.kernel.org/netdev/20240505144608.GB67882@kernel.org/
Reviewed-by: Jacob Keller <jacob.e.keller@intel.com>
Reviewed-by: Sabrina Dubroca <sd@queasysnail.net>
Reviewed-by: Simon Horman <horms@kernel.org>
Acked-by: Shannon Nelson <shannon.nelson@amd.com>
Link: https://lore.kernel.org/r/20240506102812.3025432-1-edumazet@google.com
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
|
|
Refactor struct ice_vsi_cfg_params to be embedded into struct ice_vsi.
Prior to that the members of the struct were scattered around ice_vsi,
and were copy-pasted for purposes of reinit.
Now we have struct handy, and it is easier to have something sticky
in the flags field.
Suggested-by: Przemek Kitszel <przemyslaw.kitszel@intel.com>
Reviewed-by: Przemek Kitszel <przemyslaw.kitszel@intel.com>
Reviewed-by: Vaishnavi Tipireddy <vaishnavi.tipireddy@intel.com>
Signed-off-by: Mateusz Polchlopek <mateusz.polchlopek@intel.com>
Tested-by: Pucha Himasekhar Reddy <himasekharx.reddy.pucha@intel.com> (A Contingent worker at Intel)
Signed-off-by: Tony Nguyen <anthony.l.nguyen@intel.com>
|
|
Update existing E830 device ids and comments to align with new naming 'C'
for 100G and 'CC' for 200G.
Reviewed-by: Wojciech Drewek <wojciech.drewek@intel.com>
Signed-off-by: Paul Greenwalt <paul.greenwalt@intel.com>
Tested-by: Pucha Himasekhar Reddy <himasekharx.reddy.pucha@intel.com> (A Contingent worker at Intel)
Signed-off-by: Tony Nguyen <anthony.l.nguyen@intel.com>
|
|
Add support for additional E830 device ids which are supported by the
driver:
- 0x12D5: Intel(R) Ethernet Controller E830-C for backplane
- 0x12D8: Intel(R) Ethernet Controller E830-C for QSFP
- 0x12DA: Intel(R) Ethernet Controller E830-C for SFP
- 0x12DC: Intel(R) Ethernet Controller E830-XXV for backplane
- 0x12DD: Intel(R) Ethernet Controller E830-XXV for QSFP
- 0x12DE: Intel(R) Ethernet Controller E830-XXV for SFP
Reviewed-by: Wojciech Drewek <wojciech.drewek@intel.com>
Signed-off-by: Paul Greenwalt <paul.greenwalt@intel.com>
Tested-by: Pucha Himasekhar Reddy <himasekharx.reddy.pucha@intel.com> (A Contingent worker at Intel)
Signed-off-by: Tony Nguyen <anthony.l.nguyen@intel.com>
|
|
git://git.kernel.org/pub/scm/linux/kernel/git/tnguy/next-queue
Tony Nguyen says:
====================
net: intel: start The Great Code Dedup + Page Pool for iavf
Alexander Lobakin says:
Here's a two-shot: introduce {,Intel} Ethernet common library (libeth and
libie) and switch iavf to Page Pool. Details are in the commit messages;
here's a summary:
Not a secret there's a ton of code duplication between two and more Intel
ethernet modules. Before introducing new changes, which would need to be
copied over again, start decoupling the already existing duplicate
functionality into a new module, which will be shared between several
Intel Ethernet drivers. The first name that came to my mind was
"libie" -- "Intel Ethernet common library". Also this sounds like
"lovelie" (-> one word, no "lib I E" pls) and can be expanded as
"lib Internet Explorer" :P
The "generic", pure-software part is placed separately, so that it can be
easily reused in any driver by any vendor without linking to the Intel
pre-200G guts. In a few words, it's something any modern driver does the
same way, but nobody moved it level up (yet).
The series is only the beginning. From now on, adding every new feature
or doing any good driver refactoring will remove much more lines than add
for quite some time. There's a basic roadmap with some deduplications
planned already, not speaking of that touching every line now asks:
"can I share this?". The final destination is very ambitious: have only
one unified driver for at least i40e, ice, iavf, and idpf with a struct
ops for each generation. That's never gonna happen, right? But you still
can at least try.
PP conversion for iavf lands within the same series as these two are tied
closely. libie will support Page Pool model only, so that a driver can't
use much of the lib until it's converted. iavf is only the example, the
rest will eventually be converted soon on a per-driver basis. That is
when it gets really interesting. Stay tech.
* '40GbE' of git://git.kernel.org/pub/scm/linux/kernel/git/tnguy/next-queue:
MAINTAINERS: add entry for libeth and libie
iavf: switch to Page Pool
iavf: pack iavf_ring more efficiently
libeth: add Rx buffer management
page_pool: add DMA-sync-for-CPU inline helper
page_pool: constify some read-only function arguments
slab: introduce kvmalloc_array_node() and kvcalloc_node()
iavf: drop page splitting and recycling
iavf: kill "legacy-rx" for good
net: intel: introduce {, Intel} Ethernet common library
====================
Link: https://lore.kernel.org/r/20240424203559.3420468-1-anthony.l.nguyen@intel.com
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
|
|
Not a secret there's a ton of code duplication between two and more Intel
ethernet modules.
Before introducing new changes, which would need to be copied over again,
start decoupling the already existing duplicate functionality into a new
module, which will be shared between several Intel Ethernet drivers.
Add the lookup table which converts 8/10-bit hardware packet type into
a parsed bitfield structure for easy checking packet format parameters,
such as payload level, IP version, etc. This is currently used by i40e,
ice and iavf and it's all the same in all three drivers.
The only difference introduced in this implementation is that instead of
defining a 256 (or 1024 in case of ice) element array, add unlikely()
condition to limit the input to 154 (current maximum non-reserved packet
type). There's no reason to waste 600 (or even 3600) bytes only to not
hurt very unlikely exception packets.
The hash computation function now takes payload level directly as a
pkt_hash_type. There's a couple cases when non-IP ptypes are marked as
L3 payload and in the previous versions their hash level would be 2, not
3. But skb_set_hash() only sees difference between L4 and non-L4, thus
this won't change anything at all.
The module is behind the hidden Kconfig symbol, which the drivers will
select when needed. The exports are behind 'LIBIE' namespace to limit
the scope of the functions.
Not that non-HW-specific symbols will live in yet another module,
libeth. This is done to easily distinguish pretty generic code ready
for reusing by any other vendor and/or for moving the layer up from
the code useful in Intel's 1-100G drivers only.
Signed-off-by: Alexander Lobakin <aleksander.lobakin@intel.com>
Signed-off-by: Tony Nguyen <anthony.l.nguyen@intel.com>
|
|
Introduce support for Tx scheduler topology change, based on user
selection, from default 9-layer to 5-layer.
Change requires NVM (version 3.20 or newer) and DDP package (OS Package
1.3.30 or newer - available for over a year in linux-firmware, since
commit aed71f296637 in linux-firmware ("ice: Update package to 1.3.30.0"))
https://git.kernel.org/pub/scm/linux/kernel/git/firmware/linux-firmware.git/commit/?id=aed71f296637
Enable 5-layer topology switch in init path of the driver. To accomplish
that upload of the DDP package needs to be delayed, until change in Tx
topology is finished. To trigger the Tx change user selection should be
changed in NVM using devlink. Then the platform should be rebooted.
Signed-off-by: Michal Wilczynski <michal.wilczynski@intel.com>
Co-developed-by: Mateusz Polchlopek <mateusz.polchlopek@intel.com>
Signed-off-by: Mateusz Polchlopek <mateusz.polchlopek@intel.com>
Tested-by: Pucha Himasekhar Reddy <himasekharx.reddy.pucha@intel.com>
Signed-off-by: Tony Nguyen <anthony.l.nguyen@intel.com>
|
|
In cases when VF sends malformed packets that are classified as malicious,
it can cause Tx queue to freeze as a result of Malicious Driver Detection
event. Such malformed packets can appear as a result of a faulty userspace
app running on VF. This frozen queue can be stuck for several minutes being
unusable.
User might prefer to immediately bring the VF back to operational state
after such event, which can be done by automatically resetting the VF which
caused MDD. This is already implemented for Rx events (mdd-auto-reset-vf
flag private flag needs to be set).
Extend the VF auto reset to also cover Tx MDD events. When any MDD event
occurs on VF (Tx or Rx) and the mdd-auto-reset-vf private flag is set,
perform a graceful VF reset to quickly bring it back to operational state.
Reviewed-by: Wojciech Drewek <wojciech.drewek@intel.com>
Reviewed-by: Przemek Kitszel <przemyslaw.kitszel@intel.com>
Co-developed-by: Liang-Min Wang <liang-min.wang@intel.com>
Signed-off-by: Liang-Min Wang <liang-min.wang@intel.com>
Signed-off-by: Marcin Szycik <marcin.szycik@linux.intel.com>
Reviewed-by: Simon Horman <horms@kernel.org>
Tested-by: Rafal Romanowski <rafal.romanowski@intel.com>
Signed-off-by: Tony Nguyen <anthony.l.nguyen@intel.com>
|
|
git://git.kernel.org/pub/scm/linux/kernel/git/tnguy/next-queue
Tony Nguyen says:
====================
Intel Wired LAN Driver Updates 2024-04-01 (ice)
This series contains updates to ice driver only.
Michal Schmidt changes flow for gettimex64 to use host-side spinlock
rather than hardware semaphore for lighter-weight locking.
Steven adds ability for switch recipes to be re-used when firmware
supports it.
Thorsten Blum removes unwanted newlines in netlink messaging.
Michal Swiatkowski and Piotr re-organize devlink related code; renaming,
moving, and consolidating it to a single location. Michal also
simplifies the devlink init and cleanup path to occur under a single
lock call.
* '100GbE' of git://git.kernel.org/pub/scm/linux/kernel/git/tnguy/next-queue:
ice: hold devlink lock for whole init/cleanup
ice: move devlink port code to a separate file
ice: move ice_devlink.[ch] to devlink folder
ice: Remove newlines in NL_SET_ERR_MSG_MOD
ice: Add switch recipe reusing feature
ice: fold ice_ptp_read_time into ice_ptp_gettimex64
ice: avoid the PTP hardware semaphore in gettimex64 path
ice: add ice_adapter for shared data across PFs on the same NIC
====================
Link: https://lore.kernel.org/r/20240401172421.1401696-1-anthony.l.nguyen@intel.com
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
|
|
git://git.kernel.org/pub/scm/linux/kernel/git/tnguy/next-queue
Tony Nguyen says:
====================
Intel Wired LAN Driver Updates 2024-03-29 (net: intel)
This series contains updates to most Intel drivers.
Jesse moves declaration of pci_driver struct to remove need for forward
declarations in igb and converts Intel drivers to user newer power
management ops.
Sasha reworks power management flow on igc to avoid using rtnl_lock()
during those flows.
Maciej reorganizes i40e_nvm file to avoid forward declarations.
* '1GbE' of git://git.kernel.org/pub/scm/linux/kernel/git/tnguy/next-queue:
i40e: avoid forward declarations in i40e_nvm.c
igc: Refactor runtime power management flow
net: intel: implement modern PM ops declarations
igb: simplify pci ops declaration
====================
Link: https://lore.kernel.org/r/20240329175632.211340-1-anthony.l.nguyen@intel.com
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
|
|
Simplify devlink lock code in driver by taking it for whole init/cleanup
path. Instead of calling devlink functions that taking lock call the
lockless versions.
Suggested-by: Jiri Pirko <jiri@resnulli.us>
Reviewed-by: Jesse Brandeburg <jesse.brandeburg@intel.com>
Reviewed-by: Wojciech Drewek <wojciech.drewek@intel.com>
Signed-off-by: Michal Swiatkowski <michal.swiatkowski@linux.intel.com>
Reviewed-by: Simon Horman <horms@kernel.org>
Tested-by: Pucha Himasekhar Reddy <himasekharx.reddy.pucha@intel.com> (A Contingent worker at Intel)
Signed-off-by: Tony Nguyen <anthony.l.nguyen@intel.com>
|
|
Keep devlink related code in a separate file. More devlink port code and
handlers will be added here for other port operations.
Remove no longer needed include of our devlink.h in ice_lib.c.
Reviewed-by: Przemek Kitszel <przemyslaw.kitszel@intel.com>
Reviewed-by: Wojciech Drewek <wojciech.drewek@intel.com>
Signed-off-by: Piotr Raczynski <piotr.raczynski@intel.com>
Signed-off-by: Michal Swiatkowski <michal.swiatkowski@linux.intel.com>
Reviewed-by: Simon Horman <horms@kernel.org>
Tested-by: Pucha Himasekhar Reddy <himasekharx.reddy.pucha@intel.com> (A Contingent worker at Intel)
Signed-off-by: Tony Nguyen <anthony.l.nguyen@intel.com>
|
|
Only moving whole files, fixing Makefile and bunch of includes.
Some changes to ice_devlink file was done even in representor part (Tx
topology), so keep it as final patch to not mess up with rebasing.
After moving to devlink folder there is no need to have such long name
for these files. Rename them to simple devlink.
Reviewed-by: Aleksandr Loktionov <aleksandr.loktionov@intel.com>
Signed-off-by: Michal Swiatkowski <michal.swiatkowski@linux.intel.com>
Reviewed-by: Simon Horman <horms@kernel.org>
Tested-by: Pucha Himasekhar Reddy <himasekharx.reddy.pucha@intel.com> (A Contingent worker at Intel)
Signed-off-by: Tony Nguyen <anthony.l.nguyen@intel.com>
|
|
There is a need for synchronization between ice PFs on the same physical
adapter.
Add a "struct ice_adapter" for holding data shared between PFs of the
same multifunction PCI device. The struct is refcounted - each ice_pf
holds a reference to it.
Its first use will be for PTP. I expect it will be useful also to
improve the ugliness that is ice_prot_id_tbl.
Reviewed-by: Przemek Kitszel <przemyslaw.kitszel@intel.com>
Signed-off-by: Michal Schmidt <mschmidt@redhat.com>
Tested-by: Pucha Himasekhar Reddy <himasekharx.reddy.pucha@intel.com> (A Contingent worker at Intel)
Signed-off-by: Tony Nguyen <anthony.l.nguyen@intel.com>
|
|
There are, especially with multi-attr arrays, many cases
of needing to iterate all attributes of a specific type
in a netlink message or a nested attribute. Add specific
macros to support that case.
Also convert many instances using this spatch:
@@
iterator nla_for_each_attr;
iterator name nla_for_each_attr_type;
identifier nla;
expression head, len, rem;
expression ATTR;
type T;
identifier x;
@@
-nla_for_each_attr(nla, head, len, rem)
+nla_for_each_attr_type(nla, ATTR, head, len, rem)
{
<... T x; ...>
-if (nla_type(nla) == ATTR) {
...
-}
}
@@
identifier nla;
iterator nla_for_each_nested;
iterator name nla_for_each_nested_type;
expression attr, rem;
expression ATTR;
type T;
identifier x;
@@
-nla_for_each_nested(nla, attr, rem)
+nla_for_each_nested_type(nla, ATTR, attr, rem)
{
<... T x; ...>
-if (nla_type(nla) == ATTR) {
...
-}
}
@@
iterator nla_for_each_attr;
iterator name nla_for_each_attr_type;
identifier nla;
expression head, len, rem;
expression ATTR;
type T;
identifier x;
@@
-nla_for_each_attr(nla, head, len, rem)
+nla_for_each_attr_type(nla, ATTR, head, len, rem)
{
<... T x; ...>
-if (nla_type(nla) != ATTR) continue;
...
}
@@
identifier nla;
iterator nla_for_each_nested;
iterator name nla_for_each_nested_type;
expression attr, rem;
expression ATTR;
type T;
identifier x;
@@
-nla_for_each_nested(nla, attr, rem)
+nla_for_each_nested_type(nla, ATTR, attr, rem)
{
<... T x; ...>
-if (nla_type(nla) != ATTR) continue;
...
}
Although I had to undo one bad change this made, and
I also adjusted some other code for whitespace and to
use direct variable initialization now.
Signed-off-by: Johannes Berg <johannes.berg@intel.com>
Link: https://lore.kernel.org/r/20240328203144.b5a6c895fb80.I1869b44767379f204998ff44dd239803f39c23e0@changeid
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
|
|
Switch the Intel networking drivers to use the new power management ops
declaration formats and macros, which allows us to drop __maybe_unused,
as well as a bunch of ifdef checking CONFIG_PM.
This is safe to do because the compiler drops the unused functions,
verified by checking for any of the power management function symbols
being present in System.map for a build without CONFIG_PM.
If a driver has runtime PM, define the ops with pm_ptr(), and if the
driver has Simple PM, use pm_sleep_ptr(), as well as the new versions of
the macros for declaring the members of the pm_ops structs.
Checked with network-enabled allnoconfig, allyesconfig, allmodconfig on
x64_64.
Reviewed-by: Alan Brady <alan.brady@intel.com>
Signed-off-by: Jesse Brandeburg <jesse.brandeburg@intel.com>
Reviewed-by: Maciej Fijalkowski <maciej.fijalkowski@intel.com>
Tested-by: Pucha Himasekhar Reddy <himasekharx.reddy.pucha@intel.com> (A Contingent worker at Intel)
Signed-off-by: Tony Nguyen <anthony.l.nguyen@intel.com>
|
|
For slow-path Rx and Tx PF VSI is used. There is no need to have control
plane VSI. Remove all code related to it.
Eswitch rebuild can't fail without rebuilding control plane VSI. Return
void from ice_eswitch_rebuild().
Reviewed-by: Marcin Szycik <marcin.szycik@linux.intel.com>
Signed-off-by: Michal Swiatkowski <michal.swiatkowski@linux.intel.com>
Tested-by: Sujai Buvaneswaran <sujai.buvaneswaran@intel.com>
Signed-off-by: Tony Nguyen <anthony.l.nguyen@intel.com>
|
|
Merge in late fixes to prepare for the 6.9 net-next PR.
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
|
|
Cross-merge networking fixes after downstream PR.
No conflicts.
Adjacent changes:
net/core/page_pool_user.c
0b11b1c5c320 ("netdev: let netlink core handle -EMSGSIZE errors")
429679dcf7d9 ("page_pool: fix netlink dump stop/resume")
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
|