aboutsummaryrefslogtreecommitdiff
path: root/drivers/net
AgeCommit message (Collapse)AuthorFilesLines
2022-02-28net: sparx5: Add #include to remove warningCasper Andersson1-0/+2
main.h uses NUM_TARGETS from main_regs.h, but the missing include never causes any errors because everywhere main.h is (currently) included, main_regs.h is included before. But since it is dependent on main_regs.h it should always be included. Signed-off-by: Casper Andersson <[email protected]> Reviewed-by: Joacim Zetterling <[email protected]> Signed-off-by: David S. Miller <[email protected]>
2022-02-26Merge branch '40GbE' of ↵David S. Miller3-75/+114
git://git.kernel.org/pub/scm/linux/kernel/git/tnguy/net-queue Tony Nguyen says: ==================== Intel Wired LAN Driver Updates 2022-02-25 This series contains updates to iavf driver only. Slawomir fixes stability issues that can be seen when stressing the driver using a large number of VFs with a multitude of operations. Among the fixes are reworking mutexes to provide more effective locking, ensuring initialization is complete before teardown, preventing operations which could race while removing the driver, stopping certain tasks from being queued when the device is down, and adding a missing mutex unlock. ==================== Signed-off-by: David S. Miller <[email protected]>
2022-02-25Merge tag 'linux-can-fixes-for-5.17-20220225' of ↵Jakub Kicinski4-15/+18
git://git.kernel.org/pub/scm/linux/kernel/git/mkl/linux-can Marc Kleine-Budde says: ==================== pull-request: can 2022-02-25 The first 2 patches are by Vincent Mailhol and fix the error handling of the ndo_open callbacks of the etas_es58x and the gs_usb CAN USB drivers. The last patch is by Lad Prabhakar and fixes a small race condition in the rcar_canfd's rcar_canfd_channel_probe() function. * tag 'linux-can-fixes-for-5.17-20220225' of git://git.kernel.org/pub/scm/linux/kernel/git/mkl/linux-can: can: rcar_canfd: rcar_canfd_channel_probe(): register the CAN device when fully ready can: gs_usb: change active_channels's type from atomic_t to u8 can: etas_es58x: change opened_channel_cnt's type from atomic_t to u8 ==================== Link: https://lore.kernel.org/r/[email protected] Signed-off-by: Jakub Kicinski <[email protected]>
2022-02-25iavf: Fix __IAVF_RESETTING state usageSlawomir Laba1-7/+6
The setup of __IAVF_RESETTING state in watchdog task had no effect and could lead to slow resets in the driver as the task for __IAVF_RESETTING state only requeues watchdog. Till now the __IAVF_RESETTING was interpreted by reset task as running state which could lead to errors with allocating and resources disposal. Make watchdog_task queue the reset task when it's necessary. Do not update the state to __IAVF_RESETTING so the reset task knows exactly what is the current state of the adapter. Fixes: 898ef1cb1cb2 ("iavf: Combine init and watchdog state machines") Signed-off-by: Slawomir Laba <[email protected]> Signed-off-by: Phani Burra <[email protected]> Signed-off-by: Jacob Keller <[email protected]> Signed-off-by: Mateusz Palczewski <[email protected]> Tested-by: Konrad Jankowski <[email protected]> Signed-off-by: Tony Nguyen <[email protected]>
2022-02-25iavf: Fix missing check for running netdevSlawomir Laba1-2/+5
The driver was queueing reset_task regardless of the netdev state. Do not queue the reset task in iavf_change_mtu if netdev is not running. Fixes: fdd4044ffdc8 ("iavf: Remove timer for work triggering, use delaying work instead") Signed-off-by: Slawomir Laba <[email protected]> Signed-off-by: Phani Burra <[email protected]> Signed-off-by: Jacob Keller <[email protected]> Signed-off-by: Mateusz Palczewski <[email protected]> Tested-by: Konrad Jankowski <[email protected]> Signed-off-by: Tony Nguyen <[email protected]>
2022-02-25iavf: Fix deadlock in iavf_reset_taskSlawomir Laba1-0/+1
There exists a missing mutex_unlock call on crit_lock in iavf_reset_task call path. Unlock the crit_lock before returning from reset task. Fixes: 5ac49f3c2702 ("iavf: use mutexes for locking of critical sections") Signed-off-by: Slawomir Laba <[email protected]> Signed-off-by: Phani Burra <[email protected]> Signed-off-by: Jacob Keller <[email protected]> Signed-off-by: Mateusz Palczewski <[email protected]> Tested-by: Konrad Jankowski <[email protected]> Signed-off-by: Tony Nguyen <[email protected]>
2022-02-25iavf: Fix race in init stateSlawomir Laba1-1/+2
When iavf_init_version_check sends VIRTCHNL_OP_GET_VF_RESOURCES message, the driver will wait for the response after requeueing the watchdog task in iavf_init_get_resources call stack. The logic is implemented this way that iavf_init_get_resources has to be called in order to allocate adapter->vf_res. It is polling for the AQ response in iavf_get_vf_config function. Expect a call trace from kernel when adminq_task worker handles this message first. adapter->vf_res will be NULL in iavf_virtchnl_completion. Make the watchdog task not queue the adminq_task if the init process is not finished yet. Fixes: 898ef1cb1cb2 ("iavf: Combine init and watchdog state machines") Signed-off-by: Slawomir Laba <[email protected]> Signed-off-by: Phani Burra <[email protected]> Signed-off-by: Jacob Keller <[email protected]> Signed-off-by: Mateusz Palczewski <[email protected]> Tested-by: Konrad Jankowski <[email protected]> Signed-off-by: Tony Nguyen <[email protected]>
2022-02-25iavf: Fix locking for VIRTCHNL_OP_GET_OFFLOAD_VLAN_V2_CAPSSlawomir Laba3-24/+23
iavf_virtchnl_completion is called under crit_lock but when the code for VIRTCHNL_OP_GET_OFFLOAD_VLAN_V2_CAPS is called, this lock is released in order to obtain rtnl_lock to avoid ABBA deadlock with unregister_netdev. Along with the new way iavf_remove behaves, there exist many risks related to the lock release and attmepts to regrab it. The driver faces crashes related to races between unregister_netdev and netdev_update_features. Yet another risk is that the driver could already obtain the crit_lock in order to destroy it and iavf_virtchnl_completion could crash or block forever. Make iavf_virtchnl_completion never relock crit_lock in it's call paths. Extract rtnl_lock locking logic to the driver for unregister_netdev in order to set the netdev_registered flag inside the lock. Introduce a new flag that will inform adminq_task to perform the code from VIRTCHNL_OP_GET_OFFLOAD_VLAN_V2_CAPS right after it finishes processing messages. Guard this code with remove flags so it's never called when the driver is in remove state. Fixes: 5951a2b9812d ("iavf: Fix VLAN feature flags after VFR") Signed-off-by: Slawomir Laba <[email protected]> Signed-off-by: Phani Burra <[email protected]> Signed-off-by: Jacob Keller <[email protected]> Signed-off-by: Mateusz Palczewski <[email protected]> Tested-by: Konrad Jankowski <[email protected]> Signed-off-by: Tony Nguyen <[email protected]>
2022-02-25iavf: Fix init state closure on removeSlawomir Laba2-1/+27
When init states of the adapter work, the errors like lack of communication with the PF might hop in. If such events occur the driver restores previous states in order to retry initialization in a proper way. When remove task kicks in, this situation could lead to races with unregistering the netdevice as well as resources cleanup. With the commit introducing the waiting in remove for init to complete, this problem turns into an endless waiting if init never recovers from errors. Introduce __IAVF_IN_REMOVE_TASK bit to indicate that the remove thread has started. Make __IAVF_COMM_FAILED adapter state respect the __IAVF_IN_REMOVE_TASK bit and set the __IAVF_INIT_FAILED state and return without any action instead of trying to recover. Make __IAVF_INIT_FAILED adapter state respect the __IAVF_IN_REMOVE_TASK bit and return without any further actions. Make the loop in the remove handler break when adapter has __IAVF_INIT_FAILED state set. Fixes: 898ef1cb1cb2 ("iavf: Combine init and watchdog state machines") Signed-off-by: Slawomir Laba <[email protected]> Signed-off-by: Phani Burra <[email protected]> Signed-off-by: Jacob Keller <[email protected]> Signed-off-by: Mateusz Palczewski <[email protected]> Tested-by: Konrad Jankowski <[email protected]> Signed-off-by: Tony Nguyen <[email protected]>
2022-02-25iavf: Add waiting so the port is initialized in removeSlawomir Laba1-11/+16
There exist races when port is being configured and remove is triggered. unregister_netdev is not and can't be called under crit_lock mutex since it is calling ndo_stop -> iavf_close which requires this lock. Depending on init state the netdev could be still unregistered so unregister_netdev never cleans up, when shortly after that the device could become registered. Make iavf_remove wait until port finishes initialization. All critical state changes are atomic (under crit_lock). Crashes that come from iavf_reset_interrupt_capability and iavf_free_traffic_irqs should now be solved in a graceful manner. Fixes: 605ca7c5c6707 ("iavf: Fix kernel BUG in free_msi_irqs") Signed-off-by: Slawomir Laba <[email protected]> Signed-off-by: Phani Burra <[email protected]> Signed-off-by: Jacob Keller <[email protected]> Signed-off-by: Mateusz Palczewski <[email protected]> Tested-by: Konrad Jankowski <[email protected]> Signed-off-by: Tony Nguyen <[email protected]>
2022-02-25iavf: Rework mutexes for better synchronisationSlawomir Laba2-31/+36
The driver used to crash in multiple spots when put to stress testing of the init, reset and remove paths. The user would experience call traces or hangs when creating, resetting, removing VFs. Depending on the machines, the call traces are happening in random spots, like reset restoring resources racing with driver remove. Make adapter->crit_lock mutex a mandatory lock for guarding the operations performed on all workqueues and functions dealing with resource allocation and disposal. Make __IAVF_REMOVE a final state of the driver respected by workqueues that shall not requeue, when they fail to obtain the crit_lock. Make the IRQ handler not to queue the new work for adminq_task when the __IAVF_REMOVE state is set. Fixes: 5ac49f3c2702 ("iavf: use mutexes for locking of critical sections") Signed-off-by: Slawomir Laba <[email protected]> Signed-off-by: Phani Burra <[email protected]> Signed-off-by: Jacob Keller <[email protected]> Signed-off-by: Mateusz Palczewski <[email protected]> Tested-by: Konrad Jankowski <[email protected]> Signed-off-by: Tony Nguyen <[email protected]>
2022-02-25net: stmmac: fix return value of __setup handlerRandy Dunlap1-3/+3
__setup() handlers should return 1 on success, i.e., the parameter has been handled. A return of 0 causes the "option=value" string to be added to init's environment strings, polluting it. Fixes: 47dd7a540b8a ("net: add support for STMicroelectronics Ethernet controllers.") Fixes: f3240e2811f0 ("stmmac: remove warning when compile as built-in (V2)") Signed-off-by: Randy Dunlap <[email protected]> Reported-by: Igor Zhbanov <[email protected]> Link: lore.kernel.org/r/[email protected] Cc: Giuseppe Cavallaro <[email protected]> Cc: Alexandre Torgue <[email protected]> Cc: Jose Abreu <[email protected]> Link: https://lore.kernel.org/r/[email protected] Signed-off-by: Jakub Kicinski <[email protected]>
2022-02-25net: sxgbe: fix return value of __setup handlerRandy Dunlap1-3/+3
__setup() handlers should return 1 on success, i.e., the parameter has been handled. A return of 0 causes the "option=value" string to be added to init's environment strings, polluting it. Fixes: acc18c147b22 ("net: sxgbe: add EEE(Energy Efficient Ethernet) for Samsung sxgbe") Fixes: 1edb9ca69e8a ("net: sxgbe: add basic framework for Samsung 10Gb ethernet driver") Signed-off-by: Randy Dunlap <[email protected]> Reported-by: Igor Zhbanov <[email protected]> Link: lore.kernel.org/r/[email protected] Cc: Siva Reddy <[email protected]> Cc: Girish K S <[email protected]> Cc: Byungho An <[email protected]> Link: https://lore.kernel.org/r/[email protected] Signed-off-by: Jakub Kicinski <[email protected]>
2022-02-25can: rcar_canfd: rcar_canfd_channel_probe(): register the CAN device when ↵Lad Prabhakar1-3/+3
fully ready Register the CAN device only when all the necessary initialization is completed. This patch makes sure all the data structures and locks are initialized before registering the CAN device. Link: https://lore.kernel.org/all/[email protected] Reported-by: Pavel Machek <[email protected]> Signed-off-by: Lad Prabhakar <[email protected]> Reviewed-by: Pavel Machek <[email protected]> Reviewed-by: Ulrich Hecht <[email protected]> Signed-off-by: Marc Kleine-Budde <[email protected]>
2022-02-25net: sparx5: Fix add vlan when invalid operationCasper Andersson1-10/+10
Check if operation is valid before changing any settings in hardware. Otherwise it results in changes being made despite it not being a valid operation. Fixes: 78eab33bb68b ("net: sparx5: add vlan support") Signed-off-by: Casper Andersson <[email protected]> Signed-off-by: David S. Miller <[email protected]>
2022-02-25net: chelsio: cxgb3: check the return value of pci_find_capability()Jia-Ju Bai1-0/+2
The function pci_find_capability() in t3_prep_adapter() can fail, so its return value should be checked. Fixes: 4d22de3e6cc4 ("Add support for the latest 1G/10G Chelsio adapter, T3") Reported-by: TOTE Robot <[email protected]> Signed-off-by: Jia-Ju Bai <[email protected]> Signed-off-by: David S. Miller <[email protected]>
2022-02-25ibmvnic: Allow queueing resets during probeSukadev Bhattiprolu2-10/+98
We currently don't allow queuing resets when adapter is in VNIC_PROBING state - instead we throw away the reset and return EBUSY. The reasoning is probably that during ibmvnic_probe() the ibmvnic_adapter itself is being initialized so performing a reset during this time can lead us to accessing fields in the ibmvnic_adapter that are not fully initialized. A review of the code shows that all the adapter state neede to process a reset is initialized before registering the CRQ so that should no longer be a concern. Further the expectation is that if we do get a reset (transport event) during probe, the do..while() loop in ibmvnic_probe() will handle this by reinitializing the CRQ. While that is true to some extent, it is possible that the reset might occur _after_ the CRQ is registered and CRQ_INIT message was exchanged but _before_ the adapter state is set to VNIC_PROBED. As mentioned above, such a reset will be thrown away. While the client assumes that the adapter is functional, the vnic server will wait for the client to reinit the adapter. This disconnect between the two leaves the adapter down needing manual intervention. Because ibmvnic_probe() has other work to do after initializing the CRQ (such as registering the netdev at a minimum) and because the reset event can occur at any instant after the CRQ is initialized, there will always be a window between initializing the CRQ and considering the adapter ready for resets (ie state == PROBED). So rather than discarding resets during this window, allow queueing them - but only process them after the adapter is fully initialized. To do this, introduce a new completion state ->probe_done and have the reset worker thread wait on this before processing resets. This change brings up two new situations in or just after ibmvnic_probe(). First after one or more resets were queued, we encounter an error and decide to retry the initialization. At that point the queued resets are no longer relevant since we could be talking to a new vnic server. So we must purge/flush the queued resets before restarting the initialization. As a side note, since we are still in the probing stage and we have not registered the netdev, it will not be CHANGE_PARAM reset. Second this change opens up a potential race between the worker thread in __ibmvnic_reset(), the tasklet and the ibmvnic_open() due to the following sequence of events: 1. Register CRQ 2. Get transport event before CRQ_INIT completes. 3. Tasklet schedules reset: a) add rwi to list b) schedule_work() to start worker thread which runs and waits for ->probe_done. 4. ibmvnic_probe() decides to retry, purges rwi_list 5. Re-register crq and this time rest of probe succeeds - register netdev and complete(->probe_done). 6. Worker thread resumes in __ibmvnic_reset() from 3b. 7. Worker thread sets ->resetting bit 8. ibmvnic_open() comes in, notices ->resetting bit, sets state to IBMVNIC_OPEN and returns early expecting worker thread to finish the open. 9. Worker thread finds rwi_list empty and returns without opening the interface. If this happens, the ->ndo_open() call is effectively lost and the interface remains down. To address this, ensure that ->rwi_list is not empty before setting the ->resetting bit. See also comments in __ibmvnic_reset(). Fixes: 6a2fb0e99f9c ("ibmvnic: driver initialization for kdump/kexec") Signed-off-by: Sukadev Bhattiprolu <[email protected]> Signed-off-by: David S. Miller <[email protected]>
2022-02-25ibmvnic: clear fop when retrying probeSukadev Bhattiprolu1-0/+5
Clear ->failover_pending flag that may have been set in the previous pass of registering CRQ. If we don't clear, a subsequent ibmvnic_open() call would be misled into thinking a failover is pending and assuming that the reset worker thread would open the adapter. If this pass of registering the CRQ succeeds (i.e there is no transport event), there wouldn't be a reset worker thread. This would leave the adapter unconfigured and require manual intervention to bring it up during boot. Fixes: 5a18e1e0c193 ("ibmvnic: Fix failover case for non-redundant configuration") Signed-off-by: Sukadev Bhattiprolu <[email protected]> Signed-off-by: David S. Miller <[email protected]>
2022-02-25ibmvnic: init init_done_rc earlierSukadev Bhattiprolu1-5/+21
We currently initialize the ->init_done completion/return code fields before issuing a CRQ_INIT command. But if we get a transport event soon after registering the CRQ the taskslet may already have recorded the completion and error code. If we initialize here, we might overwrite/ lose that and end up issuing the CRQ_INIT only to timeout later. If that timeout happens during probe, we will leave the adapter in the DOWN state rather than retrying to register/init the CRQ. Initialize the completion before registering the CRQ so we don't lose the notification. Fixes: 032c5e82847a ("Driver for IBM System i/p VNIC protocol") Signed-off-by: Sukadev Bhattiprolu <[email protected]> Signed-off-by: David S. Miller <[email protected]>
2022-02-25ibmvnic: register netdev after init of adapterSukadev Bhattiprolu1-6/+8
Finish initializing the adapter before registering netdev so state is consistent. Fixes: c26eba03e407 ("ibmvnic: Update reset infrastructure to support tunable parameters") Signed-off-by: Sukadev Bhattiprolu <[email protected]> Signed-off-by: David S. Miller <[email protected]>
2022-02-25ibmvnic: complete init_done on transport eventsSukadev Bhattiprolu1-0/+7
If we get a transport event, set the error and mark the init as complete so the attempt to send crq-init or login fail sooner rather than wait for the timeout. Fixes: bbd669a868bb ("ibmvnic: Fix completion structure initialization") Signed-off-by: Sukadev Bhattiprolu <[email protected]> Signed-off-by: David S. Miller <[email protected]>
2022-02-25ibmvnic: define flush_reset_queue helperSukadev Bhattiprolu1-8/+16
Define and use a helper to flush the reset queue. Fixes: 2770a7984db5 ("ibmvnic: Introduce hard reset recovery") Signed-off-by: Sukadev Bhattiprolu <[email protected]> Signed-off-by: David S. Miller <[email protected]>
2022-02-25ibmvnic: initialize rc before completing waitSukadev Bhattiprolu1-1/+1
We should initialize ->init_done_rc before calling complete(). Otherwise the waiting thread may see ->init_done_rc as 0 before we have updated it and may assume that the CRQ was successful. Fixes: 6b278c0cb378 ("ibmvnic delay complete()") Signed-off-by: Sukadev Bhattiprolu <[email protected]> Signed-off-by: David S. Miller <[email protected]>
2022-02-25ibmvnic: free reset-work-item when flushingSukadev Bhattiprolu1-1/+3
Fix a tiny memory leak when flushing the reset work queue. Fixes: 2770a7984db5 ("ibmvnic: Introduce hard reset recovery") Signed-off-by: Sukadev Bhattiprolu <[email protected]> Signed-off-by: David S. Miller <[email protected]>
2022-02-25net: stmmac: only enable DMA interrupts when readyVincent Whitchurch1-2/+26
In this driver's ->ndo_open() callback, it enables DMA interrupts, starts the DMA channels, then requests interrupts with request_irq(), and then finally enables napi. If RX DMA interrupts are received before napi is enabled, no processing is done because napi_schedule_prep() will return false. If the network has a lot of broadcast/multicast traffic, then the RX ring could fill up completely before napi is enabled. When this happens, no further RX interrupts will be delivered, and the driver will fail to receive any packets. Fix this by only enabling DMA interrupts after all other initialization is complete. Fixes: 523f11b5d4fd72efb ("net: stmmac: move hardware setup for stmmac_open to new function") Reported-by: Lars Persson <[email protected]> Signed-off-by: Vincent Whitchurch <[email protected]> Signed-off-by: David S. Miller <[email protected]>
2022-02-25xen/netfront: destroy queues before real_num_tx_queues is zeroedMarek Marczykowski-Górecki1-16/+23
xennet_destroy_queues() relies on info->netdev->real_num_tx_queues to delete queues. Since d7dac083414eb5bb99a6d2ed53dc2c1b405224e5 ("net-sysfs: update the queue counts in the unregistration path"), unregister_netdev() indirectly sets real_num_tx_queues to 0. Those two facts together means, that xennet_destroy_queues() called from xennet_remove() cannot do its job, because it's called after unregister_netdev(). This results in kfree-ing queues that are still linked in napi, which ultimately crashes: BUG: kernel NULL pointer dereference, address: 0000000000000000 #PF: supervisor read access in kernel mode #PF: error_code(0x0000) - not-present page PGD 0 P4D 0 Oops: 0000 [#1] PREEMPT SMP PTI CPU: 1 PID: 52 Comm: xenwatch Tainted: G W 5.16.10-1.32.fc32.qubes.x86_64+ #226 RIP: 0010:free_netdev+0xa3/0x1a0 Code: ff 48 89 df e8 2e e9 00 00 48 8b 43 50 48 8b 08 48 8d b8 a0 fe ff ff 48 8d a9 a0 fe ff ff 49 39 c4 75 26 eb 47 e8 ed c1 66 ff <48> 8b 85 60 01 00 00 48 8d 95 60 01 00 00 48 89 ef 48 2d 60 01 00 RSP: 0000:ffffc90000bcfd00 EFLAGS: 00010286 RAX: 0000000000000000 RBX: ffff88800edad000 RCX: 0000000000000000 RDX: 0000000000000001 RSI: ffffc90000bcfc30 RDI: 00000000ffffffff RBP: fffffffffffffea0 R08: 0000000000000000 R09: 0000000000000000 R10: 0000000000000000 R11: 0000000000000001 R12: ffff88800edad050 R13: ffff8880065f8f88 R14: 0000000000000000 R15: ffff8880066c6680 FS: 0000000000000000(0000) GS:ffff8880f3300000(0000) knlGS:0000000000000000 CS: 0010 DS: 0000 ES: 0000 CR0: 0000000080050033 CR2: 0000000000000000 CR3: 00000000e998c006 CR4: 00000000003706e0 Call Trace: <TASK> xennet_remove+0x13d/0x300 [xen_netfront] xenbus_dev_remove+0x6d/0xf0 __device_release_driver+0x17a/0x240 device_release_driver+0x24/0x30 bus_remove_device+0xd8/0x140 device_del+0x18b/0x410 ? _raw_spin_unlock+0x16/0x30 ? klist_iter_exit+0x14/0x20 ? xenbus_dev_request_and_reply+0x80/0x80 device_unregister+0x13/0x60 xenbus_dev_changed+0x18e/0x1f0 xenwatch_thread+0xc0/0x1a0 ? do_wait_intr_irq+0xa0/0xa0 kthread+0x16b/0x190 ? set_kthread_struct+0x40/0x40 ret_from_fork+0x22/0x30 </TASK> Fix this by calling xennet_destroy_queues() from xennet_uninit(), when real_num_tx_queues is still available. This ensures that queues are destroyed when real_num_tx_queues is set to 0, regardless of how unregister_netdev() was called. Originally reported at https://github.com/QubesOS/qubes-issues/issues/7257 Fixes: d7dac083414eb5bb9 ("net-sysfs: update the queue counts in the unregistration path") Cc: [email protected] Signed-off-by: Marek Marczykowski-Górecki <[email protected]> Signed-off-by: David S. Miller <[email protected]>
2022-02-25can: gs_usb: change active_channels's type from atomic_t to u8Vincent Mailhol1-5/+5
The driver uses an atomic_t variable: gs_usb:active_channels to keep track of the number of opened channels in order to only allocate memory for the URBs when this count changes from zero to one. However, the driver does not decrement the counter when an error occurs in gs_can_open(). This issue is fixed by changing the type from atomic_t to u8 and by simplifying the logic accordingly. It is safe to use an u8 here because the network stack big kernel lock (a.k.a. rtnl_mutex) is being hold. For details, please refer to [1]. [1] https://lore.kernel.org/linux-can/CAMZ6Rq+sHpiw34ijPsmp7vbUpDtJwvVtdV7CvRZJsLixjAFfrg@mail.gmail.com/T/#t Fixes: d08e973a77d1 ("can: gs_usb: Added support for the GS_USB CAN devices") Link: https://lore.kernel.org/all/[email protected] Signed-off-by: Vincent Mailhol <[email protected]> Signed-off-by: Marc Kleine-Budde <[email protected]>
2022-02-25can: etas_es58x: change opened_channel_cnt's type from atomic_t to u8Vincent Mailhol2-7/+10
The driver uses an atomic_t variable: struct es58x_device::opened_channel_cnt to keep track of the number of opened channels in order to only allocate memory for the URBs when this count changes from zero to one. While the intent was to prevent race conditions, the choice of an atomic_t turns out to be a bad idea for several reasons: - implementation is incorrect and fails to decrement opened_channel_cnt when the URB allocation fails as reported in [1]. - even if opened_channel_cnt were to be correctly decremented, atomic_t is insufficient to cover edge cases: there can be a race condition in which 1/ a first process fails to allocate URBs memory 2/ a second process enters es58x_open() before the first process does its cleanup and decrements opened_channed_cnt. In which case, the second process would successfully return despite the URBs memory not being allocated. - actually, any kind of locking mechanism was useless here because it is redundant with the network stack big kernel lock (a.k.a. rtnl_lock) which is being hold by all the callers of net_device_ops:ndo_open() and net_device_ops:ndo_close(). c.f. the ASSERST_RTNL() calls in __dev_open() [2] and __dev_close_many() [3]. The atmomic_t is thus replaced by a simple u8 type and the logic to increment and decrement es58x_device:opened_channel_cnt is simplified accordingly fixing the bug reported in [1]. We do not check again for ASSERST_RTNL() as this is already done by the callers. [1] https://lore.kernel.org/linux-can/20220201140351.GA2548@kili/T/#u [2] https://elixir.bootlin.com/linux/v5.16/source/net/core/dev.c#L1463 [3] https://elixir.bootlin.com/linux/v5.16/source/net/core/dev.c#L1541 Fixes: 8537257874e9 ("can: etas_es58x: add core support for ETAS ES58X CAN USB interfaces") Link: https://lore.kernel.org/all/[email protected] Reported-by: Dan Carpenter <[email protected]> Signed-off-by: Vincent Mailhol <[email protected]> Signed-off-by: Marc Kleine-Budde <[email protected]>
2022-02-24net: mv643xx_eth: process retval from of_get_mac_addressMauri Sandberg1-10/+14
Obtaining a MAC address may be deferred in cases when the MAC is stored in an NVMEM block, for example, and it may not be ready upon the first retrieval attempt and return EPROBE_DEFER. It is also possible that a port that does not rely on NVMEM has been already created when getting the defer request. Thus, also the resources allocated previously must be freed when doing a roll-back. Fixes: 76723bca2802 ("net: mv643xx_eth: add DT parsing support") Signed-off-by: Mauri Sandberg <[email protected]> Reviewed-by: Andrew Lunn <[email protected]> Link: https://lore.kernel.org/r/[email protected] Signed-off-by: Jakub Kicinski <[email protected]>
2022-02-24Revert "i40e: Fix reset bw limit when DCB enabled with 1 TC"Mateusz Palczewski1-11/+1
Revert of a patch that instead of fixing a AQ error when trying to reset BW limit introduced several regressions related to creation and managing TC. Currently there are errors when creating a TC on both PF and VF. Error log: [17428.783095] i40e 0000:3b:00.1: AQ command Config VSI BW allocation per TC failed = 14 [17428.783107] i40e 0000:3b:00.1: Failed configuring TC map 0 for VSI 391 [17428.783254] i40e 0000:3b:00.1: AQ command Config VSI BW allocation per TC failed = 14 [17428.783259] i40e 0000:3b:00.1: Unable to configure TC map 0 for VSI 391 This reverts commit 3d2504663c41104b4359a15f35670cfa82de1bbf. Fixes: 3d2504663c41 (i40e: Fix reset bw limit when DCB enabled with 1 TC) Signed-off-by: Mateusz Palczewski <[email protected]> Signed-off-by: Tony Nguyen <[email protected]> Link: https://lore.kernel.org/r/[email protected] Signed-off-by: Jakub Kicinski <[email protected]>
2022-02-24bnx2x: fix driver load from initrdManish Chopra1-0/+3
Commit b7a49f73059f ("bnx2x: Utilize firmware 7.13.21.0") added new firmware support in the driver with maintaining older firmware compatibility. However, older firmware was not added in MODULE_FIRMWARE() which caused missing firmware files in initrd image leading to driver load failure from initrd. This patch adds MODULE_FIRMWARE() for older firmware version to have firmware files included in initrd. Fixes: b7a49f73059f ("bnx2x: Utilize firmware 7.13.21.0") Link: https://bugzilla.kernel.org/show_bug.cgi?id=215627 Signed-off-by: Manish Chopra <[email protected]> Signed-off-by: Alok Prasad <[email protected]> Signed-off-by: Ariel Elior <[email protected]> Link: https://lore.kernel.org/r/[email protected] Signed-off-by: Jakub Kicinski <[email protected]>
2022-02-24Revert "xen-netback: Check for hotplug-status existence before watching"Marek Marczykowski-Górecki1-8/+4
This reverts commit 2afeec08ab5c86ae21952151f726bfe184f6b23d. The reasoning in the commit was wrong - the code expected to setup the watch even if 'hotplug-status' didn't exist. In fact, it relied on the watch being fired the first time - to check if maybe 'hotplug-status' is already set to 'connected'. Not registering a watch for non-existing path (which is the case if hotplug script hasn't been executed yet), made the backend not waiting for the hotplug script to execute. This in turns, made the netfront think the interface is fully operational, while in fact it was not (the vif interface on xen-netback side might not be configured yet). This was a workaround for 'hotplug-status' erroneously being removed. But since that is reverted now, the workaround is not necessary either. More discussion at https://lore.kernel.org/xen-devel/[email protected]/T/#u Signed-off-by: Marek Marczykowski-Górecki <[email protected]> Reviewed-by: Paul Durrant <[email protected]> Reviewed-by: Michael Brown <[email protected]> Link: https://lore.kernel.org/r/[email protected] Signed-off-by: Jakub Kicinski <[email protected]>
2022-02-24Revert "xen-netback: remove 'hotplug-status' once it has served its purpose"Marek Marczykowski-Górecki1-1/+1
This reverts commit 1f2565780e9b7218cf92c7630130e82dcc0fe9c2. The 'hotplug-status' node should not be removed as long as the vif device remains configured. Otherwise the xen-netback would wait for re-running the network script even if it was already called (in case of the frontent re-connecting). But also, it _should_ be removed when the vif device is destroyed (for example when unbinding the driver) - otherwise hotplug script would not configure the device whenever it re-appear. Moving removal of the 'hotplug-status' node was a workaround for nothing calling network script after xen-netback module is reloaded. But when vif interface is re-created (on xen-netback unbind/bind for example), the script should be called, regardless of who does that - currently this case is not handled by the toolstack, and requires manual script call. Keeping hotplug-status=connected to skip the call is wrong and leads to not configured interface. More discussion at https://lore.kernel.org/xen-devel/[email protected]/T/#u Signed-off-by: Marek Marczykowski-Górecki <[email protected]> Reviewed-by: Paul Durrant <[email protected]> Link: https://lore.kernel.org/r/[email protected] Signed-off-by: Jakub Kicinski <[email protected]>
2022-02-23net/mlx5e: Fix VF min/max rate parameters interchange mistakeGal Pressman1-1/+1
The VF min and max rate were passed incorrectly and resulted in wrongly interchanging them. Fix the order of parameters in mlx5_esw_qos_set_vport_rate(). Fixes: d7df09f5e7b4 ("net/mlx5: E-switch, Enable vport QoS on demand") Signed-off-by: Gal Pressman <[email protected]> Reviewed-by: Aya Levin <[email protected]> Signed-off-by: Saeed Mahameed <[email protected]>
2022-02-23net/mlx5e: Add missing increment of countLama Kayal1-0/+1
Add mistakenly missing increment of count variable when looping over output buffer in mlx5e_self_test(). This resolves the issue of garbage values output when querying with self test via ethtool. before: $ ethtool -t eth2 The test result is PASS The test extra info: Link Test 0 Speed Test 1768697188 Health Test 758528120 Loopback Test 3288687 after: $ ethtool -t eth2 The test result is PASS The test extra info: Link Test 0 Speed Test 0 Health Test 0 Loopback Test 0 Fixes: 7990b1b5e8bd ("net/mlx5e: loopback test is not supported in switchdev mode") Signed-off-by: Lama Kayal <[email protected]> Reviewed-by: Gal Pressman <[email protected]> Signed-off-by: Saeed Mahameed <[email protected]>
2022-02-23net/mlx5e: MPLSoUDP decap, fix check for unsupported matchesMaor Dickman1-17/+11
Currently offload of rule on bareudp device require tunnel key in order to match on mpls fields and without it the mpls fields are ignored, this is incorrect due to the fact udp tunnel doesn't have key to match on. Fix by returning error in case flow is matching on tunnel key. Fixes: 72046a91d134 ("net/mlx5e: Allow to match on mpls parameters") Signed-off-by: Maor Dickman <[email protected]> Reviewed-by: Roi Dayan <[email protected]> Signed-off-by: Saeed Mahameed <[email protected]>
2022-02-23net/mlx5e: Fix MPLSoUDP encap to use MPLS action informationMaor Dickman7-3/+32
Currently the MPLSoUDP encap builds the MPLS header using encap action information (tunnel id, ttl and tos) instead of the MPLS action information (label, ttl, tc and bos) which is wrong. Fix by storing the MPLS action information during the flow action parse and later using it to create the encap MPLS header. Fixes: f828ca6a2fb6 ("net/mlx5e: Add support for hw encapsulation of MPLS over UDP") Signed-off-by: Maor Dickman <[email protected]> Reviewed-by: Roi Dayan <[email protected]> Signed-off-by: Saeed Mahameed <[email protected]>
2022-02-23net/mlx5e: Add feature check for set fec countersLama Kayal1-3/+3
Fec counters support is checked via the PCAM feature_cap_mask, bit 0: PPCNT_counter_group_Phy_statistical_counter_group. Add feature check to avoid faulty behavior. Fixes: 0a1498ebfa55 ("net/mlx5e: Expose FEC counters via ethtool") Signed-off-by: Lama Kayal <[email protected]> Reviewed-by: Gal Pressman <[email protected]> Signed-off-by: Saeed Mahameed <[email protected]>
2022-02-23net/mlx5e: TC, Skip redundant ct clear actionsRoi Dayan2-0/+8
Offload of ct clear action is just resetting the reg_c register. It's done by allocating modify hdr resources which is limited. Doing it multiple times is redundant and wasting modify hdr resources and if resources depleted the driver will fail offloading the rule. Ignore redundant ct clear actions after the first one. Fixes: 806401c20a0f ("net/mlx5e: CT, Fix multiple allocations and memleak of mod acts") Signed-off-by: Roi Dayan <[email protected]> Reviewed-by: Ariel Levkovich <[email protected]> Reviewed-by: Maor Dickman <[email protected]> Signed-off-by: Saeed Mahameed <[email protected]>
2022-02-23net/mlx5e: TC, Reject rules with forward and drop actionsRoi Dayan1-0/+6
Such rules are redundant but allowed and passed to the driver. The driver does not support offloading such rules so return an error. Fixes: 03a9d11e6eeb ("net/mlx5e: Add TC drop and mirred/redirect action parsing for SRIOV offloads") Signed-off-by: Roi Dayan <[email protected]> Reviewed-by: Oz Shlomo <[email protected]> Signed-off-by: Saeed Mahameed <[email protected]>
2022-02-23net/mlx5e: TC, Reject rules with drop and modify hdr actionRoi Dayan1-0/+6
This kind of action is not supported by firmware and generates a syndrome. kernel: mlx5_core 0000:08:00.0: mlx5_cmd_check:777:(pid 102063): SET_FLOW_TABLE_ENTRY(0x936) op_mod(0x0) failed, status bad parameter(0x3), syndrome (0x8708c3) Fixes: d7e75a325cb2 ("net/mlx5e: Add offloading of E-Switch TC pedit (header re-write) actions") Signed-off-by: Roi Dayan <[email protected]> Reviewed-by: Maor Dickman <[email protected]> Reviewed-by: Oz Shlomo <[email protected]> Signed-off-by: Saeed Mahameed <[email protected]>
2022-02-23net/mlx5e: kTLS, Use CHECKSUM_UNNECESSARY for device-offloaded packetsTariq Toukan1-1/+2
For RX TLS device-offloaded packets, the HW spec guarantees checksum validation for the offloaded packets, but does not define whether the CQE.checksum field matches the original packet (ciphertext) or the decrypted one (plaintext). This latitude allows architetctural improvements between generations of chips, resulting in different decisions regarding the value type of CQE.checksum. Hence, for these packets, the device driver should not make use of this CQE field. Here we block CHECKSUM_COMPLETE usage for RX TLS device-offloaded packets, and use CHECKSUM_UNNECESSARY instead. Value of the packet's tcp_hdr.csum is not modified by the HW, and it always matches the original ciphertext. Fixes: 1182f3659357 ("net/mlx5e: kTLS, Add kTLS RX HW offload support") Signed-off-by: Tariq Toukan <[email protected]> Signed-off-by: Saeed Mahameed <[email protected]>
2022-02-23net/mlx5e: Fix wrong return value on ioctl EEPROM query failureGal Pressman1-1/+1
The ioctl EEPROM query wrongly returns success on read failures, fix that by returning the appropriate error code. Fixes: bb64143eee8c ("net/mlx5e: Add ethtool support for dump module EEPROM") Signed-off-by: Gal Pressman <[email protected]> Reviewed-by: Tariq Toukan <[email protected]> Signed-off-by: Saeed Mahameed <[email protected]>
2022-02-23net/mlx5: Fix possible deadlock on rule deletionMaor Gottlieb1-0/+2
Add missing call to up_write_ref_node() which releases the semaphore in case the FTE doesn't have destinations, such in drop rule case. Fixes: 465e7baab6d9 ("net/mlx5: Fix deletion of duplicate rules") Signed-off-by: Maor Gottlieb <[email protected]> Reviewed-by: Mark Bloch <[email protected]> Signed-off-by: Saeed Mahameed <[email protected]>
2022-02-23net/mlx5: Fix tc max supported prio for nic modeChris Mi1-0/+3
Only prio 1 is supported if firmware doesn't support ignore flow level for nic mode. The offending commit removed the check wrongly. Add it back. Fixes: 9a99c8f1253a ("net/mlx5e: E-Switch, Offload all chain 0 priorities when modify header and forward action is not supported") Signed-off-by: Chris Mi <[email protected]> Reviewed-by: Roi Dayan <[email protected]> Signed-off-by: Saeed Mahameed <[email protected]>
2022-02-23net/mlx5: Fix wrong limitation of metadata match on ecpfAriel Levkovich1-4/+0
Match metadata support check returns false for ecpf device. However, this support does exist for ecpf and therefore this limitation should be removed to allow feature such as stacked devices and internal port offloaded to be supported. Fixes: 92ab1eb392c6 ("net/mlx5: E-Switch, Enable vport metadata matching if firmware supports it") Signed-off-by: Ariel Levkovich <[email protected]> Reviewed-by: Maor Dickman <[email protected]> Signed-off-by: Saeed Mahameed <[email protected]>
2022-02-23net/mlx5: Update log_max_qp value to be 17 at mostMaher Sanalla1-1/+1
Currently, log_max_qp value is dependent on what FW reports as its max capability. In reality, due to a bug, some FWs report a value greater than 17, even though they don't support log_max_qp > 17. This FW issue led the driver to exhaust memory on startup. Thus, log_max_qp value is set to be no more than 17 regardless of what FW reports, as it was before the cited commit. Fixes: f79a609ea6bf ("net/mlx5: Update log_max_qp value to FW max capability") Signed-off-by: Maher Sanalla <[email protected]> Reviewed-by: Avihai Horon <[email protected]> Signed-off-by: Saeed Mahameed <[email protected]>
2022-02-23net/mlx5: DR, Fix the threshold that defines when pool sync is initiatedYevgeny Kliteynik1-4/+7
When deciding whether to start syncing and actually free all the "hot" ICM chunks, we need to consider the type of the ICM chunks that we're dealing with. For instance, the amount of available ICM for MODIFY_ACTION is significantly lower than the usual STE ICM, so the threshold should account for that - otherwise we can deplete MODIFY_ACTION memory just by creating and deleting the same modify header action in a continuous loop. This patch replaces the hard-coded threshold with a dynamic value. Fixes: 1c58651412bb ("net/mlx5: DR, ICM memory pools sync optimization") Signed-off-by: Yevgeny Kliteynik <[email protected]> Reviewed-by: Alex Vesker <[email protected]> Signed-off-by: Saeed Mahameed <[email protected]>
2022-02-23net/mlx5: DR, Don't allow match on IP w/o matching on full ethertype/ip_versionYevgeny Kliteynik3-17/+45
Currently SMFS allows adding rule with matching on src/dst IP w/o matching on full ethertype or ip_version, which is not supported by HW. This patch fixes this issue and adds the check as it is done in DMFS. Fixes: 26d688e33f88 ("net/mlx5: DR, Add Steering entry (STE) utilities") Signed-off-by: Yevgeny Kliteynik <[email protected]> Reviewed-by: Alex Vesker <[email protected]> Signed-off-by: Saeed Mahameed <[email protected]>
2022-02-23net/mlx5: DR, Fix slab-out-of-bounds in mlx5_cmd_dr_create_fteYevgeny Kliteynik1-7/+26
When adding a rule with 32 destinations, we hit the following out-of-band access issue: BUG: KASAN: slab-out-of-bounds in mlx5_cmd_dr_create_fte+0x18ee/0x1e70 This patch fixes the issue by both increasing the allocated buffers to accommodate for the needed actions and by checking the number of actions to prevent this issue when a rule with too many actions is provided. Fixes: 1ffd498901c1 ("net/mlx5: DR, Increase supported num of actions to 32") Signed-off-by: Yevgeny Kliteynik <[email protected]> Reviewed-by: Alex Vesker <[email protected]> Signed-off-by: Saeed Mahameed <[email protected]>