aboutsummaryrefslogtreecommitdiff
path: root/drivers/net/ethernet/ibm
AgeCommit message (Collapse)AuthorFilesLines
2021-02-16net: re-solve some conflicts after net -> net-next mergeJakub Kicinski2-6/+9
Signed-off-by: Jakub Kicinski <[email protected]>
2021-02-16Merge git://git.kernel.org/pub/scm/linux/kernel/git/netdev/netDavid S. Miller2-13/+35
2021-02-15ibmvnic: serialize access to work queue on removeSukadev Bhattiprolu2-8/+24
The work queue is used to queue reset requests like CHANGE-PARAM or FAILOVER resets for the worker thread. When the adapter is being removed the adapter state is set to VNIC_REMOVING and the work queue is flushed so no new work is added. However the check for adapter being removed is racy in that the adapter can go into REMOVING state just after we check and we might end up adding work just as it is being flushed (or after). The ->rwi_lock is already being used to serialize queue/dequeue work. Extend its usage ensure there is no race when scheduling/flushing work. Fixes: 6954a9e4192b ("ibmvnic: Flush existing work items before device removal") Signed-off-by: Sukadev Bhattiprolu <[email protected]> Cc:Uwe Kleine-König <[email protected]> Cc:Saeed Mahameed <[email protected]> Reviewed-by: Dany Madden <[email protected]> Signed-off-by: David S. Miller <[email protected]>
2021-02-15ibmvnic: skip send_request_unmap for timeout resetLijun Pan1-1/+6
Timeout reset will trigger the VIOS to unmap it automatically, similarly as FAILVOER and MOBILITY events. If we unmap it in the linux side, we will see errors like "30000003: Error 4 in REQUEST_UNMAP_RSP". So, don't call send_request_unmap for timeout reset. Fixes: ed651a10875f ("ibmvnic: Updated reset handling") Signed-off-by: Lijun Pan <[email protected]> Signed-off-by: David S. Miller <[email protected]>
2021-02-15ibmvnic: add memory barrier to protect long term bufferLijun Pan1-0/+5
dma_rmb() barrier is added to load the long term buffer before copying it to socket buffer; and dma_wmb() barrier is added to update the long term buffer before it being accessed by VIOS (virtual i/o server). Fixes: 032c5e82847a ("Driver for IBM System i/p VNIC protocol") Signed-off-by: Lijun Pan <[email protected]> Acked-by: Thomas Falcon <[email protected]> Signed-off-by: David S. Miller <[email protected]>
2021-02-15ibmvnic: substitute mb() with dma_wmb() for send_*crq* functionsLijun Pan1-2/+2
The CRQ and subCRQ descriptors are DMA mapped, so dma_wmb(), though weaker, is good enough to protect the data structures. Signed-off-by: Lijun Pan <[email protected]> Acked-by: Thomas Falcon <[email protected]> Signed-off-by: David S. Miller <[email protected]>
2021-02-15ibmvnic: simplify reset_long_term_buff functionLijun Pan1-38/+8
The only thing reset_long_term_buff() should do is set buffer to zero. After doing that, it is not necessary to send_request_map again to VIOS since it actually does not change the mapping. So, keep memset function and remove all others. Signed-off-by: Lijun Pan <[email protected]> Signed-off-by: David S. Miller <[email protected]>
2021-02-12of: Remove of_dev_{get,put}()Rob Herring1-7/+8
of_dev_get() and of_dev_put are just wrappers for get_device()/put_device() on a platform_device. There's also already platform_device_{get,put}() wrappers for this purpose. Let's update the few users and remove of_dev_{get,put}(). Cc: Michael Ellerman <[email protected]> Cc: Benjamin Herrenschmidt <[email protected]> Cc: Paul Mackerras <[email protected]> Cc: "David S. Miller" <[email protected]> Cc: Jakub Kicinski <[email protected]> Cc: Frank Rowand <[email protected]> Cc: Patrice Chotard <[email protected]> Cc: Felipe Balbi <[email protected]> Cc: Julia Lawall <[email protected]> Cc: Gilles Muller <[email protected]> Cc: Nicolas Palix <[email protected]> Cc: Michal Marek <[email protected]> Cc: [email protected] Cc: [email protected] Cc: [email protected] Cc: [email protected] Cc: [email protected] Signed-off-by: Rob Herring <[email protected]> Reviewed-by: Greg Kroah-Hartman <[email protected]> Link: https://lore.kernel.org/r/[email protected]
2021-02-12ibmvnic: change IBMVNIC_MAX_IND_DESCS to 16Dany Madden1-1/+1
The supported indirect subcrq entries on Power8 is 16. Power9 supports 128. Redefined this value to 16 to minimize the driver from having to reset when migrating between Power9 and Power8. In our rx/tx performance testing, we found no performance difference between 16 and 128 at this time. Fixes: f019fb6392e5 ("ibmvnic: Introduce indirect subordinate Command Response Queue buffer") Signed-off-by: Dany Madden <[email protected]> Signed-off-by: David S. Miller <[email protected]>
2021-02-11ibmvnic: Set to CLOSED state even on errorSukadev Bhattiprolu1-3/+1
If set_link_state() fails for any reason, we still cleanup the adapter state and cannot recover from a partial close anyway. So set the adapter to CLOSED state. That way if a new soft/hard reset is processed, the adapter will remain in the CLOSED state until the next ibmvnic_open(). Fixes: 01d9bd792d16 ("ibmvnic: Reorganize device close") Signed-off-by: Sukadev Bhattiprolu <[email protected]> Reported-by: Abdul Haleem <[email protected]> Signed-off-by: David S. Miller <[email protected]>
2021-02-11ibmvnic: prefer strscpy over strlcpyLijun Pan1-3/+3
Fix this warning: WARNING: Prefer strscpy over strlcpy - see: https://lore.kernel.org/r/CAHk-=wgfRnXz0W3D37d01q3JFkr_i_uTL=V6A6G1oUZcprmknw@mail.gmail.com/ Signed-off-by: Lijun Pan <[email protected]> Signed-off-by: David S. Miller <[email protected]>
2021-02-11ibmvnic: remove unused spinlock_t stats_lock definitionLijun Pan2-3/+0
stats_lock is no longer used. So remove it. Signed-off-by: Lijun Pan <[email protected]> Reviewed-by: Saeed Mahameed <[email protected]> Signed-off-by: David S. Miller <[email protected]>
2021-02-11ibmvnic: add comments for spinlock_t definitionsLijun Pan1-4/+6
There are several spinlock_t definitions without comments. Add them. Signed-off-by: Lijun Pan <[email protected]> Reviewed-by: Saeed Mahameed <[email protected]> Signed-off-by: David S. Miller <[email protected]>
2021-02-11ibmvnic: fix miscellaneous checksLijun Pan1-8/+9
Fix the following checkpatch checks: CHECK: Macro argument 'off' may be better as '(off)' to avoid precedence issues CHECK: Alignment should match open parenthesis CHECK: multiple assignments should be avoided CHECK: Blank lines aren't necessary before a close brace '}' CHECK: Please use a blank line after function/struct/union/enum declarations CHECK: Unnecessary parentheses around 'rc != H_FUNCTION' Signed-off-by: Lijun Pan <[email protected]> Reviewed-by: Saeed Mahameed <[email protected]> Signed-off-by: David S. Miller <[email protected]>
2021-02-11ibmvnic: avoid multiple line dereferenceLijun Pan1-9/+8
Fix the following checkpatch warning: WARNING: Avoid multiple line dereference Signed-off-by: Lijun Pan <[email protected]> Reviewed-by: Saeed Mahameed <[email protected]> Signed-off-by: David S. Miller <[email protected]>
2021-02-11ibmvnic: fix bracesLijun Pan1-2/+1
Fix the following checkpatch warning: WARNING: braces {} are not necessary for single statement blocks Signed-off-by: Lijun Pan <[email protected]> Reviewed-by: Saeed Mahameed <[email protected]> Signed-off-by: David S. Miller <[email protected]>
2021-02-11ibmvnic: fix block commentsLijun Pan1-8/+4
Fix the following checkpatch warning: WARNING: networking block comments don't use an empty /* line, use /* Comment... Signed-off-by: Lijun Pan <[email protected]> Reviewed-by: Saeed Mahameed <[email protected]> Signed-off-by: David S. Miller <[email protected]>
2021-02-11ibmvnic: prefer 'unsigned long' over 'unsigned long int'Lijun Pan1-8/+8
Fix the following checkpatch warnings: WARNING: Prefer 'unsigned long' over 'unsigned long int' as the int is unnecessary WARNING: Prefer 'long' over 'long int' as the int is unnecessary Signed-off-by: Lijun Pan <[email protected]> Reviewed-by: Saeed Mahameed <[email protected]> Signed-off-by: David S. Miller <[email protected]>
2021-02-10Merge git://git.kernel.org/pub/scm/linux/kernel/git/netdev/netDavid S. Miller1-1/+16
2021-02-06ibmvnic: Clear failover_pending if unable to scheduleSukadev Bhattiprolu1-1/+16
Normally we clear the failover_pending flag when processing the reset. But if we are unable to schedule a failover reset we must clear the flag ourselves. We could fail to schedule the reset if we are in PROBING state (eg: when booting via kexec) or because we could not allocate memory. Thanks to Cris Forno for helping isolate the problem and for testing. Fixes: 1d8504937478 ("powerpc/vnic: Extend "failover pending" window") Signed-off-by: Sukadev Bhattiprolu <[email protected]> Tested-by: Cristobal Forno <[email protected]> Link: https://lore.kernel.org/r/[email protected] Signed-off-by: Jakub Kicinski <[email protected]>
2021-02-02Merge git://git.kernel.org/pub/scm/linux/kernel/git/netdev/netJakub Kicinski1-5/+0
Signed-off-by: Jakub Kicinski <[email protected]>
2021-02-01ibmvnic: remove unnecessary rmb() inside ibmvnic_pollLijun Pan1-1/+0
rmb() can be removed since: 1. pending_scrq() has dma_rmb() at the function end; 2. dma_rmb(), though weaker, is enough here. Signed-off-by: Lijun Pan <[email protected]> Acked-by: Dwip Banerjee <[email protected]> Acked-by: Thomas Falcon <[email protected]> Reviewed-by: Brian King <[email protected]> Signed-off-by: Jakub Kicinski <[email protected]>
2021-02-01ibmvnic: rework to ensure SCRQ entry reads are properly orderedLijun Pan1-19/+11
Move the dma_rmb() between pending_scrq() and ibmvnic_next_scrq() into the end of pending_scrq() to save the duplicated code since this dma_rmb will be used 3 times. Signed-off-by: Lijun Pan <[email protected]> Acked-by: Thomas Falcon <[email protected]> Signed-off-by: Jakub Kicinski <[email protected]>
2021-02-01ibmvnic: device remove has higher precedence over resetLijun Pan1-5/+0
Returning -EBUSY in ibmvnic_remove() does not actually hold the removal procedure since driver core doesn't care for the return value (see __device_release_driver() in drivers/base/dd.c calling dev->bus->remove()) though vio_bus_remove (in arch/powerpc/platforms/pseries/vio.c) records the return value and passes it on. [1] During the device removal precedure, checking for resetting bit is dropped so that we can continue executing all the cleanup calls in the rest of the remove function. Otherwise, it can cause latent memory leaks and kernel crashes. [1] https://lore.kernel.org/linuxppc-dev/[email protected]/T/#m48f5befd96bc9842ece2a3ad14f4c27747206a53 Reported-by: Uwe Kleine-König <[email protected]> Fixes: 7d7195a026ba ("ibmvnic: Do not process device remove during device reset") Signed-off-by: Lijun Pan <[email protected]> Link: https://lore.kernel.org/r/[email protected] Signed-off-by: Jakub Kicinski <[email protected]>
2021-01-28Merge git://git.kernel.org/pub/scm/linux/kernel/git/netdev/netJakub Kicinski1-0/+6
drivers/net/can/dev.c b552766c872f ("can: dev: prevent potential information leak in can_fill_info()") 3e77f70e7345 ("can: dev: move driver related infrastructure into separate subdir") 0a042c6ec991 ("can: dev: move netlink related code into seperate file") Code move. drivers/net/ethernet/mellanox/mlx5/core/en_ethtool.c 57ac4a31c483 ("net/mlx5e: Correctly handle changing the number of queues when the interface is down") 214baf22870c ("net/mlx5e: Support HTB offload") Adjacent code changes net/switchdev/switchdev.c 20776b465c0c ("net: switchdev: don't set port_obj_info->handled true when -EOPNOTSUPP") ffb68fc58e96 ("net: switchdev: remove the transaction structure from port object notifiers") bae33f2b5afe ("net: switchdev: remove the transaction structure from port attributes") Transaction parameter gets dropped otherwise keep the fix. Signed-off-by: Jakub Kicinski <[email protected]>
2021-01-27ibmvnic: Ensure that CRQ entry read are correctly orderedLijun Pan1-0/+6
Ensure that received Command-Response Queue (CRQ) entries are properly read in order by the driver. dma_rmb barrier has been added before accessing the CRQ descriptor to ensure the entire descriptor is read before processing. Fixes: 032c5e82847a ("Driver for IBM System i/p VNIC protocol") Signed-off-by: Lijun Pan <[email protected]> Link: https://lore.kernel.org/r/[email protected] Signed-off-by: Jakub Kicinski <[email protected]>
2021-01-16net: ethernet: ibm: ibmvnic: Fix some kernel-doc misdemeanoursLee Jones1-14/+13
Fixes the following W=1 kernel build warning(s): from drivers/net/ethernet/ibm/ibmvnic.c:35: inlined from ‘handle_vpd_rsp’ at drivers/net/ethernet/ibm/ibmvnic.c:4124:3: drivers/net/ethernet/ibm/ibmvnic.c:1362: warning: Function parameter or member 'hdr_field' not described in 'build_hdr_data' drivers/net/ethernet/ibm/ibmvnic.c:1362: warning: Function parameter or member 'skb' not described in 'build_hdr_data' drivers/net/ethernet/ibm/ibmvnic.c:1362: warning: Function parameter or member 'hdr_len' not described in 'build_hdr_data' drivers/net/ethernet/ibm/ibmvnic.c:1362: warning: Function parameter or member 'hdr_data' not described in 'build_hdr_data' drivers/net/ethernet/ibm/ibmvnic.c:1423: warning: Function parameter or member 'hdr_field' not described in 'create_hdr_descs' drivers/net/ethernet/ibm/ibmvnic.c:1423: warning: Function parameter or member 'hdr_data' not described in 'create_hdr_descs' drivers/net/ethernet/ibm/ibmvnic.c:1423: warning: Function parameter or member 'len' not described in 'create_hdr_descs' drivers/net/ethernet/ibm/ibmvnic.c:1423: warning: Function parameter or member 'hdr_len' not described in 'create_hdr_descs' drivers/net/ethernet/ibm/ibmvnic.c:1423: warning: Function parameter or member 'scrq_arr' not described in 'create_hdr_descs' drivers/net/ethernet/ibm/ibmvnic.c:1474: warning: Function parameter or member 'txbuff' not described in 'build_hdr_descs_arr' drivers/net/ethernet/ibm/ibmvnic.c:1474: warning: Function parameter or member 'num_entries' not described in 'build_hdr_descs_arr' drivers/net/ethernet/ibm/ibmvnic.c:1474: warning: Function parameter or member 'hdr_field' not described in 'build_hdr_descs_arr' drivers/net/ethernet/ibm/ibmvnic.c:1832: warning: Function parameter or member 'adapter' not described in 'do_change_param_reset' drivers/net/ethernet/ibm/ibmvnic.c:1832: warning: Function parameter or member 'rwi' not described in 'do_change_param_reset' drivers/net/ethernet/ibm/ibmvnic.c:1832: warning: Function parameter or member 'reset_state' not described in 'do_change_param_reset' drivers/net/ethernet/ibm/ibmvnic.c:1911: warning: Function parameter or member 'adapter' not described in 'do_reset' drivers/net/ethernet/ibm/ibmvnic.c:1911: warning: Function parameter or member 'rwi' not described in 'do_reset' drivers/net/ethernet/ibm/ibmvnic.c:1911: warning: Function parameter or member 'reset_state' not described in 'do_reset' Signed-off-by: Lee Jones <[email protected]> Signed-off-by: Jakub Kicinski <[email protected]>
2021-01-09ibmvnic: merge do_change_param_reset into do_resetLijun Pan1-110/+44
Commit b27507bb59ed ("net/ibmvnic: unlock rtnl_lock in reset so linkwatch_event can run") introduced do_change_param_reset function to solve the rtnl lock issue. Majority of the code in do_change_param_reset duplicates do_reset. Also, we can handle the rtnl lock issue in do_reset itself. Hence merge do_change_param_reset back into do_reset to clean up the code. Signed-off-by: Lijun Pan <[email protected]> Link: https://lore.kernel.org/r/[email protected] Signed-off-by: Jakub Kicinski <[email protected]>
2021-01-04ibmvnic: fix: NULL pointer dereference.YANG LI1-3/+1
The error is due to dereference a null pointer in function reset_one_sub_crq_queue(): if (!scrq) { netdev_dbg(adapter->netdev, "Invalid scrq reset. irq (%d) or msgs(%p).\n", scrq->irq, scrq->msgs); return -EINVAL; } If the expression is true, scrq must be a null pointer and cannot dereference. Fixes: 9281cf2d5840 ("ibmvnic: avoid memset null scrq msgs") Signed-off-by: YANG LI <[email protected]> Reported-by: Abaci <[email protected]> Acked-by: Lijun Pan <[email protected]> Link: https://lore.kernel.org/r/1609312994-121032-1-git-send-email-abaci-bugfix@linux.alibaba.com Signed-off-by: Jakub Kicinski <[email protected]>
2020-12-23ibmvnic: continue fatal error reset after passive initLijun Pan1-2/+1
Commit f9c6cea0b385 ("ibmvnic: Skip fatal error reset after passive init") says "If the passive CRQ initialization occurs before the FATAL reset task is processed, the FATAL error reset task would try to access a CRQ message queue that was freed, causing an oops. The problem may be most likely to occur during DLPAR add vNIC with a non-default MTU, because the DLPAR process will automatically issue a change MTU request. Fix this by not processing fatal error reset if CRQ is passively initialized after client-driven CRQ initialization fails." The original commit skips a specific reset condition, but that does not fix the problem it claims to fix, and misses a reset condition. The effective fix is commit 0e435befaea4 ("ibmvnic: fix NULL pointer dereference in ibmvic_reset_crq") and commit a0faaa27c716 ("ibmvnic: fix NULL pointer dereference in reset_sub_crq_queues"). With above two fixes, there are no more crashes seen as described even without the original commit, so I would like to revert the original commit. Fixes: f9c6cea0b385 ("ibmvnic: Skip fatal error reset after passive init") Signed-off-by: Lijun Pan <[email protected]> Link: https://lore.kernel.org/r/[email protected] Signed-off-by: Jakub Kicinski <[email protected]>
2020-12-22ibmvnic: fix login buffer memory leakLijun Pan1-0/+3
Commit 34f0f4e3f488 ("ibmvnic: Fix login buffer memory leaks") frees login_rsp_buffer in release_resources() and send_login() because handle_login_rsp() does not free it. Commit f3ae59c0c015 ("ibmvnic: store RX and TX subCRQ handle array in ibmvnic_adapter struct") frees login_rsp_buffer in handle_login_rsp(). It seems unnecessary to free it in release_resources() and send_login(). There are chances that handle_login_rsp returns earlier without freeing buffers. Double-checking the buffer is harmless since release_login_buffer and release_login_rsp_buffer will do nothing if buffer is already freed. Fixes: f3ae59c0c015 ("ibmvnic: store RX and TX subCRQ handle array in ibmvnic_adapter struct") Fixes: 34f0f4e3f488 ("ibmvnic: Fix login buffer memory leaks") Signed-off-by: Lijun Pan <[email protected]> Link: https://lore.kernel.org/r/[email protected] Signed-off-by: Jakub Kicinski <[email protected]>
2020-12-16use __netdev_notify_peers in ibmvnicLijun Pan1-6/+3
Start to use the lockless version of netdev_notify_peers. Signed-off-by: Lijun Pan <[email protected]> Signed-off-by: Jakub Kicinski <[email protected]>
2020-12-09ibmvnic: fix rx buffer tracking and index management in replenish_rx_pool ↵Dwip N. Banerjee1-0/+2
partial success We observed that in the error case for batched send_subcrq_indirect() the driver does not account for the partial success case. This caused Linux to crash when free_map and pool index are inconsistent. Driver needs to update the rx pools "available" count when some batched sends worked but an error was encountered as part of the whole operation. Also track replenish_add_buff_failure for statistic purposes. Fixes: 4f0b6812e9b9a ("ibmvnic: Introduce batched RX buffer descriptor transmission") Signed-off-by: Dwip N. Banerjee <[email protected]> Reviewed-by: Dany Madden <[email protected]> Signed-off-by: David S. Miller <[email protected]>
2020-12-07ibmvnic: add some debugsSukadev Bhattiprolu1-3/+22
We sometimes run into situations where a soft/hard reset of the adapter takes a long time or fails to complete. Having additional messages that include important adapter state info will hopefully help understand what is happening, reduce the guess work and minimize requests to reproduce problems with debug patches. Signed-off-by: Sukadev Bhattiprolu <[email protected]> Link: https://lore.kernel.org/r/[email protected] Signed-off-by: Jakub Kicinski <[email protected]>
2020-12-03Merge git://git.kernel.org/pub/scm/linux/kernel/git/netdev/netJakub Kicinski2-68/+126
Conflicts: drivers/net/ethernet/ibm/ibmvnic.c Signed-off-by: Jakub Kicinski <[email protected]>
2020-12-01ibmvnic: Fix TX completion error handlingThomas Falcon1-3/+1
TX completions received with an error return code are not being processed properly. When an error code is seen, do not proceed to the next completion before cleaning up the existing entry's data structures. Fixes: 032c5e82847a ("Driver for IBM System i/p VNIC protocol") Signed-off-by: Thomas Falcon <[email protected]> Signed-off-by: David S. Miller <[email protected]>
2020-12-01ibmvnic: Ensure that SCRQ entry reads are correctly orderedThomas Falcon1-0/+18
Ensure that received Subordinate Command-Response Queue (SCRQ) entries are properly read in order by the driver. These queues are used in the ibmvnic device to process RX buffer and TX completion descriptors. dma_rmb barriers have been added after checking for a pending descriptor to ensure the correct descriptor entry is checked and after reading the SCRQ descriptor to ensure the entire descriptor is read before processing. Fixes: 032c5e82847a ("Driver for IBM System i/p VNIC protocol") Signed-off-by: Thomas Falcon <[email protected]> Signed-off-by: David S. Miller <[email protected]>
2020-11-28ibmvnic: reduce wait for completion timeDany Madden1-3/+3
Reduce the wait time for Command Response Queue response from 30 seconds to 20 seconds, as recommended by VIOS and Power Hypervisor teams. Fixes: bd0b672313941 ("ibmvnic: Move login and queue negotiation into ibmvnic_open") Fixes: 53da09e92910f ("ibmvnic: Add set_link_state routine for setting adapter link state") Signed-off-by: Dany Madden <[email protected]> Signed-off-by: Jakub Kicinski <[email protected]>
2020-11-28ibmvnic: no reset timeout for 5 seconds after resetDany Madden2-2/+11
Reset timeout is going off right after adapter reset. This patch ensures that timeout is scheduled if it has been 5 seconds since the last reset. 5 seconds is the default watchdog timeout. Fixes: ed651a10875f1 ("ibmvnic: Updated reset handling") Signed-off-by: Dany Madden <[email protected]> Signed-off-by: Jakub Kicinski <[email protected]>
2020-11-28ibmvnic: send_login should check for crq errorsDany Madden1-6/+12
send_login() does not check for the result of ibmvnic_send_crq() of the login request. This results in the driver needlessly retrying the login 10 times even when CRQ is no longer active. Check the return code and give up in case of errors in sending the CRQ. The only time we want to retry is if we get a PARITALSUCCESS response from the partner. Fixes: 032c5e82847a2 ("Driver for IBM System i/p VNIC protocol") Signed-off-by: Dany Madden <[email protected]> Signed-off-by: Sukadev Bhattiprolu <[email protected]> Signed-off-by: Jakub Kicinski <[email protected]>
2020-11-28ibmvnic: track pending loginSukadev Bhattiprolu2-0/+18
If after ibmvnic sends a LOGIN it gets a FAILOVER, it is possible that the worker thread will start reset process and free the login response buffer before it gets a (now stale) LOGIN_RSP. The ibmvnic tasklet will then try to access the login response buffer and crash. Have ibmvnic track pending logins and discard any stale login responses. Fixes: 032c5e82847a ("Driver for IBM System i/p VNIC protocol") Signed-off-by: Sukadev Bhattiprolu <[email protected]> Signed-off-by: Jakub Kicinski <[email protected]>
2020-11-28ibmvnic: delay next reset if hard reset failsSukadev Bhattiprolu1-0/+8
If auto-priority failover is enabled, the backing device needs time to settle if hard resetting fails for any reason. Add a delay of 60 seconds before retrying the hard-reset. Fixes: 2770a7984db5 ("ibmvnic: Introduce hard reset recovery") Signed-off-by: Sukadev Bhattiprolu <[email protected]> Signed-off-by: Jakub Kicinski <[email protected]>
2020-11-28ibmvnic: restore adapter state on failed resetDany Madden1-31/+36
In a failed reset, driver could end up in VNIC_PROBED or VNIC_CLOSED state and cannot recover in subsequent resets, leaving it offline. This patch restores the adapter state to reset_state, the original state when reset was called. Fixes: b27507bb59ed5 ("net/ibmvnic: unlock rtnl_lock in reset so linkwatch_event can run") Fixes: 2770a7984db58 ("ibmvnic: Introduce hard reset recovery") Signed-off-by: Dany Madden <[email protected]> Signed-off-by: Jakub Kicinski <[email protected]>
2020-11-28ibmvnic: avoid memset null scrq msgsDany Madden1-4/+15
scrq->msgs could be NULL during device reset, causing Linux to crash. So, check before memset scrq->msgs. Fixes: c8b2ad0a4a901 ("ibmvnic: Sanitize entire SCRQ buffer on reset") Signed-off-by: Dany Madden <[email protected]> Signed-off-by: Lijun Pan <[email protected]> Signed-off-by: Jakub Kicinski <[email protected]>
2020-11-28ibmvnic: stop free_all_rwi on failed resetDany Madden1-19/+3
When ibmvnic fails to reset, it breaks out of the reset loop and frees all of the remaining resets from the workqueue. Doing so prevents the adapter from recovering if no reset is scheduled after that. Instead, have the driver continue to process resets on the workqueue. Remove the no longer need free_all_rwi(). Fixes: ed651a10875f1 ("ibmvnic: Updated reset handling") Signed-off-by: Dany Madden <[email protected]> Signed-off-by: Jakub Kicinski <[email protected]>
2020-11-28ibmvnic: handle inconsistent login with resetDany Madden1-1/+1
Inconsistent login with the vnicserver is causing the device to be removed. This does not give the device a chance to recover from error state. This patch schedules a FATAL reset instead to bring the adapter up. Fixes: 032c5e82847a2 ("Driver for IBM System i/p VNIC protocol") Signed-off-by: Dany Madden <[email protected]> Signed-off-by: Lijun Pan <[email protected]> Signed-off-by: Jakub Kicinski <[email protected]>
2020-11-27Merge git://git.kernel.org/pub/scm/linux/kernel/git/netdev/netJakub Kicinski2-5/+21
Trivial conflict in CAN, keep the net-next + the byteswap wrapper. Conflicts: drivers/net/can/usb/gs_usb.c Signed-off-by: Jakub Kicinski <[email protected]>
2020-11-24ibmvnic: enhance resetting status check during module exitLijun Pan2-4/+2
Based on the discussion with Sukadev Bhattiprolu and Dany Madden, we believe that checking adapter->resetting bit is preferred since RESETTING state flag is not as strict as resetting bit. RESETTING state flag is removed since it is verbose now. Fixes: 7d7195a026ba ("ibmvnic: Do not process device remove during device reset") Signed-off-by: Lijun Pan <[email protected]> Signed-off-by: Jakub Kicinski <[email protected]>
2020-11-24ibmvnic: fix NULL pointer dereference in ibmvic_reset_crqLijun Pan1-0/+3
crq->msgs could be NULL if the previous reset did not complete after freeing crq->msgs. Check for NULL before dereferencing them. Snippet of call trace: ... ibmvnic 30000003 env3 (unregistering): Releasing sub-CRQ ibmvnic 30000003 env3 (unregistering): Releasing CRQ BUG: Kernel NULL pointer dereference on read at 0x00000000 Faulting instruction address: 0xc0000000000c1a30 Oops: Kernel access of bad area, sig: 11 [#1] LE PAGE_SIZE=64K MMU=Hash SMP NR_CPUS=2048 NUMA pSeries Modules linked in: ibmvnic(E-) rpadlpar_io rpaphp xt_CHECKSUM xt_MASQUERADE xt_conntrack ipt_REJECT nf_reject_ipv4 nft_compat nft_counter nft_chain_nat nf_nat nf_conntrack nf_defrag_ipv6 nf_defrag_ipv4 nf_tables xsk_diag tcp_diag udp_diag tun raw_diag inet_diag unix_diag bridge af_packet_diag netlink_diag stp llc rfkill sunrpc pseries_rng xts vmx_crypto uio_pdrv_genirq uio binfmt_misc ip_tables xfs libcrc32c sd_mod t10_pi sg ibmvscsi ibmveth scsi_transport_srp dm_mirror dm_region_hash dm_log dm_mod [last unloaded: ibmvnic] CPU: 20 PID: 8426 Comm: kworker/20:0 Tainted: G E 5.10.0-rc1+ #12 Workqueue: events __ibmvnic_reset [ibmvnic] NIP: c0000000000c1a30 LR: c008000001b00c18 CTR: 0000000000000400 REGS: c00000000d05b7a0 TRAP: 0380 Tainted: G E (5.10.0-rc1+) MSR: 800000000280b033 <SF,VEC,VSX,EE,FP,ME,IR,DR,RI,LE> CR: 44002480 XER: 20040000 CFAR: c0000000000c19ec IRQMASK: 0 GPR00: 0000000000000400 c00000000d05ba30 c008000001b17c00 0000000000000000 GPR04: 0000000000000000 0000000000000000 0000000000000000 00000000000001e2 GPR08: 000000000001f400 ffffffffffffd950 0000000000000000 c008000001b0b280 GPR12: c0000000000c19c8 c00000001ec72e00 c00000000019a778 c00000002647b440 GPR16: 0000000000000000 0000000000000000 0000000000000000 0000000000000000 GPR20: 0000000000000006 0000000000000001 0000000000000003 0000000000000002 GPR24: 0000000000001000 c008000001b0d570 0000000000000005 c00000007ab5d550 GPR28: c00000007ab5c000 c000000032fcf848 c00000007ab5cc00 c000000032fcf800 NIP [c0000000000c1a30] memset+0x68/0x104 LR [c008000001b00c18] ibmvnic_reset_crq+0x70/0x110 [ibmvnic] Call Trace: [c00000000d05ba30] [0000000000000800] 0x800 (unreliable) [c00000000d05bab0] [c008000001b0a930] do_reset.isra.40+0x224/0x634 [ibmvnic] [c00000000d05bb80] [c008000001b08574] __ibmvnic_reset+0x17c/0x3c0 [ibmvnic] [c00000000d05bc50] [c00000000018d9ac] process_one_work+0x2cc/0x800 [c00000000d05bd20] [c00000000018df58] worker_thread+0x78/0x520 [c00000000d05bdb0] [c00000000019a934] kthread+0x1c4/0x1d0 [c00000000d05be20] [c00000000000d5d0] ret_from_kernel_thread+0x5c/0x6c Fixes: 032c5e82847a ("Driver for IBM System i/p VNIC protocol") Signed-off-by: Lijun Pan <[email protected]> Signed-off-by: Jakub Kicinski <[email protected]>
2020-11-24ibmvnic: fix NULL pointer dereference in reset_sub_crq_queuesLijun Pan1-0/+3
adapter->tx_scrq and adapter->rx_scrq could be NULL if the previous reset did not complete after freeing sub crqs. Check for NULL before dereferencing them. Snippet of call trace: ibmvnic 30000006 env6: Releasing sub-CRQ ibmvnic 30000006 env6: Releasing CRQ ... ibmvnic 30000006 env6: Got Control IP offload Response ibmvnic 30000006 env6: Re-setting tx_scrq[0] BUG: Kernel NULL pointer dereference on read at 0x00000000 Faulting instruction address: 0xc008000003dea7cc Oops: Kernel access of bad area, sig: 11 [#1] LE PAGE_SIZE=64K MMU=Hash SMP NR_CPUS=2048 NUMA pSeries Modules linked in: rpadlpar_io rpaphp xt_CHECKSUM xt_MASQUERADE xt_conntrack ipt_REJECT nf_reject_ipv4 nft_compat nft_counter nft_chain_nat nf_nat nf_conntrack nf_defrag_ipv6 nf_defrag_ipv4 nf_tables xsk_diag tcp_diag udp_diag raw_diag inet_diag unix_diag af_packet_diag netlink_diag tun bridge stp llc rfkill sunrpc pseries_rng xts vmx_crypto uio_pdrv_genirq uio binfmt_misc ip_tables xfs libcrc32c sd_mod t10_pi sg ibmvscsi ibmvnic ibmveth scsi_transport_srp dm_mirror dm_region_hash dm_log dm_mod CPU: 80 PID: 1856 Comm: kworker/80:2 Tainted: G W 5.8.0+ #4 Workqueue: events __ibmvnic_reset [ibmvnic] NIP: c008000003dea7cc LR: c008000003dea7bc CTR: 0000000000000000 REGS: c0000007ef7db860 TRAP: 0380 Tainted: G W (5.8.0+) MSR: 800000000280b033 <SF,VEC,VSX,EE,FP,ME,IR,DR,RI,LE> CR: 28002422 XER: 0000000d CFAR: c000000000bd9520 IRQMASK: 0 GPR00: c008000003dea7bc c0000007ef7dbaf0 c008000003df7400 c0000007fa26ec00 GPR04: c0000007fcd0d008 c0000007fcd96350 0000000000000027 c0000007fcd0d010 GPR08: 0000000000000023 0000000000000000 0000000000000000 0000000000000000 GPR12: 0000000000002000 c00000001ec18e00 c0000000001982f8 c0000007bad6e840 GPR16: 0000000000000000 0000000000000000 0000000000000000 0000000000000000 GPR20: 0000000000000000 0000000000000000 0000000000000000 fffffffffffffef7 GPR24: 0000000000000402 c0000007fa26f3a8 0000000000000003 c00000016f8ec048 GPR28: 0000000000000000 0000000000000000 0000000000000000 c0000007fa26ec00 NIP [c008000003dea7cc] ibmvnic_reset_init+0x15c/0x258 [ibmvnic] LR [c008000003dea7bc] ibmvnic_reset_init+0x14c/0x258 [ibmvnic] Call Trace: [c0000007ef7dbaf0] [c008000003dea7bc] ibmvnic_reset_init+0x14c/0x258 [ibmvnic] (unreliable) [c0000007ef7dbb80] [c008000003de8860] __ibmvnic_reset+0x408/0x970 [ibmvnic] [c0000007ef7dbc50] [c00000000018b7cc] process_one_work+0x2cc/0x800 [c0000007ef7dbd20] [c00000000018bd78] worker_thread+0x78/0x520 [c0000007ef7dbdb0] [c0000000001984c4] kthread+0x1d4/0x1e0 [c0000007ef7dbe20] [c00000000000cea8] ret_from_kernel_thread+0x5c/0x74 Fixes: 57a49436f4e8 ("ibmvnic: Reset sub-crqs during driver reset") Signed-off-by: Lijun Pan <[email protected]> Signed-off-by: Jakub Kicinski <[email protected]>