aboutsummaryrefslogtreecommitdiff
path: root/drivers
AgeCommit message (Collapse)AuthorFilesLines
2020-08-19Merge tag 'amd-drm-fixes-5.9-2020-08-12' of ↵Dave Airlie12-45/+114
git://people.freedesktop.org/~agd5f/linux into drm-fixes amd-drm-fixes-5.9-2020-08-12: amdgpu: - Fix allocation size - SR-IOV fixes - Vega20 SMU feature state caching fix - Fix custom pptable handling - Arcturus golden settings update - Several display fixes Signed-off-by: Dave Airlie <[email protected]> From: Alex Deucher <[email protected]> Link: https://patchwork.freedesktop.org/patch/msgid/[email protected]
2020-08-19Merge tag 'drm-misc-fixes-2020-08-12' of ↵Dave Airlie2-0/+2
git://anongit.freedesktop.org/drm/drm-misc into drm-fixes drm-misc-fixes for v5.9-rc1: - Add missing dma_fence_put() in virtio_gpu_execbuffer_ioctl(). - Fix memory leak in virtio_gpu_cleanup_object(). Signed-off-by: Dave Airlie <[email protected]> From: Maarten Lankhorst <[email protected]> Link: https://patchwork.freedesktop.org/patch/msgid/[email protected]
2020-08-18net: mscc: ocelot: remove duplicate "the the" phrase in Kconfig textColin Ian King1-1/+1
The Kconfig help text contains the phrase "the the" in the help text. Fix this. Signed-off-by: Colin Ian King <[email protected]> Signed-off-by: David S. Miller <[email protected]>
2020-08-18bonding: fix active-backup failover for current ARP slaveJiri Wiesner1-2/+16
When the ARP monitor is used for link detection, ARP replies are validated for all slaves (arp_validate=3) and fail_over_mac is set to active, two slaves of an active-backup bond may get stuck in a state where both of them are active and pass packets that they receive to the bond. This state makes IPv6 duplicate address detection fail. The state is reached thus: 1. The current active slave goes down because the ARP target is not reachable. 2. The current ARP slave is chosen and made active. 3. A new slave is enslaved. This new slave becomes the current active slave and can reach the ARP target. As a result, the current ARP slave stays active after the enslave action has finished and the log is littered with "PROBE BAD" messages: > bond0: PROBE: c_arp ens10 && cas ens11 BAD The workaround is to remove the slave with "going back" status from the bond and re-enslave it. This issue was encountered when DPDK PMD interfaces were being enslaved to an active-backup bond. I would be possible to fix the issue in bond_enslave() or bond_change_active_slave() but the ARP monitor was fixed instead to keep most of the actions changing the current ARP slave in the ARP monitor code. The current ARP slave is set as inactive and backup during the commit phase. A new state, BOND_LINK_FAIL, has been introduced for slaves in the context of the ARP monitor. This allows administrators to see how slaves are rotated for sending ARP requests and attempts are made to find a new active slave. Fixes: b2220cad583c9 ("bonding: refactor ARP active-backup monitor") Signed-off-by: Jiri Wiesner <[email protected]> Signed-off-by: David S. Miller <[email protected]>
2020-08-18drm/amd/display: fix pow() crashing when given base 0Krunoslav Kovac1-0/+3
[Why&How] pow(a,x) is implemented as exp(x*log(a)). log(0) will crash. So return 0^x = 0, unless x=0, convention seems to be 0^0 = 1. Cc: [email protected] Signed-off-by: Krunoslav Kovac <[email protected]> Reviewed-by: Anthony Koo <[email protected]> Acked-by: Rodrigo Siqueira <[email protected]> Signed-off-by: Alex Deucher <[email protected]>
2020-08-18drm/amd/display: Reset scrambling on Test PatternChris Park1-0/+1
[Why] Programming is missing the sequence where for eDP the scrambling is reset when testing for eye diagram test pattern. [How] Include the required register in the definition Signed-off-by: Chris Park <[email protected]> Reviewed-by: Nicholas Kazlauskas <[email protected]> Acked-by: Rodrigo Siqueira <[email protected]> Signed-off-by: Alex Deucher <[email protected]>
2020-08-18drm/amd/display: fix dcn3 wide timing dsc validationDmytro Laktyushkin1-0/+4
Wide timing DSC requires odm. Since spreadsheet is missing this dsc validation we have to modify DML vba code ourselves. Signed-off-by: Dmytro Laktyushkin <[email protected]> Reviewed-by: Wesley Chalmers <[email protected]> Acked-by: Rodrigo Siqueira <[email protected]> Signed-off-by: Alex Deucher <[email protected]>
2020-08-18drm/amd/display: Fix DFPstate hang due to view port changedPaul Hsieh1-2/+2
[Why] Place the cursor in the center of screen between two pipes then adjusting the viewport but cursour doesn't update cause DFPstate hang. [How] If viewport changed, update cursor as well. Cc: [email protected] Signed-off-by: Paul Hsieh <[email protected]> Reviewed-by: Aric Cyr <[email protected]> Acked-by: Rodrigo Siqueira <[email protected]> Signed-off-by: Alex Deucher <[email protected]>
2020-08-18drm/amd/display: Assign correct left shiftChris Park2-2/+7
[Why] Reading for DP alt registers return incorrect values due to LE_SF definition missing. [How] Define correct LE_SF or DP alt registers. Signed-off-by: Chris Park <[email protected]> Reviewed-by: Nicholas Kazlauskas <[email protected]> Acked-by: Rodrigo Siqueira <[email protected]> Signed-off-by: Alex Deucher <[email protected]>
2020-08-18drm/amd/display: Call DMUB for eDP power controlChris Park6-2/+80
[Why] If DMUB is used, LVTMA VBIOS call can be used to control eDP instead of tranditional transmitter control. Interface is agreed with VBIOS for eDP to use this new path to program LVTMA registers. [How] Create DAL interface to send DMUB command for LVTMA as currently implemented in VBIOS. Signed-off-by: Chris Park <[email protected]> Reviewed-by: Nicholas Kazlauskas <[email protected]> Acked-by: Rodrigo Siqueira <[email protected]> Signed-off-by: Alex Deucher <[email protected]>
2020-08-18drm/amdkfd: fix the wrong sdma instance query for renoirHuang Rui1-9/+22
Renoir only has one sdma instance, it will get failed once query the sdma1 registers. So use switch-case instead of static register array. Signed-off-by: Huang Rui <[email protected]> Reviewed-by: Alex Deucher <[email protected]> Reviewed-by: Felix Kuehling <[email protected]> Signed-off-by: Alex Deucher <[email protected]>
2020-08-18drm/amdgpu: parse ta firmware for navy_flounderBhawanpreet Lakha1-2/+1
Use the same case as sienna_cichlid Signed-off-by: Bhawanpreet Lakha <[email protected]> Reviewed-by: Alex Deucher <[email protected]> Signed-off-by: Alex Deucher <[email protected]>
2020-08-18Merge tag 'spi-fix-v5.9-rc1' of ↵Linus Torvalds3-39/+85
git://git.kernel.org/pub/scm/linux/kernel/git/broonie/spi Pull spi fixes from Mark Brown: "A bunch of fixes that came in for SPI during the merge window. Some from ST and others for their controller, one from Lukas for a race between device addition and controller unregistration and one from fix from Geert for the DT bindings which unbreaks validation" * tag 'spi-fix-v5.9-rc1' of git://git.kernel.org/pub/scm/linux/kernel/git/broonie/spi: dt-bindings: lpspi: Add missing boolean type for fsl,spi-only-use-cs1-sel spi: stm32: always perform registers configuration prior to transfer spi: stm32: fixes suspend/resume management spi: stm32: fix stm32_spi_prepare_mbr in case of odd clk_rate spi: stm32: fix fifo threshold level in case of short transfer spi: stm32h7: fix race condition at end of transfer spi: stm32: clear only asserted irq flags on interrupt spi: Prevent adding devices below an unregistering controller
2020-08-18drm/amdgpu: fix NULL pointer access issue when unloading driverGuchun Chen1-2/+0
When unloading driver by "modprobe -r amdgpu", one NULL pointer dereference bug occurs in ras debugfs releasing. The cause is the duplicated debugfs_remove, as drm debugfs_root dir has been cleaned up already by drm_minor_unregister. BUG: kernel NULL pointer dereference, address: 00000000000000a0 PGD 0 P4D 0 Oops: 0002 [#1] SMP PTI CPU: 11 PID: 1526 Comm: modprobe Tainted: G OE 5.6.0-guchchen #1 Hardware name: System manufacturer System Product Name/TUF Z370-PLUS GAMING II, BIOS 0411 09/21/2018 RIP: 0010:down_write+0x15/0x40 Code: eb de e8 7e 17 72 ff cc cc cc cc cc cc cc cc cc cc cc cc cc cc 0f 1f 44 00 00 53 48 89 fb e8 92 d8 ff ff 31 c0 ba 01 00 00 00 <f0> 48 0f b1 13 75 0f 65 48 8b 04 25 c0 8b 01 00 48 89 43 08 5b c3 RSP: 0018:ffffb1590386fcd0 EFLAGS: 00010246 RAX: 0000000000000000 RBX: 00000000000000a0 RCX: 0000000000000000 RDX: 0000000000000001 RSI: ffffffff85b2fcc2 RDI: 00000000000000a0 RBP: ffffb1590386fd30 R08: ffffffff85b2fcc2 R09: 000000000002b3c0 R10: ffff97a330618c40 R11: 00000000000005f6 R12: ffff97a3481beb40 R13: 00000000000000a0 R14: ffff97a3481beb40 R15: 0000000000000000 FS: 00007fb11a717540(0000) GS:ffff97a376cc0000(0000) knlGS:0000000000000000 CS: 0010 DS: 0000 ES: 0000 CR0: 0000000080050033 CR2: 00000000000000a0 CR3: 00000004066d6006 CR4: 00000000003606e0 DR0: 0000000000000000 DR1: 0000000000000000 DR2: 0000000000000000 DR3: 0000000000000000 DR6: 00000000fffe0ff0 DR7: 0000000000000400 Call Trace: simple_recursive_removal+0x63/0x370 ? debugfs_remove+0x60/0x60 debugfs_remove+0x40/0x60 amdgpu_ras_fini+0x82/0x230 [amdgpu] ? __kernfs_remove.part.17+0x101/0x1f0 ? kernfs_name_hash+0x12/0x80 amdgpu_device_fini+0x1c0/0x580 [amdgpu] amdgpu_driver_unload_kms+0x3e/0x70 [amdgpu] amdgpu_pci_remove+0x36/0x60 [amdgpu] pci_device_remove+0x3b/0xb0 device_release_driver_internal+0xe5/0x1c0 driver_detach+0x46/0x90 bus_remove_driver+0x58/0xd0 pci_unregister_driver+0x29/0x90 amdgpu_exit+0x11/0x25 [amdgpu] __x64_sys_delete_module+0x13d/0x210 do_syscall_64+0x5f/0x250 entry_SYSCALL_64_after_hwframe+0x44/0xa9 Signed-off-by: Guchun Chen <[email protected]> Reviewed-by: Tao Zhou <[email protected]> Signed-off-by: Alex Deucher <[email protected]>
2020-08-18drm/amdgpu: fix uninit-value in arcturus_log_thermal_throttling_event()Kevin Wang1-3/+6
when function arcturus_get_smu_metrics_data() call failed, it will cause the variable "throttler_status" isn't initialized before use. warning: powerplay/arcturus_ppt.c:2268:24: warning: ‘throttler_status’ may be used uninitialized in this function [-Wmaybe-uninitialized] 2268 | if (throttler_status & logging_label[throttler_idx].feature_mask) { Signed-off-by: Kevin Wang <[email protected]> Reviewed-by: Alex Deucher <[email protected]> Signed-off-by: Alex Deucher <[email protected]>
2020-08-18drm/amdgpu: disable gfxoff for navy_flounderJiansong Chen1-0/+3
gfxoff is temporarily disabled for navy_flounder, since at present the feature has broken some basic amdgpu test. Signed-off-by: Jiansong Chen <[email protected]> Reviewed-by: Tao Zhou <[email protected]> Signed-off-by: Alex Deucher <[email protected]>
2020-08-18net: gianfar: Add of_node_put() before goto statementSumera Priyadarsini1-1/+3
Every iteration of for_each_available_child_of_node() decrements reference count of the previous node, however when control is transferred from the middle of the loop, as in the case of a return or break or goto, there is no decrement thus ultimately resulting in a memory leak. Fix a potential memory leak in gianfar.c by inserting of_node_put() before the goto statement. Issue found with Coccinelle. Signed-off-by: Sumera Priyadarsini <[email protected]> Signed-off-by: David S. Miller <[email protected]>
2020-08-18cxgb4: Fix race between loopback and normal Tx pathGanji Aravind1-1/+5
Even after Tx queues are marked stopped, there exists a small window where the current packet in the normal Tx path is still being sent out and loopback selftest ends up corrupting the same Tx ring. So, ensure selftest takes the Tx lock to synchronize access the Tx ring. Fixes: 7235ffae3d2c ("cxgb4: add loopback ethtool self-test") Signed-off-by: Ganji Aravind <[email protected]> Reviewed-by: Jesse Brandeburg <[email protected]> Signed-off-by: David S. Miller <[email protected]>
2020-08-18cxgb4: Fix work request size calculation for loopback testGanji Aravind1-2/+2
Work request used for sending loopback packet needs to add the firmware work request only once. So, fix by using correct structure size. Fixes: 7235ffae3d2c ("cxgb4: add loopback ethtool self-test") Signed-off-by: Ganji Aravind <[email protected]> Reviewed-by: Jesse Brandeburg <[email protected]> Signed-off-by: David S. Miller <[email protected]>
2020-08-18sfc: don't free_irq()s if they were never requestedEdward Cree2-0/+6
If efx_nic_init_interrupt fails, or was never run (e.g. due to an earlier failure in ef100_net_open), freeing irqs in efx_nic_fini_interrupt is not needed and will cause error messages and stack traces. So instead, only do this if efx_nic_init_interrupt successfully completed, as indicated by the new efx->irqs_hooked flag. Fixes: 965b549f3c20 ("sfc_ef100: implement ndo_open/close and EVQ probing") Signed-off-by: Edward Cree <[email protected]> Reviewed-by: Jesse Brandeburg <[email protected]> Signed-off-by: David S. Miller <[email protected]>
2020-08-18sfc: null out channel->rps_flow_id after freeing itEdward Cree1-0/+1
If an ef100_net_open() fails, ef100_net_stop() may be called without channel->rps_flow_id having been written; thus it may hold the address freed by a previous ef100_net_stop()'s call to efx_remove_filters(). This then causes a double-free when efx_remove_filters() is called again, leading to a panic. To prevent this, after freeing it, overwrite it with NULL. Fixes: a9dc3d5612ce ("sfc_ef100: RX filter table management and related gubbins") Signed-off-by: Edward Cree <[email protected]> Reviewed-by: Jesse Brandeburg <[email protected]> Signed-off-by: David S. Miller <[email protected]>
2020-08-18sfc: take correct lock in ef100_reset()Edward Cree1-4/+4
When downing and upping the ef100 filter table, we need to take a write lock on efx->filter_sem, not just a read lock, because we may kfree() the table pointers. Without this, resets cause a WARN_ON from efx_rwsem_assert_write_locked(). Fixes: a9dc3d5612ce ("sfc_ef100: RX filter table management and related gubbins") Signed-off-by: Edward Cree <[email protected]> Reviewed-by: Jesse Brandeburg <[email protected]> Signed-off-by: David S. Miller <[email protected]>
2020-08-18sfc: really check hash is valid before using itEdward Cree1-0/+2
Actually hook up the .rx_buf_hash_valid method in EF100's nic_type. Fixes: 068885434ccb ("sfc: check hash is valid before using it") Reported-by: Martin Habets <[email protected]> Signed-off-by: Edward Cree <[email protected]> Reviewed-by: Jesse Brandeburg <[email protected]> Signed-off-by: David S. Miller <[email protected]>
2020-08-18macvlan: validate setting of multiple remote source MAC addressesAlvin Šipraga1-4/+17
Remote source MAC addresses can be set on a 'source mode' macvlan interface via the IFLA_MACVLAN_MACADDR_DATA attribute. This commit tightens the validation of these MAC addresses to match the validation already performed when setting or adding a single MAC address via the IFLA_MACVLAN_MACADDR attribute. iproute2 uses IFLA_MACVLAN_MACADDR_DATA for its 'macvlan macaddr set' command, and IFLA_MACVLAN_MACADDR for its 'macvlan macaddr add' command, which demonstrates the inconsistent behaviour that this commit addresses: # ip link add link eth0 name macvlan0 type macvlan mode source # ip link set link dev macvlan0 type macvlan macaddr add 01:00:00:00:00:00 RTNETLINK answers: Cannot assign requested address # ip link set link dev macvlan0 type macvlan macaddr set 01:00:00:00:00:00 # ip -d link show macvlan0 5: macvlan0@eth0: <BROADCAST,MULTICAST,DYNAMIC,UP,LOWER_UP> mtu 1500 ... link/ether 2e:ac:fd:2d:69:f8 brd ff:ff:ff:ff:ff:ff promiscuity 0 macvlan mode source remotes (1) 01:00:00:00:00:00 numtxqueues 1 ... With this change, the 'set' command will (rightly) fail in the same way as the 'add' command. Signed-off-by: Alvin Šipraga <[email protected]> Signed-off-by: David S. Miller <[email protected]>
2020-08-18RDMA/usnic: Fix spelling mistake "transistion" -> "transition"Colin Ian King1-1/+1
There is a spelling mistake in a usnic_err error message. Fix it. Link: https://lore.kernel.org/r/[email protected] Signed-off-by: Colin Ian King <[email protected]> Signed-off-by: Jason Gunthorpe <[email protected]>
2020-08-18RDMA/hns: Fix spelling mistake "epmty" -> "empty"Colin Ian King1-1/+1
There is a spelling mistake in a dev_dbg message. Fix it. Link: https://lore.kernel.org/r/[email protected] Signed-off-by: Colin Ian King <[email protected]> Signed-off-by: Jason Gunthorpe <[email protected]>
2020-08-18EDAC/{i7core,sb,pnd2,skx}: Fix error event severityTony Luck4-7/+7
IA32_MCG_STATUS.RIPV indicates whether the return RIP value pushed onto the stack as part of machine check delivery is valid or not. Various drivers copied a code fragment that uses the RIPV bit to determine the severity of the error as either HW_EVENT_ERR_UNCORRECTED or HW_EVENT_ERR_FATAL, but this check is reversed (marking errors where RIPV is set as "FATAL"). Reverse the tests so that the error is marked fatal when RIPV is not set. Reported-by: Gabriele Paoloni <[email protected]> Signed-off-by: Tony Luck <[email protected]> Signed-off-by: Borislav Petkov <[email protected]> Cc: <[email protected]> Link: https://lkml.kernel.org/r/[email protected]
2020-08-17Revert "scsi: qla2xxx: Disable T10-DIF feature with FC-NVMe during probe"Quinn Tran1-4/+0
FCP T10-PI and NVMe features are independent of each other. This patch allows both features to co-exist. This reverts commit 5da05a26b8305a625bc9d537671b981795b46dab. Link: https://lore.kernel.org/r/[email protected] Fixes: 5da05a26b830 ("scsi: qla2xxx: Disable T10-DIF feature with FC-NVMe during probe") Reviewed-by: Himanshu Madhani <[email protected]> Signed-off-by: Quinn Tran <[email protected]> Signed-off-by: Nilesh Javali <[email protected]> Signed-off-by: Martin K. Petersen <[email protected]>
2020-08-17Revert "scsi: qla2xxx: Fix crash on qla2x00_mailbox_command"Saurav Kashyap1-8/+0
FCoE adapter initialization failed for ISP8021 with the following patch applied. In addition, reproduction of the issue the patch originally tried to address has been unsuccessful. This reverts commit 3cb182b3fa8b7a61f05c671525494697cba39c6a. Link: https://lore.kernel.org/r/[email protected] Reviewed-by: Himanshu Madhani <[email protected]> Signed-off-by: Saurav Kashyap <[email protected]> Signed-off-by: Nilesh Javali <[email protected]> Signed-off-by: Martin K. Petersen <[email protected]>
2020-08-17scsi: qla2xxx: Fix null pointer access during disconnect from subsystemQuinn Tran1-0/+5
NVMEAsync command is being submitted to QLA while the same NVMe controller is in the middle of reset. The reset path has deleted the association and freed aen_op->fcp_req.private. Add a check for this private pointer before issuing the command. ... 6 [ffffb656ca11fce0] page_fault at ffffffff8c00114e [exception RIP: qla_nvme_post_cmd+394] RIP: ffffffffc0d012ba RSP: ffffb656ca11fd98 RFLAGS: 00010206 RAX: ffff8fb039eda228 RBX: ffff8fb039eda200 RCX: 00000000000da161 RDX: ffffffffc0d4d0f0 RSI: ffffffffc0d26c9b RDI: ffff8fb039eda220 RBP: 0000000000000013 R8: ffff8fb47ff6aa80 R9: 0000000000000002 R10: 0000000000000000 R11: ffffb656ca11fdc8 R12: ffff8fb27d04a3b0 R13: ffff8fc46dd98a58 R14: 0000000000000000 R15: ffff8fc4540f0000 ORIG_RAX: ffffffffffffffff CS: 0010 SS: 0018 7 [ffffb656ca11fe08] nvme_fc_start_fcp_op at ffffffffc0241568 [nvme_fc] 8 [ffffb656ca11fe50] nvme_fc_submit_async_event at ffffffffc0241901 [nvme_fc] 9 [ffffb656ca11fe68] nvme_async_event_work at ffffffffc014543d [nvme_core] 10 [ffffb656ca11fe98] process_one_work at ffffffff8b6cd437 11 [ffffb656ca11fed8] worker_thread at ffffffff8b6cdcef 12 [ffffb656ca11ff10] kthread at ffffffff8b6d3402 13 [ffffb656ca11ff50] ret_from_fork at ffffffff8c000255 -- PID: 37824 TASK: ffff8fb033063d80 CPU: 20 COMMAND: "kworker/u97:451" 0 [ffffb656ce1abc28] __schedule at ffffffff8be629e3 1 [ffffb656ce1abcc8] schedule at ffffffff8be62fe8 2 [ffffb656ce1abcd0] schedule_timeout at ffffffff8be671ed 3 [ffffb656ce1abd70] wait_for_completion at ffffffff8be639cf 4 [ffffb656ce1abdd0] flush_work at ffffffff8b6ce2d5 5 [ffffb656ce1abe70] nvme_stop_ctrl at ffffffffc0144900 [nvme_core] 6 [ffffb656ce1abe80] nvme_fc_reset_ctrl_work at ffffffffc0243445 [nvme_fc] 7 [ffffb656ce1abe98] process_one_work at ffffffff8b6cd437 8 [ffffb656ce1abed8] worker_thread at ffffffff8b6cdb50 9 [ffffb656ce1abf10] kthread at ffffffff8b6d3402 10 [ffffb656ce1abf50] ret_from_fork at ffffffff8c000255 Link: https://lore.kernel.org/r/[email protected] Reviewed-by: Himanshu Madhani <[email protected]> Signed-off-by: Quinn Tran <[email protected]> Signed-off-by: Nilesh Javali <[email protected]> Signed-off-by: Martin K. Petersen <[email protected]>
2020-08-17scsi: qla2xxx: Check if FW supports MQ before enablingSaurav Kashyap1-0/+5
OS boot during Boot from SAN was stuck at dracut emergency shell after enabling NVMe driver parameter. For non-MQ support the driver was enabling MQ. Add a check to confirm if FW supports MQ. Link: https://lore.kernel.org/r/[email protected] Reviewed-by: Himanshu Madhani <[email protected]> Signed-off-by: Saurav Kashyap <[email protected]> Signed-off-by: Nilesh Javali <[email protected]> Signed-off-by: Martin K. Petersen <[email protected]>
2020-08-17scsi: qla2xxx: Fix WARN_ON in qla_nvme_register_hbaArun Easi2-1/+10
qla_nvme_register_hba() puts out a warning when there are not enough queue pairs available for FC-NVME. Just fail the NVME registration rather than a WARNING + call Trace. Link: https://lore.kernel.org/r/[email protected] Reviewed-by: Himanshu Madhani <[email protected]> Signed-off-by: Arun Easi <[email protected]> Signed-off-by: Nilesh Javali <[email protected]> Signed-off-by: Martin K. Petersen <[email protected]>
2020-08-17scsi: qla2xxx: Allow ql2xextended_error_logging special value 1 to be set ↵Arun Easi1-0/+3
anytime ql2xextended_error_logging can now be set to 1 to get the default mask value, as opposed to at module load time only. Link: https://lore.kernel.org/r/[email protected] Reviewed-by: Himanshu Madhani <[email protected]> Signed-off-by: Arun Easi <[email protected]> Signed-off-by: Nilesh Javali <[email protected]> Signed-off-by: Martin K. Petersen <[email protected]>
2020-08-17scsi: qla2xxx: Reduce noisy debug messageQuinn Tran1-2/+2
Update debug level and message for ELS IOCB done. Link: https://lore.kernel.org/r/[email protected] Reviewed-by: Himanshu Madhani <[email protected]> Signed-off-by: Quinn Tran <[email protected]> Signed-off-by: Nilesh Javali <[email protected]> Signed-off-by: Martin K. Petersen <[email protected]>
2020-08-17scsi: qla2xxx: Fix login timeoutQuinn Tran2-4/+16
Multipath errors were seen during failback due to login timeout. The remote device sent LOGO, the local host tore down the session and did relogin. The RSCN arrived indicates remote device is going through failover after which the relogin is in a 20s timeout phase. At this point the driver is stuck in the relogin process. Add a fix to delete the session as part of abort/flush the login. Link: https://lore.kernel.org/r/[email protected] Reviewed-by: Himanshu Madhani <[email protected]> Signed-off-by: Quinn Tran <[email protected]> Signed-off-by: Nilesh Javali <[email protected]> Signed-off-by: Martin K. Petersen <[email protected]>
2020-08-17scsi: qla2xxx: Indicate correct supported speeds for Mezz cardQuinn Tran1-5/+12
Correct the supported speeds for 16G Mezz card. Link: https://lore.kernel.org/r/[email protected] Reviewed-by: Himanshu Madhani <[email protected]> Signed-off-by: Quinn Tran <[email protected]> Signed-off-by: Nilesh Javali <[email protected]> Signed-off-by: Martin K. Petersen <[email protected]>
2020-08-17scsi: qla2xxx: Flush I/O on zone disableQuinn Tran1-1/+0
Perform implicit logout to flush I/O on zone disable. Link: https://lore.kernel.org/r/[email protected] Reviewed-by: Himanshu Madhani <[email protected]> Signed-off-by: Quinn Tran <[email protected]> Signed-off-by: Himanshu Madhani <[email protected]> Signed-off-by: Nilesh Javali <[email protected]> Signed-off-by: Martin K. Petersen <[email protected]>
2020-08-17scsi: qla2xxx: Flush all sessions on zone disableQuinn Tran1-0/+12
On Zone Disable, certain switches would ignore all commands. This causes timeout for both switch scan command and abort of that command. On detection of this condition, all sessions will be shutdown. Link: https://lore.kernel.org/r/[email protected] Reviewed-by: Himanshu Madhani <[email protected]> Signed-off-by: Quinn Tran <[email protected]> Signed-off-by: Himanshu Madhani <[email protected]> Signed-off-by: Nilesh Javali <[email protected]> Signed-off-by: Martin K. Petersen <[email protected]>
2020-08-17scsi: qla2xxx: Use MBX_TOV_SECONDS for mailbox command timeout valuesEnzo Matsumiya1-7/+7
Improves readability of qla_mbx.c. Link: https://lore.kernel.org/r/[email protected] Reviewed-by: Himanshu Madhani <[email protected]> Reviewed-by: Roman Bolshakov <[email protected]> Signed-off-by: Enzo Matsumiya <[email protected]> Signed-off-by: Martin K. Petersen <[email protected]>
2020-08-17scsi: scsi_debug: Fix scp is NULL errorsDouglas Gilbert1-0/+2
John Garry reported 'sdebug_q_cmd_complete: scp is NULL' failures that were mainly seen on aarch64 machines (e.g. RPi 4 with four A72 CPUs). The problem was tracked down to a missing critical section on a "short circuit" path. Namely, the time to process the current command so far has already exceeded the requested command duration (i.e. the number of nanoseconds in the ndelay parameter). The random=1 parameter setting was pivotal in finding this error. The failure scenario involved first taking that "short circuit" path (due to a very short command duration) and then taking the more likely hrtimer_start() path (due to a longer command duration). With random=1 each command's duration is taken from the uniformly distributed [0..ndelay) interval. The fio utility also helped by reliably generating the error scenario at about once per minute on a RPi 4 (64 bit OS). Link: https://lore.kernel.org/r/[email protected] Reported-by: John Garry <[email protected]> Reviewed-by: Lee Duncan <[email protected]> Signed-off-by: Douglas Gilbert <[email protected]> Signed-off-by: Martin K. Petersen <[email protected]>
2020-08-17scsi: zfcp: Fix use-after-free in request timeout handlersSteffen Maier1-2/+2
Before v4.15 commit 75492a51568b ("s390/scsi: Convert timers to use timer_setup()"), we intentionally only passed zfcp_adapter as context argument to zfcp_fsf_request_timeout_handler(). Since we only trigger adapter recovery, it was unnecessary to sync against races between timeout and (late) completion. Likewise, we only passed zfcp_erp_action as context argument to zfcp_erp_timeout_handler(). Since we only wakeup an ERP action, it was unnecessary to sync against races between timeout and (late) completion. Meanwhile the timeout handlers get timer_list as context argument and do a timer-specific container-of to zfcp_fsf_req which can have been freed. Fix it by making sure that any request timeout handlers, that might just have started before del_timer(), are completed by using del_timer_sync() instead. This ensures the request free happens afterwards. Space time diagram of potential use-after-free: Basic idea is to have 2 or more pending requests whose timeouts run out at almost the same time. req 1 timeout ERP thread req 2 timeout ---------------- ---------------- --------------------------------------- zfcp_fsf_request_timeout_handler fsf_req = from_timer(fsf_req, t, timer) adapter = fsf_req->adapter zfcp_qdio_siosl(adapter) zfcp_erp_adapter_reopen(adapter,...) zfcp_erp_strategy ... zfcp_fsf_req_dismiss_all list_for_each_entry_safe zfcp_fsf_req_complete 1 del_timer 1 zfcp_fsf_req_free 1 zfcp_fsf_req_complete 2 zfcp_fsf_request_timeout_handler del_timer 2 fsf_req = from_timer(fsf_req, t, timer) zfcp_fsf_req_free 2 adapter = fsf_req->adapter ^^^^^^^ already freed Link: https://lore.kernel.org/r/[email protected] Fixes: 75492a51568b ("s390/scsi: Convert timers to use timer_setup()") Cc: <[email protected]> #4.15+ Suggested-by: Julian Wiedmann <[email protected]> Reviewed-by: Julian Wiedmann <[email protected]> Signed-off-by: Steffen Maier <[email protected]> Signed-off-by: Martin K. Petersen <[email protected]>
2020-08-17scsi: ufs: No need to send Abort Task if the task in DB was clearedBean Huo1-7/+8
If the bit corresponding to a task in the Doorbell register has been cleared, no need to poll the status of the task on the device side and to send an Abort Task TM. Instead, let it directly goto cleanup. In addition, to keep original debug output, move the goto below the debug print. Link: https://lore.kernel.org/r/[email protected] Reviewed-by: Stanley Chu <[email protected]> Reviewed-by: Can Guo <[email protected]> Signed-off-by: Bean Huo <[email protected]> Signed-off-by: Martin K. Petersen <[email protected]>
2020-08-17scsi: ufs: Clean up completed request without interrupt notificationStanley Chu1-1/+2
If somehow no interrupt notification is raised for a completed request and its doorbell bit is cleared by host, UFS driver needs to cleanup its outstanding bit in ufshcd_abort(). Otherwise, system may behave abnormally in the following scenario: After ufshcd_abort() returns, this request will be requeued by SCSI layer with its outstanding bit set. Any future completed request will trigger ufshcd_transfer_req_compl() to handle all "completed outstanding bits". At this time the "abnormal outstanding bit" will be detected and the "requeued request" will be chosen to execute request post-processing flow. This is wrong because this request is still "alive". Link: https://lore.kernel.org/r/[email protected] Reviewed-by: Can Guo <[email protected]> Acked-by: Avri Altman <[email protected]> Signed-off-by: Stanley Chu <[email protected]> Signed-off-by: Bean Huo <[email protected]> Signed-off-by: Martin K. Petersen <[email protected]>
2020-08-17scsi: ufs: Improve interrupt handling for shared interruptsAdrian Hunter1-3/+3
For shared interrupts, the interrupt status might be zero, so check that first. Link: https://lore.kernel.org/r/[email protected] Reviewed-by: Avri Altman <[email protected]> Signed-off-by: Adrian Hunter <[email protected]> Signed-off-by: Martin K. Petersen <[email protected]>
2020-08-17scsi: ufs: Fix interrupt error message for shared interruptsAdrian Hunter1-1/+1
The interrupt might be shared, in which case it is not an error for the interrupt handler to be called when the interrupt status is zero, so don't print the message unless there was enabled interrupt status. Link: https://lore.kernel.org/r/[email protected] Fixes: 9333d7757348 ("scsi: ufs: Fix irq return code") Reviewed-by: Avri Altman <[email protected]> Signed-off-by: Adrian Hunter <[email protected]> Signed-off-by: Martin K. Petersen <[email protected]>
2020-08-17scsi: ufs-pci: Add quirk for broken auto-hibernate for Intel EHLAdrian Hunter2-3/+22
Intel EHL UFS host controller advertises auto-hibernate capability but it does not work correctly. Add a quirk for that. [mkp: checkpatch fix] Link: https://lore.kernel.org/r/[email protected] Fixes: 8c09d7527697 ("scsi: ufshdc-pci: Add Intel PCI IDs for EHL") Acked-by: Stanley Chu <[email protected]> Signed-off-by: Adrian Hunter <[email protected]> Signed-off-by: Martin K. Petersen <[email protected]>
2020-08-17scsi: ufs-mediatek: Fix incorrect time to wait link statusStanley Chu1-1/+1
Fix incorrect calculation of "ms" based waiting time in function ufs_mtk_setup_clocks(). Link: https://lore.kernel.org/r/[email protected] Fixes: 9006e3986f66 ("scsi: ufs-mediatek: Do not gate clocks if auto-hibern8 is not entered yet") Reviewed-by: Avri Altman <[email protected]> Signed-off-by: Stanley Chu <[email protected]> Signed-off-by: Martin K. Petersen <[email protected]>
2020-08-17scsi: ufs: Fix possible infinite loop in ufshcd_holdStanley Chu1-1/+4
In ufshcd_suspend(), after clk-gating is suspended and link is set as Hibern8 state, ufshcd_hold() is still possibly invoked before ufshcd_suspend() returns. For example, MediaTek's suspend vops may issue UIC commands which would call ufshcd_hold() during the command issuing flow. Now if UFSHCD_CAP_HIBERN8_WITH_CLK_GATING capability is enabled, then ufshcd_hold() may enter infinite loops because there is no clk-ungating work scheduled or pending. In this case, ufshcd_hold() shall just bypass, and keep the link as Hibern8 state. Link: https://lore.kernel.org/r/[email protected] Reviewed-by: Avri Altman <[email protected]> Co-developed-by: Andy Teng <[email protected]> Signed-off-by: Andy Teng <[email protected]> Signed-off-by: Stanley Chu <[email protected]> Signed-off-by: Martin K. Petersen <[email protected]>
2020-08-17scsi: fcoe: Fix I/O path allocationMike Christie1-1/+1
ixgbe_fcoe_ddp_setup() can be called from the main I/O path and is called with a spin_lock held, so we have to use GFP_ATOMIC allocation instead of GFP_KERNEL. Link: https://lore.kernel.org/r/[email protected] cc: Hannes Reinecke <[email protected]> Reviewed-by: Lee Duncan <[email protected]> Signed-off-by: Mike Christie <[email protected]> Signed-off-by: Martin K. Petersen <[email protected]>
2020-08-17scsi: ufs: ti-j721e-ufs: Fix error return in ti_j721e_ufs_probe()Jing Xiangfeng1-0/+1
Fix to return error code PTR_ERR() from the error handling case instead of 0. Link: https://lore.kernel.org/r/[email protected] Fixes: 22617e216331 ("scsi: ufs: ti-j721e-ufs: Fix unwinding of pm_runtime changes") Reviewed-by: Avri Altman <[email protected]> Signed-off-by: Jing Xiangfeng <[email protected]> Signed-off-by: Martin K. Petersen <[email protected]>