aboutsummaryrefslogtreecommitdiff
AgeCommit message (Collapse)AuthorFilesLines
2023-09-05scsi: megaraid_sas: Fix deadlock on firmware crashdumpJunxiao Bi2-13/+10
The following processes run into a deadlock. CPU 41 was waiting for CPU 29 to handle a CSD request while holding spinlock "crashdump_lock", but CPU 29 was hung by that spinlock with IRQs disabled. PID: 17360 TASK: ffff95c1090c5c40 CPU: 41 COMMAND: "mrdiagd" !# 0 [ffffb80edbf37b58] __read_once_size at ffffffff9b871a40 include/linux/compiler.h:185:0 !# 1 [ffffb80edbf37b58] atomic_read at ffffffff9b871a40 arch/x86/include/asm/atomic.h:27:0 !# 2 [ffffb80edbf37b58] dump_stack at ffffffff9b871a40 lib/dump_stack.c:54:0 # 3 [ffffb80edbf37b78] csd_lock_wait_toolong at ffffffff9b131ad5 kernel/smp.c:364:0 # 4 [ffffb80edbf37b78] __csd_lock_wait at ffffffff9b131ad5 kernel/smp.c:384:0 # 5 [ffffb80edbf37bf8] csd_lock_wait at ffffffff9b13267a kernel/smp.c:394:0 # 6 [ffffb80edbf37bf8] smp_call_function_many at ffffffff9b13267a kernel/smp.c:843:0 # 7 [ffffb80edbf37c50] smp_call_function at ffffffff9b13279d kernel/smp.c:867:0 # 8 [ffffb80edbf37c50] on_each_cpu at ffffffff9b13279d kernel/smp.c:976:0 # 9 [ffffb80edbf37c78] flush_tlb_kernel_range at ffffffff9b085c4b arch/x86/mm/tlb.c:742:0 #10 [ffffb80edbf37cb8] __purge_vmap_area_lazy at ffffffff9b23a1e0 mm/vmalloc.c:701:0 #11 [ffffb80edbf37ce0] try_purge_vmap_area_lazy at ffffffff9b23a2cc mm/vmalloc.c:722:0 #12 [ffffb80edbf37ce0] free_vmap_area_noflush at ffffffff9b23a2cc mm/vmalloc.c:754:0 #13 [ffffb80edbf37cf8] free_unmap_vmap_area at ffffffff9b23bb3b mm/vmalloc.c:764:0 #14 [ffffb80edbf37cf8] remove_vm_area at ffffffff9b23bb3b mm/vmalloc.c:1509:0 #15 [ffffb80edbf37d18] __vunmap at ffffffff9b23bb8a mm/vmalloc.c:1537:0 #16 [ffffb80edbf37d40] vfree at ffffffff9b23bc85 mm/vmalloc.c:1612:0 #17 [ffffb80edbf37d58] megasas_free_host_crash_buffer [megaraid_sas] at ffffffffc020b7f2 drivers/scsi/megaraid/megaraid_sas_fusion.c:3932:0 #18 [ffffb80edbf37d80] fw_crash_state_store [megaraid_sas] at ffffffffc01f804d drivers/scsi/megaraid/megaraid_sas_base.c:3291:0 #19 [ffffb80edbf37dc0] dev_attr_store at ffffffff9b56dd7b drivers/base/core.c:758:0 #20 [ffffb80edbf37dd0] sysfs_kf_write at ffffffff9b326acf fs/sysfs/file.c:144:0 #21 [ffffb80edbf37de0] kernfs_fop_write at ffffffff9b325fd4 fs/kernfs/file.c:316:0 #22 [ffffb80edbf37e20] __vfs_write at ffffffff9b29418a fs/read_write.c:480:0 #23 [ffffb80edbf37ea8] vfs_write at ffffffff9b294462 fs/read_write.c:544:0 #24 [ffffb80edbf37ee8] SYSC_write at ffffffff9b2946ec fs/read_write.c:590:0 #25 [ffffb80edbf37ee8] SyS_write at ffffffff9b2946ec fs/read_write.c:582:0 #26 [ffffb80edbf37f30] do_syscall_64 at ffffffff9b003ca9 arch/x86/entry/common.c:298:0 #27 [ffffb80edbf37f58] entry_SYSCALL_64 at ffffffff9ba001b1 arch/x86/entry/entry_64.S:238:0 PID: 17355 TASK: ffff95c1090c3d80 CPU: 29 COMMAND: "mrdiagd" !# 0 [ffffb80f2d3c7d30] __read_once_size at ffffffff9b0f2ab0 include/linux/compiler.h:185:0 !# 1 [ffffb80f2d3c7d30] native_queued_spin_lock_slowpath at ffffffff9b0f2ab0 kernel/locking/qspinlock.c:368:0 # 2 [ffffb80f2d3c7d58] pv_queued_spin_lock_slowpath at ffffffff9b0f244b arch/x86/include/asm/paravirt.h:674:0 # 3 [ffffb80f2d3c7d58] queued_spin_lock_slowpath at ffffffff9b0f244b arch/x86/include/asm/qspinlock.h:53:0 # 4 [ffffb80f2d3c7d68] queued_spin_lock at ffffffff9b8961a6 include/asm-generic/qspinlock.h:90:0 # 5 [ffffb80f2d3c7d68] do_raw_spin_lock_flags at ffffffff9b8961a6 include/linux/spinlock.h:173:0 # 6 [ffffb80f2d3c7d68] __raw_spin_lock_irqsave at ffffffff9b8961a6 include/linux/spinlock_api_smp.h:122:0 # 7 [ffffb80f2d3c7d68] _raw_spin_lock_irqsave at ffffffff9b8961a6 kernel/locking/spinlock.c:160:0 # 8 [ffffb80f2d3c7d88] fw_crash_buffer_store [megaraid_sas] at ffffffffc01f8129 drivers/scsi/megaraid/megaraid_sas_base.c:3205:0 # 9 [ffffb80f2d3c7dc0] dev_attr_store at ffffffff9b56dd7b drivers/base/core.c:758:0 #10 [ffffb80f2d3c7dd0] sysfs_kf_write at ffffffff9b326acf fs/sysfs/file.c:144:0 #11 [ffffb80f2d3c7de0] kernfs_fop_write at ffffffff9b325fd4 fs/kernfs/file.c:316:0 #12 [ffffb80f2d3c7e20] __vfs_write at ffffffff9b29418a fs/read_write.c:480:0 #13 [ffffb80f2d3c7ea8] vfs_write at ffffffff9b294462 fs/read_write.c:544:0 #14 [ffffb80f2d3c7ee8] SYSC_write at ffffffff9b2946ec fs/read_write.c:590:0 #15 [ffffb80f2d3c7ee8] SyS_write at ffffffff9b2946ec fs/read_write.c:582:0 #16 [ffffb80f2d3c7f30] do_syscall_64 at ffffffff9b003ca9 arch/x86/entry/common.c:298:0 #17 [ffffb80f2d3c7f58] entry_SYSCALL_64 at ffffffff9ba001b1 arch/x86/entry/entry_64.S:238:0 The lock is used to synchronize different sysfs operations, it doesn't protect any resource that will be touched by an interrupt. Consequently it's not required to disable IRQs. Replace the spinlock with a mutex to fix the deadlock. Signed-off-by: Junxiao Bi <[email protected]> Link: https://lore.kernel.org/r/[email protected] Reviewed-by: Mike Christie <[email protected]> Cc: [email protected] Signed-off-by: Martin K. Petersen <[email protected]>
2023-08-30scsi: ufs: core: No need to update UPIU.header.flags and lun in advanced ↵Bean Huo1-3/+1
RPMB handler For advanced RPMB requests, its UPIU package should be fully initialized in its ufs-bsg-based application, except for task tag. in ufshcd.c, we just copy UPIU (with CDB) request as-is. Signed-off-by: Bean Huo <[email protected]> Link: https://lore.kernel.org/r/[email protected] Reviewed-by: Bart Van Assche <[email protected]> Signed-off-by: Martin K. Petersen <[email protected]>
2023-08-30scsi: ufs: core: Add advanced RPMB support where UFSHCI 4.0 does not support ↵Bean Huo2-3/+10
EHS length in UTRD According to UFSHCI 4.0 specification: 5.2 Host Controller Capabilities Registers 5.2.1 Offset 00h: CAP – Controller Capabilities: "EHS Length in UTRD Supported (EHSLUTRDS): Indicates whether the host controller supports EHS Length field in UTRD. 0 – Host controller takes EHS length from CMD UPIU, and SW driver use EHS Length field in CMD UPIU. 1 – HW controller takes EHS length from UTRD, and SW driver use EHS Length field in UTRD. NOTE Recommend Host controllers move to taking EHS length from UTRD, and in UFS-5, it will be mandatory." So, when UFSHCI 4.0 doesn't support EHS Length field in UTRD, we could use EHS Length field in CMD UPIU. Remove the limitation that advanced RPMB only works when EHS length is supported in UTRD. Fixes: 6ff265fc5ef6 ("scsi: ufs: core: bsg: Add advanced RPMB support in ufs_bsg") Co-developed-by: "jonghwi.rha" <[email protected]> Signed-off-by: "jonghwi.rha" <[email protected]> Signed-off-by: Bean Huo <[email protected]> Link: https://lore.kernel.org/r/[email protected] Reviewed-by: Bart Van Assche <[email protected]> Signed-off-by: Martin K. Petersen <[email protected]>
2023-08-30scsi: mpt3sas: Remove volatile qualifierRanjan Kumar3-6/+6
Remove reduntant volatile qualifier. Signed-off-by: Ranjan Kumar <[email protected]> Link: https://lore.kernel.org/r/[email protected] Signed-off-by: Martin K. Petersen <[email protected]>
2023-08-30scsi: mpt3sas: Perform additional retries if doorbell read returns 0Ranjan Kumar2-13/+34
The driver retries certain register reads 3 times if the returned value is 0. This was done because the controller could return 0 for certain registers if other registers were being accessed concurrently by the BMC. In certain systems with increased BMC interactions, the register values returned can be 0 for longer than 3 retries. Change the retry count from 3 to 30 for the affected registers to prevent problems with out-of-band management. Fixes: b899202901a8 ("scsi: mpt3sas: Add separate function for aero doorbell reads") Cc: [email protected] Signed-off-by: Ranjan Kumar <[email protected]> Link: https://lore.kernel.org/r/[email protected] Signed-off-by: Martin K. Petersen <[email protected]>
2023-08-30scsi: libsas: Simplify sas_queue_reset() and remove unused codeWenchao Hao1-38/+3
sas_queue_reset() is always called with param "wait" set to 0, so remove it from this function's parameter list. Also remove unused function sas_wait_eh(). Signed-off-by: Wenchao Hao <[email protected]> Link: https://lore.kernel.org/r/[email protected] Reviewed-by: Jason Yan <[email protected]> Reviewed-by: Damien Le Moal <[email protected]> Signed-off-by: Martin K. Petersen <[email protected]>
2023-08-30scsi: ufs: Fix the build for the old ARM OABIBart Van Assche1-1/+1
All structs and unions are word aligned when using the OABI. Mark the union in struct utp_upiu_header as packed to prevent that the compiler inserts padding bytes. Cc: Arnd Bergmann <[email protected]> Reported-by: kernel test robot <[email protected]> Closes: https://lore.kernel.org/oe-kbuild-all/[email protected]/ Fixes: 617bfaa8dd50 ("scsi: ufs: Simplify response header parsing") Signed-off-by: Bart Van Assche <[email protected]> Link: https://lore.kernel.org/r/[email protected] Reviewed-by: Arnd Bergmann <[email protected]> Reviewed-by: Bean Huo <[email protected]> Signed-off-by: Martin K. Petersen <[email protected]>
2023-08-30scsi: qla2xxx: Fix unused variable warning in qla2xxx_process_purls_pkt()Nathan Chancellor1-2/+1
When CONFIG_NVME_FC is not set, fcport is unused: drivers/scsi/qla2xxx/qla_nvme.c: In function 'qla2xxx_process_purls_pkt': drivers/scsi/qla2xxx/qla_nvme.c:1183:20: warning: unused variable 'fcport' [-Wunused-variable] 1183 | fc_port_t *fcport = uctx->fcport; | ^~~~~~ While this preprocessor usage could be converted to a normal if statement to allow the compiler to always see fcport as used, it is equally easy to just eliminate the fcport variable and use uctx->fcport directly. Fixes: 27177862de96 ("scsi: qla2xxx: Fix nvme_fc_rcv_ls_req() undefined error") Reported-by: Stephen Rothwell <[email protected]> Closes: https://lore.kernel.org/linux-next/[email protected]/ Reported-by: kernel test robot <[email protected]> Closes: https://lore.kernel.org/oe-kbuild-all/[email protected]/ Signed-off-by: Nathan Chancellor <[email protected]> Link: https://lore.kernel.org/r/[email protected] Acked-by: Nilesh Javali <[email protected]> Reviewed-by: Bart Van Assche <[email protected]> Signed-off-by: Martin K. Petersen <[email protected]>
2023-08-30scsi: fnic: Remove unused functions fnic_scsi_host_start/end_tag()Yang Li1-33/+0
The functions fnic_scsi_host_start_tag() and fnic_scsi_host_end_tag() are not used anywhere, so remove them. silence the warnings: drivers/scsi/fnic/fnic_scsi.c:2175:1: warning: unused function 'fnic_scsi_host_start_tag' drivers/scsi/fnic/fnic_scsi.c:2196:1: warning: unused function 'fnic_scsi_host_end_tag' Signed-off-by: Yang Li <[email protected]> Link: https://lore.kernel.org/r/[email protected] Acked-by: Karan Tilak Kumar <[email protected]> Signed-off-by: Martin K. Petersen <[email protected]>
2023-08-30scsi: qla2xxx: Fix spelling mistake "tranport" -> "transport"Colin Ian King1-1/+1
There is a spelling mistake in a ql_dbg message. Fix it. Signed-off-by: Colin Ian King <[email protected]> Link: https://lore.kernel.org/r/[email protected] Acked-by: Nilesh Javali <[email protected]> Signed-off-by: Martin K. Petersen <[email protected]>
2023-08-30Merge branch '6.5/scsi-fixes' into 6.6/scsi-stagingMartin K. Petersen33-235/+282
Pull in the fixes tree for a commit that missed 6.5. Also resolve a trivial merge conflict in fnic. * 6.5/scsi-fixes: (36 commits) scsi: storvsc: Handle additional SRB status values scsi: snic: Fix double free in snic_tgt_create() scsi: core: raid_class: Remove raid_component_add() scsi: ufs: ufs-qcom: Clear qunipro_g4_sel for HW major version > 5 scsi: ufs: mcq: Fix the search/wrap around logic scsi: qedf: Fix firmware halt over suspend and resume scsi: qedi: Fix firmware halt over suspend and resume scsi: qedi: Fix potential deadlock on &qedi_percpu->p_work_lock scsi: lpfc: Remove reftag check in DIF paths scsi: ufs: renesas: Fix private allocation scsi: snic: Fix possible memory leak if device_add() fails scsi: core: Fix possible memory leak if device_add() fails scsi: core: Fix legacy /proc parsing buffer overflow scsi: 53c700: Check that command slot is not NULL scsi: fnic: Replace return codes in fnic_clean_pending_aborts() scsi: storvsc: Fix handling of virtual Fibre Channel timeouts scsi: pm80xx: Fix error return code in pm8001_pci_probe() scsi: zfcp: Defer fc_rport blocking until after ADISC response scsi: storvsc: Limit max_sectors for virtual Fibre Channel devices scsi: sg: Fix checking return value of blk_get_queue() ... Signed-off-by: Martin K. Petersen <[email protected]>
2023-08-25scsi: fnic: Replace sgreset tag with max_tag_idKaran Tilak Kumar2-12/+11
sgreset is issued with a SCSI command pointer. The device reset code assumes that it was issued on a hardware queue, and calls block multiqueue layer. However, the assumption is broken, and there is no hardware queue associated with the sgreset, and this leads to a crash due to a null pointer exception. Fix the code to use the max_tag_id as a tag which does not overlap with the other tags issued by mid layer. Tested by running FC traffic for a few minutes, and by issuing sgreset on the device in parallel. Without the fix, the crash is observed right away. With this fix, no crash is observed. Reviewed-by: Sesidhar Baddela <[email protected]> Tested-by: Karan Tilak Kumar <[email protected]> Signed-off-by: Karan Tilak Kumar <[email protected]> Link: https://lore.kernel.org/r/[email protected] Signed-off-by: Martin K. Petersen <[email protected]>
2023-08-25scsi: storvsc: Handle additional SRB status valuesMichael Kelley1-0/+7
Testing of virtual Fibre Channel devices under Hyper-V has shown additional SRB status values being returned for various error cases. Because these SRB status values are not recognized by storvsc, the I/O operations are not flagged as an error. Requests are treated as if they completed normally but with zero data transferred, which can cause a flood of retries. Add definitions for these SRB status values and handle them like other error statuses from the Hyper-V host. Signed-off-by: Michael Kelley <[email protected]> Link: https://lore.kernel.org/r/[email protected] Signed-off-by: Martin K. Petersen <[email protected]>
2023-08-25Merge patch series "qla2xxx driver misc features"Martin K. Petersen17-96/+1088
Nilesh Javali <[email protected]> says: Martin, Please apply the qla2xxx driver miscellaneous features and bug fixes to the scsi tree at your earliest convenience. Link: https://lore.kernel.org/r/[email protected] Signed-off-by: Martin K. Petersen <[email protected]>
2023-08-25scsi: qla2xxx: Remove unused variables in qla24xx_build_scsi_type_6_iocbs()Nilesh Javali1-5/+0
Sparse warning reported, drivers/scsi/qla2xxx/qla_iocb.c: In function 'qla24xx_build_scsi_type_6_iocbs': >> drivers/scsi/qla2xxx/qla_iocb.c:594:29: warning: variable 'ha' set but not used [-Wunused-but-set-variable] 594 | struct qla_hw_data *ha; | ^~ Remove unused variables 'vha' and 'ha'. Reported-by: kernel test robot <[email protected]> Closes: https://lore.kernel.org/oe-kbuild-all/[email protected]/ Signed-off-by: Nilesh Javali <[email protected]> Link: https://lore.kernel.org/r/[email protected] Signed-off-by: Martin K. Petersen <[email protected]>
2023-08-25scsi: qla2xxx: Fix nvme_fc_rcv_ls_req() undefined errorNilesh Javali1-1/+3
The kernel robot reported below build error, >> ERROR: modpost: "nvme_fc_rcv_ls_req" [drivers/scsi/qla2xxx/qla2xxx.ko] undefined! Use CONFIG_NVME_FC enabled check to fix the build error. Reported-by: kernel test robot <[email protected]> Closes: https://lore.kernel.org/oe-kbuild-all/[email protected]/ Signed-off-by: Nilesh Javali <[email protected]> Link: https://lore.kernel.org/r/[email protected] Signed-off-by: Martin K. Petersen <[email protected]>
2023-08-24Merge patch series "smartpqi updates"Martin K. Petersen2-55/+215
Don Brace <[email protected]> says: cat smartpqi_6.6_cover_letter These patches are based on Martin Petersen's 6.6/scsi-queue tree https://git.kernel.org/pub/scm/linux/kernel/git/mkp/scsi.git 6.6/scsi-queue The biggest functional change to smartpqi is the addition of an abort handler. Some customers were complaining about I/O stalls to all devices when only one device is reset. Adding an abort handler helps to prevent I/O stalls to all devices. All of the reset of the patches are small changes to logging messages, MACRO and variable name changes, and one minor change for LUN assignments. This set of changes consists of: * smartpqi-add-abort-handler When a device reset occurs, the SML pauses I/O to all devices presented by a controller instance causing some performance issues. To only affect device with a problematic request, we added an abort handler. The abort handler is implemented by using a device reset, but I/O to the other devices is no longer affected. * smartpqi-refactor-rename-MACRO-to-clarify-purpose The MACRO SOP_RC_INCORRECT_LOGICAL_UNIT was used to check for a condition where a TMF was sent an incorrect LUN. We renamed this MACRO to SOP_TMF_INCORRECT_LOGICAL_UNIT for clarity. * smartpqi-refactor-rename-pciinfo-to-pci_info Change the pciinfo variable to pci_info to make more readable code. No functional changes. * smartpqi-simplify-lun_number-assignment We simplified the conditional expression used to populate LUN numbers for requests. * smartpqi-enhance-shutdown-notification Clarify controller cache flush errors. We added in more precise information to the cache flush informational message. No functional changes. * smartpqi-enhance-controller-offline-notification The driver can offline a controller for multiple reasons. We added a description of why these rare offline actions are taken. And a function to provide the specific details of the shutdown. * smartpqi-enhance-error-messages We added host:bus:target:lun to messages emitted in our reset/abort handlers. No functional changes. * smartpqi-change-driver-version-to-2.1.24-046 Link: https://lore.kernel.org/r/[email protected] Signed-off-by: Martin K. Petersen <[email protected]>
2023-08-24scsi: smartpqi: Change driver version to 2.1.24-046Don Brace1-3/+3
Reviewed-by: Gerry Morong <[email protected]> Reviewed-by: Scott Benesh <[email protected]> Reviewed-by: Scott Teel <[email protected]> Reviewed-by: Mike McGowen <[email protected]> Reviewed-by: Kevin Barnett <[email protected]> Signed-off-by: Don Brace <[email protected]> Link: https://lore.kernel.org/r/[email protected] Signed-off-by: Martin K. Petersen <[email protected]>
2023-08-24scsi: smartpqi: Enhance error messagesMahesh Rajashekhara1-8/+9
Add more detail to some TMF messages. Reviewed-by: Scott Benesh <[email protected]> Reviewed-by: Mike McGowen <[email protected]> Reviewed-by: Kevin Barnett <[email protected]> Signed-off-by: Mahesh Rajashekhara <[email protected]> Signed-off-by: Don Brace <[email protected]> Link: https://lore.kernel.org/r/[email protected] Signed-off-by: Martin K. Petersen <[email protected]>
2023-08-24scsi: smartpqi: Enhance controller offline notificationDavid Strahan1-1/+49
Add a description for the reason the controller has been taken off-line. Reviewed-by: Scott Benesh <[email protected]> Reviewed-by: Scott Teel <[email protected]> Reviewed-by: Mike McGowen <[email protected]> Signed-off-by: David Strahan <[email protected]> Signed-off-by: Don Brace <[email protected]> Link: https://lore.kernel.org/r/[email protected] Signed-off-by: Martin K. Petersen <[email protected]>
2023-08-24scsi: smartpqi: Enhance shutdown notificationDavid Strahan1-1/+1
Provide more detailed information about cache flush errors during shutdown. Reviewed-by: Mahesh Rajashekhara <[email protected]> Reviewed-by: Scott Benesh <[email protected]> Reviewed-by: Scott Teel <[email protected]> Reviewed-by: Mike McGowen <[email protected]> Reviewed-by: Kevin Barnett <[email protected]> Signed-off-by: David Strahan <[email protected]> Signed-off-by: Don Brace <[email protected]> Link: https://lore.kernel.org/r/[email protected] Signed-off-by: Martin K. Petersen <[email protected]>
2023-08-24scsi: smartpqi: Simplify lun_number assignmentDavid Strahan1-4/+2
Simplify lun_number assignment. lun_number assignment is only required for non-AIO requests. Reviewed-by: Scott Benesh <[email protected]> Reviewed-by: Mike McGowen <[email protected]> Reviewed-by: Kevin Barnett <[email protected]> Signed-off-by: David Strahan <[email protected]> Signed-off-by: Don Brace <[email protected]> Link: https://lore.kernel.org/r/[email protected] Signed-off-by: Martin K. Petersen <[email protected]>
2023-08-24scsi: smartpqi: Rename pciinfo to pci_infoKevin Barnett1-6/+6
Make pci device structure names consistent and readable. Reviewed-by: Scott Benesh <[email protected]> Reviewed-by: Mike McGowen <[email protected]> Signed-off-by: Kevin Barnett <[email protected]> Signed-off-by: Don Brace <[email protected]> Link: https://lore.kernel.org/r/[email protected] Signed-off-by: Martin K. Petersen <[email protected]>
2023-08-24scsi: smartpqi: Rename MACRO to clarify purposeKevin Barnett2-2/+2
Rename SOP_RC_INCORRECT_LOGICAL_UNIT to SOP_TMF_INCORRECT_LOGICAL_UNIT to clarify the intended purpose. Reviewed-by: Mahesh Rajashekhara <[email protected]> Reviewed-by: Scott Teel <[email protected]> Reviewed-by: Scott Benesh <[email protected]> Reviewed-by: Mike McGowen <[email protected]> Signed-off-by: Kevin Barnett <[email protected]> Signed-off-by: Don Brace <[email protected]> Link: https://lore.kernel.org/r/[email protected] Signed-off-by: Martin K. Petersen <[email protected]>
2023-08-24scsi: smartpqi: Add abort handlerKevin Barnett2-36/+149
Implement aborts as resets. Avoid I/O stalls across all devices attached to a controller when device I/O requests time out. Reviewed-by: Mahesh Rajashekhara <[email protected]> Reviewed-by: Scott Teel <[email protected]> Reviewed-by: Scott Benesh <[email protected]> Reviewed-by: Mike McGowen <[email protected]> Signed-off-by: Kevin Barnett <[email protected]> Signed-off-by: Don Brace <[email protected]> Link: https://lore.kernel.org/r/[email protected] Signed-off-by: Martin K. Petersen <[email protected]>
2023-08-24scsi: snic: Fix double free in snic_tgt_create()Zhu Wang1-2/+1
Commit 41320b18a0e0 ("scsi: snic: Fix possible memory leak if device_add() fails") fixed the memory leak caused by dev_set_name() when device_add() failed. However, it did not consider that 'tgt' has already been released when put_device(&tgt->dev) is called. Remove kfree(tgt) in the error path to avoid double free of 'tgt' and move put_device(&tgt->dev) after the removed kfree(tgt) to avoid a use-after-free. Fixes: 41320b18a0e0 ("scsi: snic: Fix possible memory leak if device_add() fails") Signed-off-by: Zhu Wang <[email protected]> Link: https://lore.kernel.org/r/[email protected] Signed-off-by: Martin K. Petersen <[email protected]>
2023-08-24scsi: sd: Remove the number of forward declarationsBart Van Assche1-39/+27
Move the sd_pm_ops and sd_template data structures to just above init_sd() such that the number of forward function declarations can be reduced. Signed-off-by: Bart Van Assche <[email protected]> Link: https://lore.kernel.org/r/[email protected] Signed-off-by: Martin K. Petersen <[email protected]>
2023-08-24scsi: st: Add third party poweron reset handlingJohn Meneghini1-0/+2
Many tape devices will automatically rewind following a poweron/reset. This can result in data loss as other operations in the driver can write to the tape when the position is unknown. E.g. MTEOM can write a filemark at the beginning of the tape. This patch adds code to detect poweron/reset unit attentions and prevents the driver from writing to the tape when the position could be unknown. Customer reported problem description: We have experienced an issue with the SCSI tape driver (st) which has led to data loss for us on two separate occasions in production, as well as in a third case in which we were able to reproduce the failure in our test environment. The tape device involved is an Amazon Tape Gateway, a virtual tape library (VTL) appliance which presents as a series of iSCSI targets (multiple tape drives and a changer) and is backed by storage in Amazon S3. The problem is a general one and not limited to any particular SCSI transport or tape device, though the nature of both iSCSI and the VTL make data loss somewhat more likely with this combination than with a physical tape drive. The observed behavior occurs when an error causes the VTL tape gateway process (on the appliance) to crash and restart. This interrupts the iSCSI TCP connections and, when it occurs during a write, causes the write to fail with EIO. However, we then find that the virtual tape in question is now completely blank. We raised this issue with AWS support, thinking this must be a bug in the VTL appliance, but that turns out not to be the case. Per AWS support, when the gateway crashes in this manner, its notion of the current tape position is reset to the beginning of the tape. It also sets a unit attention condition, such that the next request results in a CHECK CONDITION status with sense key UNIT ATTENTION and asc/ascq indicating a device reset. According to their logs the next command being sent is WRITE FILEMARK, which results in writing an FM at the beginning of the tape, effectively discarding its contents. In fact, once the write fails with EIO, our software attempts to recover by rewinding and repositioning the tape, then resuming operation. If this fails, it attempts to rewind and reposition again, write a marker at the end of the tape, and then unmount. It does not under any circumstances write either data or filemarks without having successfully positioned the tape to a known point. What actually happens is that, since the last operation was a write, the kernel executes an implied MTWEOF operation (which translate to a Write Filemarks command) before the rewind that was actually requested. This seems not entirely unreasonable, provided the tape position is known. However, once this request fails (due to the unit attention condition), our next rewind attempt also triggers an implied MTWEOF, which does _not_ fail (the unit attention condition persists only until the initiator has been notified); this is the command that unexpectedly erases the tape. Our analysis is that the st driver is in fact completely ignoring the UNIT ATTENTION and associated reset notification from the device. This is not a condition that can be detected in the transport or mid-layer, as it occurs entirely within the target and is reported only via the UNIT ATTENTION sense key. The upper driver (i.e. st) needs to detect this indication and reset its internal model of the device to an unknown state. Suggested-by: Jeffrey Hutzelman <[email protected]> Signed-off-by: John Meneghini <[email protected]> Link: https://lore.kernel.org/r/[email protected] Acked-by: Kai Mäkisara <[email protected]> Signed-off-by: Martin K. Petersen <[email protected]>
2023-08-24scsi: core: Report error list information in debugfsBart Van Assche1-3/+23
Provide information in debugfs about SCSI error handling to make it easier to debug the SCSI error handler. Additionally, report the maximum number of retries in debugfs (.allowed). Reviewed-by: John Garry <[email protected]> Cc: Hannes Reinecke <[email protected]> Cc: Damien Le Moal <[email protected]> Cc: Mike Christie <[email protected]> Cc: Ming Lei <[email protected]> Signed-off-by: Bart Van Assche <[email protected]> Link: https://lore.kernel.org/r/[email protected] Signed-off-by: Martin K. Petersen <[email protected]>
2023-08-24scsi: core: Improve type safety of scsi_rescan_device()Bart Van Assche11-13/+12
Most callers of scsi_rescan_device() have the scsi_device pointer readily available. Pass a struct scsi_device pointer to scsi_rescan_device() instead of a struct device pointer. This change prevents that a pointer to another struct device would be passed accidentally to scsi_rescan_device(). Remove the scsi_rescan_device() declaration from the scsi_priv.h header file since it duplicates the declaration in <scsi/scsi_host.h>. Reviewed-by: Hannes Reinecke <[email protected]> Reviewed-by: Damien Le Moal <[email protected]> Reviewed-by: John Garry <[email protected]> Cc: Mike Christie <[email protected]> Cc: Ming Lei <[email protected]> Signed-off-by: Bart Van Assche <[email protected]> Link: https://lore.kernel.org/r/[email protected] Signed-off-by: Martin K. Petersen <[email protected]>
2023-08-24scsi: qedi: Remove unused declarationsYue Haibing1-2/+0
These declarations were never implemented, remove them. Signed-off-by: Yue Haibing <[email protected]> Link: https://lore.kernel.org/r/[email protected] Signed-off-by: Martin K. Petersen <[email protected]>
2023-08-24scsi: qedf: Remove unused declarationYue Haibing1-1/+0
This declaration was never implemented, remove it. Signed-off-by: Yue Haibing <[email protected]> Link: https://lore.kernel.org/r/[email protected] Signed-off-by: Martin K. Petersen <[email protected]>
2023-08-24scsi: xen-scsifront: shost_priv() can never return NULLJuergen Gross1-3/+3
There is no need to check whether shost_priv() returns a non-NULL value, as the pointer returned is just an offset to the passed in parameter. While at it replace an open coded shost_priv() instance. Reported-by: Dan Carpenter <[email protected]> Signed-off-by: Juergen Gross <[email protected]> Link: https://lore.kernel.org/r/[email protected] Signed-off-by: Martin K. Petersen <[email protected]>
2023-08-24scsi: core: raid_class: Remove raid_component_add()Zhu Wang2-52/+0
The raid_component_add() function was added to the kernel tree via patch "[SCSI] embryonic RAID class" (2005). Remove this function since it never has had any callers in the Linux kernel. And also raid_component_release() is only used in raid_component_add(), so it is also removed. Signed-off-by: Zhu Wang <[email protected]> Link: https://lore.kernel.org/r/[email protected] Reviewed-by: Bart Van Assche <[email protected]> Fixes: 04b5b5cb0136 ("scsi: core: Fix possible memory leak if device_add() fails") Signed-off-by: Martin K. Petersen <[email protected]>
2023-08-24Merge patch series "libsas: Some tidy-up"Martin K. Petersen25-131/+59
John Garry <[email protected]> says: This series tidies-up libsas a bit, including: - delete structure(s) with only one member - delete structure members which are only ever set - delete structure members which are never set and code which relies on that member being set This conflicts with the following series: https://lore.kernel.org/linux-scsi/[email protected]/ Any conflict should be trivial to resolve. Based on mkp-scsi staging at a18e81d17a7e ("scsi: ufs: ufs-pci: Add support for QEMU") Link: https://lore.kernel.org/r/[email protected] Signed-off-by: Martin K. Petersen <[email protected]>
2023-08-24Merge patch series "Returning FIS on success for CDL"Martin K. Petersen6-18/+23
Igor Pylypiv <[email protected]> says: This patch series plumbs libata's request for a result taskfile (ATA_QCFLAG_RESULT_TF) through libsas to pm80xx LLDD. Other libsas LLDDs can start using the newly added return_fis_on_success as well, if needed. For Command Duration Limits policy 0xD (command completes without an error) libata needs FIS in order to detect the ATA_SENSE bit and read the Sense Data for Successful NCQ Commands log (0Fh). pm80xx HBAs do not return FIS on success by default, hence, the driver is updated to set the RETFIS bit (Return FIS on good completion) when requested by libsas. Link: https://lore.kernel.org/r/[email protected] Signed-off-by: Martin K. Petersen <[email protected]>
2023-08-21scsi: libsas: Delete sas_ata_task.retry_countJohn Garry2-3/+1
Since libsas was introduced in commit 2908d778ab3e ("[SCSI] aic94xx: new driver"), sas_ata_task.retry_count is never set, so delete it and the reference in asd_build_ata_ascb(). Signed-off-by: John Garry <[email protected]> Link: https://lore.kernel.org/r/[email protected] Reviewed-by: Damien Le Moal <[email protected]> Reviewed-by: Jason Yan <[email protected]> Signed-off-by: Martin K. Petersen <[email protected]>
2023-08-21scsi: libsas: Delete sas_ata_task.stp_affil_polJohn Garry2-6/+1
Since libsas was introduced in commit 2908d778ab3e ("[SCSI] aic94xx: new driver"), sas_ata_task.stp_affil_pol is never set, so delete it and the reference in asd_build_ata_ascb(). Signed-off-by: John Garry <[email protected]> Link: https://lore.kernel.org/r/[email protected] Reviewed-by: Damien Le Moal <[email protected]> Reviewed-by: Jason Yan <[email protected]> Signed-off-by: Martin K. Petersen <[email protected]>
2023-08-21scsi: libsas: Delete sas_ata_task.set_affil_polJohn Garry2-5/+3
Since libsas was introduced in commit 2908d778ab3e ("[SCSI] aic94xx: new driver"), sas_ata_task.set_affil_pol is never set, so delete it and the reference in asd_build_ata_ascb(). Signed-off-by: John Garry <[email protected]> Link: https://lore.kernel.org/r/[email protected] Reviewed-by: Damien Le Moal <[email protected]> Reviewed-by: Jason Yan <[email protected]> Signed-off-by: Martin K. Petersen <[email protected]>
2023-08-21scsi: libsas: Delete sas_ssp_task.task_prioJohn Garry9-12/+5
Since libsas was introduced in commit 2908d778ab3e ("[SCSI] aic94xx: new driver"), sas_ssp_task.task_prio is never set, so delete it and any references which depend on it being set (all of them). Signed-off-by: John Garry <[email protected]> Link: https://lore.kernel.org/r/[email protected] Reviewed-by: Damien Le Moal <[email protected]> Reviewed-by: Jason Yan <[email protected]> Signed-off-by: Martin K. Petersen <[email protected]>
2023-08-21scsi: libsas: Delete sas_ssp_task.enable_first_burstJohn Garry6-19/+4
Since libsas was introduced in commit 2908d778ab3e ("[SCSI] aic94xx: new driver"), sas_ssp_task.enable_first_burst is never set, so delete it and any references. Signed-off-by: John Garry <[email protected]> Link: https://lore.kernel.org/r/[email protected] Reviewed-by: Damien Le Moal <[email protected]> Reviewed-by: Jason Yan <[email protected]> Signed-off-by: Martin K. Petersen <[email protected]>
2023-08-21scsi: libsas: Delete sas_ssp_task.retry_countJohn Garry3-4/+0
Since libsas was introduced in commit 2908d778ab3e ("[SCSI] aic94xx: new driver"), sas_ssp_task.retry_count is only ever set, so delete it. The aic94xx driver also had its own retry_count definition in struct scb sub-structs, which may have caused a mix-up. Signed-off-by: John Garry <[email protected]> Link: https://lore.kernel.org/r/[email protected] Reviewed-by: Jason Yan <[email protected]> Reviewed-by: Damien Le Moal <[email protected]> Signed-off-by: Martin K. Petersen <[email protected]>
2023-08-21scsi: libsas: Delete struct scsi_coreJohn Garry17-55/+50
Since commit 79855d178557 ("libsas: remove task_collector mode"), struct scsi_core only contains a reference to the shost. struct scsi_core is only used in sas_ha_struct.core, so delete scsi_core and replace with a reference to the shost there. Signed-off-by: John Garry <[email protected]> Link: https://lore.kernel.org/r/[email protected] Reviewed-by: Jason Yan <[email protected]> Reviewed-by: Damien Le Moal <[email protected]> Signed-off-by: Martin K. Petersen <[email protected]>
2023-08-21scsi: libsas: Delete enum sas_phy_typeJohn Garry6-11/+0
enum sas_phy_type is used for asd_sas_phy.type, which is only ever set, so delete this member and the enum. Signed-off-by: John Garry <[email protected]> Link: https://lore.kernel.org/r/[email protected] Reviewed-by: Damien Le Moal <[email protected]> Reviewed-by: Jason Yan <[email protected]> Signed-off-by: Martin K. Petersen <[email protected]>
2023-08-21scsi: libsas: Delete enum sas_classJohn Garry8-15/+0
enum sas_class prob would have been useful if function sas_show_class() was ever implemented, which it wasn't. enum sas_class is used as asd_sas_port.class and asd_sas_phy.class, which are only ever set, so delete these members and the enum. Signed-off-by: John Garry <[email protected]> Link: https://lore.kernel.org/r/[email protected] Reviewed-by: Damien Le Moal <[email protected]> Reviewed-by: Jason Yan <[email protected]> Signed-off-by: Martin K. Petersen <[email protected]>
2023-08-21scsi: libsas: Delete sas_ha_struct.lldd_moduleJohn Garry7-7/+0
Since libsas was introduced in commit 2908d778ab3e ("[SCSI] aic94xx: new driver"), sas_ha_struct.lldd_module has only ever been set, so remove it. Struct scsi_host_template already has a reference to the LLD driver module as to stop the driver being removed unexpectedly. Signed-off-by: John Garry <[email protected]> Link: https://lore.kernel.org/r/[email protected] Reviewed-by: Damien Le Moal <[email protected]> Reviewed-by: Jason Yan <[email protected]> Signed-off-by: Martin K. Petersen <[email protected]>
2023-08-21scsi: qla2xxx: Update version to 10.02.09.100-kNilesh Javali1-3/+3
Signed-off-by: Nilesh Javali <[email protected]> Link: https://lore.kernel.org/r/[email protected] Reviewed-by: Himanshu Madhani <[email protected]> Signed-off-by: Martin K. Petersen <[email protected]>
2023-08-21Revert "scsi: qla2xxx: Fix buffer overrun"Nilesh Javali1-1/+1
Revert due to Get PLOGI Template failed. This reverts commit b68710a8094fdffe8dd4f7a82c82649f479bb453. Cc: [email protected] Signed-off-by: Nilesh Javali <[email protected]> Link: https://lore.kernel.org/r/[email protected] Reviewed-by: Himanshu Madhani <[email protected]> Signed-off-by: Martin K. Petersen <[email protected]>
2023-08-21scsi: qla2xxx: Fix smatch warn for qla_init_iocb_limit()Nilesh Javali1-1/+1
Fix indentation for warning reported by smatch: drivers/scsi/qla2xxx/qla_init.c:4199 qla_init_iocb_limit() warn: inconsistent indenting Fixes: efa74a62aaa2 ("scsi: qla2xxx: Adjust IOCB resource on qpair create") Signed-off-by: Nilesh Javali <[email protected]> Link: https://lore.kernel.org/r/[email protected] Reviewed-by: Himanshu Madhani <[email protected]> Signed-off-by: Martin K. Petersen <[email protected]>
2023-08-21scsi: qla2xxx: Remove unsupported ql2xenabledif optionManish Rangankar3-5/+8
User accidently passed module parameter ql2xenabledif=1 which is unsupported. However, driver still initialized which lead to guard tag errors during device discovery. Remove unsupported ql2xenabledif=1 option and validate the user input. Cc: [email protected] Signed-off-by: Manish Rangankar <[email protected]> Signed-off-by: Nilesh Javali <[email protected]> Link: https://lore.kernel.org/r/[email protected] Reviewed-by: Himanshu Madhani <[email protected]> Signed-off-by: Martin K. Petersen <[email protected]>