aboutsummaryrefslogtreecommitdiff
path: root/drivers/scsi
AgeCommit message (Collapse)AuthorFilesLines
2023-08-07scsi: mpi3mr: Update driver version to 8.5.0.0.0Ranjan Kumar1-2/+2
Update driver version to 8.5.0.0.0 Signed-off-by: Ranjan Kumar <[email protected]> Link: https://lore.kernel.org/r/[email protected] Signed-off-by: Martin K. Petersen <[email protected]>
2023-08-07scsi: mpi3mr: Enhance handling of devices removed after controller resetRanjan Kumar1-0/+13
Mark all of the devices that are exposed to the OS prior to a controller reset and not detected by the controller after the reset as removed devices and the I/Os to those devices are unblocked (and returned with DID_NO_CONNECT) prior to removing the devices one after the other. Signed-off-by: Ranjan Kumar <[email protected]> Link: https://lore.kernel.org/r/[email protected] Signed-off-by: Martin K. Petersen <[email protected]>
2023-08-07scsi: mpi3mr: WRITE SAME implementationRanjan Kumar2-24/+105
Enhance driver to divert the WRITE SAME commands that are issued with UNMAP=1 and NDOB=1 and with the transfer length greater than the max WRITE SAME length specified by the firmware for the particular drive to the controller firmware. Reported-by: kernel test robot <[email protected]> Closes: https://lore.kernel.org/oe-kbuild-all/[email protected]/ Signed-off-by: Ranjan Kumar <[email protected]> Link: https://lore.kernel.org/r/[email protected] Signed-off-by: Martin K. Petersen <[email protected]>
2023-08-07scsi: mpi3mr: Add support for more than 1MB I/ORanjan Kumar3-11/+50
Enhance the driver to get the maximum data length per I/O request from IOC Facts data and report that to the upper layers. If the IOC facts data is not reported then a default I/O size of 1MB is reported to the OS. Signed-off-by: Ranjan Kumar <[email protected]> Link: https://lore.kernel.org/r/[email protected] Signed-off-by: Martin K. Petersen <[email protected]>
2023-08-07scsi: mpi3mr: Update MPI Headers to version 3.00.28Ranjan Kumar3-2/+3
Updated MPI Headers to version 3.00.28. Signed-off-by: Ranjan Kumar <[email protected]> Link: https://lore.kernel.org/r/[email protected] Signed-off-by: Martin K. Petersen <[email protected]>
2023-08-07scsi: mpi3mr: Invoke soft reset upon TSU or event ack time outRanjan Kumar1-4/+4
When a timestamp update or an event acknowledgment command times out, the driver invokes the soft reset handler to recover the controller while holding a mutex lock. The soft reset handler also tries to acquire the same mutex to send initialization commands to the controller which leads to a deadlock scenario. To resolve the issue the driver will check thestatus and if this indicates the controller is operational, the driver will issue a diagnostic fault reset and exit out of the command processing function. If the controller is already faulted or asynchronously reset, then the driver will just exit the command processing function. Signed-off-by: Ranjan Kumar <[email protected]> Link: https://lore.kernel.org/r/[email protected] Signed-off-by: Martin K. Petersen <[email protected]>
2023-08-07scsi: qedf: Fix firmware halt over suspend and resumeNilesh Javali1-0/+18
While performing certain power-off sequences, PCI drivers are called to suspend and resume their underlying devices through PCI PM (power management) interface. However the hardware does not support PCI PM suspend/resume operations so system wide suspend/resume leads to bad MFW (management firmware) state which causes various follow-up errors in driver when communicating with the device/firmware. To fix this driver implements PCI PM suspend handler to indicate unsupported operation to the PCI subsystem explicitly, thus avoiding system to go into suspended/standby mode. Fixes: 61d8658b4a43 ("scsi: qedf: Add QLogic FastLinQ offload FCoE driver framework.") Signed-off-by: Saurav Kashyap <[email protected]> Signed-off-by: Nilesh Javali <[email protected]> Link: https://lore.kernel.org/r/[email protected] Signed-off-by: Martin K. Petersen <[email protected]>
2023-08-07scsi: qedi: Fix firmware halt over suspend and resumeNilesh Javali1-0/+18
While performing certain power-off sequences, PCI drivers are called to suspend and resume their underlying devices through PCI PM (power management) interface. However the hardware does not support PCI PM suspend/resume operations so system wide suspend/resume leads to bad MFW (management firmware) state which causes various follow-up errors in driver when communicating with the device/firmware. To fix this driver implements PCI PM suspend handler to indicate unsupported operation to the PCI subsystem explicitly, thus avoiding system to go into suspended/standby mode. Fixes: ace7f46ba5fd ("scsi: qedi: Add QLogic FastLinQ offload iSCSI driver framework.") Signed-off-by: Nilesh Javali <[email protected]> Link: https://lore.kernel.org/r/[email protected] Signed-off-by: Martin K. Petersen <[email protected]>
2023-08-07scsi: qedi: Fix potential deadlock on &qedi_percpu->p_work_lockChengfeng Ye1-2/+3
As &qedi_percpu->p_work_lock is acquired by hard IRQ qedi_msix_handler(), other acquisitions of the same lock under process context should disable IRQ, otherwise deadlock could happen if the IRQ preempts the execution while the lock is held in process context on the same CPU. qedi_cpu_offline() is one such function which acquires the lock in process context. [Deadlock Scenario] qedi_cpu_offline() ->spin_lock(&p->p_work_lock) <irq> ->qedi_msix_handler() ->edi_process_completions() ->spin_lock_irqsave(&p->p_work_lock, flags); (deadlock here) This flaw was found by an experimental static analysis tool I am developing for IRQ-related deadlocks. The tentative patch fix the potential deadlock by spin_lock_irqsave() under process context. Signed-off-by: Chengfeng Ye <[email protected]> Link: https://lore.kernel.org/r/[email protected] Acked-by: Manish Rangankar <[email protected]> Signed-off-by: Martin K. Petersen <[email protected]>
2023-08-07scsi: lpfc: Remove reftag check in DIF pathsJustin Tee1-17/+3
When preparing protection DIF I/O for DMA, the driver obtains reference tags from scsi_prot_ref_tag(). Previously, there was a wrong assumption that an all 0xffffffff value meant error and thus the driver failed the I/O. This patch removes the evaluation code and accepts whatever the upper layer returns. Signed-off-by: Justin Tee <[email protected]> Link: https://lore.kernel.org/r/[email protected] Signed-off-by: Martin K. Petersen <[email protected]>
2023-08-07scsi: snic: Fix possible memory leak if device_add() failsZhu Wang1-0/+1
If device_add() returns error, the name allocated by dev_set_name() needs be freed. As the comment of device_add() says, put_device() should be used to give up the reference in the error path. So fix this by calling put_device(), then the name can be freed in kobject_cleanp(). Fixes: c8806b6c9e82 ("snic: driver for Cisco SCSI HBA") Signed-off-by: Zhu Wang <[email protected]> Acked-by: Narsimhulu Musini <[email protected]> Link: https://lore.kernel.org/r/[email protected] Signed-off-by: Martin K. Petersen <[email protected]>
2023-08-07scsi: core: Fix possible memory leak if device_add() failsZhu Wang1-0/+1
If device_add() returns error, the name allocated by dev_set_name() needs be freed. As the comment of device_add() says, put_device() should be used to decrease the reference count in the error path. So fix this by calling put_device(), then the name can be freed in kobject_cleanp(). Fixes: ee959b00c335 ("SCSI: convert struct class_device to struct device") Signed-off-by: Zhu Wang <[email protected]> Link: https://lore.kernel.org/r/[email protected] Reviewed-by: Bart Van Assche <[email protected]> Signed-off-by: Martin K. Petersen <[email protected]>
2023-08-07scsi: lpfc: Modify when a node should be put in device recovery mode during RSCNJustin Tee1-1/+1
Only nodes whose state is at least past a PLOGI issue and strictly less than a PRLI issue should be put into device recovery mode upon RSCN receipt. Previously, the allowance of LOGO and PRLI completion states did not make sense because those nodes should be allowed to flow through and marked as NPort dissappeared as is normally done. A follow up RSCN GID_FT would recover those nodes in such cases. Signed-off-by: Justin Tee <[email protected]> Link: https://lore.kernel.org/r/[email protected] Signed-off-by: Martin K. Petersen <[email protected]>
2023-08-05Merge tag 'ata-6.5-rc5' of ↵Linus Torvalds1-3/+6
git://git.kernel.org/pub/scm/linux/kernel/git/dlemoal/libata Pull ata fix from Damien Le Moal: - Prevent the scsi disk driver from issuing a START STOP UNIT command for ATA devices during system resume as this causes various issues reported by multiple users. * tag 'ata-6.5-rc5' of git://git.kernel.org/pub/scm/linux/kernel/git/dlemoal/libata: ata,scsi: do not issue START STOP UNIT on resume
2023-08-02ata,scsi: remove ata_sas_port_init()Niklas Cassel1-3/+0
ata_sas_port_init() now only contains a single initialization. Move this single initialization to ata_sas_port_alloc(), since: 1) ata_sas_port_alloc() already initializes some of the struct members. 2) ata_sas_port_alloc() is only used by libsas. Suggested-by: John Garry <[email protected]> Signed-off-by: Niklas Cassel <[email protected]> Reviewed-by: John Garry <[email protected]> Reviewed-by: Martin K. Petersen <[email protected]> Signed-off-by: Damien Le Moal <[email protected]>
2023-08-02ata,scsi: cleanup __ata_port_probe()Hannes Reinecke2-2/+2
Rename __ata_port_probe() to ata_port_probe() and drop the wrapper ata_sas_async_probe(). Signed-off-by: Hannes Reinecke <[email protected]> Signed-off-by: Niklas Cassel <[email protected]> Reviewed-by: Jason Yan <[email protected]> Reviewed-by: John Garry <[email protected]> Reviewed-by: Martin K. Petersen <[email protected]> Signed-off-by: Damien Le Moal <[email protected]>
2023-08-02ata,scsi: remove ata_sas_port_destroy()Hannes Reinecke2-2/+2
Is now a wrapper around kfree(), so call it directly. Signed-off-by: Hannes Reinecke <[email protected]> Signed-off-by: Niklas Cassel <[email protected]> Reviewed-by: John Garry <[email protected]> Reviewed-by: Jason Yan <[email protected]> Reviewed-by: Martin K. Petersen <[email protected]> Signed-off-by: Damien Le Moal <[email protected]>
2023-08-02ata,scsi: remove ata_sas_port_{start,stop} callbacksHannes Reinecke1-2/+0
Callbacks are empty now, so remove them. Also, remove the call to ap->ops->port_start() in ata_sas_port_init(), as this would otherwise cause a NULL pointer dereference, now when the callback is gone. Signed-off-by: Hannes Reinecke <[email protected]> [niklas: remove the call to ap->ops->port_start() in ata_sas_port_init()] Signed-off-by: Niklas Cassel <[email protected]> Reviewed-by: Jason Yan <[email protected]> Reviewed-by: John Garry <[email protected]> Reviewed-by: Martin K. Petersen <[email protected]> Signed-off-by: Damien Le Moal <[email protected]>
2023-08-02ata,scsi: do not issue START STOP UNIT on resumeDamien Le Moal1-3/+6
During system resume, ata_port_pm_resume() triggers ata EH to 1) Resume the controller 2) Reset and rescan the ports 3) Revalidate devices This EH execution is started asynchronously from ata_port_pm_resume(), which means that when sd_resume() is executed, none or only part of the above processing may have been executed. However, sd_resume() issues a START STOP UNIT to wake up the drive from sleep mode. This command is translated to ATA with ata_scsi_start_stop_xlat() and issued to the device. However, depending on the state of execution of the EH process and revalidation triggerred by ata_port_pm_resume(), two things may happen: 1) The START STOP UNIT fails if it is received before the controller has been reenabled at the beginning of the EH execution. This is visible with error messages like: ata10.00: device reported invalid CHS sector 0 sd 9:0:0:0: [sdc] Start/Stop Unit failed: Result: hostbyte=DID_OK driverbyte=DRIVER_OK sd 9:0:0:0: [sdc] Sense Key : Illegal Request [current] sd 9:0:0:0: [sdc] Add. Sense: Unaligned write command sd 9:0:0:0: PM: dpm_run_callback(): scsi_bus_resume+0x0/0x90 returns -5 sd 9:0:0:0: PM: failed to resume async: error -5 2) The START STOP UNIT command is received while the EH process is on-going, which mean that it is stopped and must wait for its completion, at which point the command is rather useless as the drive is already fully spun up already. This case results also in a significant delay in sd_resume() which is observable by users as the entire system resume completion is delayed. Given that ATA devices will be woken up by libata activity on resume, sd_resume() has no need to issue a START STOP UNIT command, which solves the above mentioned problems. Do not issue this command by introducing the new scsi_device flag no_start_on_resume and setting this flag to 1 in ata_scsi_dev_config(). sd_resume() is modified to issue a START STOP UNIT command only if this flag is not set. Reported-by: Paul Ausbeck <[email protected]> Closes: https://bugzilla.kernel.org/show_bug.cgi?id=215880 Fixes: a19a93e4c6a9 ("scsi: core: pm: Rely on the device driver core for async power management") Signed-off-by: Damien Le Moal <[email protected]> Tested-by: Tanner Watkins <[email protected]> Tested-by: Paul Ausbeck <[email protected]> Reviewed-by: Hannes Reinecke <[email protected]> Reviewed-by: Bart Van Assche <[email protected]>
2023-07-31scsi: core: Fix legacy /proc parsing buffer overflowTony Battersby1-13/+17
(lightly modified commit message mostly by Linus Torvalds) The parsing code for /proc/scsi/scsi is disgusting and broken. We should have just used 'sscanf()' or something simple like that, but the logic may actually predate our kernel sscanf library routine for all I know. It certainly predates both git and BK histories. And we can't change it to be something sane like that now, because the string matching at the start is done case-insensitively, and the separator parsing between numbers isn't done at all, so *any* separator will work, including a possible terminating NUL character. This interface is root-only, and entirely for legacy use, so there is absolutely no point in trying to tighten up the parsing. Because any separator has traditionally worked, it's entirely possible that people have used random characters rather than the suggested space. So don't bother to try to pretty it up, and let's just make a minimal patch that can be back-ported and we can forget about this whole sorry thing for another two decades. Just make it at least not read past the end of the supplied data. Link: https://lore.kernel.org/linux-scsi/[email protected]/ Cc: Linus Torvalds <[email protected]> Cc: Martin K Petersen <[email protected]> Cc: James Bottomley <[email protected]> Cc: Willy Tarreau <[email protected]> Cc: [email protected] Fixes: 1da177e4c3f4 ("Linux-2.6.12-rc2") Signed-off-by: Tony Battersby <[email protected]> Signed-off-by: Martin K Petersen <[email protected]>
2023-07-31scsi: qedf: Do not touch __user pointer in qedf_dbg_fp_int_cmd_read() directlyOleksandr Natalenko2-6/+17
The qedf_dbg_fp_int_cmd_read() function invokes sprintf() directly on a __user pointer, which may crash the kernel. Avoid doing that by vmalloc()'ating a buffer for scnprintf() and then calling simple_read_from_buffer() which does a proper copy_to_user() call. Fixes: 61d8658b4a43 ("scsi: qedf: Add QLogic FastLinQ offload FCoE driver framework.") Link: https://lore.kernel.org/lkml/[email protected]/ Link: https://lore.kernel.org/linux-scsi/[email protected]/ Cc: Saurav Kashyap <[email protected]> Cc: Rob Evers <[email protected]> Cc: Johannes Thumshirn <[email protected]> Cc: David Laight <[email protected]> Cc: Jozef Bacik <[email protected]> Cc: Laurence Oberman <[email protected]> Cc: "James E.J. Bottomley" <[email protected]> Cc: "Martin K. Petersen" <[email protected]> Cc: [email protected] Cc: [email protected] Reviewed-by: Laurence Oberman <[email protected]> Reviewed-by: Johannes Thumshirn <[email protected]> Tested-by: Laurence Oberman <[email protected]> Acked-by: Saurav Kashyap <[email protected]> Signed-off-by: Oleksandr Natalenko <[email protected]> Link: https://lore.kernel.org/r/[email protected] Signed-off-by: Martin K. Petersen <[email protected]>
2023-07-31scsi: qedf: Do not touch __user pointer in qedf_dbg_debug_cmd_read() directlyOleksandr Natalenko1-4/+3
The qedf_dbg_debug_cmd_read() function invokes sprintf() directly on a __user pointer, which may crash the kernel. Avoid doing that by using a small on-stack buffer for scnprintf() and then calling simple_read_from_buffer() which does a proper copy_to_user() call. Fixes: 61d8658b4a43 ("scsi: qedf: Add QLogic FastLinQ offload FCoE driver framework.") Link: https://lore.kernel.org/lkml/[email protected]/ Link: https://lore.kernel.org/linux-scsi/[email protected]/ Cc: Saurav Kashyap <[email protected]> Cc: Rob Evers <[email protected]> Cc: Johannes Thumshirn <[email protected]> Cc: David Laight <[email protected]> Cc: Jozef Bacik <[email protected]> Cc: Laurence Oberman <[email protected]> Cc: "James E.J. Bottomley" <[email protected]> Cc: "Martin K. Petersen" <[email protected]> Cc: [email protected] Cc: [email protected] Reviewed-by: Laurence Oberman <[email protected]> Reviewed-by: Johannes Thumshirn <[email protected]> Tested-by: Laurence Oberman <[email protected]> Acked-by: Saurav Kashyap <[email protected]> Signed-off-by: Oleksandr Natalenko <[email protected]> Link: https://lore.kernel.org/r/[email protected] Signed-off-by: Martin K. Petersen <[email protected]>
2023-07-31scsi: qedf: Do not touch __user pointer in ↵Oleksandr Natalenko1-4/+3
qedf_dbg_stop_io_on_error_cmd_read() directly The qedf_dbg_stop_io_on_error_cmd_read() function invokes sprintf() directly on a __user pointer, which may crash the kernel. Avoid doing that by using a small on-stack buffer for scnprintf() and then calling simple_read_from_buffer() which does a proper copy_to_user() call. Fixes: 61d8658b4a43 ("scsi: qedf: Add QLogic FastLinQ offload FCoE driver framework.") Link: https://lore.kernel.org/lkml/[email protected]/ Link: https://lore.kernel.org/linux-scsi/[email protected]/ Cc: Saurav Kashyap <[email protected]> Cc: Rob Evers <[email protected]> Cc: Johannes Thumshirn <[email protected]> Cc: David Laight <[email protected]> Cc: Jozef Bacik <[email protected]> Cc: Laurence Oberman <[email protected]> Cc: "James E.J. Bottomley" <[email protected]> Cc: "Martin K. Petersen" <[email protected]> Cc: [email protected] Cc: [email protected] Reviewed-by: Laurence Oberman <[email protected]> Reviewed-by: Johannes Thumshirn <[email protected]> Tested-by: Laurence Oberman <[email protected]> Acked-by: Saurav Kashyap <[email protected]> Signed-off-by: Oleksandr Natalenko <[email protected]> Link: https://lore.kernel.org/r/[email protected] Signed-off-by: Martin K. Petersen <[email protected]>
2023-07-31scsi: 53c700: Check that command slot is not NULLAlexandra Diupina1-1/+1
Add a check for the command slot value to avoid dereferencing a NULL pointer. Found by Linux Verification Center (linuxtesting.org) with SVACE. Fixes: 1da177e4c3f4 ("Linux-2.6.12-rc2") Co-developed-by: Vladimir Telezhnikov <[email protected]> Signed-off-by: Vladimir Telezhnikov <[email protected]> Signed-off-by: Alexandra Diupina <[email protected]> Link: https://lore.kernel.org/r/[email protected] Signed-off-by: Martin K. Petersen <[email protected]>
2023-07-31scsi: fnic: Replace return codes in fnic_clean_pending_aborts()Karan Tilak Kumar2-3/+5
fnic_clean_pending_aborts() was returning a non-zero value irrespective of failure or success. This caused the caller of this function to assume that the device reset had failed, even though it would succeed in most cases. As a consequence, a successful device reset would escalate to host reset. Reviewed-by: Sesidhar Baddela <[email protected]> Tested-by: Karan Tilak Kumar <[email protected]> Signed-off-by: Karan Tilak Kumar <[email protected]> Link: https://lore.kernel.org/r/[email protected] Signed-off-by: Martin K. Petersen <[email protected]>
2023-07-31scsi: storvsc: Fix handling of virtual Fibre Channel timeoutsMichael Kelley1-4/+0
Hyper-V provides the ability to connect Fibre Channel LUNs to the host system and present them in a guest VM as a SCSI device. I/O to the vFC device is handled by the storvsc driver. The storvsc driver includes a partial integration with the FC transport implemented in the generic portion of the Linux SCSI subsystem so that FC attributes can be displayed in /sys. However, the partial integration means that some aspects of vFC don't work properly. Unfortunately, a full and correct integration isn't practical because of limitations in what Hyper-V provides to the guest. In particular, in the context of Hyper-V storvsc, the FC transport timeout function fc_eh_timed_out() causes a kernel panic because it can't find the rport and dereferences a NULL pointer. The original patch that added the call from storvsc_eh_timed_out() to fc_eh_timed_out() is faulty in this regard. In many cases a timeout is due to a transient condition, so the situation can be improved by just continuing to wait like with other I/O requests issued by storvsc, and avoiding the guaranteed panic. For a permanent failure, continuing to wait may result in a hung thread instead of a panic, which again may be better. So fix the panic by removing the storvsc call to fc_eh_timed_out(). This allows storvsc to keep waiting for a response. The change has been tested by users who experienced a panic in fc_eh_timed_out() due to transient timeouts, and it solves their problem. In the future we may want to deprecate the vFC functionality in storvsc since it can't be fully fixed. But it has current users for whom it is working well enough, so it should probably stay for a while longer. Fixes: 3930d7309807 ("scsi: storvsc: use default I/O timeout handler for FC devices") Cc: [email protected] Signed-off-by: Michael Kelley <[email protected]> Link: https://lore.kernel.org/r/[email protected] Signed-off-by: Martin K. Petersen <[email protected]>
2023-07-31scsi: hisi_sas: Fix warning detected by sparseSunil V L1-1/+1
LKP reports below warning when building for RISC-V with randconfig configuration. drivers/scsi/hisi_sas/hisi_sas_v3_hw.c:4567:35: sparse: sparse: incorrect type in argument 4 (different base types) @@ expected restricted __le32 [usertype] *[assigned] ptr @@ got unsigned int * @@ Type cast to fix this warning. Reported-by: kernel test robot <[email protected]> Closes: https://lore.kernel.org/oe-kbuild-all/[email protected]/ Signed-off-by: Sunil V L <[email protected]> Link: https://lore.kernel.org/r/[email protected] Reviewed-by: Xiang Chen <[email protected]> Signed-off-by: Martin K. Petersen <[email protected]>
2023-07-31scsi: aic7xxx: Fix firmware build fatal errorWang Jinchao2-15/+4
When building with CONFIG_AIC7XXX_BUILD_FIRMWARE=y, two fatal errors are reported as shown below: aicasm_gram.tab.c:203:10: fatal error: aicasm_gram.tab.h: No such file or directory aicasm_macro_gram.tab.c:167:10: fatal error: aicasm_macro_gram.tab.h: No such file or directory Fix these issues to make randconfig builds more reliable. [mkp: add missing include] Signed-off-by: Wang Jinchao <[email protected]> Link: https://lore.kernel.org/r/ZK0XIj6XzY5MCvtd@fedora Signed-off-by: Martin K. Petersen <[email protected]>
2023-07-25scsi: pm80xx: Fix error return code in pm8001_pci_probe()Yang Yingliang1-1/+2
If pm8001_init_sas_add() fails, return error code in pm8001_pci_probe(). Fixes: 14a8f116cdc0 ("scsi: pm80xx: Add GET_NVMD timeout during probe") Signed-off-by: Yang Yingliang <[email protected]> Link: https://lore.kernel.org/r/[email protected] Reviewed-by: Igor Pylypiv <[email protected]> Signed-off-by: Martin K. Petersen <[email protected]>
2023-07-25scsi: qla4xxx: Add length check when parsing nlattrsLin Ma1-0/+15
There are three places that qla4xxx parses nlattrs: - qla4xxx_set_chap_entry() - qla4xxx_iface_set_param() - qla4xxx_sysfs_ddb_set_param() and each of them directly converts the nlattr to specific pointer of structure without length checking. This could be dangerous as those attributes are not validated and a malformed nlattr (e.g., length 0) could result in an OOB read that leaks heap dirty data. Add the nla_len check before accessing the nlattr data and return EINVAL if the length check fails. Fixes: 26ffd7b45fe9 ("[SCSI] qla4xxx: Add support to set CHAP entries") Fixes: 1e9e2be3ee03 ("[SCSI] qla4xxx: Add flash node mgmt support") Fixes: 00c31889f751 ("[SCSI] qla4xxx: fix data alignment and use nl helpers") Signed-off-by: Lin Ma <[email protected]> Link: https://lore.kernel.org/r/[email protected] Reviewed-by: Chris Leech <[email protected]> Signed-off-by: Martin K. Petersen <[email protected]>
2023-07-25scsi: be2iscsi: Add length check when parsing nlattrsLin Ma1-0/+4
beiscsi_iface_set_param() parses nlattr with nla_for_each_attr and assumes every attributes can be viewed as struct iscsi_iface_param_info. This is not true because there is no any nla_policy to validate the attributes passed from the upper function iscsi_set_iface_params(). Add the nla_len check before accessing the nlattr data and return EINVAL if the length check fails. Fixes: 0e43895ec1f4 ("[SCSI] be2iscsi: adding functionality to change network settings using iscsiadm") Signed-off-by: Lin Ma <[email protected]> Link: https://lore.kernel.org/r/[email protected] Reviewed-by: Chris Leech <[email protected]> Signed-off-by: Martin K. Petersen <[email protected]>
2023-07-25scsi: iscsi: Add strlen() check in iscsi_if_set{_host}_param()Lin Ma1-0/+8
The functions iscsi_if_set_param() and iscsi_if_set_host_param() convert an nlattr payload to type char* and then call C string handling functions like sscanf and kstrdup: char *data = (char*)ev + sizeof(*ev); ... sscanf(data, "%d", &value); However, since the nlattr is provided by the user-space program and the nlmsg skb is allocated with GFP_KERNEL instead of GFP_ZERO flag (see netlink_alloc_large_skb() in netlink_sendmsg()), dirty data on the heap can lead to an OOB access for those string handling functions. By investigating how the bug is introduced, we find it is really interesting as the old version parsing code starting from commit fd7255f51a13 ("[SCSI] iscsi: add sysfs attrs for uspace sync up") treated the nlattr as integer bytes instead of string and had length check in iscsi_copy_param(): if (ev->u.set_param.len != sizeof(uint32_t)) BUG(); But, since the commit a54a52caad4b ("[SCSI] iscsi: fixup set/get param functions"), the code treated the nlattr as C string while forgetting to add any strlen checks(), opening the possibility of an OOB access. Fix the potential OOB by adding the strlen() check before accessing the buf. If the data passes this check, all low-level set_param handlers can safely treat this buf as legal C string. Fixes: fd7255f51a13 ("[SCSI] iscsi: add sysfs attrs for uspace sync up") Fixes: 1d9bf13a9cf9 ("[SCSI] iscsi class: add iscsi host set param event") Signed-off-by: Lin Ma <[email protected]> Link: https://lore.kernel.org/r/[email protected] Reviewed-by: Chris Leech <[email protected]> Signed-off-by: Martin K. Petersen <[email protected]>
2023-07-25scsi: iscsi: Add length check for nlattr payloadLin Ma1-29/+43
The current NETLINK_ISCSI netlink parsing loop checks every nlmsg to make sure the length is bigger than sizeof(struct iscsi_uevent) and then calls iscsi_if_recv_msg(). nlh = nlmsg_hdr(skb); if (nlh->nlmsg_len < sizeof(*nlh) + sizeof(*ev) || skb->len < nlh->nlmsg_len) { break; } ... err = iscsi_if_recv_msg(skb, nlh, &group); Hence, in iscsi_if_recv_msg() the nlmsg_data can be safely converted to iscsi_uevent as the length is already checked. However, in other cases the length of nlattr payload is not checked before the payload is converted to other data structures. One example is iscsi_set_path() which converts the payload to type iscsi_path without any checks: params = (struct iscsi_path *)((char *)ev + sizeof(*ev)); Whereas iscsi_if_transport_conn() correctly checks the pdu_len: pdu_len = nlh->nlmsg_len - sizeof(*nlh) - sizeof(*ev); if ((ev->u.send_pdu.hdr_size > pdu_len) .. err = -EINVAL; To sum up, some code paths called in iscsi_if_recv_msg() do not check the length of the data (see below picture) and directly convert the data to another data structure. This could result in an out-of-bound reads and heap dirty data leakage. _________ nlmsg_len(nlh) _______________ / \ +----------+--------------+---------------------------+ | nlmsghdr | iscsi_uevent | data | +----------+--------------+---------------------------+ \ / iscsi_uevent->u.set_param.len Fix the issue by adding the length check before accessing it. To clean up the code, an additional parameter named rlen is added. The rlen is calculated at the beginning of iscsi_if_recv_msg() which avoids duplicated calculation. Fixes: ac20c7bf070d ("[SCSI] iscsi_transport: Added Ping support") Fixes: 43514774ff40 ("[SCSI] iscsi class: Add new NETLINK_ISCSI messages for cnic/bnx2i driver.") Fixes: 1d9bf13a9cf9 ("[SCSI] iscsi class: add iscsi host set param event") Fixes: 01cb225dad8d ("[SCSI] iscsi: add target discvery event to transport class") Fixes: 264faaaa1254 ("[SCSI] iscsi: add transport end point callbacks") Fixes: fd7255f51a13 ("[SCSI] iscsi: add sysfs attrs for uspace sync up") Signed-off-by: Lin Ma <[email protected]> Link: https://lore.kernel.org/r/[email protected] Reviewed-by: Chris Leech <[email protected]> Signed-off-by: Martin K. Petersen <[email protected]>
2023-07-24block: Improve performance for BLK_MQ_F_BLOCKING driversBart Van Assche1-1/+2
blk_mq_run_queue() runs the queue asynchronously if BLK_MQ_F_BLOCKING has been set. This is suboptimal since running the queue asynchronously is slower than running the queue synchronously. This patch modifies blk_mq_run_queue() as follows if BLK_MQ_F_BLOCKING has been set: - Run the queue synchronously if it is allowed to sleep. - Run the queue asynchronously if it is not allowed to sleep. Additionally, blk_mq_run_hw_queue(hctx, false) calls are modified into blk_mq_run_hw_queue(hctx, hctx->flags & BLK_MQ_F_BLOCKING) if the caller may be invoked from atomic context. The following caller chains have been reviewed: blk_mq_run_hw_queue(hctx, false) blk_mq_get_tag() /* may sleep, hence the functions it calls may also sleep */ blk_execute_rq() /* may sleep */ blk_mq_run_hw_queues(q, async=false) blk_freeze_queue_start() /* may sleep */ blk_mq_requeue_work() /* may sleep */ scsi_kick_queue() scsi_requeue_run_queue() /* may sleep */ scsi_run_host_queues() scsi_ioctl_reset() /* may sleep */ blk_mq_insert_requests(hctx, ctx, list, run_queue_async=false) blk_mq_dispatch_plug_list(plug, from_sched=false) blk_mq_flush_plug_list(plug, from_schedule=false) __blk_flush_plug(plug, from_schedule=false) blk_add_rq_to_plug() blk_mq_submit_bio() /* may sleep if REQ_NOWAIT has not been set */ blk_mq_plug_issue_direct() blk_mq_flush_plug_list() /* see above */ blk_mq_dispatch_plug_list(plug, from_sched=false) blk_mq_flush_plug_list() /* see above */ blk_mq_try_issue_directly() blk_mq_submit_bio() /* may sleep if REQ_NOWAIT has not been set */ blk_mq_try_issue_list_directly(hctx, list) blk_mq_insert_requests() /* see above */ Cc: Christoph Hellwig <[email protected]> Cc: Ming Lei <[email protected]> Signed-off-by: Bart Van Assche <[email protected]> Link: https://lore.kernel.org/r/[email protected] Signed-off-by: Jens Axboe <[email protected]>
2023-07-24scsi: Remove a blk_mq_run_hw_queues() callBart Van Assche1-1/+1
blk_mq_kick_requeue_list() calls blk_mq_run_hw_queues() asynchronously. Leave out the direct blk_mq_run_hw_queues() call. This patch causes scsi_run_queue() to call blk_mq_run_hw_queues() asynchronously instead of synchronously. Since scsi_run_queue() is not called from the hot I/O submission path, this patch does not affect the hot path. This patch prepares for allowing blk_mq_run_hw_queue() to sleep if BLK_MQ_F_BLOCKING has been set. scsi_run_queue() may be called from atomic context and must not sleep. Hence the removal of the blk_mq_run_hw_queues(q, false) call. See also scsi_unblock_requests(). Cc: "Martin K. Petersen" <[email protected]> Signed-off-by: Bart Van Assche <[email protected]> Reviewed-by: "Martin K. Petersen" <[email protected]> Link: https://lore.kernel.org/r/[email protected] Signed-off-by: Jens Axboe <[email protected]>
2023-07-24scsi: Inline scsi_kick_queue()Bart Van Assche1-7/+2
Inline scsi_kick_queue() to prepare for modifying the second argument passed to blk_mq_run_hw_queues(). Reviewed-by: Christoph Hellwig <[email protected]> Cc: "Martin K. Petersen" <[email protected]> Signed-off-by: Bart Van Assche <[email protected]> Reviewed-by: "Martin K. Petersen" <[email protected]> Link: https://lore.kernel.org/r/[email protected] Signed-off-by: Jens Axboe <[email protected]>
2023-07-23Merge patch series: "qla2xxx driver bug fixes"Martin K. Petersen10-102/+168
Nilesh Javali <[email protected]> says: Martin, Please apply the qla2xxx driver bug fixes to the scsi tree at your earliest convenience. Link: https://lore.kernel.org/r/[email protected] Signed-off-by: Martin K. Petersen <[email protected]>
2023-07-23scsi: qla2xxx: Update version to 10.02.08.500-kNilesh Javali1-2/+2
Signed-off-by: Nilesh Javali <[email protected]> Link: https://lore.kernel.org/r/[email protected] Reviewed-by: Himanshu Madhani <[email protected]> Signed-off-by: Martin K. Petersen <[email protected]>
2023-07-23scsi: qla2xxx: fix inconsistent TMF timeoutQuinn Tran1-1/+0
Different behavior were experienced of session being torn down vs not when TMF is timed out. When FW detects the time out, the session is torn down. When driver detects the time out, the session is not torn down. Allow TMF error to return to upper layer without session tear down. Cc: [email protected] Signed-off-by: Quinn Tran <[email protected]> Signed-off-by: Nilesh Javali <[email protected]> Link: https://lore.kernel.org/r/[email protected] Reviewed-by: Himanshu Madhani <[email protected]> Signed-off-by: Martin K. Petersen <[email protected]>
2023-07-23scsi: qla2xxx: Fix TMF leak throughQuinn Tran1-6/+14
Task management can retry up to 5 times when FW resource becomes bottle neck. Between the retries, there is a short sleep. Current code assumes the chip has not reset or session has not changed. Check for chip reset or session change before sending Task management. Cc: [email protected] Fixes: 9803fb5d2759 ("scsi: qla2xxx: Fix task management cmd failure") Signed-off-by: Quinn Tran <[email protected]> Signed-off-by: Nilesh Javali <[email protected]> Link: https://lore.kernel.org/r/[email protected] Reviewed-by: Himanshu Madhani <[email protected]> Signed-off-by: Martin K. Petersen <[email protected]>
2023-07-23scsi: qla2xxx: Turn off noisy message logQuinn Tran1-1/+1
Some consider noisy log as test failure. Turn off noisy message log. Cc: [email protected] Signed-off-by: Quinn Tran <[email protected]> Signed-off-by: Nilesh Javali <[email protected]> Link: https://lore.kernel.org/r/[email protected] Reviewed-by: Himanshu Madhani <[email protected]> Signed-off-by: Martin K. Petersen <[email protected]>
2023-07-23scsi: qla2xxx: Fix session hang in gnlQuinn Tran1-2/+3
Connection does not resume after a host reset / chip reset. The cause of the blockage is due to the FCF_ASYNC_ACTIVE left on. The gnl command was interrupted by the chip reset. On exiting the command, this flag should be turn off to allow relogin to reoccur. Clear this flag to prevent blockage. Cc: [email protected] Fixes: 17e64648aa47 ("scsi: qla2xxx: Correct fcport flags handling") Signed-off-by: Quinn Tran <[email protected]> Signed-off-by: Nilesh Javali <[email protected]> Link: https://lore.kernel.org/r/[email protected] Reviewed-by: Himanshu Madhani <[email protected]> Signed-off-by: Martin K. Petersen <[email protected]>
2023-07-23scsi: qla2xxx: Fix erroneous link up failureQuinn Tran2-2/+7
Link up failure occurred where driver failed to see certain events from FW indicating link up (AEN 8011) and fabric login completion (AEN 8014). Without these 2 events, driver would not proceed forward to scan the fabric. The cause of this is due to delay in the receive of interrupt for Mailbox 60 that causes qla to set the fw_started flag late. The late setting of this flag causes other interrupts to be dropped. These dropped interrupts happen to be the link up (AEN 8011) and fabric login completion (AEN 8014). Set fw_started flag early to prevent interrupts being dropped. Cc: [email protected] Signed-off-by: Quinn Tran <[email protected]> Signed-off-by: Nilesh Javali <[email protected]> Link: https://lore.kernel.org/r/[email protected] Reviewed-by: Himanshu Madhani <[email protected]> Signed-off-by: Martin K. Petersen <[email protected]>
2023-07-23scsi: qla2xxx: Fix command flush during TMFQuinn Tran3-39/+45
For each TMF request, driver iterates through each qpair and flushes commands associated to the TMF. At the end of the qpair flush, a Marker is used to complete the flush transaction. This process was repeated for each qpair. The multiple flush and marker for this TMF request seems to cause confusion for FW. Instead, 1 flush is sent to FW. Driver would wait for FW to go through all the I/Os on each qpair to be read then return. Driver then closes out the transaction with a Marker. Cc: [email protected] Fixes: d90171dd0da5 ("scsi: qla2xxx: Multi-que support for TMF") Signed-off-by: Quinn Tran <[email protected]> Signed-off-by: Nilesh Javali <[email protected]> Link: https://lore.kernel.org/r/[email protected] Reviewed-by: Himanshu Madhani <[email protected]> Signed-off-by: Martin K. Petersen <[email protected]>
2023-07-23scsi: qla2xxx: Limit TMF to 8 per functionQuinn Tran3-25/+41
Per FW recommendation, 8 TMF's can be outstanding for each function. Previously, it allowed 8 per target. Limit TMF to 8 per function. Cc: [email protected] Fixes: 6a87679626b5 ("scsi: qla2xxx: Fix task management cmd fail due to unavailable resource") Signed-off-by: Quinn Tran <[email protected]> Signed-off-by: Nilesh Javali <[email protected]> Link: https://lore.kernel.org/r/[email protected] Reviewed-by: Himanshu Madhani <[email protected]> Signed-off-by: Martin K. Petersen <[email protected]>
2023-07-23scsi: qla2xxx: Adjust IOCB resource on qpair createQuinn Tran4-19/+38
During NVMe queue creation, a new qpair is created. FW resource limit needs to be re-adjusted to take into account the new qpair. Otherwise, NVMe command can not go through. This issue was discovered while testing/forcing FW execution to fail at load time. Add call to readjust IOCB and exchange limit. In addition, get FW state command and require FW to be running. Otherwise, error is generated. Cc: [email protected] Signed-off-by: Quinn Tran <[email protected]> Signed-off-by: Nilesh Javali <[email protected]> Link: https://lore.kernel.org/r/[email protected] Reviewed-by: Himanshu Madhani <[email protected]> Signed-off-by: Martin K. Petersen <[email protected]>
2023-07-23scsi: qla2xxx: Fix deletion race conditionQuinn Tran2-9/+21
System crash when using debug kernel due to link list corruption. The cause of the link list corruption is due to session deletion was allowed to queue up twice. Here's the internal trace that show the same port was allowed to double queue for deletion on different cpu. 20808683956 015 qla2xxx [0000:13:00.1]-e801:4: Scheduling sess ffff93ebf9306800 for deletion 50:06:0e:80:12:48:ff:50 fc4_type 1 20808683957 027 qla2xxx [0000:13:00.1]-e801:4: Scheduling sess ffff93ebf9306800 for deletion 50:06:0e:80:12:48:ff:50 fc4_type 1 Move the clearing/setting of deleted flag lock. Cc: [email protected] Fixes: 726b85487067 ("qla2xxx: Add framework for async fabric discovery") Signed-off-by: Quinn Tran <[email protected]> Signed-off-by: Nilesh Javali <[email protected]> Link: https://lore.kernel.org/r/[email protected] Reviewed-by: Himanshu Madhani <[email protected]> Signed-off-by: Martin K. Petersen <[email protected]>
2023-07-23Merge patch series "lpfc: Update lpfc to revision 14.2.0.14"Martin K. Petersen12-167/+324
Justin Tee <[email protected]> says: Update lpfc to revision 14.2.0.14 This patch set contains logging improvements, kref handling fixes, discovery bug fixes, and refactoring of repeated code. The patches were cut against Martin's 6.6/scsi-queue tree. Link: https://lore.kernel.org/r/[email protected] Signed-off-by: Martin K. Petersen <[email protected]>
2023-07-23scsi: lpfc: Copyright updates for 14.2.0.14 patchesJustin Tee1-1/+1
Update copyrights to 2023 for files modified in the 14.2.0.14 patch set. Signed-off-by: Justin Tee <[email protected]> Link: https://lore.kernel.org/r/[email protected] Signed-off-by: Martin K. Petersen <[email protected]>
2023-07-23scsi: lpfc: Update lpfc version to 14.2.0.14Justin Tee1-1/+1
Update lpfc version to 14.2.0.14 Signed-off-by: Justin Tee <[email protected]> Link: https://lore.kernel.org/r/[email protected] Signed-off-by: Martin K. Petersen <[email protected]>