aboutsummaryrefslogtreecommitdiff
path: root/drivers/scsi
AgeCommit message (Collapse)AuthorFilesLines
2018-04-20scsi: storvsc: Set up correct queue depth values for IDE devicesLong Li1-2/+5
Unlike SCSI and FC, we don't use multiple channels for IDE. Also fix the calculation for sub-channels. Signed-off-by: Long Li <[email protected]> Reviewed-by: Michael Kelley <[email protected]> Signed-off-by: Martin K. Petersen <[email protected]>
2018-04-19scsi: sd_zbc: Avoid that resetting a zone fails sporadicallyBart Van Assche1-58/+82
Since SCSI scanning occurs asynchronously, since sd_revalidate_disk() is called from sd_probe_async() and since sd_revalidate_disk() calls sd_zbc_read_zones() it can happen that sd_zbc_read_zones() is called concurrently with blkdev_report_zones() and/or blkdev_reset_zones(). That can cause these functions to fail with -EIO because sd_zbc_read_zones() e.g. sets q->nr_zones to zero before restoring it to the actual value, even if no drive characteristics have changed. Avoid that this can happen by making the following changes: - Protect the code that updates zone information with blk_queue_enter() and blk_queue_exit(). - Modify sd_zbc_setup_seq_zones_bitmap() and sd_zbc_setup() such that these functions do not modify struct scsi_disk before all zone information has been obtained. Note: since commit 055f6e18e08f ("block: Make q_usage_counter also track legacy requests"; kernel v4.15) the request queue freezing mechanism also affects legacy request queues. Fixes: 89d947561077 ("sd: Implement support for ZBC devices") Signed-off-by: Bart Van Assche <[email protected]> Cc: Jens Axboe <[email protected]> Cc: Damien Le Moal <[email protected]> Cc: Christoph Hellwig <[email protected]> Cc: Hannes Reinecke <[email protected]> Cc: [email protected] # v4.16 Reviewed-by: Damien Le Moal <[email protected]> Signed-off-by: Martin K. Petersen <[email protected]>
2018-04-19scsi: sd_zbc: Let the SCSI core handle ILLEGAL REQUEST / ASC 0x21Bart Van Assche1-10/+0
scsi_io_completion() translates the sense key ILLEGAL REQUEST / ASC 0x21 into ACTION_FAIL. That means that setting cmd->allowed to zero in sd_zbc_complete() for this sense code / ASC combination is not necessary. Hence remove the code that resets cmd->allowed from sd_zbc_complete(). Signed-off-by: Bart Van Assche <[email protected]> Cc: Damien Le Moal <[email protected]> Cc: Christoph Hellwig <[email protected]> Cc: Hannes Reinecke <[email protected]> Reviewed-by: Damien Le Moal <[email protected]> Signed-off-by: Martin K. Petersen <[email protected]>
2018-04-19scsi: sd_zbc: Change the type of the ZBC fields into u32Bart Van Assche1-6/+6
This patch does not change any functionality but makes it clear that it is on purpose that these fields are 32 bits wide. Signed-off-by: Bart Van Assche <[email protected]> Cc: Damien Le Moal <[email protected]> Cc: Christoph Hellwig <[email protected]> Cc: Hannes Reinecke <[email protected]> Reviewed-by: Damien Le Moal <[email protected]> Signed-off-by: Martin K. Petersen <[email protected]>
2018-04-19scsi: storsvc: don't set a bounce limitChristoph Hellwig1-3/+0
The default already is to never bounce, so the call is a no-op. Signed-off-by: Christoph Hellwig <[email protected]> Signed-off-by: Martin K. Petersen <[email protected]>
2018-04-19scsi: iscsi_tcp: don't set a bounce limitChristoph Hellwig1-1/+0
The default already is to never bounce, so the call is a no-op. Signed-off-by: Christoph Hellwig <[email protected]> Signed-off-by: Martin K. Petersen <[email protected]>
2018-04-19scsi: sg: Change return type to vm_fault_tSouptick Joarder1-1/+1
Use new return type vm_fault_t for fault handler in struct vm_operations_struct. Signed-off-by: Souptick Joarder <[email protected]> Reviewed-by: Matthew Wilcox <[email protected]> Acked-by: Douglas Gilbert <[email protected]> Signed-off-by: Martin K. Petersen <[email protected]>
2018-04-19scsi: zorro_esp: New driver for Amiga Zorro NCR53C9x boardsMichael Schmitz3-0/+1187
New combined SCSI driver for all ESP based Zorro SCSI boards for m68k Amiga. Code largely based on board specific parts of the old drivers (blz1230.c, blz2060.c, cyberstorm.c, cyberstormII.c, fastlane.c which were removed after the 2.6 kernel series for lack of maintenance) with contributions by Tuomas Vainikka (TCQ bug tests and workaround) and Finn Thain (TCQ bugfix by use of PIO in extended message in transfer). New Kconfig option and Makefile entries for new Amiga Zorro ESP SCSI driver included in this patch. Use DMA transfers wherever possible, with board-specific DMA set-up functions copied from the old driver code. Three byte reselection messages do appear to cause DMA timeouts. So wire up a PIO transfer routine for these instead. esp_reselect_with_tag explicitly sets esp->cmd_block_dma as target address for the message bytes but PIO requires a virtual address. Substiute kernel virtual address esp->cmd_block in PIO transfer call if DMA address is esp->cmd_block_dma and phase is message in. PIO code taken from mac_esp.c where the reselection timeout issue was debugged and fixed first, with minor macro and function rename. Signed-off-by: Michael Schmitz <[email protected]> Reviewed-by: Finn Thain <[email protected]> Reviewed-by: Christoph Hellwig <[email protected]> Tested-by: Christian T. Steigies <[email protected]> Tested-by: John Paul Adrian Glaubitz <[email protected]> Signed-off-by: Martin K. Petersen <[email protected]>
2018-04-18scsi: sd: Defer spinning up drive while SANITIZE is in progressMahesh Rajashekhara1-0/+2
A drive being sanitized will return NOT READY / ASC 0x4 / ASCQ 0x1b ("LOGICAL UNIT NOT READY. SANITIZE IN PROGRESS"). Prevent spinning up the drive until this condition clears. [mkp: tweaked commit message] Signed-off-by: Mahesh Rajashekhara <[email protected]> Cc: <[email protected]> Signed-off-by: Martin K. Petersen <[email protected]>
2018-04-18scsi: megaraid_sas: Do not log an error if FW successfully initializes.Vinson Lee1-3/+3
Fixes: 2d2c2331673c ("scsi: megaraid_sas: modified few prints in OCR and IOC INIT path") Signed-off-by: Vinson Lee <[email protected]> Acked-by: Shivasharan S <[email protected]> Signed-off-by: Martin K. Petersen <[email protected]>
2018-04-18scsi: ufs: add trace event for ufs upiuOhad Sharabi1-0/+40
Add UFS Protocol Information Units(upiu) trace events for ufs driver, used to trace various ufs transaction types- command, task-management and device management. The trace-point format is generic and can be easily adapted to trace other upius if needed. Currently tracing ufs transaction of type 'device management', which this patch introduce, cannot be obtained from any other trace. Device management transactions are used for communication with the device such as reading and writing descriptor or attributes etc. Signed-off-by: Ohad Sharabi <[email protected]> Reviewed-by: Stanislav Nijnikov <[email protected]> Reviewed-by: Bart Van Assche <[email protected]> Signed-off-by: Martin K. Petersen <[email protected]>
2018-04-18scsi: fnic: fix spelling mistake in fnic stats "Abord" -> "Abort"Colin Ian King1-1/+1
Trivial fix to spelling mistake in fnic stats message text. Signed-off-by: Colin Ian King <[email protected]> Signed-off-by: Martin K. Petersen <[email protected]>
2018-04-18scsi: scsi_debug: IMMED related delay adjustmentsDouglas Gilbert1-9/+24
A patch titled: "[PATCH v2] scsi_debug: implement IMMED bit" introduced long delays to the Start stop unit (SSU) and Synchronize cache (SC) commands when the IMMED bit is clear. This patch makes those delays more realistic. It causes SSU to only delay when the start stop state is changed; SC only delays when there's been a write since the previous SC. It also reduced the SC delay from 1 second to 50 milliseconds. Signed-off-by: Douglas Gilbert <[email protected]> Tested-by: Ming Lei <[email protected]> Reported-by: Ming Lei <[email protected]> Reviewed-by: Johannes Thumshirn <[email protected]> Signed-off-by: Martin K. Petersen <[email protected]>
2018-04-18scsi: iscsi: respond to netlink with unicast when appropriateChris Leech1-11/+18
Instead of always multicasting responses, send a unicast netlink message directed at the correct pid. This will be needed if we ever want to support multiple userspace processes interacting with the kernel over iSCSI netlink simultaneously. Limitations can currently be seen if you attempt to run multiple iscsistart commands in parallel. We've fixed up the userspace issues in iscsistart that prevented multiple instances from running, so now attempts to speed up booting by bringing up multiple iscsi sessions at once in the initramfs are just running into misrouted responses that this fixes. Signed-off-by: Chris Leech <[email protected]> Reviewed-by: Lee Duncan <[email protected]> Signed-off-by: Martin K. Petersen <[email protected]>
2018-04-18scsi: scsi_dh: replace too broad "TP9" string with the exact modelsXose Vazquez Perez1-1/+4
SGI/TP9100 is not an RDAC array: ^^^ https://git.opensvc.com/gitweb.cgi?p=multipath-tools/.git;a=blob;f=libmultipath/hwtable.c;h=88b4700beb1d8940008020fbe4c3cd97d62f4a56;hb=HEAD#l235 This partially reverts commit 35204772ea03 ("[SCSI] scsi_dh_rdac : Consolidate rdac strings together") [mkp: fixed up the new entries to align with rest of struct] Cc: NetApp RDAC team <[email protected]> Cc: Hannes Reinecke <[email protected]> Cc: James E.J. Bottomley <[email protected]> Cc: Martin K. Petersen <[email protected]> Cc: SCSI ML <[email protected]> Cc: DM ML <[email protected]> Signed-off-by: Xose Vazquez Perez <[email protected]> Signed-off-by: Martin K. Petersen <[email protected]>
2018-04-18scsi: devinfo: delete duplicate "Generic"/"USB Storage-SMC" deviceXose Vazquez Perez1-2/+1
The revision field is currently unused by the devinfo pattern matching code. Combine two blacklist entries into one. $ egrep "Generic.*Storage-SMC" /proc/scsi/device_info 'Generic' 'USB Storage-SMC' 0x402 'Generic' 'USB Storage-SMC' 0x402 [mkp: tweaked commit desc] Cc: Hannes Reinecke <[email protected]> Cc: Martin K. Petersen <[email protected]> Cc: James E.J. Bottomley <[email protected]> Cc: SCSI ML <[email protected]> Signed-off-by: Xose Vazquez Perez <[email protected]> Signed-off-by: Martin K. Petersen <[email protected]>
2018-04-18scsi: lpfc: update driver version to 12.0.0.2James Smart1-1/+1
Update the driver version to 12.0.0.2 Signed-off-by: Dick Kennedy <[email protected]> Signed-off-by: James Smart <[email protected]> Reviewed-by: Hannes Reinecke <[email protected]> Signed-off-by: Martin K. Petersen <[email protected]>
2018-04-18scsi: lpfc: Correct missing remoteport registration during link bouncesJames Smart2-4/+5
Remote port disappearance/reappearances would cause a series of RSCN events to be delivered to the driver. During the resulting GID_FT handling, the driver clears the fc4 settings on the remote port, which makes it skip registration. As such, the nvme associations eventually fail and return io errors to the applications. Correct by not clearng the nlp_fc4_types for all nodes in lpfc_issue_gidft. Instead, when the GID_FT response is handled, clear the nlp_fc4_types of FCP and NVME prior to evaluating the fc4_type returned by the GID_FT response. This approach leaves "skipped" nodes with their nlp_fc4_types intacted. Signed-off-by: Dick Kennedy <[email protected]> Signed-off-by: James Smart <[email protected]> Reviewed-by: Hannes Reinecke <[email protected]> Signed-off-by: Martin K. Petersen <[email protected]>
2018-04-18scsi: lpfc: Fix NULL pointer reference when resetting adapterJames Smart1-16/+20
Points referencing local port structures didn't accommodate cases where the localport may not be registered yet. Add NULL pointer checks to logic. Signed-off-by: Dick Kennedy <[email protected]> Signed-off-by: James Smart <[email protected]> Reviewed-by: Hannes Reinecke <[email protected]> Signed-off-by: Martin K. Petersen <[email protected]>
2018-04-18scsi: lpfc: Fix nvme remoteport registration race conditionsJames Smart1-2/+14
On tests adding and removing a remote port, calls to nvme_info would eventually show fewer target ports discovered than were present in the san. Additionally, the following error messages were seen: 6031 RemotePort Registration failed err: -116, DID x471301 There is a race condition that exists between the driver and the nvme transport on remote port unregister vs the confirmed deletion. It's possible that the driver may rediscover the remote port and reregister the remote port before a prior unregister delete callback was made (as it rebinded to the prior remoteport structure). However, the driver was coded to expect the callback before seeing the remote port again thus a new registration. The logic results in the driver having an invalid remoteport pointer set. Correct by tracking when waiting for the delete callback. In cases where the ndlp remoteport pointer is updated, it is only cleared when the wait has not been superceded by a prior registration. Signed-off-by: Dick Kennedy <[email protected]> Signed-off-by: James Smart <[email protected]> Reviewed-by: Hannes Reinecke <[email protected]> Signed-off-by: Martin K. Petersen <[email protected]>
2018-04-18scsi: lpfc: Fix driver not recovering NVME rports during target link faultsJames Smart3-3/+15
During target-side port faults, the driver would not recover all target port logins. This resulted in a loss of nvme device discovery. The driver is coded to wait for all GID_FT requests to complete before restarting discovery. A fault is seen where the outstanding GIT_FT counts are not properly decremented, thus discovery would never start. Another fault was found in the clearing of the gidft_inp counter that would be skipped in this condition. And a third fault found with lpfc_nvme_register_port that would remove a reverence on the ndlp which then allows a node swap on a port address change to prematurely remove the reference and release the ndlp. The following changes are made: - Correct the decrementing of the outstanding GID_FT counters. - In RSCN handling, no longer zero the counter before calling to issue another GID_FT. - No longer remove the reference on the dlp when the ndlp->nrport value is not yet null. Signed-off-by: Dick Kennedy <[email protected]> Signed-off-by: James Smart <[email protected]> Reviewed-by: Hannes Reinecke <[email protected]> Signed-off-by: Martin K. Petersen <[email protected]>
2018-04-18scsi: lpfc: Fix WQ/CQ creation for older asic's.James Smart3-0/+28
The patch to enlarge WQ/CQ creation keys off of an adapter response that indicates support for the larger values. Older adapters return an incorrect response and are limited in size. Thus the adapters fail the WQ creation steps. Augment the WQ sizing checks with a check on the older adapter types and limit them to the restricted sizes. Fixes: c176ffa0841c ("scsi: lpfc: Increase CQ and WQ sizes for SCSI") Signed-off-by: Dick Kennedy <[email protected]> Signed-off-by: James Smart <[email protected]> Reviewed-by: Hannes Reinecke <[email protected]> Signed-off-by: Martin K. Petersen <[email protected]>
2018-04-18scsi: lpfc: Fix NULL pointer access in lpfc_nvme_info_showJames Smart4-10/+31
After making remoteport unregister requests, the ndlp nrport pointer was stale. Track when waiting for waiting for unregister completion callback and adjust nldp pointer assignment. Add a few safety checks for NULL pointer values. Signed-off-by: Dick Kennedy <[email protected]> Signed-off-by: James Smart <[email protected]> Reviewed-by: Hannes Reinecke <[email protected]> Signed-off-by: Martin K. Petersen <[email protected]>
2018-04-18scsi: lpfc: Fix lingering lpfc_wq resource after driver unloadJames Smart1-3/+8
After driver unloads, lpfc_wq remains active. The destroy_workqueue calls were not being made in driver unload. Additionally, SLI3 is allocating lpfc_wq resources, but never uses it. Make the destroy_workqueue calls on driver unload. Modify the SLI3 code path no longer allocate lpfc_wq resources. Signed-off-by: Dick Kennedy <[email protected]> Signed-off-by: James Smart <[email protected]> Reviewed-by: Hannes Reinecke <[email protected]> Signed-off-by: Martin K. Petersen <[email protected]>
2018-04-18scsi: lpfc: Fix Abort request WQ selectionJames Smart2-13/+13
When running loads that generated aborts, io errors where seen. Turns out the abort requests where not placed on the proper WQ resulting in the errors. Closer inspection inspection of this error also showed improper spinlock api use. Correct the WQ selection policy for the abort requests. Correct spin_lock/spin_lock_irq/spin_lock_irqsave usage. Signed-off-by: Dick Kennedy <[email protected]> Signed-off-by: James Smart <[email protected]> Reviewed-by: Hannes Reinecke <[email protected]> Signed-off-by: Martin K. Petersen <[email protected]>
2018-04-18scsi: lpfc: Enlarge nvmet asynchronous receive buffer countsJames Smart4-3/+17
Under large io load, the current sizing of asynchronous buffer counts could be exceeded, indicated by a 2885 log message: 2885 Port Status Event: port status reg 0x81800000, port smphr reg 0xc000, error 1=0x52004a01, error 2=0x0 Enlarge the async receive queue size. Allow for a configurable number of buffers to be posted to each RQ, using the new attribute lpfc_nvmet_mrq_post. Signed-off-by: Dick Kennedy <[email protected]> Signed-off-by: James Smart <[email protected]> Reviewed-by: Hannes Reinecke <[email protected]> Signed-off-by: Martin K. Petersen <[email protected]>
2018-04-18scsi: lpfc: Add per io channel NVME IO statisticsJames Smart6-91/+173
When debugging various issues, per IO channel IO statistics were useful to understand what was happening. However, many of the stats were on a port basis rather than an io channel basis. Move statistics to an io channel basis. Signed-off-by: Dick Kennedy <[email protected]> Signed-off-by: James Smart <[email protected]> Reviewed-by: Hannes Reinecke <[email protected]> Signed-off-by: Martin K. Petersen <[email protected]>
2018-04-18scsi: lpfc: Correct target queue depth application changesJames Smart4-20/+66
The max_scsicmpl_time parameter can be used to perform scsi cmd queue depth mgmt based on io completion time: the queue depth is reduced to make completion time shorter. However, as soon as an io completes and the completion time is within limits, the code immediately bumps the queue depth limit back up to the target queue depth. Thus the procedure restarts, effectively limiting the usefulness of adjusting queue depth to help completion time. This patch makes the following changes: - Removes the code at io completion that resets the queue depth as soon as within limits. - As the code removed was where the target queue depth was first applied, change target queue depth application so that it occurs when the parameter is changed. - Makes target queue depth a standard parameter: both a module parameter and a sysfs parameter. - Optimizes the command pending count by using atomics rather than locks. - Updates the debugfs nodelist stats to allow better debugging of pending command counts. Signed-off-by: Dick Kennedy <[email protected]> Signed-off-by: James Smart <[email protected]> Reviewed-by: Hannes Reinecke <[email protected]> Signed-off-by: Martin K. Petersen <[email protected]>
2018-04-18scsi: lpfc: Fix multiple PRLI completion error pathJames Smart1-23/+6
Nodelist entry for SCSI array ends up in UNMAPPED state. This is due to illegal discovery State machine transition because of two PRLIs and the first one failing with LS_RJT. Also, the error path was designed assuming the PRLIs complete in the order they were sent, FCP first, then NVME. In a failing case, the array thinks about the first PRLI (FCP), but issues LS_RJT for the 2nd PRLI immediately. Fix PRLI completion error path for the ordering expectation. Ensure the discovery state machine update is not set until all outstanding PRLIs are complete. Signed-off-by: Dick Kennedy <[email protected]> Signed-off-by: James Smart <[email protected]> Reviewed-by: Hannes Reinecke <[email protected]> Signed-off-by: Martin K. Petersen <[email protected]>
2018-04-18scsi: megaraid_sas: driver version upgradeShivasharan S1-2/+2
Signed-off-by: Shivasharan S <[email protected]> Signed-off-by: Martin K. Petersen <[email protected]>
2018-04-18scsi: megaraid_sas: Increase timeout by 1 sec for non-RAID fastpath IOsShivasharan S1-0/+3
Hardware could time out Fastpath IOs one second earlier than the timeout provided by the host. For non-RAID devices, driver provides timeout value based on OS provided timeout value. Under certain scenarios, if the OS provides a timeout value of 1 second, due to above behavior hardware will timeout immediately. Increase timeout value for non-RAID fastpath IOs by 1 second. Signed-off-by: Shivasharan S <[email protected]> Signed-off-by: Martin K. Petersen <[email protected]>
2018-04-18scsi: megaraid_sas: Use zeroing memory allocator than allocator/memsetHimanshu Jha2-18/+12
Use pci_zalloc_consistent for allocating zeroed memory and remove unnecessary memset function. Done using Coccinelle. Generated by: scripts/coccinelle/api/alloc/kzalloc-simple.cocci Suggested-by: Luis R. Rodriguez <[email protected]> Signed-off-by: Himanshu Jha <[email protected]> Signed-off-by: Shivasharan S <[email protected]> Signed-off-by: Martin K. Petersen <[email protected]>
2018-04-18scsi: libsas: add transport class for ATA devicesJason Yan2-0/+6
Now ata devices attached with sas controller do not have transport class, so that we can not see any information of these ata devices in /sys/class/ata_port(or ata_link or ata_device). Add transport class for the ata devices attached with sas controller. The /sys/class directory will show the infomation of the ata devices as follows: localhost:/sys/class # ls ata* ata_device: dev1.0 dev2.0 ata_link: link1 link2 ata_port: ata1 ata2 No functional change of the device scanning and io path. The ata transport class was deleted when destroying the sas devices. Signed-off-by: Jason Yan <[email protected]> CC: Dan Williams <[email protected]> CC: Tejun Heo <[email protected]> Acked-by: Tejun Heo <[email protected]> Signed-off-by: Martin K. Petersen <[email protected]>
2018-04-18scsi: hisi_sas: remove some unneeded structure membersJohn Garry3-21/+1
This patch removes unneeded structure elements: - hisi_sas_phy.dev_sas_addr: only ever written - Also remove associated function which writes it, hisi_sas_init_add(). - hisi_sas_device.attached_phy: only ever written - Also remove code to set it in hisi_sas_dev_found() Signed-off-by: John Garry <[email protected]> Reviewed-by: Xiang Chen <[email protected]> Signed-off-by: Martin K. Petersen <[email protected]>
2018-04-18scsi: hisi_sas: print device id for errorsJohn Garry2-4/+4
When we find an erroneous slot completion, to help aid debugging add the device index to the current debug log. Signed-off-by: John Garry <[email protected]> Reviewed-by: Xiang Chen <[email protected]> Signed-off-by: Martin K. Petersen <[email protected]>
2018-04-18scsi: hisi_sas: check IPTT is valid before using it for v3 hwXiaofei Tan1-4/+8
There is a bug of v3 hw development version. When AXI error happen, hw may return an abnormal CQ that IPTT value is 0xffff. This will cause IPTT out-of-bounds reference. This patch adds a check of IPTT in cq_tasklet_v3_hw() and discards invalid slot. This workaround scheme is just to enhance fault-tolerance of the driver. So, we will apply this scheme for all version of v3 hw, although release version has fixed this SoC bug. Signed-off-by: Xiaofei Tan <[email protected]> Signed-off-by: John Garry <[email protected]> Signed-off-by: Martin K. Petersen <[email protected]>
2018-04-18scsi: hisi_sas: consolidate command check in hisi_sas_get_ata_protocol()Xiaofei Tan1-14/+15
Currently we check the fis->command value in 2 locations in hisi_sas_get_ata_protocol() switch statement. Fix this by consolidating the check for fis->command value to 1 location only. Signed-off-by: Xiaofei Tan <[email protected]> Signed-off-by: John Garry <[email protected]> Signed-off-by: Martin K. Petersen <[email protected]>
2018-04-18scsi: hisi_sas: use dma_zalloc_coherent()Xiang Chen1-3/+1
This is a warning coming from Coccinelle, and need to use new interface dma_zalloc_coherent() instead of dma_alloc_coherent()/memset(). Signed-off-by: Xiang Chen <[email protected]> Signed-off-by: John Garry <[email protected]> Signed-off-by: Martin K. Petersen <[email protected]>
2018-04-18scsi: hisi_sas: delete timer when removing hisi_sas driverXiang Chen3-3/+6
Delete timer for v1 and v3 hw when removing hisi_sas driver. Signed-off-by: Xiang chen <[email protected]> Signed-off-by: John Garry <[email protected]> Signed-off-by: Martin K. Petersen <[email protected]>
2018-04-18scsi: hisi_sas: update RAS feature for later revision of v3 HWXiaofei Tan1-2/+58
There is an modification for later revision of v3 hw. More HW errors are reported through RAS interrupt. These errors were originally reported only through MSI. When report to RAS, some combinations are done to port AXI errors and FIFO OMIT errors. For example, each port has 4 AXI errors, and they are combined to one when report to RAS. This patch does two things: 1. Enable RAS interrupt of these errors and handle them in PCI error handlers. 2. Disable MSI interrupts of these errors for this later revision hw. Signed-off-by: Xiaofei Tan <[email protected]> Signed-off-by: John Garry <[email protected]> Signed-off-by: Martin K. Petersen <[email protected]>
2018-04-18scsi: hisi_sas: make SAS address of SATA disks uniqueXiang Chen1-0/+1
When directly connected with SATA disks in different SAS cores, fill SAS address with scsi_host's id to make it's fake SAS address unique. Signed-off-by: Xiang Chen <[email protected]> Signed-off-by: John Garry <[email protected]> Signed-off-by: Martin K. Petersen <[email protected]>
2018-04-18scsi: cxlflash: Handle spurious interruptsUma Krishnan2-0/+12
The following Oops can occur when there is heavy I/O traffic and the host is reset by a tool such as sg_reset. [c000200fff3fbc90] c00800001690117c process_cmd_doneq+0x104/0x500 [cxlflash] (unreliable) [c000200fff3fbd80] c008000016901648 cxlflash_rrq_irq+0xd0/0x150 [cxlflash] [c000200fff3fbde0] c000000000193130 __handle_irq_event_percpu+0xa0/0x310 [c000200fff3fbea0] c0000000001933d8 handle_irq_event_percpu+0x38/0x90 [c000200fff3fbee0] c000000000193494 handle_irq_event+0x64/0xb0 [c000200fff3fbf10] c000000000198ea0 handle_fasteoi_irq+0xc0/0x230 [c000200fff3fbf40] c00000000019182c generic_handle_irq+0x4c/0x70 [c000200fff3fbf60] c00000000001794c __do_irq+0x7c/0x1c0 [c000200fff3fbf90] c00000000002a390 call_do_irq+0x14/0x24 [c000200e5828fab0] c000000000017b2c do_IRQ+0x9c/0x130 [c000200e5828fb00] c000000000009b04 h_virt_irq_common+0x114/0x120 When a context is reset, the pending commands are flushed and the AFU is notified. Before the AFU handles this request there could be command completion interrupts queued to PHB which are yet to be delivered to the context. In this scenario, a context could receive an interrupt for a command that has been flushed, leading to a possible crash when the memory for the flushed command is accessed. To resolve this problem, a boolean will indicate if the hardware queue is ready to process interrupts or not. This can be evaluated in the interrupt handler before proessing an interrupt. Signed-off-by: Uma Krishnan <[email protected]> Acked-by: Matthew R. Ochs <[email protected]> Signed-off-by: Martin K. Petersen <[email protected]>
2018-04-18scsi: cxlflash: Remove commmands from pending list on timeoutUma Krishnan1-0/+14
The following Oops can occur if an internal command sent to the AFU does not complete within the timeout: [c000000ff101b810] c008000016020d94 term_mc+0xfc/0x1b0 [cxlflash] [c000000ff101b8a0] c008000016020fb0 term_afu+0x168/0x280 [cxlflash] [c000000ff101b930] c0080000160232ec cxlflash_pci_error_detected+0x184/0x230 [cxlflash] [c000000ff101b9e0] c00800000d95d468 cxl_vphb_error_detected+0x90/0x150[cxl] [c000000ff101ba20] c00800000d95f27c cxl_pci_error_detected+0xa4/0x240 [cxl] [c000000ff101bac0] c00000000003eaf8 eeh_report_error+0xd8/0x1b0 [c000000ff101bb20] c00000000003d0b8 eeh_pe_dev_traverse+0x98/0x170 [c000000ff101bbb0] c00000000003f438 eeh_handle_normal_event+0x198/0x580 [c000000ff101bc60] c00000000003fba4 eeh_handle_event+0x2a4/0x338 [c000000ff101bd10] c0000000000400b8 eeh_event_handler+0x1f8/0x200 [c000000ff101bdc0] c00000000013da48 kthread+0x1a8/0x1b0 [c000000ff101be30] c00000000000b528 ret_from_kernel_thread+0x5c/0xb4 When an internal command times out, the command buffer is freed while it is still in the pending commands list of the context. This corrupts the list and when the context is cleaned up, a crash is encountered. To resolve this issue, when an AFU command or TMF command times out, the command should be deleted from the hardware queue pending command list before freeing the buffer. Signed-off-by: Uma Krishnan <[email protected]> Acked-by: Matthew R. Ochs <[email protected]> Signed-off-by: Martin K. Petersen <[email protected]>
2018-04-18scsi: cxlflash: Synchronize reset and remove opsUma Krishnan1-3/+3
The following Oops can be encountered if a device removal or system shutdown is initiated while an EEH recovery is in process: [c000000ff2f479c0] c008000015256f18 cxlflash_pci_slot_reset+0xa0/0x100 [cxlflash] [c000000ff2f47a30] c00800000dae22e0 cxl_pci_slot_reset+0x168/0x290 [cxl] [c000000ff2f47ae0] c00000000003ef1c eeh_report_reset+0xec/0x170 [c000000ff2f47b20] c00000000003d0b8 eeh_pe_dev_traverse+0x98/0x170 [c000000ff2f47bb0] c00000000003f80c eeh_handle_normal_event+0x56c/0x580 [c000000ff2f47c60] c00000000003fba4 eeh_handle_event+0x2a4/0x338 [c000000ff2f47d10] c0000000000400b8 eeh_event_handler+0x1f8/0x200 [c000000ff2f47dc0] c00000000013da48 kthread+0x1a8/0x1b0 [c000000ff2f47e30] c00000000000b528 ret_from_kernel_thread+0x5c/0xb4 The remove handler frees AFU memory while the EEH recovery is in progress, leading to a race condition. This can result in a crash if the recovery thread tries to access this memory. To resolve this issue, the cxlflash remove handler will evaluate the device state and yield to any active reset or probing threads. Signed-off-by: Uma Krishnan <[email protected]> Acked-by: Matthew R. Ochs <[email protected]> Signed-off-by: Martin K. Petersen <[email protected]>
2018-04-18scsi: cxlflash: Enable OCXL operationsUma Krishnan2-2/+8
This commit enables the OCXL operations for the OCXL devices. Signed-off-by: Uma Krishnan <[email protected]> Acked-by: Matthew R. Ochs <[email protected]> Signed-off-by: Martin K. Petersen <[email protected]>
2018-04-18scsi: cxlflash: Support AFU resetUma Krishnan1-0/+17
The cxlflash core driver resets the AFU when the master contexts are created in the initialization or recovery paths. Today, the OCXL provider service to perform this operation is pending implementation. To avoid a crash due to a missing fop, log an error once and return success to continue with execution. Signed-off-by: Uma Krishnan <[email protected]> Acked-by: Matthew R. Ochs <[email protected]> Signed-off-by: Martin K. Petersen <[email protected]>
2018-04-18scsi: cxlflash: Register for translation errorsUma Krishnan2-2/+33
While enabling a context on the link, a predefined callback can be registered with the OCXL provider services to be notified on translation errors. These errors can in turn be passed back to the user on a read operation. Signed-off-by: Uma Krishnan <[email protected]> Acked-by: Matthew R. Ochs <[email protected]> Signed-off-by: Martin K. Petersen <[email protected]>
2018-04-18scsi: cxlflash: Introduce OCXL context state machineUma Krishnan2-3/+64
In order to protect the OCXL hardware contexts from getting clobbered, a simple state machine is added to indicate when a context is in open, close or start state. The expected states are validated throughout the code to prevent illegal operations on a context. A mutex is added to protect writes to the context state field. Signed-off-by: Uma Krishnan <[email protected]> Acked-by: Matthew R. Ochs <[email protected]> Signed-off-by: Martin K. Petersen <[email protected]>
2018-04-18scsi: cxlflash: Update synchronous interrupt status bitsUma Krishnan1-14/+21
The SISLite specification has been updated to define new synchronous interrupt status bits. These bits are set by the AFU when a given PASID or EA is bad and a synchronous interrupt is triggered. The SISLite header file is updated to support these new bits. Note that there are also some formatting updates to some of the existing bits to allow all of the definitions to line up uniformly. Signed-off-by: Uma Krishnan <[email protected]> Acked-by: Matthew R. Ochs <[email protected]> Signed-off-by: Martin K. Petersen <[email protected]>
2018-04-18scsi: cxlflash: Setup LISNs for master contextsUma Krishnan1-0/+21
Similar to user contexts, master contexts also require that the per-context LISN registers be programmed for certain AFUs. The mapped trigger page is obtained from underlying transport and registered with AFU for each master context. Signed-off-by: Uma Krishnan <[email protected]> Acked-by: Matthew R. Ochs <[email protected]> Signed-off-by: Martin K. Petersen <[email protected]>