aboutsummaryrefslogtreecommitdiff
path: root/drivers/scsi/libsas/sas_scsi_host.c
AgeCommit message (Collapse)AuthorFilesLines
2018-03-20Merge tag 'scsi-fixes' of ↵Linus Torvalds1-20/+13
git://git.kernel.org/pub/scm/linux/kernel/git/jejb/scsi Pull SCSI fixes from James Bottomley: - one driver patch (qla2xxx) which fixes a problem caused by an existing regression fix (FCP discovery is failing) - one generic fix to a longstanding bug in libsas that causes I/O eventually to hang to the device in the face of ATA error recovery. * tag 'scsi-fixes' of git://git.kernel.org/pub/scm/linux/kernel/git/jejb/scsi: scsi: qla2xxx: Remove FC_NO_LOOP_ID for FCP and FC-NVMe Discovery scsi: libsas: defer ata device eh commands to libata
2018-03-12scsi: libsas: defer ata device eh commands to libataJason Yan1-20/+13
When ata device doing EH, some commands still attached with tasks are not passed to libata when abort failed or recover failed, so libata did not handle these commands. After these commands done, sas task is freed, but ata qc is not freed. This will cause ata qc leak and trigger a warning like below: WARNING: CPU: 0 PID: 28512 at drivers/ata/libata-eh.c:4037 ata_eh_finish+0xb4/0xcc CPU: 0 PID: 28512 Comm: kworker/u32:2 Tainted: G W OE 4.14.0#1 ...... Call trace: [<ffff0000088b7bd0>] ata_eh_finish+0xb4/0xcc [<ffff0000088b8420>] ata_do_eh+0xc4/0xd8 [<ffff0000088b8478>] ata_std_error_handler+0x44/0x8c [<ffff0000088b8068>] ata_scsi_port_error_handler+0x480/0x694 [<ffff000008875fc4>] async_sas_ata_eh+0x4c/0x80 [<ffff0000080f6be8>] async_run_entry_fn+0x4c/0x170 [<ffff0000080ebd70>] process_one_work+0x144/0x390 [<ffff0000080ec100>] worker_thread+0x144/0x418 [<ffff0000080f2c98>] kthread+0x10c/0x138 [<ffff0000080855dc>] ret_from_fork+0x10/0x18 If ata qc leaked too many, ata tag allocation will fail and io blocked for ever. As suggested by Dan Williams, defer ata device commands to libata and merge sas_eh_finish_cmd() with sas_eh_defer_cmd(). libata will handle ata qcs correctly after this. Signed-off-by: Jason Yan <yanaijie@huawei.com> CC: Xiaofei Tan <tanxiaofei@huawei.com> CC: John Garry <john.garry@huawei.com> CC: Dan Williams <dan.j.williams@intel.com> Reviewed-by: Dan Williams <dan.j.williams@intel.com> Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>
2018-01-31Merge tag 'scsi-misc' of git://git.kernel.org/pub/scm/linux/kernel/git/jejb/scsiLinus Torvalds1-16/+4
Pull SCSI updates from James Bottomley: "This is mostly updates of the usual driver suspects: arcmsr, scsi_debug, mpt3sas, lpfc, cxlflash, qla2xxx, aacraid, megaraid_sas, hisi_sas. We also have a rework of the libsas hotplug handling to make it more robust, a slew of 32 bit time conversions and fixes, and a host of the usual minor updates and style changes. The biggest potential for regressions is the libsas hotplug changes, but so far they seem stable under testing" * tag 'scsi-misc' of git://git.kernel.org/pub/scm/linux/kernel/git/jejb/scsi: (313 commits) scsi: qla2xxx: Fix logo flag for qlt_free_session_done() scsi: arcmsr: avoid do_gettimeofday scsi: core: Add VENDOR_SPECIFIC sense code definitions scsi: qedi: Drop cqe response during connection recovery scsi: fas216: fix sense buffer initialization scsi: ibmvfc: Remove unneeded semicolons scsi: hisi_sas: fix a bug in hisi_sas_dev_gone() scsi: hisi_sas: directly attached disk LED feature for v2 hw scsi: hisi_sas: devicetree: bindings: add LED feature for v2 hw scsi: megaraid_sas: NVMe passthrough command support scsi: megaraid: use ktime_get_real for firmware time scsi: fnic: use 64-bit timestamps scsi: qedf: Fix error return code in __qedf_probe() scsi: devinfo: fix format of the device list scsi: qla2xxx: Update driver version to 10.00.00.05-k scsi: qla2xxx: Add XCB counters to debugfs scsi: qla2xxx: Fix queue ID for async abort with Multiqueue scsi: qla2xxx: Fix warning for code intentation in __qla24xx_handle_gpdb_event() scsi: qla2xxx: Fix warning during port_name debug print scsi: qla2xxx: Fix warning in qla2x00_async_iocb_timeout() ...
2018-01-19Merge tag 'scsi-fixes' of ↵Linus Torvalds1-2/+15
git://git.kernel.org/pub/scm/linux/kernel/git/jejb/scsi Pull SCSI fix from James Bottomley: "One fix for SAS attached SATA CD-ROMs. It turns out that the libata handling of CD devices relies on the SCSI error handler, so disable async aborts (which don't start the error handler) for these devices" * tag 'scsi-fixes' of git://git.kernel.org/pub/scm/linux/kernel/git/jejb/scsi: scsi: libsas: Disable asynchronous aborts for SATA devices
2018-01-10scsi: libsas: Disable asynchronous aborts for SATA devicesHannes Reinecke1-2/+15
Handling CD-ROM devices from libsas is decidedly odd, as libata relies on SCSI EH to be started to figure out that no medium is present. So we cannot do asynchronous aborts for SATA devices. Fixes: 909657615d9 ("scsi: libsas: allow async aborts") Cc: <stable@vger.kernel.org> # 4.12+ Signed-off-by: Hannes Reinecke <hare@suse.com> Reviewed-by: Christoph Hellwig <hch@lst.de> Tested-by: Yves-Alexis Perez <corsac@debian.org> Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>
2018-01-03scsi: libsas: remove private hex2bin() implementationAndy Shevchenko1-16/+4
The function sas_parse_addr() could be easily substituted by hex2bin() which is in kernel library code. Cc: Christoph Hellwig <hch@lst.de> Signed-off-by: Andy Shevchenko <andriy.shevchenko@linux.intel.com> Tested-by: Xiang Chen <chenxiang66@hisilicon.com> Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>
2017-11-21treewide: Remove TIMER_FUNC_TYPE and TIMER_DATA_TYPE castsKees Cook1-1/+1
With all callbacks converted, and the timer callback prototype switched over, the TIMER_FUNC_TYPE cast is no longer needed, so remove it. Conversion was done with the following scripts: perl -pi -e 's|\(TIMER_FUNC_TYPE\)||g' \ $(git grep TIMER_FUNC_TYPE | cut -d: -f1 | sort -u) perl -pi -e 's|\(TIMER_DATA_TYPE\)||g' \ $(git grep TIMER_DATA_TYPE | cut -d: -f1 | sort -u) The now unused macros are also dropped from include/linux/timer.h. Signed-off-by: Kees Cook <keescook@chromium.org>
2017-11-01scsi: sas: Convert timers to use timer_setup()Kees Cook1-1/+1
In preparation for unconditionally passing the struct timer_list pointer to all timer callbacks, switch to using the new timer_setup() and from_timer() to pass the timer pointer explicitly. This requires adding a pointer to hold the timer's target task, as there isn't a link back from slow_task. Cc: John Garry <john.garry@huawei.com> Cc: "James E.J. Bottomley" <jejb@linux.vnet.ibm.com> Cc: "Martin K. Petersen" <martin.petersen@oracle.com> Cc: Jack Wang <jinpu.wang@profitbricks.com> Cc: lindar_liu@usish.com Cc: Jens Axboe <axboe@fb.com> Cc: Hannes Reinecke <hare@suse.com> Cc: Johannes Thumshirn <jthumshirn@suse.de> Cc: Benjamin Block <bblock@linux.vnet.ibm.com> Cc: Baoyou Xie <baoyou.xie@linaro.org> Cc: Wei Yongjun <weiyongjun1@huawei.com> Cc: linux-scsi@vger.kernel.org Signed-off-by: Kees Cook <keescook@chromium.org> Reviewed-by: Martin K. Petersen <martin.petersen@oracle.com> Acked-by: John Garry <john.garry@huawei.com> # for hisi_sas part Tested-by: John Garry <john.garry@huawei.com> # basic sanity test for hisi_sas Reviewed-by: Jack Wang <jinpu.wang@profitbricks.com>
2017-08-25scsi: libsas: move bus_reset_handler() to target_reset_handler()Hannes Reinecke1-6/+6
The bus reset handler is calling I_T Nexus reset, which logically is a target reset as it need to specify both the initiator and the target. So move it to target reset. Signed-off-by: Hannes Reinecke <hare@suse.com> Reviewed-by: Christoph Hellwig <hch@lst.de> Reviewed-by: Johannes Thumshirn <jthumshirn@suse.de> Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>
2017-08-25scsi: libsas: Remove a set-but-not-used variableBart Van Assche1-3/+0
This was detected by building with W=1. Signed-off-by: Bart Van Assche <bart.vanassche@wdc.com> Reviewed-by: Christoph Hellwig <hch@lst.de> Reviewed-by: Hannes Reinecke <hare@suse.de> Cc: Johannes Thumshirn <jthumshirn@suse.de> Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>
2017-04-06scsi: make eh_eflags persistentHannes Reinecke1-2/+0
If a failed command is retried and fails again we need to enter SCSI EH, otherwise we will never be able to recover the command. To detect this situation we must not clear scmd->eh_eflags when EH finishes but rather make it persistent throughout the lifetime of the command. Signed-off-by: Hannes Reinecke <hare@suse.de> Reviewed-by: Benjamin Block <bblock@linux.vnet.ibm.com> Reviewed-by: Bart Van Assche <bart.vanassche@sandisk.com> Reviewed-by: Johannes Thumshirn <jthumshirn@suse.de> Reviewed-by: Christoph Hellwig <hch@lst.de> Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>
2017-04-06scsi: libsas: allow async abortsChristoph Hellwig1-3/+0
We now first try to call ->eh_abort_handler from a work queue, but libsas was always failing that for no good reason. Allow async aborts. Reviewed-by: Johannes Thumshirn <jth@kernel.org> Reviewed-by: Hannes Reinecke <hare@suse.de> Signed-off-by: Christoph Hellwig <hch@lst.de> Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>
2017-02-06scsi: libsas: remove sas_scsi_timed_outChristoph Hellwig1-7/+0
EH_NOT_HANDLED is the default case if no eh_timed_out method is provided, so there is no need to supply it. Signed-off-by: Christoph Hellwig <hch@lst.de> Reviewed-by: Hannes Reinecke <hare@suse.com> Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>
2014-12-04scsi: remove ->change_queue_type methodChristoph Hellwig1-8/+0
Since we got rid of ordered tag support in 2010 the prime use case of switching on and off ordered tags has been obsolete. The other function of enabling/disabling tagging entirely has only been correctly implemented by the 53c700 driver and isn't generally useful. Signed-off-by: Christoph Hellwig <hch@lst.de> Reviewed-by: Bart Van Assche <bvanassche@acm.org> Reviewed-by: Martin K. Petersen <martin.petersen@oracle.com Reviewed-by: Hannes Reinecke <hare@suse.de>
2014-11-27libsas: remove task_collector modeChristoph Hellwig1-175/+1
The task_collector mode (or "latency_injector", (C) Dan Willians) is an optional I/O path in libsas that queues up scsi commands instead of directly sending it to the hardware. It generall increases latencies to in the optiomal case slightly reduce mmio traffic to the hardware. Only the obsolete aic94xx driver and the mvsas driver allowed to use it without recompiling the kernel, and most drivers didn't support it at all. Remove the giant blob of code to allow better optimizations for scsi-mq in the future. Signed-off-by: Christoph Hellwig <hch@lst.de> Reviewed-by: Hannes Reinecke <hare@suse.de> Acked-by: Dan Williams <dan.j.williams@intel.com>
2014-11-24scsi: drop reason argument from ->change_queue_depthChristoph Hellwig1-7/+5
Drop the now unused reason argument from the ->change_queue_depth method. Also add a return value to scsi_adjust_queue_depth, and rename it to scsi_change_queue_depth now that it can be used as the default ->change_queue_depth implementation. Signed-off-by: Christoph Hellwig <hch@lst.de> Reviewed-by: Mike Christie <michaelc@cs.wisc.edu> Reviewed-by: Hannes Reinecke <hare@suse.de>
2014-11-24scsi: avoid ->change_queue_depth indirection for queue full trackingChristoph Hellwig1-14/+3
All drivers use the implementation for ramping the queue up and down, so instead of overloading the change_queue_depth method call the implementation diretly if the driver opts into it by setting the track_queue_depth flag in the host template. Note that a few drivers validated the new queue depth in their change_queue_depth method, but as we never go over the queue depth set during slave_configure or the sysfs file this isn't nessecary and can safely be removed. Signed-off-by: Christoph Hellwig <hch@lst.de> Reviewed-by: Mike Christie <michaelc@cs.wisc.edu> Reviewed-by: Hannes Reinecke <hare@suse.de> Reviewed-by: Venkatesh Srinivas <venkateshs@google.com>
2014-11-12scsi: don't force tagged_supported in driversChristoph Hellwig1-1/+0
Now that we also get proper values in cmd->request->tag for untagged commands, there is no need to force tagged_supported to on in drivers that need host-wide tags. Signed-off-by: Christoph Hellwig <hch@lst.de> Reviewed-by: Mike Christie <michaelc@cs.wisc.edu> Reviewed-by: Hannes Reinecke <hare@suse.de>
2014-11-12scsi: don't set tagging state from scsi_adjust_queue_depthChristoph Hellwig1-14/+6
Remove the tagged argument from scsi_adjust_queue_depth, and just let it handle the queue depth. For most drivers those two are fairly separate, given that most modern drivers don't care about the SCSI "tagged" status of a command at all, and many old drivers allow queuing of multiple untagged commands in the driver. Instead we start out with the ->simple_tags flag set before calling ->slave_configure, which is how all drivers actually looking at ->simple_tags except for one worke anyway. The one other case looks broken, but I've kept the behavior as-is for now. Except for that we only change ->simple_tags from the ->change_queue_type, and when rejecting a tag message in a single driver, so keeping this churn out of scsi_adjust_queue_depth is a clear win. Now that the usage of scsi_adjust_queue_depth is more obvious we can also remove all the trivial instances in ->slave_alloc or ->slave_configure that just set it to the cmd_per_lun default. Signed-off-by: Christoph Hellwig <hch@lst.de> Reviewed-by: Mike Christie <michaelc@cs.wisc.edu> Reviewed-by: Hannes Reinecke <hare@suse.de> Reviewed-by: Martin K. Petersen <martin.petersen@oracle.com>
2014-11-12scsi: always assign block layer tags if enabledChristoph Hellwig1-8/+3
Allow a driver to ask for block layer tags by setting .use_blk_tags in the host template, in which case it will always see a valid value in request->tag, similar to the behavior when using blk-mq. This means even SCSI "untagged" commands will now have a tag, which is especially useful when using a host-wide tag map. Signed-off-by: Christoph Hellwig <hch@lst.de> Reviewed-by: Mike Christie <michaelc@cs.wisc.edu> Reviewed-by: Martin K. Petersen <martin.petersen@oracle.com> Reviewed-by: Hannes Reinecke <hare@suse.de>
2014-07-25scsi: convert host_busy to atomic_tChristoph Hellwig1-2/+3
Avoid taking the host-wide host_lock to check the per-host queue limit. Instead we do an atomic_inc_return early on to grab our slot in the queue, and if necessary decrement it after finishing all checks. Signed-off-by: Christoph Hellwig <hch@lst.de> Reviewed-by: Martin K. Petersen <martin.petersen@oracle.com> Reviewed-by: Hannes Reinecke <hare@suse.de> Reviewed-by: Webb Scales <webbnh@hp.com> Acked-by: Jens Axboe <axboe@kernel.dk> Tested-by: Bart Van Assche <bvanassche@acm.org> Tested-by: Robert Elliott <elliott@hp.com>
2014-07-17scsi: use 64-bit LUNsHannes Reinecke1-5/+6
The SCSI standard defines 64-bit values for LUNs, and large arrays employing large or hierarchical LUN numbers become more and more common. So update the linux SCSI stack to use 64-bit LUN numbers. Signed-off-by: Hannes Reinecke <hare@suse.de> Reviewed-by: Christoph Hellwig <hch@infradead.org> Reviewed-by: Ewan Milne <emilne@redhat.com> Signed-off-by: Christoph Hellwig <hch@lst.de>
2014-03-15[SCSI] libsas: introduce scmd_dbg() to quiet false positive "timeout" messagesDan Williams1-1/+1
libsas sometimes short circuits timeouts to force commands into error recovery. It is misleading to log that the command timed-out in sas_scsi_timed_out() when in fact it was just queued for error handling. It's also redundant in the case of a true timeout as libata eh will detect and report timeouts via it's AC_ERR_TIMEOUT facility. Given that some environments consider "timeout" errors to be indicative of impending device failure demote the sas_scsi_timed_out() timeout message to be disabled by default. This parallels ata_scsi_timed_out(). [jejb: checkpatch fix] Reported-by: Xun Ni <xun.ni@intel.com> Tested-by: Nelson Cheng <nelson.cheng@intel.com> Acked-by: Lukasz Dorau <lukasz.dorau@intel.com> Signed-off-by: Dan Williams <dan.j.williams@intel.com> Signed-off-by: James Bottomley <JBottomley@Parallels.com>
2013-06-04[SCSI] libsas: implement > 16 byte CDB supportJames Bottomley1-1/+1
Remove the arbitrary expectation in libsas that all SCSI commands are 16 bytes or less. Instead do all copies via cmd->cmd_len (and use a pointer to this in the libsas task instead of a copy). Note that this still doesn't enable > 16 byte CDB support in the underlying drivers because their internal format has to be fixed and the wire format of > 16 byte CDBs according to the SAS spec is different. the libsas drivers (isci, aic94xx, mvsas and pm8xxx are all updated for this change. Cc: Lukasz Dorau <lukasz.dorau@intel.com> Cc: Maciej Patelczyk <maciej.patelczyk@intel.com> Cc: Dave Jiang <dave.jiang@intel.com> Cc: Jack Wang <xjtuwjp@gmail.com> Cc: Lindar Liu <lindar_liu@usish.com> Cc: Xiangliang Yu <yuxiangl@marvell.com> Signed-off-by: James Bottomley <JBottomley@Parallels.com>
2012-07-20[SCSI] libsas: trim sas_task of slow path infrastructureDan Williams1-2/+6
The timer and the completion are only used for slow path tasks (smp, and lldd tmfs), yet we incur the allocation space and cpu setup time for every fast path task. Cc: Xiangliang Yu <yuxiangl@marvell.com> Acked-by: Jack Wang <jack_wang@usish.com> Signed-off-by: Dan Williams <dan.j.williams@intel.com> Signed-off-by: James Bottomley <JBottomley@Parallels.com>
2012-07-20[SCSI] libsas: use ->lldd_I_T_nexus_reset for ->eh_bus_reset_handlerDan Williams1-11/+8
sas_eh_bus_reset_handler() amounts to sas_phy_reset() without notification of the reset to the lldd. If this is triggered from eh-cmnd recovery there may be sas_tasks for the lldd to terminate, so ->lldd_I_T_nexus_reset is warranted. Cc: Xiangliang Yu <yuxiangl@marvell.com> Cc: Luben Tuikov <ltuikov@yahoo.com> Cc: Jack Wang <jack_wang@usish.com> Reviewed-by: Jacek Danecki <jacek.danecki@intel.com> [jacek: modify pm8001_I_T_nexus_reset to return -ENODEV] Signed-off-by: Dan Williams <dan.j.williams@intel.com> Signed-off-by: James Bottomley <JBottomley@Parallels.com>
2012-07-20[SCSI] libsas: add sas_eh_abort_handlerDan Williams1-0/+21
When recovering failed eh-cmnds let the lldd attempt an abort via scsi_abort_eh_cmnd before escalating. Reviewed-by: Jacek Danecki <jacek.danecki@intel.com> Signed-off-by: Dan Williams <dan.j.williams@intel.com> Signed-off-by: James Bottomley <JBottomley@Parallels.com>
2012-07-20[SCSI] libsas: enforce eh strategy handlers only in eh contextDan Williams1-4/+117
The strategy handlers may be called in places that are problematic for libsas (i.e. sata resets outside of domain revalidation filtering / libata link recovery), or problematic for userspace (non-blocking ioctl to sleeping reset functions). However, these routines are also called for eh escalations and recovery of scsi_eh_prep_cmnd(), so permit them as long as we are running in the host's error handler, otherwise arrange for them to be triggered in eh_context. Signed-off-by: Dan Williams <dan.j.williams@intel.com> Signed-off-by: James Bottomley <JBottomley@Parallels.com>
2012-07-20[SCSI] libsas: cleanup spurious calls to scsi_schedule_ehMaciej Trela1-1/+0
eh is woken up automatically by the presence of failed commands, scsi_schedule_eh is reserved for cases where there are no failed commands. This guarantees that host_eh_sceduled is only incremented when an explicit eh request is made. Reviewed-by: Jacek Danecki <jacek.danecki@intel.com> Signed-off-by: Maciej Trela <maciej.trela@intel.com> [fixed spurious delete of sas_ata_task_abort] Signed-off-by: Artur Wojcik <artur.wojcik@intel.com> Signed-off-by: Dan Williams <dan.j.williams@intel.com> Signed-off-by: James Bottomley <JBottomley@Parallels.com>
2012-07-20[SCSI] libata, libsas: introduce sched_eh and end_eh port opsDan Williams1-6/+21
When managing shost->host_eh_scheduled libata assumes that there is a 1:1 shost-to-ata_port relationship. libsas creates a 1:N relationship so it needs to manage host_eh_scheduled cumulatively at the host level. The sched_eh and end_eh port port ops allow libsas to track when domain devices enter/leave the "eh-pending" state under ha->lock (previously named ha->state_lock, but it is no longer just a lock for ha->state changes). Since host_eh_scheduled indicates eh without backing commands pinning the device it can be deallocated at any time. Move the taking of the domain_device reference under the port_lock to guarantee that the ata_port stays around for the duration of eh. Reviewed-by: Jacek Danecki <jacek.danecki@intel.com> Acked-by: Jeff Garzik <jgarzik@redhat.com> Signed-off-by: Dan Williams <dan.j.williams@intel.com> Signed-off-by: James Bottomley <JBottomley@Parallels.com>
2012-02-29[SCSI] libsas: don't recover end devices attached to disabled physDan Williams1-1/+2
If userspace has decided to disable a phy the kernel should honor that and not inadvertantly re-enable the phy via error recovery. This is more straightforward in the sata case where link recovery (via libata-eh) is separate from sas_task cancelling in libsas-eh. Teach libsas to accept -ENODEV as a successful response from I_T_nexus_reset ('successful' in terms of not escalating further). This is a more comprehensive fix then "libsas: don't recover 'gone' devices in sas_ata_hard_reset()", as it is no longer sata-specific. aic94xx does check the return value from sas_phy_reset() so if the phy is disabled we proceed with clearing the I_T_nexus. Signed-off-by: Dan Williams <dan.j.williams@intel.com> Signed-off-by: James Bottomley <JBottomley@Parallels.com>
2012-02-29[SCSI] libsas: fix lifetime of SAS_HA_FROZENDan Williams1-6/+7
Until all sas_tasks are known to no longer be in-flight this flag gates late completions from colliding with error handling. However, it must be cleared prior to the submission of scsi_send_eh_cmnd() requests, otherwise those commands will never be completed correctly. This was spotted by slub debug: ============================================================================= BUG sas_task: Objects remaining on kmem_cache_close() ----------------------------------------------------------------------------- INFO: Slab 0xffffea001f0eba00 objects=34 used=1 fp=0xffff8807c3aecb00 flags=0x8000000000004080 Pid: 22919, comm: modprobe Not tainted 3.2.0-isci+ #2 Call Trace: [<ffffffff810fcdcd>] slab_err+0xb0/0xd2 [<ffffffff810e1c50>] ? free_percpu+0x31/0x117 [<ffffffff81100122>] ? kzalloc+0x14/0x16 [<ffffffff81100122>] ? kzalloc+0x14/0x16 [<ffffffff81100486>] kmem_cache_destroy+0x11d/0x270 [<ffffffffa0112bdc>] sas_class_exit+0x10/0x12 [libsas] [<ffffffff81078fba>] sys_delete_module+0x1c4/0x23c [<ffffffff814797ba>] ? sysret_check+0x2e/0x69 [<ffffffff8126479e>] ? trace_hardirqs_on_thunk+0x3a/0x3f [<ffffffff81479782>] system_call_fastpath+0x16/0x1b INFO: Object 0xffff8807c3aed280 @offset=21120 INFO: Allocated in sas_alloc_task+0x22/0x90 [libsas] age=4615311 cpu=2 pid=12966 __slab_alloc.clone.3+0x1d1/0x234 kmem_cache_alloc+0x52/0x10d sas_alloc_task+0x22/0x90 [libsas] sas_queuecommand+0x20e/0x230 [libsas] scsi_send_eh_cmnd+0xd1/0x30c scsi_eh_try_stu+0x4f/0x6b scsi_eh_ready_devs+0xba/0x6ef sas_scsi_recover_host+0xa35/0xab1 [libsas] scsi_error_handler+0x14b/0x5fa kthread+0x9d/0xa5 kernel_thread_helper+0x4/0x10 Signed-off-by: Dan Williams <dan.j.williams@intel.com> Signed-off-by: James Bottomley <JBottomley@Parallels.com>
2012-02-29[SCSI] libsas: async ata scanningDan Williams1-18/+0
libsas ata error handling is already async but this does not help the scan case. Move initial link recovery out from under host->scan_mutex, and delay synchronization with eh until after all port probe/recovery work has been queued. Device ordering is maintained with scan order by still calling sas_rphy_add() in order of domain discovery. Since we now scan the domain list when invoking libata-eh we need to be careful to check for fully initialized ata ports. Acked-by: Jack Wang <jack_wang@usish.com> Acked-by: Jeff Garzik <jgarzik@redhat.com> Signed-off-by: Dan Williams <dan.j.williams@intel.com> Signed-off-by: James Bottomley <JBottomley@Parallels.com>
2012-02-29[SCSI] libsas: fix mixed topology recoveryDan Williams1-6/+7
If we have a domain with sas and sata devices there may still be sas recovery actions to take after peeling off the commands to send to libata. Reported-by: Andrzej Jakowski <andrzej.jakowski@intel.com> Signed-off-by: Dan Williams <dan.j.williams@intel.com> Signed-off-by: James Bottomley <JBottomley@Parallels.com>
2012-02-29[SCSI] libsas: close scsi_remove_target() vs libata-eh raceDan Williams1-3/+0
ata_port lifetime in libata follows the host. In libsas it follows the scsi_target. Once scsi_remove_device() has caused all commands to be completed it allows scsi_remove_target() to immediately proceed to freeing the ata_port causing bug reports like: [ 848.393333] BUG: spinlock bad magic on CPU#4, kworker/u:2/5107 [ 848.400262] general protection fault: 0000 [#1] SMP [ 848.406244] CPU 4 [ 848.408310] Modules linked in: nls_utf8 ipv6 uinput i2c_i801 i2c_core iTCO_wdt iTCO_vendor_support ioatdma dca sg sd_mod sr_mod cdrom ahci libahci isci libsas libata scsi_transport_sas [last unloaded: scsi_wait_scan] [ 848.432060] [ 848.434137] Pid: 5107, comm: kworker/u:2 Not tainted 3.2.0-isci+ #8 Intel Corporation S2600CP/S2600CP [ 848.445310] RIP: 0010:[<ffffffff8126a68c>] [<ffffffff8126a68c>] spin_dump+0x5e/0x8c [ 848.454787] RSP: 0018:ffff8807f868dca0 EFLAGS: 00010002 [ 848.461137] RAX: 0000000000000048 RBX: ffff8807fe86a630 RCX: ffffffff817d0be0 [ 848.469520] RDX: 0000000000000000 RSI: ffffffff814af1cf RDI: 0000000000000002 [ 848.477959] RBP: ffff8807f868dcb0 R08: 00000000ffffffff R09: 000000006b6b6b6b [ 848.486327] R10: 000000000003fb8c R11: ffffffff81a19448 R12: 6b6b6b6b6b6b6b6b [ 848.494699] R13: ffff8808027dc520 R14: 0000000000000000 R15: 000000000000001e [ 848.503067] FS: 0000000000000000(0000) GS:ffff88083fd00000(0000) knlGS:0000000000000000 [ 848.512899] CS: 0010 DS: 0000 ES: 0000 CR0: 000000008005003b [ 848.519710] CR2: 00007ff77d001000 CR3: 00000007f7a5d000 CR4: 00000000000406e0 [ 848.528072] DR0: 0000000000000000 DR1: 0000000000000000 DR2: 0000000000000000 [ 848.536446] DR3: 0000000000000000 DR6: 00000000ffff0ff0 DR7: 0000000000000400 [ 848.544831] Process kworker/u:2 (pid: 5107, threadinfo ffff8807f868c000, task ffff8807ff348000) [ 848.555327] Stack: [ 848.557959] ffff8807fe86a630 ffff8807fe86a630 ffff8807f868dcd0 ffffffff8126a6e0 [ 848.567072] ffffffff817c142f ffff8807fe86a630 ffff8807f868dcf0 ffffffff8126a703 [ 848.576190] ffff8808027dc520 0000000000000286 ffff8807f868dd10 ffffffff814af1bb [ 848.585281] Call Trace: [ 848.588409] [<ffffffff8126a6e0>] spin_bug+0x26/0x28 [ 848.594357] [<ffffffff8126a703>] do_raw_spin_unlock+0x21/0x88 [ 848.601283] [<ffffffff814af1bb>] _raw_spin_unlock_irqrestore+0x2c/0x65 [ 848.609089] [<ffffffffa001c103>] ata_scsi_port_error_handler+0x548/0x557 [libata] [ 848.618331] [<ffffffff81061813>] ? async_schedule+0x17/0x17 [ 848.625060] [<ffffffffa004f30f>] async_sas_ata_eh+0x45/0x69 [libsas] [ 848.632655] [<ffffffff810618aa>] async_run_entry_fn+0x97/0x125 [ 848.639670] [<ffffffff81057439>] process_one_work+0x207/0x38d [ 848.646577] [<ffffffff8105738c>] ? process_one_work+0x15a/0x38d [ 848.653681] [<ffffffff810576f7>] worker_thread+0x138/0x21c [ 848.660305] [<ffffffff810575bf>] ? process_one_work+0x38d/0x38d [ 848.667493] [<ffffffff8105b098>] kthread+0x9d/0xa5 [ 848.673382] [<ffffffff8106e1bd>] ? trace_hardirqs_on_caller+0x12f/0x166 [ 848.681304] [<ffffffff814b7704>] kernel_thread_helper+0x4/0x10 [ 848.688324] [<ffffffff814af534>] ? retint_restore_args+0x13/0x13 [ 848.695530] [<ffffffff8105affb>] ? __init_kthread_worker+0x5b/0x5b [ 848.702929] [<ffffffff814b7700>] ? gs_change+0x13/0x13 [ 848.709155] Code: 00 00 48 8d 88 38 04 00 00 44 8b 80 84 02 00 00 31 c0 e8 cf 1b 24 00 41 83 c8 ff 44 8b 4b 08 48 c7 c1 e0 0b 7d 81 4d 85 e4 74 10 <45> 8b 84 24 84 02 00 00 49 8d 8c 24 38 04 00 00 8b 53 04 48 89 [ 848.732467] RIP [<ffffffff8126a68c>] spin_dump+0x5e/0x8c [ 848.738905] RSP <ffff8807f868dca0> [ 848.743743] ---[ end trace 143161646eee8caa ]--- ...so arrange for the ata_port to have the same end of life as the domain device. Reported-by: Marcin Tomczak <marcin.tomczak@intel.com> Acked-by: Jeff Garzik <jgarzik@redhat.com> Signed-off-by: Dan Williams <dan.j.williams@intel.com> Signed-off-by: James Bottomley <JBottomley@Parallels.com>
2012-02-29[SCSI] libsas: pre-clean commands that won the eh vs completion raceDan Williams1-9/+16
When scrolling forward through the eh list (in a clear_q scenario) it is possible to encounter commands that won the completion vs eh race. Rather than sprinkle more "if (!task)" throughout the handler just make a pass through the list and delete the race winners before handling the rest. Signed-off-by: Dan Williams <dan.j.williams@intel.com> Signed-off-by: James Bottomley <JBottomley@Parallels.com>
2012-02-29[SCSI] libsas: fix sas_find_local_phy(), take phy referencesDan Williams1-20/+18
In the direct-attached case this routine returns the phy on which this device was first discovered. Which is broken if we want to support wide-targets, as this phy reference can become stale even though the port is still active. In the expander-attached case this routine tries to lookup the phy by scanning the attached sas addresses of the parent expander, and BUG_ONs if it can't find it. However since eh and the libsas workqueue run independently we can still be attempting device recovery via eh after libsas has recorded the device as detached. This is even easier to hit now that eh is blocked while device domain rediscovery takes place, and that libata is fed more timed out commands increasing the chances that it will try to recover the ata device. Arrange for dev->phy to always point to a last known good phy, it may be stale after the port is torn down, but it will catch up for wide port reconfigurations, and never be NULL. Signed-off-by: Dan Williams <dan.j.williams@intel.com> Signed-off-by: James Bottomley <JBottomley@Parallels.com>
2012-02-19[SCSI] libsas: sas_phy_enable via transport_sas_phy_resetDan Williams1-1/+0
Execute the link-reset triggered by sas_phy_enable via transport_sas_phy_reset so that it can be managed by libata. Signed-off-by: Dan Williams <dan.j.williams@intel.com> Signed-off-by: James Bottomley <JBottomley@Parallels.com>
2012-02-19[SCSI] libsas: defer SAS_TASK_NEED_DEV_RESET commands to libataDan Williams1-2/+2
lldds use the SAS_TASK_NEED_DEV_RESET interface to request that eh perform a reset. In the sata device case defer the commands that triggered the reset to libata-eh context so it can perform its pre and post reset management. In the sas_ata_post_internal() case the reset request is falling on deaf ears as the sas_task is immediately destroyed without any reset action. Since it is currently a nop, and likely superfluous given the conversion to new-style libata-eh, just drop the request. Signed-off-by: Dan Williams <dan.j.williams@intel.com> Signed-off-by: James Bottomley <JBottomley@Parallels.com>
2012-02-19[SCSI] libsas: let libata handle command timeoutsDan Williams1-2/+20
libsas-eh if it successfully aborts an ata command will hide the timeout condition (AC_ERR_TIMEOUT) from libata. The command likely completes with the all-zero task->task_status it started with. Instead, interpret a TMF_RESP_FUNC_COMPLETE as the end of the sas_task but keep the scmd around for libata-eh to handle. Tested-by: Andrzej Jakowski <andrzej.jakowski@intel.com> Signed-off-by: Dan Williams <dan.j.williams@intel.com> Signed-off-by: James Bottomley <JBottomley@Parallels.com>
2012-02-19[SCSI] libsas: fix timeout vs completion raceDan Williams1-53/+51
Until we have told the lldd to forget a task a timed out operation can return from the hardware at any time. Since completion frees the task we need to make sure that no tasks run their normal completion handler once eh has decided to manage the task. Similar to ata_scsi_cmd_error_handler() freeze completions to let eh judge the outcome of the race. Task collector mode is problematic because it presents a situation where a task can be timed out and aborted before the lldd has even seen it. For this case we need to guarantee that a task that an lldd has been told to forget does not get queued after the lldd says "never seen it". With sas_scsi_timed_out we achieve this with the ->task_queue_flush mutex, rather than adding more time. Signed-off-by: Dan Williams <dan.j.williams@intel.com> Signed-off-by: James Bottomley <JBottomley@Parallels.com>
2012-02-19[SCSI] libsas: prevent double completion of scmds from ehDan Williams1-28/+33
We invoke task->task_done() to free the task in the eh case, but at this point we are prepared for scsi_eh_flush_done_q() to finish off the scmd. Introduce sas_end_task() to capture the final response status from the lldd and free the task. Also take the opportunity to kill this warning. drivers/scsi/libsas/sas_scsi_host.c: In function ‘sas_end_task’: drivers/scsi/libsas/sas_scsi_host.c:102:3: warning: case value ‘2’ not in enumerated type ‘enum exec_status’ [-Wswitch] Signed-off-by: Dan Williams <dan.j.williams@intel.com> Signed-off-by: James Bottomley <JBottomley@Parallels.com>
2012-02-19[SCSI] libsas: close error handling vs sas_ata_task_done() raceDan Williams1-44/+0
Since sas_ata does not implement ->freeze(), completions for scmds and internal commands can still arrive concurrent with ata_scsi_cmd_error_handler() and sas_ata_post_internal() respectively. By the time either of those is called libata has committed to completing the qc, and the ATA_PFLAG_FROZEN flag tells sas_ata_task_done() it has lost the race. In the sas_ata_post_internal() case we take on the additional responsibility of freeing the sas_task to close the race with sas_ata_task_done() freeing the the task while sas_ata_post_internal() is in the process of invoking ->lldd_abort_task(). Signed-off-by: Dan Williams <dan.j.williams@intel.com> Signed-off-by: James Bottomley <JBottomley@Parallels.com>
2012-02-19[SCSI] libsas: convert dev->gone to flagsDan Williams1-1/+1
In preparation for adding tracking of another device state "destroy". Signed-off-by: Dan Williams <dan.j.williams@intel.com> Signed-off-by: James Bottomley <JBottomley@Parallels.com>
2012-02-19[SCSI] libsas: remove ata_port.lock management duties from llddsDan Williams1-4/+2
Each libsas driver (mvsas, pm8001, and isci) has invented a different method for managing the ap->lock. The lock is held by the ata ->queuecommand() path. mvsas drops it prior to acquiring any internal locks which allows it to hold its internal lock across calls to task->task_done(). This capability is important as it is the only way the driver can flush task->task_done() instances to guarantee that it no longer has any in-flight references to a domain_device at ->lldd_dev_gone() time. Signed-off-by: Dan Williams <dan.j.williams@intel.com> Signed-off-by: James Bottomley <JBottomley@Parallels.com>
2012-02-19[SCSI] libsas: fix domain_device leakDan Williams1-10/+6
Arrange for the deallocation of a struct domain_device object when it no longer has: 1/ any children 2/ references by any scsi_targets 3/ references by a lldd The comment about domain_device lifetime in Documentation/scsi/libsas.txt is stale as it appears mainline never had a version of a struct domain_device that was registered as a kobject. We now manage domain_device reference counts on behalf of external agents. Reviewed-by: Jack Wang <jack_wang@usish.com> Signed-off-by: Dan Williams <dan.j.williams@intel.com> Signed-off-by: James Bottomley <JBottomley@Parallels.com>
2012-02-19[SCSI] libsas: kill sas_slave_destroyDan Williams1-9/+0
Per commit 3e4ec344 "libata: kill ATA_FLAG_DISABLED" needing to set ATA_DEV_NONE is a holdover from before libsas converted to the "new-style" ata-eh. Signed-off-by: Dan Williams <dan.j.williams@intel.com> Signed-off-by: James Bottomley <JBottomley@Parallels.com>
2011-10-31scsi: Add export.h for EXPORT_SYMBOL/THIS_MODULE as requiredPaul Gortmaker1-0/+1
For the basic SCSI infrastructure files that are exporting symbols but not modules themselves, add in the basic export.h header file to allow the exports. Signed-off-by: Paul Gortmaker <paul.gortmaker@windriver.com>
2011-10-02[SCSI] isci: atapi supportDan Williams1-1/+1
Based on original implementation from Jiangbi Liu and Maciej Trela. ATAPI transfers happen in two-to-three stages. The two stage atapi commands are those that include a dma data transfer. The data transfer portion of these operations is handled by the hardware packet-dma acceleration. The three-stage commands do not have a data transfer and are handled without hardware assistance in raw frame mode. stage1: transmit host-to-device fis to notify the device of an incoming atapi cdb. Upon reception of the pio-setup-fis repost the task_context to perform the dma transfer of the cdb+data (go to stage3), or repost the task_context to transmit the cdb as a raw frame (go to stage 2). stage2: wait for hardware notification of the cdb transmission and then go to stage 3. stage3: wait for the arrival of the terminating device-to-host fis and terminate the command. To keep the implementation simple we only support ATAPI packet-dma protocol (for commands with data) to avoid needing to handle the data transfer manually (like we do for SATA-PIO). This may affect compatibility for a small number of devices (see ATA_HORKAGE_ATAPI_MOD16_DMA). If the data-transfer underruns, or encounters an error the device-to-host fis is expected to arrive in the unsolicited frame queue to pass to libata for disposition. However, in the DONE_UNEXP_FIS (data underrun) case it appears we need to craft a response. In the DONE_REG_ERR case we do receive the UF and propagate it to libsas. Signed-off-by: Maciej Trela <maciej.trela@intel.com> Signed-off-by: Dan Williams <dan.j.williams@intel.com> Signed-off-by: James Bottomley <JBottomley@Parallels.com>
2011-10-02[SCSI] libsas: dynamic queue depthDan Williams1-21/+18
The queue-depth for libsas-attached devices initializes to 32 and can only be increased manually via sysfs to a max of 64, while mpt2sas attached devices initialize to 254 and dynamically float via the midlayer ->change_queue_depth interface. No performance regression was observed with this change on the isci driver. Tested-by: Dave Jiang <dave.jiang@intel.com> Signed-off-by: Dan Williams <dan.j.williams@intel.com> Signed-off-by: James Bottomley <JBottomley@Parallels.com>