aboutsummaryrefslogtreecommitdiff
AgeCommit message (Collapse)AuthorFilesLines
2019-03-27scsi: qedf: Add additional checks for io_req->sc_cmd validityChad Dupuis1-2/+37
- Check the validity of various pointers before processing. Signed-off-by: Chad Dupuis <[email protected]> Signed-off-by: Saurav Kashyap <[email protected]> Signed-off-by: Martin K. Petersen <[email protected]>
2019-03-27scsi: qedf: fixup bit operationsHannes Reinecke2-1/+8
test_bit() is atomic, test_bit() || test_bit() is not. So protect consecutive bit tests with a lock to avoid races. Signed-off-by: Hannes Reinecke <[email protected]> Signed-off-by: Saurav Kashyap <[email protected]> Signed-off-by: Martin K. Petersen <[email protected]>
2019-03-27scsi: qedf: fixup locking in qedf_restart_rport()Hannes Reinecke1-1/+7
fc_rport_create() needs to be called with disc_mutex held. And we should re-assign the 'rdata' pointer in case it got changed. Signed-off-by: Hannes Reinecke <[email protected]> Signed-off-by: Saurav Kashyap <[email protected]> Signed-off-by: Martin K. Petersen <[email protected]>
2019-03-27scsi: qedf: missing kref_put in qedf_xmit()Hannes Reinecke1-1/+3
qedf_xmit() calls fc_rport_lookup(), but discards the returned rdata structure almost immediately without decreasing the refcount. This leads to a refcount leak and the rdata never to be freed. Signed-off-by: Hannes Reinecke <[email protected]> Signed-off-by: Saurav Kashyap <[email protected]> Signed-off-by: Martin K. Petersen <[email protected]>
2019-03-27scsi: qedf: Check for link state before processing LL2 packets and send ↵Saurav Kashyap2-6/+28
fipvlan retries - Check if link is UP before sending and processing any packets on wire. Signed-off-by: Saurav Kashyap <[email protected]> Signed-off-by: Martin K. Petersen <[email protected]>
2019-03-27scsi: qedf: Add missing fc_disc_init call after allocating lportChad Dupuis1-0/+2
When receiving an unsolicited frame we could crash on a list traversal in fc_rport_lookup while searching the rport which is associated with our lport. Initialize the lport's discovery node after allocating the lport in __qedf_probe(). Signed-off-by: Chad Dupuis <[email protected]> Signed-off-by: Saurav Kashyap <[email protected]> Signed-off-by: Martin K. Petersen <[email protected]>
2019-03-27scsi: qedf: Correct the memory barriers in qedf_ring_doorbellAndrew Vasquez1-2/+10
- Correct memory barriers to make sure all cmnds are flushed. Signed-off-by: Andrew Vasquez <[email protected]> Signed-off-by: Saurav Kashyap <[email protected]> Signed-off-by: Martin K. Petersen <[email protected]>
2019-03-27scsi: qedf: Use a separate completion for cleanup commandsChad Dupuis2-4/+8
- If a TMF and cleanup are issued at the same time they could cause a call trace if issued against the same xid as the io_req->tm_done completion is used for both. - Set and clear cleanup bit in cleanup routine. Signed-off-by: Chad Dupuis <[email protected]> Signed-off-by: Saurav Kashyap <[email protected]> Signed-off-by: Martin K. Petersen <[email protected]>
2019-03-27scsi: qedf: Modify abort and tmf handler to handle edge condition and flushSaurav Kashyap4-67/+276
An I/O can be in any state when flush is called, it can be in abort, waiting for abort, RRQ send and waiting or TMF send. - HZ can be different on different architecture, correctly set abort timeout value. - Flush can complete the I/Os prematurely, handle refcount for aborted I/Os and for which RRQ is pending. - Differentiate LUN/TARGET reset, as cleanup needs to be send to firmware accordingly. - Add flush mutex to sync cleanup call from abort and flush routine. - Clear abort/outstanding bit on timeout. Signed-off-by: Shyam Sundar <[email protected]> Signed-off-by: Chad Dupuis <[email protected]> Signed-off-by: Saurav Kashyap <[email protected]> Signed-off-by: Martin K. Petersen <[email protected]>
2019-03-27scsi: qedf: Modify flush routine to handle all I/Os and TMFShyam Sundar4-26/+277
The purpose of flush routine is to cleanup I/Os to the firmware and complete them to scsi middle layer. This routine is invoked before connection is uploaded because of rport going away. - Don't process any I/Os, aborts, TMFs coming when flush in progress. - Add flags to handle cleanup and release of I/Os because flush can prematurely complete I/Os. - Original command can get completed to driver when cleanup for same is posted to firmware, handle this condition. - Modify flush to handle I/Os in all the states like abort, TMF, RRQ and timeouts. Signed-off-by: Shyam Sundar <[email protected]> Signed-off-by: Chad Dupuis <[email protected]> Signed-off-by: Saurav Kashyap <[email protected]> Signed-off-by: Martin K. Petersen <[email protected]>
2019-03-27scsi: qedf: Simplify s/g list mappingChad Dupuis3-94/+40
When mapping the pages from a scatter/gather list from the SCSI layer we only need to follow these rules: - Max SGEs for each I/O request is 256 - No size limit on each SGE - No need to split OS provided SGEs to 4K before sending to firmware. - Slow SGE is applicable only when: - There are > 8 SGEs and any middle SGE is less than a page size (4K) Make necessary changes so that driver follows these rules. Applicable only for Write requests (not for Read requests). No need to check SGE address alignment requirements (first, middle or last) before declaring slow SGE. Signed-off-by: Chad Dupuis <[email protected]> Signed-off-by: Saurav Kashyap <[email protected]> Signed-off-by: Martin K. Petersen <[email protected]>
2019-03-27scsi: qedf: Add missing return in qedf_post_io_req() in the fcport offload checkChad Dupuis1-0/+1
Fixes the following crash as the return was missing from the check if an fcport is offloaded. If we hit this code we continue to try to post an invalid task which can lead to the crash: [30259.616411] [0000:61:00.3]:[qedf_post_io_req:989]:3: Session not offloaded yet. [30259.616413] [0000:61:00.3]:[qedf_upload_connection:1340]:3: Uploading connection port_id=490020. [30259.623769] BUG: unable to handle kernel NULL pointer dereference at 0000000000000198 [30259.631645] IP: [<ffffffffc035b1ed>] qedf_init_task.isra.16+0x3d/0x450 [qedf] [30259.638816] PGD 0 [30259.640841] Oops: 0000 [#1] SMP [30259.644098] Modules linked in: fuse xt_CHECKSUM iptable_mangle ipt_MASQUERADE nf_nat_masquerade_ipv4 iptable_nat nf_nat_ipv4 nf_nat nf_conntrack_ipv4 nf_defrag_ipv4 xt_conntrack nf_conntrack ipt_REJECT nf_reject_ipv4 tun bridge stp llc ebtable_filter ebtables devlink ip6table_filter ip6_tables iptable_filter vfat fat ib_isert iscsi_target_mod ib_srpt target_core_mod ib_srp scsi_transport_srp ib_ipoib ib_ucm ib_umad dm_service_time skx_edac intel_powerclamp coretemp intel_rapl iosf_mbi kvm_intel kvm irqbypass crc32_pclmul ghash_clmulni_intel aesni_intel rpcrdma sunrpc rdma_ucm ib_uverbs lrw gf128mul ib_iser rdma_cm iw_cm ib_cm libiscsi scsi_transport_iscsi qedr(OE) glue_helper ablk_helper cryptd ib_core dm_round_robin joydev pcspkr ipmi_ssif ses enclosure ipmi_si ipmi_devintf ipmi_msghandler mei_me [30259.715529] mei sg hpilo hpwdt shpchp wmi lpc_ich acpi_power_meter dm_multipath ip_tables xfs libcrc32c sd_mod crc_t10dif crct10dif_generic uas usb_storage mgag200 qedf(OE) i2c_algo_bit libfcoe drm_kms_helper libfc syscopyarea sysfillrect scsi_transport_fc qede(OE) sysimgblt fb_sys_fops ptp ttm pps_core drm qed(OE) smartpqi crct10dif_pclmul crct10dif_common crc32c_intel i2c_core scsi_transport_sas scsi_tgt dm_mirror dm_region_hash dm_log dm_mod [30259.754237] CPU: 9 PID: 977 Comm: kdmwork-253:7 Kdump: loaded Tainted: G W OE ------------ 3.10.0-862.el7.x86_64 #1 [30259.765664] Hardware name: HPE Synergy 480 Gen10/Synergy 480 Gen10 Compute Module, BIOS I42 04/04/2018 [30259.775000] task: ffff8c801efd0000 ti: ffff8c801efd8000 task.ti: ffff8c801efd8000 [30259.782505] RIP: 0010:[<ffffffffc035b1ed>] [<ffffffffc035b1ed>] qedf_init_task.isra.16+0x3d/0x450 [qedf] [30259.792116] RSP: 0018:ffff8c801efdbbb0 EFLAGS: 00010046 [30259.797444] RAX: 0000000000000000 RBX: ffffa7f1450948d8 RCX: ffff8c7fe5bc40c8 [30259.804600] RDX: ffff8c800715b300 RSI: ffffa7f1450948d8 RDI: ffff8c80169c2480 [30259.811755] RBP: ffff8c801efdbc30 R08: 00000000000000ae R09: ffff8c800a314540 [30259.818911] R10: ffff8c7fe5bc40c8 R11: ffff8c801efdb8ae R12: 0000000000000000 [30259.826068] R13: ffff8c800715b300 R14: ffff8c80169c2480 R15: ffff8c8005da28e0 [30259.833223] FS: 0000000000000000(0000) GS:ffff8c803f840000(0000) knlGS:0000000000000000 [30259.841338] CS: 0010 DS: 0000 ES: 0000 CR0: 0000000080050033 [30259.847100] CR2: 0000000000000198 CR3: 000000081242e000 CR4: 00000000007607e0 [30259.854256] DR0: 0000000000000000 DR1: 0000000000000000 DR2: 0000000000000000 [30259.861412] DR3: 0000000000000000 DR6: 00000000fffe0ff0 DR7: 0000000000000400 [30259.868568] PKRU: 00000000 [30259.871278] Call Trace: [30259.873737] [<ffffffffc035c948>] qedf_post_io_req+0x148/0x680 [qedf] [30259.880201] [<ffffffffc035d070>] qedf_queuecommand+0x1f0/0x240 [qedf] [30259.886749] [<ffffffffa329b050>] scsi_dispatch_cmd+0xb0/0x240 [30259.892600] [<ffffffffa32a45bc>] scsi_request_fn+0x4cc/0x680 [30259.898364] [<ffffffffa3118ad9>] __blk_run_queue+0x39/0x50 [30259.903954] [<ffffffffa3114393>] __elv_add_request+0xd3/0x260 [30259.909805] [<ffffffffa311baf0>] blk_insert_cloned_request+0xf0/0x1b0 [30259.916358] [<ffffffffc010b622>] map_request+0x142/0x220 [dm_mod] [30259.922560] [<ffffffffc010b716>] map_tio_request+0x16/0x40 [dm_mod] [30259.928932] [<ffffffffa2ebb1f5>] kthread_worker_fn+0x85/0x180 [30259.934782] [<ffffffffa2ebb170>] ? kthread_stop+0xf0/0xf0 [30259.940284] [<ffffffffa2ebae31>] kthread+0xd1/0xe0 [30259.945176] [<ffffffffa2ebad60>] ? insert_kthread_work+0x40/0x40 [30259.951290] [<ffffffffa351f61d>] ret_from_fork_nospec_begin+0x7/0x21 [30259.957750] [<ffffffffa2ebad60>] ? insert_kthread_work+0x40/0x40 [30259.963860] Code: fe 41 55 49 89 d5 41 54 53 48 89 f3 48 83 ec 58 4c 8b 67 28 4c 8b 4e 18 65 48 8b 04 25 28 00 00 00 48 89 45 d0 31 c0 4c 8b 7e 58 <49> 8b 84 24 98 01 00 00 48 8b 00 f6 80 31 01 00 00 10 0f 85 0b [30259.983372] RIP [<ffffffffc035b1ed>] qedf_init_task.isra.16+0x3d/0x450 [qedf] [30259.990630] RSP <ffff8c801efdbbb0> [30259.994127] CR2: 0000000000000198 Signed-off-by: Chad Dupuis <[email protected]> Signed-off-by: Saurav Kashyap <[email protected]> Signed-off-by: Martin K. Petersen <[email protected]>
2019-03-27scsi: qedf: Correct xid range overlap between offloaded requests and libfc ↵Chad Dupuis3-17/+7
requests There is currently an overlap where exchange IDs between what is used for offloaded commands and by libfc for ELS commands. Correct this so that exchange ID range is: Offloaded requests: 0 to 0xfff libfc requests: 0x1000 to 0xfffe Signed-off-by: Chad Dupuis <[email protected]> Signed-off-by: Saurav Kashyap <[email protected]> Signed-off-by: Martin K. Petersen <[email protected]>
2019-03-27scsi: qedf: Do not retry ELS request if qedf_alloc_cmd failsChad Dupuis1-12/+4
If we cannot allocate an ELS middlepath request, simply fail instead of trying to delay and then reallocate. This delay logic is causing soft lockup messages: NMI watchdog: BUG: soft lockup - CPU#2 stuck for 22s! [kworker/2:1:7639] Modules linked in: xt_CHECKSUM ipt_MASQUERADE nf_nat_masquerade_ipv4 tun devlink ip6t_rpfilter ipt_REJECT nf_reject_ipv4 ip6t_REJECT nf_reject_ipv6 xt_conntrack ip_set nfnetlink ebtable_nat ebtable_broute bridge stp llc ip6table_nat nf_conntrack_ipv6 nf_defrag_ipv6 nf_nat_ipv6 ip6table_mangle ip6table_security ip6table_raw iptable_nat nf_conntrack_ipv4 nf_defrag_ipv4 nf_nat_ipv4 nf_nat nf_conntrack iptable_mangle iptable_security iptable_raw ebtable_filter ebtables ip6table_filter ip6_tables iptable_filter dm_service_time vfat fat rpcrdma sunrpc ib_isert iscsi_target_mod ib_iser libiscsi scsi_transport_iscsi ib_srpt target_core_mod ib_srp scsi_transport_srp ib_ipoib rdma_ucm ib_ucm ib_uverbs ib_umad rdma_cm ib_cm iw_cm sb_edac intel_powerclamp coretemp intel_rapl iosf_mbi kvm_intel kvm irqbypass crc32_pclmul ghash_clmulni_intel aesni_intel lrw gf128mul glue_helper ablk_helper cryptd iTCO_wdt iTCO_vendor_support qedr(OE) ib_core joydev ipmi_ssif pcspkr hpilo hpwdt sg ipmi_si ipmi_devintf ipmi_msghandler ioatdma shpchp lpc_ich wmi dca acpi_power_meter dm_multipath ip_tables xfs libcrc32c sd_mod crc_t10dif crct10dif_generic qedf(OE) libfcoe mgag200 libfc i2c_algo_bit drm_kms_helper scsi_transport_fc qede(OE) syscopyarea sysfillrect sysimgblt fb_sys_fops ttm qed(OE) drm crct10dif_pclmul e1000e crct10dif_common crc32c_intel scsi_tgt hpsa i2c_core ptp scsi_transport_sas pps_core dm_mirror dm_region_hash dm_log dm_mod CPU: 2 PID: 7639 Comm: kworker/2:1 Kdump: loaded Tainted: G OEL ------------ 3.10.0-861.el7.x86_64 #1 Hardware name: HP ProLiant DL580 Gen9/ProLiant DL580 Gen9, BIOS U17 07/21/2016 Workqueue: qedf_2_dpc qedf_handle_rrq [qedf] task: ffff959edd628fd0 ti: ffff959ed6f08000 task.ti: ffff959ed6f08000 RIP: 0010:[<ffffffff8355913a>] [<ffffffff8355913a>] delay_tsc+0x3a/0x60 RSP: 0018:ffff959ed6f0bd30 EFLAGS: 00000246 RAX: 000000008ef5f791 RBX: 5f646d635f666465 RCX: 0000025b8ededa2f RDX: 000000000000025b RSI: 0000000000000002 RDI: 0000000000217d1e RBP: ffff959ed6f0bd30 R08: ffffffffc079aae8 R09: 0000000000000200 R10: ffffffffc07952c6 R11: 0000000000000000 R12: 6c6c615f66646571 R13: ffff959ed6f0bcc8 R14: ffff959ed6f0bd08 R15: ffff959e00000028 FS: 0000000000000000(0000) GS:ffff959eff480000(0000) knlGS:0000000000000000 CS: 0010 DS: 0000 ES: 0000 CR0: 0000000080050033 CR2: 00007f4117fa1eb0 CR3: 0000002039e66000 CR4: 00000000003607e0 DR0: 0000000000000000 DR1: 0000000000000000 DR2: 0000000000000000 DR3: 0000000000000000 DR6: 00000000fffe0ff0 DR7: 0000000000000400 Call Trace: [<ffffffff8355907d>] __const_udelay+0x2d/0x30 [<ffffffffc079444a>] qedf_initiate_els+0x13a/0x450 [qedf] [<ffffffffc0794210>] ? qedf_srr_compl+0x2a0/0x2a0 [qedf] [<ffffffffc0795337>] qedf_send_rrq+0x127/0x230 [qedf] [<ffffffffc078ed55>] qedf_handle_rrq+0x15/0x20 [qedf] [<ffffffff832b2dff>] process_one_work+0x17f/0x440 [<ffffffff832b3ac6>] worker_thread+0x126/0x3c0 [<ffffffff832b39a0>] ? manage_workers.isra.24+0x2a0/0x2a0 [<ffffffff832bae31>] kthread+0xd1/0xe0 [<ffffffff832bad60>] ? insert_kthread_work+0x40/0x40 [<ffffffff8391f637>] ret_from_fork_nospec_begin+0x21/0x21 [<ffffffff832bad60>] ? insert_kthread_work+0x40/0x40 Signed-off-by: Chad Dupuis <[email protected]> Signed-off-by: Saurav Kashyap <[email protected]> Signed-off-by: Martin K. Petersen <[email protected]>
2019-03-27scsi: MAINTAINERS: Add maintainer for MediaTek UFS driverStanley Chu1-0/+7
Signed-off-by: Stanley Chu <[email protected]> Signed-off-by: Martin K. Petersen <[email protected]>
2019-03-27scsi: ibmvfc: Clean up transport eventsTyrel Datwyler2-4/+11
No change to functionality. Simply make transport event messages a little clearer, and rework CRQ format enums such that we have separate enums for INIT messages and XPORT events. [mkp: typo] Signed-off-by: Tyrel Datwyler <[email protected]> Signed-off-by: Martin K. Petersen <[email protected]>
2019-03-27scsi: ibmvfc: Byte swap status and error codes when loggingTyrel Datwyler1-13/+15
Status and error codes are returned in big endian from the VIOS. The values are translated into a human readable format when logged, but the values are also logged. This patch byte swaps those values so that they are consistent between BE and LE platforms. Signed-off-by: Tyrel Datwyler <[email protected]> Signed-off-by: Martin K. Petersen <[email protected]>
2019-03-27scsi: ibmvfc: Add failed PRLI to cmd_status lookup arrayTyrel Datwyler1-0/+1
The VIOS uses the SCSI_ERROR class to report PRLI failures. These errors are indicated with the combination of a IBMVFC_FC_SCSI_ERROR return status and 0x8000 error code. Add these codes to cmd_status[] with appropriate human readable error message. Signed-off-by: Tyrel Datwyler <[email protected]> Signed-off-by: Martin K. Petersen <[email protected]>
2019-03-27scsi: ibmvfc: Remove "failed" from logged errorsTyrel Datwyler1-1/+1
The text of messages logged with ibmvfc_log_error() always contain the term "failed". In the case of cancelled commands during EH they are reported back by the VIOS using error codes. This can be confusing to somebody looking at these log messages as to whether a command was successfully cancelled. The following real log message for example it is unclear if the transaction was actaully cancelled. <6>sd 0:0:1:1: Cancelling outstanding commands. <3>sd 0:0:1:1: [sde] Command (28) failed: transaction cancelled (2:6) flags: 0 fcp_rsp: 0, resid=0, scsi_status: 0 Remove prefixing of "failed" to all error logged messages. The ibmvfc_log_error() function translates the returned error/status codes to a human readable message already. Signed-off-by: Tyrel Datwyler <[email protected]> Signed-off-by: Martin K. Petersen <[email protected]>
2019-03-27scsi: qla2xxx: Simplify conditional check againNathan Chancellor1-2/+2
Clang warns when it sees a logical not on the left side of a conditional statement because it thinks the logical not should be applied to the whole statement, not just the left side: drivers/scsi/qla2xxx/qla_nx.c:3703:7: warning: logical not is only applied to the left hand side of this comparison [-Wlogical-not-parentheses] This particular instance was already fixed by commit 0bfe7d3cae58 ("scsi: qla2xxx: Simplify conditional check") upstream but it was reintroduced by commit 3695310e37b4 ("scsi: qla2xxx: Update flash read/write routine") in the 5.2/scsi-queue. Fixes: 3695310e37b4 ("scsi: qla2xxx: Update flash read/write routine") Link: https://github.com/ClangBuiltLinux/linux/issues/80 Signed-off-by: Nathan Chancellor <[email protected]> Reviewed-by: Nick Desaulniers <[email protected]> Signed-off-by: Martin K. Petersen <[email protected]>
2019-03-27scsi: zfcp: reduce flood of fcrscn1 trace records on multi-element RSCNSteffen Maier1-4/+17
If an incoming ELS of type RSCN contains more than one element, zfcp suboptimally causes repeated erp trigger NOP trace records for each previously failed port. These could be ports that went away. It loops over each RSCN element, and for each of those in an inner loop over all zfcp_ports. The trigger to recover failed ports should be just the reception of some RSCN, no matter how many elements it has. So we can loop over failed ports separately, and only then loop over each RSCN element to handle the non-failed ports. The call chain was: zfcp_fc_incoming_rscn for (i = 1; i < no_entries; i++) _zfcp_fc_incoming_rscn list_for_each_entry(port, &adapter->port_list, list) if (masked port->d_id match) zfcp_fc_test_link if (!port->d_id) zfcp_erp_port_reopen "fcrscn1" <=== In order the reduce the "flooding" of the REC trace area in such cases, we factor out handling the failed ports to be outside of the entries loop: zfcp_fc_incoming_rscn if (no_entries > 1) <=== list_for_each_entry(port, &adapter->port_list, list) <=== if (!port->d_id) zfcp_erp_port_reopen "fcrscn1" <=== for (i = 1; i < no_entries; i++) _zfcp_fc_incoming_rscn list_for_each_entry(port, &adapter->port_list, list) if (masked port->d_id match) zfcp_fc_test_link Abbreviated example trace records before this code change: Tag : fcrscn1 WWPN : 0x500507630310d327 ERP want : 0x02 ERP need : 0x02 Tag : fcrscn1 WWPN : 0x500507630310d327 ERP want : 0x02 ERP need : 0x00 NOP => superfluous trace record The last trace entry repeats if there are more than 2 RSCN elements. Signed-off-by: Steffen Maier <[email protected]> Reviewed-by: Benjamin Block <[email protected]> Reviewed-by: Jens Remus <[email protected]> Signed-off-by: Martin K. Petersen <[email protected]>
2019-03-27scsi: zfcp: fix scsi_eh host reset with port_forced ERP for non-NPIV FCP devicesSteffen Maier3-0/+20
Suppose more than one non-NPIV FCP device is active on the same channel. Send I/O to storage and have some of the pending I/O run into a SCSI command timeout, e.g. due to bit errors on the fibre. Now the error situation stops. However, we saw FCP requests continue to timeout in the channel. The abort will be successful, but the subsequent TUR fails. Scsi_eh starts. The LUN reset fails. The target reset fails. The host reset only did an FCP device recovery. However, for non-NPIV FCP devices, this does not close and reopen ports on the SAN-side if other non-NPIV FCP device(s) share the same open ports. In order to resolve the continuing FCP request timeouts, we need to explicitly close and reopen ports on the SAN-side. This was missing since the beginning of zfcp in v2.6.0 history commit ea127f975424 ("[PATCH] s390 (7/7): zfcp host adapter."). Note: The FSF requests for forced port reopen could run into FSF request timeouts due to other reasons. This would trigger an internal FCP device recovery. Pending forced port reopen recoveries would get dismissed. So some ports might not get fully reopened during this host reset handler. However, subsequent I/O would trigger the above described escalation and eventually all ports would be forced reopen to resolve any continuing FCP request timeouts due to earlier bit errors. Signed-off-by: Steffen Maier <[email protected]> Fixes: 1da177e4c3f4 ("Linux-2.6.12-rc2") Cc: <[email protected]> #3.0+ Reviewed-by: Jens Remus <[email protected]> Reviewed-by: Benjamin Block <[email protected]> Signed-off-by: Martin K. Petersen <[email protected]>
2019-03-27scsi: zfcp: fix rport unblock if deleted SCSI devices on Scsi_HostSteffen Maier1-0/+3
An already deleted SCSI device can exist on the Scsi_Host and remain there because something still holds a reference. A new SCSI device with the same H:C:T:L and FCP device, target port WWPN, and FCP LUN can be created. When we try to unblock an rport, we still find the deleted SCSI device and return early because the zfcp_scsi_dev of that SCSI device is not ZFCP_STATUS_COMMON_UNBLOCKED. Hence we miss to unblock the rport, even if the new proper SCSI device would be in good state. Therefore, skip deleted SCSI devices when iterating the sdevs of the shost. [cf. __scsi_device_lookup{_by_target}() or scsi_device_get()] The following abbreviated trace sequence can indicate such problem: Area : REC Tag : ersfs_3 LUN : 0x4045400300000000 WWPN : 0x50050763031bd327 LUN status : 0x40000000 not ZFCP_STATUS_COMMON_UNBLOCKED Ready count : n not incremented yet Running count : 0x00000000 ERP want : 0x01 ERP need : 0xc1 ZFCP_ERP_ACTION_NONE Area : REC Tag : ersfs_3 LUN : 0x4045400300000000 WWPN : 0x50050763031bd327 LUN status : 0x41000000 Ready count : n+1 Running count : 0x00000000 ERP want : 0x01 ERP need : 0x01 ... Area : REC Level : 4 only with increased trace level Tag : ertru_l LUN : 0x4045400300000000 WWPN : 0x50050763031bd327 LUN status : 0x40000000 Request ID : 0x0000000000000000 ERP status : 0x01800000 ERP step : 0x1000 ERP action : 0x01 ERP count : 0x00 NOT followed by a trace record with tag "scpaddy" for WWPN 0x50050763031bd327. Signed-off-by: Steffen Maier <[email protected]> Fixes: 6f2ce1c6af37 ("scsi: zfcp: fix rport unblock race with LUN recovery") Cc: <[email protected]> #2.6.32+ Reviewed-by: Jens Remus <[email protected]> Reviewed-by: Benjamin Block <[email protected]> Signed-off-by: Martin K. Petersen <[email protected]>
2019-03-27scsi: sd: Quiesce warning if device does not report optimal I/O sizeMartin K. Petersen1-0/+3
Commit a83da8a4509d ("scsi: sd: Optimal I/O size should be a multiple of physical block size") split one conditional into several separate statements in an effort to provide more accurate warning messages when a device reports a nonsensical value. However, this reorganization accidentally dropped the precondition of the reported value being larger than zero. This lead to a warning getting emitted on devices that do not report an optimal I/O size at all. Remain silent if a device does not report an optimal I/O size. Fixes: a83da8a4509d ("scsi: sd: Optimal I/O size should be a multiple of physical block size") Cc: Randy Dunlap <[email protected]> Cc: <[email protected]> Reported-by: Hussam Al-Tayeb <[email protected]> Tested-by: Hussam Al-Tayeb <[email protected]> Reviewed-by: Bart Van Assche <[email protected]> Signed-off-by: Martin K. Petersen <[email protected]>
2019-03-27scsi: sd: Fix a race between closing an sd device and sd I/OBart Van Assche1-6/+13
The scsi_end_request() function calls scsi_cmd_to_driver() indirectly and hence needs the disk->private_data pointer. Avoid that that pointer is cleared before all affected I/O requests have finished. This patch avoids that the following crash occurs: Unable to handle kernel NULL pointer dereference at virtual address 0000000000000000 Call trace: scsi_mq_uninit_cmd+0x1c/0x30 scsi_end_request+0x7c/0x1b8 scsi_io_completion+0x464/0x668 scsi_finish_command+0xbc/0x160 scsi_eh_flush_done_q+0x10c/0x170 sas_scsi_recover_host+0x84c/0xa98 [libsas] scsi_error_handler+0x140/0x5b0 kthread+0x100/0x12c ret_from_fork+0x10/0x18 Cc: Christoph Hellwig <[email protected]> Cc: Ming Lei <[email protected]> Cc: Hannes Reinecke <[email protected]> Cc: Johannes Thumshirn <[email protected]> Cc: Jason Yan <[email protected]> Cc: <[email protected]> Signed-off-by: Bart Van Assche <[email protected]> Reported-by: Jason Yan <[email protected]> Reviewed-by: Christoph Hellwig <[email protected]> Signed-off-by: Martin K. Petersen <[email protected]>
2019-03-27scsi: core: Run queue when state is set to running after being blockedzhengbin1-0/+6
Use dd to test a SCSI device: 1. echo "blocked" >/sys/block/sda/device/state 2. dd if=/dev/sda of=/mnt/t.log bs=1M count=10 3. echo "running" >/sys/block/sda/device/state dd should finish this work after step 3, but it hangs. After step2, the call chain is this: blk_mq_dispatch_rq_list-->scsi_queue_rq-->prep_to_mq prep_to_mq will return BLK_STS_RESOURCE, and scsi_queue_rq will transition it to BLK_STS_DEV_RESOURCE which means that driver can guarantee that IO dispatch will be triggered in future when the resource is available. Need to follow the rule if we set the device state to running. [mkp: tweaked commit description and code comment as suggested by Bart] Signed-off-by: zhengbin <[email protected]> Reviewed-by: Ming Lei <[email protected]> Reviewed-by: Bart Van Assche <[email protected]> Signed-off-by: Martin K. Petersen <[email protected]>
2019-03-27scsi: sd: Inline sd_probe_part2()Bart Van Assche1-58/+43
Make sd_probe() easier to read by inlining sd_probe_part2(). This patch does not change any functionality. Cc: Lee Duncan <[email protected]> Cc: Hannes Reinecke <[email protected]> Cc: Luis Chamberlain <[email protected]> Cc: Johannes Thumshirn <[email protected]> Cc: Christoph Hellwig <[email protected]> Cc: Greg Kroah-Hartman <[email protected]> Signed-off-by: Bart Van Assche <[email protected]> Signed-off-by: Martin K. Petersen <[email protected]>
2019-03-27scsi: sd: Rely on the driver core for asynchronous probingBart Van Assche4-47/+5
As explained during the 2018 LSF/MM session about increasing SCSI disk probing concurrency, the problems with the current probing approach are as follows: - The driver core is unaware of asynchronous SCSI LUN probing. wait_for_device_probe() waits for all asynchronous probes except asynchronous SCSI disk probes. - There is unnecessary serialization between sd_probe() and sd_remove(). This can lead to a deadlock. Hence this patch that modifies the sd driver such that it uses the driver core framework for asynchronous probing. The async domains and get_device()/put_device() pairs that became superfluous due to this change are removed. This patch does not affect the time needed for loading the scsi_debug kernel module with parameters delay=0 and max_luns=256. This patch depends on commit ef0ff68351be ("driver core: Probe devices asynchronously instead of the driver") that went upstream in kernel version v5.1-rc1. Cc: Lee Duncan <[email protected]> Cc: Hannes Reinecke <[email protected]> Cc: Luis Chamberlain <[email protected]> Cc: Johannes Thumshirn <[email protected]> Cc: Christoph Hellwig <[email protected]> Cc: Greg Kroah-Hartman <[email protected]> Cc: Dan Williams <[email protected]> Signed-off-by: Bart Van Assche <[email protected]> Signed-off-by: Martin K. Petersen <[email protected]>
2019-03-27scsi: mpt3sas: fix indentation issueColin Ian King1-2/+2
There are a couple of statements that are incorrectly indented, fix these. Signed-off-by: Colin Ian King <[email protected]> Signed-off-by: Martin K. Petersen <[email protected]>
2019-03-27scsi: libcxgbi: remove uninitialized variable lenColin Ian King1-3/+2
The variable len is not being inintialized and the uninitialized value is being returned. However, this return path is never reached because the default case in the switch statement returns -ENOSYS. Clean up the code by replacing the return -ENOSYS with a break for the default case and returning -ENOSYS at the end of the function. This allows len to be removed. Signed-off-by: Colin Ian King <[email protected]> Signed-off-by: Martin K. Petersen <[email protected]>
2019-03-27scsi: target: alua: fix the tg_pt_gps_counttangwenji1-2/+4
Reducing the count should be alua_tg_pt_gps_count instead of alua_tg_pt_gps_counter when free alua group. Signed-off-by: tangwenji <[email protected]> Reviewed-by: Mike Christie <[email protected]> Signed-off-by: Martin K. Petersen <[email protected]>
2019-03-25scsi: qla4xxx: fix a potential NULL pointer dereferenceKangjie Lu1-0/+2
In case iscsi_lookup_endpoint fails, the fix returns -EINVAL to avoid NULL pointer dereference. Signed-off-by: Kangjie Lu <[email protected]> Acked-by: Manish Rangankar <[email protected]> Reviewed-by: Mukesh Ojha <[email protected]> Signed-off-by: Martin K. Petersen <[email protected]>
2019-03-25scsi: gdth: Only call dma_free_coherent when buf is not NULL in ioc_generalNathan Chancellor1-2/+3
When building with -Wsometimes-uninitialized, Clang warns: drivers/scsi/gdth.c:3662:6: warning: variable 'paddr' is used uninitialized whenever 'if' condition is false [-Wsometimes-uninitialized] Don't attempt to call dma_free_coherent when buf is NULL (meaning that we never called dma_alloc_coherent and initialized paddr), which avoids this warning. Link: https://github.com/ClangBuiltLinux/linux/issues/402 Signed-off-by: Nathan Chancellor <[email protected]> Reviewed-by: Arnd Bergmann <[email protected]> Signed-off-by: Martin K. Petersen <[email protected]>
2019-03-25scsi: pm8001: remove set but not used variables 'param, sas_ha'YueHaibing1-4/+0
Fixes gcc '-Wunused-but-set-variable' warning: drivers/scsi/pm8001/pm8001_hwi.c: In function 'mpi_smp_completion': drivers/scsi/pm8001/pm8001_hwi.c:2901:6: warning: variable 'param' set but not used [-Wunused-but-set-variable] drivers/scsi/pm8001/pm8001_hwi.c: In function 'pm8001_bytes_dmaed': drivers/scsi/pm8001/pm8001_hwi.c:3247:24: warning: variable 'sas_ha' set but not used [-Wunused-but-set-variable] They're never used since introduction, so can be removed. Signed-off-by: YueHaibing <[email protected]> Acked-by: Jack Wang <[email protected]> Signed-off-by: Martin K. Petersen <[email protected]>
2019-03-25scsi: aacraid: Insure we don't access PCIe space during AER/EEHDave Carroll2-3/+8
There are a few windows during AER/EEH when we can access PCIe I/O mapped registers. This will harden the access to insure we do not allow PCIe access during errors Signed-off-by: Dave Carroll <[email protected]> Reviewed-by: Sagar Biradar <[email protected]> Signed-off-by: Martin K. Petersen <[email protected]>
2019-03-25scsi: qla4xxx: avoid freeing unallocated dma memoryArnd Bergmann1-1/+1
Clang -Wuninitialized notices that on is_qla40XX we never allocate any DMA memory in get_fw_boot_info() but attempt to free it anyway: drivers/scsi/qla4xxx/ql4_os.c:5915:7: error: variable 'buf_dma' is used uninitialized whenever 'if' condition is false [-Werror,-Wsometimes-uninitialized] if (!(val & 0x07)) { ^~~~~~~~~~~~~ drivers/scsi/qla4xxx/ql4_os.c:5985:47: note: uninitialized use occurs here dma_free_coherent(&ha->pdev->dev, size, buf, buf_dma); ^~~~~~~ drivers/scsi/qla4xxx/ql4_os.c:5915:3: note: remove the 'if' if its condition is always true if (!(val & 0x07)) { ^~~~~~~~~~~~~~~~~~~ drivers/scsi/qla4xxx/ql4_os.c:5885:20: note: initialize the variable 'buf_dma' to silence this warning dma_addr_t buf_dma; ^ = 0 Skip the call to dma_free_coherent() here. Fixes: 2a991c215978 ("[SCSI] qla4xxx: Boot from SAN support for open-iscsi") Signed-off-by: Arnd Bergmann <[email protected]> Reviewed-by: Nathan Chancellor <[email protected]> Signed-off-by: Martin K. Petersen <[email protected]>
2019-03-25scsi: lpfc: avoid uninitialized variable warningArnd Bergmann1-4/+4
clang -Wuninitialized incorrectly sees a variable being used without initialization: drivers/scsi/lpfc/lpfc_nvme.c:2102:37: error: variable 'localport' is uninitialized when used here [-Werror,-Wuninitialized] lport = (struct lpfc_nvme_lport *)localport->private; ^~~~~~~~~ drivers/scsi/lpfc/lpfc_nvme.c:2059:38: note: initialize the variable 'localport' to silence this warning struct nvme_fc_local_port *localport; ^ = NULL 1 error generated. This is clearly in dead code, as the condition leading up to it is always false when CONFIG_NVME_FC is disabled, and the variable is always initialized when nvme_fc_register_localport() got called successfully. Change the preprocessor conditional to the equivalent C construct, which makes the code more readable and gets rid of the warning. Signed-off-by: Arnd Bergmann <[email protected]> Acked-by: James Smart <[email protected]> Signed-off-by: Martin K. Petersen <[email protected]>
2019-03-25scsi: lpfc: change snprintf to scnprintf for possible overflowSilvio Cesare4-339/+349
Change snprintf to scnprintf. There are generally two cases where using snprintf causes problems. 1) Uses of size += snprintf(buf, SIZE - size, fmt, ...) In this case, if snprintf would have written more characters than what the buffer size (SIZE) is, then size will end up larger than SIZE. In later uses of snprintf, SIZE - size will result in a negative number, leading to problems. Note that size might already be too large by using size = snprintf before the code reaches a case of size += snprintf. 2) If size is ultimately used as a length parameter for a copy back to user space, then it will potentially allow for a buffer overflow and information disclosure when size is greater than SIZE. When the size is used to index the buffer directly, we can have memory corruption. This also means when size = snprintf... is used, it may also cause problems since size may become large. Copying to userspace is mitigated by the HARDENED_USERCOPY kernel configuration. The solution to these issues is to use scnprintf which returns the number of characters actually written to the buffer, so the size variable will never exceed SIZE. Signed-off-by: Silvio Cesare <[email protected]> Signed-off-by: Willy Tarreau <[email protected]> Signed-off-by: James Smart <[email protected]> Cc: Dick Kennedy <[email protected]> Cc: Dan Carpenter <[email protected]> Cc: Kees Cook <[email protected]> Cc: Will Deacon <[email protected]> Cc: Greg KH <[email protected]> Signed-off-by: Martin K. Petersen <[email protected]>
2019-03-25scsi: ufs-mediatek: Add missing MODULE_* informationAnders Roxell1-0/+5
When building the ufs-mediatek module the following warning shows up: WARNING: modpost: missing MODULE_LICENSE() in drivers/scsi/ufs/ufs-mediatek.o Rework to add MODULE_LICENSE,MODULE_AUTHOR and MODULE_DESCRIPTION. Fixes: ddd90623ce26 ("scsi: ufs-mediatek: Add UFS support for Mediatek SoC chips") Signed-off-by: Anders Roxell <[email protected]> Reviewed-by: Stanley Chu <[email protected]> Signed-off-by: Martin K. Petersen <[email protected]>
2019-03-25scsi: ufs-mediatek: Fix platform_no_drv_owner.cocci warningsYueHaibing1-1/+0
Remove .owner field if calls are used which set it automatically Generated by: scripts/coccinelle/api/platform_no_drv_owner.cocci Signed-off-by: YueHaibing <[email protected]> Reviewed-by: Stanley Chu <[email protected]> Signed-off-by: Martin K. Petersen <[email protected]>
2019-03-25scsi: mpt3sas: Fix kernel panic during expander resetSreekanth Reddy2-0/+18
During expander reset handling, the driver invokes kernel function scsi_host_find_tag() to obtain outstanding requests associated with the scsi host managed by the driver. Driver loops from tag value zero to hba queue depth to obtain the outstanding scmds. But when blk-mq is enabled, the block layer may return stale entry for one or more requests. This may lead to kernel panic if the returned value is inaccessible or the memory pointed by the returned value is reused. Reference of upstream discussion: https://patchwork.kernel.org/patch/10734933/ Instead of calling scsi_host_find_tag() API for each and every smid (smid is tag +1) from one to shost->can_queue, now driver will call this API (to obtain the outstanding scmd) only for those smid's which are outstanding at the driver level. Driver will determine whether this smid is outstanding at driver level by looking into it's corresponding MPI request frame, if its MPI request frame is empty, then it means that this smid is free and does not need to call scsi_host_find_tag() for it. By doing this, driver will invoke scsi_host_find_tag() for only those tags which are outstanding at the driver level. Driver will check whether particular MPI request frame is empty or not by looking into the "DevHandle" field. If this field is zero then it means that this MPI request is empty. For active MPI request DevHandle must be non-zero. Also driver will memset the MPI request frame once the corresponding scmd is processed (i.e. just before calling scmd->done function). Signed-off-by: Sreekanth Reddy <[email protected]> Signed-off-by: Martin K. Petersen <[email protected]>
2019-03-20scsi: target: iscsi: Free conn_ops when zalloc_cpumask_var failedtangwenji1-3/+3
It should not free cpumask but free conn->conn_ops when zalloc_cpumask_var failed. Signed-off-by: tangwenji <[email protected]> Reviewed-by: Mike Christie <[email protected]> Signed-off-by: Martin K. Petersen <[email protected]>
2019-03-20scsi: target: iscsi: Fix np_ip_proto and np_sock_type in iscsit_setup_nptangwenji1-3/+0
In the switch, np_ip_proto and np_sock_type set different values according to np_network_transport, and then uniformly assign values, so the previous values are overwritten. Signed-off-by: tangwenji <[email protected]> Reviewed-by: Mike Christie <[email protected]> Signed-off-by: Martin K. Petersen <[email protected]>
2019-03-20scsi: target: fix unsigned comparision with less than zeroColin Ian King1-3/+6
Currently an error return is being assigned to an unsigned size_t varianle and then checked if the result is less than zero which will always be false. Fix this by making ret ssize_t rather than a size_t. Fixes: 0322913cab79 ("scsi: target: Add device product id and revision configfs attributes") Signed-off-by: Colin Ian King <[email protected]> Reviewed-by: Mike Christie <[email protected]> Signed-off-by: Martin K. Petersen <[email protected]>
2019-03-20scsi: ibmvscsi: Fix empty event pool access during host removalTyrel Datwyler1-6/+16
The event pool used for queueing commands is destroyed fairly early in the ibmvscsi_remove() code path. Since, this happens prior to the call so scsi_remove_host() it is possible for further calls to queuecommand to be processed which manifest as a panic due to a NULL pointer dereference as seen here: PANIC: "Unable to handle kernel paging request for data at address 0x00000000" Context process backtrace: DSISR: 0000000042000000 ????Syscall Result: 0000000000000000 4 [c000000002cb3820] memcpy_power7 at c000000000064204 [Link Register] [c000000002cb3820] ibmvscsi_send_srp_event at d000000003ed14a4 5 [c000000002cb3920] ibmvscsi_send_srp_event at d000000003ed14a4 [ibmvscsi] ?(unreliable) 6 [c000000002cb39c0] ibmvscsi_queuecommand at d000000003ed2388 [ibmvscsi] 7 [c000000002cb3a70] scsi_dispatch_cmd at d00000000395c2d8 [scsi_mod] 8 [c000000002cb3af0] scsi_request_fn at d00000000395ef88 [scsi_mod] 9 [c000000002cb3be0] __blk_run_queue at c000000000429860 10 [c000000002cb3c10] blk_delay_work at c00000000042a0ec 11 [c000000002cb3c40] process_one_work at c0000000000dac30 12 [c000000002cb3cd0] worker_thread at c0000000000db110 13 [c000000002cb3d80] kthread at c0000000000e3378 14 [c000000002cb3e30] ret_from_kernel_thread at c00000000000982c The kernel buffer log is overfilled with this log: [11261.952732] ibmvscsi: found no event struct in pool! This patch reorders the operations during host teardown. Start by calling the SRP transport and Scsi_Host remove functions to flush any outstanding work and set the host offline. LLDD teardown follows including destruction of the event pool, freeing the Command Response Queue (CRQ), and unmapping any persistent buffers. The event pool destruction is protected by the scsi_host lock, and the pool is purged prior of any requests for which we never received a response. Finally, move the removal of the scsi host from our global list to the end so that the host is easily locatable for debugging purposes during teardown. Cc: <[email protected]> # v2.6.12+ Signed-off-by: Tyrel Datwyler <[email protected]> Signed-off-by: Martin K. Petersen <[email protected]>
2019-03-20scsi: ibmvscsi: Protect ibmvscsi_head from concurrent modificaitonTyrel Datwyler1-0/+5
For each ibmvscsi host created during a probe or destroyed during a remove we either add or remove that host to/from the global ibmvscsi_head list. This runs the risk of concurrent modification. This patch adds a simple spinlock around the list modification calls to prevent concurrent updates as is done similarly in the ibmvfc driver and ipr driver. Fixes: 32d6e4b6e4ea ("scsi: ibmvscsi: add vscsi hosts to global list_head") Cc: <[email protected]> # v4.10+ Signed-off-by: Tyrel Datwyler <[email protected]> Signed-off-by: Martin K. Petersen <[email protected]>
2019-03-20scsi: ufs-mediatek: Avoid using ret uninitialized in ufs_mtk_setup_clocksNathan Chancellor1-4/+1
When building with -Wsometimes-uninitialized, Clang warns: drivers/scsi/ufs/ufs-mediatek.c:112:7: warning: variable 'ret' is used uninitialized whenever 'if' condition is false [-Wsometimes-uninitialized] if (on) ^~ drivers/scsi/ufs/ufs-mediatek.c:120:9: note: uninitialized use occurs here return ret; ^~~ drivers/scsi/ufs/ufs-mediatek.c:112:3: note: remove the 'if' if its condition is always true if (on) ^~~~~~~ drivers/scsi/ufs/ufs-mediatek.c:108:7: warning: variable 'ret' is used uninitialized whenever 'if' condition is false [-Wsometimes-uninitialized] if (!on) ^~~ drivers/scsi/ufs/ufs-mediatek.c:120:9: note: uninitialized use occurs here return ret; ^~~ drivers/scsi/ufs/ufs-mediatek.c:108:3: note: remove the 'if' if its condition is always true if (!on) ^~~~~~~~ drivers/scsi/ufs/ufs-mediatek.c:96:9: note: initialize the variable 'ret' to silence this warning int ret; ^ = 0 2 warnings generated. Remove the default case and initialize ret to -EINVAL to properly fix this warning. Fixes: ddd90623ce26 ("scsi: ufs-mediatek: Add UFS support for Mediatek SoC chips") Link: https://github.com/ClangBuiltLinux/linux/issues/426 Signed-off-by: Nathan Chancellor <[email protected]> Reviewed-by: Nick Desaulniers <[email protected]> Reviewed-by: Stanley Chu <[email protected]> Signed-off-by: Martin K. Petersen <[email protected]>
2019-03-20scsi: ufs-mediatek: Make some symbols staticYueHaibing1-3/+3
Fix sparse warnings: drivers/scsi/ufs/ufs-mediatek.c:19:6: warning: symbol 'ufs_mtk_cfg_unipro_cg' was not declared. Should it be static? drivers/scsi/ufs/ufs-mediatek.c:55:5: warning: symbol 'ufs_mtk_bind_mphy' was not declared. Should it be static? drivers/scsi/ufs/ufs-mediatek.c:342:27: warning: symbol 'ufs_mtk_of_match' was not declared. Should it be static? Signed-off-by: YueHaibing <[email protected]> Reviewed-by: Mukesh Ojha <[email protected]> Reviewed-by: Stanley Chu <[email protected]> Signed-off-by: Martin K. Petersen <[email protected]>
2019-03-20scsi: lpfc: Fixup eq_clr_intr referencesJames Smart2-4/+4
Declaring interrupt clear routines as inline is bogus as they are used as an indirect pointer. Remove the inline references. Signed-off-by: Dick Kennedy <[email protected]> Signed-off-by: James Smart <[email protected]> Signed-off-by: Martin K. Petersen <[email protected]>
2019-03-20scsi: lpfc: Fix build errorJames Bottomley2-7/+5
You can't declare a function inline in a header if it doesn't have a body available to the compiler. So realistically you either don't declare it inline or you make it a static inline in the header. I think the latter applies in this case, so this should be the fix Signed-off-by: James Bottomley <[email protected]> Acked-by: James Smart <[email protected]> Signed-off-by: Martin K. Petersen <[email protected]>