aboutsummaryrefslogtreecommitdiff
AgeCommit message (Collapse)AuthorFilesLines
2021-02-02nvme-rdma: add clean action for failed reconnectionChao Leng1-2/+16
A crash happens when inject failed reconnection. If reconnect failed after start io queues, the queues will be unquiesced and new requests continue to be delivered. Reconnection error handling process directly free queues without cancel suspend requests. The suppend request will time out, and then crash due to use the queue after free. Add sync queues and cancel suppend requests for reconnection error handling. Signed-off-by: Chao Leng <[email protected]> Signed-off-by: Christoph Hellwig <[email protected]>
2021-02-02nvme-core: add cancel tagset helpersChao Leng2-0/+22
Add nvme_cancel_tagset and nvme_cancel_admin_tagset for tear down and reconnection error handling. Signed-off-by: Chao Leng <[email protected]> Signed-off-by: Christoph Hellwig <[email protected]>
2021-02-02nvme-core: get rid of the extra spaceChaitanya Kulkarni1-1/+1
Remove the extra space in the nvme_free_cels() when calling xa_for_each loop which is not a common practice (except drivers/infiniband/core/ not sure why). Signed-off-by: Chaitanya Kulkarni <[email protected]> Signed-off-by: Christoph Hellwig <[email protected]>
2021-02-02nvme: add tracing of zns commandsJohannes Thumshirn2-1/+39
When support for the NVMe ZNS commands was merged, tracing of these has been omitted. Add nvme_cmd_zone_mgmt_send, nvme_cmd_zone_mgmt_recv as well as nvme_cmd_zone_append to the nvme driver's tracing facility. Signed-off-by: Johannes Thumshirn <[email protected]> Signed-off-by: Christoph Hellwig <[email protected]>
2021-02-02nvme: parse format nvm command details when tracingMichal Krakowiak1-0/+19
Add detailed parsing of format nvm admin command to make the trace log more consistent and human-readable. Signed-off-by: Michal Krakowiak <[email protected]> Acked-by: Dan Williams <[email protected]> Reviewed-by: Minwoo Im <[email protected]> Signed-off-by: Christoph Hellwig <[email protected]>
2021-02-02nvme: update enumerations for status codesMax Gurtovoy1-4/+20
All the updates are mentioned in the ratified NVMe 1.4 spec. Reviewed-by: Hannes Reinecke <[email protected]> Signed-off-by: Max Gurtovoy <[email protected]> Signed-off-by: Christoph Hellwig <[email protected]>
2021-02-02nvmet: add lba to sect conversion helpersChaitanya Kulkarni2-5/+13
In this preparation patch, we add helpers to convert lbas to sectors & sectors to lba. This is needed to eliminate code duplication in the ZBD backend. Use these helpers in the block device backend. Signed-off-by: Chaitanya Kulkarni <[email protected]> Reviewed-by: Damien Le Moal <[email protected]> Signed-off-by: Christoph Hellwig <[email protected]>
2021-02-02nvmet: remove extra variable in identify nsChaitanya Kulkarni1-16/+15
We remove the extra local variable struct nvmet_ns in nvmet_execute_identify_ns() since req already has ns member that can be reused, this also eliminates the explicit call to nvmet_put_namespace() which is already present in the request completion path. Signed-off-by: Chaitanya Kulkarni <[email protected]> Reviewed-by: Sagi Grimberg <[email protected]> Signed-off-by: Christoph Hellwig <[email protected]>
2021-02-02nvmet: remove extra variable in id-desclistChaitanya Kulkarni1-11/+9
We remove the extra local variable struct nvmet_ns in nvmet_execute_identify_desclist() since req already has ns member that can be reused, this also eliminates the explicit call to nvmet_put_namespace() which is already present in the request completion path. Signed-off-by: Chaitanya Kulkarni <[email protected]> Reviewed-by: Sagi Grimberg <[email protected]> Signed-off-by: Christoph Hellwig <[email protected]>
2021-02-02nvmet: remove extra variable in smart log nsidChaitanya Kulkarni1-11/+9
We remove the extra local variable struct nvmet_ns in nvmet_get_smart_log_nsid() since req already has ns member that can be reused, this also eliminates the explicit call to nvmet_put_namespace() which is already present in the request completion path. Signed-off-by: Chaitanya Kulkarni <[email protected]> Reviewed-by: Sagi Grimberg <[email protected]> Signed-off-by: Christoph Hellwig <[email protected]>
2021-02-02nvme: refactor ns->ctrl by requestMinwoo Im1-3/+3
Just for current code in nvme_cleanup_cmd(), we don't have to get namespace instance, but we need controller instance. Controller instance can be retrieved by namespace instance, but it can be directly accessed by nvme_request instance from request. ctrl = nvme_req(req)->ctrl; We don't have to go around namespace instance from request instance through gendisk. Signed-off-by: Minwoo Im <[email protected]> Reviewed-by: Sagi Grimberg <[email protected]> Signed-off-by: Christoph Hellwig <[email protected]>
2021-02-02nvme-tcp: pass multipage bvec to request iov_iterSagi Grimberg1-4/+9
iov_iter uses the right helpers so we should be able to pass in a multipage bvec. Right now the iov_iter is initialized with more segments that it needs which doesn't fail because the iov_iter is capped by byte count, but it is better to use a full multipage bvec iter. Signed-off-by: Sagi Grimberg <[email protected]> Signed-off-by: Christoph Hellwig <[email protected]>
2021-02-02nvme-tcp: get rid of unused helper functionSagi Grimberg1-5/+0
Signed-off-by: Sagi Grimberg <[email protected]> Signed-off-by: Christoph Hellwig <[email protected]>
2021-02-02nvme-tcp: fix wrong setting of request iov_iterSagi Grimberg1-5/+2
We might set the iov_iter direction wrong, which is harmless for this use-case, but get it right. Also this makes the code slightly cleaner. Signed-off-by: Sagi Grimberg <[email protected]> Signed-off-by: Christoph Hellwig <[email protected]>
2021-02-02nvme: support command retry delay for admin commandMinwoo Im1-3/+2
The controller can request a delay retrying a failed command by setting the Command Retry Delay (CRD) field in the Completion Queue Entry. Currentlty this features is only applied to commands on the I/O queue, but not to commands on the admin queue. Retreive the nvme_ctrl from the request so that no namespace is required and apply the feature to all commands. Signed-off-by: Minwoo Im <[email protected]> Signed-off-by: Christoph Hellwig <[email protected]>
2021-02-02nvme: constify static attribute_group structsRikard Falkeborn3-4/+4
The only usage of these is to put their addresses in arrays of pointers to const attribute_groups. Make them const to allow the compiler to put them in read-only memory. Signed-off-by: Rikard Falkeborn <[email protected]> Signed-off-by: Christoph Hellwig <[email protected]>
2021-02-02nvmet-fc: use RCU proctection for assoc_listLeonid Ravich1-43/+38
searching assoc_list protected by rcu_read_lock if list not changed inline. and according to the rcu list rules. queue array embedded into nvmet_fc_tgt_assoc protected by rcu_read_lock according to rcu dereference/assign rules. queue and assoc object freed after grace period by call_rcu. tgtport lock taken for changing assoc_list. Reviewed-by: Eldad Zinger <[email protected]> Reviewed-by: Elad Grupi <[email protected]> Reviewed-by: James Smart <[email protected]> Signed-off-by: Leonid Ravich <[email protected]> Signed-off-by: Christoph Hellwig <[email protected]>
2021-02-02nvmet: Fix nvmet_is_port_enabled indentationIsrael Rukshin1-1/+1
Remove extra tab. Signed-off-by: Israel Rukshin <[email protected]> Reviewed-by: Max Gurtovoy <[email protected]> Signed-off-by: Christoph Hellwig <[email protected]>
2021-02-02nvmet: Use nvmet_is_port_enabled helper for pi_enableIsrael Rukshin1-3/+1
Remove code duplication. Signed-off-by: Israel Rukshin <[email protected]> Reviewed-by: Max Gurtovoy <[email protected]> Signed-off-by: Christoph Hellwig <[email protected]>
2021-01-31drbd: Avoid comma separated statementsJoe Perches1-2/+4
Use semicolons and braces. Signed-off-by: Joe Perches <[email protected]> Signed-off-by: Jens Axboe <[email protected]>
2021-01-26rsxx: remove redundant NULL checkYang Li1-2/+1
Fix below warnings reported by coccicheck: ./drivers/block/rsxx/dma.c:948:3-8: WARNING: NULL check before some freeing functions is not needed. Reported-by: Abaci Robot <[email protected]> Signed-off-by: Yang Li <[email protected]> Signed-off-by: Jens Axboe <[email protected]>
2021-01-26zram: fix NULL check before some freeing functions is not neededTian Tao1-2/+1
fixed the below warning: /drivers/block/zram/zram_drv.c:534:2-8: WARNING: NULL check before some freeing functions is not needed. Signed-off-by: Tian Tao <[email protected]> Acked-by: Minchan Kim <[email protected]> Signed-off-by: Jens Axboe <[email protected]>
2021-01-26drbd: remove unused argument from drbd_request_prepare and __drbd_make_requestGuoqing Jiang3-10/+6
We can remove start_jif since it is not used by drbd_request_prepare, then remove it from __drbd_make_request further. Cc: Philipp Reisner <[email protected]> Cc: Lars Ellenberg <[email protected]> Cc: [email protected] Signed-off-by: Guoqing Jiang <[email protected]> Signed-off-by: Jens Axboe <[email protected]>
2021-01-26mtip32xx: prefer pcie_capability_read_word()Bjorn Helgaas1-8/+3
Replace pci_read_config_word() with pcie_capability_read_word(). pcie_capability_read_word() takes care of a few special cases when reading the PCIe capability. See 8c0d3a02c130 ("PCI: Add accessors for PCI Express Capability"). Signed-off-by: Bjorn Helgaas <[email protected]> Signed-off-by: Jens Axboe <[email protected]>
2021-01-26mtip32xx: use PCI #defines instead of numbersBjorn Helgaas1-2/+2
Use PCI #defines for PCIe Device Control register values instead of hard-coding bit positions. No functional change intended. Signed-off-by: Bjorn Helgaas <[email protected]> Signed-off-by: Jens Axboe <[email protected]>
2021-01-26loop: scale loop device by introducing per device lockPavel Tatashin2-40/+54
Currently, loop device has only one global lock: loop_ctl_mutex. This becomes hot in scenarios where many loop devices are used. Scale it by introducing per-device lock: lo_mutex that protects modifications of all fields in struct loop_device. Keep loop_ctl_mutex to protect global data: loop_index_idr, loop_lookup, loop_add. The new lock ordering requirement is that loop_ctl_mutex must be taken before lo_mutex. Signed-off-by: Pavel Tatashin <[email protected]> Reviewed-by: Tyler Hicks <[email protected]> Reviewed-by: Petr Vorel <[email protected]> Signed-off-by: Jens Axboe <[email protected]>
2021-01-26bdev: Do not return EBUSY if bdev discard races with writeJan Kara1-6/+4
blkdev_fallocate() tries to detect whether a discard raced with an overlapping write by calling invalidate_inode_pages2_range(). However this check can give both false negatives (when writing using direct IO or when writeback already writes out the written pagecache range) and false positives (when write is not actually overlapping but ends in the same page when blocksize < pagesize). This actually causes issues for qemu which is getting confused by EBUSY errors. Fix the problem by removing this conflicting write detection since it is inherently racy and thus of little use anyway. Reported-by: Maxim Levitsky <[email protected]> CC: "Darrick J. Wong" <[email protected]> Link: https://lore.kernel.org/qemu-devel/[email protected] Signed-off-by: Jan Kara <[email protected]> Reviewed-by: Maxim Levitsky <[email protected]> Reviewed-by: Darrick J. Wong <[email protected]> Reviewed-by: Christoph Hellwig <[email protected]> Signed-off-by: Jens Axboe <[email protected]>
2021-01-26block: inherit BIO_REMAPPED when cloning biosChristoph Hellwig3-0/+6
Cloned bios are can be used to on the same device, in which case we need to inherit the BIO_REMAPPED flag to avoid a double partition remap. When the cloned bios are used on another device, bio_set_dev will clear the flag. Fixes: 309dca309fc3 ("block: store a block_device pointer in struct bio") Signed-off-by: Christoph Hellwig <[email protected]> Signed-off-by: Jens Axboe <[email protected]>
2021-01-26bcache: use bio_set_dev to assign ->bi_bdevChristoph Hellwig1-1/+1
Always use the bio_set_dev helper to assign ->bi_bdev to make sure other state related to the device is uptodate. Signed-off-by: Christoph Hellwig <[email protected]> Signed-off-by: Jens Axboe <[email protected]>
2021-01-26nvme: use bio_set_dev to assign ->bi_bdevChristoph Hellwig3-4/+4
Always use the bio_set_dev helper to assign ->bi_bdev to make sure other state related to the device is uptodate. Signed-off-by: Christoph Hellwig <[email protected]> Signed-off-by: Jens Axboe <[email protected]>
2021-01-25bfq: bfq_check_waker() should be staticJens Axboe1-1/+2
It's only used in the same file, mark is appropriately static. Fixes: 71217df39dc6 ("block, bfq: make waker-queue detection more robust") Reported-by: kernel test robot <[email protected]> Signed-off-by: Jens Axboe <[email protected]>
2021-01-25block, bfq: make waker-queue detection more robustPaolo Valente2-110/+108
In the presence of many parallel I/O flows, the detection of waker bfq_queues suffers from false positives. This commits addresses this issue by making the filtering of actual wakers more selective. In more detail, a candidate waker must be found to meet waker requirements three times before being promoted to actual waker. Tested-by: Jan Kara <[email protected]> Signed-off-by: Paolo Valente <[email protected]> Signed-off-by: Jens Axboe <[email protected]>
2021-01-25block, bfq: save also injection state on queue mergingPaolo Valente2-0/+13
To prevent injection information from being lost on bfq_queue merging, also the amount of service that a bfq_queue receives must be saved and restored when the bfq_queue is merged and split, respectively. Tested-by: Jan Kara <[email protected]> Signed-off-by: Paolo Valente <[email protected]> Signed-off-by: Jens Axboe <[email protected]>
2021-01-25block, bfq: save also weight-raised service on queue mergingPaolo Valente2-0/+3
To prevent weight-raising information from being lost on bfq_queue merging, also the amount of service that a bfq_queue receives must be saved and restored when the bfq_queue is merged and split, respectively. Tested-by: Jan Kara <[email protected]> Signed-off-by: Paolo Valente <[email protected]> Signed-off-by: Jens Axboe <[email protected]>
2021-01-25block, bfq: fix switch back from soft-rt weitgh-raisingPaolo Valente1-2/+20
A bfq_queue may happen to be deemed as soft real-time while it is still enjoying interactive weight-raising. If this happens because of a false positive, then the bfq_queue is likely to loose its soft real-time status soon. Upon losing such a status, the bfq_queue must get back its interactive weight-raising, if its interactive period is not over yet. But this case is not handled. This commit corrects this error. Tested-by: Jan Kara <[email protected]> Signed-off-by: Paolo Valente <[email protected]> Signed-off-by: Jens Axboe <[email protected]>
2021-01-25block, bfq: re-evaluate convenience of I/O plugging on rq arrivalsPaolo Valente1-5/+19
Upon an I/O-dispatch attempt, BFQ may detect that it was better to plug I/O dispatch, and to wait for a new request to arrive for the currently in-service queue. But the arrival of a new request for an empty bfq_queue, and thus the switch from idle to busy of the bfq_queue, may cause the scenario to change, and make plugging no longer needed for service guarantees, or more convenient for throughput. In this case, keeping I/O-dispatch plugged would certainly lower throughput. To address this issue, this commit makes such a check, and stops plugging I/O if it is better to stop plugging I/O. Tested-by: Jan Kara <[email protected]> Signed-off-by: Paolo Valente <[email protected]> Signed-off-by: Jens Axboe <[email protected]>
2021-01-25block, bfq: replace mechanism for evaluating I/O intensityPaolo Valente2-27/+52
Some BFQ mechanisms make their decisions on a bfq_queue basing also on whether the bfq_queue is I/O bound. In this respect, the current logic for evaluating whether a bfq_queue is I/O bound is rather rough. This commits replaces this logic with a more effective one. The new logic measures the percentage of time during which a bfq_queue is active, and marks the bfq_queue as I/O bound if the latter if this percentage is above a fixed threshold. Tested-by: Jan Kara <[email protected]> Signed-off-by: Paolo Valente <[email protected]> Signed-off-by: Jens Axboe <[email protected]>
2021-01-25block: skip bio_check_eod for partition-remapped biosChristoph Hellwig1-5/+6
When an already remapped bio is resubmitted (e.g. by blk_queue_split), bio_check_eod will compare the remapped bi_sector against the size of the partition, leading to spurious I/O failures. Skip the EOD check in this case. Fixes: 309dca309fc3 ("block: store a block_device pointer in struct bio") Reported-by: Jens Axboe <[email protected]> Signed-off-by: Christoph Hellwig <[email protected]> Signed-off-by: Jens Axboe <[email protected]>
2021-01-25bio: don't copy bvec for direct IOPavel Begunkov3-39/+42
The block layer spends quite a while in blkdev_direct_IO() to copy and initialise bio's bvec. However, if we've already got a bvec in the input iterator it might be reused in some cases, i.e. when new ITER_BVEC_FLAG_FIXED flag is set. Simple tests show considerable performance boost, and it also reduces memory footprint. Suggested-by: Matthew Wilcox <[email protected]> Reviewed-by: Christoph Hellwig <[email protected]> Signed-off-by: Pavel Begunkov <[email protected]> Reviewed-by: Ming Lei <[email protected]> Signed-off-by: Jens Axboe <[email protected]>
2021-01-25bio: add a helper calculating nr segments to allocPavel Begunkov3-8/+18
Add a helper function calculating the number of bvec segments we need to allocate to construct a bio. It doesn't change anything functionally, but will be used to not duplicate special cases in the future. Reviewed-by: Christoph Hellwig <[email protected]> Signed-off-by: Pavel Begunkov <[email protected]> Reviewed-by: Ming Lei <[email protected]> Signed-off-by: Jens Axboe <[email protected]>
2021-01-25iov_iter: optimise bvec iov_iter_advance()Pavel Begunkov1-0/+19
iov_iter_advance() is heavily used, but implemented through generic means. For bvecs there is a specifically crafted function for that, so use bvec_iter_advance() instead, it's faster and slimmer. Reviewed-by: Christoph Hellwig <[email protected]> Signed-off-by: Pavel Begunkov <[email protected]> Reviewed-by: Ming Lei <[email protected]> Signed-off-by: Jens Axboe <[email protected]>
2021-01-25target/file: allocate the bvec array as part of struct target_core_file_cmdChristoph Hellwig1-14/+6
This saves one memory allocation, and ensures the bvecs aren't freed before the AIO completion. This will allow the lower level code to be optimized so that it can avoid allocating another bvec array. Signed-off-by: Christoph Hellwig <[email protected]> Signed-off-by: Pavel Begunkov <[email protected]> Reviewed-by: Ming Lei <[email protected]> Signed-off-by: Jens Axboe <[email protected]>
2021-01-25block/psi: remove PSI annotations from direct IOPavel Begunkov2-0/+8
Direct IO does not operate on the current working set of pages managed by the kernel, so it should not be accounted as memory stall to PSI infrastructure. The block layer and iomap direct IO use bio_iov_iter_get_pages() to build bios, and they are the only users of it, so to avoid PSI tracking for them clear out BIO_WORKINGSET flag. Do same for dio_bio_submit() because fs/direct_io constructs bios by hand directly calling bio_add_page(). Reported-by: Christoph Hellwig <[email protected]> Suggested-by: Christoph Hellwig <[email protected]> Suggested-by: Johannes Weiner <[email protected]> Reviewed-by: Christoph Hellwig <[email protected]> Signed-off-by: Pavel Begunkov <[email protected]> Reviewed-by: Ming Lei <[email protected]> Signed-off-by: Jens Axboe <[email protected]>
2021-01-25bvec/iter: disallow zero-length segment bvecsPavel Begunkov3-2/+9
zero-length bvec segments are allowed in general, but not handled by bio and down the block layer so filtered out. This inconsistency may be confusing and prevent from optimisations. As zero-length segments are useless and places that were generating them are patched, declare them not allowed. Reviewed-by: Christoph Hellwig <[email protected]> Signed-off-by: Pavel Begunkov <[email protected]> Reviewed-by: Ming Lei <[email protected]> Signed-off-by: Jens Axboe <[email protected]>
2021-01-25splice: don't generate zero-len segement bvecsPavel Begunkov1-3/+6
iter_file_splice_write() may spawn bvec segments with zero-length. In preparation for prohibiting them, filter out by hand at splice level. Reviewed-by: Christoph Hellwig <[email protected]> Signed-off-by: Pavel Begunkov <[email protected]> Reviewed-by: Ming Lei <[email protected]> Signed-off-by: Jens Axboe <[email protected]>
2021-01-24block: remove unnecessary argument from blk_execute_rqGuoqing Jiang24-35/+33
We can remove 'q' from blk_execute_rq as well after the previous change in blk_execute_rq_nowait. And more importantly it never really was needed to start with given that we can trivial derive it from struct request. Cc: [email protected] Cc: [email protected] Cc: [email protected] Cc: [email protected] Cc: [email protected] Cc: [email protected] Acked-by: Ulf Hansson <[email protected]> # for mmc Signed-off-by: Guoqing Jiang <[email protected]> Signed-off-by: Jens Axboe <[email protected]>
2021-01-24block: remove unnecessary argument from blk_execute_rq_nowaitGuoqing Jiang11-21/+17
The 'q' is not used since commit a1ce35fa4985 ("block: remove dead elevator code"), also update the comment of the function. And more importantly it never really was needed to start with given that we can trivial derive it from struct request. Cc: [email protected] Cc: [email protected] Cc: [email protected] Cc: [email protected] Cc: [email protected] Cc: [email protected] Cc: [email protected] Signed-off-by: Guoqing Jiang <[email protected]> Signed-off-by: Jens Axboe <[email protected]>
2021-01-24bsg: free the request before return error codePan Bian1-1/+3
Free the request rq before returning error code. Fixes: 972248e9111e ("scsi: bsg-lib: handle bidi requests without block layer help") Signed-off-by: Pan Bian <[email protected]> Reviewed-by: Christoph Hellwig <[email protected]> Signed-off-by: Jens Axboe <[email protected]>
2021-01-24bcache: don't pass BIOSET_NEED_BVECS for the 'bio_set' embedded in 'cache_set'Ming Lei1-1/+1
This bioset is just for allocating bio only from bio_next_split, and it needn't bvecs, so remove the flag. Cc: [email protected] Cc: Coly Li <[email protected]> Reviewed-by: Christoph Hellwig <[email protected]> Signed-off-by: Ming Lei <[email protected]> Acked-by: Coly Li <[email protected]> Reviewed-by: Hannes Reinecke <[email protected]> Signed-off-by: Jens Axboe <[email protected]>
2021-01-24block: move three bvec helpers declaration into private helperMing Lei2-3/+4
bvec_alloc(), bvec_free() and bvec_nr_vecs() are only used inside block layer core functions, no need to declare them in public header. Reviewed-by: Christoph Hellwig <[email protected]> Signed-off-by: Ming Lei <[email protected]> Reviewed-by: Pavel Begunkov <[email protected]> Reviewed-by: Hannes Reinecke <[email protected]> Signed-off-by: Jens Axboe <[email protected]>