aboutsummaryrefslogtreecommitdiff
path: root/include/linux/blk-mq.h
AgeCommit message (Collapse)AuthorFilesLines
2022-07-14block: Use the new blk_opf_t typeBart Van Assche1-3/+3
Use the new blk_opf_t type for arguments and variables that represent request flags or a bitwise combination of a request operation and request flags. Rename the function arguments and also a structure member that hold a request operation and flags from 'rw' into 'opf'. This patch does not change any functionality. Cc: Christoph Hellwig <[email protected]> Cc: Ming Lei <[email protected]> Cc: Hannes Reinecke <[email protected]> Cc: Damien Le Moal <[email protected]> Cc: Johannes Thumshirn <[email protected]> Signed-off-by: Bart Van Assche <[email protected]> Link: https://lore.kernel.org/r/[email protected] Signed-off-by: Jens Axboe <[email protected]>
2022-07-14block: Change the type of req_op() and bio_op() into enum req_opBart Van Assche1-2/+4
Improve static type checking by changing the type of the value returned by req_op() and bio_op() from unsigned int into enum req_op. Insert 'default: break;' in switch statements on the enum req_op type to prevent that the compiler warns about these switch statements. Cc: Christoph Hellwig <[email protected]> Cc: Ming Lei <[email protected]> Cc: Hannes Reinecke <[email protected]> Cc: Damien Le Moal <[email protected]> Cc: Johannes Thumshirn <[email protected]> Cc: Tim Waugh <[email protected]> Cc: Alasdair Kergon <[email protected]> Cc: Mike Snitzer <[email protected]> Cc: Mikulas Patocka <[email protected]> Signed-off-by: Bart Van Assche <[email protected]> Link: https://lore.kernel.org/r/[email protected] Signed-off-by: Jens Axboe <[email protected]>
2022-07-06block: move zone related fields to struct gendiskChristoph Hellwig1-4/+4
Move the zone related fields that are currently stored in struct request_queue to struct gendisk as these are part of the highlevel block layer API and are only used for non-passthrough I/O that requires the gendisk. Signed-off-by: Christoph Hellwig <[email protected]> Reviewed-by: Chaitanya Kulkarni <[email protected]> Reviewed-by: Damien Le Moal <[email protected]> Reviewed-by: Johannes Thumshirn <[email protected]> Link: https://lore.kernel.org/r/[email protected] Signed-off-by: Jens Axboe <[email protected]>
2022-07-06blk-mq: Drop 'reserved' arg of busy_tag_iter_fnJohn Garry1-1/+1
We no longer use the 'reserved' arg in busy_tag_iter_fn for any iter function so it may be dropped. Signed-off-by: John Garry <[email protected]> Reviewed-by: Christoph Hellwig <[email protected]> Reviewed-by: Hannes Reinecke <[email protected]> Reviewed-by: Martin K. Petersen <[email protected]> Reviewed-by: Sagi Grimberg <[email protected]> #nvme Reviewed-by: Bart Van Assche <[email protected]> Link: https://lore.kernel.org/r/[email protected] Signed-off-by: Jens Axboe <[email protected]>
2022-07-06blk-mq: Drop blk_mq_ops.timeout 'reserved' argJohn Garry1-1/+1
With new API blk_mq_is_reserved_rq() we can tell if a request is from the reserved pool, so stop passing 'reserved' arg. There is actually only a single user of that arg for all the callback implementations, which can use blk_mq_is_reserved_rq() instead. This will also allow us to stop passing the same 'reserved' around the blk-mq iter functions next. Signed-off-by: John Garry <[email protected]> Reviewed-by: Christoph Hellwig <[email protected]> Reviewed-by: Bart Van Assche <[email protected]> Reviewed-by: Hannes Reinecke <[email protected]> Acked-by: Ulf Hansson <[email protected]> # For MMC Reviewed-by: Martin K. Petersen <[email protected]> Link: https://lore.kernel.org/r/[email protected] Signed-off-by: Jens Axboe <[email protected]>
2022-07-06blk-mq: Add a flag for reserved requestsJohn Garry1-0/+6
Add a flag for reserved requests so that drivers may know this for any special handling. Signed-off-by: John Garry <[email protected]> Reviewed-by: Christoph Hellwig <[email protected]> Reviewed-by: Bart Van Assche <[email protected]> Reviewed-by: Hannes Reinecke <[email protected]> Reviewed-by: Martin K. Petersen <[email protected]> Link: https://lore.kernel.org/r/[email protected] Signed-off-by: Jens Axboe <[email protected]>
2022-06-28blk-mq: cleanup disk sysfs registrationChristoph Hellwig1-1/+0
Pass a gendisk to the sysfs register/unregister functions and give them descriptive names. Also move the unregistration helper next to the one doing the registration. Signed-off-by: Christoph Hellwig <[email protected]> Reviewed-by: Bart Van Assche <[email protected]> Link: https://lore.kernel.org/r/[email protected] Signed-off-by: Jens Axboe <[email protected]>
2022-06-28block: simplify disk shutdownChristoph Hellwig1-0/+3
Set the queue dying flag and call blk_mq_exit_queue from del_gendisk for all disks that do not have separately allocated queues, and thus remove the need to call blk_cleanup_queue for them. Rename blk_cleanup_disk to blk_mq_destroy_queue to make it clear that this function is intended only for separately allocated blk-mq queues. This saves an extra queue freeze for devices without a separately allocated queue. Signed-off-by: Christoph Hellwig <[email protected]> Reviewed-by: Hannes Reinecke <[email protected]> Link: https://lore.kernel.org/r/[email protected] Signed-off-by: Jens Axboe <[email protected]>
2022-05-28blk-mq: remove the done argument to blk_execute_rq_nowaitChristoph Hellwig1-2/+1
Let the caller set it together with the end_io_data instead of passing a pointless argument. Note the the target code did in fact already set it and then just overrode it again by calling blk_execute_rq_nowait. Signed-off-by: Christoph Hellwig <[email protected]> Reviewed-by: Keith Busch <[email protected]> Reviewed-by: Kanchan Joshi <[email protected]> Reviewed-by: Chaitanya Kulkarni <[email protected]> Link: https://lore.kernel.org/r/[email protected] Signed-off-by: Jens Axboe <[email protected]>
2022-05-08blk-mq: remove the error_count from struct requestWilly Tarreau1-1/+0
The last two users were floppy.c and ataflop.c respectively, it was verified that no other drivers makes use of this, so let's remove it. Suggested-by: Linus Torvalds <[email protected]> Cc: Minh Yuan <[email protected]> Cc: Denis Efremov <[email protected]>, Cc: Geert Uytterhoeven <[email protected]> Cc: Christoph Hellwig <[email protected]> Signed-off-by: Willy Tarreau <[email protected]> Signed-off-by: Linus Torvalds <[email protected]>
2022-03-08blk-mq: manage hctx map via xarrayMing Lei1-2/+1
First code becomes more clean by switching to xarray from plain array. Second use-after-free on q->queue_hw_ctx can be fixed because queue_for_each_hw_ctx() may be run when updating nr_hw_queues is in-progress. With this patch, q->hctx_table is defined as xarray, and this structure will share same lifetime with request queue, so queue_for_each_hw_ctx() can use q->hctx_table to lookup hctx reliably. Reported-by: Yu Kuai <[email protected]> Signed-off-by: Ming Lei <[email protected]> Reviewed-by: Hannes Reinecke <[email protected]> Reviewed-by: Christoph Hellwig <[email protected]> Link: https://lore.kernel.org/r/[email protected] [axboe: fix blk_mq_hw_ctx forward declaration] Signed-off-by: Jens Axboe <[email protected]>
2022-02-16blk-mq: remove the request_queue argument to blk_insert_cloned_requestChristoph Hellwig1-2/+1
The request must be submitted to the queue it was allocated for, so remove the extra request_queue argument. Signed-off-by: Christoph Hellwig <[email protected]> Reviewed-by: Mike Snitzer <[email protected]> Link: https://lore.kernel.org/r/[email protected] Signed-off-by: Jens Axboe <[email protected]>
2022-01-09block: fix old-style declarationYang Li1-1/+1
Move the 'inline' keyword to the front of 'void'. Remove a warning found by clang(make W=1 LLVM=1) ./include/linux/blk-mq.h:259:1: warning: ‘inline’ is not at beginning of declaration Reported-by: Abaci Robot <[email protected]> Signed-off-by: Yang Li <[email protected]> Link: https://lore.kernel.org/r/[email protected] Signed-off-by: Jens Axboe <[email protected]>
2022-01-05block: introduce rq_list_moveKeith Busch1-0/+17
When iterating a list, a particular request may need to be moved for special handling. Provide a helper function to achieve that so drivers don't need to reimplement rqlist manipulation. Signed-off-by: Keith Busch <[email protected]> Reviewed-by: Christoph Hellwig <[email protected]> Link: https://lore.kernel.org/r/[email protected] Signed-off-by: Jens Axboe <[email protected]>
2022-01-05block: introduce rq_list_for_each_safe macroKeith Busch1-0/+4
While iterating a list, a particular request may need to be removed for special handling. Provide an iterator that can safely handle that. Reviewed-by: Christoph Hellwig <[email protected]> Signed-off-by: Keith Busch <[email protected]> Link: https://lore.kernel.org/r/[email protected] Signed-off-by: Jens Axboe <[email protected]>
2022-01-05block: move rq_list macros to blk-mq.hKeith Busch1-0/+29
Move the request list macros to the header file that defines that struct they operate on. Signed-off-by: Keith Busch <[email protected]> Reviewed-by: Christoph Hellwig <[email protected]> Link: https://lore.kernel.org/r/[email protected] Signed-off-by: Jens Axboe <[email protected]>
2021-12-16block: add mq_ops->queue_rqs hookJens Axboe1-0/+8
If we have a list of requests in our plug list, send it to the driver in one go, if possible. The driver must set mq_ops->queue_rqs() to support this, if not the usual one-by-one path is used. Reviewed-by: Christoph Hellwig <[email protected]> Signed-off-by: Jens Axboe <[email protected]>
2021-12-06blk-mq: Delete busy_iter_fnJohn Garry1-1/+0
Typedefs busy_iter_fn and busy_tag_iter_fn are now identical, so delete busy_iter_fn to reduce duplication. It would be nicer to delete busy_tag_iter_fn, as the name busy_iter_fn is less specific. However busy_tag_iter_fn is used in many different parts of the tree, unlike busy_iter_fn which is just use in block/, so just take the straightforward path now, so that we could rename later treewide. Signed-off-by: John Garry <[email protected]> Reviewed-by: Ming Lei <[email protected]> Reviewed-by: Hannes Reinecke <[email protected]> Tested-by: Kashyap Desai <[email protected]> Link: https://lore.kernel.org/r/[email protected] Signed-off-by: Jens Axboe <[email protected]>
2021-12-06blk-mq: Drop busy_iter_fn blk_mq_hw_ctx argumentJohn Garry1-2/+1
The only user of blk_mq_hw_ctx blk_mq_hw_ctx argument is blk_mq_rq_inflight(). Function blk_mq_rq_inflight() uses the hctx to find the associated request queue to match against the request. However this same check is already done in caller bt_iter(), so drop this check. With that change there are no more users of busy_iter_fn blk_mq_hw_ctx argument, so drop the argument. Reviewed-by Hannes Reinecke <[email protected]> Signed-off-by: John Garry <[email protected]> Reviewed-by: Ming Lei <[email protected]> Tested-by: Kashyap Desai <[email protected]> Link: https://lore.kernel.org/r/[email protected] Signed-off-by: Jens Axboe <[email protected]>
2021-12-03blk-mq: move srcu from blk_mq_hw_ctx to request_queueMing Lei1-8/+0
In case of BLK_MQ_F_BLOCKING, per-hctx srcu is used to protect dispatch critical area. However, this srcu instance stays at the end of hctx, and it often takes standalone cacheline, often cold. Inside srcu_read_lock() and srcu_read_unlock(), WRITE is always done on the indirect percpu variable which is allocated from heap instead of being embedded, srcu->srcu_idx is read only in srcu_read_lock(). It doesn't matter if srcu structure stays in hctx or request queue. So switch to per-request-queue srcu for protecting dispatch, and this way simplifies quiesce a lot, not mention quiesce is always done on the request queue wide. Signed-off-by: Ming Lei <[email protected]> Link: https://lore.kernel.org/r/[email protected] Signed-off-by: Jens Axboe <[email protected]>
2021-12-03block: switch to atomic_t for request referencesJens Axboe1-1/+1
refcount_t is not as expensive as it used to be, but it's still more expensive than the io_uring method of using atomic_t and just checking for potential over/underflow. This borrows that same implementation, which in turn is based on the mm implementation from Linus. Reviewed-by: Keith Busch <[email protected]> Signed-off-by: Jens Axboe <[email protected]>
2021-11-29block: remove the gendisk argument to blk_execute_rqChristoph Hellwig1-4/+3
Remove the gendisk aregument to blk_execute_rq and blk_execute_rq_nowait given that it is unused now. Also convert the boolean at_head parameter to actually use the bool type while touching the prototype. Signed-off-by: Christoph Hellwig <[email protected]> Reviewed-by: Chaitanya Kulkarni <[email protected]> Reviewed-by: Martin K. Petersen <[email protected]> Link: https://lore.kernel.org/r/[email protected] Signed-off-by: Jens Axboe <[email protected]>
2021-11-29block: remove the ->rq_disk field in struct requestChristoph Hellwig1-4/+0
Just use the disk attached to the request_queue instead. Signed-off-by: Christoph Hellwig <[email protected]> Reviewed-by: Chaitanya Kulkarni <[email protected]> Reviewed-by: Martin K. Petersen <[email protected]> Link: https://lore.kernel.org/r/[email protected] Signed-off-by: Jens Axboe <[email protected]>
2021-11-29blk-mq: Add blk_mq_complete_request_direct()Sebastian Andrzej Siewior1-0/+11
Add blk_mq_complete_request_direct() which completes the block request directly instead deferring it to softirq for single queue devices. This is useful for devices which complete the requests in preemptible context and raising softirq from means scheduling ksoftirqd. Signed-off-by: Sebastian Andrzej Siewior <[email protected]> Reviewed-by: Christoph Hellwig <[email protected]> Link: https://lore.kernel.org/r/[email protected] Signed-off-by: Jens Axboe <[email protected]>
2021-11-29block: remove rq_flush_dcache_pagesChristoph Hellwig1-10/+0
This function is trivial, and flush_dcache_page is always defined, so just open code it in the 2.5 callers. Signed-off-by: Christoph Hellwig <[email protected]> Reviewed-by: Johannes Thumshirn <[email protected]> Link: https://lore.kernel.org/r/[email protected] Signed-off-by: Jens Axboe <[email protected]>
2021-11-29block: move blk_rq_err_bytes to scsiChristoph Hellwig1-3/+0
blk_rq_err_bytes is only used by the scsi midlayer, so move it there. Signed-off-by: Christoph Hellwig <[email protected]> Reviewed-by: Johannes Thumshirn <[email protected]> Link: https://lore.kernel.org/r/[email protected] Signed-off-by: Jens Axboe <[email protected]>
2021-11-09Merge tag 'for-5.16/block-2021-11-09' of git://git.kernel.dk/linux-blockLinus Torvalds1-0/+1
Pull block fixes from Jens Axboe: - Set of fixes for the batched tag allocation (Ming, me) - add_disk() error handling fix (Luis) - Nested queue quiesce fixes (Ming) - Shared tags init error handling fix (Ye) - Misc cleanups (Jean, Ming, me) * tag 'for-5.16/block-2021-11-09' of git://git.kernel.dk/linux-block: nvme: wait until quiesce is done scsi: make sure that request queue queiesce and unquiesce balanced scsi: avoid to quiesce sdev->request_queue two times blk-mq: add one API for waiting until quiesce is done blk-mq: don't free tags if the tag_set is used by other device in queue initialztion block: fix device_add_disk() kobject_create_and_add() error handling block: ensure cached plug request matches the current queue block: move queue enter logic into blk_mq_submit_bio() block: make bio_queue_enter() fast-path available inline block: split request allocation components into helpers block: have plug stored requests hold references to the queue blk-mq: update hctx->nr_active in blk_mq_end_request_batch() blk-mq: add RQF_ELV debug entry blk-mq: only try to run plug merge if request has same queue with incoming bio block: move RQF_ELV setting into allocators dm: don't stop request queue after the dm device is suspended block: replace always false argument with 'false' block: assign correct tag before doing prefetch of request blk-mq: fix redundant check of !e expression
2021-11-09blk-mq: add one API for waiting until quiesce is doneMing Lei1-0/+1
Some drivers(NVMe, SCSI) need to call quiesce and unquiesce in pair, but it is hard to switch to this style, so these drivers need one atomic flag for helping to balance quiesce and unquiesce. When quiesce is in-progress, the driver still needs to wait until the quiesce is done, so add API of blk_mq_wait_quiesce_done() for these drivers. Signed-off-by: Ming Lei <[email protected]> Reviewed-by: Martin K. Petersen <[email protected]> Link: https://lore.kernel.org/r/[email protected] Signed-off-by: Jens Axboe <[email protected]>
2021-10-29block: remove blk_{get,put}_requestChristoph Hellwig1-3/+0
These are now pointless wrappers around blk_mq_{alloc,free}_request, so remove them. Signed-off-by: Christoph Hellwig <[email protected]> Reviewed-by: Chaitanya Kulkarni <[email protected]> Link: https://lore.kernel.org/r/[email protected] Signed-off-by: Jens Axboe <[email protected]>
2021-10-22block: remove the initialize_rq_fn blk_mq_ops methodChristoph Hellwig1-5/+0
Entirely unused now. Signed-off-by: Christoph Hellwig <[email protected]> Reviewed-by: Hannes Reinecke <[email protected]> Link: https://lore.kernel.org/r/[email protected] Signed-off-by: Jens Axboe <[email protected]>
2021-10-21blk-crypto: rename blk_keyslot_manager to blk_crypto_profileEric Biggers1-1/+1
blk_keyslot_manager is misnamed because it doesn't necessarily manage keyslots. It actually does several different things: - Contains the crypto capabilities of the device. - Provides functions to control the inline encryption hardware. Originally these were just for programming/evicting keyslots; however, new functionality (hardware-wrapped keys) will require new functions here which are unrelated to keyslots. Moreover, device-mapper devices already (ab)use "keyslot_evict" to pass key eviction requests to their underlying devices even though device-mapper devices don't have any keyslots themselves (so it really should be "evict_key", not "keyslot_evict"). - Sometimes (but not always!) it manages keyslots. Originally it always did, but device-mapper devices don't have keyslots themselves, so they use a "passthrough keyslot manager" which doesn't actually manage keyslots. This hack works, but the terminology is unnatural. Also, some hardware doesn't have keyslots and thus also uses a "passthrough keyslot manager" (support for such hardware is yet to be upstreamed, but it will happen eventually). Let's stop having keyslot managers which don't actually manage keyslots. Instead, rename blk_keyslot_manager to blk_crypto_profile. This is a fairly big change, since for consistency it also has to update keyslot manager-related function names, variable names, and comments -- not just the actual struct name. However it's still a fairly straightforward change, as it doesn't change any actual functionality. Acked-by: Ulf Hansson <[email protected]> # For MMC Reviewed-by: Mike Snitzer <[email protected]> Reviewed-by: Martin K. Petersen <[email protected]> Signed-off-by: Eric Biggers <[email protected]> Link: https://lore.kernel.org/r/[email protected] Signed-off-by: Jens Axboe <[email protected]>
2021-10-20blk-mq: move blk_mq_flush_plug_list to block/blk-mq.hChristoph Hellwig1-2/+0
This helper is internal to the block layer. Signed-off-by: Christoph Hellwig <[email protected]> Link: https://lore.kernel.org/r/[email protected] Signed-off-by: Jens Axboe <[email protected]>
2021-10-19block: move blk_mq_tag_to_rq() inlineJens Axboe1-1/+35
This is in the fast path of driver issue or completion, and it's a single array index operation. Move it inline to avoid a function call for it. This does mean making struct blk_mq_tags block layer public, but there's not really much in there. Reviewed-by: Christoph Hellwig <[email protected]> Signed-off-by: Jens Axboe <[email protected]>
2021-10-18block: add support for blk_mq_end_request_batch()Jens Axboe1-0/+29
Instead of calling blk_mq_end_request() on a single request, add a helper that takes the new struct io_comp_batch and completes any request stored in there. Reviewed-by: Christoph Hellwig <[email protected]> Signed-off-by: Jens Axboe <[email protected]>
2021-10-18block: add a struct io_comp_batch argument to fops->iopoll()Jens Axboe1-1/+1
struct io_comp_batch contains a list head and a completion handler, which will allow completions to more effciently completed batches of IO. For now, no functional changes in this patch, we just define the io_comp_batch structure and add the argument to the file_operations iopoll handler. Reviewed-by: Christoph Hellwig <[email protected]> Signed-off-by: Jens Axboe <[email protected]>
2021-10-18block: remove some blk_mq_hw_ctx debugfs entriesJens Axboe1-10/+0
Just like the blk_mq_ctx counterparts, we've got a bunch of counters in here that are only for debugfs and are of questionnable value. They are: - dispatched, index of how many requests were dispatched in one go - poll_{considered,invoked,success}, which track poll sucess rates. We're confident in the iopoll implementation at this point, don't bother tracking these. As a bonus, this shrinks each hardware queue from 576 bytes to 512 bytes, dropping a whole cacheline. Reviewed-by: Christoph Hellwig <[email protected]> Signed-off-by: Jens Axboe <[email protected]>
2021-10-18block: store elevator state in requestJens Axboe1-0/+2
Add an rq private RQF_ELV flag, which tells the block layer that this request was initialized on a queue that has an IO scheduler attached. This allows for faster checking in the fast path, rather than having to deference rq->q later on. Elevator switching does full quiesce of the queue before detaching an IO scheduler, so it's safe to cache this in the request itself. Signed-off-by: Jens Axboe <[email protected]>
2021-10-18block: improve layout of struct requestJens Axboe1-44/+46
It's been a while since this was analyzed, move some members around to better flow with the use case. Initial state up top, and queued state after that. This improves my peak case by about 1.5%, from 7750K to 7900K IOPS. Reviewed-by: Christoph Hellwig <[email protected]> Signed-off-by: Jens Axboe <[email protected]>
2021-10-18block: switch polling to be bio basedChristoph Hellwig1-13/+2
Replace the blk_poll interface that requires the caller to keep a queue and cookie from the submissions with polling based on the bio. Polling for the bio itself leads to a few advantages: - the cookie construction can made entirely private in blk-mq.c - the caller does not need to remember the request_queue and cookie separately and thus sidesteps their lifetime issues - keeping the device and the cookie inside the bio allows to trivially support polling BIOs remapping by stacking drivers - a lot of code to propagate the cookie back up the submission path can be removed entirely. Signed-off-by: Christoph Hellwig <[email protected]> Tested-by: Mark Wunderlich <[email protected]> Link: https://lore.kernel.org/r/[email protected] Signed-off-by: Jens Axboe <[email protected]>
2021-10-18block: fold bio_cur_bytes into blk_rq_cur_bytesChristoph Hellwig1-1/+5
Fold bio_cur_bytes into the only caller. Signed-off-by: Christoph Hellwig <[email protected]> Link: https://lore.kernel.org/r/[email protected] Signed-off-by: Jens Axboe <[email protected]>
2021-10-18block: pre-allocate requests if plug is started and is a batchJens Axboe1-1/+4
The caller typically has a good (or even exact) idea of how many requests it needs to submit. We can make the request/tag allocation a lot more efficient if we just allocate N requests/tags upfront when we queue the first bio from the batch. Provide a new plug start helper that allows the caller to specify how many IOs are expected. This sets plug->nr_ios, and we can use that for smarter request allocation. The plug provides a holding spot for requests, and request allocation will check it before calling into the normal request allocation path. The blk_finish_plug() is called, check if there are unused requests and free them. This should not happen in normal operations. The exception is if we get merging, then we may be left with requests that need freeing when done. This raises the per-core performance on my setup from ~5.8M to ~6.1M IOPS. Signed-off-by: Jens Axboe <[email protected]>
2021-10-18blk-mq: Change shared sbitmap naming to shared tagsJohn Garry1-4/+4
Now that shared sbitmap support really means shared tags, rename symbols to match that. Signed-off-by: John Garry <[email protected]> Link: https://lore.kernel.org/r/[email protected] Signed-off-by: Jens Axboe <[email protected]>
2021-10-18blk-mq: Use shared tags for shared sbitmap supportJohn Garry1-8/+7
Currently we use separate sbitmap pairs and active_queues atomic_t for shared sbitmap support. However a full sets of static requests are used per HW queue, which is quite wasteful, considering that the total number of requests usable at any given time across all HW queues is limited by the shared sbitmap depth. As such, it is considerably more memory efficient in the case of shared sbitmap to allocate a set of static rqs per tag set or request queue, and not per HW queue. So replace the sbitmap pairs and active_queues atomic_t with a shared tags per tagset and request queue, which will hold a set of shared static rqs. Since there is now no valid HW queue index to be passed to the blk_mq_ops .init and .exit_request callbacks, pass an invalid index token. This changes the semantics of the APIs, such that the callback would need to validate the HW queue index before using it. Currently no user of shared sbitmap actually uses the HW queue index (as would be expected). Signed-off-by: John Garry <[email protected]> Reviewed-by: Ming Lei <[email protected]> Link: https://lore.kernel.org/r/[email protected] Signed-off-by: Jens Axboe <[email protected]>
2021-10-18block: Rename BLKDEV_MAX_RQ -> BLKDEV_DEFAULT_RQJohn Garry1-1/+1
It is a bit confusing that there is BLKDEV_MAX_RQ and MAX_SCHED_RQ, as the name BLKDEV_MAX_RQ would imply the max requests always, which it is not. Rename to BLKDEV_MAX_RQ to BLKDEV_DEFAULT_RQ, matching its usage - that being the default number of requests assigned when allocating a request queue. Signed-off-by: John Garry <[email protected]> Reviewed-by: Ming Lei <[email protected]> Reviewed-by: Hannes Reinecke <[email protected]> Link: https://lore.kernel.org/r/[email protected] Signed-off-by: Jens Axboe <[email protected]>
2021-10-18block: move struct request to blk-mq.hChristoph Hellwig1-0/+465
struct request is only used by blk-mq drivers, so move it and all related declarations to blk-mq.h. Signed-off-by: Christoph Hellwig <[email protected]> Reviewed-by: Johannes Thumshirn <[email protected]> Link: https://lore.kernel.org/r/[email protected] Signed-off-by: Jens Axboe <[email protected]>
2021-08-23block: cleanup the lockdep handling in *alloc_diskChristoph Hellwig1-7/+3
Pass the lockdep name to the low-level __blk_alloc_disk helper and hardcode the name for it given that the number of minors or node_id are not very useful information. While this passes a pointless argument for non-lockdep builds that is not really an issue as disk allocation is a probe time only slow path. Signed-off-by: Christoph Hellwig <[email protected]> Link: https://lore.kernel.org/r/[email protected] Signed-off-by: Jens Axboe <[email protected]>
2021-08-05blk-mq: Introduce the BLK_MQ_F_NO_SCHED_BY_DEFAULT flagBart Van Assche1-0/+6
elevator_get_default() uses the following algorithm to select an I/O scheduler from inside add_disk(): - In case of a single hardware queue or if sharing hardware queues across multiple request queues (BLK_MQ_F_TAG_HCTX_SHARED), use mq-deadline. - Otherwise, use 'none'. This is a good choice for most but not for all block drivers. Make it possible to override the selection of mq-deadline with a new flag, namely BLK_MQ_F_NO_SCHED_BY_DEFAULT. Cc: Christoph Hellwig <[email protected]> Cc: Ming Lei <[email protected]> Cc: Tetsuo Handa <[email protected]> Cc: Martijn Coenen <[email protected]> Cc: Jaegeuk Kim <[email protected]> Signed-off-by: Bart Van Assche <[email protected]> Link: https://lore.kernel.org/r/[email protected] Signed-off-by: Jens Axboe <[email protected]>
2021-06-30block: mark blk_mq_init_queue_data staticChristoph Hellwig1-2/+0
All driver uses are gone now. Signed-off-by: Christoph Hellwig <[email protected]> Reviewed-by: Bart Van Assche <[email protected]> Link: https://lore.kernel.org/r/[email protected] Signed-off-by: Jens Axboe <[email protected]>
2021-06-18blk-mq: fix an IS_ERR() vs NULL bugDan Carpenter1-1/+1
The __blk_mq_alloc_disk() function doesn't return NULLs it returns error pointers. Fixes: b461dfc49eb6 ("blk-mq: add the blk_mq_alloc_disk APIs") Signed-off-by: Dan Carpenter <[email protected]> Reviewed-by: Christoph Hellwig <[email protected]> Link: https://lore.kernel.org/r/YMyjci35WBqrtqG+@mwanda Signed-off-by: Jens Axboe <[email protected]>
2021-06-11blk-mq: remove blk_mq_init_sq_queueChristoph Hellwig1-4/+0
All users are gone now. Signed-off-by: Christoph Hellwig <[email protected]> Reviewed-by: Chaitanya Kulkarni <[email protected]> Link: https://lore.kernel.org/r/[email protected] Signed-off-by: Jens Axboe <[email protected]>