aboutsummaryrefslogtreecommitdiff
AgeCommit message (Collapse)AuthorFilesLines
2021-02-10block: introduce blk_queue_clear_zone_settings()Damien Le Moal3-0/+21
Introduce the internal function blk_queue_clear_zone_settings() to cleanup all limits and resources related to zoned block devices. This new function is called from blk_queue_set_zoned() when a disk zoned model is set to BLK_ZONED_NONE. This particular case can happens when a partition is created on a host-aware scsi disk. Signed-off-by: Damien Le Moal <[email protected]> Reviewed-by: Chaitanya Kulkarni <[email protected]> Reviewed-by: Johannes Thumshirn <[email protected]> Signed-off-by: Jens Axboe <[email protected]>
2021-02-10zonefs: use zone write granularity as block sizeDamien Le Moal1-5/+4
Zoned block devices have different granularity constraints for write operations into sequential zones. E.g. ZBC and ZAC devices require that writes be aligned to the device physical block size while NVMe ZNS devices allow logical block size aligned write operations. To correctly handle such difference, use the device zone write granularity limit to set the block size of a zonefs volume, thus allowing the smallest possible write unit for all zoned device types. Signed-off-by: Damien Le Moal <[email protected]> Reviewed-by: Christoph Hellwig <[email protected]> Reviewed-by: Martin K. Petersen <[email protected]> Reviewed-by: Chaitanya Kulkarni <[email protected]> Reviewed-by: Johannes Thumshirn <[email protected]> Signed-off-by: Jens Axboe <[email protected]>
2021-02-10block: introduce zone_write_granularity limitDamien Le Moal5-1/+74
Per ZBC and ZAC specifications, host-managed SMR hard-disks mandate that all writes into sequential write required zones be aligned to the device physical block size. However, NVMe ZNS does not have this constraint and allows write operations into sequential zones to be aligned to the device logical block size. This inconsistency does not help with software portability across device types. To solve this, introduce the zone_write_granularity queue limit to indicate the alignment constraint, in bytes, of write operations into zones of a zoned block device. This new limit is exported as a read-only sysfs queue attribute and the helper blk_queue_zone_write_granularity() introduced for drivers to set this limit. The function blk_queue_set_zoned() is modified to set this new limit to the device logical block size by default. NVMe ZNS devices as well as zoned nullb devices use this default value as is. The scsi disk driver is modified to execute the blk_queue_zone_write_granularity() helper to set the zone write granularity of host-managed SMR disks to the disk physical block size. The accessor functions queue_zone_write_granularity() and bdev_zone_write_granularity() are also introduced. Signed-off-by: Damien Le Moal <[email protected]> Reviewed-by: Christoph Hellwig <[email protected]> Reviewed-by: Martin K. Petersen <[email protected]> Reviewed-by: Chaitanya Kulkarni <[email protected]> Reviewed-by: Johannes Thumshirn <[email protected]> Signed-off-by: Jens Axboe <[email protected]>
2021-02-10block: use blk_queue_set_zoned in add_partition()Damien Le Moal1-1/+1
When changing the zoned model of host-aware zoned block devices, use blk_queue_set_zoned() instead of directly assigning the gendisk queue zoned limit. Signed-off-by: Damien Le Moal <[email protected]> Reviewed-by: Christoph Hellwig <[email protected]> Reviewed-by: Chaitanya Kulkarni <[email protected]> Reviewed-by: Johannes Thumshirn <[email protected]> Signed-off-by: Jens Axboe <[email protected]>
2021-02-10nullb: use blk_queue_set_zoned() to setup zoned devicesDamien Le Moal1-4/+4
Use blk_queue_set_zoned() to set a nullb device zone model instead of directly assigning the device queue zoned limit. This initialization of the devicve zoned model as well as the setup of the queue flag QUEUE_FLAG_ZONE_RESETALL and of the device queue elevator feature are moved from null_init_zoned_dev() to null_register_zoned_dev() so that the initialization of the queue limits is done when the gendisk of the nullb device is available. Signed-off-by: Damien Le Moal <[email protected]> Reviewed-by: Christoph Hellwig <[email protected]> Reviewed-by: Chaitanya Kulkarni <[email protected]> Reviewed-by: Johannes Thumshirn <[email protected]> Signed-off-by: Jens Axboe <[email protected]>
2021-02-10nvme: cleanup zone information initializationDamien Le Moal2-13/+9
For a zoned namespace, in nvme_update_ns_info(), call nvme_update_zone_info() after executing nvme_update_disk_info() so that the namespace queue logical and physical block size limits are set. This allows setting the namespace queue max_zone_append_sectors limit in nvme_update_zone_info() instead of nvme_revalidate_zones(), simplifying this function. Also use blk_queue_set_zoned() to set the namespace zoned model. Signed-off-by: Damien Le Moal <[email protected]> Reviewed-by: Christoph Hellwig <[email protected]> Reviewed-by: Chaitanya Kulkarni <[email protected]> Reviewed-by: Johannes Thumshirn <[email protected]> Signed-off-by: Jens Axboe <[email protected]>
2021-02-10block: document zone_append_max_bytes attributeDamien Le Moal1-0/+6
The description of the zone_append_max_bytes sysfs queue attribute is missing from Documentation/block/queue-sysfs.rst. Add it. Signed-off-by: Damien Le Moal <[email protected]> Reviewed-by: Christoph Hellwig <[email protected]> Reviewed-by: Chaitanya Kulkarni <[email protected]> Reviewed-by: Martin K. Petersen <[email protected]> Reviewed-by: Johannes Thumshirn <[email protected]> Signed-off-by: Jens Axboe <[email protected]>
2021-02-08block: use bi_max_vecs to find the bvec poolChristoph Hellwig5-100/+51
Instead of encoding of the bvec pool using magic bio flags, just use a helper to find the pool based on the max_vecs value. Signed-off-by: Christoph Hellwig <[email protected]> Signed-off-by: Jens Axboe <[email protected]>
2021-02-08md/raid10: remove dead code in reshape_requestChristoph Hellwig1-4/+0
A bio allocated by bio_alloc_bioset comes pre-zeroed, no need to clear random fields. Signed-off-by: Christoph Hellwig <[email protected]> Acked-by: Song Liu <[email protected]> Signed-off-by: Jens Axboe <[email protected]>
2021-02-08block: mark the bio as cloned in bio_iov_bvec_setChristoph Hellwig1-1/+1
bio_iov_bvec_set clones the bio_vecs from the iter, and thus should be treated like a cloned bio in every respect. That also includes not touching bi_max_vecs as that is a property of the bio allocation and not its current payload. Signed-off-by: Christoph Hellwig <[email protected]> Signed-off-by: Jens Axboe <[email protected]>
2021-02-08block: set BIO_NO_PAGE_REF in bio_iov_bvec_setChristoph Hellwig1-3/+2
bio_iov_bvec_set assigns the foreign bvec, so setting the NO_PAGE_REF directly there seems like the best fit. Signed-off-by: Christoph Hellwig <[email protected]> Signed-off-by: Jens Axboe <[email protected]>
2021-02-08block: remove a layer of indentation in bio_iov_iter_get_pagesChristoph Hellwig1-7/+7
Remove a pointless layer of indentation after a return statement. Signed-off-by: Christoph Hellwig <[email protected]> Signed-off-by: Jens Axboe <[email protected]>
2021-02-08block: turn the nr_iovecs argument to bio_alloc* into an unsigned shortChristoph Hellwig2-5/+6
The bi_max_vecs and bi_vcnt fields are defined as unsigned short, so don't allow passing larger values in. Signed-off-by: Christoph Hellwig <[email protected]> Signed-off-by: Jens Axboe <[email protected]>
2021-02-08block: remove the 1 and 4 vec bvec_slabs entriesChristoph Hellwig1-37/+16
All bios with up to 4 bvecs use the inline bvecs in the bio itself, so don't bother to define bvec_slabs entries for them. Also decruftify the bvec_slabs definition and initialization while we're at it. Signed-off-by: Christoph Hellwig <[email protected]> Signed-off-by: Jens Axboe <[email protected]>
2021-02-08block: streamline bvec_allocChristoph Hellwig1-16/+10
Avoid the pointless goto by trying the slab allocation first and falling through to the mempool. Signed-off-by: Christoph Hellwig <[email protected]> Signed-off-by: Jens Axboe <[email protected]>
2021-02-08block: factor out a bvec_alloc_gfp helperChristoph Hellwig1-9/+11
Clean up bvec_alloc a little by factoring out a helper for the gfp_t manipulations. Signed-off-by: Christoph Hellwig <[email protected]> Signed-off-by: Jens Axboe <[email protected]>
2021-02-08block: move struct biovec_slab to bio.cChristoph Hellwig2-6/+6
struct biovec_slab is only used inside of bio.c, so move it there. Signed-off-by: Christoph Hellwig <[email protected]> Signed-off-by: Jens Axboe <[email protected]>
2021-02-08block: reuse BIO_INLINE_VECS for integrity bvecsChristoph Hellwig3-10/+3
bvec_alloc always uses biovec_slabs, and thus always needs to use the same number of inline vecs. Share a single definition for the data and integrity bvecs. Signed-off-by: Christoph Hellwig <[email protected]> Signed-off-by: Jens Axboe <[email protected]>
2021-02-02block: fix memory leak of bvecMing Lei1-1/+1
bio_init() clears bio instance, so the bvec index has to be set after bio_init(), otherwise bio->bi_io_vec may be leaked. Fixes: 3175199ab0ac ("block: split bio_kmalloc from bio_alloc_bioset") Cc: Johannes Thumshirn <[email protected]> Cc: Chaitanya Kulkarni <[email protected]> Cc: Damien Le Moal <[email protected]> Reviewed-by: Christoph Hellwig <[email protected]> Signed-off-by: Ming Lei <[email protected]> Signed-off-by: Jens Axboe <[email protected]>
2021-02-01md: use rdev_read_only in restart_arrayChristoph Hellwig1-1/+1
Make the read-only check in restart_array identical to the other two read-only checks. Signed-off-by: Christoph Hellwig <[email protected]> Signed-off-by: Jens Axboe <[email protected]>
2021-02-01md: check for NULL ->meta_bdev before calling bdev_read_onlyChristoph Hellwig1-5/+8
->meta_bdev is optional and not set for most arrays. Add a rdev_read_only helper that calls bdev_read_only for both devices in a safe way. Fixes: 6f0d9689b670 ("block: remove the NULL bdev check in bdev_read_only") Reported-by: Guoqing Jiang <[email protected]> Signed-off-by: Christoph Hellwig <[email protected]> Signed-off-by: Jens Axboe <[email protected]>
2021-01-29block: drop removed argument from kernel-doc of blk_execute_rq()Lukas Bulwahn1-1/+0
Commit 684da7628d93 ("block: remove unnecessary argument from blk_execute_rq") changes the signature of blk_execute_rq(), but misses to adjust its kernel-doc. Hence, make htmldocs warns on ./block/blk-exec.c:78: warning: Excess function parameter 'q' description in 'blk_execute_rq' Drop removed argument from kernel-doc of blk_execute_rq() as well. Signed-off-by: Lukas Bulwahn <[email protected]> Acked-by: Guoqing Jiang <[email protected]> Signed-off-by: Jens Axboe <[email protected]>
2021-01-29block: remove typo in kernel-doc of set_disk_ro()Lukas Bulwahn1-1/+1
Commit 52f019d43c22 ("block: add a hard-readonly flag to struct gendisk") provides some kernel-doc for set_disk_ro(), but introduces a small typo. Hence, make htmldocs warns on ./block/genhd.c:1441: warning: Function parameter or member 'read_only' not described in 'set_disk_ro' warning: Excess function parameter 'ready_only' description in 'set_disk_ro' Remove that typo in the kernel-doc for set_disk_ro(). Signed-off-by: Lukas Bulwahn <[email protected]> Reviewed-by: Christoph Hellwig <[email protected]> Signed-off-by: Jens Axboe <[email protected]>
2021-01-28blk-cgroup: Remove obsolete macroBaolin Wang1-2/+0
Remove the obsolete 'MAX_KEY_LEN' macro. Signed-off-by: Baolin Wang <[email protected]> Signed-off-by: Jens Axboe <[email protected]>
2021-01-27nvme-core: check bdev value for NULLChaitanya Kulkarni1-1/+2
The nvme-core sets the bdev to NULL when admin comamnd is issued from IOCTL in the following path e.g. nvme list :- block_ioctl() blkdev_ioctl() nvme_ioctl() nvme_user_cmd() nvme_submit_user_cmd() The commit 309dca309fc3 ("block: store a block_device pointer in struct bio") now uses bdev unconditionally in the macro bio_set_dev() and assumes that bdev value is not NULL which results in the following crash in since thats where bdev is actually accessed :- void bio_associate_blkg_from_css(struct bio *bio, struct cgroup_subsys_state *css) { if (bio->bi_blkg) blkg_put(bio->bi_blkg); if (css && css->parent) { bio->bi_blkg = blkg_tryget_closest(bio, css); } else { --------------> blkg_get(bio->bi_bdev->bd_disk->queue->root_blkg); bio->bi_blkg = bio->bi_bdev->bd_disk->queue->root_blkg; } } EXPORT_SYMBOL_GPL(bio_associate_blkg_from_css); [ 345.385947] BUG: kernel NULL pointer dereference, address: 0000000000000690 [ 345.387103] #PF: supervisor read access in kernel mode [ 345.387894] #PF: error_code(0x0000) - not-present page [ 345.388756] PGD 162a2b067 P4D 162a2b067 PUD 1633eb067 PMD 0 [ 345.389625] Oops: 0000 [#1] SMP NOPTI [ 345.390206] CPU: 15 PID: 4100 Comm: nvme Tainted: G OE 5.11.0-rc5blk+ #141 [ 345.391377] Hardware name: QEMU Standard PC (i440FX + PIIX, 1996), BIOS rel-1.12.0-59-gc9ba52764 [ 345.393074] RIP: 0010:bio_associate_blkg_from_css.cold.47+0x58/0x21f [ 345.396362] RSP: 0018:ffffc90000dbbce8 EFLAGS: 00010246 [ 345.397078] RAX: 0000000000000000 RBX: 0000000000000000 RCX: 0000000000000027 [ 345.398114] RDX: 0000000000000000 RSI: ffff888813be91f0 RDI: ffff888813be91f8 [ 345.399039] RBP: ffffc90000dbbd30 R08: 0000000000000001 R09: 0000000000000001 [ 345.399950] R10: 0000000064c66670 R11: 00000000ef955201 R12: ffff888812d32800 [ 345.401031] R13: 0000000000000000 R14: ffff888113e51540 R15: ffff888113e51540 [ 345.401976] FS: 00007f3747f1d780(0000) GS:ffff888813a00000(0000) knlGS:0000000000000000 [ 345.402997] CS: 0010 DS: 0000 ES: 0000 CR0: 0000000080050033 [ 345.403737] CR2: 0000000000000690 CR3: 000000081a4bc000 CR4: 00000000003506e0 [ 345.404685] Call Trace: [ 345.405031] bio_associate_blkg+0x71/0x1c0 [ 345.405649] nvme_submit_user_cmd+0x1aa/0x38e [nvme_core] [ 345.406348] nvme_user_cmd.isra.73.cold.98+0x54/0x92 [nvme_core] [ 345.407117] nvme_ioctl+0x226/0x260 [nvme_core] [ 345.407707] blkdev_ioctl+0x1c8/0x2b0 [ 345.408183] block_ioctl+0x3f/0x50 [ 345.408627] __x64_sys_ioctl+0x84/0xc0 [ 345.409117] do_syscall_64+0x33/0x40 [ 345.409592] entry_SYSCALL_64_after_hwframe+0x44/0xa9 [ 345.410233] RIP: 0033:0x7f3747632107 [ 345.413125] RSP: 002b:00007ffe461b6648 EFLAGS: 00000206 ORIG_RAX: 0000000000000010 [ 345.414086] RAX: ffffffffffffffda RBX: 00000000007b7fd0 RCX: 00007f3747632107 [ 345.414998] RDX: 00007ffe461b6650 RSI: 00000000c0484e41 RDI: 0000000000000004 [ 345.415966] RBP: 0000000000000004 R08: 00000000007b7fe8 R09: 00000000007b9080 [ 345.416883] R10: 00007ffe461b62c0 R11: 0000000000000206 R12: 00000000007b7fd0 [ 345.417808] R13: 0000000000000000 R14: 0000000000000003 R15: 0000000000000000 Add a NULL check before we set the bdev for bio. This issue is found on block/for-next tree. Fixes: 309dca309fc3 ("block: store a block_device pointer in struct bio") Signed-off-by: Chaitanya Kulkarni <[email protected]> Reviewed-by: Christoph Hellwig <[email protected]> Signed-off-by: Jens Axboe <[email protected]>
2021-01-27mm: only make map_swap_entry available for CONFIG_HIBERNATIONJens Axboe1-1/+5
Current tree spews this on compile: mm/swapfile.c:2290:17: warning: ‘map_swap_entry’ defined but not used [-Wunused-function] 2290 | static sector_t map_swap_entry(swp_entry_t entry, struct block_device **bdev) | ^~~~~~~~~~~~~~ if !CONFIG_HIBERNATION, as we don't use the function unless we have that config option set. Fixes: 48d15436fde6 ("mm: remove get_swap_bio") Signed-off-by: Jens Axboe <[email protected]>
2021-01-27mm: remove get_swap_bioChristoph Hellwig3-43/+13
Just reuse the block_device and sector from the swap_info structure, just as used by the SWP_SYNCHRONOUS path. Also remove the checks for NULL returns from bio_alloc as that can't happen for sleeping allocations. Signed-off-by: Christoph Hellwig <[email protected]> Reviewed-by: Johannes Thumshirn <[email protected]> Reviewed-by: Chaitanya Kulkarni <[email protected]> Acked-by: Damien Le Moal <[email protected]> Signed-off-by: Jens Axboe <[email protected]>
2021-01-27nilfs2: remove cruft in nilfs_alloc_seg_bioChristoph Hellwig1-4/+0
bio_alloc never returns NULL when it can sleep. Signed-off-by: Christoph Hellwig <[email protected]> Reviewed-by: Johannes Thumshirn <[email protected]> Reviewed-by: Chaitanya Kulkarni <[email protected]> Acked-by: Damien Le Moal <[email protected]> Signed-off-by: Jens Axboe <[email protected]>
2021-01-27nfs/blocklayout: remove cruft in bl_alloc_init_bioChristoph Hellwig1-5/+0
bio_alloc never returns NULL when it can sleep. Signed-off-by: Christoph Hellwig <[email protected]> Reviewed-by: Johannes Thumshirn <[email protected]> Reviewed-by: Chaitanya Kulkarni <[email protected]> Acked-by: Damien Le Moal <[email protected]> Signed-off-by: Jens Axboe <[email protected]>
2021-01-27md/raid6: refactor raid5_read_one_chunkChristoph Hellwig1-63/+45
Refactor raid5_read_one_chunk so that all simple checks are done before allocating the bio. Signed-off-by: Christoph Hellwig <[email protected]> Acked-by: Song Liu <[email protected]> Reviewed-by: Johannes Thumshirn <[email protected]> Reviewed-by: Chaitanya Kulkarni <[email protected]> Acked-by: Damien Le Moal <[email protected]> Signed-off-by: Jens Axboe <[email protected]>
2021-01-27md: remove md_bio_alloc_syncChristoph Hellwig1-9/+1
md_bio_alloc_sync is never called with a NULL mddev, and ->sync_set is initialized in md_run, so it always must be initialized as well. Just open code the remaining call to bio_alloc_bioset. Signed-off-by: Christoph Hellwig <[email protected]> Acked-by: Song Liu <[email protected]> Reviewed-by: Johannes Thumshirn <[email protected]> Reviewed-by: Chaitanya Kulkarni <[email protected]> Acked-by: Damien Le Moal <[email protected]> Signed-off-by: Jens Axboe <[email protected]>
2021-01-27md: simplify sync_page_ioChristoph Hellwig1-13/+13
Use an on-stack bio and biovec for the single page synchronous I/O. Signed-off-by: Christoph Hellwig <[email protected]> Acked-by: Song Liu <[email protected]> Reviewed-by: Johannes Thumshirn <[email protected]> Reviewed-by: Chaitanya Kulkarni <[email protected]> Acked-by: Damien Le Moal <[email protected]> Signed-off-by: Jens Axboe <[email protected]>
2021-01-27md: remove bio_alloc_mddevChristoph Hellwig4-15/+3
bio_alloc_mddev is never called with a NULL mddev, and ->bio_set is initialized in md_run, so it always must be initialized as well. Just open code the remaining call to bio_alloc_bioset. Signed-off-by: Christoph Hellwig <[email protected]> Acked-by: Song Liu <[email protected]> Reviewed-by: Johannes Thumshirn <[email protected]> Reviewed-by: Chaitanya Kulkarni <[email protected]> Acked-by: Damien Le Moal <[email protected]> Signed-off-by: Jens Axboe <[email protected]>
2021-01-27drbd: remove drbd_req_make_private_bioChristoph Hellwig3-14/+8
Open code drbd_req_make_private_bio in the two callers to prepare for further changes. Also don't bother to initialize bi_next as the bio code already does that that. Signed-off-by: Christoph Hellwig <[email protected]> Reviewed-by: Johannes Thumshirn <[email protected]> Reviewed-by: Chaitanya Kulkarni <[email protected]> Acked-by: Damien Le Moal <[email protected]> Signed-off-by: Jens Axboe <[email protected]>
2021-01-27drbd: remove bio_alloc_drbdChristoph Hellwig4-17/+2
Given that drbd_md_io_bio_set is initialized during module initialization and the module fails to load if the initialization fails there is no need to fall back to plain bio_alloc. Signed-off-by: Christoph Hellwig <[email protected]> Reviewed-by: Johannes Thumshirn <[email protected]> Reviewed-by: Chaitanya Kulkarni <[email protected]> Acked-by: Damien Le Moal <[email protected]> Signed-off-by: Jens Axboe <[email protected]>
2021-01-27f2fs: remove FAULT_ALLOC_BIOChristoph Hellwig4-28/+4
Sleeping bio allocations do not fail, which means that injecting an error into sleeping bio allocations is a little silly. Signed-off-by: Christoph Hellwig <[email protected]> Reviewed-by: Johannes Thumshirn <[email protected]> Reviewed-by: Chaitanya Kulkarni <[email protected]> Acked-by: Damien Le Moal <[email protected]> Signed-off-by: Jens Axboe <[email protected]>
2021-01-27f2fs: use blkdev_issue_flush in __submit_flush_waitChristoph Hellwig3-13/+3
Use the blkdev_issue_flush helper instead of duplicating it. Signed-off-by: Christoph Hellwig <[email protected]> Reviewed-by: Johannes Thumshirn <[email protected]> Reviewed-by: Chaitanya Kulkarni <[email protected]> Acked-by: Damien Le Moal <[email protected]> Signed-off-by: Jens Axboe <[email protected]>
2021-01-27dm-clone: use blkdev_issue_flush in commit_metadataChristoph Hellwig1-13/+1
Use blkdev_issue_flush instead of open coding it. Signed-off-by: Christoph Hellwig <[email protected]> Reviewed-by: Johannes Thumshirn <[email protected]> Reviewed-by: Chaitanya Kulkarni <[email protected]> Acked-by: Damien Le Moal <[email protected]> Signed-off-by: Jens Axboe <[email protected]>
2021-01-27block: use an on-stack bio in blkdev_issue_flushChristoph Hellwig23-38/+33
There is no point in allocating memory for a synchronous flush. Signed-off-by: Christoph Hellwig <[email protected]> Reviewed-by: Johannes Thumshirn <[email protected]> Reviewed-by: Chaitanya Kulkarni <[email protected]> Acked-by: Damien Le Moal <[email protected]> Signed-off-by: Jens Axboe <[email protected]>
2021-01-27block: split bio_kmalloc from bio_alloc_biosetChristoph Hellwig2-87/+86
bio_kmalloc shares almost no logic with the bio_set based fast path in bio_alloc_bioset. Split it into an entirely separate implementation. Signed-off-by: Christoph Hellwig <[email protected]> Reviewed-by: Johannes Thumshirn <[email protected]> Reviewed-by: Chaitanya Kulkarni <[email protected]> Acked-by: Damien Le Moal <[email protected]> Signed-off-by: Jens Axboe <[email protected]>
2021-01-27blk-crypto: use bio_kmalloc in blk_crypto_clone_bioChristoph Hellwig1-1/+1
Use bio_kmalloc instead of open coding it. Signed-off-by: Christoph Hellwig <[email protected]> Reviewed-by: Eric Biggers <[email protected]> Reviewed-by: Johannes Thumshirn <[email protected]> Reviewed-by: Chaitanya Kulkarni <[email protected]> Acked-by: Damien Le Moal <[email protected]> Signed-off-by: Jens Axboe <[email protected]>
2021-01-27btrfs: use bio_kmalloc in __alloc_deviceChristoph Hellwig1-1/+1
Use bio_kmalloc instead of open coding it. Signed-off-by: Christoph Hellwig <[email protected]> Reviewed-by: Josef Bacik <[email protected]> Reviewed-by: Johannes Thumshirn <[email protected]> Reviewed-by: Chaitanya Kulkarni <[email protected]> Acked-by: Damien Le Moal <[email protected]> Signed-off-by: Jens Axboe <[email protected]>
2021-01-27zonefs: use bio_alloc in zonefs_file_dio_appendChristoph Hellwig1-1/+1
Use bio_alloc instead of open coding it. Signed-off-by: Christoph Hellwig <[email protected]> Reviewed-by: Johannes Thumshirn <[email protected]> Reviewed-by: Chaitanya Kulkarni <[email protected]> Acked-by: Damien Le Moal <[email protected]> Signed-off-by: Jens Axboe <[email protected]>
2021-01-27bfq: Use only idle IO periods for think time calculationsJan Kara1-1/+9
Currently whenever bfq queue has a request queued we add now - last_completion_time to the think time statistics. This is however misleading in case the process is able to submit several requests in parallel because e.g. if the queue has request completed at time T0 and then queues new requests at times T1, T2, then we will add T1-T0 and T2-T0 to think time statistics which just doesn't make any sence (the queue's think time is penalized by the queue being able to submit more IO). So add to think time statistics only time intervals when the queue had no IO pending. Signed-off-by: Jan Kara <[email protected]> Acked-by: Paolo Valente <[email protected]> [axboe: fix whitespace on empty line] Signed-off-by: Jens Axboe <[email protected]>
2021-01-27bfq: Use 'ttime' local variableJan Kara1-1/+1
Use local variable 'ttime' instead of dereferencing bfqq. Signed-off-by: Jan Kara <[email protected]> Acked-by: Paolo Valente <[email protected]> Signed-off-by: Jens Axboe <[email protected]>
2021-01-27bfq: Avoid false bfq queue mergingJan Kara1-0/+1
bfq_setup_cooperator() uses bfqd->in_serv_last_pos so detect whether it makes sense to merge current bfq queue with the in-service queue. However if the in-service queue is freshly scheduled and didn't dispatch any requests yet, bfqd->in_serv_last_pos is stale and contains value from the previously scheduled bfq queue which can thus result in a bogus decision that the two queues should be merged. This bug can be observed for example with the following fio jobfile: [global] direct=0 ioengine=sync invalidate=1 size=1g rw=read [reader] numjobs=4 directory=/mnt where the 4 processes will end up in the one shared bfq queue although they do IO to physically very distant files (for some reason I was able to observe this only with slice_idle=1ms setting). Fix the problem by invalidating bfqd->in_serv_last_pos when switching in-service queue. Fixes: 058fdecc6de7 ("block, bfq: fix in-service-queue check for queue merging") CC: [email protected] Signed-off-by: Jan Kara <[email protected]> Acked-by: Paolo Valente <[email protected]> Signed-off-by: Jens Axboe <[email protected]>
2021-01-26blkcg: delete redundant get/put operations for queueChunguang Xu1-5/+8
When calling blkcg_schedule_throttle(), for the same queue, redundant get/put operations can be removed. Signed-off-by: Chunguang Xu <[email protected]> Acked-by: Tejun Heo <[email protected]> Signed-off-by: Jens Axboe <[email protected]>
2021-01-26block: unexport truncate_bdev_rangeChristoph Hellwig2-8/+2
truncate_bdev_range is only used in always built-in block layer code, so remove the export and the !CONFIG_BLOCK stub. Signed-off-by: Christoph Hellwig <[email protected]> Signed-off-by: Jens Axboe <[email protected]>
2021-01-26blk: wbt: remove unused parameter from wbt_should_throttleLei Chen1-2/+2
The first parameter rwb is not used for this function. So just remove it. Signed-off-by: Lei Chen <[email protected]> Signed-off-by: Jens Axboe <[email protected]>
2021-01-26bdev: Do not return EBUSY if bdev discard races with writeJan Kara1-6/+4
blkdev_fallocate() tries to detect whether a discard raced with an overlapping write by calling invalidate_inode_pages2_range(). However this check can give both false negatives (when writing using direct IO or when writeback already writes out the written pagecache range) and false positives (when write is not actually overlapping but ends in the same page when blocksize < pagesize). This actually causes issues for qemu which is getting confused by EBUSY errors. Fix the problem by removing this conflicting write detection since it is inherently racy and thus of little use anyway. Reported-by: Maxim Levitsky <[email protected]> CC: "Darrick J. Wong" <[email protected]> Link: https://lore.kernel.org/qemu-devel/[email protected] Signed-off-by: Jan Kara <[email protected]> Reviewed-by: Maxim Levitsky <[email protected]> Reviewed-by: Darrick J. Wong <[email protected]> Reviewed-by: Christoph Hellwig <[email protected]> Signed-off-by: Jens Axboe <[email protected]>