aboutsummaryrefslogtreecommitdiff
AgeCommit message (Collapse)AuthorFilesLines
2023-02-09Merge tag 'nvme-6.2-2023-02-09' of git://git.infradead.org/nvme into block-6.2Jens Axboe1-1/+1
Pull NVMe fix from Christoph: "nvme fixes for Linux 6.2 - fix a static checker warning for a variable introduces in the last pull request (Tom Rix)" * tag 'nvme-6.2-2023-02-09' of git://git.infradead.org/nvme: nvme-auth: mark nvme_auth_wq static
2023-02-08nvme-auth: mark nvme_auth_wq staticTom Rix1-1/+1
Fix a smatch report for the newly added nvme_auth_wq. Signed-off-by: Tom Rix <[email protected]> Reviewed-by: Chaitanya Kulkarni <[email protected]> Signed-off-by: Christoph Hellwig <[email protected]>
2023-02-02Merge tag 'nvme-6.2-2023-02-02' of git://git.infradead.org/nvme into block-6.2Jens Axboe3-4/+19
Pul NVMe fixes from Christoph: "nvme fixes for Linux 6.2 - fix a missing queue put in nvmet_fc_ls_create_association (Amit Engel) - clear queue pointers on tag_set initialization failure (Maurizio Lombardi) - use workqueue dedicated to authentication (Shin'ichiro Kawasaki)" * tag 'nvme-6.2-2023-02-02' of git://git.infradead.org/nvme: nvme-auth: use workqueue dedicated to authentication nvme: clear the request_queue pointers on failure in nvme_alloc_io_tag_set nvme: clear the request_queue pointers on failure in nvme_alloc_admin_tag_set nvme-fc: fix a missing queue put in nvmet_fc_ls_create_association
2023-02-01blk-cgroup: don't update io stat for root cgroupMing Lei1-0/+4
We source root cgroup stats from the system-wide stats, see blkcg_print_stat and blkcg_rstat_flush, so don't update io state for root cgroup. Fixes blkg leak issue introduced in commit 3b8cc6298724 ("blk-cgroup: Optimize blkcg_rstat_flush()") which starts to grab blkg's reference when adding iostat_cpu into percpu blkcg list, but this state won't be consumed by blkcg_rstat_flush() where the blkg reference is dropped. Tested-by: Bart van Assche <[email protected]> Reported-by: Bart van Assche <[email protected]> Fixes: 3b8cc6298724 ("blk-cgroup: Optimize blkcg_rstat_flush()") Cc: Tejun Heo <[email protected]> Cc: Waiman Long <[email protected]> Signed-off-by: Ming Lei <[email protected]> Link: https://lore.kernel.org/r/[email protected] Signed-off-by: Jens Axboe <[email protected]>
2023-02-01nvme-auth: use workqueue dedicated to authenticationShin'ichiro Kawasaki1-2/+12
NVMe In-Band authentication uses two kinds of works: chap->auth_work and ctrl->dhchap_auth_work. The latter work flushes or cancels the former work. However, the both works are queued to the same workqueue nvme-wq. It results in the lockdep WARNING as follows: WARNING: possible recursive locking detected 6.2.0-rc4+ #1 Not tainted -------------------------------------------- kworker/u16:7/69 is trying to acquire lock: ffff902d52e65548 ((wq_completion)nvme-wq){+.+.}-{0:0}, at: start_flush_work+0x2c5/0x380 but task is already holding lock: ffff902d52e65548 ((wq_completion)nvme-wq){+.+.}-{0:0}, at: process_one_work+0x210/0x410 To avoid the WARNING, introduce a new workqueue nvme-auth-wq dedicated to chap->auth_work. Reported-by: Daniel Wagner <[email protected]> Link: https://lore.kernel.org/linux-nvme/[email protected]/ Fixes: f50fff73d620 ("nvme: implement In-Band authentication") Signed-off-by: Shin'ichiro Kawasaki <[email protected]> Tested-by: Daniel Wagner <[email protected]> Reviewed-by: Hannes Reinecke <[email protected]> Signed-off-by: Christoph Hellwig <[email protected]>
2023-02-01nvme: clear the request_queue pointers on failure in nvme_alloc_io_tag_setMaurizio Lombardi1-0/+1
In nvme_alloc_io_tag_set(), the connect_q pointer should be set to NULL in case of error to avoid potential invalid pointer dereferences. Signed-off-by: Maurizio Lombardi <[email protected]> Reviewed-by: Chaitanya Kulkarni <[email protected]> Signed-off-by: Christoph Hellwig <[email protected]>
2023-02-01nvme: clear the request_queue pointers on failure in nvme_alloc_admin_tag_setMaurizio Lombardi1-1/+3
If nvme_alloc_admin_tag_set() fails, the admin_q and fabrics_q pointers are left with an invalid, non-NULL value. Other functions may then check the pointers and dereference them, e.g. in nvme_probe() -> out_disable: -> nvme_dev_remove_admin(). Fix the bug by setting admin_q and fabrics_q to NULL in case of error. Also use the set variable to free the tag_set as ctrl->admin_tagset isn't initialized yet. Signed-off-by: Maurizio Lombardi <[email protected]> Reviewed-by: Keith Busch <[email protected]> Reviewed-by: Chaitanya Kulkarni <[email protected]> Signed-off-by: Christoph Hellwig <[email protected]>
2023-02-01nvme-fc: fix a missing queue put in nvmet_fc_ls_create_associationAmit Engel1-1/+3
As part of nvmet_fc_ls_create_association there is a case where nvmet_fc_alloc_target_queue fails right after a new association with an admin queue is created. In this case, no one releases the get taken in nvmet_fc_alloc_target_assoc. This fix is adding the missing put. Signed-off-by: Amit Engel <[email protected]> Reviewed-by: James Smart <[email protected]> Signed-off-by: Christoph Hellwig <[email protected]>
2023-01-31block: Fix the blk_mq_destroy_queue() documentationBart Van Assche1-2/+3
Commit 2b3f056f72e5 moved a blk_put_queue() call from blk_mq_destroy_queue() into its callers. Reflect this change in the documentation block above blk_mq_destroy_queue(). Cc: Christoph Hellwig <[email protected]> Cc: Sagi Grimberg <[email protected]> Cc: Chaitanya Kulkarni <[email protected]> Cc: Keith Busch <[email protected]> Fixes: 2b3f056f72e5 ("blk-mq: move the call to blk_put_queue out of blk_mq_destroy_queue") Signed-off-by: Bart Van Assche <[email protected]> Reviewed-by: Christoph Hellwig <[email protected]> Link: https://lore.kernel.org/r/[email protected] Signed-off-by: Jens Axboe <[email protected]>
2023-01-31block: ublk: extending queue_size to fix overflowLiu Xiaodong1-1/+1
When validating drafted SPDK ublk target, in a case that assigning large queue depth to multiqueue ublk device, ublk target would run into a weird incorrect state. During rounds of review and debug, An overflow bug was found in ublk driver. In ublk_cmd.h, UBLK_MAX_QUEUE_DEPTH is 4096 which means each ublk queue depth can be set as large as 4096. But when setting qd for a ublk device, sizeof(struct ublk_queue) + depth * sizeof(struct ublk_io) will be larger than 65535 if qd is larger than 2728. Then queue_size is overflowed, and ublk_get_queue() references a wrong pointer position. The wrong content of ublk_queue elements will lead to out-of-bounds memory access. Extend queue_size in ublk_device as "unsigned int". Signed-off-by: Liu Xiaodong <[email protected]> Fixes: 71f28f3136af ("ublk_drv: add io_uring based userspace block driver") Reviewed-by: Ming Lei <[email protected]> Link: https://lore.kernel.org/r/[email protected] Signed-off-by: Jens Axboe <[email protected]>
2023-01-29block, bfq: fix uaf for bfqq in bic_set_bfqq()Yu Kuai2-2/+4
After commit 64dc8c732f5c ("block, bfq: fix possible uaf for 'bfqq->bic'"), bic->bfqq will be accessed in bic_set_bfqq(), however, in some context bic->bfqq will be freed, and bic_set_bfqq() is called with the freed bic->bfqq. Fix the problem by always freeing bfqq after bic_set_bfqq(). Fixes: 64dc8c732f5c ("block, bfq: fix possible uaf for 'bfqq->bic'") Reported-and-tested-by: Shinichiro Kawasaki <[email protected]> Signed-off-by: Yu Kuai <[email protected]> Reviewed-by: Jan Kara <[email protected]> Link: https://lore.kernel.org/r/[email protected] Signed-off-by: Jens Axboe <[email protected]>
2023-01-26Merge tag 'nvme-6.2-2023-01-26' of git://git.infradead.org/nvme into block-6.2Jens Axboe3-11/+10
Pull NVMe fixes from Christoph: "nvme fixes for Linux 6.2 - flush initial scan_work for async probe (Keith Busch) - fix passthrough csi check (Keith Busch) - fix nvme-fc initialization order (Ross Lagerwall)" * tag 'nvme-6.2-2023-01-26' of git://git.infradead.org/nvme: nvme: fix passthrough csi check nvme-pci: flush initial scan_work for async probe nvme-fc: fix initialization order
2023-01-26block: ublk: move ublk_chr_class destroying after devices are removedMing Lei1-4/+3
The 'ublk_chr_class' is needed when deleting ublk char devices in ublk_exit(), so move it after devices(idle) are removed. Fixes the following warning reported by Harris, James R: [ 859.178950] sysfs group 'power' not found for kobject 'ublkc0' [ 859.178962] WARNING: CPU: 3 PID: 1109 at fs/sysfs/group.c:278 sysfs_remove_group+0x9c/0xb0 Reported-by: "Harris, James R" <[email protected]> Fixes: 71f28f3136af ("ublk_drv: add io_uring based userspace block driver") Link: https://lore.kernel.org/linux-block/Y9JlFmSgDl3+zy3N@T590/T/#t Signed-off-by: Ming Lei <[email protected]> Tested-by: Jim Harris <[email protected]> Link: https://lore.kernel.org/r/[email protected] Signed-off-by: Jens Axboe <[email protected]>
2023-01-25nvme: fix passthrough csi checkKeith Busch1-1/+1
The namespace head saves the Command Set Indicator enum, so use that instead of the Command Set Selected. The two values are not the same. Fixes: 831ed60c2aca2d ("nvme: also return I/O command effects from nvme_command_effects") Signed-off-by: Keith Busch <[email protected]> Signed-off-by: Christoph Hellwig <[email protected]>
2023-01-24nvme-pci: flush initial scan_work for async probeKeith Busch1-0/+1
The nvme device may have a namespace with the root partition, so make sure we've completed scanning before returning from the async probe. Fixes: eac3ef262941 ("nvme-pci: split the initial probe from the rest path") Reported-by: Klaus Jensen <[email protected]> Signed-off-by: Keith Busch <[email protected]> Tested-by: Ville Syrjälä <[email protected]> Tested-by: Klaus Jensen <[email protected]> Signed-off-by: Christoph Hellwig <[email protected]>
2023-01-23nvme-fc: fix initialization orderRoss Lagerwall1-10/+8
ctrl->ops is used by nvme_alloc_admin_tag_set() but set by nvme_init_ctrl() so reorder the calls to avoid a NULL pointer dereference. Fixes: 6dfba1c09c10 ("nvme-fc: use the tagset alloc/free helpers") Signed-off-by: Ross Lagerwall <[email protected]> Reviewed-by: Sagi Grimberg <[email protected]> Signed-off-by: Christoph Hellwig <[email protected]>
2023-01-20Merge tag 'nvme-6.2-2023-01-20' of git://git.infradead.org/nvme into block-6.2Jens Axboe2-5/+21
Pull NVMe fixes from Christoph: "nvme fixes for Linux 6.2 - fix controller shutdown regression in nvme-apple (Janne Grunau) - fix a polling on timeout regression in nvme-pci (Keith Busch)" * tag 'nvme-6.2-2023-01-20' of git://git.infradead.org/nvme: nvme-pci: fix timeout request state check nvme-apple: only reset the controller when RTKit is running nvme-apple: reset controller during shutdown
2023-01-19nvme-pci: fix timeout request state checkKeith Busch1-1/+1
Polling the completion can progress the request state to IDLE, either inline with the completion, or through softirq. Either way, the state may not be COMPLETED, so don't check for that. We only care if the state isn't IN_FLIGHT. This is fixing an issue where the driver aborts an IO that we just completed. Seeing the "aborting" message instead of "polled" is very misleading as to where the timeout problem resides. Fixes: bf392a5dc02a9b ("nvme-pci: Remove tag from process cq") Signed-off-by: Keith Busch <[email protected]> Signed-off-by: Christoph Hellwig <[email protected]>
2023-01-19nvme-apple: only reset the controller when RTKit is runningJanne Grunau1-3/+3
NVMe controller register access hangs indefinitely when the co-processor is not running. A missed reset is preferable over a hanging thread since it could be recoverable. Signed-off-by: Janne Grunau <[email protected]> Signed-off-by: Christoph Hellwig <[email protected]>
2023-01-19nvme-apple: reset controller during shutdownJanne Grunau1-1/+17
This is a functional revert of c76b8308e4c9 ("nvme-apple: fix controller shutdown in apple_nvme_disable"). The commit broke suspend/resume since apple_nvme_reset_work() tries to disable the controller on resume. This does not work for the apple NVMe controller since register access only works while the co-processor firmware is running. Disabling the NVMe controller in the shutdown path is also required for shutting the co-processor down. The original code was appropriate for this hardware. Add a comment to prevent a similar breaking changes in the future. Fixes: c76b8308e4c9 ("nvme-apple: fix controller shutdown in apple_nvme_disable") Reported-by: Janne Grunau <[email protected]> Link: https://lore.kernel.org/all/[email protected]/ Signed-off-by: Janne Grunau <[email protected]> [hch: updated with a more descriptive comment from Hector Martin] Signed-off-by: Christoph Hellwig <[email protected]>
2023-01-17block: fix hctx checks for batch allocationPavel Begunkov1-1/+5
When there are no read queues read requests will be assigned a default queue on allocation. However, blk_mq_get_cached_request() is not prepared for that and will fail all attempts to grab read requests from the cache. Worst case it doubles the number of requests allocated, roughly half of which will be returned by blk_mq_free_plug_rqs(). It only affects batched allocations and so is io_uring specific. For reference, QD8 t/io_uring benchmark improves by 20-35%. Signed-off-by: Pavel Begunkov <[email protected]> Link: https://lore.kernel.org/r/80d4511011d7d4751b4cf6375c4e38f237d935e3.1673955390.git.asml.silence@gmail.com Signed-off-by: Jens Axboe <[email protected]>
2023-01-17block/rnbd-clt: fix wrong max ID in ida_alloc_maxGuoqing Jiang1-1/+1
We need to pass 'end - 1' to ida_alloc_max after switch from ida_simple_get to ida_alloc_max. Otherwise smatch warns. drivers/block/rnbd/rnbd-clt.c:1460 init_dev() error: Calling ida_alloc_max() with a 'max' argument which is a power of 2. -1 missing? Fixes: 24afc15dbe21 ("block/rnbd: Remove a useless mutex") Reported-by: kernel test robot <[email protected]> Reported-by: Dan Carpenter <[email protected]> Signed-off-by: Guoqing Jiang <[email protected]> Acked-by: Jack Wang <[email protected]> Link: https://lore.kernel.org/r/[email protected] Signed-off-by: Jens Axboe <[email protected]>
2023-01-16blk-cgroup: fix missing pd_online_fn() while activating policyYu Kuai1-0/+4
If the policy defines pd_online_fn(), it should be called after pd_init_fn(), like blkg_create(). Signed-off-by: Yu Kuai <[email protected]> Acked-by: Tejun Heo <[email protected]> Link: https://lore.kernel.org/r/[email protected] Signed-off-by: Jens Axboe <[email protected]>
2023-01-16pktcdvd: check for NULL returna fter calling bio_split_to_limits()Jens Axboe1-0/+2
The revert of the removal of this driver happened after we fixed up the split limits for NOWAIT issue, hence it got missed. Ensure that we check for a NULL bio after splitting, in case it should be retried. Marking this as fixing both commits, so that stable backport will do this correctly. Cc: [email protected] Fixes: 9cea62b2cbab ("block: don't allow splitting of a REQ_NOWAIT bio") Fixes: 4b83e99ee709 ("Revert "pktcdvd: remove driver."") Signed-off-by: Jens Axboe <[email protected]>
2023-01-15block, bfq: switch 'bfqg->ref' to use atomic refcount apisYu Kuai2-6/+4
The updating of 'bfqg->ref' should be protected by 'bfqd->lock', however, during code review, we found that bfq_pd_free() update 'bfqg->ref' without holding the lock, which is problematic: 1) bfq_pd_free() triggered by removing cgroup is called asynchronously; 2) bfqq will grab bfqg reference, and exit bfqq will drop the reference, which can concurrent with 1). Unfortunately, 'bfqd->lock' can't be held here because 'bfqd' might already be freed in bfq_pd_free(). Fix the problem by using atomic refcount apis. Signed-off-by: Yu Kuai <[email protected]> Reviewed-by: Jan Kara <[email protected]> Link: https://lore.kernel.org/r/[email protected] Signed-off-by: Jens Axboe <[email protected]>
2023-01-13Merge branch 'md-fixes' of ↵Jens Axboe1-2/+2
https://git.kernel.org/pub/scm/linux/kernel/git/song/md into block-6.2 Pull MD fix from Song: "It fixes an issue introduced by recent code refactor." * 'md-fixes' of https://git.kernel.org/pub/scm/linux/kernel/git/song/md: md: fix incorrect declaration about claim_rdev in md_import_device
2023-01-12md: fix incorrect declaration about claim_rdev in md_import_deviceAdrian Huang1-2/+2
Commit fb541ca4c365 ("md: remove lock_bdev / unlock_bdev") removes wrappers for blkdev_get/blkdev_put. However, the uninitialized local static variable of pointer type 'claim_rdev' in md_import_device() is NULL, which leads to the following warning call trace: WARNING: CPU: 22 PID: 1037 at block/bdev.c:577 bd_prepare_to_claim+0x131/0x150 CPU: 22 PID: 1037 Comm: mdadm Not tainted 6.2.0-rc3+ #69 .. RIP: 0010:bd_prepare_to_claim+0x131/0x150 .. Call Trace: <TASK> ? _raw_spin_unlock+0x15/0x30 ? iput+0x6a/0x220 blkdev_get_by_dev.part.0+0x4b/0x300 md_import_device+0x126/0x1d0 new_dev_store+0x184/0x240 md_attr_store+0x80/0xf0 kernfs_fop_write_iter+0x128/0x1c0 vfs_write+0x2be/0x3c0 ksys_write+0x5f/0xe0 do_syscall_64+0x38/0x90 entry_SYSCALL_64_after_hwframe+0x72/0xdc It turns out the md device cannot be used: md: could not open device unknown-block(259,0). md: md127 stopped. Fix the issue by declaring the local static variable of struct type and passing the pointer of the variable to blkdev_get_by_dev(). Fixes: fb541ca4c365 ("md: remove lock_bdev / unlock_bdev") Cc: Christoph Hellwig <[email protected]> Signed-off-by: Adrian Huang <[email protected]> Reviewed-by: Christoph Hellwig <[email protected]> Signed-off-by: Song Liu <[email protected]>
2023-01-12Merge tag 'nvme-6.2-2023-01-12' of git://git.infradead.org/nvme into block-6.2Jens Axboe4-52/+75
Pull NVMe fixes from Christoph: "nvme fixes for Linux 6.2 - Identify quirks for Apple controllers (Hector Martin) - fix error handling in nvme_pci_enable (Tong Zhang) - refuse unprivileged passthrough on partitions (Christoph Hellwig) - fix MAINTAINERS to not match nvmem subsystem headers (Russell King)" * tag 'nvme-6.2-2023-01-12' of git://git.infradead.org/nvme: MAINTAINERS: stop nvme matching for nvmem files nvme: don't allow unprivileged passthrough on partitions nvme: replace the "bool vec" arguments with flags in the ioctl path nvme: remove __nvme_ioctl nvme-pci: fix error handling in nvme_pci_enable() nvme-pci: add NVME_QUIRK_IDENTIFY_CNS quirk to Apple T2 controllers nvme-apple: add NVME_QUIRK_IDENTIFY_CNS quirk to fix regression
2023-01-10MAINTAINERS: stop nvme matching for nvmem filesRussell King (Oracle)1-1/+2
The nvme patterns detect all include files starting with nvme, which also picks up the nvmem subsystem header files. Fix this by using a more specific pattern. Signed-off-by: Russell King (Oracle) <[email protected]> [hch: switched to a purely inclusive pattern instead of excluding nvmem*] Signed-off-by: Christoph Hellwig <[email protected]>
2023-01-10nvme: don't allow unprivileged passthrough on partitionsChristoph Hellwig1-16/+31
Passthrough commands can always access the entire device, and thus submitting them on partitions is an privelege escalation. In hindsight we should have never allowed any passthrough commands on partitions, but it's probably too late to change that decision now. Fixes: e4fbcf32c860 ("nvme: identify-namespace without CAP_SYS_ADMIN") Signed-off-by: Christoph Hellwig <[email protected]> Reviewed-by: Keith Busch <[email protected]> Reviewed-by: Kanchan Joshi <[email protected]> Reviewed-by: Chaitanya Kulkarni <[email protected]>
2023-01-10nvme: replace the "bool vec" arguments with flags in the ioctl pathChristoph Hellwig1-25/+28
To prepare for passing down more information, replace the boolean vec argument with a more extensible flags one. Signed-off-by: Christoph Hellwig <[email protected]> Reviewed-by: Keith Busch <[email protected]> Reviewed-by: Kanchan Joshi <[email protected]> Reviewed-by: Chaitanya Kulkarni <[email protected]>
2023-01-10nvme: remove __nvme_ioctlChristoph Hellwig1-10/+8
Open code __nvme_ioctl in the two callers to make future changes that pass down additional paramters in the ioctl path easier. Signed-off-by: Christoph Hellwig <[email protected]> Reviewed-by: Keith Busch <[email protected]> Reviewed-by: Kanchan Joshi <[email protected]> Reviewed-by: Chaitanya Kulkarni <[email protected]>
2023-01-10nvme-pci: fix error handling in nvme_pci_enable()Tong Zhang1-2/+7
There are two issues in nvme_pci_enable(): 1) If pci_alloc_irq_vectors() fails, device is left enabled. Fix this by adding a goto disable statement. 2) nvme_pci_configure_admin_queue could return -ENODEV, in this case, we will need to free IRQ properly. Otherwise the following warning could be triggered: [ 5.286752] WARNING: CPU: 0 PID: 33 at kernel/irq/irqdomain.c:253 irq_domain_remove+0x12d/0x140 [ 5.290547] Call Trace: [ 5.290626] <TASK> [ 5.290695] msi_remove_device_irq_domain+0xc9/0xf0 [ 5.290843] msi_device_data_release+0x15/0x80 [ 5.290978] release_nodes+0x58/0x90 [ 5.293788] WARNING: CPU: 0 PID: 33 at kernel/irq/msi.c:276 msi_device_data_release+0x76/0x80 [ 5.297573] Call Trace: [ 5.297651] <TASK> [ 5.297719] release_nodes+0x58/0x90 [ 5.297831] devres_release_all+0xef/0x140 [ 5.298339] device_unbind_cleanup+0x11/0xc0 [ 5.298479] really_probe+0x296/0x320 Fixes: a6ee7f19ebfd ("nvme-pci: call nvme_pci_configure_admin_queue from nvme_pci_enable") Co-developed-by: Keith Busch <[email protected]> Signed-off-by: Tong Zhang <[email protected]> Reviewed-by: Keith Busch <[email protected]> Signed-off-by: Christoph Hellwig <[email protected]>
2023-01-10nvme-pci: add NVME_QUIRK_IDENTIFY_CNS quirk to Apple T2 controllersHector Martin1-1/+2
This mirrors the quirk added to Apple Silicon controllers in apple.c. These controllers do not support the Active NS ID List command and behave identically to the SoC version judging by existing user reports/syslogs, so will need the same fix. This quirk reverts back to NVMe 1.0 behavior and disables the broken commands. Fixes: 811f4de0344d ("nvme: avoid fallback to sequential scan due to transient issues") Signed-off-by: Hector Martin <[email protected]> Tested-by: Orlando Chamberlain <[email protected]> Signed-off-by: Christoph Hellwig <[email protected]>
2023-01-10nvme-apple: add NVME_QUIRK_IDENTIFY_CNS quirk to fix regressionHector Martin1-1/+1
From the get-go, this driver and the ANS syslog have been complaining about namespace identification. In 6.2-rc1, commit 811f4de0344d ("nvme: avoid fallback to sequential scan due to transient issues") regressed the driver by no longer allowing fallback to sequential namespace scans, leaving us with no namespaces. It turns out that the real problem is that this controller claiming NVMe 1.1 compat is treating the CNS field as a binary field, as in NVMe 1.0. This already has a quirk, NVME_QUIRK_IDENTIFY_CNS, so set it for the controller to fix all this nonsense (including other errors triggered by other CNS commands). Fixes: 811f4de0344d ("nvme: avoid fallback to sequential scan due to transient issues") Fixes: 5bd2927aceba ("nvme-apple: Add initial Apple SoC NVMe driver") Signed-off-by: Hector Martin <[email protected]> Reviewed-by: Sven Peter <[email protected]> Signed-off-by: Christoph Hellwig <[email protected]>
2023-01-08block: Drop spurious might_sleep() from blk_put_queue()Tejun Heo1-3/+0
Dan reports the following smatch detected the following: block/blk-cgroup.c:1863 blkcg_schedule_throttle() warn: sleeping in atomic context caused by blkcg_schedule_throttle() calling blk_put_queue() in an non-sleepable context. blk_put_queue() acquired might_sleep() in 63f93fd6fa57 ("block: mark blk_put_queue as potentially blocking") which transferred the might_sleep() from blk_free_queue(). blk_free_queue() acquired might_sleep() in e8c7d14ac6c3 ("block: revert back to synchronous request_queue removal") while turning request_queue removal synchronous. However, this isn't necessary as nothing in the free path actually requires sleeping. It's pretty unusual to require a sleeping context in a put operation and it's not needed in the first place. Let's drop it. Signed-off-by: Tejun Heo <[email protected]> Reported-by: Dan Carpenter <[email protected]> Link: https://lkml.kernel.org/r/Y7g3L6fntnTtOm63@kili Cc: Christoph Hellwig <[email protected]> Cc: Luis Chamberlain <[email protected]> Fixes: e8c7d14ac6c3 ("block: revert back to synchronous request_queue removal") # v5.9+ Reviewed-by: Christoph Hellwig <[email protected]> Link: https://lore.kernel.org/r/[email protected] Signed-off-by: Jens Axboe <[email protected]>
2023-01-05block: Remove "select SRCU"Paul E. McKenney1-1/+0
Now that the SRCU Kconfig option is unconditionally selected, there is no longer any point in selecting it. Therefore, remove the "select SRCU" Kconfig statements. Signed-off-by: Paul E. McKenney <[email protected]> Cc: Jens Axboe <[email protected]> Cc: [email protected] Signed-off-by: Jens Axboe <[email protected]>
2023-01-04Revert "pktcdvd: remove driver."Jens Axboe8-0/+3419
This reverts commit f40eb99897af665f11858dd7b56edcb62c3f3c67. There are apparently still users out there of this driver. While we'd love to remove it to ease the maintenance burden, let's reinstate it for now until better (userspace) solutions can be developed. Link: https://lore.kernel.org/lkml/20230104190115.ceglfefco475ev6c@pali/ Reported-by: Pali Rohár <[email protected]> Signed-off-by: Jens Axboe <[email protected]>
2023-01-04Revert "block: remove devnode callback from struct block_device_operations"Jens Axboe2-0/+12
This reverts commit 85d6ce58e493ac8b7122e2fbe3f41b94d6ebdc11. We're reinstating the pktcdvd driver, which needs this API. Signed-off-by: Jens Axboe <[email protected]>
2023-01-04Revert "block: bio_copy_data_iter"Jens Axboe2-15/+24
This reverts commit db1c7d77976775483a8ef240b4c705f113e13ea1. We're reinstating the pktcdvd driver, which needs this API. Signed-off-by: Jens Axboe <[email protected]>
2023-01-04ublk: honor IO_URING_F_NONBLOCK for handling control commandMing Lei1-0/+3
Most of control command handlers may sleep, so return -EAGAIN in case of IO_URING_F_NONBLOCK to defer the handling into io wq context. Fixes: 71f28f3136af ("ublk_drv: add io_uring based userspace block driver") Reported-by: Jens Axboe <[email protected]> Signed-off-by: Ming Lei <[email protected]> Link: https://lore.kernel.org/r/[email protected] Signed-off-by: Jens Axboe <[email protected]>
2023-01-04block: don't allow splitting of a REQ_NOWAIT bioJens Axboe1-0/+10
If we split a bio marked with REQ_NOWAIT, then we can trigger spurious EAGAIN if constituent parts of that split bio end up failing request allocations. Parts will complete just fine, but just a single failure in one of the chained bios will yield an EAGAIN final result for the parent bio. Return EAGAIN early if we end up needing to split such a bio, which allows for saner recovery handling. Cc: [email protected] # 5.15+ Link: https://github.com/axboe/liburing/issues/766 Reported-by: Michael Kelley <[email protected]> Reviewed-by: Keith Busch <[email protected]> Signed-off-by: Jens Axboe <[email protected]>
2023-01-04block: handle bio_split_to_limits() NULL returnJens Axboe8-2/+19
This can't happen right now, but in preparation for allowing bio_split_to_limits() returning NULL if it ended the bio, check for it in all the callers. Signed-off-by: Jens Axboe <[email protected]>
2022-12-29Merge tag 'nvme-6.2-2022-12-29' of git://git.infradead.org/nvme into block-6.2Jens Axboe9-34/+159
Pull NVMe fixes from Christoph: "nvme fixes for Linux 6.2 - fix various problems in handling the Command Supported and Effects log (Christoph Hellwig) - don't allow unprivileged passthrough of commands that don't transfer data but modify logical block content (Christoph Hellwig) - add a features and quirks policy document (Christoph Hellwig) - fix some really nasty code that was correct but made smatch complain (Sagi Grimberg)" * tag 'nvme-6.2-2022-12-29' of git://git.infradead.org/nvme: nvme-auth: fix smatch warning complaints nvme: consult the CSE log page for unprivileged passthrough nvme: also return I/O command effects from nvme_command_effects nvmet: don't defer passthrough commands with trivial effects to the workqueue nvmet: set the LBCC bit for commands that modify data nvmet: use NVME_CMD_EFFECTS_CSUPP instead of open coding it nvme: fix the NVME_CMD_EFFECTS_CSE_MASK definition docs, nvme: add a feature and quirk policy document
2022-12-28nvme-auth: fix smatch warning complaintsSagi Grimberg1-1/+1
When initializing auth context, there may be no secrets passed by the user. Make return code explicit when returning successfully. smatch warnings: drivers/nvme/host/auth.c:950 nvme_auth_init_ctrl() warn: missing error code? 'ret' Reported-by: kernel test robot <[email protected]> Reported-by: Dan Carpenter <[email protected]> Signed-off-by: Sagi Grimberg <[email protected]> Signed-off-by: Christoph Hellwig <[email protected]>
2022-12-28nvme: consult the CSE log page for unprivileged passthroughChristoph Hellwig2-4/+25
Commands like Write Zeros can change the contents of a namespaces without actually transferring data. To protect against this, check the Commands Supported and Effects log is supported by the controller for any unprivileg command passthrough and refuse unprivileged passthrough if the command has any effects that can change data or metadata. Note: While the Commands Support and Effects log page has only been mandatory since NVMe 2.0, it is widely supported because Windows requires it for any command passthrough from userspace. Fixes: e4fbcf32c860 ("nvme: identify-namespace without CAP_SYS_ADMIN") Signed-off-by: Christoph Hellwig <[email protected]> Reviewed-by: Keith Busch <[email protected]> Reviewed-by: Sagi Grimberg <[email protected]> Reviewed-by: Kanchan Joshi <[email protected]>
2022-12-28nvme: also return I/O command effects from nvme_command_effectsChristoph Hellwig1-6/+26
To be able to use the Commands Supported and Effects Log for allowing unprivileged passtrough, it needs to be corretly reported for I/O commands as well. Return the I/O command effects from nvme_command_effects, and also add a default list of effects for the NVM command set. For other command sets, the Commands Supported and Effects log is required to be present already. Signed-off-by: Christoph Hellwig <[email protected]> Reviewed-by: Keith Busch <[email protected]> Reviewed-by: Kanchan Joshi <[email protected]>
2022-12-28nvmet: don't defer passthrough commands with trivial effects to the workqueueChristoph Hellwig1-6/+5
Mask out the "Command Supported" and "Logical Block Content Change" bits and only defer execution of commands that have non-trivial effects to the workqueue for synchronous execution. This allows to execute admin commands asynchronously on controllers that provide a Command Supported and Effects log page, and will keep allowing to execute Write commands asynchronously once command effects on I/O commands are taken into account. Fixes: c1fef73f793b ("nvmet: add passthru code to process commands") Signed-off-by: Christoph Hellwig <[email protected]> Reviewed-by: Keith Busch <[email protected]> Reviewed-by: Sagi Grimberg <[email protected]> Reviewed-by: Kanchan Joshi <[email protected]>
2022-12-28nvmet: set the LBCC bit for commands that modify dataChristoph Hellwig1-2/+4
Write, Write Zeroes, Zone append and a Zone Reset through Zone Management Send modify the logical block content of a namespace, so make sure the LBCC bit is reported for them. Fixes: b5d0b38c0475 ("nvmet: add Command Set Identifier support") Signed-off-by: Christoph Hellwig <[email protected]> Reviewed-by: Keith Busch <[email protected]> Reviewed-by: Sagi Grimberg <[email protected]> Reviewed-by: Kanchan Joshi <[email protected]> Reviewed-by: Chaitanya Kulkarni <[email protected]>
2022-12-28nvmet: use NVME_CMD_EFFECTS_CSUPP instead of open coding itChristoph Hellwig1-16/+19
Use NVME_CMD_EFFECTS_CSUPP instead of open coding it and assign a single value to multiple array entries instead of repeated assignments. Signed-off-by: Christoph Hellwig <[email protected]> Reviewed-by: Keith Busch <[email protected]> Reviewed-by: Sagi Grimberg <[email protected]> Reviewed-by: Kanchan Joshi <[email protected]> Reviewed-by: Chaitanya Kulkarni <[email protected]>