Age | Commit message (Collapse) | Author | Files | Lines |
|
Pull NVMe fix from Christoph:
"nvme fixes for Linux 6.2
- fix a static checker warning for a variable introduces in the last
pull request (Tom Rix)"
* tag 'nvme-6.2-2023-02-09' of git://git.infradead.org/nvme:
nvme-auth: mark nvme_auth_wq static
|
|
Fix a smatch report for the newly added nvme_auth_wq.
Signed-off-by: Tom Rix <[email protected]>
Reviewed-by: Chaitanya Kulkarni <[email protected]>
Signed-off-by: Christoph Hellwig <[email protected]>
|
|
Pul NVMe fixes from Christoph:
"nvme fixes for Linux 6.2
- fix a missing queue put in nvmet_fc_ls_create_association (Amit Engel)
- clear queue pointers on tag_set initialization failure
(Maurizio Lombardi)
- use workqueue dedicated to authentication (Shin'ichiro Kawasaki)"
* tag 'nvme-6.2-2023-02-02' of git://git.infradead.org/nvme:
nvme-auth: use workqueue dedicated to authentication
nvme: clear the request_queue pointers on failure in nvme_alloc_io_tag_set
nvme: clear the request_queue pointers on failure in nvme_alloc_admin_tag_set
nvme-fc: fix a missing queue put in nvmet_fc_ls_create_association
|
|
We source root cgroup stats from the system-wide stats, see blkcg_print_stat
and blkcg_rstat_flush, so don't update io state for root cgroup.
Fixes blkg leak issue introduced in commit 3b8cc6298724 ("blk-cgroup: Optimize blkcg_rstat_flush()")
which starts to grab blkg's reference when adding iostat_cpu into percpu
blkcg list, but this state won't be consumed by blkcg_rstat_flush() where
the blkg reference is dropped.
Tested-by: Bart van Assche <[email protected]>
Reported-by: Bart van Assche <[email protected]>
Fixes: 3b8cc6298724 ("blk-cgroup: Optimize blkcg_rstat_flush()")
Cc: Tejun Heo <[email protected]>
Cc: Waiman Long <[email protected]>
Signed-off-by: Ming Lei <[email protected]>
Link: https://lore.kernel.org/r/[email protected]
Signed-off-by: Jens Axboe <[email protected]>
|
|
NVMe In-Band authentication uses two kinds of works: chap->auth_work and
ctrl->dhchap_auth_work. The latter work flushes or cancels the former
work. However, the both works are queued to the same workqueue nvme-wq.
It results in the lockdep WARNING as follows:
WARNING: possible recursive locking detected
6.2.0-rc4+ #1 Not tainted
--------------------------------------------
kworker/u16:7/69 is trying to acquire lock:
ffff902d52e65548 ((wq_completion)nvme-wq){+.+.}-{0:0}, at: start_flush_work+0x2c5/0x380
but task is already holding lock:
ffff902d52e65548 ((wq_completion)nvme-wq){+.+.}-{0:0}, at: process_one_work+0x210/0x410
To avoid the WARNING, introduce a new workqueue nvme-auth-wq dedicated
to chap->auth_work.
Reported-by: Daniel Wagner <[email protected]>
Link: https://lore.kernel.org/linux-nvme/[email protected]/
Fixes: f50fff73d620 ("nvme: implement In-Band authentication")
Signed-off-by: Shin'ichiro Kawasaki <[email protected]>
Tested-by: Daniel Wagner <[email protected]>
Reviewed-by: Hannes Reinecke <[email protected]>
Signed-off-by: Christoph Hellwig <[email protected]>
|
|
In nvme_alloc_io_tag_set(), the connect_q pointer should be set to NULL
in case of error to avoid potential invalid pointer dereferences.
Signed-off-by: Maurizio Lombardi <[email protected]>
Reviewed-by: Chaitanya Kulkarni <[email protected]>
Signed-off-by: Christoph Hellwig <[email protected]>
|
|
If nvme_alloc_admin_tag_set() fails, the admin_q and fabrics_q pointers
are left with an invalid, non-NULL value. Other functions may then check
the pointers and dereference them, e.g. in
nvme_probe() -> out_disable: -> nvme_dev_remove_admin().
Fix the bug by setting admin_q and fabrics_q to NULL in case of error.
Also use the set variable to free the tag_set as ctrl->admin_tagset isn't
initialized yet.
Signed-off-by: Maurizio Lombardi <[email protected]>
Reviewed-by: Keith Busch <[email protected]>
Reviewed-by: Chaitanya Kulkarni <[email protected]>
Signed-off-by: Christoph Hellwig <[email protected]>
|
|
As part of nvmet_fc_ls_create_association there is a case where
nvmet_fc_alloc_target_queue fails right after a new association with an
admin queue is created. In this case, no one releases the get taken in
nvmet_fc_alloc_target_assoc. This fix is adding the missing put.
Signed-off-by: Amit Engel <[email protected]>
Reviewed-by: James Smart <[email protected]>
Signed-off-by: Christoph Hellwig <[email protected]>
|
|
Commit 2b3f056f72e5 moved a blk_put_queue() call from
blk_mq_destroy_queue() into its callers. Reflect this change in the
documentation block above blk_mq_destroy_queue().
Cc: Christoph Hellwig <[email protected]>
Cc: Sagi Grimberg <[email protected]>
Cc: Chaitanya Kulkarni <[email protected]>
Cc: Keith Busch <[email protected]>
Fixes: 2b3f056f72e5 ("blk-mq: move the call to blk_put_queue out of blk_mq_destroy_queue")
Signed-off-by: Bart Van Assche <[email protected]>
Reviewed-by: Christoph Hellwig <[email protected]>
Link: https://lore.kernel.org/r/[email protected]
Signed-off-by: Jens Axboe <[email protected]>
|
|
When validating drafted SPDK ublk target, in a case that
assigning large queue depth to multiqueue ublk device,
ublk target would run into a weird incorrect state. During
rounds of review and debug, An overflow bug was found
in ublk driver.
In ublk_cmd.h, UBLK_MAX_QUEUE_DEPTH is 4096 which means
each ublk queue depth can be set as large as 4096. But
when setting qd for a ublk device,
sizeof(struct ublk_queue) + depth * sizeof(struct ublk_io)
will be larger than 65535 if qd is larger than 2728.
Then queue_size is overflowed, and ublk_get_queue()
references a wrong pointer position. The wrong content of
ublk_queue elements will lead to out-of-bounds memory
access.
Extend queue_size in ublk_device as "unsigned int".
Signed-off-by: Liu Xiaodong <[email protected]>
Fixes: 71f28f3136af ("ublk_drv: add io_uring based userspace block driver")
Reviewed-by: Ming Lei <[email protected]>
Link: https://lore.kernel.org/r/[email protected]
Signed-off-by: Jens Axboe <[email protected]>
|
|
After commit 64dc8c732f5c ("block, bfq: fix possible uaf for 'bfqq->bic'"),
bic->bfqq will be accessed in bic_set_bfqq(), however, in some context
bic->bfqq will be freed, and bic_set_bfqq() is called with the freed
bic->bfqq.
Fix the problem by always freeing bfqq after bic_set_bfqq().
Fixes: 64dc8c732f5c ("block, bfq: fix possible uaf for 'bfqq->bic'")
Reported-and-tested-by: Shinichiro Kawasaki <[email protected]>
Signed-off-by: Yu Kuai <[email protected]>
Reviewed-by: Jan Kara <[email protected]>
Link: https://lore.kernel.org/r/[email protected]
Signed-off-by: Jens Axboe <[email protected]>
|
|
Pull NVMe fixes from Christoph:
"nvme fixes for Linux 6.2
- flush initial scan_work for async probe (Keith Busch)
- fix passthrough csi check (Keith Busch)
- fix nvme-fc initialization order (Ross Lagerwall)"
* tag 'nvme-6.2-2023-01-26' of git://git.infradead.org/nvme:
nvme: fix passthrough csi check
nvme-pci: flush initial scan_work for async probe
nvme-fc: fix initialization order
|
|
The 'ublk_chr_class' is needed when deleting ublk char devices in
ublk_exit(), so move it after devices(idle) are removed.
Fixes the following warning reported by Harris, James R:
[ 859.178950] sysfs group 'power' not found for kobject 'ublkc0'
[ 859.178962] WARNING: CPU: 3 PID: 1109 at fs/sysfs/group.c:278 sysfs_remove_group+0x9c/0xb0
Reported-by: "Harris, James R" <[email protected]>
Fixes: 71f28f3136af ("ublk_drv: add io_uring based userspace block driver")
Link: https://lore.kernel.org/linux-block/Y9JlFmSgDl3+zy3N@T590/T/#t
Signed-off-by: Ming Lei <[email protected]>
Tested-by: Jim Harris <[email protected]>
Link: https://lore.kernel.org/r/[email protected]
Signed-off-by: Jens Axboe <[email protected]>
|
|
The namespace head saves the Command Set Indicator enum, so use that
instead of the Command Set Selected. The two values are not the same.
Fixes: 831ed60c2aca2d ("nvme: also return I/O command effects from nvme_command_effects")
Signed-off-by: Keith Busch <[email protected]>
Signed-off-by: Christoph Hellwig <[email protected]>
|
|
The nvme device may have a namespace with the root partition, so make
sure we've completed scanning before returning from the async probe.
Fixes: eac3ef262941 ("nvme-pci: split the initial probe from the rest path")
Reported-by: Klaus Jensen <[email protected]>
Signed-off-by: Keith Busch <[email protected]>
Tested-by: Ville Syrjälä <[email protected]>
Tested-by: Klaus Jensen <[email protected]>
Signed-off-by: Christoph Hellwig <[email protected]>
|
|
ctrl->ops is used by nvme_alloc_admin_tag_set() but set by
nvme_init_ctrl() so reorder the calls to avoid a NULL pointer
dereference.
Fixes: 6dfba1c09c10 ("nvme-fc: use the tagset alloc/free helpers")
Signed-off-by: Ross Lagerwall <[email protected]>
Reviewed-by: Sagi Grimberg <[email protected]>
Signed-off-by: Christoph Hellwig <[email protected]>
|
|
Pull NVMe fixes from Christoph:
"nvme fixes for Linux 6.2
- fix controller shutdown regression in nvme-apple (Janne Grunau)
- fix a polling on timeout regression in nvme-pci (Keith Busch)"
* tag 'nvme-6.2-2023-01-20' of git://git.infradead.org/nvme:
nvme-pci: fix timeout request state check
nvme-apple: only reset the controller when RTKit is running
nvme-apple: reset controller during shutdown
|
|
Polling the completion can progress the request state to IDLE, either
inline with the completion, or through softirq. Either way, the state
may not be COMPLETED, so don't check for that. We only care if the state
isn't IN_FLIGHT.
This is fixing an issue where the driver aborts an IO that we just
completed. Seeing the "aborting" message instead of "polled" is very
misleading as to where the timeout problem resides.
Fixes: bf392a5dc02a9b ("nvme-pci: Remove tag from process cq")
Signed-off-by: Keith Busch <[email protected]>
Signed-off-by: Christoph Hellwig <[email protected]>
|
|
NVMe controller register access hangs indefinitely when the co-processor
is not running. A missed reset is preferable over a hanging thread since
it could be recoverable.
Signed-off-by: Janne Grunau <[email protected]>
Signed-off-by: Christoph Hellwig <[email protected]>
|
|
This is a functional revert of c76b8308e4c9 ("nvme-apple: fix controller
shutdown in apple_nvme_disable").
The commit broke suspend/resume since apple_nvme_reset_work() tries to
disable the controller on resume. This does not work for the apple NVMe
controller since register access only works while the co-processor
firmware is running.
Disabling the NVMe controller in the shutdown path is also required
for shutting the co-processor down. The original code was appropriate
for this hardware. Add a comment to prevent a similar breaking changes
in the future.
Fixes: c76b8308e4c9 ("nvme-apple: fix controller shutdown in apple_nvme_disable")
Reported-by: Janne Grunau <[email protected]>
Link: https://lore.kernel.org/all/[email protected]/
Signed-off-by: Janne Grunau <[email protected]>
[hch: updated with a more descriptive comment from Hector Martin]
Signed-off-by: Christoph Hellwig <[email protected]>
|
|
When there are no read queues read requests will be assigned a
default queue on allocation. However, blk_mq_get_cached_request() is not
prepared for that and will fail all attempts to grab read requests from
the cache. Worst case it doubles the number of requests allocated,
roughly half of which will be returned by blk_mq_free_plug_rqs().
It only affects batched allocations and so is io_uring specific.
For reference, QD8 t/io_uring benchmark improves by 20-35%.
Signed-off-by: Pavel Begunkov <[email protected]>
Link: https://lore.kernel.org/r/80d4511011d7d4751b4cf6375c4e38f237d935e3.1673955390.git.asml.silence@gmail.com
Signed-off-by: Jens Axboe <[email protected]>
|
|
We need to pass 'end - 1' to ida_alloc_max after switch from
ida_simple_get to ida_alloc_max.
Otherwise smatch warns.
drivers/block/rnbd/rnbd-clt.c:1460 init_dev() error: Calling ida_alloc_max() with a 'max' argument which is a power of 2. -1 missing?
Fixes: 24afc15dbe21 ("block/rnbd: Remove a useless mutex")
Reported-by: kernel test robot <[email protected]>
Reported-by: Dan Carpenter <[email protected]>
Signed-off-by: Guoqing Jiang <[email protected]>
Acked-by: Jack Wang <[email protected]>
Link: https://lore.kernel.org/r/[email protected]
Signed-off-by: Jens Axboe <[email protected]>
|
|
If the policy defines pd_online_fn(), it should be called after
pd_init_fn(), like blkg_create().
Signed-off-by: Yu Kuai <[email protected]>
Acked-by: Tejun Heo <[email protected]>
Link: https://lore.kernel.org/r/[email protected]
Signed-off-by: Jens Axboe <[email protected]>
|
|
The revert of the removal of this driver happened after we fixed up
the split limits for NOWAIT issue, hence it got missed. Ensure that
we check for a NULL bio after splitting, in case it should be retried.
Marking this as fixing both commits, so that stable backport will do
this correctly.
Cc: [email protected]
Fixes: 9cea62b2cbab ("block: don't allow splitting of a REQ_NOWAIT bio")
Fixes: 4b83e99ee709 ("Revert "pktcdvd: remove driver."")
Signed-off-by: Jens Axboe <[email protected]>
|
|
The updating of 'bfqg->ref' should be protected by 'bfqd->lock', however,
during code review, we found that bfq_pd_free() update 'bfqg->ref'
without holding the lock, which is problematic:
1) bfq_pd_free() triggered by removing cgroup is called asynchronously;
2) bfqq will grab bfqg reference, and exit bfqq will drop the reference,
which can concurrent with 1).
Unfortunately, 'bfqd->lock' can't be held here because 'bfqd' might already
be freed in bfq_pd_free(). Fix the problem by using atomic refcount apis.
Signed-off-by: Yu Kuai <[email protected]>
Reviewed-by: Jan Kara <[email protected]>
Link: https://lore.kernel.org/r/[email protected]
Signed-off-by: Jens Axboe <[email protected]>
|
|
https://git.kernel.org/pub/scm/linux/kernel/git/song/md into block-6.2
Pull MD fix from Song:
"It fixes an issue introduced by recent code refactor."
* 'md-fixes' of https://git.kernel.org/pub/scm/linux/kernel/git/song/md:
md: fix incorrect declaration about claim_rdev in md_import_device
|
|
Commit fb541ca4c365 ("md: remove lock_bdev / unlock_bdev") removes
wrappers for blkdev_get/blkdev_put. However, the uninitialized local
static variable of pointer type 'claim_rdev' in md_import_device()
is NULL, which leads to the following warning call trace:
WARNING: CPU: 22 PID: 1037 at block/bdev.c:577 bd_prepare_to_claim+0x131/0x150
CPU: 22 PID: 1037 Comm: mdadm Not tainted 6.2.0-rc3+ #69
..
RIP: 0010:bd_prepare_to_claim+0x131/0x150
..
Call Trace:
<TASK>
? _raw_spin_unlock+0x15/0x30
? iput+0x6a/0x220
blkdev_get_by_dev.part.0+0x4b/0x300
md_import_device+0x126/0x1d0
new_dev_store+0x184/0x240
md_attr_store+0x80/0xf0
kernfs_fop_write_iter+0x128/0x1c0
vfs_write+0x2be/0x3c0
ksys_write+0x5f/0xe0
do_syscall_64+0x38/0x90
entry_SYSCALL_64_after_hwframe+0x72/0xdc
It turns out the md device cannot be used:
md: could not open device unknown-block(259,0).
md: md127 stopped.
Fix the issue by declaring the local static variable of struct type
and passing the pointer of the variable to blkdev_get_by_dev().
Fixes: fb541ca4c365 ("md: remove lock_bdev / unlock_bdev")
Cc: Christoph Hellwig <[email protected]>
Signed-off-by: Adrian Huang <[email protected]>
Reviewed-by: Christoph Hellwig <[email protected]>
Signed-off-by: Song Liu <[email protected]>
|
|
Pull NVMe fixes from Christoph:
"nvme fixes for Linux 6.2
- Identify quirks for Apple controllers (Hector Martin)
- fix error handling in nvme_pci_enable (Tong Zhang)
- refuse unprivileged passthrough on partitions (Christoph Hellwig)
- fix MAINTAINERS to not match nvmem subsystem headers (Russell King)"
* tag 'nvme-6.2-2023-01-12' of git://git.infradead.org/nvme:
MAINTAINERS: stop nvme matching for nvmem files
nvme: don't allow unprivileged passthrough on partitions
nvme: replace the "bool vec" arguments with flags in the ioctl path
nvme: remove __nvme_ioctl
nvme-pci: fix error handling in nvme_pci_enable()
nvme-pci: add NVME_QUIRK_IDENTIFY_CNS quirk to Apple T2 controllers
nvme-apple: add NVME_QUIRK_IDENTIFY_CNS quirk to fix regression
|
|
The nvme patterns detect all include files starting with nvme, which
also picks up the nvmem subsystem header files. Fix this by using
a more specific pattern.
Signed-off-by: Russell King (Oracle) <[email protected]>
[hch: switched to a purely inclusive pattern instead of excluding nvmem*]
Signed-off-by: Christoph Hellwig <[email protected]>
|
|
Passthrough commands can always access the entire device, and thus
submitting them on partitions is an privelege escalation.
In hindsight we should have never allowed any passthrough commands on
partitions, but it's probably too late to change that decision now.
Fixes: e4fbcf32c860 ("nvme: identify-namespace without CAP_SYS_ADMIN")
Signed-off-by: Christoph Hellwig <[email protected]>
Reviewed-by: Keith Busch <[email protected]>
Reviewed-by: Kanchan Joshi <[email protected]>
Reviewed-by: Chaitanya Kulkarni <[email protected]>
|
|
To prepare for passing down more information, replace the boolean
vec argument with a more extensible flags one.
Signed-off-by: Christoph Hellwig <[email protected]>
Reviewed-by: Keith Busch <[email protected]>
Reviewed-by: Kanchan Joshi <[email protected]>
Reviewed-by: Chaitanya Kulkarni <[email protected]>
|
|
Open code __nvme_ioctl in the two callers to make future changes that
pass down additional paramters in the ioctl path easier.
Signed-off-by: Christoph Hellwig <[email protected]>
Reviewed-by: Keith Busch <[email protected]>
Reviewed-by: Kanchan Joshi <[email protected]>
Reviewed-by: Chaitanya Kulkarni <[email protected]>
|
|
There are two issues in nvme_pci_enable():
1) If pci_alloc_irq_vectors() fails, device is left enabled. Fix this by
adding a goto disable statement.
2) nvme_pci_configure_admin_queue could return -ENODEV, in this case,
we will need to free IRQ properly. Otherwise the following warning
could be triggered:
[ 5.286752] WARNING: CPU: 0 PID: 33 at kernel/irq/irqdomain.c:253 irq_domain_remove+0x12d/0x140
[ 5.290547] Call Trace:
[ 5.290626] <TASK>
[ 5.290695] msi_remove_device_irq_domain+0xc9/0xf0
[ 5.290843] msi_device_data_release+0x15/0x80
[ 5.290978] release_nodes+0x58/0x90
[ 5.293788] WARNING: CPU: 0 PID: 33 at kernel/irq/msi.c:276 msi_device_data_release+0x76/0x80
[ 5.297573] Call Trace:
[ 5.297651] <TASK>
[ 5.297719] release_nodes+0x58/0x90
[ 5.297831] devres_release_all+0xef/0x140
[ 5.298339] device_unbind_cleanup+0x11/0xc0
[ 5.298479] really_probe+0x296/0x320
Fixes: a6ee7f19ebfd ("nvme-pci: call nvme_pci_configure_admin_queue from nvme_pci_enable")
Co-developed-by: Keith Busch <[email protected]>
Signed-off-by: Tong Zhang <[email protected]>
Reviewed-by: Keith Busch <[email protected]>
Signed-off-by: Christoph Hellwig <[email protected]>
|
|
This mirrors the quirk added to Apple Silicon controllers in apple.c.
These controllers do not support the Active NS ID List command and
behave identically to the SoC version judging by existing user
reports/syslogs, so will need the same fix. This quirk reverts
back to NVMe 1.0 behavior and disables the broken commands.
Fixes: 811f4de0344d ("nvme: avoid fallback to sequential scan due to transient issues")
Signed-off-by: Hector Martin <[email protected]>
Tested-by: Orlando Chamberlain <[email protected]>
Signed-off-by: Christoph Hellwig <[email protected]>
|
|
From the get-go, this driver and the ANS syslog have been complaining
about namespace identification. In 6.2-rc1, commit 811f4de0344d ("nvme:
avoid fallback to sequential scan due to transient issues") regressed
the driver by no longer allowing fallback to sequential namespace scans,
leaving us with no namespaces.
It turns out that the real problem is that this controller claiming
NVMe 1.1 compat is treating the CNS field as a binary field, as in NVMe
1.0. This already has a quirk, NVME_QUIRK_IDENTIFY_CNS, so set it for
the controller to fix all this nonsense (including other errors
triggered by other CNS commands).
Fixes: 811f4de0344d ("nvme: avoid fallback to sequential scan due to transient issues")
Fixes: 5bd2927aceba ("nvme-apple: Add initial Apple SoC NVMe driver")
Signed-off-by: Hector Martin <[email protected]>
Reviewed-by: Sven Peter <[email protected]>
Signed-off-by: Christoph Hellwig <[email protected]>
|
|
Dan reports the following smatch detected the following:
block/blk-cgroup.c:1863 blkcg_schedule_throttle() warn: sleeping in atomic context
caused by blkcg_schedule_throttle() calling blk_put_queue() in an
non-sleepable context.
blk_put_queue() acquired might_sleep() in 63f93fd6fa57 ("block: mark
blk_put_queue as potentially blocking") which transferred the might_sleep()
from blk_free_queue().
blk_free_queue() acquired might_sleep() in e8c7d14ac6c3 ("block: revert back
to synchronous request_queue removal") while turning request_queue removal
synchronous. However, this isn't necessary as nothing in the free path
actually requires sleeping.
It's pretty unusual to require a sleeping context in a put operation and
it's not needed in the first place. Let's drop it.
Signed-off-by: Tejun Heo <[email protected]>
Reported-by: Dan Carpenter <[email protected]>
Link: https://lkml.kernel.org/r/Y7g3L6fntnTtOm63@kili
Cc: Christoph Hellwig <[email protected]>
Cc: Luis Chamberlain <[email protected]>
Fixes: e8c7d14ac6c3 ("block: revert back to synchronous request_queue removal") # v5.9+
Reviewed-by: Christoph Hellwig <[email protected]>
Link: https://lore.kernel.org/r/[email protected]
Signed-off-by: Jens Axboe <[email protected]>
|
|
Now that the SRCU Kconfig option is unconditionally selected, there is
no longer any point in selecting it. Therefore, remove the "select SRCU"
Kconfig statements.
Signed-off-by: Paul E. McKenney <[email protected]>
Cc: Jens Axboe <[email protected]>
Cc: [email protected]
Signed-off-by: Jens Axboe <[email protected]>
|
|
This reverts commit f40eb99897af665f11858dd7b56edcb62c3f3c67.
There are apparently still users out there of this driver. While we'd
love to remove it to ease the maintenance burden, let's reinstate it
for now until better (userspace) solutions can be developed.
Link: https://lore.kernel.org/lkml/20230104190115.ceglfefco475ev6c@pali/
Reported-by: Pali Rohár <[email protected]>
Signed-off-by: Jens Axboe <[email protected]>
|
|
This reverts commit 85d6ce58e493ac8b7122e2fbe3f41b94d6ebdc11.
We're reinstating the pktcdvd driver, which needs this API.
Signed-off-by: Jens Axboe <[email protected]>
|
|
This reverts commit db1c7d77976775483a8ef240b4c705f113e13ea1.
We're reinstating the pktcdvd driver, which needs this API.
Signed-off-by: Jens Axboe <[email protected]>
|
|
Most of control command handlers may sleep, so return -EAGAIN in case
of IO_URING_F_NONBLOCK to defer the handling into io wq context.
Fixes: 71f28f3136af ("ublk_drv: add io_uring based userspace block driver")
Reported-by: Jens Axboe <[email protected]>
Signed-off-by: Ming Lei <[email protected]>
Link: https://lore.kernel.org/r/[email protected]
Signed-off-by: Jens Axboe <[email protected]>
|
|
If we split a bio marked with REQ_NOWAIT, then we can trigger spurious
EAGAIN if constituent parts of that split bio end up failing request
allocations. Parts will complete just fine, but just a single failure
in one of the chained bios will yield an EAGAIN final result for the
parent bio.
Return EAGAIN early if we end up needing to split such a bio, which
allows for saner recovery handling.
Cc: [email protected] # 5.15+
Link: https://github.com/axboe/liburing/issues/766
Reported-by: Michael Kelley <[email protected]>
Reviewed-by: Keith Busch <[email protected]>
Signed-off-by: Jens Axboe <[email protected]>
|
|
This can't happen right now, but in preparation for allowing
bio_split_to_limits() returning NULL if it ended the bio, check for it
in all the callers.
Signed-off-by: Jens Axboe <[email protected]>
|
|
Pull NVMe fixes from Christoph:
"nvme fixes for Linux 6.2
- fix various problems in handling the Command Supported and Effects log
(Christoph Hellwig)
- don't allow unprivileged passthrough of commands that don't transfer
data but modify logical block content (Christoph Hellwig)
- add a features and quirks policy document (Christoph Hellwig)
- fix some really nasty code that was correct but made smatch complain
(Sagi Grimberg)"
* tag 'nvme-6.2-2022-12-29' of git://git.infradead.org/nvme:
nvme-auth: fix smatch warning complaints
nvme: consult the CSE log page for unprivileged passthrough
nvme: also return I/O command effects from nvme_command_effects
nvmet: don't defer passthrough commands with trivial effects to the workqueue
nvmet: set the LBCC bit for commands that modify data
nvmet: use NVME_CMD_EFFECTS_CSUPP instead of open coding it
nvme: fix the NVME_CMD_EFFECTS_CSE_MASK definition
docs, nvme: add a feature and quirk policy document
|
|
When initializing auth context, there may be no secrets passed
by the user. Make return code explicit when returning successfully.
smatch warnings:
drivers/nvme/host/auth.c:950 nvme_auth_init_ctrl() warn: missing error code? 'ret'
Reported-by: kernel test robot <[email protected]>
Reported-by: Dan Carpenter <[email protected]>
Signed-off-by: Sagi Grimberg <[email protected]>
Signed-off-by: Christoph Hellwig <[email protected]>
|
|
Commands like Write Zeros can change the contents of a namespaces without
actually transferring data. To protect against this, check the Commands
Supported and Effects log is supported by the controller for any
unprivileg command passthrough and refuse unprivileged passthrough if the
command has any effects that can change data or metadata.
Note: While the Commands Support and Effects log page has only been
mandatory since NVMe 2.0, it is widely supported because Windows requires
it for any command passthrough from userspace.
Fixes: e4fbcf32c860 ("nvme: identify-namespace without CAP_SYS_ADMIN")
Signed-off-by: Christoph Hellwig <[email protected]>
Reviewed-by: Keith Busch <[email protected]>
Reviewed-by: Sagi Grimberg <[email protected]>
Reviewed-by: Kanchan Joshi <[email protected]>
|
|
To be able to use the Commands Supported and Effects Log for allowing
unprivileged passtrough, it needs to be corretly reported for I/O
commands as well. Return the I/O command effects from
nvme_command_effects, and also add a default list of effects for the
NVM command set. For other command sets, the Commands Supported and
Effects log is required to be present already.
Signed-off-by: Christoph Hellwig <[email protected]>
Reviewed-by: Keith Busch <[email protected]>
Reviewed-by: Kanchan Joshi <[email protected]>
|
|
Mask out the "Command Supported" and "Logical Block Content Change" bits
and only defer execution of commands that have non-trivial effects to
the workqueue for synchronous execution. This allows to execute admin
commands asynchronously on controllers that provide a Command Supported
and Effects log page, and will keep allowing to execute Write commands
asynchronously once command effects on I/O commands are taken into
account.
Fixes: c1fef73f793b ("nvmet: add passthru code to process commands")
Signed-off-by: Christoph Hellwig <[email protected]>
Reviewed-by: Keith Busch <[email protected]>
Reviewed-by: Sagi Grimberg <[email protected]>
Reviewed-by: Kanchan Joshi <[email protected]>
|
|
Write, Write Zeroes, Zone append and a Zone Reset through
Zone Management Send modify the logical block content of a namespace,
so make sure the LBCC bit is reported for them.
Fixes: b5d0b38c0475 ("nvmet: add Command Set Identifier support")
Signed-off-by: Christoph Hellwig <[email protected]>
Reviewed-by: Keith Busch <[email protected]>
Reviewed-by: Sagi Grimberg <[email protected]>
Reviewed-by: Kanchan Joshi <[email protected]>
Reviewed-by: Chaitanya Kulkarni <[email protected]>
|
|
Use NVME_CMD_EFFECTS_CSUPP instead of open coding it and assign a
single value to multiple array entries instead of repeated assignments.
Signed-off-by: Christoph Hellwig <[email protected]>
Reviewed-by: Keith Busch <[email protected]>
Reviewed-by: Sagi Grimberg <[email protected]>
Reviewed-by: Kanchan Joshi <[email protected]>
Reviewed-by: Chaitanya Kulkarni <[email protected]>
|