Age | Commit message (Collapse) | Author | Files | Lines |
|
A negative status has the "do not retry" bit set, which makes it not
retryable. Use a fake status that can potentially be retried on reset.
An aborted command's status is overridden by the timeout handler so
that it won't be retried, which is necessary to keep initialization from
getting into a reset loop.
Signed-off-by: Keith Busch <[email protected]>
Signed-off-by: Jens Axboe <[email protected]>
|
|
The tag set queue depth needs to be one less than the h/w queue depth
so we don't wrap the circular buffer. This conforms to the specification
defined "Full Queue" condition.
Signed-off-by: Keith Busch <[email protected]>
Signed-off-by: Jens Axboe <[email protected]>
|
|
Translation SCSI commands to NVMe commands is rather pointless in general
as applications must not expext to be able to use SCSI commands on a
generic block device.
Make the huge translation layer optional and hope no one will ever enable
it in the future.
Signed-off-by: Christoph Hellwig <[email protected]>
Signed-off-by: Jens Axboe <[email protected]>
|
|
Make sure we synchronize access to the namespaces list and grab a reference
to the namespace before doing I/O. Make sure to reject the ioctl if multiple
namespaces are present as it's entirely unsafe, and warn when using it even
with a single namespace.
Signed-off-by: Christoph Hellwig <[email protected]>
Reviewed-by: Sagi Grimberg <[email protected]>
Acked-by: Keith Busch <[email protected]>
Signed-off-by: Jens Axboe <[email protected]>
|
|
Currently traversal and modification of ctrl->namespaces happens completely
unsynchronized, which can be fixed by the addition of a simple mutex.
Note: nvme_dev_ioctl will be handled in the next patch.
Signed-off-by: Christoph Hellwig <[email protected]>
Reviewed-by: Sagi Grimberg <[email protected]>
Acked-by: Keith Busch <[email protected]>
Signed-off-by: Jens Axboe <[email protected]>
|
|
Nothing pci specific about them and We'll need them exported
in other transports too.
Signed-off-by: Sagi Grimberg <[email protected]>
Signed-off-by: Christoph Hellwig <[email protected]>
Acked-by: Keith Busch <[email protected]>
Signed-off-by: Jens Axboe <[email protected]>
|
|
We are having build failure with sparc allmodconfig with the error:
drivers/nvme/host/pci.c:15:0:
include/linux/aer.h: In function 'pci_enable_pcie_error_reporting':
include/linux/aer.h:49:10: error: 'EINVAL' undeclared (first use in this function)
The file aer.h is using the error values but they are defined in
errno.h. Include errno.h so that we have the definitions of the error
codes.
Fixes: a0a3408ee614 ("NVMe: Add pci error handlers")
Cc: Keith Busch <[email protected]>
Signed-off-by: Sudip Mukherjee <[email protected]>
Signed-off-by: Jens Axboe <[email protected]>
|
|
Exposes the NGUID, EUI-64, and NSID to sysfs entries under the disk's
kobject.
Signed-off-by: Keith Busch <[email protected]>
Reviewed-by: Sagi Grimberg <[email protected]>
Signed-off-by: Jens Axboe <[email protected]>
|
|
Requests enabling pcie aer support. Shuts down the controller on error
detected with io frozen state prior to requesting slot reset; resumes
controller after reset completes.
Signed-off-by: Keith Busch <[email protected]>
Reviewed-by: Christoph Hellwig <[email protected]>
Signed-off-by: Jens Axboe <[email protected]>
|
|
This was added for the 'magic' AEN requests in the NVMe driver that never
return. We now handle them purely inside the driver and don't need this
core hack any more.
Signed-off-by: Christoph Hellwig <[email protected]>
Acked-by: Keith Busch <[email protected]>
Signed-off-by: Jens Axboe <[email protected]>
|
|
Merge the two per-request structures in the nvme driver into a single
one.
Signed-off-by: Christoph Hellwig <[email protected]>
Acked-by: Keith Busch <[email protected]>
Signed-off-by: Jens Axboe <[email protected]>
|
|
Signed-off-by: Christoph Hellwig <[email protected]>
Acked-by: Keith Busch <[email protected]>
Signed-off-by: Jens Axboe <[email protected]>
|
|
We need to move freeing of resources to the ->complete handler to ensure
they are also freed when we cancel the command.
Signed-off-by: Christoph Hellwig <[email protected]>
Acked-by: Keith Busch <[email protected]>
Signed-off-by: Jens Axboe <[email protected]>
|
|
Now that all commands are executed as block layer requests we can remove the
internal completion in the NVMe driver. Note that we can simply call
blk_mq_complete_request to abort commands as the block layer will protect
against double copletions internally.
Signed-off-by: Christoph Hellwig <[email protected]>
Acked-by: Keith Busch <[email protected]>
Signed-off-by: Jens Axboe <[email protected]>
|
|
AEN requests are different from other requests in that they don't time out
or can easily be cancelled. Because of that we should not use the blk-mq
infrastructure but just special case them in the completion path.
Signed-off-by: Christoph Hellwig <[email protected]>
Acked-by: Keith Busch <[email protected]>
Signed-off-by: Jens Axboe <[email protected]>
|
|
And remove the now unused nvme_submit_cmd helper.
Signed-off-by: Christoph Hellwig <[email protected]>
Acked-by: Keith Busch <[email protected]>
Signed-off-by: Jens Axboe <[email protected]>
|
|
Signed-off-by: Christoph Hellwig <[email protected]>
Acked-by: Keith Busch <[email protected]>
Signed-off-by: Jens Axboe <[email protected]>
|
|
We'll need them in other places later.
Signed-off-by: Christoph Hellwig <[email protected]>
Acked-by: Keith Busch <[email protected]>
Signed-off-by: Jens Axboe <[email protected]>
|
|
The number in tag_set->queue depth includes the reserved tags.
Signed-off-by: Christoph Hellwig <[email protected]>
Acked-by: Keith Busch <[email protected]>
Signed-off-by: Jens Axboe <[email protected]>
|
|
We no longer require the two-pass setup for block integrity.
Signed-off-by: Keith Busch <[email protected]>
Signed-off-by: Jens Axboe <[email protected]>
|
|
We don't want to allow new references to open on a device that is
removed. This ties the lifetime of these handles to the physical device's
presence rather than to the open reference count.
Signed-off-by: Keith Busch <[email protected]>
Signed-off-by: Jens Axboe <[email protected]>
|
|
Removes all usage of the global work queue so work can't be
scheduled on two different work queues, and removes nvme's work queue
singlethreadedness so controllers can be driven in parallel.
Signed-off-by: Keith Busch <[email protected]>
[hch: keep the dead controller removal on the system workqueue to avoid
deadlocks]
Signed-off-by: Christoph Hellwig <[email protected]>
Signed-off-by: Jens Axboe <[email protected]>
|
|
The NVMe 1.1 specification provides an identify mode to return a
list of active namespaces. This is more efficient to discover which
namespace identifiers are active on a controller, providing potentially
significant improvement in scan time for controllers with sparesly
populated namespaces.
Signed-off-by: Keith Busch <[email protected]>
[hch: add quirk for the broken Qemu Identify implementation. To be relaxed
later]
Signed-off-by: Christoph Hellwig <[email protected]>
Signed-off-by: Jens Axboe <[email protected]>
|
|
There is no lock to sychronize access to the abort_limit field of
struct nvme_ctrl, so switch it to an atomic_t.
Signed-off-by: Christoph Hellwig <[email protected]>
Signed-off-by: Keith Busch <[email protected]>
Signed-off-by: Jens Axboe <[email protected]>
|
|
Compared to the kthread this gives us multiple call prevention for free.
Signed-off-by: Christoph Hellwig <[email protected]>
Signed-off-by: Keith Busch <[email protected]>
Signed-off-by: Jens Axboe <[email protected]>
|
|
If we're using two work queues we're always going to run into races where
one item is tearing down what the other one is initializing. So insted
merge the two work queues, and let the old probe_work also tear the
controller down first if it was alive. Together with the better detection
of the probe path using a flag this gives us a properly serialized
reset/probe path that also doesn't accidentally trigger when two commands
time out and the second one tries to reset the controller while the first
reset is still in progress.
Signed-off-by: Christoph Hellwig <[email protected]>
Acked-by: Keith Busch <[email protected]>
Signed-off-by: Jens Axboe <[email protected]>
|
|
Otherwise we're never going to complete a command when it is restarted just
after we completed all other outstanding commands in nvme_clear_queue.
The controller must be disabled prior to completing a presumed lost
command, do this by directly shutting down the controller before
queueing the reset work, and return EH_HANDLED from the timeout handler
after we shut the controller down.
Signed-off-by: Keith Busch <[email protected]>
[hch: split and rebase]
Signed-off-by: Christoph Hellwig <[email protected]>
Signed-off-by: Jens Axboe <[email protected]>
|
|
Don't delete the controller from dev_list before queuing a reset, instead
just check for it being reset in the polling kthread. This allows to remove
the dev_list_lock in various places, and in addition we can simply rely on
checking the queue_work return value to see if we could reset a controller.
Signed-off-by: Christoph Hellwig <[email protected]>
Signed-off-by: Keith Busch <[email protected]>
Signed-off-by: Jens Axboe <[email protected]>
|
|
To properly document how we are using a negative Linux error value to
communicate request cancellations inside the driver.
Signed-off-by: Christoph Hellwig <[email protected]>
Acked-by: Keith Busch <[email protected]>
Signed-off-by: Jens Axboe <[email protected]>
|
|
We want to be able to return bettern error values frmo nvme_timeout, which
is significantly easier if the two functions are merged. Also clean up and
reduce the printk spew so that we only get one message per abort.
Signed-off-by: Christoph Hellwig <[email protected]>
Signed-off-by: Keith Busch <[email protected]>
Signed-off-by: Jens Axboe <[email protected]>
|
|
There is nothing it protects, but it makes lockdep unhappy in many different
ways.
Signed-off-by: Christoph Hellwig <[email protected]>
Signed-off-by: Keith Busch <[email protected]>
Signed-off-by: Jens Axboe <[email protected]>
|
|
Signed-off-by: Keith Busch <[email protected]>
[hch: split from a larger patch]
Signed-off-by: Christoph Hellwig <[email protected]>
Signed-off-by: Jens Axboe <[email protected]>
|
|
Without this we can easily get bad derferences on nvmeq->d_db when the nvme
kthread tries to poll the CQs for controllers that are in half initialized
state.
Signed-off-by: Christoph Hellwig <[email protected]>
Signed-off-by: Keith Busch <[email protected]>
Signed-off-by: Jens Axboe <[email protected]>
|
|
Half initialized queues due to kernel error returns or timeout are still a
good reason to give up on initializing a controller.
Signed-off-by: Christoph Hellwig <[email protected]>
Acked-by: Keith Busch <[email protected]>
Signed-off-by: Jens Axboe <[email protected]>
|
|
Timer context is not very useful for drivers to perform any meaningful abort
action from. So instead of calling the driver from this useless context
defer it to a workqueue as soon as possible.
Note that while a delayed_work item would seem the right thing here I didn't
dare to use it due to the magic in blk_add_timer that pokes deep into timer
internals. But maybe this encourages Tejun to add a sensible API for that to
the workqueue API and we'll all be fine in the end :)
Contains a major update from Keith Bush:
"This patch removes synchronizing the timeout work so that the timer can
start a freeze on its own queue. The timer enters the queue, so timer
context can only start a freeze, but not wait for frozen."
Signed-off-by: Christoph Hellwig <[email protected]>
Acked-by: Keith Busch <[email protected]>
Signed-off-by: Jens Axboe <[email protected]>
|
|
The "|" operator has higher precedence than "?:" so this didn't work as
intended. I had previously fixed this bug, but it we copied the older
unfixed version when we moved the function between files.
Fixes: 1673f1f08c88 ('nvme: move block_device_operations and ns/ctrl freeing to common code')
Signed-off-by: Dan Carpenter <[email protected]>
Signed-off-by: Jens Axboe <[email protected]>
|
|
We recently changed bio_integrity_alloc() to return ERR_PTRs instead of
NULL but these calls were missed.
Fixes: 06c1e3902aa7 ('blk-integrity: empty implementation when disabled')
Signed-off-by: Dan Carpenter <[email protected]>
Reviewed-by: Johannes Thumshirn <[email protected]>
Signed-off-by: Jens Axboe <[email protected]>
|
|
The nvme_user_cmd function was recently moved around from one file
to another, which made a warning reappear that I had fixed before
at some point:
drivers/nvme/host/core.c: In function 'nvme_user_cmd':
drivers/nvme/host/core.c:424:4: warning: cast to pointer from integer of different size [-Wint-to-pointer-cast]
This applies the same workaround that we have elsewhere in the
driver with an extra type cast to uintptr_t.
Signed-off-by: Arnd Bergmann <[email protected]>
Fixes: 1673f1f08c88 ("nvme: move block_device_operations and ns/ctrl freeing to common code")
Link: https://lkml.org/lkml/2015/10/9/611
Signed-off-by: Jens Axboe <[email protected]>
|
|
Looks like I didn't test with CONFIG_NVM enabled, and neither did
the build bot.
Most of this is really weird crazy shit in the lighnvm support, though.
Struct nvme_ns is a structure for the NVM I/O command set, and it has
no business poking into it. Second this commit:
commit 47b3115ae7b799be8b77b0f024215ad4f68d6460
Author: Wenwei Tao <[email protected]>
Date: Fri Nov 20 13:47:55 2015 +0100
nvme: lightnvm: use admin queues for admin cmds
Does even more crazy stuff. If a function gets a request_queue parameter
passed it'd better use that and not look for another one.
Signed-off-by: Christoph Hellwig <[email protected]>
Signed-off-by: Jens Axboe <[email protected]>
|
|
This patch moves the blk_integrity_payload definition outside the
CONFIG_BLK_DEV_INTERITY dependency and provides empty function
implementations when the kernel configuration disables integrity
extensions. This simplifies drivers that make use of these to map user
data so they don't need to repeat the same configuration checks.
Signed-off-by: Keith Busch <[email protected]>
Updated by Jens to pass an error pointer return from
bio_integrity_alloc(), otherwise if CONFIG_BLK_DEV_INTEGRITY isn't
set, we return a weird ENOMEM from __nvme_submit_user_cmd()
if a meta buffer is set.
Signed-off-by: Jens Axboe <[email protected]>
|
|
Split out a helper that just issues the Set Features and interprets the
result which can go to common code, and document why we are ignoring
non-timeout error returns in the PCIe driver.
Signed-off-by: Christoph Hellwig <[email protected]>
Signed-off-by: Jens Axboe <[email protected]>
|
|
For this we need to add a proper controller init routine and a list of
all controllers that is in addition to the list of PCIe controllers,
which stays in pci.c. Note that we remove the sysfs device when the
last reference to a controller is dropped now - the old code would have
kept it around longer, which doesn't make much sense.
This requires a new ->reset_ctrl operation to implement controleller
resets, and a new ->write_reg32 operation that is required to implement
subsystem resets. We also now store caches copied of the NVMe compliance
version and the flag if a controller is attached to a subsystem or not in
the generic controller structure now.
Signed-off-by: Christoph Hellwig <[email protected]>
[Fixes for pr merge]
Signed-off-by: Keith Busch <[email protected]>
Signed-off-by: Jens Axboe <[email protected]>
|
|
The namespace scanning code has been mostly generic already, we just
need to store a pointer to the tagset in the nvme_ctrl structure, and
add a method to check if a controller is I/O incapable. The latter
will hopefully be replaced by a proper controller state machine soon.
Signed-off-by: Christoph Hellwig <[email protected]>
[Fixed pr conflicts]
Signed-off-by: Keith Busch <[email protected]>
Signed-off-by: Jens Axboe <[email protected]>
|
|
We want to record the identify and CAP values even if no I/O queue
is available.
Signed-off-by: Christoph Hellwig <[email protected]>
Signed-off-by: Keith Busch <[email protected]>
Signed-off-by: Jens Axboe <[email protected]>
|
|
And add the 64-bit register read operation for it.
Signed-off-by: Christoph Hellwig <[email protected]>
Signed-off-by: Keith Busch <[email protected]>
Signed-off-by: Jens Axboe <[email protected]>
|
|
Signed-off-by: Christoph Hellwig <[email protected]>
Signed-off-by: Jens Axboe <[email protected]>
|
|
Remove the calculation of all the bits written into the CC register into
nvme_enable_ctrl, so that they can be moved into the core NVMe driver in
the future.
Signed-off-by: Christoph Hellwig <[email protected]>
Signed-off-by: Jens Axboe <[email protected]>
|
|
Add an enum for all workarounds not in the spec and identify the affected
controllers at probe time.
Signed-off-by: Christoph Hellwig <[email protected]>
Signed-off-by: Keith Busch <[email protected]>
Signed-off-by: Jens Axboe <[email protected]>
|
|
This moves the block_device_operations over to common code mostly
as-is. The only change is that the ns and ctrl refcounting got some
small refcounting to have wrappers around the kref_put operations.
A new free_ctrl operation is added to allow the PCI driver to free
it's ressources on the final drop.
Signed-off-by: Christoph Hellwig <[email protected]>
[Moved the integrity and pr changes due to merge conflict]
Signed-off-by: Keith Busch <[email protected]>
Signed-off-by: Jens Axboe <[email protected]>
|
|
Use the integrity API to pass through metadata from userspace. For PI
enabled devices this means that we now validate the reftag, which seems
like an unintentional ommission in the old code.
Thanks to Keith Busch for testing and fixes.
Signed-off-by: Christoph Hellwig <[email protected]>
[Skip metadata setup on admin commands]
Signed-off-by: Keith Busch <[email protected]>
Signed-off-by: Jens Axboe <[email protected]>
|