aboutsummaryrefslogtreecommitdiff
path: root/drivers/lightnvm
AgeCommit message (Collapse)AuthorFilesLines
2018-06-01lightnvm: proper error handling for pblk_bio_add_pagesIgor Konopko1-2/+4
Currently in case of error caused by bio_pc_add_page in pblk_bio_add_pages two issues occur when calling from pblk_rb_read_to_bio(). First one is in pblk_bio_free_pages, since we are trying to free pages not allocated from our mempool. Second one is the warn from dma_pool_free, that we are trying to free NULL pointer dma. This commit fix both issues. Signed-off-by: Igor Konopko <[email protected]> Signed-off-by: Marcin Dziegielewski <[email protected]> Signed-off-by: Matias Bjørling <[email protected]> Signed-off-by: Jens Axboe <[email protected]>
2018-06-01lightnvm: pblk: fix smeta write error pathHans Holmberg1-3/+4
Smeta write errors were previously ignored. Skip these lines instead and throw them back on the free list, so the chunks will go through a reset cycle before we attempt to use the line again. Signed-off-by: Hans Holmberg <[email protected]> Reviewed-by: Javier González <[email protected]> Signed-off-by: Matias Bjørling <[email protected]> Signed-off-by: Jens Axboe <[email protected]>
2018-06-01lightnvm: pblk: garbage collect lines with failed writesHans Holmberg7-64/+200
Write failures should not happen under normal circumstances, so in order to bring the chunk back into a known state as soon as possible, evacuate all the valid data out of the line and let the fw judge if the block can be written to in the next reset cycle. Do this by introducing a new gc list for lines with failed writes, and ensure that the rate limiter allocates a small portion of the write bandwidth to get the job done. The lba list is saved in memory for use during gc as we cannot gurantee that the emeta data is readable if a write error occurred. Signed-off-by: Hans Holmberg <[email protected]> Reviewed-by: Javier González <[email protected]> Signed-off-by: Matias Bjørling <[email protected]> Signed-off-by: Jens Axboe <[email protected]>
2018-06-01lightnvm: pblk: rework write error recovery pathHans Holmberg5-229/+181
The write error recovery path is incomplete, so rework the write error recovery handling to do resubmits directly from the write buffer. When a write error occurs, the remaining sectors in the chunk are mapped out and invalidated and the request inserted in a resubmit list. The writer thread checks if there are any requests to resubmit, scans and invalidates any lbas that have been overwritten by later writes and resubmits the failed entries. Signed-off-by: Hans Holmberg <[email protected]> Reviewed-by: Javier González <[email protected]> Signed-off-by: Matias Bjørling <[email protected]> Signed-off-by: Jens Axboe <[email protected]>
2018-06-01lightnvm: pblk: remove dead functionJavier González2-8/+0
Remove dead function for manual sync. I/O Signed-off-by: Javier González <[email protected]> Signed-off-by: Matias Bjørling <[email protected]> Signed-off-by: Jens Axboe <[email protected]>
2018-06-01lightnvm: pass flag on graceful teardown to targetsJavier González5-17/+34
If the namespace is unregistered before the LightNVM target is removed (e.g., on hot unplug) it is too late for the target to store any metadata on the device - any attempt to write to the device will fail. In this case, pass on a "gracefull teardown" flag to the target to let it know when this happens. In the case of pblk, we pad the open line (close all open chunks) to improve data retention. In the event of an ungraceful shutdown, avoid this part and just clean up. Signed-off-by: Javier González <[email protected]> Signed-off-by: Matias Bjørling <[email protected]> Signed-off-by: Jens Axboe <[email protected]>
2018-06-01lightnvm: pblk: check for chunk size before allocating itJavier González1-3/+3
Do the check for the chunk state after making sure that the chunk type is supported. Fixes: 32ef9412c114 ("lightnvm: pblk: implement get log report chunk") Signed-off-by: Javier González <[email protected]> Signed-off-by: Matias Bjørling <[email protected]> Signed-off-by: Jens Axboe <[email protected]>
2018-06-01lightnvm: pblk: remove unnecessary argumentJavier González3-5/+5
Remove unnecessary argument on pblk_line_free() Signed-off-by: Javier González <[email protected]> Signed-off-by: Matias Bjørling <[email protected]> Signed-off-by: Jens Axboe <[email protected]>
2018-06-01lightnvm: pblk: remove unnecessary indirectionJavier González1-12/+2
Call nvm_submit_io directly and remove an unnecessary indirection on the read path. Signed-off-by: Javier González <[email protected]> Signed-off-by: Matias Bjørling <[email protected]> Signed-off-by: Jens Axboe <[email protected]>
2018-06-01lightnvm: pblk: return NVM_ error on failed submissionJavier González1-14/+8
Return a meaningful error when the sanity vector I/O check fails. Signed-off-by: Javier González <[email protected]> Signed-off-by: Matias Bjørling <[email protected]> Signed-off-by: Jens Axboe <[email protected]>
2018-06-01lightnvm: pblk: warn in case of corrupted write bufferJavier González1-3/+2
When cleaning up buffer entries as we wrap up, their state should be "completed". If any of the entries is in "submitted" state, it means that something bad has happened. Trigger a warning immediately instead of waiting for the state flag to eventually be updated, thus hiding the issue. Signed-off-by: Javier González <[email protected]> Signed-off-by: Matias Bjørling <[email protected]> Signed-off-by: Jens Axboe <[email protected]>
2018-06-01lightnvm: pblk: improve error msg on corrupted LBAsJavier González1-10/+32
In the event of a mismatch between the read LBA and the metadata pointer reported by the device, improve the error message to be able to detect the offending physical address (PPA) mapped to the corrupted LBA. Signed-off-by: Javier González <[email protected]> Signed-off-by: Matias Bjørling <[email protected]> Signed-off-by: Jens Axboe <[email protected]>
2018-06-01lightnvm: pblk: check read lba on gc pathJavier González1-6/+33
Check that the lba stored in the LBA metadata is correct in the GC path too. This requires a new helper function to check random reads in the vector read. Signed-off-by: Javier González <[email protected]> Signed-off-by: Matias Bjørling <[email protected]> Signed-off-by: Jens Axboe <[email protected]>
2018-06-01lightnvm: pblk: recheck for bad lines at runtimeJavier González2-14/+35
Bad blocks can grow at runtime. Check that the number of valid blocks in a line are within the sanity threshold before allocating the line for new writes. Signed-off-by: Javier González <[email protected]> Signed-off-by: Matias Bjørling <[email protected]> Signed-off-by: Jens Axboe <[email protected]>
2018-06-01lightnvm: pblk: fail gracefully on line alloc. failureJavier González2-9/+29
In the event of a line failing to allocate, fail gracefully and stop the pipeline to avoid more write failing in the same place. Signed-off-by: Javier González <[email protected]> Signed-off-by: Matias Bjørling <[email protected]> Signed-off-by: Jens Axboe <[email protected]>
2018-05-30lightnvm: convert to bioset_init()/mempool_init()Kent Overstreet6-65/+65
Convert lightnvm to embedded bio sets. Reviewed-by: Javier González <[email protected]> Signed-off-by: Kent Overstreet <[email protected]> Signed-off-by: Jens Axboe <[email protected]>
2018-03-29lightnvm: pblk: remove some unnecessary NULL checksDan Carpenter1-4/+2
Smatch complains that flush_workqueue() dereferences the work queue pointer but then we check if it's NULL on the next line when it's too late. These NULL checks can be removed because the module won't load if we can't allocate the work queues. Signed-off-by: Dan Carpenter <[email protected]> Signed-off-by: Matias Bjørling <[email protected]> Signed-off-by: Jens Axboe <[email protected]>
2018-03-29lightnvm: pblk: don't recover unwritten linesHans Holmberg1-0/+18
If the line has not been written to, we should not try to recover any data from it, so check the state of the chunks in the line before attempting to read smeta. Signed-off-by: Hans Holmberg <[email protected]> Signed-off-by: Javier González <[email protected]> Signed-off-by: Matias Bjørling <[email protected]> Signed-off-by: Jens Axboe <[email protected]>
2018-03-29lightnvm: pblk: implement 2.0 supportJavier González3-64/+234
Implement 2.0 support in pblk. This includes the address formatting and mapping paths, as well as the sysfs entries for them. Signed-off-by: Javier González <[email protected]> Signed-off-by: Matias Bjørling <[email protected]> Signed-off-by: Jens Axboe <[email protected]>
2018-03-29lightnvm: pblk: implement get log report chunkJavier González3-82/+285
In preparation of pblk supporting 2.0, implement the get log report chunk in pblk. Also, define the chunk states as given in the 2.0 spec. Signed-off-by: Javier González <[email protected]> Signed-off-by: Matias Bjørling <[email protected]> Signed-off-by: Jens Axboe <[email protected]>
2018-03-29lightnvm: pblk: rename ppaf* to addrf*Javier González3-14/+14
In preparation for 2.0 support in pblk, rename variables referring to the address format to addrf and reserve ppaf for the 1.2 path. Signed-off-by: Javier González <[email protected]> Signed-off-by: Matias Bjørling <[email protected]> Signed-off-by: Jens Axboe <[email protected]>
2018-03-29lightnvm: pblk: check for supported versionJavier González1-2/+8
At this point, only 1.2 spec is supported, thus check for it. Also, since device-side L2P is only supported in the 1.2 spec, make sure to only check its value under 1.2. Signed-off-by: Javier González <[email protected]> Signed-off-by: Matias Bjørling <[email protected]> Signed-off-by: Jens Axboe <[email protected]>
2018-03-29lightnvm: implement get log report chunk helpersJavier González1-0/+11
The 2.0 spec provides a report chunk log page that can be retrieved using the stangard nvme get log page. This replaces the dedicated get/put bad block table in 1.2. This patch implements the helper functions to allow targets retrieve the chunk metadata using get log page. It makes nvme_get_log_ext available outside of nvme core so that we can use it form lightnvm. Signed-off-by: Javier González <[email protected]> Signed-off-by: Matias Bjørling <[email protected]> Signed-off-by: Jens Axboe <[email protected]>
2018-03-29lightnvm: make address conversions depend on generic deviceJavier González1-2/+2
On address conversions, use the generic device, instead of the target device. This allows to use conversions outside of the target's realm. Signed-off-by: Javier González <[email protected]> Signed-off-by: Matias Bjørling <[email protected]> Signed-off-by: Jens Axboe <[email protected]>
2018-03-29lightnvm: add support for 2.0 address formatJavier González5-21/+21
Add support for 2.0 address format. Also, align address bits for 1.2 and 2.0 to be able to operate on channel and luns without requiring a format conversion. Use a generic address format for this purpose. Also, convert the generic operations to the generic format in pblk. Signed-off-by: Javier González <[email protected]> Signed-off-by: Matias Bjørling <[email protected]> Signed-off-by: Jens Axboe <[email protected]>
2018-03-29lightnvm: normalize geometry nomenclatureJavier González5-74/+73
Normalize nomenclature for naming channels, luns, chunks, planes and sectors as well as derivations in order to improve readability. Signed-off-by: Javier González <[email protected]> Signed-off-by: Matias Bjørling <[email protected]> Signed-off-by: Jens Axboe <[email protected]>
2018-03-29lightnvm: add minor version to generic geometryJavier González1-2/+2
Separate the version between major and minor on the generic geometry and represent it through sysfs in the 2.0 path. The 1.2 path only shows the major version to preserve the existing user space interface. Signed-off-by: Javier González <[email protected]> Signed-off-by: Matias Bjørling <[email protected]> Signed-off-by: Jens Axboe <[email protected]>
2018-03-29lightnvm: simplify geometry structureJavier González10-189/+154
Currently, the device geometry is stored redundantly in the nvm_id and nvm_geo structures at a device level. Moreover, when instantiating targets on a specific number of LUNs, these structures are replicated and manually modified to fit the instance channel and LUN partitioning. Instead, create a generic geometry around nvm_geo, which can be used by (i) the underlying device to describe the geometry of the whole device, and (ii) instances to describe their geometry independently. Signed-off-by: Javier González <[email protected]> Signed-off-by: Matias Bjørling <[email protected]> Signed-off-by: Jens Axboe <[email protected]>
2018-03-29lightnvm: pblk: refactor init/exit sequencesJavier González1-203/+202
Refactor init and exit sequences to eliminate dependencies among init modules and improve readability. Signed-off-by: Javier González <[email protected]> Signed-off-by: Matias Bjørling <[email protected]> Signed-off-by: Jens Axboe <[email protected]>
2018-03-29lightnvm: Avoid validation of default op valueHeiner Litz1-4/+2
Fixes: 38401d231de65 ("lightnvm: set target over-provision on create ioctl") Signed-off-by: Heiner Litz <[email protected]> Reviewed-by: Javier González <[email protected]> Signed-off-by: Matias Bjørling <[email protected]> Signed-off-by: Jens Axboe <[email protected]>
2018-03-29lightnvm: centralize permission check for lightnvm ioctlJohannes Thumshirn1-18/+3
Currently all functions for handling the lightnvm core ioctl commands do a check for CAP_SYS_ADMIN. Change this to fail early in nvm_ctl_ioctl(), so we don't have to duplicate the permission checks all over. Signed-off-by: Johannes Thumshirn <[email protected]> Signed-off-by: Matias Bjørling <[email protected]> Signed-off-by: Jens Axboe <[email protected]>
2018-03-29lightnvm: fix bad block initializationHeiner Litz1-1/+2
fix reading bad block device information to correctly setup the per line blk_bitmap during lightnvm initialization Signed-off-by: Heiner Litz <[email protected]> Signed-off-by: Matias Bjørling <[email protected]> Signed-off-by: Jens Axboe <[email protected]>
2018-03-29nvme: lightnvm: add late setup of block size and metadataMatias Bjørling1-3/+0
The nvme driver sets up the size of the nvme namespace in two steps. First it initializes the device with standard logical block and metadata sizes, and then sets the correct logical block and metadata size. Due to the OCSSD 2.0 specification relies on the namespace to expose these sizes for correct initialization, let it be updated appropriately on the LightNVM side as well. Signed-off-by: Matias Bjørling <[email protected]> Acked-by: Keith Busch <[email protected]> Signed-off-by: Jens Axboe <[email protected]>
2018-03-29lightnvm: remove nvm_dev_ops->max_phys_sectMatias Bjørling3-32/+13
The value of max_phys_sect is always static. Instead of defining it in the nvm_dev_ops structure, declare it as a global value. Signed-off-by: Matias Bjørling <[email protected]> Signed-off-by: Jens Axboe <[email protected]>
2018-03-29lightnvm: remove max_rq_sizeMatias Bjørling1-1/+0
The field is no longer used. Signed-off-by: Matias Bjørling <[email protected]> Signed-off-by: Jens Axboe <[email protected]>
2018-03-29lightnvm: add 2.0 geometry identificationMatias Bjørling1-5/+3
Implement the geometry data structures for 2.0 and enable a drive to be identified as one, including exposing the appropriate 2.0 sysfs entries. Signed-off-by: Matias Bjørling <[email protected]> Signed-off-by: Jens Axboe <[email protected]>
2018-03-29lightnvm: flatten nvm_id_group into nvm_idMatias Bjørling1-13/+12
There are no groups in the 2.0 specification, make sure that the nvm_id structure is flattened before 2.0 data structures are added. Signed-off-by: Matias Bjørling <[email protected]> Signed-off-by: Jens Axboe <[email protected]>
2018-03-29lightnvm: pblk: refactor bad block identificationJavier González3-109/+109
In preparation for the OCSSD 2.0 spec. bad block identification, refactor the current code to generalize bad block get/set functions and structures. Signed-off-by: Javier González <[email protected]> Signed-off-by: Matias Bjørling <[email protected]> Signed-off-by: Jens Axboe <[email protected]>
2018-03-29lightnvm: pblk: prevent race in pblk_rb_flush_point_setHans Holmberg1-3/+4
Make sure that we are not advancing the sync pointer while we're adding bios to the write buffer entry completion list. This race condition results in bios not completing and was identified by a hang when running xfstest generic/113. Signed-off-by: Hans Holmberg <[email protected]> Reviewed-by: Javier González <[email protected]> Signed-off-by: Matias Bjørling <[email protected]> Signed-off-by: Jens Axboe <[email protected]>
2018-03-29lightnvm: pblk: allow allocation of new lines during shutdownHans Holmberg1-7/+0
When shutting down pblk the write buffer is flushed and if the current line can't fit the data in the write buffer we need to allocate a new line, so remove the check that prevents this. Signed-off-by: Hans Holmberg <[email protected]> Reviewed-by: Javier González <[email protected]> Signed-off-by: Matias Bjørling <[email protected]> Signed-off-by: Jens Axboe <[email protected]>
2018-03-29lightnvm: pblk: delete writer kick timer before stopping threadHans Holmberg1-1/+1
Unless we delete the timer that wakes up the write thread before we stop the thread we risk re-starting the thread, so delete the timer first. Signed-off-by: Hans Holmberg <[email protected]> Reviewed-by: Javier González <[email protected]> Signed-off-by: Matias Bjørling <[email protected]> Signed-off-by: Jens Axboe <[email protected]>
2018-03-29lightnvm: pblk: add padding distribution sysfs attributeHans Holmberg4-13/+113
When pblk receives a sync, all data up to that point in the write buffer must be comitted to persistent storage, and as flash memory comes with a minimal write size there is a significant cost involved both in terms of time for completing the sync and in terms of write amplification padded sectors for filling up to the minimal write size. In order to get a better understanding of the costs involved for syncs, Add a sysfs attribute to pblk: padded_dist, showing a normalized distribution of sectors padded. In order to facilitate measurements of specific workloads during the lifetime of the pblk instance, the distribution can be reset by writing 0 to the attribute. Do this by introducing counters for each possible padding: {0..(minimal write size - 1)} and calculate the normalized distribution when showing the attribute. Signed-off-by: Hans Holmberg <[email protected]> Signed-off-by: Javier González <[email protected]> Rearranged total_buckets statement in pblk_sysfs_get_padding_dist Signed-off-by: Matias Bjørling <[email protected]> Signed-off-by: Jens Axboe <[email protected]>
2018-03-29lightnvm: pblk: export write amplification counters to sysfsHans Holmberg8-10/+168
In a SSD, write amplification, WA, is defined as the average number of page writes per user page write. Write amplification negatively affects write performance and decreases the lifetime of the disk, so it's a useful metric to add to sysfs. In plkb's case, the number of writes per user sector is the sum of: (1) number of user writes (2) number of sectors written by the garbage collector (3) number of sectors padded (i.e. due to syncs) This patch adds persistent counters for 1-3 and two sysfs attributes to export these along with WA calculated with five decimals: write_amp_mileage: the accumulated write amplification stats for the lifetime of the pblk instance write_amp_trip: resetable stats to facilitate delta measurements, values reset at creation and if 0 is written to the attribute. 64-bit counters are used as a 32 bit counter would wrap around already after about 17 TB worth of user data. It will take a long long time before the 64 bit sector counters wrap around. The counters are stored after the bad block bitmap in the first emeta sector of each written line. There is plenty of space in the first emeta sector, so we don't need to bump the major version of the line data format. Signed-off-by: Hans Holmberg <[email protected]> Signed-off-by: Javier González <[email protected]> Signed-off-by: Matias Bjørling <[email protected]> Signed-off-by: Jens Axboe <[email protected]>
2018-03-29lightnvm: pblk: check data lines version on recoveryHans Holmberg3-5/+46
As a preparation for future bumps of data line persistent storage versions, we need to start checking the emeta line version during recovery. Also slit up the current emeta/smeta version into two bytes (major,minor). Recovering lines with the same major number as the current pblk data line version must succeed. This means that any changes in the persistent format must be: (1) Backward compatible: if we switch back to and older kernel, recovery of lines stored with major == current_major and minor > current_minor must succeed. (2) Forward compatible: switching to a newer kernel, recovery of lines stored with major=current_major and minor < minor must handle the data format differences gracefully(i.e. initialize new data structures to default values). If we detect lines that have a different major number than the current we must abort recovery. The user must manually migrate the data in this case. Previously the version stored in the emeta header was copied from smeta, which has version 1, so we need to set the minor version to 1. Signed-off-by: Hans Holmberg <[email protected]> Signed-off-by: Javier González <[email protected]> Signed-off-by: Matias Bjørling <[email protected]> Signed-off-by: Jens Axboe <[email protected]>
2018-03-29lightnvm: pblk: handle bad sectors in the emeta area correctlyHans Holmberg1-5/+6
Unless we check if there are bad sectors in the entire emeta-area we risk ending up with valid bitmap / available sector count inconsistency. This results in lines with a bad chunk at the last LUN marked as bad, so go through the whole emeta area and mark up the invalid sectors. Signed-off-by: Hans Holmberg <[email protected]> Signed-off-by: Matias Bjørling <[email protected]> Signed-off-by: Jens Axboe <[email protected]>
2018-03-29lightnvm/pblk-gc: Delete an error message for a failed memory allocation in ↵Markus Elfring1-3/+1
pblk_gc_line_prepare_ws() Omit an extra message for a memory allocation failure in this function. This issue was detected by using the Coccinelle software. Signed-off-by: Markus Elfring <[email protected]> Reviewed-by: Javier González <[email protected]> Signed-off-by: Matias Bjørling <[email protected]> Signed-off-by: Jens Axboe <[email protected]>
2018-03-08block: Use blk_queue_flag_*() in drivers instead of queue_flag_*()Bart Van Assche1-1/+1
This patch has been generated as follows: for verb in set_unlocked clear_unlocked set clear; do replace-in-files queue_flag_${verb} blk_queue_flag_${verb%_unlocked} \ $(git grep -lw queue_flag_${verb} drivers block/bsg*) done Except for protecting all queue flag changes with the queue lock this patch does not change any functionality. Cc: Mike Snitzer <[email protected]> Cc: Shaohua Li <[email protected]> Cc: Christoph Hellwig <[email protected]> Cc: Hannes Reinecke <[email protected]> Cc: Ming Lei <[email protected]> Signed-off-by: Bart Van Assche <[email protected]> Reviewed-by: Martin K. Petersen <[email protected]> Reviewed-by: Johannes Thumshirn <[email protected]> Acked-by: Martin K. Petersen <[email protected]> Signed-off-by: Jens Axboe <[email protected]>
2018-02-28block: Add 'lock' as third argument to blk_alloc_queue_node()Bart Van Assche1-1/+1
This patch does not change any functionality. Signed-off-by: Bart Van Assche <[email protected]> Reviewed-by: Joseph Qi <[email protected]> Cc: Christoph Hellwig <[email protected]> Cc: Philipp Reisner <[email protected]> Cc: Ulf Hansson <[email protected]> Cc: Kees Cook <[email protected]> Signed-off-by: Jens Axboe <[email protected]>
2018-01-05lightnvm: pblk: refactor pblk_ppa_comp functionMatias Bjørling1-4/+1
Shorten function to simply return the value of the if statement. Signed-off-by: Matias Bjørling <[email protected]> Signed-off-by: Jens Axboe <[email protected]>
2018-01-05lightnvm: pblk: add iostat supportJavier González3-12/+25
Since pblk registers its own block device, the iostat accounting is not automatically done for us. Therefore, add the necessary accounting logic to satisfy the iostat interface. Signed-off-by: Javier González <[email protected]> Signed-off-by: Matias Bjørling <[email protected]> Signed-off-by: Jens Axboe <[email protected]>