aboutsummaryrefslogtreecommitdiff
path: root/drivers/md/dm-raid.c
AgeCommit message (Collapse)AuthorFilesLines
2016-06-14dm raid: add prerequisite functions and definitions for reshapingHeinz Mauelshagen1-22/+202
Add rs_is_reshapable(), rs_data_stripes(), rs_reshape_requested(), rs_set_dev_and_array_sectors() and rs_adjust_data_offsets() Remove superfluous check for reshape message Correct runtime bit definitions to be incremental Signed-off-by: Heinz Mauelshagen <heinzm@redhat.com> Signed-off-by: Mike Snitzer <snitzer@redhat.com>
2016-06-14dm raid: inverse check for flags from invalid to valid flagsHeinz Mauelshagen1-32/+56
It is more intuitive to manage each raid level's features in terms of what is supported rather than what isn't supported. Signed-off-by: Heinz Mauelshagen <heinzm@redhat.com> Signed-off-by: Mike Snitzer <snitzer@redhat.com>
2016-06-14dm raid: various code cleanupsMike Snitzer1-56/+43
Renamed functions and variables with leading single underscore to have a double underscore. Renamed some functions to have better names. Folded functions that were split out without reason. Signed-off-by: Mike Snitzer <snitzer@redhat.com>
2016-06-14dm raid: rename functions that alloc and free struct raid_setMike Snitzer1-7/+7
Signed-off-by: Mike Snitzer <snitzer@redhat.com>
2016-06-14dm raid: remove all the bitops wrappersMike Snitzer1-125/+89
Removes obfuscation that is of little value. Signed-off-by: Mike Snitzer <snitzer@redhat.com>
2016-06-14dm raid: rename _in_range to __within_rangeMike Snitzer1-14/+14
Signed-off-by: Mike Snitzer <snitzer@redhat.com>
2016-06-14dm raid: add missing "dm-raid0" module aliasMike Snitzer1-1/+2
Also update module description to "raid0/1/10/4/5/6 target" Reported by Alasdair G Kergon <agk@redhat.com> Signed-off-by: Mike Snitzer <snitzer@redhat.com>
2016-06-14dm raid: rename _argname_by_flag to dm_raid_arg_name_by_flagMike Snitzer1-30/+30
Signed-off-by: Mike Snitzer <snitzer@redhat.com>
2016-06-14dm raid: bump to v1.9.0 and make the extended SB feature flag reflect itMike Snitzer1-17/+20
No idea what Heinz was doing with the versioning but upstream commit 4c9971ca6a ("dm raid: make sure no feature flags are set in metadata") bumped to 1.8.0 already. Signed-off-by: Mike Snitzer <snitzer@redhat.com>
2016-06-14dm raid: remove ti_error_* wrappersMike Snitzer1-152/+249
There ti_error_* wrappers added very little. No other DM target has ever gone to such lengths to wrap setting ti->error. Also fixes some NULL derefences via rs->ti->error. Signed-off-by: Mike Snitzer <snitzer@redhat.com>
2016-06-14dm raid: tabify appropriate whitespaceMike Snitzer1-62/+62
Signed-off-by: Mike Snitzer <snitzer@redhat.com> Signed-off-by: Mike Snitzer <snitzer@redhat.com>
2016-06-14dm raid: enhance status interface and fixup takeover/raid0Heinz Mauelshagen1-181/+253
The target's status interface has to provide the new 'data_offset' value to allow userspace to retrieve the kernels offset to the data on each raid device of a raid set. This is the base for out-of-place reshaping required to not write over any data during reshaping (e.g. change raid6_zr -> raid6_nc): - add rs_set_cur() to be able to start up existing array in case of no takeover; use in ctr on takeover check - enhance raid_status() - add supporting functions to get resync/reshape progress and raid device status chars - fixup rebuild table line output race, which does miss to emit 'rebuild N' on fully synced/rebuild devices, because it is relying on the transient 'In_sync' raid device flag - add new status line output for 'data_offset', which'll later be used for out-of-place reshaping - fixup takeover not working for all levels - fixup raid0 message interface oops caused by missing checks for the md threads, which don't exist in case of raid0 - remove ALL_FREEZE_FLAGS not needed for takeover - adjust comments Signed-off-by: Heinz Mauelshagen <heinzm@redhat.com> Signed-off-by: Mike Snitzer <snitzer@redhat.com>
2016-06-14dm raid: add raid level takeover supportHeinz Mauelshagen1-29/+426
Add raid level takeover support allowing arbitrary takeovers between raid levels supported by md personalities (i.e. raid0, raid1/10 and raid4/5/6): - add rs_config_{backup|restore} function to allow for temporary storing ctr requested layout changes and restore them for takeover conersion decision after the superblocks got loaded and analyzed - add members to store layout to 'struct raid_set' (not mandatory for takeover but needed for reshape in later patch) - add rebuild_disks bitfield to 'struct raid_set' and set bits in ctr to use in setting up takeover (base to address a 'rebuild' related raid_status() table line bug and needed as well for reshape in future patch) - add runtime flags and respective manipulation functions to be able to control e.g. wrting of superlocks to the preresume function on takeover and (later) reshape - add functions to detect takeover, check it's valid (mandatory here to avoid failing on md_run()), setup for it and use in the ctr; those will be likely moved out once reshaping gets added to simplify the ctr - start raid set readonly in ctr and switch to readwrite, optionally updating superblocks, in preresume in order to allow suspend to quiesce any active table before (which involves superblock updates); this ensures the proper sequence of writing the current and any new takeover(/reshape) metadata Signed-off-by: Heinz Mauelshagen <heinzm@redhat.com> Signed-off-by: Mike Snitzer <snitzer@redhat.com>
2016-06-14dm raid: enhance super_sync() to support new superblock membersHeinz Mauelshagen1-10/+65
Add transferring the new takeover/reshape related superblock members introduced to the super_sync() function: - add/move supporting functions - add failed devices bitfield transfer functions to retrieve the bitfield from superblock format or update it in the superblock - add code to transfer all new members Signed-off-by: Heinz Mauelshagen <heinzm@redhat.com> Signed-off-by: Mike Snitzer <snitzer@redhat.com>
2016-06-14dm raid: add new reshaping/raid10 format table line options to parameter parserHeinz Mauelshagen1-9/+39
Support the follwoing arguments in the ctr parameter parser: - add 'delta_disks', 'data_offset' taking int and sector respectively - 'raid10_use_near_sets' bool argument to optionally select near sets with supporting raid10 mappings Signed-off-by: Heinz Mauelshagen <heinzm@redhat.com> Signed-off-by: Mike Snitzer <snitzer@redhat.com>
2016-06-14dm raid: introduce extended superblock and new raid types to support ↵Heinz Mauelshagen1-133/+471
takeover/reshaping Add new members to the dm-raid superblock and new raid types to support takeover/reshape. Add all necessary members needed to support takeover and reshape in one go -- aiming to limit the amount of changes to the superblock layout. This is a larger patch due to the new superblock members, their related flags, validation of both and involved API additions/changes: - add additional members to keep track of: - state about forward/backward reshaping - reshape position - new level, layout, stripe size and delta disks - data offset to current and new data for out-of-place reshapes - failed devices bitfield extensions to keep track of max raid devices - adjust super_validate() to cope with new superblock members - adjust super_init_validation() to cope with new superblock members - add definitions for ctr flags supporting delta disks etc. - add new raid types (raid6_n_6 etc.) - add new raid10 supporting function API (_is_raid10_*()) - adjust to changed raid10 supporting function API Signed-off-by: Heinz Mauelshagen <heinzm@redhat.com> Signed-off-by: Mike Snitzer <snitzer@redhat.com>
2016-06-13dm raid: use rt_is_raid*() in all appropriate checksHeinz Mauelshagen1-12/+18
Make use if raid type rt_is_*() bool functions for simplification and consistency reasons. Signed-off-by: Heinz Mauelshagen <heinzm@redhat.com> Signed-off-by: Mike Snitzer <snitzer@redhat.com>
2016-06-13dm raid: more use of flag testing wrappersHeinz Mauelshagen1-25/+23
- add _test_flags() function - use it to simplify rs_check_for_invalid_flags() - use _test_flag() throughout Signed-off-by: Heinz Mauelshagen <heinzm@redhat.com> Signed-off-by: Mike Snitzer <snitzer@redhat.com>
2016-06-13dm raid: check constructor arguments for invalid raid level/argument ↵Heinz Mauelshagen1-1/+130
combinations Reject invalid flag combinations to avoid potential data corruption or failing raid set construction: - add definitions for constructor flag combinations and invalid flags per level - add bool test functions for the various raid types (also will be used by future reshaping enhancements) - introduce rs_check_for_invalid_flags() and _invalid_flags() to perform the validity checks Signed-off-by: Heinz Mauelshagen <heinzm@redhat.com> Signed-off-by: Mike Snitzer <snitzer@redhat.com>
2016-06-13dm raid: cleanup / provide infrastructureHeinz Mauelshagen1-196/+228
Provide necessary infrastructure to handle ctr flags and their names and cleanup setting ti->error: - comment constructor flags - introduce constructor flag manipulation - introduce ti_error_*() functions to simplify setting the error message (use in other targets?) - introduce array to hold ctr flag <-> flag name mapping - introduce argument name by flag functions for that array - use those functions throughout the ctr call path Signed-off-by: Heinz Mauelshagen <heinzm@redhat.com> Signed-off-by: Mike Snitzer <snitzer@redhat.com>
2016-06-13dm raid: use dm_arg_set API in constructorHeinz Mauelshagen1-61/+84
- use dm_arg_set API in ctr and its callees parse_raid_params() and dev_parms() - introduce _in_range() function to check a value is in a [ min, max ] range; this is to support more callers in parsing parameters etc. in the future - correct comment on MAX_RAID_DEVICES Signed-off-by: Heinz Mauelshagen <heinzm@redhat.com> Signed-off-by: Mike Snitzer <snitzer@redhat.com>
2016-06-13dm raid: rename variable 'ret' to 'r' to conform to other dm codeHeinz Mauelshagen1-32/+36
Signed-off-by: Heinz Mauelshagen <heinzm@redhat.com> Signed-off-by: Mike Snitzer <snitzer@redhat.com>
2016-06-07md: use bio op accessorsMike Christie1-2/+3
Separate the op from the rq_flag_bits and have md set/get the bio using bio_set_op_attrs/bio_op. Signed-off-by: Mike Christie <mchristi@redhat.com> Reviewed-by: Christoph Hellwig <hch@lst.de> Reviewed-by: Hannes Reinecke <hare@suse.com> Signed-off-by: Jens Axboe <axboe@fb.com>
2016-05-13dm raid: make sure no feature flags are set in metadataHeinz Mauelshagen1-1/+6
Given we don't yet support any feature flags in the dm-raid ondisk metadata (see: 'features' member of 'struct dm_raid_superblock'), add a check to ensure no flags are actually set, if any features are set reject the activation of the RAID mapping. This is to prevent possible data corruption in case of a kernel downgrade when there'll potentially be feature flags set by a future dm-raid target. Signed-off-by: Heinz Mauelshagen <heinzm@redhat.com> Signed-off-by: Mike Snitzer <snitzer@redhat.com>
2015-10-02dm raid: fix round up of default region sizeMikulas Patocka1-2/+1
Commit 3a0f9aaee028 ("dm raid: round region_size to power of two") intended to make sure that the default region size is a power of two. However, the logic in that commit is incorrect and sets the variable region_size to 0 or 1, depending on whether min_region_size is a power of two. Fix this logic, using roundup_pow_of_two(), so that region_size is properly rounded up to the next power of two. Signed-off-by: Mikulas Patocka <mpatocka@redhat.com> Fixes: 3a0f9aaee028 ("dm raid: round region_size to power of two") Cc: stable@vger.kernel.org # v3.8+ Signed-off-by: Mike Snitzer <snitzer@redhat.com>
2015-08-13block: kill merge_bvec_fn() completelyKent Overstreet1-19/+0
As generic_make_request() is now able to handle arbitrarily sized bios, it's no longer necessary for each individual block driver to define its own ->merge_bvec_fn() callback. Remove every invocation completely. Cc: Jens Axboe <axboe@kernel.dk> Cc: Lars Ellenberg <drbd-dev@lists.linbit.com> Cc: drbd-user@lists.linbit.com Cc: Jiri Kosina <jkosina@suse.cz> Cc: Yehuda Sadeh <yehuda@inktank.com> Cc: Sage Weil <sage@inktank.com> Cc: Alex Elder <elder@kernel.org> Cc: ceph-devel@vger.kernel.org Cc: Alasdair Kergon <agk@redhat.com> Cc: Mike Snitzer <snitzer@redhat.com> Cc: dm-devel@redhat.com Cc: Neil Brown <neilb@suse.de> Cc: linux-raid@vger.kernel.org Cc: Christoph Hellwig <hch@infradead.org> Cc: "Martin K. Petersen" <martin.petersen@oracle.com> Acked-by: NeilBrown <neilb@suse.de> (for the 'md' bits) Acked-by: Mike Snitzer <snitzer@redhat.com> Signed-off-by: Kent Overstreet <kent.overstreet@gmail.com> [dpark: also remove ->merge_bvec_fn() in dm-thin as well as dm-era-target, and resolve merge conflicts] Signed-off-by: Dongsu Park <dpark@posteo.net> Signed-off-by: Ming Lin <ming.l@ssi.samsung.com> Signed-off-by: Jens Axboe <axboe@fb.com>
2015-05-29dm raid: add support for the MD RAID0 personalityHeinz Mauelshagen1-48/+84
Add dm-raid access to the MD RAID0 personality to enable single zone striping. The following changes enable that access: - add type definition to raid_types array - make bitmap creation conditonal in super_validate(), because bitmaps are not allowed in raid0 - set rdev->sectors to the data image size in super_validate() to allow the raid0 personality to calculate the MD array size properly - use mdddev(un)lock() functions instead of direct mutex_(un)lock() (wrapped in here because it's a trivial change) - enhance raid_status() to always report full sync for raid0 so that userspace checks for 100% sync will succeed and allow for resize (and takeover/reshape once added in future paches) - enhance raid_resume() to not load bitmap in case of raid0 - add merge function to avoid data corruption (seen with readahead) that resulted from bio payloads that grew too large. This problem did not occur with the other raid levels because it either did not apply without striping (raid1) or was avoided via stripe caching. - raise version to 1.7.0 because of the raid0 API change Signed-off-by: Heinz Mauelshagen <heinzm@redhat.com> Reviewed-by: Jonathan Brassow <jbrassow@redhat.com> Signed-off-by: Mike Snitzer <snitzer@redhat.com>
2015-05-29dm raid: a few cleanupsHeinz Mauelshagen1-45/+46
- ensure maximum device limit in superblock - rename DMPF_* (print flags) to CTR_FLAG_* (constructor flags) and their respective struct raid_set member - use strcasecmp() in raid10_format_to_md_layout() as in the constructor Signed-off-by: Heinz Mauelshagen <heinzm@redhat.com> Reviewed-by: Jonathan Brassow <jbrassow@redhat.com> Signed-off-by: Mike Snitzer <snitzer@redhat.com>
2015-05-29dm raid: fixup documentation for discard supportHeinz Mauelshagen1-2/+0
Remove comment above parse_raid_params() that claims "devices_handle_discard_safely" is a table line argument when it is actually is a module parameter. Also, backfill dm-raid target version 1.6.0 documentation. Signed-off-by: Heinz Mauelshagen <heinzm@redhat.com> Reviewed-by: Jonathan Brassow <jbrassow@redhat.com> Signed-off-by: Mike Snitzer <snitzer@redhat.com>
2015-02-12Merge tag 'dm-3.20-changes' of ↵Linus Torvalds1-9/+7
git://git.kernel.org/pub/scm/linux/kernel/git/device-mapper/linux-dm Pull device mapper changes from Mike Snitzer: - The most significant change this cycle is request-based DM now supports stacking ontop of blk-mq devices. This blk-mq support changes the model request-based DM uses for cloning a request to relying on calling blk_get_request() directly from the underlying blk-mq device. An early consumer of this code is Intel's emerging NVMe hardware; thanks to Keith Busch for working on, and pushing for, these changes. - A few other small fixes and cleanups across other DM targets. * tag 'dm-3.20-changes' of git://git.kernel.org/pub/scm/linux/kernel/git/device-mapper/linux-dm: dm: inherit QUEUE_FLAG_SG_GAPS flags from underlying queues dm snapshot: remove unnecessary NULL checks before vfree() calls dm mpath: simplify failure path of dm_multipath_init() dm thin metadata: remove unused dm_pool_get_data_block_size() dm ioctl: fix stale comment above dm_get_inactive_table() dm crypt: update url in CONFIG_DM_CRYPT help text dm bufio: fix time comparison to use time_after_eq() dm: use time_in_range() and time_after() dm raid: fix a couple integer overflows dm table: train hybrid target type detection to select blk-mq if appropriate dm: allocate requests in target when stacking on blk-mq devices dm: prepare for allocating blk-mq clone requests in target dm: submit stacked requests in irq enabled context dm: split request structure out from dm_rq_target_io structure dm: remove exports for request-based interfaces without external callers
2015-02-09dm raid: fix a couple integer overflowsDan Carpenter1-9/+7
My static checker complains that if "num_raid_params" is UINT_MAX then the "if (num_raid_params + 1 > argc) {" check doesn't work as intended. The other change is that I moved the "if (argc != (num_raid_devs * 2))" condition forward a few lines so it was before the call to context_alloc(). If we had an integer overflow inside that function then it would lead to an immediate crash. Signed-off-by: Dan Carpenter <dan.carpenter@oracle.com> Signed-off-by: Mike Snitzer <snitzer@redhat.com>
2015-02-04md: make ->congested robust against personality changes.NeilBrown1-7/+1
There is currently no locking around calls to the 'congested' bdi function. If called at an awkward time while an array is being converted from one level (or personality) to another, there is a tiny chance of running code in an unreferenced module etc. So add a 'congested' function to the md_personality operations structure, and call it with appropriate locking from a central 'mddev_congested'. When the array personality is changing the array will be 'suspended' so no IO is processed. If mddev_congested detects this, it simply reports that the array is congested, which is a safe guess. As mddev_suspend calls synchronize_rcu(), mddev_congested can avoid races by included the whole call inside an rcu_read_lock() region. This require that the congested functions for all subordinate devices can be run under rcu_lock. Fortunately this is the case. Signed-off-by: NeilBrown <neilb@suse.de>
2014-10-29dm raid: fix inaccessible superblocks causing oops in configure_discard_supportHeinz Mauelshagen1-1/+5
Commit 48cf06bc5f ("dm raid: add discard support for RAID levels 4, 5 and 6") did not properly handle missing metadata device(s). A failing read of the superblock causes the metadata and data devices to be removed from the dev array in struct raid_set, setting references to both devices to NULL. configure_discard_support() nonetheless tries to access the data dev unconditionally causing an oops. Signed-off-by: Heinz Mauelshagen <heinzm@redhat.com> Signed-off-by: Mike Snitzer <snitzer@redhat.com>
2014-10-21dm raid: ensure superblock's size matches device's logical block sizeHeinz Mauelshagen1-4/+7
The dm-raid superblock (struct dm_raid_superblock) is padded to 512 bytes and that size is being used to read it in from the metadata device into one preallocated page. Reading or writing this on a 512-byte sector device works fine but on a 4096-byte sector device this fails. Set the dm-raid superblock's size to the logical block size of the metadata device, because IO at that size is guaranteed too work. Also add a size check to avoid silent partial metadata loss in case the superblock should ever grow past the logical block size or PAGE_SIZE. [includes pointer math fix from Dan Carpenter] Reported-by: "Liuhua Wang" <lwang@suse.com> Signed-off-by: Heinz Mauelshagen <heinzm@redhat.com> Signed-off-by: Dan Carpenter <dan.carpenter@oracle.com> Signed-off-by: Mike Snitzer <snitzer@redhat.com> Cc: stable@vger.kernel.org
2014-10-05dm raid: add discard support for RAID levels 4, 5 and 6Heinz Mauelshagen1-4/+34
In case of RAID levels 4, 5 and 6 we have to verify each RAID members' ability to zero data on discards to avoid stripe data corruption -- if discard_zeroes_data is not set for each RAID member discard support must be disabled. But given the uncertainty of whether or not a RAID member properly supports zeroing data on discard we require the user to explicitly allow discard support on RAID levels 4, 5, and 6 by setting a dm-raid module paramter, e.g.: dm-raid.devices_handle_discard_safely=Y Otherwise, discards could cause data corruption on RAID4/5/6. Signed-off-by: Heinz Mauelshagen <heinzm@redhat.com> Signed-off-by: Mike Snitzer <snitzer@redhat.com>
2014-10-05dm raid: add discard support for RAID levels 1 and 10Heinz Mauelshagen1-2/+28
Discard support is not enabled for RAID levels 4, 5, and 6 at this time due to concerns about unreliable discard_zeroes_data support on some hardware. Otherwise, discards could cause stripe data corruption (classic example of bad apples spoiling the bunch). Signed-off-by: Heinz Mauelshagen <heinzm@redhat.com> Signed-off-by: Mike Snitzer <snitzer@redhat.com>
2013-06-26MD: Remember the last sync operation that was performedJonathan Brassow1-1/+2
MD: Remember the last sync operation that was performed This patch adds a field to the mddev structure to track the last sync operation that was performed. This is especially useful when it comes to what is recorded in mismatch_cnt in sysfs. If the last operation was "data-check", then it reports the number of descrepancies found by the user-initiated check. If it was a "repair" operation, then it is reporting the number of descrepancies repaired. etc. Signed-off-by: Jonathan Brassow <jbrassow@redhat.com> Signed-off-by: NeilBrown <neilb@suse.de>
2013-06-14md: replace strict_strto*() with kstrto*()Jingoo Han1-4/+4
The usage of strict_strtoul() is not preferred, because strict_strtoul() is obsolete. Thus, kstrtoul() should be used. Signed-off-by: Jingoo Han <jg1.han@samsung.com> Signed-off-by: NeilBrown <neilb@suse.de>
2013-06-14dm-raid: silence compiler warning on rebuilds_per_group.NeilBrown1-1/+1
This doesn't really need to be initialised, but it doesn't hurt, silences the compiler, and as it is a counter it makes sense for it to start at zero. Signed-off-by: NeilBrown <neilb@suse.de>
2013-06-14DM RAID: Fix raid_resume not reviving failed devices in all casesJonathan Brassow1-0/+15
DM RAID: Fix raid_resume not reviving failed devices in all cases When a device fails in a RAID array, it is marked as Faulty. Later, md_check_recovery is called which (through the call chain) calls 'hot_remove_disk' in order to have the personalities remove the device from use in the array. Sometimes, it is possible for the array to be suspended before the personalities get their chance to perform 'hot_remove_disk'. This is normally not an issue. If the array is deactivated, then the failed device will be noticed when the array is reinstantiated. If the array is resumed and the disk is still missing, md_check_recovery will be called upon resume and 'hot_remove_disk' will be called at that time. However, (for dm-raid) if the device has been restored, a resume on the array would cause it to attempt to revive the device by calling 'hot_add_disk'. If 'hot_remove_disk' had not been called, a situation is then created where the device is thought to concurrently be the replacement and the device to be replaced. Thus, the device is first sync'ed with the rest of the array (because it is the replacement device) and then marked Faulty and removed from the array (because it is also the device being replaced). The solution is to check and see if the device had properly been removed before the array was suspended. This is done by seeing whether the device's 'raid_disk' field is -1 - a condition that implies that 'md_check_recovery -> remove_and_add_spares (where raid_disk is set to -1) -> hot_remove_disk' has been called. If 'raid_disk' is not -1, then 'hot_remove_disk' must be called to complete the removal of the previously faulty device before it can be revived via 'hot_add_disk'. Signed-off-by: Jonathan Brassow <jbrassow@redhat.com> Signed-off-by: NeilBrown <neilb@suse.de>
2013-06-14DM RAID: Break-up untidy functionJonathan Brassow1-33/+39
DM RAID: Break-up untidy function Clean-up excessive indentation by moving some code in raid_resume() into its own function. Signed-off-by: Jonathan Brassow <jbrassow@redhat.com> Signed-off-by: NeilBrown <neilb@suse.de>
2013-06-14DM RAID: Add ability to restore transiently failed devices on resumeJonathan Brassow1-1/+43
DM RAID: Add ability to restore transiently failed devices on resume This patch adds code to the resume function to check over the devices in the RAID array. If any are found to be marked as failed and their superblocks can be read, an attempt is made to reintegrate them into the array. This allows the user to refresh the array with a simple suspend and resume of the array - rather than having to load a completely new table, allocate and initialize all the structures and throw away the old instantiation. Signed-off-by: Jonathan Brassow <jbrassow@redhat.com> Signed-off-by: NeilBrown <neilb@suse.de>
2013-04-24DM RAID: Add message/status support for changing sync actionJonathan Brassow1-2/+109
DM RAID: Add message/status support for changing sync action This patch adds a message interface to dm-raid to allow the user to more finely control the sync actions being performed by the MD driver. This gives the user the ability to initiate "check" and "repair" (i.e. scrubbing). Two additional fields have been appended to the status output to provide more information about the type of sync action occurring and the results of those actions, specifically: <sync_action> and <mismatch_cnt>. These new fields will always be populated. This is essentially the device-mapper way of doing what MD controls through the 'sync_action' sysfs file and shows through the 'mismatch_cnt' sysfs file. Signed-off-by: Jonathan Brassow <jbrassow@redhat.com> Signed-off-by: NeilBrown <neilb@suse.de>
2013-03-05Merge tag 'md-3.9' of git://neil.brown.name/mdLinus Torvalds1-20/+103
Pull md updates from NeilBrown: "Mostly little bugfixes. Only "feature" is a new RAID10 layout which slightly improves the number of sets of devices that can concurrently fail, without data loss." * tag 'md-3.9' of git://neil.brown.name/md: md: expedite metadata update when switching read-auto -> active md: remove CONFIG_MULTICORE_RAID456 md/raid1,raid10: fix deadlock with freeze_array() md/raid0: improve error message when converting RAID4-with-spares to RAID0 md: raid0: fix error return from create_stripe_zones. md: fix two bugs when attempting to resize RAID0 array. DM RAID: Add support for MD's RAID10 "far" and "offset" algorithms MD RAID10: Improve redundancy for 'far' and 'offset' algorithms (part 2) MD RAID10: Improve redundancy for 'far' and 'offset' algorithms (part 1) MD RAID10: Minor non-functional code changes md: raid1,10: Handle REQ_WRITE_SAME flag in write bios md: protect against crash upon fsync on ro array
2013-03-01dm: rename request variables to biosAlasdair G Kergon1-1/+1
Use 'bio' in the name of variables and functions that deal with bios rather than 'request' to avoid confusion with the normal block layer use of 'request'. No functional changes. Signed-off-by: Alasdair G Kergon <agk@redhat.com>
2013-03-01dm: fix truncated status stringsMikulas Patocka1-5/+3
Avoid returning a truncated table or status string instead of setting the DM_BUFFER_FULL_FLAG when the last target of a table fills the buffer. When processing a table or status request, the function retrieve_status calls ti->type->status. If ti->type->status returns non-zero, retrieve_status assumes that the buffer overflowed and sets DM_BUFFER_FULL_FLAG. However, targets don't return non-zero values from their status method on overflow. Most targets returns always zero. If a buffer overflow happens in a target that is not the last in the table, it gets noticed during the next iteration of the loop in retrieve_status; but if a buffer overflow happens in the last target, it goes unnoticed and erroneously truncated data is returned. In the current code, the targets behave in the following way: * dm-crypt returns -ENOMEM if there is not enough space to store the key, but it returns 0 on all other overflows. * dm-thin returns errors from the status method if a disk error happened. This is incorrect because retrieve_status doesn't check the error code, it assumes that all non-zero values mean buffer overflow. * all the other targets always return 0. This patch changes the ti->type->status function to return void (because most targets don't use the return code). Overflow is detected in retrieve_status: if the status method fills up the remaining space completely, it is assumed that buffer overflow happened. Cc: stable@vger.kernel.org Signed-off-by: Mikulas Patocka <mpatocka@redhat.com> Signed-off-by: Alasdair G Kergon <agk@redhat.com>
2013-02-26DM RAID: Add support for MD's RAID10 "far" and "offset" algorithmsJonathan Brassow1-20/+103
DM RAID: Add support for MD's RAID10 "far" and "offset" algorithms Until now, dm-raid.c only supported the "near" algorthm of MD's RAID10 implementation. This patch adds support for the "far" and "offset" algorithms, but only with the improved redundancy that is brought with the introduction of the 'use_far_sets' bit, which shifts copied stripes according to smaller sets vs the entire array. That is, the 17th bit of the 'layout' variable that defines the RAID10 implementation will always be set. (More information on how the 'layout' variable selects the RAID10 algorithm can be found in the opening comments of drivers/md/raid10.c.) Signed-off-by: Jonathan Brassow <jbrassow@redhat.com> Signed-off-by: NeilBrown <neilb@suse.de>
2013-01-24DM-RAID: Fix RAID10's check for sufficient redundancyJonathan Brassow1-64/+37
Before attempting to activate a RAID array, it is checked for sufficient redundancy. That is, we make sure that there are not too many failed devices - or devices specified for rebuild - to undermine our ability to activate the array. The current code performs this check twice - once to ensure there were not too many devices specified for rebuild by the user ('validate_rebuild_devices') and again after possibly experiencing a failure to read the superblock ('analyse_superblocks'). Neither of these checks are sufficient. The first check is done properly but with insufficient information about the possible failure state of the devices to make a good determination if the array can be activated. The second check is simply done wrong in the case of RAID10 because it doesn't account for the independence of the stripes (i.e. mirror sets). The solution is to use the properly written check ('validate_rebuild_devices'), but perform the check after the superblocks have been read and we know which devices have failed. This gives us one check instead of two and performs it in a location where it can be done right. Only RAID10 was affected and it was affected in the following ways: - the code did not properly catch the condition where a user specified a device for rebuild that already had a failed device in the same mirror set. (This condition would, however, be caught at a deeper level in MD.) - the code triggers a false positive and denies activation when devices in independent mirror sets have failed - counting the failures as though they were all in the same set. The most likely place this error was introduced (or this patch should have been included) is in commit 4ec1e369 - first introduced in v3.7-rc1. Consequently this fix should also go in v3.7.y, however there is a small conflict on the .version in raid_target, so I'll submit a separate patch to -stable. Cc: stable@vger.kernel.org Signed-off-by: Jonathan Brassow <jbrassow@redhat.com> Signed-off-by: NeilBrown <neilb@suse.de>
2012-12-21dm: remove map_infoMikulas Patocka1-2/+2
This patch removes map_info from bio-based device mapper targets. map_info is still used for request-based targets. Signed-off-by: Mikulas Patocka <mpatocka@redhat.com> Signed-off-by: Alasdair G Kergon <agk@redhat.com>
2012-12-21dm raid: round region_size to power of twoJonathan Brassow1-1/+3
If the user does not supply a bitmap region_size to the dm raid target, a reasonable size is computed automatically. If this is not a power of 2, the md code will report an error later. This patch catches the problem early and rounds the region_size to the next power of two. Signed-off-by: Jonathan Brassow <jbrassow@redhat.com> Signed-off-by: Alasdair G Kergon <agk@redhat.com>