aboutsummaryrefslogtreecommitdiff
AgeCommit message (Collapse)AuthorFilesLines
2012-10-11MD: raid5 trim supportShaohua Li2-3/+166
Discard for raid4/5/6 has limitation. If discard request size is small, we do discard for one disk, but we need calculate parity and write parity disk. To correctly calculate parity, zero_after_discard must be guaranteed. Even it's true, we need do discard for one disk but write another disks, which makes the parity disks wear out fast. This doesn't make sense. So an efficient discard for raid4/5/6 should discard all data disks and parity disks, which requires the write pattern to be (A, A+chunk_size, A+chunk_size*2...). If A's size is smaller than chunk_size, such pattern is almost impossible in practice. So in this patch, I only handle the case that A's size equals to chunk_size. That is discard request should be aligned to stripe size and its size is multiple of stripe size. Since we can only handle request with specific alignment and size (or part of the request fitting stripes), we can't guarantee zero_after_discard even zero_after_discard is true in low level drives. The block layer doesn't send down correctly aligned requests even correct discard alignment is set, so I must filter out. For raid4/5/6 parity calculation, if data is 0, parity is 0. So if zero_after_discard is true for all disks, data is consistent after discard. Otherwise, data might be lost. Let's consider a scenario: discard a stripe, write data to one disk and write parity disk. The stripe could be still inconsistent till then depending on using data from other data disks or parity disks to calculate new parity. If the disk is broken, we can't restore it. So in this patch, we only enable discard support if all disks have zero_after_discard. If discard fails in one disk, we face the similar inconsistent issue above. The patch will make discard follow the same path as normal write request. If discard fails, a resync will be scheduled to make the data consistent. This isn't good to have extra writes, but data consistency is important. If a subsequent read/write request hits raid5 cache of a discarded stripe, the discarded dev page should have zero filled, so the data is consistent. This patch will always zero dev page for discarded request stripe. This isn't optimal because discard request doesn't need such payload. Next patch will avoid it. Signed-off-by: Shaohua Li <[email protected]> Signed-off-by: NeilBrown <[email protected]>
2012-10-11md/bitmap:Don't use IS_ERR to judge alloc_page().Jianpeng Ma1-6/+2
Signed-off-by: Jianpeng Ma <[email protected]> Signed-off-by: NeilBrown <[email protected]>
2012-10-11md/raid1: Don't release reference to device while handling read error.NeilBrown1-4/+5
When we get a read error, we arrange for raid1d to handle it. Currently we release the reference on the device. This can result in conf->mirrors[read_disk].rdev being NULL in fix_read_error, if the device happens to get removed before the read error is handled. So instead keep the reference until the read error has been fully handled. Reported-by: hank <[email protected]> Signed-off-by: NeilBrown <[email protected]>
2012-10-11raid: replace list_for_each_continue_rcu with new interfaceMichael Wang1-6/+3
This patch replaces list_for_each_continue_rcu() with list_for_each_entry_continue_rcu() to save a few lines of code and allow removing list_for_each_continue_rcu(). Reviewed-by: Paul E. McKenney <[email protected]> Signed-off-by: Michael Wang <[email protected]> Signed-off-by: NeilBrown <[email protected]>
2012-10-11add further __init annotations to crypto/xor.cJan Beulich1-2/+2
Allow particularly do_xor_speed() to be discarded post-init. Signed-off-by: Jan Beulich <[email protected]> Signed-off-by: NeilBrown <[email protected]>
2012-10-11DM RAID: Fix for "sync" directive ineffectivenessJonathan Brassow1-0/+13
There are two table arguments that can be given to a DM RAID target that control whether the array is forced to (re)synchronize or skip initialization: "sync" and "nosync". When "sync" is given, we set mddev->recovery_cp to 0 in order to cause the device to resynchronize. This is insufficient if there is a bitmap in use, because the array will simply look at the bitmap and see that there is no recovery necessary. The fix is to skip over the loading of the superblocks when "sync" is given, causing new superblocks to be written that will force the array to go through initialization (i.e. synchronization). Signed-off-by: Jonathan Brassow <[email protected]> Signed-off-by: NeilBrown <[email protected]>
2012-10-11DM RAID: Fix comparison of index and quantity for "rebuild" parameterJonathan Brassow1-1/+1
DM RAID: Fix comparison of index and quantity for "rebuild" parameter The "rebuild" parameter takes an index argument that starts counting from zero. The conditional used to validate the index was using '>' rather than '>=', leaving the door open for an index value that would be 1 too large. Reported-by: Neil Brown <[email protected]> Signed-off-by: Jonathan Brassow <[email protected]> Signed-off-by: NeilBrown <[email protected]>
2012-10-11DM RAID: Add rebuild capability for RAID10Jonathan Brassow2-1/+42
DM RAID: Add code to validate replacement slots for RAID10 arrays RAID10 can handle 'copies - 1' failures for each mirror group. This code ensures the user has provided a valid array - one whose devices specified for rebuild do not exceed the amount of redundancy available. Signed-off-by: Jonathan Brassow <[email protected]> Signed-off-by: NeilBrown <[email protected]>
2012-10-11DM RAID: Move 'rebuild' checking code to its own functionJonathan Brassow1-25/+50
DM RAID: Move chunk of code to it's own function The code that checks whether device replacements/rebuilds are possible given a specific RAID type is moved to it's own function. It will further expand when the code to check RAID10 is added. A separate function makes it easier to read. Signed-off-by: Jonathan Brassow <[email protected]> Signed-off-by: NeilBrown <[email protected]>
2012-10-11MD RAID10: Prep for DM RAID10 device replacement capabilityJonathan Brassow2-3/+9
MD RAID10: Fix a couple potential kernel panics if RAID10 is used by dm-raid When device-mapper uses the RAID10 personality through dm-raid.c, there is no 'gendisk' structure in mddev and some sysfs information is also not populated. This patch avoids touching those non-existent structures. Signed-off-by: Jonathan Brassow <[email protected]> Signed-off-by: NeilBrown <[email protected]>
2012-10-11md: avoid taking the mutex on some ioctls.NeilBrown1-23/+62
Some ioctls don't need to take the mutex and doing so can cause a delay as it is held during super-block update. So move those ioctls out of the mutex and rely on rcu locking to ensure we don't access stale data. Signed-off-by: NeilBrown <[email protected]>
2012-10-11MD: change the parameter of md threadShaohua Li6-11/+17
Change the thread parameter, so the thread can carry extra info. Next patch will use it. Signed-off-by: Shaohua Li <[email protected]> Signed-off-by: NeilBrown <[email protected]>
2012-10-11md/raid10: submit IO from originating thread instead of md thread.NeilBrown1-3/+54
queuing writes to the md thread means that all requests go through the one processor which may not be able to keep up with very high request rates. So use the plugging infrastructure to submit all requests on unplug. If a 'schedule' is needed, we fall back on the old approach of handing the requests to the thread for it to handle. This is nearly identical to a recent patch which provided similar functionality to RAID1. Signed-off-by: NeilBrown <[email protected]>
2012-10-11md: raid 10 supports TRIMShaohua Li1-4/+25
This makes md raid 10 support TRIM. If one disk supports discard and another not, or one has discard_zero_data and another not, there could be inconsistent between data from such disks. But this should not matter, discarded data is useless. This will add extra copy in rebuild though. Signed-off-by: Shaohua Li <[email protected]> Signed-off-by: NeilBrown <[email protected]>
2012-10-11md: raid 1 supports TRIMShaohua Li1-2/+21
This makes md raid 1 support TRIM. If one disk supports discard and another not, or one has discard_zero_data and another not, there could be inconsistent between data from such disks. But this should not matter, discarded data is useless. This will add extra copy in rebuild though. Signed-off-by: Shaohua Li <[email protected]> Signed-off-by: NeilBrown <[email protected]>
2012-10-11md: raid 0 supports TRIMShaohua Li1-1/+18
This makes md raid 0 support TRIM. Signed-off-by: Shaohua Li <[email protected]> Signed-off-by: NeilBrown <[email protected]>
2012-10-11md: linear supports TRIMShaohua Li1-0/+16
This makes md linear support TRIM. Signed-off-by: Shaohua Li <[email protected]> Signed-off-by: NeilBrown <[email protected]>
2012-10-11md/linear: rcu_dereference outside read-lock sectionDenis Efremov1-2/+7
According to the comment in linear_stop function rcu_dereference in linear_start and linear_stop functions occurs under reconfig_mutex. The patch represents this agreement in code and prevents lockdep complaint. Found by Linux Driver Verification project (linuxtesting.org) Signed-off-by: Denis Efremov <[email protected]> Signed-off-by: NeilBrown <[email protected]>
2012-09-28block: makes bio_split support bio without dataShaohua Li1-12/+14
discard bio hasn't data attached. We hit a BUG_ON with such bio. This makes bio_split works for such bio. Signed-off-by: Shaohua Li <[email protected]> Signed-off-by: NeilBrown <[email protected]> Signed-off-by: Jens Axboe <[email protected]>
2012-09-28scatterlist: refactor the sg_nentsMaxim Levitsky1-5/+2
Replace 'while' with 'for' as suggested by Tejun Heo Signed-off-by: Maxim Levitsky <[email protected]> Signed-off-by: Jens Axboe <[email protected]>
2012-09-27scatterlist: add sg_nentsMaxim Levitsky2-0/+23
Useful helper to know the number of entries in scatterlist. Signed-off-by: Maxim Levitsky <[email protected]> Cc: Alex Dubov <[email protected]> Acked-by: Tejun Heo <[email protected]> Cc: Jens Axboe <[email protected]> Signed-off-by: Jens Axboe <[email protected]>
2012-09-27fs: fix include/percpu-rwsem.h export errorJens Axboe1-1/+1
We get the following export error on the include file: usr/include/linux/fs.h:13: included file 'linux/percpu-rwsem.h' is not exported Move the include inside the __KERNEL__ section. Reported-by: Stephen Rothwell <[email protected]> Signed-off-by: Jens Axboe <[email protected]>
2012-09-26percpu-rw-semaphore: fix documentation typosMikulas Patocka1-2/+2
One more patch for this thing, fixing some typos in the documentation. Signed-off-by: Mikulas Patocka <[email protected]> Signed-off-by: Jens Axboe <[email protected]>
2012-09-26fs/block_dev.c:1644:5: sparse: symbol 'blkdev_mmap' was not declaredFengguang Wu1-1/+1
blkdev_mmap() isn't used outside of fs/block_dev.c, mark it as static. Reported-by: Fengguang Wu <[email protected]> Signed-off-by: Jens Axboe <[email protected]>
2012-09-26blockdev: turn a rw semaphore into a percpu rw semaphoreMikulas Patocka4-11/+135
This avoids cache line bouncing when many processes lock the semaphore for read. New percpu lock implementation The lock consists of an array of percpu unsigned integers, a boolean variable and a mutex. When we take the lock for read, we enter rcu read section, check for a "locked" variable. If it is false, we increase a percpu counter on the current cpu and exit the rcu section. If "locked" is true, we exit the rcu section, take the mutex and drop it (this waits until a writer finished) and retry. Unlocking for read just decreases percpu variable. Note that we can unlock on a difference cpu than where we locked, in this case the counter underflows. The sum of all percpu counters represents the number of processes that hold the lock for read. When we need to lock for write, we take the mutex, set "locked" variable to true and synchronize rcu. Since RCU has been synchronized, no processes can create new read locks. We wait until the sum of percpu counters is zero - when it is, there are no readers in the critical section. Signed-off-by: Mikulas Patocka <[email protected]> Signed-off-by: Jens Axboe <[email protected]>
2012-09-26Fix a crash when block device is read and block size is changed at the same timeMikulas Patocka3-3/+65
The kernel may crash when block size is changed and I/O is issued simultaneously. Because some subsystems (udev or lvm) may read any block device anytime, the bug actually puts any code that changes a block device size in jeopardy. The crash can be reproduced if you place "msleep(1000)" to blkdev_get_blocks just before "bh->b_size = max_blocks << inode->i_blkbits;". Then, run "dd if=/dev/ram0 of=/dev/null bs=4k count=1 iflag=direct" While it is waiting in msleep, run "blockdev --setbsz 2048 /dev/ram0" You get a BUG. The direct and non-direct I/O is written with the assumption that block size does not change. It doesn't seem practical to fix these crashes one-by-one there may be many crash possibilities when block size changes at a certain place and it is impossible to find them all and verify the code. This patch introduces a new rw-lock bd_block_size_semaphore. The lock is taken for read during I/O. It is taken for write when changing block size. Consequently, block size can't be changed while I/O is being submitted. For asynchronous I/O, the patch only prevents block size change while the I/O is being submitted. The block size can change when the I/O is in progress or when the I/O is being finished. This is acceptable because there are no accesses to block size when asynchronous I/O is being finished. The patch prevents block size changing while the device is mapped with mmap. Signed-off-by: Mikulas Patocka <[email protected]> Signed-off-by: Jens Axboe <[email protected]>
2012-09-21block: fix request_queue->flags initializationTejun Heo1-1/+1
A queue newly allocated with blk_alloc_queue_node() has only QUEUE_FLAG_BYPASS set. For request-based drivers, blk_init_allocated_queue() is called and q->queue_flags is overwritten with QUEUE_FLAG_DEFAULT which doesn't include BYPASS even though the initial bypass is still in effect. In blk_init_allocated_queue(), or QUEUE_FLAG_DEFAULT to q->queue_flags instead of overwriting. Signed-off-by: Tejun Heo <[email protected]> Cc: [email protected] Acked-by: Vivek Goyal <[email protected]> Signed-off-by: Jens Axboe <[email protected]>
2012-09-21block: lift the initial queue bypass mode on blk_register_queue() instead of ↵Tejun Heo2-5/+8
blk_init_allocated_queue() b82d4b197c ("blkcg: make request_queue bypassing on allocation") made request_queues bypassed on allocation to avoid switching on and off bypass mode on a queue being initialized. Some drivers allocate and then destroy a lot of queues without fully initializing them and incurring bypass latency overhead on each of them could add upto significant overhead. Unfortunately, blk_init_allocated_queue() is never used by queues of bio-based drivers, which means that all bio-based driver queues are in bypass mode even after initialization and registration complete successfully. Due to the limited way request_queues are used by bio drivers, this problem is hidden pretty well but it shows up when blk-throttle is used in combination with a bio-based driver. Trying to configure (echoing to cgroupfs file) blk-throttle for a bio-based driver hangs indefinitely in blkg_conf_prep() waiting for bypass mode to end. This patch moves the initial blk_queue_bypass_end() call from blk_init_allocated_queue() to blk_register_queue() which is called for any userland-visible queues regardless of its type. I believe this is correct because I don't think there is any block driver which needs or wants working elevator and blk-cgroup on a queue which isn't visible to userland. If there are such users, we need a different solution. Signed-off-by: Tejun Heo <[email protected]> Reported-by: Joseph Glanville <[email protected]> Cc: [email protected] Acked-by: Vivek Goyal <[email protected]> Signed-off-by: Jens Axboe <[email protected]>
2012-09-20block: ioctl to zero block rangesMartin K. Petersen2-0/+28
Introduce a BLKZEROOUT ioctl which can be used to clear block ranges by way of blkdev_issue_zeroout(). Signed-off-by: Martin K. Petersen <[email protected]> Acked-by: Mike Snitzer <[email protected]> Signed-off-by: Jens Axboe <[email protected]>
2012-09-20block: Make blkdev_issue_zeroout use WRITE SAMEMartin K. Petersen1-1/+29
If the device supports WRITE SAME, use that to optimize zeroing of blocks. If the device does not support WRITE SAME or if the operation fails, fall back to writing zeroes the old-fashioned way. Signed-off-by: Martin K. Petersen <[email protected]> Acked-by: Mike Snitzer <[email protected]> Signed-off-by: Jens Axboe <[email protected]>
2012-09-20block: Implement support for WRITE SAMEMartin K. Petersen11-6/+181
The WRITE SAME command supported on some SCSI devices allows the same block to be efficiently replicated throughout a block range. Only a single logical block is transferred from the host and the storage device writes the same data to all blocks described by the I/O. This patch implements support for WRITE SAME in the block layer. The blkdev_issue_write_same() function can be used by filesystems and block drivers to replicate a buffer across a block range. This can be used to efficiently initialize software RAID devices, etc. Signed-off-by: Martin K. Petersen <[email protected]> Acked-by: Mike Snitzer <[email protected]> Signed-off-by: Jens Axboe <[email protected]>
2012-09-20block: Consolidate command flag and queue limit checks for mergesMartin K. Petersen3-20/+44
- blk_check_merge_flags() verifies that cmd_flags / bi_rw are compatible. This function is called for both req-req and req-bio merging. - blk_rq_get_max_sectors() and blk_queue_get_max_sectors() can be used to query the maximum sector count for a given request or queue. The calls will return the right value from the queue limits given the type of command (RW, discard, write same, etc.) Signed-off-by: Martin K. Petersen <[email protected]> Acked-by: Mike Snitzer <[email protected]> Signed-off-by: Jens Axboe <[email protected]>
2012-09-20block: Clean up special command handling logicMartin K. Petersen7-49/+46
Remove special-casing of non-rw fs style requests (discard). The nomerge flags are consolidated in blk_types.h, and rq_mergeable() and bio_mergeable() have been modified to use them. bio_is_rw() is used in place of bio_has_data() a few places. This is done to to distinguish true reads and writes from other fs type requests that carry a payload (e.g. write same). Signed-off-by: Martin K. Petersen <[email protected]> Acked-by: Mike Snitzer <[email protected]> Signed-off-by: Jens Axboe <[email protected]>
2012-09-12block/blk-tag.c: Remove useless kfreePeter Senna Tschudin1-4/+2
Remove useless kfree() and clean up code related to the removal. The semantic patch that finds this problem is as follows: (http://coccinelle.lip6.fr/) // <smpl> @r exists@ position p1,p2; expression x; @@ if (x@p1 == NULL) { ... kfree@p2(x); ... return ...; } @unchanged exists@ position r.p1,r.p2; expression e <= r.x,x,e1; iterator I; statement S; @@ if (x@p1 == NULL) { ... when != I(x,...) S when != e = e1 when != e += e1 when != e -= e1 when != ++e when != --e when != e++ when != e-- when != &e kfree@p2(x); ... return ...; } @ok depends on unchanged exists@ position any r.p1; position r.p2; expression x; @@ ... when != true x@p1 == NULL kfree@p2(x); @depends on !ok && unchanged@ position r.p2; expression x; @@ *kfree@p2(x); // </smpl> Signed-off-by: Peter Senna Tschudin <[email protected]> Signed-off-by: Jens Axboe <[email protected]>
2012-09-09block: remove the duplicated setting for congestion_thresholdJaehoon Chung1-2/+0
Before call the blk_queue_congestion_threshold(), the blk_queue_congestion_threshold() is already called at blk_queue_make_rquest(). Because this code is the duplicated, it has removed. Signed-off-by: Jaehoon Chung <[email protected]> Signed-off-by: Kyungmin Park <[email protected]> Signed-off-by: Jens Axboe <[email protected]>
2012-09-09block: reject invalid queue attribute valuesDave Reisner1-2/+23
Instead of using simple_strtoul which "converts" invalid numbers to 0, use strict_strtoul and perform error checking to ensure that userspace passes us a valid unsigned long. This addresses problems with functions such as writev, which might want to write a trailing newline -- the newline should rightfully be rejected, but the value preceeding it should be preserved. Fixes BZ#46981. Signed-off-by: Dave Reisner <[email protected]> Signed-off-by: Jens Axboe <[email protected]>
2012-09-09block: Add bio_clone_bioset(), bio_clone_kmalloc()Kent Overstreet8-46/+29
Previously, there was bio_clone() but it only allocated from the fs bio set; as a result various users were open coding it and using __bio_clone(). This changes bio_clone() to become bio_clone_bioset(), and then we add bio_clone() and bio_clone_kmalloc() as wrappers around it, making use of the functionality the last patch adedd. This will also help in a later patch changing how bio cloning works. Signed-off-by: Kent Overstreet <[email protected]> CC: Jens Axboe <[email protected]> CC: NeilBrown <[email protected]> CC: Alasdair Kergon <[email protected]> CC: Boaz Harrosh <[email protected]> CC: Jeff Garzik <[email protected]> Acked-by: Jeff Garzik <[email protected]> Signed-off-by: Jens Axboe <[email protected]>
2012-09-09block: Consolidate bio_alloc_bioset(), bio_kmalloc()Kent Overstreet2-77/+49
Previously, bio_kmalloc() and bio_alloc_bioset() behaved slightly different because there was some almost-duplicated code - this fixes some of that. The important change is that previously bio_kmalloc() always set bi_io_vec = bi_inline_vecs, even if nr_iovecs == 0 - unlike bio_alloc_bioset(). This would cause bio_has_data() to return true; I don't know if this resulted in any actual bugs but it was certainly wrong. bio_kmalloc() and bio_alloc_bioset() also have different arbitrary limits on nr_iovecs - 1024 (UIO_MAXIOV) for bio_kmalloc(), 256 (BIO_MAX_PAGES) for bio_alloc_bioset(). This patch doesn't fix that, but at least they're enforced closer together and hopefully they will be fixed in a later patch. This'll also help with some future cleanups - there are a fair number of functions that allocate bios (e.g. bio_clone()), and now they don't have to be duplicated for bio_alloc(), bio_alloc_bioset(), and bio_kmalloc(). Signed-off-by: Kent Overstreet <[email protected]> CC: Jens Axboe <[email protected]> v7: Re-add dropped comments, improv patch description Signed-off-by: Jens Axboe <[email protected]>
2012-09-09block: Kill bi_destructorKent Overstreet5-48/+27
Now that we've got generic code for freeing bios allocated from bio pools, this isn't needed anymore. This patch also makes bio_free() static, since without bi_destructor there should be no need for it to be called anywhere else. bio_free() is now only called from bio_put, so we can refactor those a bit - move some code from bio_put() to bio_free() and kill the redundant bio->bi_next = NULL. v5: Switch to BIO_KMALLOC_POOL ((void *)~0), per Boaz v6: BIO_KMALLOC_POOL now NULL, drop bio_free's EXPORT_SYMBOL v7: No #define BIO_KMALLOC_POOL anymore Signed-off-by: Kent Overstreet <[email protected]> CC: Jens Axboe <[email protected]> Signed-off-by: Jens Axboe <[email protected]>
2012-09-09pktcdvd: Switch to bio_kmalloc()Kent Overstreet1-45/+7
This is prep work for killing bi_destructor - previously, pktcdvd had its own pkt_bio_alloc which was basically duplication bio_kmalloc(), necessitating its own bi_destructor implementation. v5: Un-reorder some functions, to make the patch easier to review Signed-off-by: Kent Overstreet <[email protected]> Acked-by: Jiri Kosina <[email protected]> Signed-off-by: Jens Axboe <[email protected]>
2012-09-09block: Add bio_reset()Kent Overstreet3-6/+44
Reusing bios is something that's been highly frowned upon in the past, but driver code keeps doing it anyways. If it's going to happen anyways, we should provide a generic method. This'll help with getting rid of bi_destructor - drivers/block/pktcdvd.c was open coding it, by doing a bio_init() and resetting bi_destructor. This required reordering struct bio, but the block layer is not yet nearly fast enough for any cacheline effects to matter here. v5: Add a define BIO_RESET_BITS, to be very explicit about what parts of bio->bi_flags are saved. v6: Further commenting verbosity, per Tejun v9: Add a function comment Signed-off-by: Kent Overstreet <[email protected]> CC: Jens Axboe <[email protected]> Acked-by: Tejun Heo <[email protected]> Signed-off-by: Jens Axboe <[email protected]>
2012-09-09dm: Use bioset's front_pad for dm_rq_clone_bio_infoKent Overstreet1-28/+18
Previously, dm_rq_clone_bio_info needed to be freed by the bio's destructor to avoid a memory leak in the blk_rq_prep_clone() error path. This gets rid of a memory allocation and means we can kill dm_rq_bio_destructor. The _rq_bio_info_cache kmem cache is unused now and needs to be deleted, but due to the way io_pool is used and overloaded this looks not quite trivial so I'm leaving it for a later patch. v6: Fix comment on struct dm_rq_clone_bio_info, per Tejun Signed-off-by: Kent Overstreet <[email protected]> CC: Alasdair Kergon <[email protected]> Acked-by: Tejun Heo <[email protected]> Signed-off-by: Jens Axboe <[email protected]>
2012-09-09block: Ues bi_pool for bio_integrity_alloc()Kent Overstreet6-41/+26
Now that bios keep track of where they were allocated from, bio_integrity_alloc_bioset() becomes redundant. Remove bio_integrity_alloc_bioset() and drop bio_set argument from the related functions and make them use bio->bi_pool. Signed-off-by: Kent Overstreet <[email protected]> CC: Jens Axboe <[email protected]> CC: Martin K. Petersen <[email protected]> Acked-by: Tejun Heo <[email protected]> Signed-off-by: Jens Axboe <[email protected]>
2012-09-09block: Generalized bio pool freeingKent Overstreet8-103/+21
With the old code, when you allocate a bio from a bio pool you have to implement your own destructor that knows how to find the bio pool the bio was originally allocated from. This adds a new field to struct bio (bi_pool) and changes bio_alloc_bioset() to use it. This makes various bio destructors unnecessary, so they're then deleted. v6: Explain the temporary if statement in bio_put Signed-off-by: Kent Overstreet <[email protected]> CC: Jens Axboe <[email protected]> CC: NeilBrown <[email protected]> CC: Alasdair Kergon <[email protected]> CC: Nicholas Bellinger <[email protected]> CC: Lars Ellenberg <[email protected]> Acked-by: Tejun Heo <[email protected]> Acked-by: Nicholas Bellinger <[email protected]> Signed-off-by: Jens Axboe <[email protected]>
2012-09-06Merge tag 'fixes-for-linus' of ↵Linus Torvalds17-27/+73
git://git.kernel.org/pub/scm/linux/kernel/git/arm/arm-soc Pull ARM SoC bug fixes from Olof Johansson: "Mostly Renesas and Atmel bugfixes this time, targeting boot and build problems. A couple of patches for gemini and kirkwood as well. On a whole nothing very controversial." * tag 'fixes-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/arm/arm-soc: ARM: gemini: fix the gemini build ARM: shmobile: armadillo800eva: enable rw rootfs mount ARM: Kirkwood: Fix 'SZ_1M' undeclared here for db88f6281-bp-setup.c ARM: shmobile: mackerel: fixup usb module order ARM: shmobile: armadillo800eva: fixup: sound card detection order ARM: shmobile: marzen: fixup smsc911x id for regulator ARM: at91/feature-removal-schedule: delay at91_mci removal ARM: mach-shmobile: armadillo800eva: Enable power button as wakeup source ARM: mach-shmobile: armadillo800eva: Fix GPIO buttons descriptions ARM: at91/dts: remove partial parameter in at91sam9g25ek.dts ARM: at91/clock: fix PLLA overclock warning ARM: at91: fix rtc-at91sam9 irq issue due to sparse irq support ARM: at91: fix system timer irq issue due to sparse irq support ARM: shmobile: sh73a0: fixup RELOC_BASE of intca_irq_pins_desc
2012-09-06Merge tag 'hwmon-for-linus' of ↵Linus Torvalds1-2/+10
git://git.kernel.org/pub/scm/linux/kernel/git/groeck/linux-staging Pull a hwmon fix from Guenter Roeck: "One patch, fixing DIV_ROUND_CLOSEST to support negative dividends. While the changes are not in the drivers/hwmon directory, the problem primarily affects hwmon drivers, and it makes sense to push the patch through the hwmon tree." * tag 'hwmon-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/groeck/linux-staging: linux/kernel.h: Fix DIV_ROUND_CLOSEST to support negative dividends
2012-09-06Merge branch 'rc-fixes' of ↵Linus Torvalds2-2/+2
git://git.kernel.org/pub/scm/linux/kernel/git/mmarek/kbuild Pull kbuild fixes from Michal Marek: "These are two fixes that should go into 3.6. The link-vmlinux.sh one is obvious. The other one fixes make firmware_install with certain configurations, where a file in the toplevel firmware tree gets installed first, and $(INSTALL_FW_PATH)/$$(dir <file>) results in /lib/firmware/./, which confuses make 3.82 for some reason." * 'rc-fixes' of git://git.kernel.org/pub/scm/linux/kernel/git/mmarek/kbuild: firmware: fix directory creation rule matching with make 3.82 link-vmlinux.sh: Fix stray "echo" in error message
2012-09-06Remove user-triggerable BUG from mpol_to_strDave Jones1-1/+1
Trivially triggerable, found by trinity: kernel BUG at mm/mempolicy.c:2546! Process trinity-child2 (pid: 23988, threadinfo ffff88010197e000, task ffff88007821a670) Call Trace: show_numa_map+0xd5/0x450 show_pid_numa_map+0x13/0x20 traverse+0xf2/0x230 seq_read+0x34b/0x3e0 vfs_read+0xac/0x180 sys_pread64+0xa2/0xc0 system_call_fastpath+0x1a/0x1f RIP: mpol_to_str+0x156/0x360 Cc: [email protected] Signed-off-by: Dave Jones <[email protected]> Signed-off-by: Linus Torvalds <[email protected]>
2012-09-05Merge tag 'mmc-fixes-for-3.6-rc5' of ↵Linus Torvalds8-61/+98
git://git.kernel.org/pub/scm/linux/kernel/git/cjb/mmc Pull MMC fixes from Chris Ball: - a firmware bug on several Samsung MoviNAND eMMC models causes permanent corruption on the device when secure erase and secure trim requests are made, so we disable those requests on these eMMC devices. - atmel-mci: fix a hang with some SD cards by waiting for not-busy flag. - dw_mmc: low-power mode breaks SDIO interrupts; fix PIO error handling; fix handling of error interrupts. - mxs-mmc: fix deadlocks; fix compile error due to dma.h arch change. - omap: fix broken PIO mode causing memory corruption. - sdhci-esdhc: fix card detection. * tag 'mmc-fixes-for-3.6-rc5' of git://git.kernel.org/pub/scm/linux/kernel/git/cjb/mmc: mmc: omap: fix broken PIO mode mmc: card: Skip secure erase on MoviNAND; causes unrecoverable corruption. mmc: dw_mmc: Disable low power mode if SDIO interrupts are used mmc: dw_mmc: fix error handling in PIO mode mmc: dw_mmc: correct mishandling error interrupt mmc: dw_mmc: amend using error interrupt status mmc: atmel-mci: not busy flag has also to be used for read operations mmc: sdhci-esdhc: break out early if clock is 0 mmc: mxs-mmc: fix deadlock caused by recursion loop mmc: mxs-mmc: fix deadlock in SDIO IRQ case mmc: bfin_sdh: fix dma_desc_array build error
2012-09-05uml: fix compile error in deliver_alarm()Miklos Szeredi1-1/+1
Fix the following compile error on UML. arch/um/os-Linux/time.c: In function 'deliver_alarm': arch/um/os-Linux/time.c:117:3: error: too few arguments to function 'alarm_handler' arch/um/os-Linux/internal.h:1:6: note: declared here The error was introduced by commit d3c1cfcd ("um: pass siginfo to guest process") in 3.6-rc1. Signed-off-by: Miklos Szeredi <[email protected]> CC: Martin Pärtel <[email protected]> Signed-off-by: Linus Torvalds <[email protected]>