aboutsummaryrefslogtreecommitdiff
path: root/drivers/md/bcache/writeback.c
AgeCommit message (Collapse)AuthorFilesLines
2015-08-13bcache: remove driver private bio splitting codeKent Overstreet1-2/+2
The bcache driver has always accepted arbitrarily large bios and split them internally. Now that every driver must accept arbitrarily large bios this code isn't nessecary anymore. Cc: [email protected] Signed-off-by: Kent Overstreet <[email protected]> [dpark: add more description in commit message] Signed-off-by: Dongsu Park <[email protected]> Signed-off-by: Ming Lin <[email protected]> Signed-off-by: Jens Axboe <[email protected]>
2015-07-29block: add a bi_error field to struct bioChristoph Hellwig1-5/+5
Currently we have two different ways to signal an I/O error on a BIO: (1) by clearing the BIO_UPTODATE flag (2) by returning a Linux errno value to the bi_end_io callback The first one has the drawback of only communicating a single possible error (-EIO), and the second one has the drawback of not beeing persistent when bios are queued up, and are not passed along from child to parent bio in the ever more popular chaining scenario. Having both mechanisms available has the additional drawback of utterly confusing driver authors and introducing bugs where various I/O submitters only deal with one of them, and the others have to add boilerplate code to deal with both kinds of error returns. So add a new bi_error field to store an errno value directly in struct bio and remove the existing mechanisms to clean all this up. Signed-off-by: Christoph Hellwig <[email protected]> Reviewed-by: Hannes Reinecke <[email protected]> Reviewed-by: NeilBrown <[email protected]> Signed-off-by: Jens Axboe <[email protected]>
2014-08-04bcache: fix uninterruptible sleep in writeback threadSlava Pestov1-4/+10
There were two issues here: - writeback thread did not start until the device first became dirty - writeback thread used uninterruptible sleep once running Without this patch I see kernel warnings printed and a load average of 1.52 after booting my test VM. With this patch the warnings are gone and the load average is near 0.00 as expected. Signed-off-by: Kent Overstreet <[email protected]>
2013-12-31Merge tag 'v3.13-rc6' into for-3.14/coreJens Axboe1-28/+25
Needed to bring blk-mq uptodate, since changes have been going in since for-3.14/core was established. Fixup merge issues related to the immutable biovec changes. Signed-off-by: Jens Axboe <[email protected]> Conflicts: block/blk-flush.c fs/btrfs/check-integrity.c fs/btrfs/extent_io.c fs/btrfs/scrub.c fs/logfs/dev_bdev.c
2013-12-16bcache: New writeback PD controllerKent Overstreet1-24/+23
The old writeback PD controller could get into states where it had throttled all the way down and take way too long to recover - it was too complicated to really understand what it was doing. This rewrites a good chunk of it to hopefully be simpler and make more sense, and it also pays more attention to units which should make the behaviour a bit easier to understand. Signed-off-by: Kent Overstreet <[email protected]>
2013-12-16bcache: Use uninterruptible sleep in writebackKent Overstreet1-2/+2
We're just waiting on kthread_should_stop(), nothing else, so interruptible sleep was wrong here. Signed-off-by: Kent Overstreet <[email protected]>
2013-12-16bcache: kthread don't set writeback task to INTERUPTIBLEStefan Priebe1-2/+0
at the beginning (schedule_timout_interuptible) and others do his on their own This prevents wrong load average calculation (load of 1 per thread) Signed-off-by: Kent Overstreet <[email protected]>
2013-11-23block: Abstract out bvec iteratorKent Overstreet1-3/+3
Immutable biovecs are going to require an explicit iterator. To implement immutable bvecs, a later patch is going to add a bi_bvec_done member to this struct; for now, this patch effectively just renames things. Signed-off-by: Kent Overstreet <[email protected]> Cc: Jens Axboe <[email protected]> Cc: Geert Uytterhoeven <[email protected]> Cc: Benjamin Herrenschmidt <[email protected]> Cc: Paul Mackerras <[email protected]> Cc: "Ed L. Cashin" <[email protected]> Cc: Nick Piggin <[email protected]> Cc: Lars Ellenberg <[email protected]> Cc: Jiri Kosina <[email protected]> Cc: Matthew Wilcox <[email protected]> Cc: Geoff Levand <[email protected]> Cc: Yehuda Sadeh <[email protected]> Cc: Sage Weil <[email protected]> Cc: Alex Elder <[email protected]> Cc: [email protected] Cc: Joshua Morris <[email protected]> Cc: Philip Kelleher <[email protected]> Cc: Rusty Russell <[email protected]> Cc: "Michael S. Tsirkin" <[email protected]> Cc: Konrad Rzeszutek Wilk <[email protected]> Cc: Jeremy Fitzhardinge <[email protected]> Cc: Neil Brown <[email protected]> Cc: Alasdair Kergon <[email protected]> Cc: Mike Snitzer <[email protected]> Cc: [email protected] Cc: Martin Schwidefsky <[email protected]> Cc: Heiko Carstens <[email protected]> Cc: [email protected] Cc: Boaz Harrosh <[email protected]> Cc: Benny Halevy <[email protected]> Cc: "James E.J. Bottomley" <[email protected]> Cc: Greg Kroah-Hartman <[email protected]> Cc: "Nicholas A. Bellinger" <[email protected]> Cc: Alexander Viro <[email protected]> Cc: Chris Mason <[email protected]> Cc: "Theodore Ts'o" <[email protected]> Cc: Andreas Dilger <[email protected]> Cc: Jaegeuk Kim <[email protected]> Cc: Steven Whitehouse <[email protected]> Cc: Dave Kleikamp <[email protected]> Cc: Joern Engel <[email protected]> Cc: Prasad Joshi <[email protected]> Cc: Trond Myklebust <[email protected]> Cc: KONISHI Ryusuke <[email protected]> Cc: Mark Fasheh <[email protected]> Cc: Joel Becker <[email protected]> Cc: Ben Myers <[email protected]> Cc: [email protected] Cc: Steven Rostedt <[email protected]> Cc: Frederic Weisbecker <[email protected]> Cc: Ingo Molnar <[email protected]> Cc: Len Brown <[email protected]> Cc: Pavel Machek <[email protected]> Cc: "Rafael J. Wysocki" <[email protected]> Cc: Herton Ronaldo Krzesinski <[email protected]> Cc: Ben Hutchings <[email protected]> Cc: Andrew Morton <[email protected]> Cc: Guo Chao <[email protected]> Cc: Tejun Heo <[email protected]> Cc: Asai Thambi S P <[email protected]> Cc: Selvan Mani <[email protected]> Cc: Sam Bradshaw <[email protected]> Cc: Wei Yongjun <[email protected]> Cc: "Roger Pau MonnĂ©" <[email protected]> Cc: Jan Beulich <[email protected]> Cc: Stefano Stabellini <[email protected]> Cc: Ian Campbell <[email protected]> Cc: Sebastian Ott <[email protected]> Cc: Christian Borntraeger <[email protected]> Cc: Minchan Kim <[email protected]> Cc: Jiang Liu <[email protected]> Cc: Nitin Gupta <[email protected]> Cc: Jerome Marchand <[email protected]> Cc: Joe Perches <[email protected]> Cc: Peng Tao <[email protected]> Cc: Andy Adamson <[email protected]> Cc: fanchaoting <[email protected]> Cc: Jie Liu <[email protected]> Cc: Sunil Mushran <[email protected]> Cc: "Martin K. Petersen" <[email protected]> Cc: Namjae Jeon <[email protected]> Cc: Pankaj Kumar <[email protected]> Cc: Dan Magenheimer <[email protected]> Cc: Mel Gorman <[email protected]>6
2013-11-10bcache: Fix sysfs splat on shutdown with flash only devsKent Overstreet1-3/+3
Whoops. Signed-off-by: Kent Overstreet <[email protected]>
2013-11-10bcache: Better full stripe scanningKent Overstreet1-38/+56
The old scanning-by-stripe code burned too much CPU, this should be better. Signed-off-by: Kent Overstreet <[email protected]>
2013-11-10bcache: Convert bch_btree_insert() to bch_btree_map_leaf_nodes()Kent Overstreet1-4/+2
Last of the btree_map() conversions. Main visible effect is bch_btree_insert() is no longer taking a struct btree_op as an argument anymore - there's no fancy state machine stuff going on, it's just a normal function. Signed-off-by: Kent Overstreet <[email protected]>
2013-11-10bcache: Don't use op->insert_collisionKent Overstreet1-3/+4
When we convert bch_btree_insert() to bch_btree_map_leaf_nodes(), we won't be passing struct btree_op to bch_btree_insert() anymore - so we need a different way of returning whether there was a collision (really, a replace collision). Signed-off-by: Kent Overstreet <[email protected]>
2013-11-10bcache: Kill op->replaceKent Overstreet1-6/+4
This is prep work for converting bch_btree_insert to bch_btree_map_leaf_nodes() - we have to convert all its arguments to actual arguments. Bunch of churn, but should be straightforward. Signed-off-by: Kent Overstreet <[email protected]>
2013-11-10bcache: Kill op->clKent Overstreet1-3/+2
This isn't used for waiting asynchronously anymore - so this is a fairly trivial refactoring. Signed-off-by: Kent Overstreet <[email protected]>
2013-11-10bcache: Prune struct btree_opKent Overstreet1-5/+12
Eventual goal is for struct btree_op to contain only what is necessary for traversing the btree. Signed-off-by: Kent Overstreet <[email protected]>
2013-11-10bcache: Add btree_map() functionsKent Overstreet1-24/+13
Lots of stuff has been open coding its own btree traversal - which is generally pretty simple code, but there are a few subtleties. This adds new new functions, bch_btree_map_nodes() and bch_btree_map_keys(), which do the traversal for you. Everything that's open coding btree traversal now (with the exception of garbage collection) is slowly going to be converted to these two functions; being able to write other code at a higher level of abstraction is a big improvement w.r.t. overall code quality. Signed-off-by: Kent Overstreet <[email protected]>
2013-11-10bcache: Convert writeback to a kthreadKent Overstreet1-197/+174
This simplifies the writeback flow control quite a bit - previously, it was conceptually two coroutines, refill_dirty() and read_dirty(). This makes the code quite a bit more straightforward. Signed-off-by: Kent Overstreet <[email protected]>
2013-11-10bcache: Move keylist out of btree_opKent Overstreet1-2/+5
Slowly working on pruning struct btree_op - the aim is for it to only contain things that are actually necessary for traversing the btree. Signed-off-by: Kent Overstreet <[email protected]>
2013-11-10bcache: Add explicit keylist arg to btree_insert()Kent Overstreet1-1/+1
Some refactoring - better to explicitly pass stuff around instead of having it all in the "big bag of state", struct btree_op. Going to prune struct btree_op quite a bit over time. Signed-off-by: Kent Overstreet <[email protected]>
2013-11-10bcache: Stripe size isn't necessarily a power of twoKent Overstreet1-16/+17
Originally I got this right... except that the divides didn't use do_div(), which broke 32 bit kernels. When I went to fix that, I forgot that the raid stripe size usually isn't a power of two... doh Signed-off-by: Kent Overstreet <[email protected]>
2013-09-24bcache: Fix a dumb CPU spinning bug in writebackKent Overstreet1-2/+1
schedule_timeout() != schedule_timeout_uninterruptible() Signed-off-by: Kent Overstreet <[email protected]> Cc: linux-stable <[email protected]> # >= v3.10 Signed-off-by: Linus Torvalds <[email protected]>
2013-09-24bcache: Fix a writeback performance regressionKent Overstreet1-22/+21
Background writeback works by scanning the btree for dirty data and adding those keys into a fixed size buffer, then for each dirty key in the keybuf writing it to the backing device. When read_dirty() finishes and it's time to scan for more dirty data, we need to wait for the outstanding writeback IO to finish - they still take up slots in the keybuf (so that foreground writes can check for them to avoid races) - without that wait, we'll continually rescan when we'll be able to add at most a key or two to the keybuf, and that takes locks that starves foreground IO. Doh. Signed-off-by: Kent Overstreet <[email protected]> Cc: linux-stable <[email protected]> # >= v3.10 Signed-off-by: Linus Torvalds <[email protected]>
2013-07-01bcache: Use standard utility codeKent Overstreet1-3/+4
Some of bcache's utility code has made it into the rest of the kernel, so drop the bcache versions. Bcache used to have a workaround for allocating from a bio set under generic_make_request() (if you allocated more than once, the bios you already allocated would get stuck on current->bio_list when you submitted, and you'd risk deadlock) - bcache would mask out __GFP_WAIT when allocating bios under generic_make_request() so that allocation could fail and it could retry from workqueue. But bio_alloc_bioset() has a workaround now, so we can drop this hack and the associated error handling. Signed-off-by: Kent Overstreet <[email protected]>
2013-06-26bcache: Write out full stripesKent Overstreet1-2/+42
Now that we're tracking dirty data per stripe, we can add two optimizations for raid5/6: * If a stripe is already dirty, force writes to that stripe to writeback mode - to help build up full stripes of dirty data * When flushing dirty data, preferentially write out full stripes first if there are any. Signed-off-by: Kent Overstreet <[email protected]>
2013-06-26bcache: Track dirty data by stripeKent Overstreet1-6/+34
To make background writeback aware of raid5/6 stripes, we first need to track the amount of dirty data within each stripe - we do this by breaking up the existing sectors_dirty into per stripe atomic_ts Signed-off-by: Kent Overstreet <[email protected]>
2013-06-26bcache: Initialize sectors_dirty when attachingKent Overstreet1-0/+36
Previously, dirty_data wouldn't get initialized until the first garbage collection... which was a bit of a problem for background writeback (as the PD controller keys off of it) and also confusing for users. This is also prep work for making background writeback aware of raid5/6 stripes. Signed-off-by: Kent Overstreet <[email protected]>
2013-06-26bcache: Fix/revamp tracepointsKent Overstreet1-4/+6
The tracepoints were reworked to be more sensible, and fixed a null pointer deref in one of the tracepoints. Converted some of the pr_debug()s to tracepoints - this is partly a performance optimization; it used to be that with DEBUG or CONFIG_DYNAMIC_DEBUG pr_debug() was an empty macro; but at some point it was changed to an empty inline function. Some of the pr_debug() statements had rather expensive function calls as part of the arguments, so this code was getting run unnecessarily even on non debug kernels - in some fast paths, too. Signed-off-by: Kent Overstreet <[email protected]>
2013-05-15bcache: Fix error handling in init codeKent Overstreet1-1/+1
This code appears to have rotted... fix various bugs and do some refactoring. Signed-off-by: Kent Overstreet <[email protected]>
2013-03-28bcache: Don't export utility code, prefix with bch_Kent Overstreet1-3/+3
Signed-off-by: Kent Overstreet <[email protected]> Cc: [email protected] Signed-off-by: Jens Axboe <[email protected]>
2013-03-23bcache: A block layer cacheKent Overstreet1-0/+414
Does writethrough and writeback caching, handles unclean shutdown, and has a bunch of other nifty features motivated by real world usage. See the wiki at http://bcache.evilpiepirate.org for more. Signed-off-by: Kent Overstreet <[email protected]>