Age | Commit message (Collapse) | Author | Files | Lines |
|
Not needed after the previous patch named
"Btrfs: fix page reading in extent_same ioctl leading to csum errors".
Signed-off-by: Filipe Manana <[email protected]>
|
|
Merge second patch-bomb from Andrew Morton:
- most of the rest of MM
- procfs
- lib/ updates
- printk updates
- bitops infrastructure tweaks
- checkpatch updates
- nilfs2 update
- signals
- various other misc bits: coredump, seqfile, kexec, pidns, zlib, ipc,
dma-debug, dma-mapping, ...
* emailed patches from Andrew Morton <[email protected]>: (102 commits)
ipc,msg: drop dst nil validation in copy_msg
include/linux/zutil.h: fix usage example of zlib_adler32()
panic: release stale console lock to always get the logbuf printed out
dma-debug: check nents in dma_sync_sg*
dma-mapping: tidy up dma_parms default handling
pidns: fix set/getpriority and ioprio_set/get in PRIO_USER mode
kexec: use file name as the output message prefix
fs, seqfile: always allow oom killer
seq_file: reuse string_escape_str()
fs/seq_file: use seq_* helpers in seq_hex_dump()
coredump: change zap_threads() and zap_process() to use for_each_thread()
coredump: ensure all coredumping tasks have SIGNAL_GROUP_COREDUMP
signal: remove jffs2_garbage_collect_thread()->allow_signal(SIGCONT)
signal: introduce kernel_signal_stop() to fix jffs2_garbage_collect_thread()
signal: turn dequeue_signal_lock() into kernel_dequeue_signal()
signals: kill block_all_signals() and unblock_all_signals()
nilfs2: fix gcc uninitialized-variable warnings in powerpc build
nilfs2: fix gcc unused-but-set-variable warnings
MAINTAINERS: nilfs2: add header file for tracing
nilfs2: add tracepoints for analyzing reading and writing metadata files
...
|
|
There are many places which use mapping_gfp_mask to restrict a more
generic gfp mask which would be used for allocations which are not
directly related to the page cache but they are performed in the same
context.
Let's introduce a helper function which makes the restriction explicit and
easier to track. This patch doesn't introduce any functional changes.
[[email protected]: coding-style fixes]
Signed-off-by: Michal Hocko <[email protected]>
Suggested-by: Andrew Morton <[email protected]>
Signed-off-by: Andrew Morton <[email protected]>
Signed-off-by: Linus Torvalds <[email protected]>
|
|
git://git.kernel.org/pub/scm/linux/kernel/git/kdave/linux into for-linus-4.4
|
|
cache friendly
Below variables are defined per compress type.
- struct list_head comp_idle_workspace[BTRFS_COMPRESS_TYPES]
- spinlock_t comp_workspace_lock[BTRFS_COMPRESS_TYPES]
- int comp_num_workspace[BTRFS_COMPRESS_TYPES]
- atomic_t comp_alloc_workspace[BTRFS_COMPRESS_TYPES]
- wait_queue_head_t comp_workspace_wait[BTRFS_COMPRESS_TYPES]
BTW, while accessing one compress type of these variables, the next or
before address is other compress types of it.
So this patch puts these variables in a struct to make cache friendly.
Reviewed-by: David Sterba <[email protected]>
Signed-off-by: Byongho Lee <[email protected]>
Signed-off-by: David Sterba <[email protected]>
|
|
Reduce number of undocumented barriers out there.
Signed-off-by: David Sterba <[email protected]>
|
|
We can always fill up the bio now, no need to estimate the possible
size based on queue parameters.
Acked-by: Steven Whitehouse <[email protected]>
Signed-off-by: Kent Overstreet <[email protected]>
[hch: rebased and wrote a changelog]
Signed-off-by: Christoph Hellwig <[email protected]>
Signed-off-by: Ming Lin <[email protected]>
Signed-off-by: Jens Axboe <[email protected]>
|
|
Currently we have two different ways to signal an I/O error on a BIO:
(1) by clearing the BIO_UPTODATE flag
(2) by returning a Linux errno value to the bi_end_io callback
The first one has the drawback of only communicating a single possible
error (-EIO), and the second one has the drawback of not beeing persistent
when bios are queued up, and are not passed along from child to parent
bio in the ever more popular chaining scenario. Having both mechanisms
available has the additional drawback of utterly confusing driver authors
and introducing bugs where various I/O submitters only deal with one of
them, and the others have to add boilerplate code to deal with both kinds
of error returns.
So add a new bi_error field to store an errno value directly in struct
bio and remove the existing mechanisms to clean all this up.
Signed-off-by: Christoph Hellwig <[email protected]>
Reviewed-by: Hannes Reinecke <[email protected]>
Reviewed-by: NeilBrown <[email protected]>
Signed-off-by: Jens Axboe <[email protected]>
|
|
git://git.kernel.org/pub/scm/linux/kernel/git/kdave/linux into for-linus-4.1
Signed-off-by: Chris Mason <[email protected]>
Conflicts:
fs/btrfs/disk-io.c
|
|
Convert kmalloc(nr * size, ..) to kmalloc_array that does additional
overflow checks, the zeroing variant is kcalloc.
Signed-off-by: David Sterba <[email protected]>
|
|
There are some op tables that can be easily made const, similarly the
sysfs feature and raid tables. This is motivated by PaX CONSTIFY plugin.
Signed-off-by: David Sterba <[email protected]>
|
|
git://git.kernel.org/pub/scm/linux/kernel/git/mason/linux-btrfs
Pull btrfs update from Chris Mason:
"From a feature point of view, most of the code here comes from Miao
Xie and others at Fujitsu to implement scrubbing and replacing devices
on raid56. This has been in development for a while, and it's a big
improvement.
Filipe and Josef have a great assortment of fixes, many of which solve
problems corruptions either after a crash or in error conditions. I
still have a round two from Filipe for next week that solves
corruptions with discard and block group removal"
* 'for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/mason/linux-btrfs: (62 commits)
Btrfs: make get_caching_control unconditionally return the ctl
Btrfs: fix unprotected deletion from pending_chunks list
Btrfs: fix fs mapping extent map leak
Btrfs: fix memory leak after block remove + trimming
Btrfs: make btrfs_abort_transaction consider existence of new block groups
Btrfs: fix race between writing free space cache and trimming
Btrfs: fix race between fs trimming and block group remove/allocation
Btrfs, replace: enable dev-replace for raid56
Btrfs: fix freeing used extents after removing empty block group
Btrfs: fix crash caused by block group removal
Btrfs: fix invalid block group rbtree access after bg is removed
Btrfs, raid56: fix use-after-free problem in the final device replace procedure on raid56
Btrfs, replace: write raid56 parity into the replace target device
Btrfs, replace: write dirty pages into the replace target device
Btrfs, raid56: support parity scrub on raid56
Btrfs, raid56: use a variant to record the operation type
Btrfs, scrub: repair the common data on RAID5/6 if it is corrupted
Btrfs, raid56: don't change bbio and raid_map
Btrfs: remove unnecessary code of stripe_index assignment in __btrfs_map_block
Btrfs: remove noused bbio_ret in __btrfs_map_block in condition
...
|
|
Don Bailey noticed that our page zeroing for compression at end-io time
isn't complete. This reworks a patch from Linus to push the zeroing
into the zlib and lzo specific functions instead of trying to handle the
corners inside btrfs_decompress_buf2page
Signed-off-by: Chris Mason <[email protected]>
Reviewed-by: Josef Bacik <[email protected]>
Reported-by: Don A. Bailey <[email protected]>
cc: [email protected]
Signed-off-by: Linus Torvalds <[email protected]>
|
|
Our compressed bio write end callback was essentially ignoring the error
parameter. When a write error happens, it must pass a value of 0 to the
inode's write_page_end_io_hook callback, SetPageError on the respective
pages and set AS_EIO in the inode's mapping flags, so that a call to
filemap_fdatawait_range() / filemap_fdatawait() can find out that errors
happened (we surely don't want silent failures on fsync for example).
Signed-off-by: Filipe Manana <[email protected]>
Signed-off-by: Chris Mason <[email protected]>
|
|
The enum exists but is not consistently used.
Signed-off-by: David Sterba <[email protected]>
|
|
The form
(value + PAGE_CACHE_SIZE - 1) >> PAGE_CACHE_SHIFT
is equivalent to
(value + PAGE_CACHE_SIZE - 1) / PAGE_CACHE_SIZE
The rest is a simple subsitution, no difference in the generated
assembly code.
Signed-off-by: David Sterba <[email protected]>
Signed-off-by: Chris Mason <[email protected]>
|
|
Add compression `workspace' in free_workspace() to
`idle_workspace' list head, instead of tail. So we have
better chances to reuse most recently used `workspace'.
Signed-off-by: Sergey Senozhatsky <[email protected]>
Reviewed-by: David Sterba <[email protected]>
Signed-off-by: Chris Mason <[email protected]>
|
|
The btrfs compression wrappers translated errors from workspace
allocation to either -ENOMEM or -1. The compression type workspace
allocators are already returning a ERR_PTR(-ENOMEM). Just return that
and get rid of the magical -1.
This helps a future patch return errors from the compression wrappers.
Signed-off-by: Zach Brown <[email protected]>
Reviewed-by: David Sterba <[email protected]>
Signed-off-by: Chris Mason <[email protected]>
|
|
shmem mappings already contain exceptional entries where swap slot
information is remembered.
To be able to store eviction information for regular page cache, prepare
every site dealing with the radix trees directly to handle entries other
than pages.
The common lookup functions will filter out non-page entries and return
NULL for page cache holes, just as before. But provide a raw version of
the API which returns non-page entries as well, and switch shmem over to
use it.
Signed-off-by: Johannes Weiner <[email protected]>
Reviewed-by: Rik van Riel <[email protected]>
Reviewed-by: Minchan Kim <[email protected]>
Cc: Andrea Arcangeli <[email protected]>
Cc: Bob Liu <[email protected]>
Cc: Christoph Hellwig <[email protected]>
Cc: Dave Chinner <[email protected]>
Cc: Greg Thelen <[email protected]>
Cc: Hugh Dickins <[email protected]>
Cc: Jan Kara <[email protected]>
Cc: KOSAKI Motohiro <[email protected]>
Cc: Luigi Semenzato <[email protected]>
Cc: Mel Gorman <[email protected]>
Cc: Metin Doslu <[email protected]>
Cc: Michel Lespinasse <[email protected]>
Cc: Ozgun Erdogan <[email protected]>
Cc: Peter Zijlstra <[email protected]>
Cc: Roman Gushchin <[email protected]>
Cc: Ryan Mallon <[email protected]>
Cc: Tejun Heo <[email protected]>
Cc: Vlastimil Babka <[email protected]>
Signed-off-by: Andrew Morton <[email protected]>
Signed-off-by: Linus Torvalds <[email protected]>
|
|
git://git.kernel.org/pub/scm/linux/kernel/git/mason/linux-btrfs
Pull btrfs fixes from Chris Mason:
"This is a small collection of fixes"
* 'for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/mason/linux-btrfs:
Btrfs: fix data corruption when reading/updating compressed extents
Btrfs: don't loop forever if we can't run because of the tree mod log
btrfs: reserve no transaction units in btrfs_ioctl_set_features
btrfs: commit transaction after setting label and features
Btrfs: fix assert screwup for the pending move stuff
|
|
When using a mix of compressed file extents and prealloc extents, it
is possible to fill a page of a file with random, garbage data from
some unrelated previous use of the page, instead of a sequence of zeroes.
A simple sequence of steps to get into such case, taken from the test
case I made for xfstests, is:
_scratch_mkfs
_scratch_mount "-o compress-force=lzo"
$XFS_IO_PROG -f -c "pwrite -S 0x06 -b 18670 266978 18670" $SCRATCH_MNT/foobar
$XFS_IO_PROG -c "falloc 26450 665194" $SCRATCH_MNT/foobar
$XFS_IO_PROG -c "truncate 542872" $SCRATCH_MNT/foobar
$XFS_IO_PROG -c "fsync" $SCRATCH_MNT/foobar
This results in the following file items in the fs tree:
item 4 key (257 INODE_ITEM 0) itemoff 15879 itemsize 160
inode generation 6 transid 6 size 542872 block group 0 mode 100600
item 5 key (257 INODE_REF 256) itemoff 15863 itemsize 16
inode ref index 2 namelen 6 name: foobar
item 6 key (257 EXTENT_DATA 0) itemoff 15810 itemsize 53
extent data disk byte 0 nr 0 gen 6
extent data offset 0 nr 24576 ram 266240
extent compression 0
item 7 key (257 EXTENT_DATA 24576) itemoff 15757 itemsize 53
prealloc data disk byte 12849152 nr 241664 gen 6
prealloc data offset 0 nr 241664
item 8 key (257 EXTENT_DATA 266240) itemoff 15704 itemsize 53
extent data disk byte 12845056 nr 4096 gen 6
extent data offset 0 nr 20480 ram 20480
extent compression 2
item 9 key (257 EXTENT_DATA 286720) itemoff 15651 itemsize 53
prealloc data disk byte 13090816 nr 405504 gen 6
prealloc data offset 0 nr 258048
The on disk extent at offset 266240 (which corresponds to 1 single disk block),
contains 5 compressed chunks of file data. Each of the first 4 compress 4096
bytes of file data, while the last one only compresses 3024 bytes of file data.
Therefore a read into the file region [285648 ; 286720[ (length = 4096 - 3024 =
1072 bytes) should always return zeroes (our next extent is a prealloc one).
The solution here is the compression code path to zero the remaining (untouched)
bytes of the last page it uncompressed data into, as the information about how
much space the file data consumes in the last page is not known in the upper layer
fs/btrfs/extent_io.c:__do_readpage(). In __do_readpage we were correctly zeroing
the remainder of the page but only if it corresponds to the last page of the inode
and if the inode's size is not a multiple of the page size.
This would cause not only returning random data on reads, but also permanently
storing random data when updating parts of the region that should be zeroed.
For the example above, it means updating a single byte in the region [285648 ; 286720[
would store that byte correctly but also store random data on disk.
A test case for xfstests follows soon.
Signed-off-by: Filipe David Borba Manana <[email protected]>
Signed-off-by: Chris Mason <[email protected]>
|
|
git://git.kernel.org/pub/scm/linux/kernel/git/mason/linux-btrfs
Pull btrfs updates from Chris Mason:
"This is a pretty big pull, and most of these changes have been
floating in btrfs-next for a long time. Filipe's properties work is a
cool building block for inheriting attributes like compression down on
a per inode basis.
Jeff Mahoney kicked in code to export filesystem info into sysfs.
Otherwise, lots of performance improvements, cleanups and bug fixes.
Looks like there are still a few other small pending incrementals, but
I wanted to get the bulk of this in first"
* 'for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/mason/linux-btrfs: (149 commits)
Btrfs: fix spin_unlock in check_ref_cleanup
Btrfs: setup inode location during btrfs_init_inode_locked
Btrfs: don't use ram_bytes for uncompressed inline items
Btrfs: fix btrfs_search_slot_for_read backwards iteration
Btrfs: do not export ulist functions
Btrfs: rework ulist with list+rb_tree
Btrfs: fix memory leaks on walking backrefs failure
Btrfs: fix send file hole detection leading to data corruption
Btrfs: add a reschedule point in btrfs_find_all_roots()
Btrfs: make send's file extent item search more efficient
Btrfs: fix to catch all errors when resolving indirect ref
Btrfs: fix protection between walking backrefs and root deletion
btrfs: fix warning while merging two adjacent extents
Btrfs: fix infinite path build loops in incremental send
btrfs: undo sysfs when open_ctree() fails
Btrfs: fix snprintf usage by send's gen_unique_name
btrfs: fix defrag 32-bit integer overflow
btrfs: sysfs: list the NO_HOLES feature
btrfs: sysfs: don't show reserved incompat feature
btrfs: call permission checks earlier in ioctls and return EPERM
...
|
|
Convert all applicable cases of printk and pr_* to the btrfs_* macros.
Fix all uses of the BTRFS prefix.
Signed-off-by: Frank Holton <[email protected]>
Signed-off-by: Josef Bacik <[email protected]>
Signed-off-by: Chris Mason <[email protected]>
|
|
Immutable biovecs are going to require an explicit iterator. To
implement immutable bvecs, a later patch is going to add a bi_bvec_done
member to this struct; for now, this patch effectively just renames
things.
Signed-off-by: Kent Overstreet <[email protected]>
Cc: Jens Axboe <[email protected]>
Cc: Geert Uytterhoeven <[email protected]>
Cc: Benjamin Herrenschmidt <[email protected]>
Cc: Paul Mackerras <[email protected]>
Cc: "Ed L. Cashin" <[email protected]>
Cc: Nick Piggin <[email protected]>
Cc: Lars Ellenberg <[email protected]>
Cc: Jiri Kosina <[email protected]>
Cc: Matthew Wilcox <[email protected]>
Cc: Geoff Levand <[email protected]>
Cc: Yehuda Sadeh <[email protected]>
Cc: Sage Weil <[email protected]>
Cc: Alex Elder <[email protected]>
Cc: [email protected]
Cc: Joshua Morris <[email protected]>
Cc: Philip Kelleher <[email protected]>
Cc: Rusty Russell <[email protected]>
Cc: "Michael S. Tsirkin" <[email protected]>
Cc: Konrad Rzeszutek Wilk <[email protected]>
Cc: Jeremy Fitzhardinge <[email protected]>
Cc: Neil Brown <[email protected]>
Cc: Alasdair Kergon <[email protected]>
Cc: Mike Snitzer <[email protected]>
Cc: [email protected]
Cc: Martin Schwidefsky <[email protected]>
Cc: Heiko Carstens <[email protected]>
Cc: [email protected]
Cc: Boaz Harrosh <[email protected]>
Cc: Benny Halevy <[email protected]>
Cc: "James E.J. Bottomley" <[email protected]>
Cc: Greg Kroah-Hartman <[email protected]>
Cc: "Nicholas A. Bellinger" <[email protected]>
Cc: Alexander Viro <[email protected]>
Cc: Chris Mason <[email protected]>
Cc: "Theodore Ts'o" <[email protected]>
Cc: Andreas Dilger <[email protected]>
Cc: Jaegeuk Kim <[email protected]>
Cc: Steven Whitehouse <[email protected]>
Cc: Dave Kleikamp <[email protected]>
Cc: Joern Engel <[email protected]>
Cc: Prasad Joshi <[email protected]>
Cc: Trond Myklebust <[email protected]>
Cc: KONISHI Ryusuke <[email protected]>
Cc: Mark Fasheh <[email protected]>
Cc: Joel Becker <[email protected]>
Cc: Ben Myers <[email protected]>
Cc: [email protected]
Cc: Steven Rostedt <[email protected]>
Cc: Frederic Weisbecker <[email protected]>
Cc: Ingo Molnar <[email protected]>
Cc: Len Brown <[email protected]>
Cc: Pavel Machek <[email protected]>
Cc: "Rafael J. Wysocki" <[email protected]>
Cc: Herton Ronaldo Krzesinski <[email protected]>
Cc: Ben Hutchings <[email protected]>
Cc: Andrew Morton <[email protected]>
Cc: Guo Chao <[email protected]>
Cc: Tejun Heo <[email protected]>
Cc: Asai Thambi S P <[email protected]>
Cc: Selvan Mani <[email protected]>
Cc: Sam Bradshaw <[email protected]>
Cc: Wei Yongjun <[email protected]>
Cc: "Roger Pau Monné" <[email protected]>
Cc: Jan Beulich <[email protected]>
Cc: Stefano Stabellini <[email protected]>
Cc: Ian Campbell <[email protected]>
Cc: Sebastian Ott <[email protected]>
Cc: Christian Borntraeger <[email protected]>
Cc: Minchan Kim <[email protected]>
Cc: Jiang Liu <[email protected]>
Cc: Nitin Gupta <[email protected]>
Cc: Jerome Marchand <[email protected]>
Cc: Joe Perches <[email protected]>
Cc: Peng Tao <[email protected]>
Cc: Andy Adamson <[email protected]>
Cc: fanchaoting <[email protected]>
Cc: Jie Liu <[email protected]>
Cc: Sunil Mushran <[email protected]>
Cc: "Martin K. Petersen" <[email protected]>
Cc: Namjae Jeon <[email protected]>
Cc: Pankaj Kumar <[email protected]>
Cc: Dan Magenheimer <[email protected]>
Cc: Mel Gorman <[email protected]>6
|
|
With immutable biovecs we don't want code accessing bi_io_vec directly -
the uses this patch changes weren't incorrect since they all own the
bio, but it makes the code harder to audit for no good reason - also,
this will help with multipage bvecs later.
Signed-off-by: Kent Overstreet <[email protected]>
Cc: Jens Axboe <[email protected]>
Cc: Alexander Viro <[email protected]>
Cc: Chris Mason <[email protected]>
Cc: Jaegeuk Kim <[email protected]>
Cc: Joern Engel <[email protected]>
Cc: Prasad Joshi <[email protected]>
Cc: Trond Myklebust <[email protected]>
|
|
Fix spacing issues detected via checkpatch.pl in accordance with the
kernel style guidelines.
Signed-off-by: Dulshani Gunawardhana <[email protected]>
Signed-off-by: Josef Bacik <[email protected]>
Signed-off-by: Chris Mason <[email protected]>
|
|
fs/btrfs/compat.h only contained trivial macro wrappers of drop_nlink()
and inc_nlink(). This doesn't belong in mainline.
Signed-off-by: Zach Brown <[email protected]>
Signed-off-by: Josef Bacik <[email protected]>
Signed-off-by: Chris Mason <[email protected]>
|
|
u64 is "unsigned long long" on all architectures now, so there's no need to
cast it when formatting it using the "ll" length modifier.
Signed-off-by: Geert Uytterhoeven <[email protected]>
Signed-off-by: Josef Bacik <[email protected]>
Signed-off-by: Chris Mason <[email protected]>
|
|
We want this for btrfs_extent_same. Basically readpage and friends do their
own extent locking but for the purposes of dedupe, we want to have both
files locked down across a set of readpage operations (so that we can
compare data). Introduce this variant and a flag which can be set for
extent_read_full_page() to indicate that we are already locked.
Partial credit for this patch goes to Gabriel de Perthuis <[email protected]>
as I have included a fix from him to the original patch which avoids a
deadlock on compressed extents.
Signed-off-by: Mark Fasheh <[email protected]>
Signed-off-by: Josef Bacik <[email protected]>
Signed-off-by: Chris Mason <[email protected]>
|
|
Big patch, but all it does is add statics to functions which
are in fact static, then remove the associated dead-code fallout.
removed functions:
btrfs_iref_to_path()
__btrfs_lookup_delayed_deletion_item()
__btrfs_search_delayed_insertion_item()
__btrfs_search_delayed_deletion_item()
find_eb_for_page()
btrfs_find_block_group()
range_straddles_pages()
extent_range_uptodate()
btrfs_file_extent_length()
btrfs_scrub_cancel_devid()
btrfs_start_transaction_lflush()
btrfs_print_tree() is left because it is used for debugging.
btrfs_start_transaction_lflush() and btrfs_reada_detach() are
left for symmetry.
ulist.c functions are left, another patch will take care of those.
Signed-off-by: Eric Sandeen <[email protected]>
Signed-off-by: Josef Bacik <[email protected]>
|
|
Argument 'root' is no more used in btrfs_csum_data().
Signed-off-by: Liu Bo <[email protected]>
Signed-off-by: Josef Bacik <[email protected]>
|
|
We'll want to merge writes so they can fill a full RAID[56] stripe, but
not necessarily reads.
Signed-off-by: David Woodhouse <[email protected]>
Signed-off-by: Chris Mason <[email protected]>
|
|
With the addition of the device replace procedure, it is possible
for btrfs_map_bio(READ) to report an error. This happens when the
specific mirror is requested which is located on the target disk,
and the copy operation has not yet copied this block. Hence the
block cannot be read and this error state is indicated by
returning EIO.
Some background information follows now. A new mirror is added
while the device replace procedure is running.
btrfs_get_num_copies() returns one more, and
btrfs_map_bio(GET_READ_MIRROR) adds one more mirror if a disk
location is involved that was already handled by the device
replace copy operation. The assigned mirror num is the highest
mirror number, e.g. the value 3 in case of RAID1.
If btrfs_map_bio() is invoked with mirror_num == 0 (i.e., select
any mirror), the copy on the target drive is never selected
because that disk shall be able to perform the write requests as
quickly as possible. The parallel execution of read requests would
only slow down the disk copy procedure. Second case is that
btrfs_map_bio() is called with mirror_num > 0. This is done from
the repair code only. In this case, the highest mirror num is
assigned to the target disk, since it is used last. And when this
mirror is not available because the copy procedure has not yet
handled this area, an error is returned. Everywhere in the code
the handling of such errors is added now.
Signed-off-by: Stefan Behrens <[email protected]>
Signed-off-by: Chris Mason <[email protected]>
|
|
We were freeing non-existent pages which was causing a panic for a user who
was suffering from ENOMEM. This patch fixes the problem. Thanks,
Reported-by: Jérôme Poulin <[email protected]>
Signed-off-by: Josef Bacik <[email protected]>
|
|
We need a barrir before calling waitqueue_active otherwise we will miss
wakeups. So in places that do atomic_dec(); then atomic_read() use
atomic_dec_return() which imply a memory barrier (see memory-barriers.txt)
and then add an explicit memory barrier everywhere else that need them.
Thanks,
Signed-off-by: Josef Bacik <[email protected]>
|
|
git://git.kernel.org/pub/scm/linux/kernel/git/mason/linux-btrfs
Pull the minimal btrfs branch from Chris Mason:
"We have a use-after-free in there, along with errors when mount -o
discard is enabled, and a BUG_ON(we should compile with UP more
often)."
* 'for-linus-min' of git://git.kernel.org/pub/scm/linux/kernel/git/mason/linux-btrfs:
Btrfs: use commit root when loading free space cache
Btrfs: fix use-after-free in __btrfs_end_transaction
Btrfs: check return value of bio_alloc() properly
Btrfs: remove lock assert from get_restripe_target()
Btrfs: fix eof while discarding extents
Btrfs: fix uninit variable in repair_eb_io_failure
Revert "Btrfs: increase the global block reserve estimates"
|
|
bio_alloc() has the possibility of returning NULL.
So, it is necessary to check the return value.
Signed-off-by: Tsutomu Itoh <[email protected]>
Signed-off-by: Chris Mason <[email protected]>
|
|
git://git.kernel.org/pub/scm/linux/kernel/git/mason/linux-btrfs
Pull btrfs fixes and features from Chris Mason:
"We've merged in the error handling patches from SuSE. These are
already shipping in the sles kernel, and they give btrfs the ability
to abort transactions and go readonly on errors. It involves a lot of
churn as they clarify BUG_ONs, and remove the ones we now properly
deal with.
Josef reworked the way our metadata interacts with the page cache.
page->private now points to the btrfs extent_buffer object, which
makes everything faster. He changed it so we write an whole extent
buffer at a time instead of allowing individual pages to go down,,
which will be important for the raid5/6 code (for the 3.5 merge
window ;)
Josef also made us more aggressive about dropping pages for metadata
blocks that were freed due to COW. Overall, our metadata caching is
much faster now.
We've integrated my patch for metadata bigger than the page size.
This allows metadata blocks up to 64KB in size. In practice 16K and
32K seem to work best. For workloads with lots of metadata, this cuts
down the size of the extent allocation tree dramatically and fragments
much less.
Scrub was updated to support the larger block sizes, which ended up
being a fairly large change (thanks Stefan Behrens).
We also have an assortment of fixes and updates, especially to the
balancing code (Ilya Dryomov), the back ref walker (Jan Schmidt) and
the defragging code (Liu Bo)."
Fixed up trivial conflicts in fs/btrfs/scrub.c that were just due to
removal of the second argument to k[un]map_atomic() in commit
7ac687d9e047.
* 'for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/mason/linux-btrfs: (75 commits)
Btrfs: update the checks for mixed block groups with big metadata blocks
Btrfs: update to the right index of defragment
Btrfs: do not bother to defrag an extent if it is a big real extent
Btrfs: add a check to decide if we should defrag the range
Btrfs: fix recursive defragment with autodefrag option
Btrfs: fix the mismatch of page->mapping
Btrfs: fix race between direct io and autodefrag
Btrfs: fix deadlock during allocating chunks
Btrfs: show useful info in space reservation tracepoint
Btrfs: don't use crc items bigger than 4KB
Btrfs: flush out and clean up any block device pages during mount
btrfs: disallow unequal data/metadata blocksize for mixed block groups
Btrfs: enhance superblock sanity checks
Btrfs: change scrub to support big blocks
Btrfs: minor cleanup in scrub
Btrfs: introduce common define for max number of mirrors
Btrfs: fix infinite loop in btrfs_shrink_device()
Btrfs: fix memory leak in resolver code
Btrfs: allow dup for data chunks in mixed mode
Btrfs: validate target profiles only if we are going to use them
...
|
|
btrfs currently handles most errors with BUG_ON. This patch is a work-in-
progress but aims to handle most errors other than internal logic
errors and ENOMEM more gracefully.
This iteration prevents most crashes but can run into lockups with
the page lock on occasion when the timing "works out."
Signed-off-by: Jeff Mahoney <[email protected]>
|
|
lock_extent and unlock_extent are always called with GFP_NOFS, drop the
argument and use GFP_NOFS consistently.
Signed-off-by: Jeff Mahoney <[email protected]>
|
|
Signed-off-by: Jeff Mahoney <[email protected]>
|
|
Signed-off-by: Cong Wang <[email protected]>
|
|
This patch corrects error checking of lookup_extent_mapping().
Signed-off-by: Tsutomu Itoh <[email protected]>
|
|
fs_info has now ~9kb, more than fits into one page. This will cause
mount failure when memory is too fragmented. Top space consumers are
super block structures super_copy and super_for_commit, ~2.8kb each.
Allocate them dynamically. fs_info will be ~3.5kb. (measured on x86_64)
Add a wrapper for freeing fs_info and all of it's dynamically allocated
members.
Signed-off-by: David Sterba <[email protected]>
|
|
If mounting with nodatasum option, we won't csum file data for
general write or direct-io write, and this rule should also be
applied when writing compressed files.
Signed-off-by: Li Zefan <[email protected]>
Signed-off-by: Chris Mason <[email protected]>
|
|
inode_numbers
Conflicts:
fs/btrfs/extent-tree.c
fs/btrfs/free-space-cache.c
fs/btrfs/inode.c
fs/btrfs/tree-log.c
Signed-off-by: Chris Mason <[email protected]>
|
|
reported by gcc -Wshadow:
page_index, page_offset, new_inode, dev_name
Signed-off-by: David Sterba <[email protected]>
|
|
There's a potential problem in 32bit system when we exhaust 32bit inode
numbers and start to allocate big inode numbers, because btrfs uses
inode->i_ino in many places.
So here we always use BTRFS_I(inode)->location.objectid, which is an
u64 variable.
There are 2 exceptions that BTRFS_I(inode)->location.objectid !=
inode->i_ino: the btree inode (0 vs 1) and empty subvol dirs (256 vs 2),
and inode->i_ino will be used in those cases.
Another reason to make this change is I'm going to use a special inode
to save free ino cache, and the inode number must be > (u64)-256.
Signed-off-by: Li Zefan <[email protected]>
|
|
Adding the check on the return value of btrfs_alloc_path() to several places.
And, some of callers are modified by this change.
Signed-off-by: Tsutomu Itoh <[email protected]>
Signed-off-by: Chris Mason <[email protected]>
|
|
To make Btrfs code more robust, several return value checks where memory
allocation can fail are introduced. I use BUG_ON where I don't know how
to handle the error properly, which increases the number of using the
notorious BUG_ON, though.
Signed-off-by: Yoshinori Sano <[email protected]>
Signed-off-by: Chris Mason <[email protected]>
|