aboutsummaryrefslogtreecommitdiff
AgeCommit message (Collapse)AuthorFilesLines
2017-04-11btrfs: drop the nossd flag when remounting with -o ssdAdam Borowski1-0/+3
The opposite case was already handled right in the very next switch entry. And also when turning on nossd, drop ssd_spread. Reported-by: Hans van Kranenburg <[email protected]> Signed-off-by: Adam Borowski <[email protected]> Reviewed-by: David Sterba <[email protected]> Signed-off-by: David Sterba <[email protected]>
2017-03-29Merge branch 'for-chris-4.11-rc5' of ↵Chris Mason6-29/+44
git://git.kernel.org/pub/scm/linux/kernel/git/kdave/linux into for-linus-4.11
2017-03-29Btrfs: fix an integer overflow checkDan Carpenter1-1/+6
This isn't super serious because you need CAP_ADMIN to run this code. I added this integer overflow check last year but apparently I am rubbish at writing integer overflow checks... There are two issues. First, access_ok() works on unsigned long type and not u64 so on 32 bit systems the access_ok() could be checking a truncated size. The other issue is that we should be using a stricter limit so we don't overflow the kzalloc() setting ctx->clone_roots later in the function after the access_ok(): alloc_size = sizeof(struct clone_root) * (arg->clone_sources_count + 1); sctx->clone_roots = kzalloc(alloc_size, GFP_KERNEL | __GFP_NOWARN); Fixes: f5ecec3ce21f ("btrfs: send: silence an integer overflow warning") Signed-off-by: Dan Carpenter <[email protected]> Reviewed-by: David Sterba <[email protected]> [ added comment ] Signed-off-by: David Sterba <[email protected]>
2017-03-29btrfs: Change qgroup_meta_rsv to 64bitGoldwyn Rodrigues3-7/+7
Using an int value is causing qg->reserved to become negative and exclusive -EDQUOT to be reached prematurely. This affects exclusive qgroups only. TEST CASE: DEVICE=/dev/vdb MOUNTPOINT=/mnt SUBVOL=$MOUNTPOINT/tmp umount $SUBVOL umount $MOUNTPOINT mkfs.btrfs -f $DEVICE mount /dev/vdb $MOUNTPOINT btrfs quota enable $MOUNTPOINT btrfs subvol create $SUBVOL umount $MOUNTPOINT mount /dev/vdb $MOUNTPOINT mount -o subvol=tmp $DEVICE $SUBVOL btrfs qgroup limit -e 3G $SUBVOL btrfs quota rescan /mnt -w for i in `seq 1 44000`; do dd if=/dev/zero of=/mnt/tmp/test_$i bs=10k count=1 if [[ $? > 0 ]]; then btrfs qgroup show -pcref $SUBVOL exit 1 fi done Signed-off-by: Goldwyn Rodrigues <[email protected]> [ add reproducer to changelog ] Signed-off-by: David Sterba <[email protected]>
2017-03-29Btrfs: bring back repair during readLiu Bo2-21/+31
Commit 20a7db8ab3f2 ("btrfs: add dummy callback for readpage_io_failed and drop checks") made a cleanup around readpage_io_failed_hook, and it was supposed to keep the original sematics, but it also unexpectedly disabled repair during read for dup, raid1 and raid10. This fixes the problem by letting data's inode call the generic readpage_io_failed callback by returning -EAGAIN from its readpage_io_failed_hook in order to notify end_bio_extent_readpage to do the rest. We don't call it directly because the generic one takes an offset from end_bio_extent_readpage() to calculate the index in the checksum array and inode's readpage_io_failed_hook doesn't offer that offset. Cc: David Sterba <[email protected]> Signed-off-by: Liu Bo <[email protected]> Reviewed-by: David Sterba <[email protected]> [ keep the const function attribute ] Signed-off-by: David Sterba <[email protected]>
2017-03-17btrfs: add missing memset while reading compressed inline extentsZygo Blaxell1-0/+14
This is a story about 4 distinct (and very old) btrfs bugs. Commit c8b978188c ("Btrfs: Add zlib compression support") added three data corruption bugs for inline extents (bugs #1-3). Commit 93c82d5750 ("Btrfs: zero page past end of inline file items") fixed bug #1: uncompressed inline extents followed by a hole and more extents could get non-zero data in the hole as they were read. The fix was to add a memset in btrfs_get_extent to zero out the hole. Commit 166ae5a418 ("btrfs: fix inline compressed read err corruption") fixed bug #2: compressed inline extents which contained non-zero bytes might be replaced with zero bytes in some cases. This patch removed an unhelpful memset from uncompress_inline, but the case where memset is required was missed. There is also a memset in the decompression code, but this only covers decompressed data that is shorter than the ram_bytes from the extent ref record. This memset doesn't cover the region between the end of the decompressed data and the end of the page. It has also moved around a few times over the years, so there's no single patch to refer to. This patch fixes bug #3: compressed inline extents followed by a hole and more extents could get non-zero data in the hole as they were read (i.e. bug #3 is the same as bug #1, but s/uncompressed/compressed/). The fix is the same: zero out the hole in the compressed case too, by putting a memset back in uncompress_inline, but this time with correct parameters. The last and oldest bug, bug #0, is the cause of the offending inline extent/hole/extent pattern. Bug #0 is a subtle and mostly-harmless quirk of behavior somewhere in the btrfs write code. In a few special cases, an inline extent and hole are allowed to persist where they normally would be combined with later extents in the file. A fast reproducer for bug #0 is presented below. A few offending extents are also created in the wild during large rsync transfers with the -S flag. A Linux kernel build (git checkout; make allyesconfig; make -j8) will produce a handful of offending files as well. Once an offending file is created, it can present different content to userspace each time it is read. Bug #0 is at least 4 and possibly 8 years old. I verified every vX.Y kernel back to v3.5 has this behavior. There are fossil records of this bug's effects in commits all the way back to v2.6.32. I have no reason to believe bug #0 wasn't present at the beginning of btrfs compression support in v2.6.29, but I can't easily test kernels that old to be sure. It is not clear whether bug #0 is worth fixing. A fix would likely require injecting extra reads into currently write-only paths, and most of the exceptional cases caused by bug #0 are already handled now. Whether we like them or not, bug #0's inline extents followed by holes are part of the btrfs de-facto disk format now, and we need to be able to read them without data corruption or an infoleak. So enough about bug #0, let's get back to bug #3 (this patch). An example of on-disk structure leading to data corruption found in the wild: item 61 key (606890 INODE_ITEM 0) itemoff 9662 itemsize 160 inode generation 50 transid 50 size 47424 nbytes 49141 block group 0 mode 100644 links 1 uid 0 gid 0 rdev 0 flags 0x0(none) item 62 key (606890 INODE_REF 603050) itemoff 9642 itemsize 20 inode ref index 3 namelen 10 name: DB_File.so item 63 key (606890 EXTENT_DATA 0) itemoff 8280 itemsize 1362 inline extent data size 1341 ram 4085 compress(zlib) item 64 key (606890 EXTENT_DATA 4096) itemoff 8227 itemsize 53 extent data disk byte 5367308288 nr 20480 extent data offset 0 nr 45056 ram 45056 extent compression(zlib) Different data appears in userspace during each read of the 11 bytes between 4085 and 4096. The extent in item 63 is not long enough to fill the first page of the file, so a memset is required to fill the space between item 63 (ending at 4085) and item 64 (beginning at 4096) with zero. Here is a reproducer from Liu Bo, which demonstrates another method of creating the same inline extent and hole pattern: Using 'page_poison=on' kernel command line (or enable CONFIG_PAGE_POISONING) run the following: # touch foo # chattr +c foo # xfs_io -f -c "pwrite -W 0 1000" foo # xfs_io -f -c "falloc 4 8188" foo # od -x foo # echo 3 >/proc/sys/vm/drop_caches # od -x foo This produce the following on my box: Correct output: file contains 1000 data bytes followed by zeros: 0000000 cdcd cdcd cdcd cdcd cdcd cdcd cdcd cdcd * 0001740 cdcd cdcd cdcd cdcd 0000 0000 0000 0000 0001760 0000 0000 0000 0000 0000 0000 0000 0000 * 0020000 Actual output: the data after the first 1000 bytes will be different each run: 0000000 cdcd cdcd cdcd cdcd cdcd cdcd cdcd cdcd * 0001740 cdcd cdcd cdcd cdcd 6c63 7400 635f 006d 0001760 5f74 6f43 7400 435f 0053 5f74 7363 7400 0002000 435f 0056 5f74 6164 7400 645f 0062 5f74 (...) Signed-off-by: Zygo Blaxell <[email protected]> Reviewed-by: Liu Bo <[email protected]> Reviewed-by: Chris Mason <[email protected]> Signed-off-by: Chris Mason <[email protected]>
2017-03-17Btrfs: fix regression in lock_delalloc_pagesLiu Bo1-1/+2
The bug is a regression after commit (da2c7009f6ca "btrfs: teach __process_pages_contig about PAGE_LOCK operation") and commit (76c0021db8fd "Btrfs: use helper to simplify lock/unlock pages"). So if the dirty pages which are under writeback got truncated partially before we lock the dirty pages, we couldn't find all pages mapping to the delalloc range, and the bug didn't return an error so it kept going on and found that the delalloc range got truncated and got to unlock the dirty pages, and then the ASSERT could caught the error, and showed ----------------------------------------------------------------------------- assertion failed: page_ops & PAGE_LOCK, file: fs/btrfs/extent_io.c, line: 1716 ----------------------------------------------------------------------------- This fixes the bug by returning the proper -EAGAIN. Cc: David Sterba <[email protected]> Reported-by: Dave Jones <[email protected]> Signed-off-by: Liu Bo <[email protected]> Signed-off-by: David Sterba <[email protected]>
2017-03-07btrfs: remove btrfs_err_str function from uapi/linux/btrfs.hDmitry V. Levin1-27/+0
btrfs_err_str function is not called from anywhere and is replicated in the userspace headers for btrfs-progs. It's removal also fixes the following linux/btrfs.h userspace compilation error: /usr/include/linux/btrfs.h: In function 'btrfs_err_str': /usr/include/linux/btrfs.h:740:11: error: 'NULL' undeclared (first use in this function) return NULL; Suggested-by: Jeff Mahoney <[email protected]> Signed-off-by: Dmitry V. Levin <[email protected]> Reviewed-by: David Sterba <[email protected]> Signed-off-by: David Sterba <[email protected]>
2017-02-28Merge branch 'for-chris-4.11-part2' of ↵Chris Mason31-626/+673
git://git.kernel.org/pub/scm/linux/kernel/git/kdave/linux into for-linus-4.11
2017-02-28btrfs: add dummy callback for readpage_io_failed and drop checksDavid Sterba4-3/+10
Make extent_io_ops::readpage_io_failed_hook callback mandatory and define a dummy function for btrfs_extent_io_ops. As the failed IO callback is not performance critical, the branch vs extra trade off does not hurt. Signed-off-by: David Sterba <[email protected]>
2017-02-28btrfs: drop checks for mandatory extent_io_ops callbacksDavid Sterba1-4/+3
We know that eadpage_end_io_hook, submit_bio_hook and merge_bio_hook are always defined so we can drop the checks before we call them. Signed-off-by: David Sterba <[email protected]>
2017-02-28btrfs: document existence of extent_io ops callbacksDavid Sterba3-11/+26
Some of the callbacks defined in btree_extent_io_ops and btrfs_extent_io_ops do always exist so we don't need to check the existence before each call. This patch just reorders the definition and documents which are mandatory/optional. Signed-off-by: David Sterba <[email protected]>
2017-02-28btrfs: let writepage_end_io_hook return voidDavid Sterba3-11/+6
There's no error path in any of the instances, always return 0. Reviewed-by: Liu Bo <[email protected]> Signed-off-by: David Sterba <[email protected]>
2017-02-28btrfs: do proper error handling in btrfs_insert_xattr_itemDavid Sterba1-1/+2
The space check in btrfs_insert_xattr_item is duplicated in it's caller (do_setxattr) so we won't hit the BUG_ON. Continuing without any check could be disasterous so turn it to a proper error handling. Reviewed-by: Liu Bo <[email protected]> Signed-off-by: David Sterba <[email protected]>
2017-02-28btrfs: handle allocation error in update_dev_stat_itemDavid Sterba1-1/+2
Reviewed-by: Liu Bo <[email protected]> Signed-off-by: David Sterba <[email protected]>
2017-02-28btrfs: remove BUG_ON from __tree_mod_log_insertDavid Sterba1-2/+0
All callers dereference the 'tm' parameter before it gets to this function, the NULL check does not make much sense here. Reviewed-by: Liu Bo <[email protected]> Signed-off-by: David Sterba <[email protected]>
2017-02-28btrfs: derive maximum output size in the compression implementationDavid Sterba5-14/+9
The value of max_out can be calculated from the parameters passed to the compressors, which is number of pages and the page size, and we don't have to needlessly pass it around. Signed-off-by: David Sterba <[email protected]>
2017-02-28btrfs: use predefined limits for calculating maximum number of pages for ↵David Sterba1-5/+6
compression Signed-off-by: David Sterba <[email protected]>
2017-02-28btrfs: export compression buffer limits in a headerDavid Sterba2-10/+15
Move the buffer limit definitions out of compress_file_range. Reviewed-by: Qu Wenruo <[email protected]> Signed-off-by: David Sterba <[email protected]>
2017-02-28btrfs: merge nr_pages input and output parameter in compress_pagesDavid Sterba5-15/+11
The parameter saying how many pages can be allocated at maximum can be merged with the output page counter, to save some stack space. The compression implementation will sink the parameter to a local variable so everything works as before. The nr_pages variables can also be simply merged in compress_file_range into one. Signed-off-by: David Sterba <[email protected]>
2017-02-28btrfs: merge length input and output parameter in compress_pagesDavid Sterba5-20/+18
The length parameter is basically duplicated for input and output in the top level caller of the compress_pages chain. We can simply use one variable for that and reduce stack consumption. The compression implementation will sink the parameter to a local variable so everything works as before. Signed-off-by: David Sterba <[email protected]>
2017-02-28btrfs: constify name of subvolume in creation helpersDavid Sterba1-3/+3
Signed-off-by: David Sterba <[email protected]>
2017-02-28btrfs: constify buffers used by compression helpersDavid Sterba3-3/+3
Signed-off-by: David Sterba <[email protected]>
2017-02-28btrfs: constify input buffer of btrfs_csum_dataDavid Sterba2-3/+3
The function does not modify the input buffer, also update a typecast in one caller. Signed-off-by: David Sterba <[email protected]>
2017-02-28btrfs: constify device path passed to relevant helpersDavid Sterba4-18/+22
Signed-off-by: David Sterba <[email protected]>
2017-02-28btrfs: make btrfs_inode_resume_unlocked_dio take btrfs_inodeNikolay Borisov2-4/+3
Signed-off-by: Nikolay Borisov <[email protected]> Signed-off-by: David Sterba <[email protected]>
2017-02-28btrfs: make btrfs_inode_block_unlocked_dio take btrfs_inodeNikolay Borisov2-3/+3
Signed-off-by: Nikolay Borisov <[email protected]> Signed-off-by: David Sterba <[email protected]>
2017-02-28btrfs: Make btrfs_add_nondir take btrfs_inodeNikolay Borisov1-9/+13
Signed-off-by: Nikolay Borisov <[email protected]> Signed-off-by: David Sterba <[email protected]>
2017-02-28btrfs: Make btrfs_add_link take btrfs_inodeNikolay Borisov3-23/+26
Signed-off-by: Nikolay Borisov <[email protected]> Signed-off-by: David Sterba <[email protected]>
2017-02-28btrfs: Make btrfs_del_delalloc_inode take btrfs_inodeNikolay Borisov1-7/+7
Signed-off-by: Nikolay Borisov <[email protected]> Signed-off-by: David Sterba <[email protected]>
2017-02-28btrfs: Make get_extent_t take btrfs_inodeNikolay Borisov8-54/+59
In addition to changing the signature, this patch also switches all the functions which are used as an argument to also take btrfs_inode. Namely those are: btrfs_get_extent and btrfs_get_extent_filemap. Signed-off-by: Nikolay Borisov <[email protected]> Signed-off-by: David Sterba <[email protected]>
2017-02-28btrfs: Make check_extent_to_block take btrfs_inodeNikolay Borisov1-5/+6
Signed-off-by: Nikolay Borisov <[email protected]> Signed-off-by: David Sterba <[email protected]>
2017-02-28btrfs: Make clone_update_extent_map take btrfs_inodeNikolay Borisov1-14/+13
Signed-off-by: Nikolay Borisov <[email protected]> Signed-off-by: David Sterba <[email protected]>
2017-02-28btrfs: Make btrfs_clear_bit_hook take btrfs_inodeNikolay Borisov3-21/+25
Signed-off-by: Nikolay Borisov <[email protected]> Signed-off-by: David Sterba <[email protected]>
2017-02-28btrfs: Make btrfs_extent_item_to_extent_map take btrfs_inodeNikolay Borisov4-8/+10
Signed-off-by: Nikolay Borisov <[email protected]> Signed-off-by: David Sterba <[email protected]>
2017-02-28btrfs: make btrfs_log_inode_parent take btrfs_inodeNikolay Borisov1-26/+24
Signed-off-by: Nikolay Borisov <[email protected]> Signed-off-by: David Sterba <[email protected]>
2017-02-28btrfs: Make check_parent_dirs_for_sync take btrfs_inodeNikolay Borisov1-14/+14
Signed-off-by: Nikolay Borisov <[email protected]> Signed-off-by: David Sterba <[email protected]>
2017-02-28btrfs: Make btrfs_orphan_add take btrfs_inodeNikolay Borisov4-22/+24
Signed-off-by: Nikolay Borisov <[email protected]> Signed-off-by: David Sterba <[email protected]>
2017-02-28btrfs: make btrfs_orphan_del take btrfs_inodeNikolay Borisov1-20/+20
Signed-off-by: Nikolay Borisov <[email protected]> Signed-off-by: David Sterba <[email protected]>
2017-02-28btrfs: make btrfs_free_io_failure_record take btrfs_inodeNikolay Borisov3-7/+9
Signed-off-by: Nikolay Borisov <[email protected]> Signed-off-by: David Sterba <[email protected]>
2017-02-28btrfs: make clean_io_failure take btrfs_inodeNikolay Borisov3-14/+15
Signed-off-by: Nikolay Borisov <[email protected]> Signed-off-by: David Sterba <[email protected]>
2017-02-28btrfs: make repair_io_failure take btrfs_inodeNikolay Borisov3-11/+12
Signed-off-by: Nikolay Borisov <[email protected]> Signed-off-by: David Sterba <[email protected]>
2017-02-28btrfs: make check_compressed_csum take btrfs_inodeNikolay Borisov1-5/+4
Signed-off-by: Nikolay Borisov <[email protected]> Signed-off-by: David Sterba <[email protected]>
2017-02-28btrfs: make btrfs_print_data_csum_error take btrfs_inodeNikolay Borisov3-7/+8
Signed-off-by: Nikolay Borisov <[email protected]> Signed-off-by: David Sterba <[email protected]>
2017-02-28btrfs: make free_io_failure take btrfs_inodeNikolay Borisov3-11/+13
Signed-off-by: Nikolay Borisov <[email protected]> Signed-off-by: David Sterba <[email protected]>
2017-02-28btrfs: Make lock_and_cleanup_extent_if_need take btrfs_inodeNikolay Borisov1-14/+14
Signed-off-by: Nikolay Borisov <[email protected]> Signed-off-by: David Sterba <[email protected]>
2017-02-28btrfs: Make check_can_nocow take btrfs_inodeNikolay Borisov1-10/+12
Signed-off-by: Nikolay Borisov <[email protected]> Signed-off-by: David Sterba <[email protected]>
2017-02-28btrfs: Make btrfs_lookup_ordered_range take btrfs_inodeNikolay Borisov6-18/+19
Signed-off-by: Nikolay Borisov <[email protected]> Signed-off-by: David Sterba <[email protected]>
2017-02-28btrfs: Make btrfs_mark_extent_written take btrfs_inodeNikolay Borisov3-6/+6
Signed-off-by: Nikolay Borisov <[email protected]> Signed-off-by: David Sterba <[email protected]>
2017-02-28btrfs: Make fill_holes take btrfs_inodeNikolay Borisov1-19/+18
Signed-off-by: Nikolay Borisov <[email protected]> Signed-off-by: David Sterba <[email protected]>