aboutsummaryrefslogtreecommitdiff
AgeCommit message (Collapse)AuthorFilesLines
2016-10-24btrfs: make file clone aware of fatal signalsWang Xiaoguang1-0/+5
Indeed this just make the behavior similar to xfs when process has fatal signals pending, and it'll make fstests/generic/298 happy. Signed-off-by: Wang Xiaoguang <[email protected]> Reviewed-by: David Sterba <[email protected]> Signed-off-by: David Sterba <[email protected]>
2016-10-24btrfs: qgroup: Prevent qgroup->reserved from going subzeroGoldwyn Rodrigues1-2/+7
While free'ing qgroup->reserved resources, we much check if the page has not been invalidated by a truncate operation by checking if the page is still dirty before reducing the qgroup resources. Resources in such a case are free'd when the entire extent is released by delayed_ref. This fixes a double accounting while releasing resources in case of truncating a file, reproduced by the following testcase. SCRATCH_DEV=/dev/vdb SCRATCH_MNT=/mnt mkfs.btrfs -f $SCRATCH_DEV mount -t btrfs $SCRATCH_DEV $SCRATCH_MNT cd $SCRATCH_MNT btrfs quota enable $SCRATCH_MNT btrfs subvolume create a btrfs qgroup limit 500m a $SCRATCH_MNT sync for c in {1..15}; do dd if=/dev/zero bs=1M count=40 of=$SCRATCH_MNT/a/file; done sleep 10 sync sleep 5 touch $SCRATCH_MNT/a/newfile echo "Removing file" rm $SCRATCH_MNT/a/file Fixes: b9d0b38928 ("btrfs: Add handler for invalidate page") Signed-off-by: Goldwyn Rodrigues <[email protected]> Reviewed-by: Qu Wenruo <[email protected]> Signed-off-by: David Sterba <[email protected]>
2016-10-17Btrfs: kill BUG_ON in do_relocationLiu Bo1-1/+8
While updating btree, we try to push items between sibling nodes/leaves in order to keep height as low as possible. But we don't memset the original places with zero when pushing items so that we could end up leaving stale content in nodes/leaves. One may read the above stale content by increasing btree blocks' @nritems. One case I've come across is that in fs tree, a leaf has two parent nodes, hence running balance ends up with processing this leaf with two parent nodes, but it can only reach the valid parent node through btrfs_search_slot, so it'd be like, do_relocation for P in all parent nodes of block A: if !P->eb: btrfs_search_slot(key); --> get path from P to A. if lowest: BUG_ON(A->bytenr != bytenr of A recorded in P); btrfs_cow_block(P, A); --> change A's bytenr in P. After btrfs_cow_block, P has the new bytenr of A, but with the same @key, we get the same path again, and get panic by BUG_ON. Note that this is only happening in a corrupted fs, for a regular fs in which we have correct @nritems so that we won't read stale content in any case. Reviewed-by: Josef Bacik <[email protected]> Signed-off-by: Liu Bo <[email protected]> Signed-off-by: David Sterba <[email protected]>
2016-10-12Merge branch 'fst-fixes' of ↵Chris Mason8-157/+272
git://git.kernel.org/pub/scm/linux/kernel/git/kdave/linux into for-linus-4.9 Signed-off-by: Chris Mason <[email protected]>
2016-10-10Revert "btrfs: let btrfs_delete_unused_bgs() to clean relocated bgs"Chris Mason2-11/+15
This reverts commit 5d8eb6fe517583f9c6d5b94faf2254a0207a45c9. When we remove devices, we free the device structures. Delaying btfs_remove_chunk() ends up hitting a use-after-free on them. Signed-off-by: Chris Mason <[email protected]>
2016-10-03btrfs: tests: uninline member definitions in free_space_extentDavid Sterba1-1/+2
The recommended way is to put all members on separate lines. Signed-off-by: David Sterba <[email protected]>
2016-10-03btrfs: tests: constify free space extent specsDavid Sterba1-11/+11
We don't change the given extent ranges, mark them const to catch accidental changes. Signed-off-by: David Sterba <[email protected]>
2016-10-03Btrfs: expand free space tree sanity tests to catch endianness bugOmar Sandoval1-68/+96
The free space tree format conversion functions were broken on big-endian systems, but the sanity tests didn't catch it because all of the operations were aligned to multiple words. This was meant to catch any bugs in the extent buffer code's handling of high memory, but it ended up hiding the endianness bug. Expand the tests to do both sector-aligned and page-aligned operations. Tested-by: Chandan Rajendra <[email protected]> Signed-off-by: Omar Sandoval <[email protected]> Signed-off-by: David Sterba <[email protected]>
2016-10-03Btrfs: fix extent buffer bitmap tests on big-endian systemsOmar Sandoval1-36/+51
The in-memory bitmap code manipulates words and is therefore sensitive to endianness, while the extent buffer bitmap code addresses bytes and is byte-order agnostic. Because the byte addressing of the extent buffer bitmaps is equivalent to a little-endian in-memory bitmap, the extent buffer bitmap tests fail on big-endian systems. 34b3e6c92af1 ("Btrfs: self-tests: Fix extent buffer bitmap test fail on BE system") worked around another endianness bug in the tests but missed this one because ed9e4afdb055 ("Btrfs: self-tests: Execute page straddling test only when nodesize < PAGE_SIZE") disables this part of the test on ppc64. That change lost the original meaning of the test, however. We really want to test that an equivalent series of operations using the in-memory bitmap API and the extent buffer bitmap API produces equivalent results. To fix this, don't use memcmp_extent_buffer() or write_extent_buffer(); do everything bit-by-bit. Reported-by: Anatoly Pugachev <[email protected]> Tested-by: Anatoly Pugachev <[email protected]> Tested-by: Feifei Xu <[email protected]> Tested-by: Chandan Rajendra <[email protected]> Signed-off-by: Omar Sandoval <[email protected]> Signed-off-by: David Sterba <[email protected]>
2016-10-03Btrfs: catch invalid free space treesOmar Sandoval4-2/+24
There are two separate issues that can lead to corrupted free space trees. 1. The free space tree bitmaps had an endianness issue on big-endian systems which is fixed by an earlier patch in this series. 2. btrfs-progs before v4.7.3 modified filesystems without updating the free space tree. To catch both of these issues at once, we need to force the free space tree to be rebuilt. To do so, add a FREE_SPACE_TREE_VALID compat_ro bit. If the bit isn't set, we know that it was either produced by a broken big-endian kernel or may have been corrupted by btrfs-progs. This also provides us with a way to add rudimentary read-write support for the free space tree to btrfs-progs: it can just clear this bit and have the kernel rebuild the free space tree. Cc: [email protected] # 4.5+ Tested-by: Holger Hoffstätte <[email protected]> Tested-by: Chandan Rajendra <[email protected]> Signed-off-by: Omar Sandoval <[email protected]> Signed-off-by: David Sterba <[email protected]>
2016-10-03Btrfs: fix mount -o clear_cache,space_cache=v2Omar Sandoval1-12/+12
We moved the code for creating the free space tree the first time that it's enabled, but didn't move the clearing code along with it. This breaks my (undocumented) intention that `mount -o clear_cache,space_cache=v2` would clear the free space tree and then recreate it. Fixes: 511711af91f2 ("btrfs: don't run delayed references while we are creating the free space tree") Cc: [email protected] # 4.5+ Tested-by: Holger Hoffstätte <[email protected]> Tested-by: Chandan Rajendra <[email protected]> Signed-off-by: Omar Sandoval <[email protected]> Signed-off-by: David Sterba <[email protected]>
2016-10-03Btrfs: fix free space tree bitmaps on big-endian systemsOmar Sandoval3-27/+76
In convert_free_space_to_{bitmaps,extents}(), we buffer the free space bitmaps in memory and copy them directly to/from the extent buffers with {read,write}_extent_buffer(). The extent buffer bitmap helpers use byte granularity, which is equivalent to a little-endian bitmap. This means that on big-endian systems, the in-memory bitmaps will be written to disk byte-swapped. To fix this, use byte-granularity for the bitmaps in memory. Fixes: a5ed91828518 ("Btrfs: implement the free space B-tree") Cc: [email protected] # 4.5+ Tested-by: Holger Hoffstätte <[email protected]> Tested-by: Chandan Rajendra <[email protected]> Signed-off-by: Omar Sandoval <[email protected]> Signed-off-by: David Sterba <[email protected]>
2016-09-26Btrfs: remove unnecessary btrfs_mark_buffer_dirty in split_leafLiu Bo1-1/+5
When we're not able to get enough space through splitting leaf, we'd create a new sibling leaf instead, and it's possible that we return a zero-nritem sibling leaf and mark it dirty before it's in a consistent state. With CONFIG_BTRFS_FS_CHECK_INTEGRITY=y, the integrity check of check_leaf will report panic due to this zero-nritem non-root leaf. This removes the unnecessary btrfs_mark_buffer_dirty. Reported-by: Filipe Manana <[email protected]> Signed-off-by: Liu Bo <[email protected]> Signed-off-by: David Sterba <[email protected]>
2016-09-26Btrfs: don't BUG() during drop snapshotJosef Bacik1-11/+27
Really there's lots of things that can go wrong here, kill all the BUG_ON()'s and replace the logic ones with ASSERT()'s and return EIO instead. Signed-off-by: Josef Bacik <[email protected]> [ switched to btrfs_err, errors go to common label ] Reviewed-by: Liu Bo <[email protected]> Signed-off-by: David Sterba <[email protected]>
2016-09-26btrfs: fix btrfs_no_printk stub helperArnd Bergmann1-10/+7
The addition of btrfs_no_printk() caused a build failure when CONFIG_PRINTK is disabled: fs/btrfs/send.c: In function 'send_rename': fs/btrfs/ctree.h:3367:2: error: implicit declaration of function 'btrfs_no_printk' [-Werror=implicit-function-declaration] This moves the helper outside of that #ifdef so it is always defined, and changes the existing #ifdef to refer to that helper as well for consistency. Fixes: 47c57058ff2c ("btrfs: btrfs_debug should consume fs_info when DEBUG is not defined") Signed-off-by: Arnd Bergmann <[email protected]> Reviewed-by: David Sterba <[email protected]> Signed-off-by: David Sterba <[email protected]>
2016-09-26Btrfs: memset to avoid stale content in btree leafLiu Bo3-19/+28
This is an additional patch to "Btrfs: memset to avoid stale content in btree node block". This uses memset to initialize the unused space in a leaf to avoid potential stale content, which may be incurred by pushing items between sibling leaves. Signed-off-by: Liu Bo <[email protected]> Reviewed-by: David Sterba <[email protected]> Signed-off-by: David Sterba <[email protected]>
2016-09-26btrfs: parent_start initialization cleanupGoldwyn Rodrigues1-15/+3
Code cleanup. parent_start is initialized multiple times when it is not necessary to do so. Signed-off-by: Goldwyn Rodrigues <[email protected]> Reviewed-by: David Sterba <[email protected]> Signed-off-by: David Sterba <[email protected]>
2016-09-26btrfs: Remove already completed TODO commentGoldwyn Rodrigues1-2/+0
Fixes: 7cf5b97650f2 ("btrfs: qgroup: Cleanup old inaccurate facilities") Signed-off-by: Goldwyn Rodrigues <[email protected]> Signed-off-by: David Sterba <[email protected]>
2016-09-26btrfs: Do not reassign count in btrfs_run_delayed_refsGoldwyn Rodrigues1-1/+0
Code cleanup. count is already (unsgined long)-1. That is the reason run_all was set. Do not reassign it (unsigned long)-1. Signed-off-by: Goldwyn Rodrigues <[email protected]> Reviewed-by: David Sterba <[email protected]> Signed-off-by: David Sterba <[email protected]>
2016-09-26btrfs: fix a possible umount deadlockAnand Jain1-6/+20
btrfs_show_devname() is using the device_list_mutex, sometimes a call to blkdev_put() leads vfs calling into this func. So call blkdev_put() outside of device_list_mutex, as of now. [ 983.284212] ====================================================== [ 983.290401] [ INFO: possible circular locking dependency detected ] [ 983.296677] 4.8.0-rc5-ceph-00023-g1b39cec2 #1 Not tainted [ 983.302081] ------------------------------------------------------- [ 983.308357] umount/21720 is trying to acquire lock: [ 983.313243] (&bdev->bd_mutex){+.+.+.}, at: [<ffffffff9128ec51>] blkdev_put+0x31/0x150 [ 983.321264] [ 983.321264] but task is already holding lock: [ 983.327101] (&fs_devs->device_list_mutex){+.+...}, at: [<ffffffffc033d6f6>] __btrfs_close_devices+0x46/0x200 [btrfs] [ 983.337839] [ 983.337839] which lock already depends on the new lock. [ 983.337839] [ 983.346024] [ 983.346024] the existing dependency chain (in reverse order) is: [ 983.353512] -> #4 (&fs_devs->device_list_mutex){+.+...}: [ 983.359096] [<ffffffff910dfd0c>] lock_acquire+0x1bc/0x1f0 [ 983.365143] [<ffffffff91823125>] mutex_lock_nested+0x65/0x350 [ 983.371521] [<ffffffffc02d8116>] btrfs_show_devname+0x36/0x1f0 [btrfs] [ 983.378710] [<ffffffff9129523e>] show_vfsmnt+0x4e/0x150 [ 983.384593] [<ffffffff9126ffc7>] m_show+0x17/0x20 [ 983.389957] [<ffffffff91276405>] seq_read+0x2b5/0x3b0 [ 983.395669] [<ffffffff9124c808>] __vfs_read+0x28/0x100 [ 983.401464] [<ffffffff9124eb3b>] vfs_read+0xab/0x150 [ 983.407080] [<ffffffff9124ec32>] SyS_read+0x52/0xb0 [ 983.412609] [<ffffffff91825fc0>] entry_SYSCALL_64_fastpath+0x23/0xc1 [ 983.419617] -> #3 (namespace_sem){++++++}: [ 983.424024] [<ffffffff910dfd0c>] lock_acquire+0x1bc/0x1f0 [ 983.430074] [<ffffffff918239e9>] down_write+0x49/0x80 [ 983.435785] [<ffffffff91272457>] lock_mount+0x67/0x1c0 [ 983.441582] [<ffffffff91272ab2>] do_add_mount+0x32/0xf0 [ 983.447458] [<ffffffff9127363a>] finish_automount+0x5a/0xc0 [ 983.453682] [<ffffffff91259513>] follow_managed+0x1b3/0x2a0 [ 983.459912] [<ffffffff9125b750>] lookup_fast+0x300/0x350 [ 983.465875] [<ffffffff9125d6e7>] path_openat+0x3a7/0xaa0 [ 983.471846] [<ffffffff9125ef75>] do_filp_open+0x85/0xe0 [ 983.477731] [<ffffffff9124c41c>] do_sys_open+0x14c/0x1f0 [ 983.483702] [<ffffffff9124c4de>] SyS_open+0x1e/0x20 [ 983.489240] [<ffffffff91825fc0>] entry_SYSCALL_64_fastpath+0x23/0xc1 [ 983.496254] -> #2 (&sb->s_type->i_mutex_key#3){+.+.+.}: [ 983.501798] [<ffffffff910dfd0c>] lock_acquire+0x1bc/0x1f0 [ 983.507855] [<ffffffff918239e9>] down_write+0x49/0x80 [ 983.513558] [<ffffffff91366237>] start_creating+0x87/0x100 [ 983.519703] [<ffffffff91366647>] debugfs_create_dir+0x17/0x100 [ 983.526195] [<ffffffff911df153>] bdi_register+0x93/0x210 [ 983.532165] [<ffffffff911df313>] bdi_register_owner+0x43/0x70 [ 983.538570] [<ffffffff914080fb>] device_add_disk+0x1fb/0x450 [ 983.544888] [<ffffffff91580226>] loop_add+0x1e6/0x290 [ 983.550596] [<ffffffff91fec358>] loop_init+0x10b/0x14f [ 983.556394] [<ffffffff91002207>] do_one_initcall+0xa7/0x180 [ 983.562618] [<ffffffff91f932e0>] kernel_init_freeable+0x1cc/0x266 [ 983.569370] [<ffffffff918174be>] kernel_init+0xe/0x100 [ 983.575166] [<ffffffff9182620f>] ret_from_fork+0x1f/0x40 [ 983.581131] -> #1 (loop_index_mutex){+.+.+.}: [ 983.585801] [<ffffffff910dfd0c>] lock_acquire+0x1bc/0x1f0 [ 983.591858] [<ffffffff91823125>] mutex_lock_nested+0x65/0x350 [ 983.598256] [<ffffffff9157ed3f>] lo_open+0x1f/0x60 [ 983.603704] [<ffffffff9128eec3>] __blkdev_get+0x123/0x400 [ 983.609757] [<ffffffff9128f4ea>] blkdev_get+0x34a/0x350 [ 983.615639] [<ffffffff9128f554>] blkdev_open+0x64/0x80 [ 983.621428] [<ffffffff9124aff6>] do_dentry_open+0x1c6/0x2d0 [ 983.627651] [<ffffffff9124c029>] vfs_open+0x69/0x80 [ 983.633181] [<ffffffff9125db74>] path_openat+0x834/0xaa0 [ 983.639152] [<ffffffff9125ef75>] do_filp_open+0x85/0xe0 [ 983.645035] [<ffffffff9124c41c>] do_sys_open+0x14c/0x1f0 [ 983.650999] [<ffffffff9124c4de>] SyS_open+0x1e/0x20 [ 983.656535] [<ffffffff91825fc0>] entry_SYSCALL_64_fastpath+0x23/0xc1 [ 983.663541] -> #0 (&bdev->bd_mutex){+.+.+.}: [ 983.668107] [<ffffffff910def43>] __lock_acquire+0x1003/0x17b0 [ 983.674510] [<ffffffff910dfd0c>] lock_acquire+0x1bc/0x1f0 [ 983.680561] [<ffffffff91823125>] mutex_lock_nested+0x65/0x350 [ 983.686967] [<ffffffff9128ec51>] blkdev_put+0x31/0x150 [ 983.692761] [<ffffffffc033481f>] btrfs_close_bdev+0x4f/0x60 [btrfs] [ 983.699699] [<ffffffffc033d77b>] __btrfs_close_devices+0xcb/0x200 [btrfs] [ 983.707178] [<ffffffffc033d8db>] btrfs_close_devices+0x2b/0xa0 [btrfs] [ 983.714380] [<ffffffffc03081c5>] close_ctree+0x265/0x340 [btrfs] [ 983.721061] [<ffffffffc02d7959>] btrfs_put_super+0x19/0x20 [btrfs] [ 983.727908] [<ffffffff91250e2f>] generic_shutdown_super+0x6f/0x100 [ 983.734744] [<ffffffff91250f56>] kill_anon_super+0x16/0x30 [ 983.740888] [<ffffffffc02da97e>] btrfs_kill_super+0x1e/0x130 [btrfs] [ 983.747909] [<ffffffff91250fe9>] deactivate_locked_super+0x49/0x80 [ 983.754745] [<ffffffff912515fd>] deactivate_super+0x5d/0x70 [ 983.760977] [<ffffffff91270a1c>] cleanup_mnt+0x5c/0x80 [ 983.766773] [<ffffffff91270a92>] __cleanup_mnt+0x12/0x20 [ 983.772738] [<ffffffff910aa2fe>] task_work_run+0x7e/0xc0 [ 983.778708] [<ffffffff91081b5a>] exit_to_usermode_loop+0x7e/0xb4 [ 983.785373] [<ffffffff910039eb>] syscall_return_slowpath+0xbb/0xd0 [ 983.792212] [<ffffffff9182605c>] entry_SYSCALL_64_fastpath+0xbf/0xc1 [ 983.799225] [ 983.799225] other info that might help us debug this: [ 983.799225] [ 983.807291] Chain exists of: &bdev->bd_mutex --> namespace_sem --> &fs_devs->device_list_mutex [ 983.816521] Possible unsafe locking scenario: [ 983.816521] [ 983.822489] CPU0 CPU1 [ 983.827043] ---- ---- [ 983.831599] lock(&fs_devs->device_list_mutex); [ 983.836289] lock(namespace_sem); [ 983.842268] lock(&fs_devs->device_list_mutex); [ 983.849478] lock(&bdev->bd_mutex); [ 983.853127] [ 983.853127] *** DEADLOCK *** [ 983.853127] [ 983.859113] 3 locks held by umount/21720: [ 983.863145] #0: (&type->s_umount_key#35){++++..}, at: [<ffffffff912515f5>] deactivate_super+0x55/0x70 [ 983.872713] #1: (uuid_mutex){+.+.+.}, at: [<ffffffffc033d8d3>] btrfs_close_devices+0x23/0xa0 [btrfs] [ 983.882206] #2: (&fs_devs->device_list_mutex){+.+...}, at: [<ffffffffc033d6f6>] __btrfs_close_devices+0x46/0x200 [btrfs] [ 983.893422] [ 983.893422] stack backtrace: [ 983.897824] CPU: 6 PID: 21720 Comm: umount Not tainted 4.8.0-rc5-ceph-00023-g1b39cec2 #1 [ 983.905958] Hardware name: Supermicro SYS-5018R-WR/X10SRW-F, BIOS 1.0c 09/07/2015 [ 983.913492] 0000000000000000 ffff8c8a53c17a38 ffffffff91429521 ffffffff9260f4f0 [ 983.921018] ffffffff92642760 ffff8c8a53c17a88 ffffffff911b2b04 0000000000000050 [ 983.928542] ffffffff9237d620 ffff8c8a5294aee0 ffff8c8a5294aeb8 ffff8c8a5294aee0 [ 983.936072] Call Trace: [ 983.938545] [<ffffffff91429521>] dump_stack+0x85/0xc4 [ 983.943715] [<ffffffff911b2b04>] print_circular_bug+0x1fb/0x20c [ 983.949748] [<ffffffff910def43>] __lock_acquire+0x1003/0x17b0 [ 983.955613] [<ffffffff910dfd0c>] lock_acquire+0x1bc/0x1f0 [ 983.961123] [<ffffffff9128ec51>] ? blkdev_put+0x31/0x150 [ 983.966550] [<ffffffff91823125>] mutex_lock_nested+0x65/0x350 [ 983.972407] [<ffffffff9128ec51>] ? blkdev_put+0x31/0x150 [ 983.977832] [<ffffffff9128ec51>] blkdev_put+0x31/0x150 [ 983.983101] [<ffffffffc033481f>] btrfs_close_bdev+0x4f/0x60 [btrfs] [ 983.989500] [<ffffffffc033d77b>] __btrfs_close_devices+0xcb/0x200 [btrfs] [ 983.996415] [<ffffffffc033d8db>] btrfs_close_devices+0x2b/0xa0 [btrfs] [ 984.003068] [<ffffffffc03081c5>] close_ctree+0x265/0x340 [btrfs] [ 984.009189] [<ffffffff9126cc5e>] ? evict_inodes+0x15e/0x170 [ 984.014881] [<ffffffffc02d7959>] btrfs_put_super+0x19/0x20 [btrfs] [ 984.021176] [<ffffffff91250e2f>] generic_shutdown_super+0x6f/0x100 [ 984.027476] [<ffffffff91250f56>] kill_anon_super+0x16/0x30 [ 984.033082] [<ffffffffc02da97e>] btrfs_kill_super+0x1e/0x130 [btrfs] [ 984.039548] [<ffffffff91250fe9>] deactivate_locked_super+0x49/0x80 [ 984.045839] [<ffffffff912515fd>] deactivate_super+0x5d/0x70 [ 984.051525] [<ffffffff91270a1c>] cleanup_mnt+0x5c/0x80 [ 984.056774] [<ffffffff91270a92>] __cleanup_mnt+0x12/0x20 [ 984.062201] [<ffffffff910aa2fe>] task_work_run+0x7e/0xc0 [ 984.067625] [<ffffffff91081b5a>] exit_to_usermode_loop+0x7e/0xb4 [ 984.073747] [<ffffffff910039eb>] syscall_return_slowpath+0xbb/0xd0 [ 984.080038] [<ffffffff9182605c>] entry_SYSCALL_64_fastpath+0xbf/0xc1 Reported-by: Ilya Dryomov <[email protected]> Signed-off-by: Anand Jain <[email protected]> Reviewed-by: David Sterba <[email protected]> Signed-off-by: David Sterba <[email protected]>
2016-09-26Btrfs: fix memory leak in do_walk_downLiu Bo1-0/+1
The extent buffer 'next' needs to be free'd conditionally. Signed-off-by: Liu Bo <[email protected]> Reviewed-by: David Sterba <[email protected]> Signed-off-by: David Sterba <[email protected]>
2016-09-26btrfs: btrfs_debug should consume fs_info when DEBUG is not definedJeff Mahoney1-4/+10
We can hit unused variable warnings when btrfs_debug and friends are just aliases for no_printk. This is due to the fs_info not getting consumed by the function call, which can happen if convenenience variables are used. This patch adds a new btrfs_no_printk static inline that consumes the convenience variable and does nothing else. It silences the unused variable warning and has no impact on the generated code: $ size fs/btrfs/extent_io.o* text data bss dec hex filename 44072 152 32 44256 ace0 fs/btrfs/extent_io.o.btrfs_no_printk 44072 152 32 44256 ace0 fs/btrfs/extent_io.o.no_printk Fixes: 27a0dd61a5 (Btrfs: make btrfs_debug match pr_debug handling related to DEBUG) Signed-off-by: Jeff Mahoney <[email protected]> Signed-off-by: David Sterba <[email protected]>
2016-09-26btrfs: convert send's verbose_printk to btrfs_debugJeff Mahoney1-27/+38
This was basically an open-coded, less flexible dynamic printk. We can just use btrfs_debug instead. Signed-off-by: Jeff Mahoney <[email protected]> Signed-off-by: David Sterba <[email protected]>
2016-09-26btrfs: convert pr_* to btrfs_* where possibleJeff Mahoney20-177/+231
For many printks, we want to know which file system issued the message. This patch converts most pr_* calls to use the btrfs_* versions instead. In some cases, this means adding plumbing to allow call sites access to an fs_info pointer. fs/btrfs/check-integrity.c is left alone for another day. Signed-off-by: Jeff Mahoney <[email protected]> Reviewed-by: David Sterba <[email protected]> Signed-off-by: David Sterba <[email protected]>
2016-09-26btrfs: convert printk(KERN_* to use pr_* callsJeff Mahoney16-275/+205
This patch converts printk(KERN_* style messages to use the pr_* versions. One side effect is that anything that was KERN_DEBUG is now automatically a dynamic debug message. Signed-off-by: Jeff Mahoney <[email protected]> Signed-off-by: David Sterba <[email protected]>
2016-09-26btrfs: unsplit printed stringsJeff Mahoney27-391/+324
CodingStyle chapter 2: "[...] never break user-visible strings such as printk messages, because that breaks the ability to grep for them." This patch unsplits user-visible strings. Signed-off-by: Jeff Mahoney <[email protected]> Signed-off-by: David Sterba <[email protected]>
2016-09-26btrfs: clean the old superblocks before freeing the deviceJeff Mahoney1-27/+11
btrfs_rm_device frees the block device but then re-opens it using the saved device name. A race exists between the close and the re-open that allows the block size to be changed. The result is getting stuck forever in the reclaim loop in __getblk_slow. This patch moves the superblock cleanup before closing the block device, which is also consistent with other callers. We also don't need a private copy of dev_name as the whole routine operates under the uuid_mutex. Signed-off-by: Jeff Mahoney <[email protected]> Reviewed-by: David Sterba <[email protected]> Signed-off-by: David Sterba <[email protected]>
2016-09-26Btrfs: kill BUG_ON in run_delayed_tree_refLiu Bo1-1/+7
In a corrupted btrfs image, we can come across this BUG_ON and get an unreponsive system, but if we return errors instead, its caller can handle everything gracefully by aborting the current transaction. Signed-off-by: Liu Bo <[email protected]> Signed-off-by: David Sterba <[email protected]>
2016-09-26Btrfs: don't leak reloc root nodes on errorJosef Bacik1-0/+4
We don't track the reloc roots in any sort of normal way, so the only way the root/commit_root nodes get free'd is if the relocation finishes successfully and the reloc root is deleted. Fix this by free'ing them in free_reloc_roots. Thanks, Signed-off-by: Josef Bacik <[email protected]> Signed-off-by: David Sterba <[email protected]>
2016-09-26btrfs: squash lines for simple wrapper functionsMasahiro Yamada5-28/+8
Remove unneeded variables and assignments. Signed-off-by: Masahiro Yamada <[email protected]> Reviewed-by: David Sterba <[email protected]> Signed-off-by: David Sterba <[email protected]>
2016-09-26Btrfs: improve check_node to avoid reading corrupted nodesLiu Bo1-4/+28
We need to check items in a node to make sure that we're reading a valid one, otherwise we could get various crashes while processing delayed_refs. Signed-off-by: Liu Bo <[email protected]> Reviewed-by: David Sterba <[email protected]> Signed-off-by: David Sterba <[email protected]>
2016-09-26Btrfs: add error handling for extent buffer in print treeLiu Bo1-0/+7
Somehow we missed btrfs_print_tree when last time we updated error handling for read_extent_block(). This keeps us from getting a NULL pointer panic when btrfs_print_tree's read_extent_block() fails. Signed-off-by: Liu Bo <[email protected]> Signed-off-by: David Sterba <[email protected]>
2016-09-26Btrfs: remove BUG_ON in start_transactionLiu Bo1-4/+1
Since we could get errors from the concurrent aborted transaction, the check of this BUG_ON in start_transaction is not true any more. Say, while flushing free space cache inode's dirty pages, btrfs_finish_ordered_io -> btrfs_join_transaction_nolock (the transaction has been aborted.) -> BUG_ON(type == TRANS_JOIN_NOLOCK); Signed-off-by: Liu Bo <[email protected]> Reviewed-by: Josef Bacik <[email protected]> Signed-off-by: David Sterba <[email protected]>
2016-09-26Btrfs: memset to avoid stale content in btree node blockLiu Bo1-0/+11
During updating btree, we could push items between sibling nodes/leaves, for leaves data sections starts reversely from the end of the block while for nodes we only have key pairs which are stored one by one from the start of the block. So we could do try to push key pairs from one node to the next node right in the tree, and after that, we update the node's nritems to reflect the correct end while leaving the stale content in the node. One may intentionally corrupt the fs image and access the stale content by bumping the nritems and causes various crashes. This takes the in-memory @nritems as the correct one and gets to memset the unused part of a btree node. Signed-off-by: Liu Bo <[email protected]> Reviewed-by: David Sterba <[email protected]> Signed-off-by: David Sterba <[email protected]>
2016-09-26Btrfs: return gracefully from balance if fs tree is corruptedLiu Bo1-6/+17
When relocating tree blocks, we firstly get block information from back references in the extent tree, we then search fs tree to try to find all parents of a block. However, if fs tree is corrupted, eg. if there're some missing items, we could come across these WARN_ONs and BUG_ONs. This makes us print some error messages and return gracefully from balance. Signed-off-by: Liu Bo <[email protected]> Reviewed-by: Josef Bacik <[email protected]> Signed-off-by: David Sterba <[email protected]>
2016-09-26Btrfs: kill BUG_ON()'s in btrfs_mark_extent_writtenJosef Bacik1-8/+33
No reason to bug on in here, fs corruption could easily cause these things to happen. Signed-off-by: Josef Bacik <[email protected]> Signed-off-by: David Sterba <[email protected]>
2016-09-26Btrfs: kill the start argument to read_extent_buffer_pagesJosef Bacik3-28/+15
Nobody uses this, it makes no sense to do partial reads of extent buffers. Signed-off-by: Josef Bacik <[email protected]> Signed-off-by: David Sterba <[email protected]>
2016-09-26Btrfs: add a flags field to btrfs_fs_infoJosef Bacik16-109/+99
We have a lot of random ints in btrfs_fs_info that can be put into flags. This is mostly equivalent with the exception of how we deal with quota going on or off, now instead we set a flag when we are turning it on or off and deal with that appropriately, rather than just having a pending state that the current quota_enabled gets set to. Thanks, Signed-off-by: Josef Bacik <[email protected]> Signed-off-by: David Sterba <[email protected]>
2016-09-26btrfs: extend btrfs_set_extent_delalloc and its friends to support in-band ↵Qu Wenruo7-25/+37
dedupe and subpage size patchset Extend btrfs_set_extent_delalloc() and extent_clear_unlock_delalloc() parameters for both in-band dedupe and subpage sector size patchset. This should reduce conflict of both patchset and the effort to rebase them. Cc: Chandan Rajendra <[email protected]> Cc: David Sterba <[email protected]> Signed-off-by: Qu Wenruo <[email protected]> Signed-off-by: David Sterba <[email protected]>
2016-09-26btrfs: add dynamic debug supportJeff Mahoney1-1/+30
We can re-use the dynamic debugging descriptor to make use of the dynamic debugging mechanism but still use our own printk interface. Defining the DEBUG macro works as it did before. When it's defined, all of the messages default to print. We can also enable all debug messages at boot or module-load time using the 'dyndbg' and 'btrfs.dyndbg' options. Signed-off-by: Jeff Mahoney <[email protected]> Reviewed-by: David Sterba <[email protected]> Signed-off-by: David Sterba <[email protected]>
2016-09-26btrfs: Fix warning "variable ‘gen’ set but not used"Luis Henriques1-2/+0
Variable 'gen' in reada_for_search() is not used since commit 58dc4ce43251 ("btrfs: remove unused parameter from readahead_tree_block"). This patch simply removes this variable. Signed-off-by: Luis Henriques <[email protected]> Reviewed-by: David Sterba <[email protected]> Signed-off-by: David Sterba <[email protected]>
2016-09-26btrfs: Fix warning "variable ‘blocksize’ set but not used"Luis Henriques1-2/+0
Variable 'blocksize' in reada_walk_down() is not used since commit d3e46fea1b1e ("btrfs: sink blocksize parameter to readahead_tree_block"). This patch simply removes this variable. Signed-off-by: Luis Henriques <[email protected]> Reviewed-by: David Sterba <[email protected]> Signed-off-by: David Sterba <[email protected]>
2016-09-26btrfs: let btrfs_delete_unused_bgs() to clean relocated bgsNaohiro Aota2-15/+11
Currently, btrfs_relocate_chunk() is removing relocated BG by itself. But the work can be done by btrfs_delete_unused_bgs() (and it's better since it trim the BG). Let's dedupe the code. While btrfs_delete_unused_bgs() is already hitting the relocated BG, it skip the BG since the BG has "ro" flag set (to keep balancing BG intact). On the other hand, btrfs cannot drop "ro" flag here to prevent additional writes. So this patch make use of "removed" flag. btrfs_delete_unused_bgs() now detect the flag to distinguish whether a read-only BG is relocating or not. Signed-off-by: Naohiro Aota <[email protected]> Reviewed-by: Josef Bacik <[email protected]> Signed-off-by: David Sterba <[email protected]>
2016-09-26Btrfs: bail out if block group has different mixed flagLiu Bo1-0/+14
Currently we allow inconsistence about mixed flag (BTRFS_BLOCK_GROUP_METADATA | BTRFS_BLOCK_GROUP_DATA). We'd get ENOSPC if block group has mixed flag and btrfs doesn't. If that happens, we have one space_info with mixed flag and another space_info only with BTRFS_BLOCK_GROUP_METADATA, and global_block_rsv.space_info points to the latter one, but all bytes from block_group contributes to the mixed space_info, thus all the allocation will fail with ENOSPC. This adds a check for the above case. Reported-by: Vegard Nossum <[email protected]> Signed-off-by: Liu Bo <[email protected]> [ updated message ] Reviewed-by: David Sterba <[email protected]> Signed-off-by: David Sterba <[email protected]>
2016-09-26Btrfs: fix memory leak in reading btree blocksLiu Bo1-0/+9
So we can read a btree block via readahead or intentional read, and we can end up with a memory leak when something happens as follows, 1) readahead starts to read block A but does not wait for read completion, 2) btree_readpage_end_io_hook finds that block A is corrupted, and it needs to clear all block A's pages' uptodate bit. 3) meanwhile an intentional read kicks in and checks block A's pages' uptodate to decide which page needs to be read. 4) when some pages have the uptodate bit during 3)'s check so 3) doesn't count them for eb->io_pages, but they are later cleared by 2) so we has to readpage on the page, we get the wrong eb->io_pages which results in a memory leak of this block. This fixes the problem by firstly getting all pages's locking and then checking pages' uptodate bit. t1(readahead) t2(readahead endio) t3(the following read) read_extent_buffer_pages end_bio_extent_readpage for pg in eb: for page 0,1,2 in eb: if pg is uptodate: btree_readpage_end_io_hook(pg) num_reads++ if uptodate: eb->io_pages = num_reads SetPageUptodate(pg) _______________ for pg in eb: for page 3 in eb: read_extent_buffer_pages if pg is NOT uptodate: btree_readpage_end_io_hook(pg) for pg in eb: __extent_read_full_page(pg) sanity check reports something wrong if pg is uptodate: clear_extent_buffer_uptodate(eb) num_reads++ for pg in eb: eb->io_pages = num_reads ClearPageUptodate(page) _______________ for pg in eb: if pg is NOT uptodate: __extent_read_full_page(pg) So t3's eb->io_pages is not consistent with the number of pages it's reading, and during endio(), atomic_dec_and_test(&eb->io_pages) will get a negative number so that we're not able to free the eb. Signed-off-by: Liu Bo <[email protected]> Reviewed-by: David Sterba <[email protected]> Signed-off-by: David Sterba <[email protected]>
2016-09-26Btrfs: remove BUG() in raid56Liu Bo1-1/+4
This BUG() has been triggered by a fuzz testing image, which contains an invalid chunk type, ie. a single stripe chunk has the raid6 type. Btrfs can handle this gracefully by returning -EIO, so besides using btrfs_warn to give us more debugging information rather than a single BUG(), we can return error properly. Signed-off-by: Liu Bo <[email protected]> Reviewed-by: David Sterba <[email protected]> Signed-off-by: David Sterba <[email protected]>
2016-09-26btrfs: fix check_shared for fiemap ioctlLu Fengqi2-10/+369
Only in the case of different root_id or different object_id, check_shared identified extent as the shared. However, If a extent was referred by different offset of same file, it should also be identified as shared. In addition, check_shared's loop scale is at least n^3, so if a extent has too many references, even causes soft hang up. First, add all delayed_ref to the ref_tree and calculate the unqiue_refs, if the unique_refs is greater than one, return BACKREF_FOUND_SHARED. Then individually add the on-disk reference(inline/keyed) to the ref_tree and calculate the unique_refs of the ref_tree to check if the unique_refs is greater than one.Because once there are two references to return SHARED, so the time complexity is close to the constant. Reported-by: Tsutomu Itoh <[email protected]> Signed-off-by: Lu Fengqi <[email protected]> Signed-off-by: David Sterba <[email protected]>
2016-09-26btrfs: create example debugfs file only in debugging buildDavid Sterba1-0/+9
Reviewed-by: Eric Sandeen <[email protected]> Signed-off-by: David Sterba <[email protected]>
2016-09-26btrfs: fix perms on demonstration debugfs interfaceEric Sandeen1-1/+1
btrfs provides a helpful demonstration of how to export a global variable via debugfs; however, it is unique among other debugfs files in that it is world-writable, which causes some concern to people who are not familiar with its purpose. Fix it so that it is only user-writable. Signed-off-by: Eric Sandeen <[email protected]> Reviewed-by: David Sterba <[email protected]> Signed-off-by: David Sterba <[email protected]>
2016-09-26Btrfs: fix memory leak of block group cacheLiu Bo3-0/+75
While processing delayed refs, we may update block group's statistics and attach it to cur_trans->dirty_bgs, and later writing dirty block groups will process the list, which happens during btrfs_commit_transaction(). For whatever reason, the transaction is aborted and dirty_bgs is not processed in cleanup_transaction(), we end up with memory leak of these dirty block group cache. Since btrfs_start_dirty_block_groups() doesn't make it go to the commit critical section, this also adds the cleanup work inside it. Signed-off-by: Liu Bo <[email protected]> Signed-off-by: David Sterba <[email protected]>