aboutsummaryrefslogtreecommitdiff
path: root/fs/btrfs
AgeCommit message (Collapse)AuthorFilesLines
2012-10-01Btrfs: update last trans if we don't update the inodeJosef Bacik1-0/+2
There is a completely impossible situation to hit where you can preallocate a file, fsync it, write into the preallocated region, have the transaction commit twice and then fsync and then immediately lose power and lose all of the contents of the write. This patch fixes this just so I feel better about the situation and because it is lightweight, we just update the last_trans when we finish an ordered IO and we don't update the inode itself. This way we are completely safe and I feel better. Thanks, Signed-off-by: Josef Bacik <[email protected]>
2012-10-01Btrfs: fix gcc warnings for 32bit compilesJan Schmidt4-31/+32
Signed-off-by: Jan Schmidt <[email protected]> Signed-off-by: Chris Mason <[email protected]>
2012-10-01Btrfs: fix btrfs send for inline items and compressionChris Mason3-15/+37
The btrfs send code was assuming the offset of the file item into the extent translated to bytes on disk. If we're compressed, this isn't true, and so it was off into extents owned by other files. It was also improperly handling inline extents. This solves a crash where we may have gone past the end of the file extent item by not testing early enough for an inline extent. It also solves problems where we have a whole between the end of the inline item and the start of the full extent. Signed-off-by: Chris Mason <[email protected]>
2012-10-01Btrfs: don't treat top/root directory inode as deleted/reusedAlexander Block1-1/+20
We can't do the deleted/reused logic for top/root inodes as it would create a stream that tries to delete and recreate the root dir. Reported-by: Alex Lyakas <[email protected]> Signed-off-by: Alexander Block <[email protected]>
2012-10-01Btrfs: ignore non-FS inodes for send/receiveAlexander Block1-0/+5
We have to ignore inode/space cache objects in send/receive. Reported-by: Alex Lyakas <[email protected]> Signed-off-by: Alexander Block <[email protected]>
2012-10-01Btrfs: pass root instead of parent_root to iterate_inode_refAlexander Block1-2/+2
We need to pass the root that we determined earlier to iterate_inode_ref. Reported-by: Alex Lyakas <[email protected]> Signed-off-by: Alexander Block <[email protected]>
2012-10-01Btrfs: use <= instead of < in is_extent_unchangedAlexander Block1-1/+1
Used the wrong compare operator here. Reported-by: Alex Lyakas <[email protected]> Signed-off-by: Alexander Block <[email protected]>
2012-10-01Btrfs: fix check for changed extent in is_extent_unchangedAlexander Block1-2/+2
The previous check was working fine, but this check should be easier to read. Also, we could theoritically have some exotic bugs with the previous checks. Signed-off-by: Alexander Block <[email protected]>
2012-10-01Btrfs: free nce and nce_head on error in name_cache_insertAlexander Block1-1/+5
Both were leaked in case of error. Reported-by: Alex Lyakas <[email protected]> Signed-off-by: Alexander Block <[email protected]>
2012-10-01Btrfs: remove unused tmp_path from iterate_dir_itemAlexander Block1-8/+0
A leftover from older code and unused now. Reported-by: Alex Lyakas <[email protected]> Signed-off-by: Alexander Block <[email protected]>
2012-10-01Btrfs: code cleanups for send/receiveAlexander Block1-48/+35
Doing some code cleanups as suggested by Arne. Changes do not change any logic. Signed-off-by: Alexander Block <[email protected]>
2012-10-01Btrfs: add/fix comments/documentation for send/receiveAlexander Block1-6/+134
As the subject already said, add/fix comments. Signed-off-by: Alexander Block <[email protected]>
2012-10-01Btrfs: update send_progress at correct placesAlexander Block1-6/+20
Updating send_progress in process_recorded_refs was not correct. It got updated too early in the cur_inode_new_gen case. Reported-by: Alex Lyakas <[email protected]> Reported-by: Arne Jansen <[email protected]> Signed-off-by: Alexander Block <[email protected]>
2012-10-01Btrfs: make aux field of ulist 64 bitAlexander Block4-23/+21
Btrfs send/receive uses the aux field to store inode numbers. On 32 bit machines this may become a problem. Also fix all users of ulist_add and ulist_add_merged. Reported-by: Arne Jansen <[email protected]> Signed-off-by: Alexander Block <[email protected]>
2012-10-01Btrfs: fix use of radix_tree for name_cache in send/receiveAlexander Block1-39/+37
We can't easily use the index of the radix tree for inums as the radix tree uses 32bit indexes on 32bit kernels. For 32bit kernels, we now use the lower 32bit of the inum as index and an additional list to store multiple entries per radix tree entry. Reported-by: Arne Jansen <[email protected]> Signed-off-by: Alexander Block <[email protected]>
2012-10-01Btrfs: fix memory leak for name_cache in send/receiveAlexander Block1-0/+1
When everything is done, name_cache_free is called which however forgot to call kfree on the cache entries. Signed-off-by: Alexander Block <[email protected]>
2012-10-01Btrfs: don't break in the final loop of find_extent_cloneAlexander Block1-1/+0
If we break, we may miss the clone from send_root which we prefer over all other clones. Commit is a result of Arne's review. Reported-by: Arne Jansen <[email protected]> Signed-off-by: Alexander Block <[email protected]>
2012-10-01Btrfs: use normal return path for root == send_root caseAlexander Block1-6/+0
Don't have a seperate return path for the mentioned case. Now we do the same "take lowest inode/offset" logic for all found clones. Commit is a result of Arne's review. Signed-off-by: Alexander Block <[email protected]>
2012-10-01Btrfs: use kmalloc instead of stack for backref_ctxAlexander Block1-11/+18
Make sure to never get in trouble due to the backref_ctx which was on the stack before. Commit is a result of Arne's review. Signed-off-by: Alexander Block <[email protected]>
2012-10-01Btrfs: rename backref_ctx::found_in_send_root to found_itselfAlexander Block1-4/+4
The new name should be easier to understand/read. Commit is a result of Arne's review. Signed-off-by: Alexander Block <[email protected]>
2012-10-01Btrfs: remove unused use_list from send/receive codeAlexander Block1-2/+0
use_list is a leftover and unused. Signed-off-by: Alexander Block <[email protected]>
2012-10-01Btrfs: add correct parent to check_dirs when dir got movedAlexander Block1-0/+11
We only added the parent for the new position of a moved dir. We also need to add the old parent of the moved dir. Reported-by: Alex Lyakas <[email protected]> Signed-off-by: Alexander Block <[email protected]>
2012-10-01Btrfs: remove unused code with #if 0Alexander Block1-0/+2
fs_path_remove is not used at the moment due to a previous patch. Remove it for now (with #if 0) to avoid compile warnings. Signed-off-by: Alexander Block <[email protected]>
2012-10-01Btrfs: add missing check for dir != tmp_dir to is_first_refAlexander Block1-1/+1
We missed that check which resultet in all refs with the same name being reported as first_ref. Reported-by: Alex Lyakas <[email protected]> Signed-off-by: Alexander Block <[email protected]>
2012-10-01Btrfs: fix cur_ino < parent_ino case for send/receiveAlexander Block1-244/+146
When the current inodes inum is smaller then the inum of the parent directory strange things were happending due to wrong path resolution and other bugs. Fix this with a new approach for the problem. Reported-by: Alex Lyakas <[email protected]> Signed-off-by: Alexander Block <[email protected]>
2012-10-01Btrfs: add rdev to get_inode_info in send/receiveAlexander Block1-13/+17
We need rdev in the next commit. Signed-off-by: Alexander Block <[email protected]>
2012-10-01Merge branch 'for-linus' of ↵Linus Torvalds3-4/+4
git://git.kernel.org/pub/scm/linux/kernel/git/jikos/trivial Pull the trivial tree from Jiri Kosina: "Tiny usual fixes all over the place" * 'for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/jikos/trivial: (34 commits) doc: fix old config name of kprobetrace fs/fs-writeback.c: cleanup riteback_sb_inodes kerneldoc btrfs: fix the commment for the action flags in delayed-ref.h btrfs: fix trivial typo for the comment of BTRFS_FREE_INO_OBJECTID vfs: fix kerneldoc for generic_fh_to_parent() treewide: fix comment/printk/variable typos ipr: fix small coding style issues doc: fix broken utf8 encoding nfs: comment fix platform/x86: fix asus_laptop.wled_type module parameter mfd: printk/comment fixes doc: getdelays.c: remember to close() socket on error in create_nl_socket() doc: aliasing-test: close fd on write error mmc: fix comment typos dma: fix comments spi: fix comment/printk typos in spi Coccinelle: fix typo in memdup_user.cocci tmiofb: missing NULL pointer checks tools: perf: Fix typo in tools/perf tools/testing: fix comment / output typos ...
2012-09-26switch simple cases of fget_light to fdgetAl Viro1-14/+12
Signed-off-by: Al Viro <[email protected]>
2012-09-26switch btrfs_ioctl_clone() to fget_light()Al Viro1-3/+3
Signed-off-by: Al Viro <[email protected]>
2012-09-26switch btrfs_ioctl_snap_create_transid() to fget_light()Al Viro1-7/+7
Signed-off-by: Al Viro <[email protected]>
2012-09-21userns: Convert btrfs to use kuid/kgid where appropriateEric W. Biederman3-11/+11
Cc: Chris Mason <[email protected]> Acked-by: Serge Hallyn <[email protected]> Signed-off-by: Eric W. Biederman <[email protected]>
2012-09-21btrfs: fix the commment for the action flags in delayed-ref.hWang Sheng-Hui1-1/+1
The action field has been merged into struct btrfs_delayed_ref_node, and no struct btrfs_delayed_ref is available now. Signed-off-by: Wang Sheng-Hui <[email protected]> Signed-off-by: Jiri Kosina <[email protected]>
2012-09-18userns: Pass a userns parameter into posix_acl_to_xattr and posix_acl_from_xattrEric W. Biederman1-4/+4
- Pass the user namespace the uid and gid values in the xattr are stored in into posix_acl_from_xattr. - Pass the user namespace kuid and kgid values should be converted into when storing uid and gid values in an xattr in posix_acl_to_xattr. - Modify all callers of posix_acl_from_xattr and posix_acl_to_xattr to pass in &init_user_ns. In the short term this change is not strictly needed but it makes the code clearer. In the longer term this change is necessary to be able to mount filesystems outside of the initial user namespace that natively store posix acls in the linux xattr format. Cc: Theodore Tso <[email protected]> Cc: Andrew Morton <[email protected]> Cc: Andreas Dilger <[email protected]> Cc: Jan Kara <[email protected]> Cc: Al Viro <[email protected]> Signed-off-by: "Eric W. Biederman" <[email protected]>
2012-09-16Merge branch 'for-linus' of ↵Linus Torvalds1-6/+2
git://git.kernel.org/pub/scm/linux/kernel/git/mason/linux-btrfs Pull a btrfs revert from Chris Mason: "My for-linus branch has one revert in the new quota code. We're building up more fixes at etc for the next merge window, but I'm keeping them out unless they are bigger regressions or have a huge impact." * 'for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/mason/linux-btrfs: Revert "Btrfs: fix some error codes in btrfs_qgroup_inherit()"
2012-09-14Revert "Btrfs: fix some error codes in btrfs_qgroup_inherit()"Chris Mason1-6/+2
This reverts commit 5986802c2fcc754040bb7ed95f30bb16c4a843b7. Both paths are not error paths but regular cases where non-qgroup subvols are involved. Signed-off-by: Chris Mason <[email protected]>
2012-09-06btrfs: fix trivial typo for the comment of BTRFS_FREE_INO_OBJECTIDWang Sheng-Hui1-1/+1
It should be storing, not sotring. Signed-off-by: Wang Sheng-Hui <[email protected]> Signed-off-by: Jiri Kosina <[email protected]>
2012-09-01btrfs: fix comment typo in btrfs_finish_ordered_ioLiu Bo1-2/+2
Fix typo errors in comments of btrfs_finish_ordered_io. Signed-off-by: Liu Bo <[email protected]> Signed-off-by: Jiri Kosina <[email protected]>
2012-08-29Merge branch 'for-linus' of ↵Linus Torvalds21-376/+418
git://git.kernel.org/pub/scm/linux/kernel/git/mason/linux-btrfs Pull btrfs fixes from Chris Mason: "I've split out the big send/receive update from my last pull request and now have just the fixes in my for-linus branch. The send/recv branch will wander over to linux-next shortly though. The largest patches in this pull are Josef's patches to fix DIO locking problems and his patch to fix a crash during balance. They are both well tested. The rest are smaller fixes that we've had queued. The last rc came out while I was hacking new and exciting ways to recover from a misplaced rm -rf on my dev box, so these missed rc3." * 'for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/mason/linux-btrfs: (25 commits) Btrfs: fix that repair code is spuriously executed for transid failures Btrfs: fix ordered extent leak when failing to start a transaction Btrfs: fix a dio write regression Btrfs: fix deadlock with freeze and sync V2 Btrfs: revert checksum error statistic which can cause a BUG() Btrfs: remove superblock writing after fatal error Btrfs: allow delayed refs to be merged Btrfs: fix enospc problems when deleting a subvol Btrfs: fix wrong mtime and ctime when creating snapshots Btrfs: fix race in run_clustered_refs Btrfs: don't run __tree_mod_log_free_eb on leaves Btrfs: increase the size of the free space cache Btrfs: barrier before waitqueue_active Btrfs: fix deadlock in wait_for_more_refs btrfs: fix second lock in btrfs_delete_delayed_items() Btrfs: don't allocate a seperate csums array for direct reads Btrfs: do not strdup non existent strings Btrfs: do not use missing devices when showing devname Btrfs: fix that error value is changed by mistake Btrfs: lock extents as we map them in DIO ...
2012-08-28Btrfs: fix that repair code is spuriously executed for transid failuresStefan Behrens1-2/+6
If verify_parent_transid() fails for all mirrors, the current code calls repair_io_failure() anyway which means: - that the disk block is rewritten without repairing anything and - that a kernel log message is printed which misleadingly claims that a read error was corrected. This is an example: parent transid verify failed on 615015833600 wanted 110423 found 110424 parent transid verify failed on 615015833600 wanted 110423 found 110424 btrfs read error corrected: ino 1 off 615015833600 (dev /dev/...) It is wrong to ignore the results from verify_parent_transid() and to call repair_eb_io_failure() when the verification of the transids failed. This commit fixes the issue. Signed-off-by: Stefan Behrens <[email protected]> Signed-off-by: Chris Mason <[email protected]>
2012-08-28Btrfs: fix ordered extent leak when failing to start a transactionLiu Bo1-2/+5
We cannot just return error before freeing ordered extent and releasing reserved space when we fail to start a transacion. Signed-off-by: Liu Bo <[email protected]> Signed-off-by: Chris Mason <[email protected]>
2012-08-28Btrfs: fix a dio write regressionLiu Bo1-4/+20
This bug is introduced by commit 3b8bde746f6f9bd36a9f05f5f3b6e334318176a9 (Btrfs: lock extents as we map them in DIO). In dio write, we should unlock the section which we didn't do IO on in case that we fall back to buffered write. But we need to not only unlock the section but also cleanup reserved space for the section. This bug was found while running xfstests 133, with this 133 no longer complains. Signed-off-by: Liu Bo <[email protected]> Signed-off-by: Chris Mason <[email protected]>
2012-08-28Btrfs: fix deadlock with freeze and sync V2Josef Bacik1-4/+9
We can deadlock with freeze right now because we unconditionally start a transaction in our ->sync_fs() call. To fix this just check and see if we have a running transaction to commit. This saves us from the deadlock because at this point we'll have the umount sem for the sb so we're safe from freezes coming in after we've done our check. With this patch the freeze xfstests no longer deadlocks. Thanks, Signed-off-by: Josef Bacik <[email protected]> Signed-off-by: Chris Mason <[email protected]>
2012-08-28Btrfs: revert checksum error statistic which can cause a BUG()Stefan Behrens3-39/+2
Commit 442a4f6308e694e0fa6025708bd5e4e424bbf51c added btrfs device statistic counters for detected IO and checksum errors to Linux 3.5. The statistic part that counts checksum errors in end_bio_extent_readpage() can cause a BUG() in a subfunction: "kernel BUG at fs/btrfs/volumes.c:3762!" That part is reverted with the current patch. However, the counting of checksum errors in the scrub context remains active, and the counting of detected IO errors (read, write or flush errors) in all contexts remains active. Cc: stable <[email protected]> # 3.5 Signed-off-by: Stefan Behrens <[email protected]> Signed-off-by: Chris Mason <[email protected]>
2012-08-28Btrfs: remove superblock writing after fatal errorStefan Behrens2-33/+5
With commit acce952b0, btrfs was changed to flag the filesystem with BTRFS_SUPER_FLAG_ERROR and switch to read-only mode after a fatal error happened like a write I/O errors of all mirrors. In such situations, on unmount, the superblock is written in btrfs_error_commit_super(). This is done with the intention to be able to evaluate the error flag on the next mount. A warning is printed in this case during the next mount and the log tree is ignored. The issue is that it is possible that the superblock points to a root that was not written (due to write I/O errors). The result is that the filesystem cannot be mounted. btrfsck also does not start and all the other btrfs-progs tools fail to start as well. However, mount -o recovery is working well and does the right things to recover the filesystem (i.e., don't use the log root, clear the free space cache and use the next mountable root that is stored in the root backup array). This patch removes the writing of the superblock when BTRFS_SUPER_FLAG_ERROR is set, and removes the handling of the error flag in the mount function. These lines can be used to reproduce the issue (using /dev/sdm): SCRATCH_DEV=/dev/sdm SCRATCH_MNT=/mnt echo 0 25165824 linear $SCRATCH_DEV 0 | dmsetup create foo ls -alLF /dev/mapper/foo mkfs.btrfs /dev/mapper/foo mount /dev/mapper/foo $SCRATCH_MNT echo bar > $SCRATCH_MNT/foo sync echo 0 25165824 error | dmsetup reload foo dmsetup resume foo ls -alF $SCRATCH_MNT touch $SCRATCH_MNT/1 ls -alF $SCRATCH_MNT sleep 35 echo 0 25165824 linear $SCRATCH_DEV 0 | dmsetup reload foo dmsetup resume foo sleep 1 umount $SCRATCH_MNT btrfsck /dev/mapper/foo dmsetup remove foo Signed-off-by: Stefan Behrens <[email protected]> Signed-off-by: Jan Schmidt <[email protected]>
2012-08-28Btrfs: allow delayed refs to be mergedJosef Bacik3-27/+142
Daniel Blueman reported a bug with fio+balance on a ramdisk setup. Basically what happens is the balance relocates a tree block which will drop the implicit refs for all of its children and adds a full backref. Once the block is relocated we have to add the implicit refs back, so when we cow the block again we add the implicit refs for its children back. The problem comes when the original drop ref doesn't get run before we add the implicit refs back. The delayed ref stuff will specifically prefer ADD operations over DROP to keep us from freeing up an extent that will have references to it, so we try to add the implicit ref before it is actually removed and we panic. This worked fine before because the add would have just canceled the drop out and we would have been fine. But the backref walking work needs to be able to freeze the delayed ref stuff in time so we have this ever increasing sequence number that gets attached to all new delayed ref updates which makes us not merge refs and we run into this issue. So to fix this we need to merge delayed refs. So everytime we run a clustered ref we need to try and merge all of its delayed refs. The backref walking stuff locks the delayed ref head before processing, so if we have it locked we are safe to merge any refs inside of the sequence number. If there is no sequence number we can merge all refs. Doing this not only fixes our bug but keeps the delayed ref code from adding and removing useless refs and batching together multiple refs into one search instead of one search per delayed ref, which will really help our commit times. I ran this with Daniels test and 276 and I haven't seen any problems. Thanks, Reported-by: Daniel J Blueman <[email protected]> Signed-off-by: Josef Bacik <[email protected]>
2012-08-28Btrfs: fix enospc problems when deleting a subvolJosef Bacik1-1/+1
Subvol delete is a special kind of awful where we use the global reserve to cover the ENOSPC requirements. The problem is once we're done removing everything we do a btrfs_update_inode(), which by default will try to do the delayed update stuff which will use it's own reserve. There will be no space in this reserve and we'll return ENOSPC. So instead use btrfs_update_inode_fallback() which will just fallback to updating the inode item in the case of enospc. This is fine because the global reserve covers the space requirements for this. With this patch I can now delete a subvol on a problem image Dave Sterba sent me. Thanks, Reported-by: David Sterba <[email protected]> Signed-off-by: Josef Bacik <[email protected]> Signed-off-by: Chris Mason <[email protected]>
2012-08-28Btrfs: fix wrong mtime and ctime when creating snapshotsMiao Xie1-0/+1
When we created a new snapshot, the mtime and ctime of its parent directory were not updated. Fix it. Signed-off-by: Miao Xie <[email protected]> Signed-off-by: Chris Mason <[email protected]>
2012-08-28Btrfs: fix race in run_clustered_refsArne Jansen1-0/+17
With commit commit d1270cd91f308c9d22b2804720c36ccd32dbc35e Author: Arne Jansen <[email protected]> Date: Tue Sep 13 15:16:43 2011 +0200 Btrfs: put back delayed refs that are too new I added a window where the delayed_ref's head->ref_mod code can diverge from the sum of the remaining refs, because we release the head->mutex in the middle. This leads to btrfs_lookup_extent_info returning wrong numbers. This patch fixes this by adjusting the head's ref_mod with each delayed ref we run. Signed-off-by: Arne Jansen <[email protected]> Signed-off-by: Chris Mason <[email protected]>
2012-08-28Btrfs: don't run __tree_mod_log_free_eb on leavesChris Mason1-0/+3
When we split a leaf, we may end up inserting a new root on top of that leaf. The reflog code was incorrectly assuming the old root was always a node. This makes sure we skip over leaves. Signed-off-by: Chris Mason <[email protected]>
2012-08-28Btrfs: increase the size of the free space cacheJosef Bacik1-8/+7
Arne was complaining about the space cache having mismatching generation numbers when debugging a deadlock. This is because we can run out of space in our preallocated range for our space cache if you have a pretty fragmented amount of space in your pinned space. So just increase the amount of space we preallocate for space cache so we can be sure to have enough space. This will only really affect data ranges since their the only chunks that end up larger than 256MB. Thanks, Signed-off-by: Josef Bacik <[email protected]> Signed-off-by: Chris Mason <[email protected]>