aboutsummaryrefslogtreecommitdiff
path: root/fs
AgeCommit message (Collapse)AuthorFilesLines
2013-09-01btrfs: add mount option to set commit intervalDavid Sterba3-2/+34
I'ts hardcoded to 30 seconds which is fine for most users. Higher values defer data being synced to permanent storage with obvious consequences when the system crashes. The upper bound is not forced, but a warning is printed if it's more than 300 seconds (5 minutes). Signed-off-by: David Sterba <[email protected]> Signed-off-by: Josef Bacik <[email protected]> Signed-off-by: Chris Mason <[email protected]>
2013-09-01Btrfs: stop using GFP_ATOMIC when allocating rewind ebsJosef Bacik2-11/+16
There is no reason we can't just set the path to blocking and then do normal GFP_NOFS allocations for these extent buffers. Thanks, Signed-off-by: Josef Bacik <[email protected]> Signed-off-by: Chris Mason <[email protected]>
2013-09-01Btrfs: deal with enomem in the rewind pathJosef Bacik2-73/+88
We can get ENOMEM trying to allocate dummy bufs for the rewind operation of the tree mod log. Instead of BUG_ON()'ing in this case pass up ENOMEM. I looked back through the callers and I'm pretty sure I got everybody who did BUG_ON(ret) in this path. Thanks, Signed-off-by: Josef Bacik <[email protected]> Signed-off-by: Chris Mason <[email protected]>
2013-09-01Btrfs: check our parent dir when doing a compare sendJosef Bacik1-0/+19
When doing a send with a parent subvol we will check to see if the file we are acting on is being overwritten and move it if we think it may be needed further down the line during the send. We check this by checking its directory and making sure it existed in the parent and making sure the file existed in the parent. The problem with this check is that if we create a directory and a file in that directory, and then snapshot, and then remove and re-create that same directory and file with different inode numbers and then try to snapshot and send with the original parent we will try and save the original file inside of that directory. This is a problem because during the receive we move the directory out of the way because it is a completely new inode, which makes us unable to find the old file inside of the directory when we try to move that out of the way for the overwrite. We fix this by checking the parent directory of the inode we think we are overwriting. If the parent directory generation in the send root != the parent directory generation in the parent root then we know it is a completely new directory and we need not bother with moving the file out of the way because it would have been completely destroyed. This fixes bz 60673. Thanks, Signed-off-by: Josef Bacik <[email protected]> Signed-off-by: Chris Mason <[email protected]>
2013-09-01Btrfs: handle errors when doing slow cachingJosef Bacik2-9/+24
Alex Lyakas reported a bug where wait_block_group_cache_progress() would wait forever if a drive failed. This is because we just bail out if there is an error while trying to cache a block group, we don't update anybody who may be waiting. So this introduces a new enum for the cache state in case of error and makes everybody bail out if we have an error. Alex tested and verified this patch fixed his problem. This fixes bz 59431. Thanks, Signed-off-by: Josef Bacik <[email protected]> Signed-off-by: Chris Mason <[email protected]>
2013-09-01Btrfs: add missing error handling to read_tree_blockFilipe David Borba Manana1-0/+4
Signed-off-by: Filipe David Borba Manana <[email protected]> Signed-off-by: Josef Bacik <[email protected]> Signed-off-by: Chris Mason <[email protected]>
2013-09-01Fix leak in __btrfs_map_block error pathDave Jones1-0/+1
If we bail out when the stripe alloc fails, we need to undo the earlier allocation of raid_map. Signed-off-by: Dave Jones <[email protected]> Signed-off-by: Josef Bacik <[email protected]> Signed-off-by: Chris Mason <[email protected]>
2013-09-01Btrfs: add missing error check to find_parent_nodesFilipe David Borba Manana1-1/+3
Signed-off-by: Filipe David Borba Manana <[email protected]> Reviewed-by: Jan Schmidt <[email protected]> Signed-off-by: Josef Bacik <[email protected]> Signed-off-by: Chris Mason <[email protected]>
2013-09-01Btrfs: optimize function btrfs_read_chunk_treeFilipe David Borba Manana1-19/+11
After reading all device items from the chunk tree, don't exit the loop and then navigate down the tree again to find the chunk items. Instead just read all device items and chunk items with a single tree search. This is possible because all device items are found before any chunk item in the chunks tree. Signed-off-by: Filipe David Borba Manana <[email protected]> Reviewed-by: Miao Xie <[email protected]> Signed-off-by: Josef Bacik <[email protected]> Signed-off-by: Chris Mason <[email protected]>
2013-09-01Btrfs: don't bug_on when we fail when cleaning up transactionsJosef Bacik1-2/+1
There is no reason for this sort of jackassery. Thanks, Signed-off-by: Josef Bacik <[email protected]> Signed-off-by: Chris Mason <[email protected]>
2013-09-01Btrfs: change how we queue blocks for backref checkingJosef Bacik1-7/+7
Previously we only added blocks to the list to have their backrefs checked if the level of the block is right above the one we are searching for. This is because we want to make sure we don't add the entire path up to the root to the lists to make sure we process things one at a time. This assumes that if any blocks in the path to the root are going to be not checked (shared in other words) then they will be in the level right above the current block on up. This isn't quite right though since we can have blocks higher up the list that are shared because they are attached to a reloc root. But we won't add this block to be checked and then later on we will BUG_ON(!upper->checked). So instead keep track of wether or not we've queued a block to be checked in this current search, and if we haven't go ahead and queue it to be checked. This patch fixed the panic I was seeing where we BUG_ON(!upper->checked). Thanks, Signed-off-by: Josef Bacik <[email protected]> Signed-off-by: Chris Mason <[email protected]>
2013-09-01Btrfs: check to see if we have an inline item properlyJosef Bacik1-0/+5
If our item isn't big enough to have an actual inline item when we have skinny metadata enabled just return 1 in find_inline_backref so we can move on to the next item. This probably wasn't causing a problem since we check the values of ptr and end properly, but just in case this will keep us from doing extra work. Thanks, Signed-off-by: Josef Bacik <[email protected]> Signed-off-by: Chris Mason <[email protected]>
2013-09-01Btrfs: fix what bits we clear when erroring out from delallocJosef Bacik1-21/+28
First of all we no longer set EXTENT_DIRTY when we dirty an extent so this patch removes the clearing of EXTENT_DIRTY since it confuses me. This patch also adds clearing EXTENT_DEFRAG and also doing EXTENT_DO_ACCOUNTING when we have errors. This is because if we are clearing delalloc without adding an ordered extent then we need to make sure the enospc handling stuff is accounted for. Also if this range was DEFRAG we need to make sure that bit is cleared so we dont leak it. Thanks, Signed-off-by: Josef Bacik <[email protected]> Signed-off-by: Chris Mason <[email protected]>
2013-09-01Btrfs: cleanup arguments to extent_clear_unlock_delallocJosef Bacik3-130/+85
This patch removes the io_tree argument for extent_clear_unlock_delalloc since we always use &BTRFS_I(inode)->io_tree, and it separates out the extent tree operations from the page operations. This way we just pass in the extent bits we want to clear and then pass in the operations we want done to the pages. This is because I'm going to fix what extent bits we clear in some cases and rather than add a bunch of new flags we'll just use the actual extent bits we want to clear. Thanks, Signed-off-by: Josef Bacik <[email protected]> Signed-off-by: Chris Mason <[email protected]>
2013-09-01btrfs: use BTRFS_SUPER_INFO_SIZE macro at btrfs_read_dev_super()Anand Jain1-2/+4
Signed-off-by: Anand Jain <[email protected]> Signed-off-by: Josef Bacik <[email protected]> Signed-off-by: Chris Mason <[email protected]>
2013-09-01Btrfs: cache the extent map struct when reading several pagesMiao Xie1-11/+46
When we read several pages at once, we needn't get the extent map object every time we deal with a page, and we can cache the extent map object. So, we can reduce the search time of the extent map, and besides that, we also can reduce the lock contention of the extent map tree. Signed-off-by: Miao Xie <[email protected]> Signed-off-by: Josef Bacik <[email protected]> Signed-off-by: Chris Mason <[email protected]>
2013-09-01Btrfs: batch the extent state operation when reading pagesMiao Xie1-28/+107
In the past, we cached the checksum value in the extent state object, so we had to split the extent state object by the block size, or we had no space to keep this checksum value. But it increased the lock contention of the extent state tree. Now we removed this limit by caching the checksum into the bio object, so it is unnecessary to do the extent state operations by the block size, we can do it in batches, in this way, we can reduce the lock contention of the extent state tree. Signed-off-by: Miao Xie <[email protected]> Signed-off-by: Josef Bacik <[email protected]> Signed-off-by: Chris Mason <[email protected]>
2013-09-01Btrfs: batch the extent state operation in the end io handle of the read pageMiao Xie1-31/+42
Before applying this patch, we set the uptodate flag and unlock the extent by the page size, it is unnecessary, we can do it in batches, it can reduce the lock contention of the extent state tree. Signed-off-by: Miao Xie <[email protected]> Signed-off-by: Josef Bacik <[email protected]> Signed-off-by: Chris Mason <[email protected]>
2013-09-01Btrfs: don't cache the csum value into the extent state treeMiao Xie8-164/+174
Before applying this patch, we cached the csum value into the extent state tree when reading some data from the disk, this operation increased the lock contention of the state tree. Now, we just store the csum value into the bio structure or other unshared structure, so we can reduce the lock contention. Signed-off-by: Miao Xie <[email protected]> Signed-off-by: Josef Bacik <[email protected]> Signed-off-by: Chris Mason <[email protected]>
2013-09-01Btrfs: add branch prediction hints in the read page end IO functionMiao Xie1-5/+9
This patch add some branch prediction hints into the end IO function of the read page, it reduced the percentage of the branch misses from 5.5% to 4.9%. Signed-off-by: Miao Xie <[email protected]> Signed-off-by: Josef Bacik <[email protected]> Signed-off-by: Chris Mason <[email protected]>
2013-09-01Btrfs: remove unnecessary argument of bio_readpage_error()Miao Xie1-15/+11
Signed-off-by: Miao Xie <[email protected]> Signed-off-by: Josef Bacik <[email protected]> Signed-off-by: Chris Mason <[email protected]>
2013-09-01Btrfs: add missing mounting options in btrfs_show_options()Wang Shilong1-0/+14
Some options are missing in btrfs_show_options(), this patch adds them. Signed-off-by: Wang Shilong <[email protected]> Reviewed-by: Miao Xie <[email protected]> Signed-off-by: Josef Bacik <[email protected]> Signed-off-by: Chris Mason <[email protected]>
2013-09-01Btrfs: use u64 for subvolid when parsing mount optionsWang Shilong1-9/+7
Although for most time, int is enough for subvolid, we should ensure safety in theory. Signed-off-by: Wang Shilong <[email protected]> Reviewed-by: Miao Xie <[email protected]> Signed-off-by: Josef Bacik <[email protected]> Signed-off-by: Chris Mason <[email protected]>
2013-09-01Btrfs: add sanity checks regarding to parsing mount optionsWang Shilong1-10/+37
I just notice the following commands succeed: mount <dev> <mnt> -o thread_pool=-1 This is ridiculous, only positive thread_pool makes sense,this patch adds sanity checks for them, and also catches the error of ENOMEM if allocating memory fails. Signed-off-by: Wang Shilong <[email protected]> Reviewed-by: Miao Xie <[email protected]> Signed-off-by: Josef Bacik <[email protected]> Signed-off-by: Chris Mason <[email protected]>
2013-09-01Btrfs, raid56: fix memory leak when allocating pages for p/q stripes failedMiao Xie1-1/+3
Signed-off-by: Miao Xie <[email protected]> Signed-off-by: Josef Bacik <[email protected]> Signed-off-by: Chris Mason <[email protected]>
2013-09-01btrfs/raid56: fix and cleanup some error pathsDan Carpenter1-6/+4
The alloc_rbio() frees "raid_map" and "bbio" on error, so there is a potential double free bug in raid56_parity_write(). The raid56_parity_write() and raid56_parity_recover() functions should still free "raid_map" and "bbio" on error if other errors occur though, so I have added some more calls to kfree(). Signed-off-by: Dan Carpenter <[email protected]> Reviewed-by: Miao Xie <[email protected]> Signed-off-by: Josef Bacik <[email protected]> Signed-off-by: Chris Mason <[email protected]>
2013-09-01Btrfs: don't bother autodefragging if our root is going awayJosef Bacik1-0/+5
We can end up with inodes on the auto defrag list that exist on roots that are going to be deleted. This is extra work we don't need to do, so just bail if our root has 0 root refs. Thanks, Signed-off-by: Josef Bacik <[email protected]> Signed-off-by: Chris Mason <[email protected]>
2013-09-01Btrfs: cleanup reloc roots properly on errorJosef Bacik2-2/+7
I was hitting the BUG_ON() at the end of merge_reloc_roots() because we were aborting the transaction at some point previously and then getting an error when we tried to drop the reloc root. I fixed btrfs_drop_snapshot to re-add us to the dead roots list if we failed, but this isn't the right thing to do for reloc roots since it uses root->root_list for it's own stuff in order to know what needs to be cleaned up. So fix btrfs_drop_snapshot to only do the re-add if we aren't dropping for reloc, and handle errors from merge_reloc_root() by dropping the reloc root we are processing since it won't be on the list of roots to cleanup. With this patch my reproducer no longer panics. Thanks, Signed-off-by: Josef Bacik <[email protected]> Signed-off-by: Chris Mason <[email protected]>
2013-09-01Btrfs: reset ret in record_one_backrefJosef Bacik1-2/+1
I was getting warnings when running find ./ -type f -exec btrfs fi defrag -f {} \; from record_one_backref because ret was set. Turns out it was because it was set to 1 because the search slot didn't come out exact and we never reset it. So reset it to 0 right after the search so we don't leak this and get uneccessary warnings. Thanks, Signed-off-by: Josef Bacik <[email protected]> Signed-off-by: Chris Mason <[email protected]>
2013-09-01btrfs: fix get set label blocking against balanceAnand Jain1-6/+10
btrfs_ioctl_get_fslabel() and btrfs_ioctl_set_fslabel() used root->fs_info->volume_mutex mutex which caused operations like balance to block set/get label operation until its completion and generally balance operation takes a long time to complete, so it will be annoying to the user when cli appears hung also this patch will add a bit of optimization within the btrfs_ioctl_get_falabel() function. v1->v2: use fs_info->super_lock instead of uuid_mutex Signed-off-by: Anand Jain <[email protected]> Signed-off-by: Josef Bacik <[email protected]> Signed-off-by: Chris Mason <[email protected]>
2013-09-01Btrfs: Print key type in decimal everywhereStefan Behrens1-2/+2
This is confusing, sometimes the key type is printed in hex (without a leading "0x" which makes things even more complicated), sometimes in decimal... Change it to be in decimal everywhere. Signed-off-by: Stefan Behrens <[email protected]> Signed-off-by: Josef Bacik <[email protected]> Signed-off-by: Chris Mason <[email protected]>
2013-09-01Btrfs/tracepoint: update delayed ref tracepointsLiu Bo2-3/+9
This shows exactly how btrfs processes the delayed refs onto disks, which is very helpful on understanding delayed ref mechanism and debugging related bugs. Signed-off-by: Liu Bo <[email protected]> Signed-off-by: Josef Bacik <[email protected]> Signed-off-by: Chris Mason <[email protected]>
2013-09-01btrfs_read_block_groups: Use enums to indexchandan1-2/+6
btrfs_space_info->block_groups. The current code uses integer literals to index btrfs_space_info->block_groups[] array. Instead use corresponding enums from 'enum btrfs_raid_types'. Signed-off-by: chandan <[email protected]> Reviewed-by: David Sterba <[email protected]> Signed-off-by: Josef Bacik <[email protected]> Signed-off-by: Chris Mason <[email protected]>
2013-09-01btrfs: Cleanup for using BTRFS_SETGET_STACK instead of raw convertQu Wenruo9-113/+150
Some codes still use the cpu_to_lexx instead of the BTRFS_SETGET_STACK_FUNCS declared in ctree.h. Also added some BTRFS_SETGET_STACK_FUNCS for btrfs_header btrfs_timespec and other structures. Signed-off-by: Qu Wenruo <[email protected]> Reviewed-by: Miao Xie <[email protected]> Reviewed-by: David Sterba <[email protected]> Signed-off-by: Josef Bacik <[email protected]> Signed-off-by: Chris Mason <[email protected]>
2013-09-01Btrfs: set qgroup_ulist to be null after calling ulist_free()Wang Shilong1-0/+6
We call ulist_free(qgroup_ulist) in btrfs_free_qgroup_config(), and btrfs_free_qgroup_config() may be called in two cases: (1)umount filesystem (2)disabling quota However, if we firstly disable quota and then umount filesystem, a double free happens. Fix it. Signed-off-by: Wang Shilong <[email protected]> Signed-off-by: Josef Bacik <[email protected]> Signed-off-by: Chris Mason <[email protected]>
2013-09-01Btrfs: add missing error checks to add_data_referencesFilipe David Borba Manana1-1/+6
The function relocation.c:add_data_references() was not checking if all calls to __add_tree_block() and find_data_references() were succeeding or not. Signed-off-by: Filipe David Borba Manana <[email protected]> Reviewed-by: David Sterba <[email protected]> Signed-off-by: Josef Bacik <[email protected]> Signed-off-by: Chris Mason <[email protected]>
2013-09-01btrfs: make errors in btrfs_num_copies less noisyDavid Sterba1-2/+2
The log message level 'critical' is verbose enough, 'emergency' beeps on all terminals. Signed-off-by: David Sterba <[email protected]> Signed-off-by: Josef Bacik <[email protected]> Signed-off-by: Chris Mason <[email protected]>
2013-09-01Btrfs: make free space caching faster with many non-inline extent referencesLiu Bo1-0/+11
So to cache free space, we iterate every extent item to gather free space info. When we have say 10,000 non-inline extent refs(such as BTRFS_EXTENT_DATA_REF), it takes quite a long time, and since inline extent refs and non-inline ones have same objectid in their keys, we can just re-search the tree with the next address to skip non-inline references. (This is found by dedup feature because dedup extents can end up with many non-inline extent refs.) Signed-off-by: Liu Bo <[email protected]> Signed-off-by: Josef Bacik <[email protected]> Signed-off-by: Chris Mason <[email protected]>
2013-09-01btrfs: fall back to global reservation when removing subvolumesJeff Mahoney3-5/+12
I recently did some ENOSPC testing that involved filling the disk while create and removing snapshots in a loop. During the test cycle, I ran into an ENOSPC when trying to remove a snapshot, leaving the fs stuck in ENOSPC even after a umount/mount cycle. This patch allow subvolume removal to fall back onto the global block reservation in order to succeed when it would have failed otherwise. Signed-off-by: Jeff Mahoney <[email protected]> Signed-off-by: Josef Bacik <[email protected]> Signed-off-by: Chris Mason <[email protected]>
2013-09-01Btrfs: optimize btrfs_lookup_extent_info()Filipe David Borba Manana1-4/+17
If we're looking for a metadata item in the tree and the search fails with return value of 1, and the slot doesn't point to the first item in the leaf, check if the previous item in the leaf corresponds to an extent item for the same object id - if it does, then don't do another tree search to get it. This optimization is already done by btrfs-progs. V2: updated commit message. Signed-off-by: Filipe David Borba Manana <[email protected]> Signed-off-by: Josef Bacik <[email protected]> Signed-off-by: Chris Mason <[email protected]>
2013-09-01Btrfs: Release uuid_mutex for shrink during device deleteCarey Underwood1-0/+2
Device scanning waits on the uuid_mutex, which can result in a very long wait if dev delete is shrinking the device. Signed-off-by: Carey Underwood <[email protected]> Reviewed-by: David Sterba <[email protected]> Signed-off-by: Josef Bacik <[email protected]> Signed-off-by: Chris Mason <[email protected]>
2013-09-01Btrfs: set lockdep class before locking new extent bufferJosef Bacik1-0/+2
We've been seeing spurious complaints out of lockdep because the lock class name changes. This is happening because when we drop a snapshot we will lock a block before we've read it in, which sets the lockdep class to whatever the default is. Then once we read the thing in we reset the lockdep class to what it is supposed to be, which blows lockdeps' mind. This patch should fix the problem, it appears to be the only place where we do this sort of thing. Thanks, Signed-off-by: Josef Bacik <[email protected]> Signed-off-by: Chris Mason <[email protected]>
2013-09-01Btrfs: return -1 when lzo compression makes data biggerStefan Agner1-1/+3
With this fix the lzo code behaves like the zlib code by returning an error code when compression does not help reduce the size of the file. This is currently not a bug since the compressed size is checked again in the calling method compress_file_range. Signed-off-by: Stefan Agner <[email protected]> Signed-off-by: Josef Bacik <[email protected]> Signed-off-by: Chris Mason <[email protected]>
2013-09-01Btrfs: stop using GFP_ATOMIC for the tree mod log allocationsJosef Bacik1-105/+56
Previously we held the tree mod lock when adding stuff because we use it to check and see if we truly do want to track tree modifications. This is admirable, but GFP_ATOMIC in a critical area that is going to get hit pretty hard and often is not nice. So instead do our basic checks to see if we don't need to track modifications, and if those pass then do our allocation, and then when we go to insert the new modification check if we still care, and if we don't just free up our mod and return. Otherwise we're good to go and we can carry on. Thanks, Signed-off-by: Josef Bacik <[email protected]> Signed-off-by: Chris Mason <[email protected]>
2013-08-30userns: Kill nsown_capable it makes the wrong thing easyEric W. Biederman2-3/+3
nsown_capable is a special case of ns_capable essentially for just CAP_SETUID and CAP_SETGID. For the existing users it doesn't noticably simplify things and from the suggested patches I have seen it encourages people to do the wrong thing. So remove nsown_capable. Acked-by: Serge Hallyn <[email protected]> Signed-off-by: "Eric W. Biederman" <[email protected]>
2013-08-30pstore/ram: (really) fix undefined usage of rounddown_pow_of_twoMaxime Bizon1-3/+3
Previous attempt to fix was b042e47491ba5f487601b5141a3f1d8582304170 Suggested use of is_power_of_2() was bogus because is_power_of_2(0) is false (documented behaviour). Signed-off-by: Maxime Bizon <[email protected]> Acked-by: Kees Cook <[email protected]> Signed-off-by: Tony Luck <[email protected]>
2013-08-30nfsd4: nfsd4_create_clid_dir prints uninitialized dataJ. Bruce Fields1-2/+0
Take the easy way out and just remove the printk. Reported-by: David Howells <[email protected]>
2013-08-30nfsd4: fix leak of inode reference on delegation failureJ. Bruce Fields1-11/+20
This fixes a regression from 68a3396178e6688ad7367202cdf0af8ed03c8727 "nfsd4: shut down more of delegation earlier". After that commit, nfs4_set_delegation() failures result in nfs4_put_delegation being called, but nfs4_put_delegation doesn't free the nfs4_file that has already been set by alloc_init_deleg(). This can result in an oops on later unmounting the exported filesystem. Note also delaying the fi_had_conflict check we're able to return a better error (hence give 4.1 clients a better idea why the delegation failed; though note CONFLICT isn't an exact match here, as that's supposed to indicate a current conflict, but all we know here is that there was one recently). Reported-by: Toralf Förster <[email protected]> Tested-by: Toralf Förster <[email protected]> Signed-off-by: J. Bruce Fields <[email protected]>
2013-08-30Revert "nfsd: nfs4_file_get_access: need to be more careful with O_RDWR"J. Bruce Fields1-4/+9
This reverts commit df66e75395c839c3a373bae897dbb1248f741b45. nfsd4_lock can get a read-only or write-only reference when only a read-write open is available. This is normal. Cc: Harshula Jayasuriya <[email protected]> Signed-off-by: J. Bruce Fields <[email protected]>
2013-08-30Merge tag 'v3.11-rc5' into for-3.12 branchJ. Bruce Fields33-280/+302
For testing purposes I want some nfs and nfsd bugfixes (specifically, 58cd57bfd9db3bc213bf9d6a10920f82095f0114 and previous nfsd patches, and Trond's 4f3cc4809a98a165a9708b72b47de71643797bbd).