aboutsummaryrefslogtreecommitdiff
path: root/fs
AgeCommit message (Collapse)AuthorFilesLines
2014-05-06NFSd: call rpc_destroy_wait_queue() from free_client()Trond Myklebust1-0/+1
Mainly to ensure that we don't leave any hanging timers. Signed-off-by: Trond Myklebust <[email protected]> Cc: [email protected] Signed-off-by: J. Bruce Fields <[email protected]>
2014-05-06NFSd: Move default initialisers from create_client() to alloc_client()Trond Myklebust1-12/+12
Aside from making it clearer what is non-trivial in create_client(), it also fixes a bug whereby we can call free_client() before idr_init() has been called. Signed-off-by: Trond Myklebust <[email protected]> Cc: [email protected] Signed-off-by: J. Bruce Fields <[email protected]>
2014-05-06Merge branch 'for-linus' of ↵Linus Torvalds5-86/+172
git://git.kernel.org/pub/scm/linux/kernel/git/mszeredi/fuse Pull fuse fixes from Miklos Szeredi: "This adds ctime update in the new cached writeback mode and also fixes/simplifies the mtime update handling. Support for rename flags (aka renameat2) is also added to the userspace API" * 'for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/mszeredi/fuse: fuse: add renameat2 support fuse: clear MS_I_VERSION fuse: clear FUSE_I_CTIME_DIRTY flag on setattr fuse: trust kernel i_ctime only fuse: remove .update_time fuse: allow ctime flushing to userspace fuse: fuse: add time_gran to INIT_OUT fuse: add .write_inode fuse: clean up fsync fuse: fuse: fallocate: use file_update_time() fuse: update mtime on open(O_TRUNC) in atomic_o_trunc mode fuse: update mtime on truncate(2) fuse: do not use uninitialized i_mode fuse: fix mtime update error in fsync fuse: check fallocate mode fuse: add __exit to fuse_ctl_cleanup
2014-05-05Merge branch 'for-linus' of ↵Linus Torvalds6-72/+39
git://git.kernel.org/pub/scm/linux/kernel/git/sage/ceph-client Pull Ceph fixes from Sage Weil: "First, there is a critical fix for the new primary-affinity function that went into -rc1. The second batch of patches from Zheng fix a range of problems with directory fragmentation, readdir, and a few odds and ends for cephfs" * 'for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/sage/ceph-client: ceph: reserve caps for file layout/lock MDS requests ceph: avoid releasing caps that are being used ceph: clear directory's completeness when creating file libceph: fix non-default values check in apply_primary_affinity() ceph: use fpos_cmp() to compare dentry positions ceph: check directory's completeness before emitting directory entry
2014-05-06xfs: remote attribute overwrite causes transaction overrunDave Chinner5-14/+42
Commit e461fcb ("xfs: remote attribute lookups require the value length") passes the remote attribute length in the xfs_da_args structure on lookup so that CRC calculations and validity checking can be performed correctly by related code. This, unfortunately has the side effect of changing the args->valuelen parameter in cases where it shouldn't. That is, when we replace a remote attribute, the incoming replacement stores the value and length in args->value and args->valuelen, but then the lookup which finds the existing remote attribute overwrites args->valuelen with the length of the remote attribute being replaced. Hence when we go to create the new attribute, we create it of the size of the existing remote attribute, not the size it is supposed to be. When the new attribute is much smaller than the old attribute, this results in a transaction overrun and an ASSERT() failure on a debug kernel: XFS: Assertion failed: tp->t_blk_res_used <= tp->t_blk_res, file: fs/xfs/xfs_trans.c, line: 331 Fix this by keeping the remote attribute value length separate to the attribute value length in the xfs_da_args structure. The enables us to pass the length of the remote attribute to be removed without overwriting the new attribute's length. Also, ensure that when we save remote block contexts for a later rename we zero the original state variables so that we don't confuse the state of the attribute to be removes with the state of the new attribute that we just added. [Spotted by Brain Foster.] Signed-off-by: Dave Chinner <[email protected]> Reviewed-by: Brian Foster <[email protected]> Signed-off-by: Dave Chinner <[email protected]>
2014-05-06xfs: initialize default acls for ->tmpfile()Brian Foster1-26/+29
The current tmpfile handler does not initialize default ACLs. Doing so within xfs_vn_tmpfile() makes it roughly equivalent to xfs_vn_mknod(), which is already used as a common create handler. xfs_vn_mknod() does not currently have a mechanism to determine whether to link the file into the namespace. Therefore, further abstract xfs_vn_mknod() into a new xfs_generic_create() handler with a tmpfile parameter. This new handler calls xfs_create_tmpfile() and d_tmpfile() on the dentry when called via ->tmpfile(). Signed-off-by: Brian Foster <[email protected]> Reviewed-by: Dave Chinner <[email protected]> Signed-off-by: Dave Chinner <[email protected]>
2014-05-05UBIFS: fix remount error pathArtem Bityutskiy1-1/+1
Dan's "smatch" checker found out that there was a bug in the error path of the 'ubifs_remount_rw()' function. Instead of jumping to the "out" label which cleans-things up, we just returned. This patch fixes the problem. Reported-by: Dan Carpenter <[email protected]> Signed-off-by: Artem Bityutskiy <[email protected]>
2014-05-05xfs: fully support v5 format filesystemsDave Chinner3-10/+6
We have had this code in the kernel for over a year now and have shaken all the known issues out of the code over the past few releases. It's now time to remove the experimental warnings during mount and fully support the new filesystem format in production systems. Remove the experimental warning, and add a version number to the initial "mounting filesystem" message to tell use what type of filesystem is being mounted. Also, remove the temporary inode cluster size output at mount time now we know that this code works fine. Signed-off-by: Dave Chinner <[email protected]> Reviewed-by: Brian Foster <[email protected]> Signed-off-by: Dave Chinner <[email protected]>
2014-05-03dcache: don't need rcu in shrink_dentry_list()Miklos Szeredi1-23/+4
Since now the shrink list is private and nobody can free the dentry while it is on the shrink list, we can remove RCU protection from this. Signed-off-by: Miklos Szeredi <[email protected]> Signed-off-by: Al Viro <[email protected]>
2014-05-03more graceful recovery in umount_collect()Al Viro1-76/+25
Start with shrink_dcache_parent(), then scan what remains. First of all, BUG() is very much an overkill here; we are holding ->s_umount, and hitting BUG() means that a lot of interesting stuff will be hanging after that point (sync(2), for example). Moreover, in cases when there had been more than one leak, we'll be better off reporting all of them. And more than just the last component of pathname - %pd is there for just such uses... That was the last user of dentry_lru_del(), so kill it off... Signed-off-by: Al Viro <[email protected]>
2014-05-03don't remove from shrink list in select_collect()Al Viro1-21/+10
If we find something already on a shrink list, just increment data->found and do nothing else. Loops in shrink_dcache_parent() and check_submounts_and_drop() will do the right thing - everything we did put into our list will be evicted and if there had been nothing, but data->found got non-zero, well, we have somebody else shrinking those guys; just try again. Signed-off-by: Al Viro <[email protected]>
2014-05-01Merge git://git.kvack.org/~bcrl/aio-fixesLinus Torvalds1-8/+34
Pull aio fixes from Ben LaHaise: "The first change from Anatol fixes a regression where io_destroy() no longer waits for outstanding aios to complete. The second corrects a memory leak in an error path for vectored aio operations. Both of these bug fixes should be queued up for stable as well" * git://git.kvack.org/~bcrl/aio-fixes: aio: fix potential leak in aio_run_iocb(). aio: block io_destroy() until all context requests are completed
2014-05-01dentry_kill(): don't try to remove from shrink listAl Viro1-8/+19
If the victim in on the shrink list, don't remove it from there. If shrink_dentry_list() manages to remove it from the list before we are done - fine, we'll just free it as usual. If not - mark it with new flag (DCACHE_MAY_FREE) and leave it there. Eventually, shrink_dentry_list() will get to it, remove the sucker from shrink list and call dentry_kill(dentry, 0). Which is where we'll deal with freeing. Since now dentry_kill(dentry, 0) may happen after or during dentry_kill(dentry, 1), we need to recognize that (by seeing DCACHE_DENTRY_KILLED already set), unlock everything and either free the sucker (in case DCACHE_MAY_FREE has been set) or leave it for ongoing dentry_kill(dentry, 1) to deal with. Signed-off-by: Al Viro <[email protected]>
2014-05-01aio: fix potential leak in aio_run_iocb().Leon Yu1-4/+2
iovec should be reclaimed whenever caller of rw_copy_check_uvector() returns, but it doesn't hold when failure happens right after aio_setup_vectored_rw(). Fix that in a such way to avoid hairy goto. Signed-off-by: Leon Yu <[email protected]> Signed-off-by: Benjamin LaHaise <[email protected]> Cc: [email protected]
2014-04-30expand the call of dentry_lru_del() in dentry_kill()Al Viro1-1/+6
Signed-off-by: Al Viro <[email protected]>
2014-04-30new helper: dentry_free()Al Viro1-5/+10
The part of old d_free() that dealt with actual freeing of dentry. Taken out of dentry_kill() into a separate function. Signed-off-by: Al Viro <[email protected]>
2014-04-30fold try_prune_one_dentry()Al Viro1-50/+25
Signed-off-by: Al Viro <[email protected]>
2014-04-30fold d_kill() and d_free()Al Viro1-52/+24
Signed-off-by: Al Viro <[email protected]>
2014-04-28ceph: reserve caps for file layout/lock MDS requestsYan, Zheng2-0/+4
Signed-off-by: Yan, Zheng <[email protected]> Reviewed-by: Sage Weil <[email protected]>
2014-04-28ceph: avoid releasing caps that are being usedYan, Zheng1-1/+1
To avoid releasing caps that are being used, encode_inode_release() should send implemented caps to MDS. Signed-off-by: Yan, Zheng <[email protected]> Reviewed-by: Sage Weil <[email protected]>
2014-04-28ceph: clear directory's completeness when creating fileYan, Zheng3-60/+21
When creating a file, ceph_set_dentry_offset() puts the new dentry at the end of directory's d_subdirs, then set the dentry's offset based on directory's max offset. The offset does not reflect the real postion of the dentry in directory. Later readdir reply from MDS may change the dentry's position/offset. This inconsistency can cause missing/duplicate entries in readdir result if readdir is partly satisfied by dcache_readdir(). The fix is clear directory's completeness after creating/renaming file. It prevents later readdir from using dcache_readdir(). Fixes: http://tracker.ceph.com/issues/8025 Signed-off-by: Yan, Zheng <[email protected]> Reviewed-by: Sage Weil <[email protected]>
2014-04-28ceph: use fpos_cmp() to compare dentry positionsYan, Zheng1-1/+1
Signed-off-by: Yan, Zheng <[email protected]> Reviewed-by: Sage Weil <[email protected]>
2014-04-28ceph: check directory's completeness before emitting directory entryYan, Zheng1-10/+12
Signed-off-by: Yan, Zheng <[email protected]> Reviewed-by: Sage Weil <[email protected]>
2014-04-28fuse: add renameat2 supportMiklos Szeredi2-8/+50
Support RENAME_EXCHANGE and RENAME_NOREPLACE flags on the userspace ABI. Signed-off-by: Miklos Szeredi <[email protected]>
2014-04-28fuse: clear MS_I_VERSIONMiklos Szeredi1-1/+1
Fuse doesn't support i_version (yet). Signed-off-by: Miklos Szeredi <[email protected]>
2014-04-28fuse: clear FUSE_I_CTIME_DIRTY flag on setattrMaxim Patlasov1-9/+17
The patch addresses two use-cases when the flag may be safely cleared: 1. fuse_do_setattr() is called with ATTR_CTIME flag set in attr->ia_valid. In this case attr->ia_ctime bears actual value. In-kernel fuse must send it to the userspace server and then assign the value to inode->i_ctime. 2. fuse_do_setattr() is called with ATTR_SIZE flag set in attr->ia_valid, whereas ATTR_CTIME is not set (truncate(2)). In this case in-kernel fuse must sent "now" to the userspace server and then assign the value to inode->i_ctime. In both cases we could clear I_DIRTY_SYNC, but that needs more thought. Signed-off-by: Maxim Patlasov <[email protected]> Signed-off-by: Miklos Szeredi <[email protected]>
2014-04-28fuse: trust kernel i_ctime onlyMaxim Patlasov2-4/+24
Let the kernel maintain i_ctime locally: update i_ctime explicitly on truncate, fallocate, open(O_TRUNC), setxattr, removexattr, link, rename, unlink. The inode flag I_DIRTY_SYNC serves as indication that local i_ctime should be flushed to the server eventually. The patch sets the flag and updates i_ctime in course of operations listed above. Signed-off-by: Maxim Patlasov <[email protected]> Signed-off-by: Miklos Szeredi <[email protected]>
2014-04-28fuse: remove .update_timeMiklos Szeredi1-12/+0
This implements updating ctime as well as mtime on file_update_time(). Signed-off-by: Miklos Szeredi <[email protected]>
2014-04-28fuse: allow ctime flushing to userspaceMaxim Patlasov3-4/+9
The patch extends fuse_setattr_in, and extends the flush procedure (fuse_flush_times()) called on ->write_inode() to send the ctime as well as mtime. Signed-off-by: Maxim Patlasov <[email protected]> Signed-off-by: Miklos Szeredi <[email protected]>
2014-04-28fuse: fuse: add time_gran to INIT_OUTMiklos Szeredi1-0/+5
Allow userspace fs to specify time granularity. This is needed because with writeback_cache mode the kernel is responsible for generating mtime and ctime, but if the underlying filesystem doesn't support nanosecond granularity then the cache will contain a different value from the one stored on the filesystem resulting in a change of times after a cache flush. Make the default granularity 1s. Signed-off-by: Miklos Szeredi <[email protected]>
2014-04-28fuse: add .write_inodeMiklos Szeredi4-33/+45
...and flush mtime from this. This allows us to use the kernel infrastructure for writing out dirty metadata (mtime at this point, but ctime in the next patches and also maybe atime). Signed-off-by: Miklos Szeredi <[email protected]>
2014-04-28fuse: clean up fsyncMiklos Szeredi1-8/+3
Don't need to start I/O twice (once without i_mutex and one within). Also make sure that even if the userspace filesystem doesn't support FSYNC we do all the steps other than sending the message. Signed-off-by: Miklos Szeredi <[email protected]>
2014-04-28fuse: fuse: fallocate: use file_update_time()Miklos Szeredi1-6/+2
in preparation for getting rid of FUSE_I_MTIME_DIRTY. Signed-off-by: Miklos Szeredi <[email protected]>
2014-04-28fuse: update mtime on open(O_TRUNC) in atomic_o_trunc modeMaxim Patlasov1-4/+14
In case of fc->atomic_o_trunc is set, fuse does nothing in fuse_do_setattr() while handling open(O_TRUNC). Hence, i_mtime must be updated explicitly in fuse_finish_open(). The patch also adds extra locking encompassing open(O_TRUNC) operation to avoid races between the truncation and updating i_mtime. Signed-off-by: Maxim Patlasov <[email protected]> Signed-off-by: Miklos Szeredi <[email protected]>
2014-04-28fuse: update mtime on truncate(2)Maxim Patlasov1-0/+2
Handling truncate(2), VFS doesn't set ATTR_MTIME bit in iattr structure; only ATTR_SIZE bit is set. In-kernel fuse must handle the case by setting mtime fields of struct fuse_setattr_in to "now" and set FATTR_MTIME bit even though ATTR_MTIME was not set. Signed-off-by: Maxim Patlasov <[email protected]> Signed-off-by: Miklos Szeredi <[email protected]>
2014-04-28fuse: do not use uninitialized i_modeMaxim Patlasov1-1/+1
When inode is in I_NEW state, inode->i_mode is not initialized yet. Do not use it before fuse_init_inode() is called. Signed-off-by: Maxim Patlasov <[email protected]> Signed-off-by: Miklos Szeredi <[email protected]>
2014-04-28fuse: fix mtime update error in fsyncMiklos Szeredi1-1/+1
Bad case of shadowing. Signed-off-by: Miklos Szeredi <[email protected]>
2014-04-28fuse: check fallocate modeMiklos Szeredi1-0/+3
Don't allow new fallocate modes until we figure out what (if anything) that takes. Signed-off-by: Miklos Szeredi <[email protected]>
2014-04-28fuse: add __exit to fuse_ctl_cleanupFabian Frederick2-2/+2
fuse_ctl_cleanup is only called by __exit fuse_exit Signed-off-by: Fabian Frederick <[email protected]> Signed-off-by: Miklos Szeredi <[email protected]>
2014-04-27Merge branch 'for-linus' of ↵Linus Torvalds8-45/+48
git://git.kernel.org/pub/scm/linux/kernel/git/mason/linux-btrfs Pull btrfs fixes from Chris Mason. * 'for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/mason/linux-btrfs: Btrfs: limit the path size in send to PATH_MAX Btrfs: correctly set profile flags on seqlock retry Btrfs: use correct key when repeating search for extent item Btrfs: fix inode caching vs tree log Btrfs: fix possible memory leaks in open_ctree() Btrfs: avoid triggering bug_on() when we fail to start inode caching task Btrfs: move btrfs_{set,clear}_and_info() to ctree.h btrfs: replace error code from btrfs_drop_extents btrfs: Change the hole range to a more accurate value. btrfs: fix use-after-free in mount_subvol()
2014-04-27Merge tag 'driver-core-3.15-rc3' of ↵Linus Torvalds2-3/+8
git://git.kernel.org/pub/scm/linux/kernel/git/gregkh/driver-core Pull driver core fixes from Greg KH: "Here are some kernfs fixes for 3.15-rc3 that resolve some reported problems. Nothing huge, but all needed" * tag 'driver-core-3.15-rc3' of git://git.kernel.org/pub/scm/linux/kernel/git/gregkh/driver-core: s390/ccwgroup: Fix memory corruption kernfs: add back missing error check in kernfs_fop_mmap() kernfs: fix a subdir count leak
2014-04-26Btrfs: limit the path size in send to PATH_MAXChris Mason1-0/+5
fs_path_ensure_buf is used to make sure our path buffers for send are big enough for the path names as we construct them. The buffer size is limited to 32K by the length field in the struct. But bugs in the path construction can end up trying to build a huge buffer, and we'll do invalid memmmoves when the buffer length field wraps. This patch is step one, preventing the overflows. Signed-off-by: Chris Mason <[email protected]>
2014-04-25Merge tag 'locks-v3.15-2' of git://git.samba.org/jlayton/linuxLinus Torvalds3-37/+37
Pull file locking fixes from Jeff Layton: "File locking related bugfixes for v3.15 (pile #2) - fix for a long-standing bug in __break_lease that can cause soft lockups - renaming of file-private locks to "open file description" locks, and the command macros to more visually distinct names The fix for __break_lease is also in the pile of patches for which Bruce sent a pull request, but I assume that your merge procedure will handle that correctly. For the other patches, I don't like the fact that we need to rename this stuff at this late stage, but it should be settled now (hopefully)" * tag 'locks-v3.15-2' of git://git.samba.org/jlayton/linux: locks: rename FL_FILE_PVT and IS_FILE_PVT to use "*_OFDLCK" instead locks: rename file-private locks to "open file description locks" locks: allow __break_lease to sleep even when break_time is 0
2014-04-25Merge branch 'for-3.15' of git://linux-nfs.org/~bfields/linuxLinus Torvalds3-13/+6
Pull nfsd bugfixes from Bruce Fields: "Three small nfsd bugfixes (including one locks.c fix for a bug triggered only from nfsd). Jeff's patches are for long-existing problems that became easier to trigger since the addition of vfs delegation support" * 'for-3.15' of git://linux-nfs.org/~bfields/linux: Revert "nfsd4: fix nfs4err_resource in 4.1 case" nfsd: set timeparms.to_maxval in setup_callback_client locks: allow __break_lease to sleep even when break_time is 0
2014-04-25kernfs: add back missing error check in kernfs_fop_mmap()Tejun Heo1-0/+2
While updating how mmap enabled kernfs files are handled by lockdep, 9b2db6e18945 ("sysfs: bail early from kernfs_file_mmap() to avoid spurious lockdep warning") inadvertently dropped error return check from kernfs_file_mmap(). The intention was just dropping "if (ops->mmap)" check as the control won't reach the point if the mmap callback isn't implemented, but I mistakenly removed the error return check together with it. This led to Xorg crash on i810 which was reported and bisected to the commit and then to the specific change by Tobias. Signed-off-by: Tejun Heo <[email protected]> Reported-and-bisected-by: Tobias Powalowski <[email protected]> Tested-by: Tobias Powalowski <[email protected]> References: http://lkml.kernel.org/g/[email protected] Cc: stable <[email protected]> # 3.14 Signed-off-by: Greg Kroah-Hartman <[email protected]>
2014-04-25kernfs: fix a subdir count leakJianyu Zhan1-3/+6
Currently kernfs_link_sibling() increates parent->dir.subdirs before adding the node into parent's chidren rb tree. Because it is possible that kernfs_link_sibling() couldn't find a suitable slot and bail out, this leads to a mismatch between elevated subdir count with actual children node numbers. This patches fix this problem, by moving the subdir accouting after the actual addtion happening. Signed-off-by: Jianyu Zhan <[email protected]> Acked-by: Tejun Heo <[email protected]> Signed-off-by: Greg Kroah-Hartman <[email protected]>
2014-04-24cifs: fix actimeo=0 corner case when cifs_i->time == jiffiesJeff Layton1-0/+3
actimeo=0 is supposed to be a special case that ensures that inode attributes are always refetched from the server instead of trusting the cache. The cifs code however uses time_in_range() to determine whether the attributes have timed out. In the case where cifs_i->time equals jiffies, this leads to the cifs code not refetching the inode attributes when it should. Fix this by explicitly testing for actimeo=0, and handling it as a special case. Reported-and-tested-by: Tetsuo Handa <[email protected]> Signed-off-by: Jeff Layton <[email protected]> Signed-off-by: Steve French <[email protected]>
2014-04-24Btrfs: correctly set profile flags on seqlock retryFilipe Manana1-1/+3
If we had to retry on the profiles seqlock (due to a concurrent write), we would set bits on the input flags that corresponded both to the current profile and to previous values of the profile. Signed-off-by: Filipe David Borba Manana <[email protected]> Signed-off-by: Chris Mason <[email protected]>
2014-04-24Btrfs: use correct key when repeating search for extent itemFilipe Manana1-0/+2
If skinny metadata is enabled and our first tree search fails to find a skinny extent item, we may repeat a tree search for a "fat" extent item (if the previous item in the leaf is not the "fat" extent we're looking for). However we were not setting the new key's objectid to the right value, as we previously used the same key variable to peek at the previous item in the leaf, which has a different objectid. So just set the right objectid to avoid modifying/deleting a wrong item if we repeat the tree search. Signed-off-by: Filipe David Borba Manana <[email protected]> Signed-off-by: Chris Mason <[email protected]>
2014-04-24Btrfs: fix inode caching vs tree logMiao Xie1-16/+2
Currently, with inode cache enabled, we will reuse its inode id immediately after unlinking file, we may hit something like following: |->iput inode |->return inode id into inode cache |->create dir,fsync |->power off An easy way to reproduce this problem is: mkfs.btrfs -f /dev/sdb mount /dev/sdb /mnt -o inode_cache,commit=100 dd if=/dev/zero of=/mnt/data bs=1M count=10 oflag=sync inode_id=`ls -i /mnt/data | awk '{print $1}'` rm -f /mnt/data i=1 while [ 1 ] do mkdir /mnt/dir_$i test1=`stat /mnt/dir_$i | grep Inode: | awk '{print $4}'` if [ $test1 -eq $inode_id ] then dd if=/dev/zero of=/mnt/dir_$i/data bs=1M count=1 oflag=sync echo b > /proc/sysrq-trigger fi sleep 1 i=$(($i+1)) done mount /dev/sdb /mnt umount /dev/sdb btrfs check /dev/sdb We fix this problem by adding unlinked inode's id into pinned tree, and we can not reuse them until committing transaction. Cc: [email protected] Signed-off-by: Miao Xie <[email protected]> Signed-off-by: Wang Shilong <[email protected]> Signed-off-by: Chris Mason <[email protected]>