aboutsummaryrefslogtreecommitdiff
path: root/fs
AgeCommit message (Collapse)AuthorFilesLines
2011-03-14switch non-create side of open() to use of do_last()Al Viro1-33/+67
Instead of path_lookupat() doing trailing symlink resolution, use the same scheme as on the O_CREAT side. Walk with LOOKUP_PARENT, then (in do_last()) look the final component up, then either open it or return error or, if it's a symlink, give the symlink back to path_openat() to be resolved there. The really messy complication here is RCU. We don't want to drop out of RCU mode before the final lookup, since we don't want to bounce parent directory ->d_count without a good reason. Result is _not_ pretty; later in the series we'll clean it up. For now we are roughly back where we'd been before the revert done by Nick's series - top-level logics of path_openat() is cleaned up, do_last() does actual opening, symlink resolution is done uniformly. Signed-off-by: Al Viro <[email protected]>
2011-03-14get rid of nd->fileAl Viro1-8/+7
Don't stash the struct file * used as starting point of walk in nameidata; pass file ** to path_init() instead. Signed-off-by: Al Viro <[email protected]>
2011-03-14get rid of the last LOOKUP_RCU dependencies in link_path_walk()Al Viro1-8/+13
New helper: terminate_walk(). An error has happened during pathname resolution and we either drop nd->path or terminate RCU, depending the mode we had been in. After that, nd is essentially empty. Switch link_path_walk() to using that for cleanup. Now the top-level logics in link_path_walk() is back to sanity. RCU dependencies are in the lower-level functions. Signed-off-by: Al Viro <[email protected]>
2011-03-14make nameidata_dentry_drop_rcu_maybe() always leave RCU modeAl Viro1-5/+11
Now we have do_follow_link() guaranteed to leave without dangling RCU and the next step will get LOOKUP_RCU logics completely out of link_path_walk(). Signed-off-by: Al Viro <[email protected]>
2011-03-14make handle_dots() leave RCU mode on errorAl Viro1-11/+12
Signed-off-by: Al Viro <[email protected]>
2011-03-14clear RCU on all failure exits from link_path_walk()Al Viro1-14/+16
Signed-off-by: Al Viro <[email protected]>
2011-03-14pull handling of . and .. into inlined helperAl Viro1-14/+16
getting LOOKUP_RCU checks out of link_path_walk()... Signed-off-by: Al Viro <[email protected]>
2011-03-14kill out_dput: in link_path_walk()Al Viro1-11/+4
Signed-off-by: Al Viro <[email protected]>
2011-03-14separate -ESTALE/-ECHILD retries in do_filp_open() from real workAl Viro1-29/+20
new helper: path_openat(). Does what do_filp_open() does, except that it tries only the walk mode (RCU/normal/force revalidation) it had been told to. Both create and non-create branches are using path_lookupat() now. Fixed the double audit_inode() in non-create branch. Signed-off-by: Al Viro <[email protected]>
2011-03-14switch do_filp_open() to struct open_flagsAl Viro4-86/+101
take calculation of open_flags by open(2) arguments into new helper in fs/open.c, move filp_open() over there, have it and do_sys_open() use that helper, switch exec.c callers of do_filp_open() to explicit (and constant) struct open_flags. Signed-off-by: Al Viro <[email protected]>
2011-03-14Collect "operation mode" arguments of do_last() into a structureAl Viro1-22/+35
No point messing with passing shitloads of "operation mode" arguments to do_open() one by one, especially since they are not going to change during do_filp_open(). Collect them into a struct, fill it and pass to do_last() by reference. Make sure that lookup intent flags are correctly set and removed - we want them for do_last(), but they make no sense for __do_follow_link(). Signed-off-by: Al Viro <[email protected]>
2011-03-14clean up the failure exits after __do_follow_link() in do_filp_open()Al Viro1-8/+5
Signed-off-by: Al Viro <[email protected]>
2011-03-14pull security_inode_follow_link() into __do_follow_link()Al Viro1-6/+7
Signed-off-by: Al Viro <[email protected]>
2011-03-14pull dropping RCU on success of link_path_walk() into path_lookupat()Al Viro1-18/+12
Signed-off-by: Al Viro <[email protected]>
2011-03-14untangle the "need_reval_dot" messAl Viro1-63/+44
instead of ad-hackery around need_reval_dot(), do the following: set a flag (LOOKUP_JUMPED) in the beginning of path, on absolute symlink traversal, on ".." and on procfs-style symlinks. Clear on normal components, leave unchanged on ".". Non-nested callers of link_path_walk() call handle_reval_path(), which checks that flag is set and that fs does want the final revalidate thing, then does ->d_revalidate(). In link_path_walk() all the return_reval stuff is gone. Signed-off-by: Al Viro <[email protected]>
2011-03-14merge component type recognitionAl Viro1-26/+22
no need to do it in three places... Signed-off-by: Al Viro <[email protected]>
2011-03-14merge path_init and path_init_rcuAl Viro1-83/+35
Actual dependency on whether we want RCU or not is in 3 small areas (as it ought to be) and everything around those is the same in both versions. Since each function has only one caller and those callers are on two sides of if (flags & LOOKUP_RCU), it's easier and cleaner to merge them and pull the checks inside. Signed-off-by: Al Viro <[email protected]>
2011-03-14sanitize path_walk() messAl Viro1-92/+56
New helper: path_lookupat(). Basically, what do_path_lookup() boils to modulo -ECHILD/-ESTALE handler. path_walk* family is gone; vfs_path_lookup() is using link_path_walk() directly, do_path_lookup() and do_filp_open() are using path_lookupat(). Signed-off-by: Al Viro <[email protected]>
2011-03-14take RCU-dependent stuff around exec_permission() into a new helperAl Viro1-11/+14
Signed-off-by: Al Viro <[email protected]>
2011-03-14kill path_lookup()Al Viro2-5/+4
all remaining callers pass LOOKUP_PARENT to it, so flags argument can die; renamed to kern_path_parent() Signed-off-by: Al Viro <[email protected]>
2011-03-14GFS2: Update to AIL list lockingSteven Whitehouse3-1/+5
The previous patch missed a couple of places where the AIL list needed locking, so this fixes up those places, plus a comment is corrected too. Signed-off-by: Steven Whitehouse <[email protected]> Cc: Dave Chinner <[email protected]>
2011-03-13compat breakage in preadv() and pwritev()Al Viro1-2/+6
Fix for a dumb preadv()/pwritev() compat bug - unlike the native variants, the compat_... ones forget to check FMODE_P{READ,WRITE}, so e.g. on pipe the native preadv() will fail with -ESPIPE and compat one will act as readv() and succeed. Not critical, but it's a clear bug with trivial fix, so IMO it's OK for -final. Signed-off-by: Al Viro <[email protected]> Signed-off-by: Linus Torvalds <[email protected]>
2011-03-13compat breakage in preadv() and pwritev()Al Viro1-2/+6
Fix for a dumb preadv()/pwritev() compat bug - unlike the native variants, compat_... ones forget to check FMODE_P{READ,WRITE}, so e.g. on pipe the native preadv() will fail with -ESPIPE and compat one will act as readv() and succeed. Not critical, but it's a clear bug with trivial fix. Signed-off-by: Al Viro <[email protected]>
2011-03-13Merge git://git.kernel.org/pub/scm/linux/kernel/git/mason/btrfs-unstableLinus Torvalds5-62/+135
* git://git.kernel.org/pub/scm/linux/kernel/git/mason/btrfs-unstable: Btrfs: break out of shrink_delalloc earlier btrfs: fix not enough reserved space btrfs: fix dip leak Btrfs: make sure not to return overlapping extents to fiemap Btrfs: deal with short returns from copy_from_user Btrfs: fix regressions in copy_from_user handling
2011-03-12Btrfs: break out of shrink_delalloc earlierChris Mason2-12/+32
Josef had changed shrink_delalloc to exit after three shrink attempts, which wasn't quite enough because new writers could race in and steal free space. But it also fixed deadlocks and stalls as we tried to recover delalloc reservations. The code was tweaked to loop 1024 times, and would reset the counter any time a small amount of progress was made. This was too drastic, and with a lot of writers we can end up stuck in shrink_delalloc forever. The shrink_delalloc loop is fairly complex because the caller is looping too, and the caller will go ahead and force a transaction commit to make sure we reclaim space. This reworks things to exit shrink_delalloc when we've forced some writeback and the delalloc reservations have gone down. This means the writeback has not just started but has also finished at least some of the metadata changes required to reclaim delalloc space. If we've got this wrong, we're returning ENOSPC too early, which is a big improvement over the current behavior of hanging the machine. Test 224 in xfstests hammers on this nicely, and with 1000 writers trying to fill a 1GB drive we get our first ENOSPC at 93% full. The other writers are able to continue until we get 100%. This is a worst case test for btrfs because the 1000 writers are doing small IO, and the small FS size means we don't have a lot of room for metadata chunks. Signed-off-by: Chris Mason <[email protected]>
2011-03-11xfs: don't name variables "panic"Alex Elder1-4/+4
The new xfs_alert_tag() used a variable named "panic", and that is to be avoided. Rename it. Signed-off-by: Alex Elder <[email protected]> Reviewed-by: Dave Chinner <[email protected]>
2011-03-11Cleanup: Factor out some cut-and-paste code.Rob Landley1-111/+44
Factor out some cut-and-paste code in options parsing. Saves about 800 bytes on x86-64. Signed-off-by: Rob Landley <[email protected]> Signed-off-by: Trond Myklebust <[email protected]>
2011-03-11cleanup: save 60 lines/100 bytes by combining two mostly duplicate functions.Rob Landley1-96/+33
Eliminate two mostly duplicate functions (nfs_parse_simple_hostname() and nfs_parse_protected_hostname()) and instead just make the calling function (nfs_parse_devname()) do everything. Signed-off-by: Rob Landley <[email protected]> Signed-off-by: Trond Myklebust <[email protected]>
2011-03-11NFS: account direct-io into task io accountingKonstantin Khlebnikov1-0/+5
Account NFS direct-io reads and writes into Task I/O Accounting. Do it before complition to handle aio. NFS have unusual direct-io implementation, thus accounting in generic code does not work. Signed-off-by: Konstantin Khlebnikov <[email protected]> Signed-off-by: Trond Myklebust <[email protected]>
2011-03-11NFSv4: Send unmapped uid/gids to the server when using auth_sysTrond Myklebust3-9/+46
The new behaviour is enabled using the new module parameter 'nfs4_disable_idmapping'. Note that if the server rejects an unmapped uid or gid, then the client will automatically switch back to using the idmapper. Signed-off-by: Trond Myklebust <[email protected]>
2011-03-11NFSv4: Propagate the error NFS4ERR_BADOWNER to nfs4_do_setattrTrond Myklebust2-2/+3
This will be required in order to switch uid/gid mapping back on if the admin has tried to disable it. Note that we also propagate NFS4ERR_BADNAME at the same time, in order to work around a Linux server bug. Signed-off-by: Trond Myklebust <[email protected]>
2011-03-11NFSv4: cleanup idmapper functions to take an nfs_server argumentTrond Myklebust2-22/+20
...instead of the nfs_client. Signed-off-by: Trond Myklebust <[email protected]>
2011-03-11NFSv4: Send unmapped uid/gids to the server if the idmapper failsTrond Myklebust1-4/+26
Signed-off-by: Trond Myklebust <[email protected]>
2011-03-11NFSv4: If the server sends us a numeric uid/gid then accept itTrond Myklebust1-2/+26
Signed-off-by: Trond Myklebust <[email protected]>
2011-03-11NFSv4.1: reject zero layout with zeroed stripe unitBenny Halevy1-2/+2
Allowing stripe_unit==0 causes the client to crash later on when dividing by zero. Reported-by: Marc Eshel <[email protected]> Signed-off-by: Benny Halevy <[email protected]> Signed-off-by: Trond Myklebust <[email protected]>
2011-03-11NFSv4.1: Clear lseg pointer in ->doio functionFred Isaman3-1/+4
Now that we have access to the pointer, clear it immediately after the put, instead of in caller. Signed-off-by: Fred Isaman <[email protected]> Signed-off-by: Trond Myklebust <[email protected]>
2011-03-11NFSv4.1: rearrange ->doio argsFred Isaman3-37/+43
This will make it possible to clear the lseg pointer in the same function as it is put, instead of in the caller nfs_pageio_doio(). Signed-off-by: Fred Isaman <[email protected]> Signed-off-by: Trond Myklebust <[email protected]>
2011-03-11NFSv4.1: pnfs filelayout driver writeFred Isaman4-2/+126
Allows the pnfs filelayout driver to write to the data servers. Note that COMMIT to data servers will be implemented in a future patch. To avoid improper behavior, for the moment any WRITE to a data server that would also require a COMMIT to the data server is sent NFS_FILE_SYNC. Signed-off-by: Andy Adamson <[email protected]> Signed-off-by: Dean Hildebrand <[email protected]> Signed-off-by: Fred Isaman <[email protected]> Signed-off-by: Mingyang Guo <[email protected]> Signed-off-by: Oleg Drokin <[email protected]> Signed-off-by: Ricardo Labiaga <[email protected]> Signed-off-by: Andy Adamson <[email protected]> Signed-off-by: Benny Halevy <[email protected]> Signed-off-by: Fred Isaman <[email protected]> Signed-off-by: Trond Myklebust <[email protected]>
2011-03-11NFSv4.1: remove GETATTR from ds writesFred Isaman2-4/+10
Any WRITE compound directed to a data server needs to have the GETATTR calls suppressed. Signed-off-by: Fred Isaman <[email protected]> Signed-off-by: Trond Myklebust <[email protected]>
2011-03-11NFSv4.1: implement generic pnfs layer write switchAndy Adamson4-0/+45
Signed-off-by: Andy Adamson <[email protected]> Signed-off-by: Boaz Harrosh <[email protected]> Signed-off-by: Dean Hildebrand <[email protected]> Signed-off-by: Fred Isaman <[email protected]> Signed-off-by: J. Bruce Fields <[email protected]> Signed-off-by: Mike Sager <[email protected]> Signed-off-by: Ricardo Labiaga <[email protected]> Signed-off-by: Tao Guo <[email protected]> Signed-off-by: Andy Adamson <[email protected]> Signed-off-by: Benny Halevy <[email protected]> Signed-off-by: Fred Isaman <[email protected]> Signed-off-by: Trond Myklebust <[email protected]>
2011-03-11NFSv4.1: trigger LAYOUTGET for writesFred Isaman3-12/+49
Signed-off-by: Fred Isaman <[email protected]> Signed-off-by: Trond Myklebust <[email protected]>
2011-03-11NFSv4.1: Send lseg down into nfs_write_rpcsetupFred Isaman1-2/+5
We grab the lseg sent in from the doio function and attach it to each struct nfs_write_data created. This is how the lseg will be sent to the layout driver. Signed-off-by: Fred Isaman <[email protected]> Signed-off-by: Trond Myklebust <[email protected]>
2011-03-11NFSv4.1: add callback to nfs4_write_doneFred Isaman1-4/+10
Add callback that pnfs layout driver can use to do its own handling of data server WRITE response. Signed-off-by: Fred Isaman <[email protected]> Signed-off-by: Trond Myklebust <[email protected]>
2011-03-11NFSv4.1: rearrange nfs_write_rpcsetupAndy Adamson1-36/+46
Reorder nfs_write_rpcsetup, preparing for a pnfs entry point. Signed-off-by: Andy Adamson <[email protected]> Signed-off-by: Fred Isaman <[email protected]> Signed-off-by: Trond Myklebust <[email protected]>
2011-03-11NFSv4.1: turn off pNFS on ds connection failureAndy Adamson4-9/+36
If a data server is unavailable, go through MDS. Mark the deviceid containing the data server as a negative cache entry. Do not try to connect to any data server on a deviceid marked as a negative cache entry. Mark any layout that tries to use the marked deviceid as failed. Inodes with a layout marked as fails will not use the layout for I/O, and will not perform any more layoutgets. Inodes without a layout will still do layoutget, but the layout will get marked immediately. Signed-off-by: Andy Adamson <[email protected]> Signed-off-by: Trond Myklebust <[email protected]>
2011-03-11NFSv4.1 move deviceid cache to filelayout driverChristoph Hellwig5-263/+92
No need for generic cache with only one user. Keep a simple hash of deviceids in the filelayout driver. Signed-off-by: Christoph Hellwig <[email protected]> Acked-by: Andy Adamson <[email protected]> Signed-off-by: Trond Myklebust <[email protected]>
2011-03-11NFSv4.1: filelayout async error handlerAndy Adamson5-6/+113
Use our own async error handler. Mark the layout as failed and retry i/o through the MDS on specified errors. Update the mds_offset in nfs_readpage_retry so that a failed short-read retry to a DS gets correctly resent through the MDS. Signed-off-by: Andy Adamson <[email protected]> Signed-off-by: Trond Myklebust <[email protected]>
2011-03-11NFSv4.1: filelayout readAndy Adamson5-2/+91
Attempt a pNFS file layout read by setting up the nfs_read_data struct and calling nfs_initiate_read with the data server rpc client and the filelayout rpc call ops. Error handling is implemented in a subsequent patch. Signed-off-by: Andy Adamson <[email protected]> Signed-off-by: Dean Hildebrand <[email protected]> Signed-off-by: Fred Isaman <[email protected]> Signed-off-by: Fred Isaman <[email protected]> Signed-off-by: Mingyang Guo <[email protected]> Signed-off-by: Oleg Drokin <[email protected]> Signed-off-by: Ricardo Labiaga <[email protected]> Tested-by: Guo Mingyang <[email protected]> Signed-off-by: Andy Adamson <[email protected]> Signed-off-by: Benny Halevy <[email protected]> Signed-off-by: Trond Myklebust <[email protected]>
2011-03-11NFSv4.1: filelayout i/o helpersFred Isaman3-0/+108
Prepare for filelayout_read_pagelist with helper functions that find the correct data server, filehandle, and offset. Signed-off-by: Andy Adamson <[email protected]> Signed-off-by: Dean Hildebrand <[email protected]> Signed-off-by: Fred Isaman <[email protected]> Signed-off-by: Marc Eshel <[email protected]> Signed-off-by: Mike Sager <[email protected]> Signed-off-by: Oleg Drokin <[email protected]> Signed-off-by: Tao Guo <[email protected]> Signed-off-by: Tigran Mkrtchyan <[email protected]> Signed-off-by: Tigran Mkrtchyan <[email protected]> Signed-off-by: Andy Adamson <[email protected]> Signed-off-by: Benny Halevy <[email protected]> Signed-off-by: Trond Myklebust <[email protected]>
2011-03-11NFSv4.1: data server connectionAndy Adamson5-2/+146
Introduce a data server set_client and init session following the nfs4_set_client and nfs4_init_session convention. Once a new nfs_client is on the nfs_client_list, the nfs_client cl_cons_state serializes access to creating an nfs_client struct with matching properties. Use the new nfs_get_client() that initializes new clients. Signed-off-by: Andy Adamson <[email protected]> Signed-off-by: Trond Myklebust <[email protected]>