Age | Commit message (Collapse) | Author | Files | Lines |
|
Instead of path_lookupat() doing trailing symlink resolution,
use the same scheme as on the O_CREAT side. Walk with
LOOKUP_PARENT, then (in do_last()) look the final component
up, then either open it or return error or, if it's a symlink,
give the symlink back to path_openat() to be resolved there.
The really messy complication here is RCU. We don't want to drop
out of RCU mode before the final lookup, since we don't want to
bounce parent directory ->d_count without a good reason.
Result is _not_ pretty; later in the series we'll clean it up.
For now we are roughly back where we'd been before the revert
done by Nick's series - top-level logics of path_openat() is
cleaned up, do_last() does actual opening, symlink resolution is
done uniformly.
Signed-off-by: Al Viro <[email protected]>
|
|
Don't stash the struct file * used as starting point of walk in nameidata;
pass file ** to path_init() instead.
Signed-off-by: Al Viro <[email protected]>
|
|
New helper: terminate_walk(). An error has happened during pathname
resolution and we either drop nd->path or terminate RCU, depending
the mode we had been in. After that, nd is essentially empty.
Switch link_path_walk() to using that for cleanup.
Now the top-level logics in link_path_walk() is back to sanity. RCU
dependencies are in the lower-level functions.
Signed-off-by: Al Viro <[email protected]>
|
|
Now we have do_follow_link() guaranteed to leave without dangling RCU
and the next step will get LOOKUP_RCU logics completely out of
link_path_walk().
Signed-off-by: Al Viro <[email protected]>
|
|
Signed-off-by: Al Viro <[email protected]>
|
|
Signed-off-by: Al Viro <[email protected]>
|
|
getting LOOKUP_RCU checks out of link_path_walk()...
Signed-off-by: Al Viro <[email protected]>
|
|
Signed-off-by: Al Viro <[email protected]>
|
|
new helper: path_openat(). Does what do_filp_open() does, except
that it tries only the walk mode (RCU/normal/force revalidation)
it had been told to.
Both create and non-create branches are using path_lookupat() now.
Fixed the double audit_inode() in non-create branch.
Signed-off-by: Al Viro <[email protected]>
|
|
take calculation of open_flags by open(2) arguments into new helper
in fs/open.c, move filp_open() over there, have it and do_sys_open()
use that helper, switch exec.c callers of do_filp_open() to explicit
(and constant) struct open_flags.
Signed-off-by: Al Viro <[email protected]>
|
|
No point messing with passing shitloads of "operation mode" arguments
to do_open() one by one, especially since they are not going to change
during do_filp_open(). Collect them into a struct, fill it and pass
to do_last() by reference.
Make sure that lookup intent flags are correctly set and removed - we
want them for do_last(), but they make no sense for __do_follow_link().
Signed-off-by: Al Viro <[email protected]>
|
|
Signed-off-by: Al Viro <[email protected]>
|
|
Signed-off-by: Al Viro <[email protected]>
|
|
Signed-off-by: Al Viro <[email protected]>
|
|
instead of ad-hackery around need_reval_dot(), do the following:
set a flag (LOOKUP_JUMPED) in the beginning of path, on absolute
symlink traversal, on ".." and on procfs-style symlinks. Clear on
normal components, leave unchanged on ".". Non-nested callers of
link_path_walk() call handle_reval_path(), which checks that flag
is set and that fs does want the final revalidate thing, then does
->d_revalidate(). In link_path_walk() all the return_reval stuff
is gone.
Signed-off-by: Al Viro <[email protected]>
|
|
no need to do it in three places...
Signed-off-by: Al Viro <[email protected]>
|
|
Actual dependency on whether we want RCU or not is in 3 small areas
(as it ought to be) and everything around those is the same in both
versions. Since each function has only one caller and those callers
are on two sides of if (flags & LOOKUP_RCU), it's easier and cleaner
to merge them and pull the checks inside.
Signed-off-by: Al Viro <[email protected]>
|
|
New helper: path_lookupat(). Basically, what do_path_lookup() boils to
modulo -ECHILD/-ESTALE handler. path_walk* family is gone; vfs_path_lookup()
is using link_path_walk() directly, do_path_lookup() and do_filp_open()
are using path_lookupat().
Signed-off-by: Al Viro <[email protected]>
|
|
Signed-off-by: Al Viro <[email protected]>
|
|
all remaining callers pass LOOKUP_PARENT to it, so
flags argument can die; renamed to kern_path_parent()
Signed-off-by: Al Viro <[email protected]>
|
|
The previous patch missed a couple of places where the AIL list
needed locking, so this fixes up those places, plus a comment
is corrected too.
Signed-off-by: Steven Whitehouse <[email protected]>
Cc: Dave Chinner <[email protected]>
|
|
Fix for a dumb preadv()/pwritev() compat bug - unlike the native
variants, the compat_... ones forget to check FMODE_P{READ,WRITE}, so
e.g. on pipe the native preadv() will fail with -ESPIPE and compat one
will act as readv() and succeed.
Not critical, but it's a clear bug with trivial fix, so IMO it's OK for
-final.
Signed-off-by: Al Viro <[email protected]>
Signed-off-by: Linus Torvalds <[email protected]>
|
|
Fix for a dumb preadv()/pwritev() compat bug - unlike the native
variants, compat_... ones forget to check FMODE_P{READ,WRITE}, so e.g.
on pipe the native preadv() will fail with -ESPIPE and compat one will
act as readv() and succeed. Not critical, but it's a clear bug with trivial
fix.
Signed-off-by: Al Viro <[email protected]>
|
|
* git://git.kernel.org/pub/scm/linux/kernel/git/mason/btrfs-unstable:
Btrfs: break out of shrink_delalloc earlier
btrfs: fix not enough reserved space
btrfs: fix dip leak
Btrfs: make sure not to return overlapping extents to fiemap
Btrfs: deal with short returns from copy_from_user
Btrfs: fix regressions in copy_from_user handling
|
|
Josef had changed shrink_delalloc to exit after three shrink
attempts, which wasn't quite enough because new writers could
race in and steal free space.
But it also fixed deadlocks and stalls as we tried to recover
delalloc reservations. The code was tweaked to loop 1024
times, and would reset the counter any time a small amount
of progress was made. This was too drastic, and with a
lot of writers we can end up stuck in shrink_delalloc forever.
The shrink_delalloc loop is fairly complex because the caller is looping
too, and the caller will go ahead and force a transaction commit to make
sure we reclaim space.
This reworks things to exit shrink_delalloc when we've forced some
writeback and the delalloc reservations have gone down. This means
the writeback has not just started but has also finished at
least some of the metadata changes required to reclaim delalloc
space.
If we've got this wrong, we're returning ENOSPC too early, which
is a big improvement over the current behavior of hanging the machine.
Test 224 in xfstests hammers on this nicely, and with 1000 writers
trying to fill a 1GB drive we get our first ENOSPC at 93% full. The
other writers are able to continue until we get 100%.
This is a worst case test for btrfs because the 1000 writers are doing
small IO, and the small FS size means we don't have a lot of room
for metadata chunks.
Signed-off-by: Chris Mason <[email protected]>
|
|
The new xfs_alert_tag() used a variable named "panic",
and that is to be avoided. Rename it.
Signed-off-by: Alex Elder <[email protected]>
Reviewed-by: Dave Chinner <[email protected]>
|
|
Factor out some cut-and-paste code in options parsing.
Saves about 800 bytes on x86-64.
Signed-off-by: Rob Landley <[email protected]>
Signed-off-by: Trond Myklebust <[email protected]>
|
|
Eliminate two mostly duplicate functions (nfs_parse_simple_hostname()
and nfs_parse_protected_hostname()) and instead just make the calling
function (nfs_parse_devname()) do everything.
Signed-off-by: Rob Landley <[email protected]>
Signed-off-by: Trond Myklebust <[email protected]>
|
|
Account NFS direct-io reads and writes into Task I/O Accounting.
Do it before complition to handle aio.
NFS have unusual direct-io implementation,
thus accounting in generic code does not work.
Signed-off-by: Konstantin Khlebnikov <[email protected]>
Signed-off-by: Trond Myklebust <[email protected]>
|
|
The new behaviour is enabled using the new module parameter
'nfs4_disable_idmapping'.
Note that if the server rejects an unmapped uid or gid, then
the client will automatically switch back to using the idmapper.
Signed-off-by: Trond Myklebust <[email protected]>
|
|
This will be required in order to switch uid/gid mapping back on if the
admin has tried to disable it.
Note that we also propagate NFS4ERR_BADNAME at the same time, in order to
work around a Linux server bug.
Signed-off-by: Trond Myklebust <[email protected]>
|
|
...instead of the nfs_client.
Signed-off-by: Trond Myklebust <[email protected]>
|
|
Signed-off-by: Trond Myklebust <[email protected]>
|
|
Signed-off-by: Trond Myklebust <[email protected]>
|
|
Allowing stripe_unit==0 causes the client to crash later on
when dividing by zero.
Reported-by: Marc Eshel <[email protected]>
Signed-off-by: Benny Halevy <[email protected]>
Signed-off-by: Trond Myklebust <[email protected]>
|
|
Now that we have access to the pointer, clear it immediately after
the put, instead of in caller.
Signed-off-by: Fred Isaman <[email protected]>
Signed-off-by: Trond Myklebust <[email protected]>
|
|
This will make it possible to clear the lseg pointer in the same
function as it is put, instead of in the caller nfs_pageio_doio().
Signed-off-by: Fred Isaman <[email protected]>
Signed-off-by: Trond Myklebust <[email protected]>
|
|
Allows the pnfs filelayout driver to write to the data servers.
Note that COMMIT to data servers will be implemented in a future
patch. To avoid improper behavior, for the moment any WRITE to a data
server that would also require a COMMIT to the data server is sent
NFS_FILE_SYNC.
Signed-off-by: Andy Adamson <[email protected]>
Signed-off-by: Dean Hildebrand <[email protected]>
Signed-off-by: Fred Isaman <[email protected]>
Signed-off-by: Mingyang Guo <[email protected]>
Signed-off-by: Oleg Drokin <[email protected]>
Signed-off-by: Ricardo Labiaga <[email protected]>
Signed-off-by: Andy Adamson <[email protected]>
Signed-off-by: Benny Halevy <[email protected]>
Signed-off-by: Fred Isaman <[email protected]>
Signed-off-by: Trond Myklebust <[email protected]>
|
|
Any WRITE compound directed to a data server needs to have the
GETATTR calls suppressed.
Signed-off-by: Fred Isaman <[email protected]>
Signed-off-by: Trond Myklebust <[email protected]>
|
|
Signed-off-by: Andy Adamson <[email protected]>
Signed-off-by: Boaz Harrosh <[email protected]>
Signed-off-by: Dean Hildebrand <[email protected]>
Signed-off-by: Fred Isaman <[email protected]>
Signed-off-by: J. Bruce Fields <[email protected]>
Signed-off-by: Mike Sager <[email protected]>
Signed-off-by: Ricardo Labiaga <[email protected]>
Signed-off-by: Tao Guo <[email protected]>
Signed-off-by: Andy Adamson <[email protected]>
Signed-off-by: Benny Halevy <[email protected]>
Signed-off-by: Fred Isaman <[email protected]>
Signed-off-by: Trond Myklebust <[email protected]>
|
|
Signed-off-by: Fred Isaman <[email protected]>
Signed-off-by: Trond Myklebust <[email protected]>
|
|
We grab the lseg sent in from the doio function and attach it to
each struct nfs_write_data created. This is how the lseg will be
sent to the layout driver.
Signed-off-by: Fred Isaman <[email protected]>
Signed-off-by: Trond Myklebust <[email protected]>
|
|
Add callback that pnfs layout driver can use to do its own handling
of data server WRITE response.
Signed-off-by: Fred Isaman <[email protected]>
Signed-off-by: Trond Myklebust <[email protected]>
|
|
Reorder nfs_write_rpcsetup, preparing for a pnfs entry point.
Signed-off-by: Andy Adamson <[email protected]>
Signed-off-by: Fred Isaman <[email protected]>
Signed-off-by: Trond Myklebust <[email protected]>
|
|
If a data server is unavailable, go through MDS.
Mark the deviceid containing the data server as a negative cache entry.
Do not try to connect to any data server on a deviceid marked as a negative
cache entry. Mark any layout that tries to use the marked deviceid as failed.
Inodes with a layout marked as fails will not use the layout for I/O, and will
not perform any more layoutgets.
Inodes without a layout will still do layoutget, but the layout will get
marked immediately.
Signed-off-by: Andy Adamson <[email protected]>
Signed-off-by: Trond Myklebust <[email protected]>
|
|
No need for generic cache with only one user.
Keep a simple hash of deviceids in the filelayout driver.
Signed-off-by: Christoph Hellwig <[email protected]>
Acked-by: Andy Adamson <[email protected]>
Signed-off-by: Trond Myklebust <[email protected]>
|
|
Use our own async error handler.
Mark the layout as failed and retry i/o through the MDS on specified errors.
Update the mds_offset in nfs_readpage_retry so that a failed short-read retry
to a DS gets correctly resent through the MDS.
Signed-off-by: Andy Adamson <[email protected]>
Signed-off-by: Trond Myklebust <[email protected]>
|
|
Attempt a pNFS file layout read by setting up the nfs_read_data struct and
calling nfs_initiate_read with the data server rpc client and the
filelayout rpc call ops.
Error handling is implemented in a subsequent patch.
Signed-off-by: Andy Adamson <[email protected]>
Signed-off-by: Dean Hildebrand <[email protected]>
Signed-off-by: Fred Isaman <[email protected]>
Signed-off-by: Fred Isaman <[email protected]>
Signed-off-by: Mingyang Guo <[email protected]>
Signed-off-by: Oleg Drokin <[email protected]>
Signed-off-by: Ricardo Labiaga <[email protected]>
Tested-by: Guo Mingyang <[email protected]>
Signed-off-by: Andy Adamson <[email protected]>
Signed-off-by: Benny Halevy <[email protected]>
Signed-off-by: Trond Myklebust <[email protected]>
|
|
Prepare for filelayout_read_pagelist with helper functions that find the correct
data server, filehandle, and offset.
Signed-off-by: Andy Adamson <[email protected]>
Signed-off-by: Dean Hildebrand <[email protected]>
Signed-off-by: Fred Isaman <[email protected]>
Signed-off-by: Marc Eshel <[email protected]>
Signed-off-by: Mike Sager <[email protected]>
Signed-off-by: Oleg Drokin <[email protected]>
Signed-off-by: Tao Guo <[email protected]>
Signed-off-by: Tigran Mkrtchyan <[email protected]>
Signed-off-by: Tigran Mkrtchyan <[email protected]>
Signed-off-by: Andy Adamson <[email protected]>
Signed-off-by: Benny Halevy <[email protected]>
Signed-off-by: Trond Myklebust <[email protected]>
|
|
Introduce a data server set_client and init session following the
nfs4_set_client and nfs4_init_session convention.
Once a new nfs_client is on the nfs_client_list, the nfs_client cl_cons_state
serializes access to creating an nfs_client struct with matching properties.
Use the new nfs_get_client() that initializes new clients.
Signed-off-by: Andy Adamson <[email protected]>
Signed-off-by: Trond Myklebust <[email protected]>
|