Age | Commit message (Collapse) | Author | Files | Lines |
|
all remaining callers are passing 0; some just obscure that fact.
Signed-off-by: Al Viro <[email protected]>
|
|
Now that no one is using rw, remove it completely.
Signed-off-by: Omar Sandoval <[email protected]>
Signed-off-by: Al Viro <[email protected]>
|
|
The rw parameter to direct_IO is redundant with iov_iter->type, and
treated slightly differently just about everywhere it's used: some users
do rw & WRITE, and others do rw == WRITE where they should be doing a
bitwise check. Simplify this with the new iov_iter_rw() helper, which
always returns either READ or WRITE.
Signed-off-by: Omar Sandoval <[email protected]>
Signed-off-by: Al Viro <[email protected]>
|
|
All places outside of core VFS that checked ->read and ->write for being NULL or
called the methods directly are gone now, so NULL {read,write} with non-NULL
{read,write}_iter will do the right thing in all cases.
Signed-off-by: Al Viro <[email protected]>
|
|
|
|
Signed-off-by: Trond Myklebust <[email protected]>
|
|
The LAYOUTCOMMIT operation means different things to different layout types.
For blocks and objects, it is both a data and metadata consistency operation.
For files and flexfiles, it is only a metadata consistency operation.
This patch separates out the 2 cases, allowing the files/flexfiles layout
drivers to optimise away the data consistency calls to layoutcommit.
Signed-off-by: Trond Myklebust <[email protected]>
|
|
We must not send a close or delegreturn that would result in a
return-on-close of the layout without ensuring that we've also
sent the necessary layoutcommit.
Signed-off-by: Trond Myklebust <[email protected]>
|
|
If the caller does not specify the O_SYNC flag, then it is legitimate
to return from O_DIRECT without doing a pNFS layoutcommit operation.
However if the file is opened O_DIRECT|O_SYNC then we'd better get it
right.
Signed-off-by: Trond Myklebust <[email protected]>
|
|
We don't just want to sync out buffered writes, but also O_DIRECT ones.
Signed-off-by: Trond Myklebust <[email protected]>
|
|
File unlock needs to update both data and metadata on the NFS server
in order to act as a synchronisation point for other clients.
Signed-off-by: Trond Myklebust <[email protected]>
|
|
Then apply it to nfs_setattr() and nfs_getattr().
Signed-off-by: Trond Myklebust <[email protected]>
|
|
pnfs_set_layoutcommit() and pnfs_commit_set_layoutcommit() are 100% identical
except for the function arguments. Refactor to eliminate the difference.
Signed-off-by: Trond Myklebust <[email protected]>
|
|
If the NFS_INO_LAYOUTCOMMIT flag was unset, then we _must_ ensure that
we also reset the last write byte (lwb) for that layout. The current
code depends on us clearing the lwb when we clear NFS_INO_LAYOUTCOMMIT,
which is not the case when we call pnfs_clear_layoutcommit().
Signed-off-by: Trond Myklebust <[email protected]>
|
|
Minor optimisation for the case where the layout has return-on-close
enabled.
Signed-off-by: Trond Myklebust <[email protected]>
|
|
I appear to have missed this when adding the ftrace probes.
Signed-off-by: Trond Myklebust <[email protected]>
|
|
Make it easier to grep for these functions by name.
Signed-off-by: Trond Myklebust <[email protected]>
|
|
The spec says that once all layouts that reference a given deviceid
have been returned, then we are only allowed to continue to cache
the deviceid if the metadata server supports notifications.
Signed-off-by: Trond Myklebust <[email protected]>
|
|
We are only allowed to cache deviceinfo if the server supports notifications
and actually promises to call us back when changes occur. Right now, we
request those notifications, but then we don't check the server's reply.
Signed-off-by: Trond Myklebust <[email protected]>
|
|
There really is no reason to do so.
Signed-off-by: Trond Myklebust <[email protected]>
|
|
Use of synchronize_rcu() when unmounting and potentially freeing a lot
of deviceids is problematic. There really is no reason why we can't just
use kfree_rcu() here.
Signed-off-by: Trond Myklebust <[email protected]>
|
|
Kinglong Mee reports that asynchronous delegations are being killed
by the call to rpc_shutdown_client() when unmounting. This can lead
to state leakage on the server until the client lease expires.
Reported-by: Kinglong Mee <[email protected]>
Signed-off-by: Trond Myklebust <[email protected]>
|
|
struct kiocb now is a generic I/O container, so move it to fs.h.
Also do a #include diet for aio.h while we're at it.
Signed-off-by: Christoph Hellwig <[email protected]>
Signed-off-by: Al Viro <[email protected]>
|
|
Most callers in the kernel want to perform synchronous file I/O, but
still have to bloat the stack with a full struct kiocb. Split out
the parts needed in filesystem code from those in the aio code, and
only allocate those needed to pass down argument on the stack. The
aio code embedds the generic iocb in the one it allocates and can
easily get back to it by using container_of.
Also add a ->ki_complete method to struct kiocb, this is used to call
into the aio code and thus removes the dependency on aio for filesystems
impementing asynchronous operations. It will also allow other callers
to substitute their own completion callback.
We also add a new ->ki_flags field to work around the nasty layering
violation recently introduced in commit 5e33f6 ("usb: gadget: ffs: add
eventfd notification about ffs events").
Signed-off-by: Christoph Hellwig <[email protected]>
Signed-off-by: Al Viro <[email protected]>
|
|
This follows up "nfs: fix dio deadlock when O_DIRECT flag is flipped"
and removes the unnecessary CONFIG_NFS_SWAP switch.
Signed-off-by: Peng Tao <[email protected]>
Signed-off-by: Trond Myklebust <[email protected]>
|
|
There is no need to pass the total request length in the kiocb, as
we already get passed in through the iov_iter argument.
Signed-off-by: Christoph Hellwig <[email protected]>
Signed-off-by: Al Viro <[email protected]>
|
|
Do so on the assumption that for most use cases, that list will turn into
a more or less LRU-ordered list, and so the list traversals in
nfs_client_return_marked_delegations() are likely to be shorter before
hitting a candidate to return.
Signed-off-by: Trond Myklebust <[email protected]>
|
|
A remount that alters security flavors can appear to succeed when it should
instead return -EINVAL. Check to see if the current security flavor exists
within the flavors specified in the remount options, and if not fail the
remount.
Signed-off-by: Benjamin Coddington <[email protected]>
Signed-off-by: Trond Myklebust <[email protected]>
|
|
The function nfs4_pnfs_v3_ds_connect_unload is exported so it can be
used by other modules, but is also marked '__exit' and will be
discarded when built into the kernel, as pointed out by this
linker error:
`nfs4_pnfs_v3_ds_connect_unload' referenced in section `___ksymtab_gpl+nfs4_pnfs_v3_ds_connect_unload' of fs/built-in.o: defined in discarded section `.exit.text' of fs/built-in.o
This removes the __exit annotation to make it safe to call this function.
Signed-off-by: Arnd Bergmann <[email protected]>
Fixes: 5f01d9539496 ("nfs41: create NFSv3 DS connection if specified")
Signed-off-by: Trond Myklebust <[email protected]>
|
|
The semantic patch that fixes this problem is as follows:
(http://coccinelle.lip6.fr/)
// <smpl>
@r@
type T;
identifier f;
@@
static T f (...) { ... }
@@
identifier r.f;
declarer name EXPORT_SYMBOL_GPL;
@@
-EXPORT_SYMBOL_GPL(f);
// </smpl>
Signed-off-by: Julia Lawall <[email protected]>
Signed-off-by: Trond Myklebust <[email protected]>
|
|
If the call to exchange-id returns with the EXCHGID4_FLAG_CONFIRMED_R flag
set, then that means our lease was established by a previous mount instance.
Ensure that we detect this situation, and that we clear the state held by
that mount.
Reported-by: Jorge Mora <[email protected]>
Signed-off-by: Trond Myklebust <[email protected]>
|
|
We do not want to allow a race with another NFS mount to cause
nfs41_walk_client_list() to establish a lease on our nfs_client before
we're done checking for trunking.
Signed-off-by: Trond Myklebust <[email protected]>
|
|
nfs_vm_page_mkwrite() should wait until the page cache invalidation
is finished. This is the second patch in a 2 patch series to deprecate
the NFS client's reliance on nfs_release_page() in the context of
nfs_invalidate_mapping().
Signed-off-by: Trond Myklebust <[email protected]>
|
|
When invalidating the page cache for a regular file, we want to first
sync all dirty data to disk and then call invalidate_inode_pages2().
The latter relies on nfs_launder_page() and nfs_release_page() to deal
respectively with dirty pages, and unstable written pages.
When commit 9590544694bec ("NFS: avoid deadlocks with loop-back mounted
NFS filesystems.") changed the behaviour of nfs_release_page(), then it
made it possible for invalidate_inode_pages2() to fail with an EBUSY.
Unfortunately, that error is then propagated back to read().
Let's therefore work around the problem for now by protecting the call
to sync the data and invalidate_inode_pages2() so that they are atomic
w.r.t. the addition of new writes.
Later on, we can revisit whether or not we still need nfs_launder_page()
and nfs_release_page().
Signed-off-by: Trond Myklebust <[email protected]>
|
|
In nfs_client_return_marked_delegations() and nfs_delegation_reap_unclaimed()
we want to optimise the loop traversal by skipping delegations that are
already in the process of being returned.
Signed-off-by: Trond Myklebust <[email protected]>
|
|
This patch ensures that the superblock doesn't go ahead and disappear
underneath us while the state manager thread is returning delegations.
Signed-off-by: Trond Myklebust <[email protected]>
|
|
Ensure that nfs_inode_set_delegation() doesn't inadvertently detach a
delegation that is already in the process of being returned.
Signed-off-by: Trond Myklebust <[email protected]>
|
|
Signed-off-by: Trond Myklebust <[email protected]>
|
|
After 566fcec60 the client uses the "current stateid" from the
nfs4_state structure to close a file. This could potentially contain a
delegation stateid, which is disallowed by the protocol and causes
servers to return NFS4ERR_BAD_STATEID. This patch restores the
(correct) behavior of sending the open stateid to close a file.
Reported-by: Olga Kornievskaia <[email protected]>
Fixes: 566fcec60 (NFSv4: Fix an atomicity problem in CLOSE)
Signed-off-by: Anna Schumaker <[email protected]>
Signed-off-by: Trond Myklebust <[email protected]>
|
|
put_rpccred() can sleep.
Fixes: 8f649c3762547 ("NFSv4: Fix the locking in nfs_inode_reclaim_delegation()")
Cc: [email protected] # 2.6.35+
Signed-off-by: Trond Myklebust <[email protected]>
|
|
If the server does not return a valid set of attributes that we can
use to either create a file or refresh the inode, then there is no
value in calling nfs_prime_dcache().
However if we're just refreshing the inode using the attributes that
the server returned, then it shouldn't matter whether or not we have
a filehandle, as long as we check the fsid+fileid combination.
Signed-off-by: Trond Myklebust <[email protected]>
|
|
When we call readdirplus, set the fileid normally returned by readdir
as the mounted-on-fileid, since that is commonly the case if there is
a mountpoint. To ensure that we get it right, we only set the flag if
the readdir fileid differs from the one returned in the readdirplus
attributes.
This again means that we can avoid the issues described in commit
2ef47eb1aee17 ("NFS: Fix use of nfs_attr_use_mounted_on_fileid()"),
which only fixed NFSv4.
Signed-off-by: Trond Myklebust <[email protected]>
|
|
If we're traversing a directory which contains a submounted filesystem,
or one that has a referral, the NFS server that is processing the READDIR
request will often return information for the underlying (mounted-on)
directory. It may, or may not, also return filehandle information.
If this happens, and the lookup in nfs_prime_dcache() returns the
dentry for the submounted directory, the filehandle comparison will
fail, and we call d_invalidate(). Post-commit 8ed936b5671bf
("vfs: Lazily remove mounts on unlinked files and directories."), this
means the entire subtree is unmounted.
The following minimal patch addresses this problem by punting on
the invalidation if there is a submount.
Kudos to Neil Brown <[email protected]> for having tracked down this
issue (see link).
Reported-by: Nix <[email protected]>
Link: http://lkml.kernel.org/r/[email protected]
Cc: [email protected] # 3.18+
Signed-off-by: Trond Myklebust <[email protected]>
|
|
Ensure that we don't regress the changes that were made to the
directory.
Signed-off-by: Trond Myklebust <[email protected]>
Tested-by: Chuck Lever <[email protected]>
|
|
nfs_post_op_update_inode() is called after a self-induced attribute
update. Ensure that it also sets the barrier.
Signed-off-by: Trond Myklebust <[email protected]>
Tested-by: Chuck Lever <[email protected]>
|
|
Prior to this patch, we used to always OK attribute updates that extended
the file size on the assumption that we might be performing writeback.
Now that we have attribute barriers to protect the writeback related updates,
we should remove this hack, as it can cause truncate() operations to
apparently be reverted if/when a readahead or getattr RPC call races
with our on-the-wire SETATTR.
Signed-off-by: Trond Myklebust <[email protected]>
Tested-by: Chuck Lever <[email protected]>
|
|
Ensure that other operations that race with delegreturn and layoutcommit
cannot revert the attribute updates that were made on the server.
Signed-off-by: Trond Myklebust <[email protected]>
Tested-by: Chuck Lever <[email protected]>
|
|
Ensure that other operations that race with our write RPC calls
cannot revert the file size updates that were made on the server.
Signed-off-by: Trond Myklebust <[email protected]>
Tested-by: Chuck Lever <[email protected]>
|
|
Ensure that we update the attribute barrier even if there were no
invalidations, provided that this value is newer than the old one.
Signed-off-by: Trond Myklebust <[email protected]>
Tested-by: Chuck Lever <[email protected]>
|
|
Ensure that other operations which raced with our setattr RPC call
cannot revert the file attribute changes that were made on the server.
To do so, we artificially bump the attribute generation counter on
the inode so that all calls to nfs_fattr_init() that precede ours
will be dropped.
The motivation for the patch came from Chuck Lever's reports of readaheads
racing with truncate operations and causing the file size to be reverted.
Reported-by: Chuck Lever <[email protected]>
Signed-off-by: Trond Myklebust <[email protected]>
Tested-by: Chuck Lever <[email protected]>
|