Age | Commit message (Collapse) | Author | Files | Lines |
|
very similar to ext3 counterpart...
Signed-off-by: Al Viro <[email protected]>
|
|
For those file systems(btrfs/ext4/ocfs2/tmpfs) that support
SEEK_DATA/SEEK_HOLE functions, we end up handling the similar
matter in lseek_execute() to update the current file offset
to the desired offset if it is valid, ceph also does the
simliar things at ceph_llseek().
To reduce the duplications, this patch make lseek_execute()
public accessible so that we can call it directly from the
underlying file systems.
Thanks Dave Chinner for this suggestion.
[AV: call it vfs_setpos(), don't bring the removed 'inode' argument back]
v2->v1:
- Add kernel-doc comments for lseek_execute()
- Call lseek_execute() in ceph->llseek()
Signed-off-by: Jie Liu <[email protected]>
Cc: Dave Chinner <[email protected]>
Cc: Al Viro <[email protected]>
Cc: Andi Kleen <[email protected]>
Cc: Andrew Morton <[email protected]>
Cc: Christoph Hellwig <[email protected]>
Cc: Chris Mason <[email protected]>
Cc: Josef Bacik <[email protected]>
Cc: Ben Myers <[email protected]>
Cc: Ted Tso <[email protected]>
Cc: Hugh Dickins <[email protected]>
Cc: Mark Fasheh <[email protected]>
Cc: Joel Becker <[email protected]>
Cc: Sage Weil <[email protected]>
Signed-off-by: Al Viro <[email protected]>
|
|
Signed-off-by: Al Viro <[email protected]>
|
|
Signed-off-by: Al Viro <[email protected]>
|
|
Signed-off-by: Al Viro <[email protected]>
|
|
Signed-off-by: Al Viro <[email protected]>
|
|
Signed-off-by: Al Viro <[email protected]>
|
|
Signed-off-by: Al Viro <[email protected]>
|
|
Signed-off-by: Al Viro <[email protected]>
|
|
Signed-off-by: Al Viro <[email protected]>
|
|
Signed-off-by: Al Viro <[email protected]>
|
|
There's no reason we have to protect the blocked_hash and file_lock_list
with the same spinlock. With the tests I have, breaking it in two gives
a barely measurable performance benefit, but it seems reasonable to make
this locking as granular as possible.
Signed-off-by: Jeff Layton <[email protected]>
Signed-off-by: Al Viro <[email protected]>
|
|
Currently, the hashing that the locking code uses to add these values
to the blocked_hash is simply calculated using fl_owner field. That's
valid in most cases except for server-side lockd, which validates the
owner of a lock based on fl_owner and fl_pid.
In the case where you have a small number of NFS clients doing a lot
of locking between different processes, you could end up with all
the blocked requests sitting in a very small number of hash buckets.
Add a new lm_owner_key operation to the lock_manager_operations that
will generate an unsigned long to use as the key in the hashtable.
That function is only implemented for server-side lockd, and simply
XORs the fl_owner and fl_pid.
Signed-off-by: Jeff Layton <[email protected]>
Acked-by: J. Bruce Fields <[email protected]>
Signed-off-by: Al Viro <[email protected]>
|
|
Break up the blocked_list into a hashtable, using the fl_owner as a key.
This speeds up searching the hash chains, which is especially significant
for deadlock detection.
Note that the initial implementation assumes that hashing on fl_owner is
sufficient. In most cases it should be, with the notable exception being
server-side lockd, which compares ownership using a tuple of the
nlm_host and the pid sent in the lock request. So, this may degrade to a
single hash bucket when you only have a single NFS client. That will be
addressed in a later patch.
The careful observer may note that this patch leaves the file_lock_list
alone. There's much less of a case for turning the file_lock_list into a
hashtable. The only user of that list is the code that generates
/proc/locks, and it always walks the entire list.
Signed-off-by: Jeff Layton <[email protected]>
Acked-by: J. Bruce Fields <[email protected]>
Signed-off-by: Al Viro <[email protected]>
|
|
Testing has shown that iterating over the blocked_list for deadlock
detection turns out to be a bottleneck. In order to alleviate that,
begin the process of turning it into a hashtable. We start by turning
the fl_link into a hlist_node and the global lists into hlists. A later
patch will do the conversion of the blocked_list to a hashtable.
Signed-off-by: Jeff Layton <[email protected]>
Acked-by: J. Bruce Fields <[email protected]>
Signed-off-by: Al Viro <[email protected]>
|
|
Since we always hold the i_lock when inserting a new waiter onto the
fl_block list, we can avoid taking the global lock at all if we find
that it's empty when we go to wake up blocked waiters.
Signed-off-by: Jeff Layton <[email protected]>
Signed-off-by: Al Viro <[email protected]>
|
|
Having a global lock that protects all of this code is a clear
scalability problem. Instead of doing that, move most of the code to be
protected by the i_lock instead. The exceptions are the global lists
that the ->fl_link sits on, and the ->fl_block list.
->fl_link is what connects these structures to the
global lists, so we must ensure that we hold those locks when iterating
over or updating these lists.
Furthermore, sound deadlock detection requires that we hold the
blocked_list state steady while checking for loops. We also must ensure
that the search and update to the list are atomic.
For the checking and insertion side of the blocked_list, push the
acquisition of the global lock into __posix_lock_file and ensure that
checking and update of the blocked_list is done without dropping the
lock in between.
On the removal side, when waking up blocked lock waiters, take the
global lock before walking the blocked list and dequeue the waiters from
the global list prior to removal from the fl_block list.
With this, deadlock detection should be race free while we minimize
excessive file_lock_lock thrashing.
Finally, in order to avoid a lock inversion problem when handling
/proc/locks output we must ensure that manipulations of the fl_block
list are also protected by the file_lock_lock.
Signed-off-by: Jeff Layton <[email protected]>
Signed-off-by: Al Viro <[email protected]>
|
|
Move the fl_link list handling routines into a separate set of helpers.
Also ensure that locks and requests are always put on global lists
last (after fully initializing them) and are taken off before unintializing
them.
Signed-off-by: Jeff Layton <[email protected]>
Signed-off-by: Al Viro <[email protected]>
|
|
Signed-off-by: Jeff Layton <[email protected]>
Acked-by: J. Bruce Fields <[email protected]>
Signed-off-by: Al Viro <[email protected]>
|
|
Signed-off-by: Jeff Layton <[email protected]>
Signed-off-by: Al Viro <[email protected]>
|
|
Signed-off-by: Jeff Layton <[email protected]>
Acked-by: J. Bruce Fields <[email protected]>
Signed-off-by: Al Viro <[email protected]>
|
|
commit 66189be74 (CIFS: Fix VFS lock usage for oplocked files) exported
the locks_delete_block symbol. There's already an exported helper
function that provides this capability however, so make cifs use that
instead and turn locks_delete_block back into a static function.
Note that if fl->fl_next == NULL then this lock has already been through
locks_delete_block(), so we should be OK to ignore an ENOENT error here
and simply not retry the lock.
Cc: Pavel Shilovsky <[email protected]>
Signed-off-by: Jeff Layton <[email protected]>
Acked-by: J. Bruce Fields <[email protected]>
Signed-off-by: Al Viro <[email protected]>
|
|
Signed-off-by: Jeff Layton <[email protected]>
Signed-off-by: Al Viro <[email protected]>
|
|
Instances either don't look at it at all (the majority of cases) or
only want it to find the superblock (which can be had as dentry->d_sb).
A few cases that want more are actually safe with dentry->d_inode -
the only precaution needed is the check that it hadn't been replaced with
NULL by rmdir() or by overwriting rename(), which case should be simply
treated as cache miss.
Signed-off-by: Linus Torvalds <[email protected]>
Signed-off-by: Al Viro <[email protected]>
|
|
"chunk_size" is an unsigned int and "pos" is an unsigned long. The
"& ~(chunk_size-1)" operation clears the high 32 bits unintentionally.
The ALIGN() macro does the correct thing.
Signed-off-by: Dan Carpenter <[email protected]>
Cc: Al Viro <[email protected]>
Signed-off-by: Andrew Morton <[email protected]>
|
|
Signed-off-by: Al Viro <[email protected]>
|
|
Signed-off-by: Al Viro <[email protected]>
|
|
Signed-off-by: Al Viro <[email protected]>
|
|
Signed-off-by: Al Viro <[email protected]>
|
|
Signed-off-by: Al Viro <[email protected]>
|
|
Signed-off-by: Al Viro <[email protected]>
|
|
Signed-off-by: Al Viro <[email protected]>
|
|
Signed-off-by: Al Viro <[email protected]>
|
|
Signed-off-by: Al Viro <[email protected]>
|
|
Signed-off-by: Al Viro <[email protected]>
|
|
Signed-off-by: Al Viro <[email protected]>
|
|
Signed-off-by: Al Viro <[email protected]>
|
|
Signed-off-by: Al Viro <[email protected]>
|
|
Signed-off-by: Al Viro <[email protected]>
|
|
Signed-off-by: Al Viro <[email protected]>
|
|
... especially since there's no way to get that sucker
on the list fsnotify_fasync() works with - the only thing
adding to it is fsnotify_fasync() itself and it's never
called for fanotify files while they are opened.
Signed-off-by: Al Viro <[email protected]>
|
|
Signed-off-by: Al Viro <[email protected]>
|
|
Signed-off-by: Al Viro <[email protected]>
|
|
the only remaining caller (in ncpfs) is guaranteed to return 0 -
we only hit it if we'd just checked that there's no dentry with
such name.
Signed-off-by: Al Viro <[email protected]>
|
|
the fallback it's using for dcache misses is actually the
same value we would've used for inumber anyway.
Signed-off-by: Al Viro <[email protected]>
|
|
Signed-off-by: Al Viro <[email protected]>
|
|
all instances always return ERR_PTR(-E...) or NULL, anyway
Signed-off-by: Al Viro <[email protected]>
|
|
Signed-off-by: Al Viro <[email protected]>
|
|
we'd just checked that child->d_inode is non-NULL, for fuck sake!
Signed-off-by: Al Viro <[email protected]>
|
|
Signed-off-by: Al Viro <[email protected]>
|