aboutsummaryrefslogtreecommitdiff
path: root/fs
AgeCommit message (Collapse)AuthorFilesLines
2010-09-07fuse: flush background queue on connection closeMiklos Szeredi1-4/+12
David Bartly reported that fuse can hang in fuse_get_req_nofail() when the connection to the filesystem server is no longer active. If bg_queue is not empty then flush_bg_queue() called from request_end() can put more requests on to the pending queue. If this happens while ending requests on the processing queue then those background requests will be queued to the pending list and never ended. Another problem is that fuse_dev_release() didn't wake up processes sleeping on blocked_waitq. Solve this by: a) flushing the background queue before calling end_requests() on the pending and processing queues b) setting blocked = 0 and waking up processes waiting on blocked_waitq() Thanks to David for an excellent bug report. Reported-by: David Bartley <[email protected]> Signed-off-by: Miklos Szeredi <[email protected]> CC: [email protected]
2010-09-03sysfs: checking for NULL instead of ERR_PTRDan Carpenter1-1/+1
d_path() returns an ERR_PTR and it doesn't return NULL. Signed-off-by: Dan Carpenter <[email protected]> Cc: stable <[email protected]> Reviewed-by: "Eric W. Biederman" <[email protected]> Signed-off-by: Greg Kroah-Hartman <[email protected]>
2010-09-03Merge branch '2.6.36-xfs-misc' of ↵Alex Elder3-11/+11
git://git.kernel.org/pub/scm/linux/kernel/git/dgc/xfsdev
2010-09-03xfs: Make fiemap work with sparse filesTao Ma3-3/+17
In xfs_vn_fiemap, we set bvm_count to fi_extent_max + 1 and want to return fi_extent_max extents, but actually it won't work for a sparse file. The reason is that in xfs_getbmap we will calculate holes and set it in 'out', while out is malloced by bmv_count(fi_extent_max+1) which didn't consider holes. So in the worst case, if 'out' vector looks like [hole, extent, hole, extent, hole, ... hole, extent, hole], we will only return half of fi_extent_max extents. This patch add a new parameter BMV_IF_NO_HOLES for bvm_iflags. So with this flags, we don't use our 'out' in xfs_getbmap for a hole. The solution is a bit ugly by just don't increasing index of 'out' vector. I felt that it is not easy to skip it at the very beginning since we have the complicated check and some function like xfs_getbmapx_fix_eof_hole to adjust 'out'. Cc: Dave Chinner <[email protected]> Signed-off-by: Tao Ma <[email protected]> Signed-off-by: Alex Elder <[email protected]>
2010-09-03xfs: prevent 32bit overflow in space reservationDave Chinner1-3/+10
If we attempt to preallocate more than 2^32 blocks of space in a single syscall, the transaction block reservation will overflow leading to a hangs in the superblock block accounting code. This is trivially reproduced with xfs_io. Fix the problem by capping the allocation reservation to the maximum number of blocks a single xfs_bmapi() call can allocate (2^21 blocks). Signed-off-by: Dave Chinner <[email protected]> Reviewed-by: Christoph Hellwig <[email protected]>
2010-09-02nfsd4: mask out non-access bits in nfs4_access_to_omodeJ. Bruce Fields1-1/+1
This fixes an unnecessary BUG(). Signed-off-by: J. Bruce Fields <[email protected]>
2010-09-02xfs: Disallow 32bit project quota idArkadiusz Mi?kiewicz1-0/+7
Currently on-disk structure is able to keep only 16bit project quota id, so disallow 32bit ones. This fixes a problem where parts of kernel structures holding project quota id are 32bit while parts (on-disk) are 16bit variables which causes project quota member files to be inaccessible for some operations (like mv/rm). Signed-off-by: Arkadiusz Mi?kiewicz <[email protected]> Reviewed-by: Christoph Hellwig <[email protected]> Signed-off-by: Alex Elder <[email protected]>
2010-09-02xfs: improve buffer cache hash scalabilityDave Chinner2-8/+1
When doing large parallel file creates on a 16p machines, large amounts of time is being spent in _xfs_buf_find(). A system wide profile with perf top shows this: 1134740.00 19.3% _xfs_buf_find 733142.00 12.5% __ticket_spin_lock The problem is that the hash contains 45,000 buffers, and the hash table width is only 256 buffers. That means we've got around 200 buffers per chain, and searching it is quite expensive. The hash table size needs to increase. Secondly, every time we do a lookup, we promote the buffer we find to the head of the hash chain. This is causing cachelines to be dirtied and causes invalidation of cachelines across all CPUs that may have walked the hash chain recently. hence every walk of the hash chain is effectively a cold cache walk. Remove the promotion to avoid this invalidation. The results are: 1045043.00 21.2% __ticket_spin_lock 326184.00 6.6% _xfs_buf_find A 70% drop in the CPU usage when looking up buffers. Unfortunately that does not result in an increase in performance underthis workload as contention on the inode_lock soaks up most of the reduction in CPU usage. Signed-off-by: Dave Chinner <[email protected]> Reviewed-by: Christoph Hellwig <[email protected]>
2010-08-309p: potential ERR_PTR() dereferenceDan Carpenter1-1/+2
p9_client_walk() can return error values if we run out of space or there is a problem with the network. Signed-off-by: Dan Carpenter <[email protected]> Signed-off-by: Eric Van Hensbergen <[email protected]>
2010-08-30nilfs2: fix leak of shadow dat inode in error path of load_nilfsRyusuke Konishi1-0/+1
If load_nilfs() gets an error while doing recovery, it will fail to free the shadow inode of dat (nilfs->ns_gc_dat). This fixes the leak issue. Signed-off-by: Ryusuke Konishi <[email protected]>
2010-08-28Merge branch 'for-linus' of git://git.infradead.org/users/eparis/notifyLinus Torvalds3-36/+64
* 'for-linus' of git://git.infradead.org/users/eparis/notify: fsnotify: drop two useless bools in the fnsotify main loop fsnotify: fix list walk order fanotify: Return EPERM when a process is not privileged fanotify: resize pid and reorder structure fanotify: drop duplicate pr_debug statement fanotify: flush outstanding perm requests on group destroy fsnotify: fix ignored mask handling between inode and vfsmount marks fanotify: add MAINTAINERS entry fsnotify: reset used_inode and used_vfsmount on each pass fanotify: do not dereference inode_mark when it is unset
2010-08-28Merge branch 'for-linus' of ↵Linus Torvalds6-12/+30
git://git.kernel.org/pub/scm/linux/kernel/git/ecryptfs/ecryptfs-2.6 * 'for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/ecryptfs/ecryptfs-2.6: eCryptfs: Fix encrypted file name lookup regression ecryptfs: properly mark init functions fs/ecryptfs: Return -ENOMEM on memory allocation failure
2010-08-28Merge branch 'for-linus' of ↵Linus Torvalds13-107/+184
git://git.kernel.org/pub/scm/linux/kernel/git/sage/ceph-client * 'for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/sage/ceph-client: ceph: fix get_ticket_handler() error handling ceph: don't BUG on ENOMEM during mds reconnect ceph: ceph_mdsc_build_path() returns an ERR_PTR ceph: Fix warnings ceph: ceph_get_inode() returns an ERR_PTR ceph: initialize fields on new dentry_infos ceph: maintain i_head_snapc when any caps are dirty, not just for data ceph: fix osd request lru adjustment when sending request ceph: don't improperly set dir complete when holding EXCL cap mm: exporting account_page_dirty ceph: direct requests in snapped namespace based on nonsnap parent ceph: queue cap snap writeback for realm children on snap update ceph: include dirty xattrs state in snapped caps ceph: fix xattr cap writeback ceph: fix multiple mds session shutdown
2010-08-28Merge branch 'for-2.6.36' of git://linux-nfs.org/~bfields/linuxLinus Torvalds3-24/+30
* 'for-2.6.36' of git://linux-nfs.org/~bfields/linux: nfsd: fix NULL dereference in nfsd_statfs() nfsd4: fix downgrade/lock logic nfsd4: typo fix in find_any_file nfsd4: bad BUG() in preprocess_stateid_op
2010-08-28writeback: Fix lost wake-up shutting down writeback threadJ. Bruce Fields1-1/+1
Setting the task state here may cause us to miss the wake up from kthread_stop(), so we need to recheck kthread_should_stop() or risk sleeping forever in the following schedule(). Symptom was an indefinite hang on an NFSv4 mount. (NFSv4 may create multiple mounts in a temporary namespace while traversing the mount path, and since the temporary namespace is immediately destroyed, it may end up destroying a mount very soon after it was created, possibly making this race more likely.) INFO: task mount.nfs4:4314 blocked for more than 120 seconds. "echo 0 > /proc/sys/kernel/hung_task_timeout_secs" disables this message. mount.nfs4 D 0000000000000000 2880 4314 4313 0x00000000 ffff88001ed6da28 0000000000000046 ffff88001ed6dfd8 ffff88001ed6dfd8 ffff88001ed6c000 ffff88001ed6c000 ffff88001ed6c000 ffff88001e5003a0 ffff88001ed6dfd8 ffff88001e5003a8 ffff88001ed6c000 ffff88001ed6dfd8 Call Trace: [<ffffffff8196090d>] schedule_timeout+0x1cd/0x2e0 [<ffffffff8106a31c>] ? mark_held_locks+0x6c/0xa0 [<ffffffff819639a0>] ? _raw_spin_unlock_irq+0x30/0x60 [<ffffffff8106a5fd>] ? trace_hardirqs_on_caller+0x14d/0x190 [<ffffffff819671fe>] ? sub_preempt_count+0xe/0xd0 [<ffffffff8195fc80>] wait_for_common+0x120/0x190 [<ffffffff81033c70>] ? default_wake_function+0x0/0x20 [<ffffffff8195fdcd>] wait_for_completion+0x1d/0x20 [<ffffffff810595fa>] kthread_stop+0x4a/0x150 [<ffffffff81061a60>] ? thaw_process+0x70/0x80 [<ffffffff810cc68a>] bdi_unregister+0x10a/0x1a0 [<ffffffff81229dc9>] nfs_put_super+0x19/0x20 [<ffffffff810ee8c4>] generic_shutdown_super+0x54/0xe0 [<ffffffff810ee9b6>] kill_anon_super+0x16/0x60 [<ffffffff8122d3b9>] nfs4_kill_super+0x39/0x90 [<ffffffff810eda45>] deactivate_locked_super+0x45/0x60 [<ffffffff810edfb9>] deactivate_super+0x49/0x70 [<ffffffff81108294>] mntput_no_expire+0x84/0xe0 [<ffffffff811084ef>] release_mounts+0x9f/0xc0 [<ffffffff81108575>] put_mnt_ns+0x65/0x80 [<ffffffff8122cc56>] nfs_follow_remote_path+0x1e6/0x420 [<ffffffff8122cfbf>] nfs4_try_mount+0x6f/0xd0 [<ffffffff8122d0c2>] nfs4_get_sb+0xa2/0x360 [<ffffffff810edcb8>] vfs_kern_mount+0x88/0x1f0 [<ffffffff810ede92>] do_kern_mount+0x52/0x130 [<ffffffff81963d9a>] ? _lock_kernel+0x6a/0x170 [<ffffffff81108e9e>] do_mount+0x26e/0x7f0 [<ffffffff81106b3a>] ? copy_mount_options+0xea/0x190 [<ffffffff811094b8>] sys_mount+0x98/0xf0 [<ffffffff810024d8>] system_call_fastpath+0x16/0x1b 1 lock held by mount.nfs4/4314: #0: (&type->s_umount_key#24){+.+...}, at: [<ffffffff810edfb1>] deactivate_super+0x41/0x70 Signed-off-by: J. Bruce Fields <[email protected]> Signed-off-by: Jens Axboe <[email protected]> Acked-by: Artem Bityutskiy <[email protected]>
2010-08-27fsnotify: drop two useless bools in the fnsotify main loopEric Paris1-8/+5
The fsnotify main loop has 2 bools which indicated if we processed the inode or vfsmount mark in that particular pass through the loop. These bool can we replaced with the inode_group and vfsmount_group variables and actually make the code a little easier to understand. Signed-off-by: Eric Paris <[email protected]>
2010-08-27fsnotify: fix list walk orderEric Paris1-6/+5
Marks were stored on the inode and vfsmonut mark list in order from highest memory address to lowest memory address. The code to walk those lists thought they were in order from lowest to highest with unpredictable results when trying to match up marks from each. It was possible that extra events would be sent to userspace when inode marks ignoring events wouldn't get matched with the vfsmount marks. This problem only affected fanotify when using both vfsmount and inode marks simultaneously. Signed-off-by: Eric Paris <[email protected]>
2010-08-27fanotify: Return EPERM when a process is not privilegedAndreas Gruenbacher1-1/+1
The appropriate error code when privileged operations are denied is EPERM, not EACCES. Signed-off-by: Andreas Gruenbacher <[email protected]> Signed-off-by: Eric Paris <[email protected]>
2010-08-27eCryptfs: Fix encrypted file name lookup regressionTyler Hicks2-8/+24
Fixes a regression caused by 21edad32205e97dc7ccb81a85234c77e760364c8 When file name encryption was enabled, ecryptfs_lookup() failed to use the encrypted and encoded version of the upper, plaintext, file name when performing a lookup in the lower file system. This made it impossible to lookup existing encrypted file names and any newly created files would have plaintext file names in the lower file system. https://bugs.launchpad.net/ecryptfs/+bug/623087 Signed-off-by: Tyler Hicks <[email protected]>
2010-08-27ecryptfs: properly mark init functionsJerome Marchand4-4/+4
Some ecryptfs init functions are not prefixed by __init and thus not freed after initialization. This patch saved about 1kB in ecryptfs module. Signed-off-by: Jerome Marchand <[email protected]> Signed-off-by: Tyler Hicks <[email protected]>
2010-08-27fs/ecryptfs: Return -ENOMEM on memory allocation failureJulia Lawall1-0/+2
In this code, 0 is returned on memory allocation failure, even though other failures return -ENOMEM or other similar values. A simplified version of the semantic match that finds this problem is as follows: (http://coccinelle.lip6.fr/) // <smpl> @@ expression ret; expression x,e1,e2,e3; @@ ret = 0 ... when != ret = e1 *x = \(kmalloc\|kcalloc\|kzalloc\)(...) ... when != ret = e2 if (x == NULL) { ... when != ret = e3 return ret; } // </smpl> Signed-off-by: Julia Lawall <[email protected]> Signed-off-by: Tyler Hicks <[email protected]>
2010-08-26nfsd: fix NULL dereference in nfsd_statfs()Takashi Iwai1-6/+8
The commit ebabe9a9001af0af56c0c2780ca1576246e7a74b pass a struct path to vfs_statfs introduced the struct path initialization, and this seems to trigger an Oops on my machine. fh_dentry field may be NULL and set later in fh_verify(), thus the initialization of path must be after fh_verify(). Signed-off-by: Takashi Iwai <[email protected]> Reviewed-by: Christoph Hellwig <[email protected]> Reviewed-by: Minchan Kim <[email protected]> Signed-off-by: J. Bruce Fields <[email protected]>
2010-08-26Merge commit 'v2.6.36-rc1' into HEADJ. Bruce Fields587-14699/+18087
2010-08-26nfsd4: fix downgrade/lock logicJ. Bruce Fields2-16/+21
If we already had a RW open for a file, and get a readonly open, we were piggybacking on the existing RW open. That's inconsistent with the downgrade logic which blows away the RW open assuming you'll still have a readonly open. Also, make sure there is a readonly or writeonly open available for locking, again to prevent bad behavior in downgrade cases when any RW open may be lost. Signed-off-by: J. Bruce Fields <[email protected]>
2010-08-26nfsd4: typo fix in find_any_fileJ. Bruce Fields1-1/+1
Signed-off-by: J. Bruce Fields <[email protected]>
2010-08-26nfsd4: bad BUG() in preprocess_stateid_opJ. Bruce Fields1-1/+0
It's OK for this function to return without setting filp--we do it in the special-stateid case. And there's a legitimate case where we can hit this, since we do permit reads on write-only stateid's. Signed-off-by: J. Bruce Fields <[email protected]>
2010-08-26Cannot allocate memory error on mountSuresh Jayaraman1-1/+1
On 08/26/2010 01:56 AM, joe hefner wrote: > On a recent Fedora (13), I am seeing a mount failure message that I can not explain. I have a Windows Server 2003ýa with a share set up for access only for a specific username (say userfoo). If I try to mount it from Linux,ýusing userfoo and the correct password all is well. If I try with a bad password or with some other username (userbar), it fails with "Permission denied" as expected. If I try to mount as username = administrator, and give the correct administrator password, I would also expect "Permission denied", but I see "Cannot allocate memory" instead. > ýfs/cifs/netmisc.c: Mapping smb error code 5 to POSIX err -13 > ýfs/cifs/cifssmb.c: Send error in QPathInfo = -13 > ýCIFS VFS: cifs_read_super: get root inode failed Looks like the commit 0b8f18e3 assumed that cifs_get_inode_info() and friends fail only due to memory allocation error when the inode is NULL which is not the case if CIFSSMBQPathInfo() fails and returns an error. Fix this by propagating the actual error code back. Acked-by: Jeff Layton <[email protected]> Signed-off-by: Suresh Jayaraman <[email protected]> Signed-off-by: Steve French <[email protected]>
2010-08-26ceph: fix get_ticket_handler() error handlingDan Carpenter1-6/+9
get_ticket_handler() returns a valid pointer or it returns ERR_PTR(-ENOMEM) if kzalloc() fails. Signed-off-by: Dan Carpenter <[email protected]> Signed-off-by: Sage Weil <[email protected]>
2010-08-26ceph: don't BUG on ENOMEM during mds reconnectSage Weil1-3/+4
We are in a position to return an error; do that instead. Signed-off-by: Sage Weil <[email protected]>
2010-08-26ceph: ceph_mdsc_build_path() returns an ERR_PTRDan Carpenter1-0/+4
ceph_mdsc_build_path() returns an ERR_PTR but this code is set up to handle NULL returns. Signed-off-by: Dan Carpenter <[email protected]> Signed-off-by: Sage Weil <[email protected]>
2010-08-26[CIFS] Eliminate unused variable warningSteve French1-1/+2
CC: Shirish Pargaonkar <[email protected]> Signed-off-by: Steve French <[email protected]>
2010-08-25ceph: Fix warningsAlan Cox1-5/+9
Just scrubbing some warnings so I can see real problem ones in the build noise. For 32bit we need to coax gcc politely into believing we really honestly intend to the casts. Using (u64)(unsigned long) means we cast from a pointer to a type of the right size and then extend it. This stops the warning spew. Signed-off-by: Alan Cox <[email protected]> Signed-off-by: Sage Weil <[email protected]>
2010-08-25ceph: ceph_get_inode() returns an ERR_PTRDan Carpenter1-2/+2
ceph_get_inode() returns an ERR_PTR and it doesn't return a NULL. Signed-off-by: Dan Carpenter <[email protected]> Signed-off-by: Sage Weil <[email protected]>
2010-08-25Merge git://git.kernel.org/pub/scm/linux/kernel/git/sfrench/cifs-2.6Linus Torvalds15-279/+622
* git://git.kernel.org/pub/scm/linux/kernel/git/sfrench/cifs-2.6: Eliminate sparse warning - bad constant expression cifs: check for NULL session password missing changes during ntlmv2/ntlmssp auth and sign [CIFS] Fix ntlmv2 auth with ntlmssp cifs: correction of unicode header files cifs: fix NULL pointer dereference in cifs_find_smb_ses cifs: consolidate error handling in several functions cifs: clean up error handling in cifs_mknod
2010-08-24ceph: initialize fields on new dentry_infosSage Weil1-1/+1
Signed-off-by: Sage Weil <[email protected]>
2010-08-24ceph: maintain i_head_snapc when any caps are dirty, not just for dataSage Weil4-7/+26
We used to use i_head_snapc to keep track of which snapc the current epoch of dirty data was dirtied under. It is used by queue_cap_snap to set up the cap_snap. However, since we queue cap snaps for any dirty caps, not just for dirty file data, we need to keep a valid i_head_snapc anytime we have dirty|flushing caps. This fixes a NULL pointer deref in queue_cap_snap when writing back dirty caps without data (e.g., snaptest-authwb.sh). Signed-off-by: Sage Weil <[email protected]>
2010-08-24Eliminate sparse warning - bad constant expression[email protected]2-72/+128
Eliminiate sparse warning during usage of crypto_shash_* APIs error: bad constant expression Allocate memory for shash descriptors once, so that we do not kmalloc/kfree it for every signature generation (shash descriptor for md5 hash). From ed7538619817777decc44b5660b52268077b74f3 Mon Sep 17 00:00:00 2001 From: Shirish Pargaonkar <[email protected]> Date: Tue, 24 Aug 2010 11:47:43 -0500 Subject: [PATCH] eliminate sparse warnings during crypto_shash_* APis usage Signed-off-by: Shirish Pargaonkar <[email protected]> Signed-off-by: Steve French <[email protected]>
2010-08-24xfs: do not discard page cache data on EAGAINChristoph Hellwig1-3/+6
If xfs_map_blocks returns EAGAIN because of lock contention we must redirty the page and not disard the pagecache content and return an error from writepage. We used to do this correctly, but the logic got lost during the recent reshuffle of the writepage code. Signed-off-by: Christoph Hellwig <[email protected]> Reported-by: Mike Gao <[email protected]> Tested-by: Mike Gao <[email protected]> Reviewed-by: Dave Chinner <[email protected]> Signed-off-by: Dave Chinner <[email protected]>
2010-08-24xfs: don't do memory allocation under the CIL context lockDave Chinner1-8/+26
Formatting items requires memory allocation when using delayed logging. Currently that memory allocation is done while holding the CIL context lock in read mode. This means that if memory allocation takes some time (e.g. enters reclaim), we cannot push on the CIL until the allocation(s) required by formatting complete. This can stall CIL pushes for some time, and once a push is stalled so are all new transaction commits. Fix this splitting the item formatting into two steps. The first step which does the allocation and memcpy() into the allocated buffer is now done outside the CIL context lock, and only the CIL insert is done inside the CIL context lock. This avoids the stall issue. Signed-off-by: Dave Chinner <[email protected]> Reviewed-by: Christoph Hellwig <[email protected]>
2010-08-24xfs: Reduce log force overhead for delayed loggingDave Chinner3-118/+147
Delayed logging adds some serialisation to the log force process to ensure that it does not deference a bad commit context structure when determining if a CIL push is necessary or not. It does this by grabing the CIL context lock exclusively, then dropping it before pushing the CIL if necessary. This causes serialisation of all log forces and pushes regardless of whether a force is necessary or not. As a result fsync heavy workloads (like dbench) can be significantly slower with delayed logging than without. To avoid this penalty, copy the current sequence from the context to the CIL structure when they are swapped. This allows us to do unlocked checks on the current sequence without having to worry about dereferencing context structures that may have already been freed. Hence we can remove the CIL context locking in the forcing code and only call into the push code if the current context matches the sequence we need to force. By passing the sequence into the push code, we can check the sequence again once we have the CIL lock held exclusive and abort if the sequence has already been pushed. This avoids a lock round-trip and unnecessary CIL pushes when we have racing push calls. The result is that the regression in dbench performance goes away - this change improves dbench performance on a ramdisk from ~2100MB/s to ~2500MB/s. This compares favourably to not using delayed logging which retuns ~2500MB/s for the same workload. Signed-off-by: Dave Chinner <[email protected]> Reviewed-by: Christoph Hellwig <[email protected]>
2010-08-24xfs: dummy transactions should not dirty VFS stateDave Chinner4-51/+26
When we need to cover the log, we issue dummy transactions to ensure the current log tail is on disk. Unfortunately we currently use the root inode in the dummy transaction, and the act of committing the transaction dirties the inode at the VFS level. As a result, the VFS writeback of the dirty inode will prevent the filesystem from idling long enough for the log covering state machine to complete. The state machine gets stuck in a loop issuing new dummy transactions to cover the log and never makes progress. To avoid this problem, the dummy transactions should not cause externally visible state changes. To ensure this occurs, make sure that dummy transactions log an unchanging field in the superblock as it's state is never propagated outside the filesystem. This allows the log covering state machine to complete successfully and the filesystem now correctly enters a fully idle state about 90s after the last modification was made. Signed-off-by: Dave Chinner <[email protected]> Reviewed-by: Christoph Hellwig <[email protected]>
2010-08-24xfs: ensure f_ffree returned by statfs() is non-negativeStuart Brodsky1-1/+6
Because of delayed updates to sb_icount field in the super block, it is possible to allocate over maxicount number of inodes. This causes the arithmetic to calculate a negative number of free inodes in user commands like df or stat -f. Since maxicount is a somewhat arbitrary number, a slight over allocation is not critical but user commands should be displayed as 0 or greater and never go negative. To do this the value in the stats buffer f_ffree is capped to never go negative. [ Modified to use max_t as per Christoph's comment. ] Signed-off-by: Stu Brodsky <[email protected]> Signed-off-by: Dave Chinner <[email protected]>
2010-08-24xfs: handle negative wbc->nr_to_write during sync writebackDave Chinner1-2/+2
During data integrity (WB_SYNC_ALL) writeback, wbc->nr_to_write will go negative on inodes with more than 1024 dirty pages due to implementation details of write_cache_pages(). Currently XFS will abort page clustering in writeback once nr_to_write drops below zero, and so for data integrity writeback we will do very inefficient page at a time allocation and IO submission for inodes with large numbers of dirty pages. Fix this by only aborting the page clustering code when wbc->nr_to_write is negative and the sync mode is WB_SYNC_NONE. Cc: <[email protected]> Signed-off-by: Dave Chinner <[email protected]> Reviewed-by: Christoph Hellwig <[email protected]>
2010-08-24xfs: fix untrusted inode number lookupDave Chinner1-6/+10
Commit 7124fe0a5b619d65b739477b3b55a20bf805b06d ("xfs: validate untrusted inode numbers during lookup") changes the inode lookup code to do btree lookups for untrusted inode numbers. This change made an invalid assumption about the alignment of inodes and hence incorrectly calculated the first inode in the cluster. As a result, some inode numbers were being incorrectly considered invalid when they were actually valid. The issue was not picked up by the xfstests suite because it always runs fsr and dump (the two utilities that utilise the bulkstat interface) on cache hot inodes and hence the lookup code in the cold cache path was not sufficiently exercised to uncover this intermittent problem. Fix the issue by relaxing the btree lookup criteria and then checking if the record returned contains the inode number we are lookup for. If it we get an incorrect record, then the inode number is invalid. Cc: <[email protected]> Signed-off-by: Dave Chinner <[email protected]> Reviewed-by: Christoph Hellwig <[email protected]>
2010-08-24xfs: ensure we mark all inodes in a freed cluster XFS_ISTALEDave Chinner1-23/+26
Under heavy load parallel metadata loads (e.g. dbench), we can fail to mark all the inodes in a cluster being freed as XFS_ISTALE as we skip inodes we cannot get the XFS_ILOCK_EXCL or the flush lock on. When this happens and the inode cluster buffer has already been marked stale and freed, inode reclaim can try to write the inode out as it is dirty and not marked stale. This can result in writing th metadata to an freed extent, or in the case it has already been overwritten trigger a magic number check failure and return an EUCLEAN error such as: Filesystem "ram0": inode 0x442ba1 background reclaim flush failed with 117 Fix this by ensuring that we hoover up all in memory inodes in the cluster and mark them XFS_ISTALE when freeing the cluster. Cc: <[email protected]> Signed-off-by: Dave Chinner <[email protected]> Reviewed-by: Christoph Hellwig <[email protected]>
2010-08-24xfs: unlock items before allowing the CIL to commitDave Chinner3-5/+17
When we commit a transaction using delayed logging, we need to unlock the items in the transaciton before we unlock the CIL context and allow it to be checkpointed. If we unlock them after we release the CIl context lock, the CIL can checkpoint and complete before we free the log items. This breaks stale buffer item unlock and unpin processing as there is an implicit assumption that the unlock will occur before the unpin. Also, some log items need to store the LSN of the transaction commit in the item (inodes and EFIs) and so can race with other transaction completions if we don't prevent the CIL from checkpointing before the unlock occurs. Cc: <[email protected]> Signed-off-by: Dave Chinner <[email protected]> Reviewed-by: Christoph Hellwig <[email protected]>
2010-08-23cifs: check for NULL session passwordJeff Layton1-0/+1
It's possible for a cifsSesInfo struct to have a NULL password, so we need to check for that prior to running strncmp on it. Signed-off-by: Jeff Layton <[email protected]> Signed-off-by: Steve French <[email protected]>
2010-08-23missing changes during ntlmv2/ntlmssp auth and signShirish Pargaonkar2-5/+10
Signed-off-by: Shirish Pargaonkar <[email protected]> Signed-off-by: Steve French <[email protected]>
2010-08-23fs/bio-integrity.c: return -ENOMEM on kmalloc failureAndrew Morton1-1/+1
Cc: David Rientjes <[email protected]> Signed-off-by: Andrew Morton <[email protected]> Signed-off-by: Jens Axboe <[email protected]>
2010-08-23bio-integrity.c: remove dependency on __GFP_NOFAILDavid Rientjes1-1/+1
The kmalloc() in bio_integrity_prep() is failable, so remove __GFP_NOFAIL from its mask. Signed-off-by: David Rientjes <[email protected]> Signed-off-by: Andrew Morton <[email protected]> Signed-off-by: Jens Axboe <[email protected]>