blaster4385/linux-IllusionX - Linux kernel with personal config changes for arch linux

Age	Commit message (Collapse)	Author	Files	Lines
2009-03-06	xfs: prevent kernel crash due to corrupted inode log format	Christoph Hellwig	1	-4/+13
	Andras Korn reported an oops on log replay causes by a corrupted xfs_inode_log_format_t passing a 0 size to kmem_zalloc. This patch handles to small or too large numbers of log regions gracefully by rejecting the log replay with a useful error message. Signed-off-by: Christoph Hellwig <[email protected]> Reported-by: Andras Korn <[email protected]> Reviewed-by: Eric Sandeen <[email protected]> Signed-off-by: Felix Blyakher <[email protected]>
2009-03-05	Squashfs: frag_size should be signed, as it can hold an error result	Roel Kluin	1	-2/+4
	Signed-off-by: Roel Kluin <[email protected]> Signed-off-by: Phillip Lougher <[email protected]>
2009-03-05	Squashfs: Fix oops when reading fsfuzzer corrupted filesystems	Phillip Lougher	4	-6/+15
	This fixes a code regression caused by the recent mainlining changes. The recent code changes call zlib_inflate repeatedly, decompressing into separate 4K buffers, this code didn't check for the possibility that zlib_inflate might ask for too many buffers when decompressing corrupted data. Signed-off-by: Phillip Lougher <[email protected]>
2009-03-04	ext4: fix ext4_free_inode() vs. ext4_claim_inode() race	Eric Sandeen	1	-3/+5
	I was seeing fsck errors on inode bitmaps after a 4 thread dbench run on a 4 cpu machine: Inode bitmap differences: -50736 -(50752--50753) etc... I believe that this is because ext4_free_inode() uses atomic bitops, and although ext4_new_inode() used to also use atomic bitops for synchronization, commit 393418676a7602e1d7d3f6e560159c65c8cbd50e changed this to use the sb_bgl_lock, so that we could also synchronize against read_inode_bitmap and initialization of uninit inode tables. However, that change left ext4_free_inode using atomic bitops, which I think leaves no synchronization between setting & unsetting bits in the inode table. The below patch fixes it for me, although I wonder if we're getting at all heavy-handed with this spinlock... Signed-off-by: Eric Sandeen <[email protected]> Reviewed-by: Aneesh Kumar K.V <[email protected]> Signed-off-by: "Theodore Ts'o" <[email protected]>
2009-03-02	Merge branch 'for_linus' of ↵	Linus Torvalds	4	-5/+8
	git://git.kernel.org/pub/scm/linux/kernel/git/tytso/ext4 * 'for_linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tytso/ext4: ext4: don't call jbd2_journal_force_commit_nested without journal ext4: Reorder fs/Makefile so that ext2 root fs's are mounted using ext2 ext4: Remove duplicate call to ext4_commit_super() in ext4_freeze()
2009-02-27	Fix FREEZE/THAW compat_ioctl regression	Christoph Hellwig	1	-0/+3
	Commit 8e961870bb9804110d5c8211d5d9d500451c4518 removed the FREEZE/THAW handling in xfs_compat_ioctl but never added any compat handler back, so now any freeze/thaw request from a 32-bit binary ond 64-bit userspace will fail. As these ioctls are 32/64-bit compatible two simple COMPATIBLE_IOCTL entries in fs/compat_ioctl.c will do the job. Signed-off-by: Christoph Hellwig <[email protected]> Signed-off-by: Linus Torvalds <[email protected]>
2009-02-27	EXPORT_SYMBOL(d_obtain_alias) rather than EXPORT_SYMBOL_GPL	Benny Halevy	1	-1/+1
	Commit 4ea3ada2955e4519befa98ff55dd62d6dfbd1705 declares d_obtain_alias() as EXPORT_SYMBOL_GPL where it's supposed to replace d_alloc_anon which was previously declared as EXPORT_SYMBOL and thus available to any loadable module. This patch reverts that. Signed-off-by: Benny Halevy <[email protected]> Acked-by: Linus Torvalds <[email protected]> Cc: Christoph Hellwig <[email protected]> Cc: "J. Bruce Fields" <[email protected]> Cc: Trond Myklebust <[email protected]> Acked-by: Al Viro <[email protected]> Signed-off-by: Andrew Morton <[email protected]> Signed-off-by: Linus Torvalds <[email protected]>
2009-02-26	Merge git://git.infradead.org/mtd-2.6	Linus Torvalds	2	-16/+44
	* git://git.infradead.org/mtd-2.6: [MTD] [MAPS] Remove MODULE_DEVICE_TABLE() from ck804rom driver. [JFFS2] fix mount crash caused by removed nodes [JFFS2] force the jffs2 GC daemon to behave a bit better [MTD] [MAPS] blackfin async requires complex mappings [MTD] [MAPS] blackfin: fix memory leak in error path [MTD] [MAPS] physmap: fix wrong free and del_mtd_{partition,device} [MTD] slram: Handle negative devlength correctly [MTD] map_rom has NULL erase pointer [MTD] [LPDDR] qinfo_probe depends on lpddr
2009-02-26	ocfs2: add IO error check in ocfs2_get_sector()	wengang wang	1	-0/+7
	Check for IO error in ocfs2_get_sector(). Signed-off-by: Wengang Wang <[email protected]> Signed-off-by: Mark Fasheh <[email protected]>
2009-02-26	ocfs2: set gap to seperate entry and value when xattr in bucket	Tiger Yang	1	-8/+10
	This patch set a gap (4 bytes) between xattr entry and name/value when xattr in bucket. This gap use to seperate entry and name/value when a bucket is full. It had already been set when xattr in inode/block. Signed-off-by: Tiger Yang <[email protected]> Signed-off-by: Mark Fasheh <[email protected]>
2009-02-26	ocfs2: lock the metaecc process for xattr bucket	Tao Ma	3	-0/+8
	For other metadata in ocfs2, metaecc is checked in ocfs2_read_blocks with io_mutex held. While for xattr bucket, it is calculated by the whole buckets. So we have to add a spin_lock to prevent multiple processes calculating metaecc. Signed-off-by: Tao Ma <[email protected]> Tested-by: Tristan Ye <[email protected]> Signed-off-by: Mark Fasheh <[email protected]>
2009-02-26	ocfs2: Use the right access_* method in ctime update of xattr.	Tao Ma	1	-2/+3
	In ctime updating of xattr, it use the wrong type of access for inode, so use ocfs2_journal_access_di instead. Reported-and-Tested-by: Tristan Ye <[email protected]> Signed-off-by: Tao Ma <[email protected]> Acked-by: Joel Becker <[email protected]> Signed-off-by: Mark Fasheh <[email protected]>
2009-02-26	ocfs2/dlm: Make dlm_assert_master_handler() kill itself instead of the asserter	Sunil Mushran	1	-6/+6
	In dlm_assert_master_handler(), if we get an incorrect assert master from a node that, we reply with EINVAL asking the asserter to die. The problem is that an assert is sent after so many hoops, it is invariably the node that thinks the asserter is wrong, is actually wrong. So instead of killing the asserter, this patch kills the assertee. This patch papers over a race that is still being addressed. Signed-off-by: Sunil Mushran <[email protected]> Acked-by: Joel Becker <[email protected]> Signed-off-by: Mark Fasheh <[email protected]>
2009-02-26	ocfs2/dlm: Use ast_lock to protect ast_list	Sunil Mushran	1	-2/+2
	The code was using dlm->spinlock instead of dlm->ast_lock to protect the ast_list. This patch fixes the issue. Signed-off-by: Sunil Mushran <[email protected]> Acked-by: Joel Becker <[email protected]> Signed-off-by: Mark Fasheh <[email protected]>
2009-02-26	ocfs2: Cleanup the lockname print in dlmglue.c	Sunil Mushran	1	-3/+8
	The dentry lock has a different format than other locks. This patch fixes ocfs2_log_dlm_error() macro to make it print the dentry lock correctly. Signed-off-by: Sunil Mushran <[email protected]> Acked-by: Joel Becker <[email protected]> Signed-off-by: Mark Fasheh <[email protected]>
2009-02-26	ocfs2/dlm: Retract fix for race between purge and migrate	Sunil Mushran	1	-2/+1
	Mainline commit d4f7e650e55af6b235871126f747da88600e8040 attempts to delay the dlm_thread from sending the drop ref message if the lockres is being migrated. The problem is that we make the dlm_thread wait for the migration to complete. This causes a deadlock as dlm_thread also participates in the lockres migration process. A better fix for the original oss bugzilla#1012 is in testing. Signed-off-by: Sunil Mushran <[email protected]> Acked-by: Joel Becker <[email protected]> Signed-off-by: Mark Fasheh <[email protected]>
2009-02-26	ocfs2: Access and dirty the buffer_head in mark_written.	Tao Ma	1	-1/+26
	In __ocfs2_mark_extent_written, when we meet with the situation of c_split_covers_rec, the old solution just replace the extent record and forget to access and dirty the buffer_head. This will cause a problem when the unwritten extent is in an extent block. So access and dirty it. Signed-off-by: Tao Ma <[email protected]> Signed-off-by: Mark Fasheh <[email protected]>
2009-02-26	Merge git://git.kernel.org/pub/scm/linux/kernel/git/mason/btrfs-unstable	Linus Torvalds	6	-76/+308
	* git://git.kernel.org/pub/scm/linux/kernel/git/mason/btrfs-unstable: Btrfs: try committing transaction before returning ENOSPC Btrfs: add better -ENOSPC handling
2009-02-26	block: fix bogus gcc warning for uninitialized var usage	Jens Axboe	1	-1/+1
	Newer gcc throw this warning: fs/bio.c: In function ?bio_alloc_bioset?: fs/bio.c:305: warning: ?p? may be used uninitialized in this function since it cannot figure out that 'p' is only ever used if 'bs' is non-NULL. Signed-off-by: Jens Axboe <[email protected]>
2009-02-26	ext4: don't call jbd2_journal_force_commit_nested without journal	Eric Sandeen	2	-2/+4
	Running without a journal, I oopsed when I ran out of space, because we called jbd2_journal_force_commit_nested() from ext4_should_retry_alloc() without a journal. This should take care of it, I think. Signed-off-by: Eric Sandeen <[email protected]> Signed-off-by: "Theodore Ts'o" <[email protected]>
2009-02-28	ext4: Reorder fs/Makefile so that ext2 root fs's are mounted using ext2	Theodore Ts'o	1	-2/+4
	In fs/Makefile, ext3 was placed before ext2 so that a root filesystem that possessed a journal, it would be mounted as ext3 instead of ext2. This was necessary because a cleanly unmounted ext3 filesystem was fully backwards compatible with ext2, and could be mounted by ext2 --- but it was desirable that it be mounted with ext3 so that the journaling would be enabled. The ext4 filesystem supports new incompatible features, so there is no danger of an ext4 filesystem being mistaken for an ext2 filesystem. At that point, the relative ordering of ext4 with respect to ext2 didn't matter until ext4 gained the ability to mount filesystems without a journal starting in 2.6.29-rc1. Now that this is the case, given that ext4 is before ext2, it means that root filesystems that were using the plain-jane ext2 format are getting mounted using the ext4 filesystem driver, which is a change in behavior which could be surprising to users. It's doubtful that there are that many ext2-only root filesystem users that would also have ext4 compiled into the kernel, but to adhere to the principle of least surprise, the correct ordering in fs/Makefile is ext3, followed by ext2, and finally ext4. Signed-off-by: "Theodore Ts'o" <[email protected]>
2009-02-28	ext4: Remove duplicate call to ext4_commit_super() in ext4_freeze()	Theodore Ts'o	1	-1/+0
	Commit c4be0c1d added error checking to ext4_freeze() when calling ext4_commit_super(). Unfortunately the patch failed to remove the original call to ext4_commit_super(), with the net result that when freezing the filesystem, the superblock gets written twice, the first time without error checking. Signed-off-by: "Theodore Ts'o" <[email protected]>
2009-02-24	Merge branch 'proc-linus' of ↵	Linus Torvalds	1	-1/+1
	git://git.kernel.org/pub/scm/linux/kernel/git/adobriyan/proc * 'proc-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/adobriyan/proc: proc: fix PG_locked reporting in /proc/kpageflags
2009-02-24	Merge branch 'for_linus' of ↵	Linus Torvalds	2	-1/+15
	git://git.kernel.org/pub/scm/linux/kernel/git/tytso/ext4 * 'for_linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tytso/ext4: ext4: Fix deadlock in ext4_write_begin() and ext4_da_write_begin() ext4: Add fallback for find_group_flex
2009-02-24	proc: fix PG_locked reporting in /proc/kpageflags	Helge Bahmann	1	-1/+1
	Expr always evaluates to zero. Cc: Matt Mackall <[email protected]> Signed-off-by: Andrew Morton <[email protected]> Signed-off-by: Alexey Dobriyan <[email protected]>
2009-02-23	proc: proc_get_inode should de_put when inode already initialized	Krzysztof Sachanowicz	1	-1/+3
	de_get is called before every proc_get_inode, but corresponding de_put is called only when dropping last reference to an inode. This might cause something like remove_proc_entry: /proc/stats busy, count=14496 to be printed to the syslog. The fix is to call de_put in case of an already initialized inode in proc_get_inode. Signed-off-by: Krzysztof Sachanowicz <[email protected]> Tested-by: Marcin Pilipczuk <[email protected]> Acked-by: Al Viro <[email protected]> Signed-off-by: Linus Torvalds <[email protected]>
2009-02-22	ext4: Fix deadlock in ext4_write_begin() and ext4_da_write_begin()	Jan Kara	1	-1/+8
	Functions ext4_write_begin() and ext4_da_write_begin() call grab_cache_page_write_begin() without AOP_FLAG_NOFS. Thus it can happen that page reclaim is triggered in that function and it recurses back into the filesystem (or some other filesystem). But this can lead to various problems as a transaction is already started at that point. Add the necessary flag. http://bugzilla.kernel.org/show_bug.cgi?id=11688 Signed-off-by: Jan Kara <[email protected]> Signed-off-by: "Theodore Ts'o" <[email protected]>
2009-02-21	ext4: Add fallback for find_group_flex	Theodore Ts'o	1	-0/+7
	This is a workaround for find_group_flex() which badly needs to be replaced. One of its problems (besides ignoring the Orlov algorithm) is that it is a bit hyperactive about returning failure under suspicious circumstances. This can lead to spurious ENOSPC failures even when there are inodes still available. Work around this for now by retrying the search using find_group_other() if find_group_flex() returns -1. If find_group_other() succeeds when find_group_flex() has failed, log a warning message. A better block/inode allocator that will fix this problem for real has been queued up for the next merge window. Signed-off-by: "Theodore Ts'o" <[email protected]>
2009-02-21	Merge git://git.kernel.org/pub/scm/linux/kernel/git/sfrench/cifs-2.6	Linus Torvalds	10	-187/+458
	* git://git.kernel.org/pub/scm/linux/kernel/git/sfrench/cifs-2.6: [CIFS] Fix multiuser mounts so server does not invalidate earlier security contexts [CIFS] improve posix semantics of file create [CIFS] Fix oops in cifs_strfromUCS_le mounting to servers which do not specify their OS cifs: posix fill in inode needed by posix open cifs: properly handle case where CIFSGetSrvInodeNumber fails cifs: refactor new_inode() calls and inode initialization [CIFS] Prevent OOPs when mounting with remote prefixpath. [CIFS] ipv6_addr_equal for address comparison
2009-02-21	[JFFS2] fix mount crash caused by removed nodes	Thomas Gleixner	1	-9/+33
	At scan time we observed following scenario: node A inserted node B inserted node C inserted -> sets overlapped flag on node B node A is removed due to CRC failure -> overlapped flag on node B remains while (tn->overlapped) tn = tn_prev(tn); ==> crash, when tn_prev(B) is referenced. When the ultimate node is removed at scan time and the overlapped flag is set on the penultimate node, then nothing updates the overlapped flag of that node. The overlapped iterators blindly expect that the ultimate node does not have the overlapped flag set, which causes the scan code to crash. It would be a huge overhead to go through the node chain on node removal and fix up the overlapped flags, so detecting such a case on the fly in the overlapped iterators is a simpler and reliable solution. Cc: [email protected] Signed-off-by: Thomas Gleixner <[email protected]> Signed-off-by: David Woodhouse <[email protected]>
2009-02-21	[CIFS] Fix multiuser mounts so server does not invalidate earlier security ↵	Steve French	5	-7/+105
	contexts When two different users mount the same Windows 2003 Server share using CIFS, the first session mounted can be invalidated. Some servers invalidate the first smb session when a second similar user (e.g. two users who get mapped by server to "guest") authenticates an smb session from the same client. By making sure that we set the 2nd and subsequent vc numbers to nonzero values, this ensures that we will not have this problem. Fixes Samba bug 6004, problem description follows: How to reproduce: - configure an "open share" (full permissions to Guest user) on Windows 2003 Server (I couldn't reproduce the problem with Samba server or Windows older than 2003) - mount the share twice with different users who will be authenticated as guest. noacl,noperm,user=john,dir_mode=0700,domain=DOMAIN,rw noacl,noperm,user=jeff,dir_mode=0700,domain=DOMAIN,rw Result: - just the mount point mounted last is accessible: Signed-off-by: Steve French <[email protected]>
2009-02-21	[CIFS] improve posix semantics of file create	Steve French	2	-103/+208
	Samba server added support for a new posix open/create/mkdir operation a year or so ago, and we added support to cifs for mkdir to use it, but had not added the corresponding code to file create. The following patch helps improve the performance of the cifs create path (to Samba and servers which support the cifs posix protocol extensions). Using Connectathon basic test1, with 2000 files, the performance improved about 15%, and also helped reduce network traffic (17% fewer SMBs sent over the wire) due to saving a network round trip for the SetPathInfo on every file create. It should also help the semantics (and probably the performance) of write (e.g. when posix byte range locks are on the file) on file handles opened with posix create, and adds support for a few flags which would have to be ignored otherwise. Signed-off-by: Steve French <[email protected]>
2009-02-21	[CIFS] Fix oops in cifs_strfromUCS_le mounting to servers which do not ↵	Steve French	2	-3/+4
	specify their OS Fixes kernel bug #10451 http://bugzilla.kernel.org/show_bug.cgi?id=10451 Certain NAS appliances do not set the operating system or network operating system fields in the session setup response on the wire. cifs was oopsing on the unexpected zero length response fields (when trying to null terminate a zero length field). This fixes the oops. Acked-by: Jeff Layton <[email protected]> CC: stable <[email protected]> Signed-off-by: Steve French <[email protected]>
2009-02-21	cifs: posix fill in inode needed by posix open	Jeff Layton	2	-1/+3
	function needed to prepare for posix open Signed-off-by: Jeff Layton <[email protected]> Signed-off-by: Steve French <[email protected]>
2009-02-21	cifs: properly handle case where CIFSGetSrvInodeNumber fails	Jeff Layton	3	-15/+14
	...if it does then we pass a pointer to an unintialized variable for the inode number to cifs_new_inode. Have it pass a NULL pointer instead. Also tweak the function prototypes to reduce the amount of casting. Signed-off-by: Jeff Layton <[email protected]> Signed-off-by: Steve French <[email protected]>
2009-02-21	cifs: refactor new_inode() calls and inode initialization	Jeff Layton	3	-66/+86
	Move new inode creation into a separate routine and refactor the callers to take advantage of it. Signed-off-by: Jeff Layton <[email protected]> Signed-off-by: Steve French <[email protected]>
2009-02-21	[CIFS] Prevent OOPs when mounting with remote prefixpath.	Igor Mammedov	3	-2/+48
	Fixes OOPs with message 'kernel BUG at fs/cifs/cifs_dfs_ref.c:274!'. Checks if the prefixpath in an accesible while we are still in cifs_mount and fails with reporting a error if we can't access the prefixpath Should fix Samba bugs 6086 and 5861 and kernel bug 12192 Signed-off-by: Igor Mammedov <[email protected]> Acked-by: Jeff Layton <[email protected]> Signed-off-by: Steve French <[email protected]>
2009-02-20	Merge git://git.kernel.org/pub/scm/linux/kernel/git/mason/btrfs-unstable	Linus Torvalds	1	-4/+4
	* git://git.kernel.org/pub/scm/linux/kernel/git/mason/btrfs-unstable: Btrfs: check file pointer in btrfs_sync_file
2009-02-20	Btrfs: try committing transaction before returning ENOSPC	Josef Bacik	1	-10/+47
	This fixes a problem where we could return -ENOSPC when we may actually have plenty of space, the space is just pinned. Instead of returning -ENOSPC immediately, commit the transaction first and then try and do the allocation again. This patch also does chunk allocation for metadata if we pass the 80% threshold for metadata space. This will help with stack usage since the chunk allocation will happen early on, instead of when the allocation is happening. Signed-off-by: Josef Bacik <[email protected]>
2009-02-20	Btrfs: add better -ENOSPC handling	Josef Bacik	6	-76/+271
	This is a step in the direction of better -ENOSPC handling. Instead of checking the global bytes counter we check the space_info bytes counters to make sure we have enough space. If we don't we go ahead and try to allocate a new chunk, and then if that fails we return -ENOSPC. This patch adds two counters to btrfs_space_info, bytes_delalloc and bytes_may_use. bytes_delalloc account for extents we've actually setup for delalloc and will be allocated at some point down the line. bytes_may_use is to keep track of how many bytes we may use for delalloc at some point. When we actually set the extent_bit for the delalloc bytes we subtract the reserved bytes from the bytes_may_use counter. This keeps us from not actually being able to allocate space for any delalloc bytes. Signed-off-by: Josef Bacik <[email protected]>
2009-02-20	Btrfs: check file pointer in btrfs_sync_file	Chris Mason	1	-4/+4
	fsync can be called by NFS with a null file pointer, and btrfs was oopsing in this case. Signed-off-by: Chris Mason <[email protected]>
2009-02-19	Merge branch 'for-linus' of git://oss.sgi.com/xfs/xfs	Linus Torvalds	1	-3/+76
	* 'for-linus' of git://oss.sgi.com/xfs/xfs: Revert "[XFS] remove old vmap cache" Revert "[XFS] use scalable vmap API"
2009-02-19	Revert "[XFS] remove old vmap cache"	Felix Blyakher	1	-1/+74
	This reverts commit d2859751cd0bf586941ffa7308635a293f943c17. This commit caused regression. We'll try to fix use of new vmap API for next release. Signed-off-by: Christoph Hellwig <[email protected]> Signed-off-by: Felix Blyakher <[email protected]>
2009-02-19	Revert "[XFS] use scalable vmap API"	Felix Blyakher	1	-3/+3
	This reverts commit 95f8e302c04c0b0c6de35ab399a5551605eeb006. This commit caused regression. We'll try to fix use of new vmap API for next release. Signed-off-by: Christoph Hellwig <[email protected]> Signed-off-by: Felix Blyakher <[email protected]>
2009-02-18	Merge branch 'for-linus' of git://git.kernel.dk/linux-2.6-block	Linus Torvalds	2	-3/+4
	* 'for-linus' of git://git.kernel.dk/linux-2.6-block: block: fix deadlock in blk_abort_queue() for drivers that readd to timeout list block: fix booting from partitioned md array block: revert part of 18ce3751ccd488c78d3827e9f6bf54e6322676fb cciss: PCI power management reset for kexec paride/pg.c: xs(): &&/\|\| confusion fs/bio: bio_alloc_bioset: pass right object ptr to mempool_free block: fix bad definition of BIO_RW_SYNC bsg: Fix sense buffer bug in SG_IO
2009-02-18	inotify: fix GFP_KERNEL related deadlock	Ingo Molnar	1	-1/+1
	Enhanced lockdep coverage of __GFP_NOFS turned up this new lockdep assert: [ 1093.677775] [ 1093.677781] ================================= [ 1093.680031] [ INFO: inconsistent lock state ] [ 1093.680031] 2.6.29-rc5-tip-01504-gb49eca1-dirty #1 [ 1093.680031] --------------------------------- [ 1093.680031] inconsistent {RECLAIM_FS-ON-W} -> {IN-RECLAIM_FS-W} usage. [ 1093.680031] kswapd0/308 [HC0[0]:SC0[0]:HE1:SE1] takes: [ 1093.680031] (&inode->inotify_mutex){+.+.?.}, at: [<c0205942>] inotify_inode_is_dead+0x20/0x80 [ 1093.680031] {RECLAIM_FS-ON-W} state was registered at: [ 1093.680031] [<c01696b9>] mark_held_locks+0x43/0x5b [ 1093.680031] [<c016baa4>] lockdep_trace_alloc+0x6c/0x6e [ 1093.680031] [<c01cf8b0>] kmem_cache_alloc+0x20/0x150 [ 1093.680031] [<c040d0ec>] idr_pre_get+0x27/0x6c [ 1093.680031] [<c02056e3>] inotify_handle_get_wd+0x25/0xad [ 1093.680031] [<c0205f43>] inotify_add_watch+0x7a/0x129 [ 1093.680031] [<c020679e>] sys_inotify_add_watch+0x20f/0x250 [ 1093.680031] [<c010389e>] sysenter_do_call+0x12/0x35 [ 1093.680031] [<ffffffff>] 0xffffffff [ 1093.680031] irq event stamp: 60417 [ 1093.680031] hardirqs last enabled at (60417): [<c018d5f5>] call_rcu+0x53/0x59 [ 1093.680031] hardirqs last disabled at (60416): [<c018d5b9>] call_rcu+0x17/0x59 [ 1093.680031] softirqs last enabled at (59656): [<c0146229>] __do_softirq+0x157/0x16b [ 1093.680031] softirqs last disabled at (59651): [<c0106293>] do_softirq+0x74/0x15d [ 1093.680031] [ 1093.680031] other info that might help us debug this: [ 1093.680031] 2 locks held by kswapd0/308: [ 1093.680031] #0: (shrinker_rwsem){++++..}, at: [<c01b0502>] shrink_slab+0x36/0x189 [ 1093.680031] #1: (&type->s_umount_key#4){+++++.}, at: [<c01e6d77>] shrink_dcache_memory+0x110/0x1fb [ 1093.680031] [ 1093.680031] stack backtrace: [ 1093.680031] Pid: 308, comm: kswapd0 Not tainted 2.6.29-rc5-tip-01504-gb49eca1-dirty #1 [ 1093.680031] Call Trace: [ 1093.680031] [<c016947a>] valid_state+0x12a/0x13d [ 1093.680031] [<c016954e>] mark_lock+0xc1/0x1e9 [ 1093.680031] [<c016a5b4>] ? check_usage_forwards+0x0/0x3f [ 1093.680031] [<c016ab74>] __lock_acquire+0x2c6/0xac8 [ 1093.680031] [<c01688d9>] ? register_lock_class+0x17/0x228 [ 1093.680031] [<c016b3d3>] lock_acquire+0x5d/0x7a [ 1093.680031] [<c0205942>] ? inotify_inode_is_dead+0x20/0x80 [ 1093.680031] [<c08824c4>] __mutex_lock_common+0x3a/0x4cb [ 1093.680031] [<c0205942>] ? inotify_inode_is_dead+0x20/0x80 [ 1093.680031] [<c08829ed>] mutex_lock_nested+0x2e/0x36 [ 1093.680031] [<c0205942>] ? inotify_inode_is_dead+0x20/0x80 [ 1093.680031] [<c0205942>] inotify_inode_is_dead+0x20/0x80 [ 1093.680031] [<c01e6672>] dentry_iput+0x90/0xc2 [ 1093.680031] [<c01e67a3>] d_kill+0x21/0x45 [ 1093.680031] [<c01e6a46>] __shrink_dcache_sb+0x27f/0x355 [ 1093.680031] [<c01e6dc5>] shrink_dcache_memory+0x15e/0x1fb [ 1093.680031] [<c01b05ed>] shrink_slab+0x121/0x189 [ 1093.680031] [<c01b0d12>] kswapd+0x39f/0x561 [ 1093.680031] [<c01ae499>] ? isolate_pages_global+0x0/0x233 [ 1093.680031] [<c0157eae>] ? autoremove_wake_function+0x0/0x43 [ 1093.680031] [<c01b0973>] ? kswapd+0x0/0x561 [ 1093.680031] [<c0157daf>] kthread+0x41/0x82 [ 1093.680031] [<c0157d6e>] ? kthread+0x0/0x82 [ 1093.680031] [<c01043ab>] kernel_thread_helper+0x7/0x10 inotify_handle_get_wd() does idr_pre_get() which does a kmem_cache_alloc() without __GFP_FS - and is hence deadlockable under extreme MM pressure. Signed-off-by: Ingo Molnar <[email protected]> Acked-by: Peter Zijlstra <[email protected]> Cc: MinChan Kim <[email protected]> Cc: Nick Piggin <[email protected]> Signed-off-by: Andrew Morton <[email protected]> Signed-off-by: Linus Torvalds <[email protected]>
2009-02-18	vt: Declare PIO_CMAP/GIO_CMAP as compatbile ioctls.	Bill Nottingham	1	-0/+2
	Otherwise, these don't work when called from 32-bit userspace on 64-bit kernels. Cc: Jiri Kosina <[email protected]> Cc: Alan Cox <[email protected]> Cc: <[email protected]> [2.6.25.x, 2.6.26.x, 2.6.27.x, 2.6.28.x] Signed-off-by: Andrew Morton <[email protected]> Signed-off-by: Linus Torvalds <[email protected]>
2009-02-18	fs/super.c: add lockdep annotation to s_umount	Peter Zijlstra	1	-1/+16
	Li Zefan said: Thread 1: for ((; ;)) { mount -t cpuset xxx /mnt > /dev/null 2>&1 cat /mnt/cpus > /dev/null 2>&1 umount /mnt > /dev/null 2>&1 } Thread 2: for ((; ;)) { mount -t cpuset xxx /mnt > /dev/null 2>&1 umount /mnt > /dev/null 2>&1 } (Note: It is irrelevant which cgroup subsys is used.) After a while a lockdep warning showed up: ============================================= [ INFO: possible recursive locking detected ] 2.6.28 #479 --------------------------------------------- mount/13554 is trying to acquire lock: (&type->s_umount_key#19){--..}, at: [<c049d888>] sget+0x5e/0x321 but task is already holding lock: (&type->s_umount_key#19){--..}, at: [<c049da0c>] sget+0x1e2/0x321 other info that might help us debug this: 1 lock held by mount/13554: #0: (&type->s_umount_key#19){--..}, at: [<c049da0c>] sget+0x1e2/0x321 stack backtrace: Pid: 13554, comm: mount Not tainted 2.6.28-mc #479 Call Trace: [<c044ad2e>] validate_chain+0x4c6/0xbbd [<c044ba9b>] __lock_acquire+0x676/0x700 [<c044bb82>] lock_acquire+0x5d/0x7a [<c049d888>] ? sget+0x5e/0x321 [<c061b9b8>] down_write+0x34/0x50 [<c049d888>] ? sget+0x5e/0x321 [<c049d888>] sget+0x5e/0x321 [<c045a2e7>] ? cgroup_set_super+0x0/0x3e [<c045959f>] ? cgroup_test_super+0x0/0x2f [<c045bcea>] cgroup_get_sb+0x98/0x2e7 [<c045cfb6>] cpuset_get_sb+0x4a/0x5f [<c049dfa4>] vfs_kern_mount+0x40/0x7b [<c049e02d>] do_kern_mount+0x37/0xbf [<c04af4a0>] do_mount+0x5c3/0x61a [<c04addd2>] ? copy_mount_options+0x2c/0x111 [<c04af560>] sys_mount+0x69/0xa0 [<c0403251>] sysenter_do_call+0x12/0x31 The cause is after alloc_super() and then retry, an old entry in list fs_supers is found, so grab_super(old) is called, but both functions hold s_umount lock: struct super_block *sget(...) { ... retry: spin_lock(&sb_lock); if (test) { list_for_each_entry(old, &type->fs_supers, s_instances) { if (!test(old, data)) continue; if (!grab_super(old)) <--- 2nd: down_write(&old->s_umount); goto retry; if (s) destroy_super(s); return old; } } if (!s) { spin_unlock(&sb_lock); s = alloc_super(type); <--- 1th: down_write(&s->s_umount) if (!s) return ERR_PTR(-ENOMEM); goto retry; } ... } It seems like a false positive, and seems like VFS but not cgroup needs to be fixed. Peter said: We can simply put the new s_umount instance in a but lockdep doesn't particularly cares about subclass order. If there's any issue with the callers of sget() assuming the s_umount lock being of sublcass 0, then there is another annotation we can use to fix that, but lets not bother with that if this is sufficient. Addresses http://bugzilla.kernel.org/show_bug.cgi?id=12673 Signed-off-by: Peter Zijlstra <[email protected]> Tested-by: Li Zefan <[email protected]> Reported-by: Li Zefan <[email protected]> Cc: Al Viro <[email protected]> Cc: Paul Menage <[email protected]> Cc: Arjan van de Ven <[email protected]> Signed-off-by: Andrew Morton <[email protected]> Signed-off-by: Linus Torvalds <[email protected]>
2009-02-18	mm: task dirty accounting fix	Nick Piggin	1	-0/+1
	YAMAMOTO-san noticed that task_dirty_inc doesn't seem to be called properly for cases where set_page_dirty is not used to dirty a page (eg. mark_buffer_dirty). Additionally, there is some inconsistency about when task_dirty_inc is called. It is used for dirty balancing, however it even gets called for __set_page_dirty_no_writeback. So rather than increment it in a set_page_dirty wrapper, move it down to exactly where the dirty page accounting stats are incremented. Cc: YAMAMOTO Takashi <[email protected]> Signed-off-by: Nick Piggin <[email protected]> Acked-by: Peter Zijlstra <[email protected]> Signed-off-by: Andrew Morton <[email protected]> Signed-off-by: Linus Torvalds <[email protected]>
2009-02-18	timerfd: add flags check	Davide Libenzi	1	-6/+6
	As requested by Michael, add a missing check for valid flags in timerfd_settime(), and make it return EINVAL in case some extra bits are set. Michael said: If this is to be any use to userland apps that want to check flag support (perhaps it is too late already), then the sooner we get it into the kernel the better: 2.6.29 would be good; earlier stables as well would be even better. [[email protected]: remove unused TFD_FLAGS_SET] Acked-by: Michael Kerrisk <[email protected]> Signed-off-by: Davide Libenzi <[email protected]> Cc: <[email protected]> [2.6.27.x, 2.6.28.x] Signed-off-by: Andrew Morton <[email protected]> Signed-off-by: Linus Torvalds <[email protected]>