aboutsummaryrefslogtreecommitdiff
AgeCommit message (Collapse)AuthorFilesLines
2011-04-16block: let io_schedule() flush the plug inlineJens Axboe2-1/+14
Linus correctly observes that the most important dispatch cases are now done from kblockd, this isn't ideal for latency reasons. The original reason for switching dispatches out-of-line was to avoid too deep a stack, so by _only_ letting the "accidental" flush directly in schedule() be guarded by offload to kblockd, we should be able to get the best of both worlds. So add a blk_schedule_flush_plug() that offloads to kblockd, and only use that from the schedule() path. Signed-off-by: Jens Axboe <[email protected]>
2011-04-16Btrfs: avoid taking the chunk_mutex in do_chunk_allocJosef Bacik2-6/+28
Everytime we try to allocate disk space we try and see if we can pre-emptively allocate a chunk, but in the common case we don't allocate anything, so there is no sense in taking the chunk_mutex at all. So instead if we are allocating a chunk, mark it in the space_info so we don't get two people trying to allocate at the same time. Thanks, Signed-off-by: Josef Bacik <[email protected]> Reviewed-by: Liu Bo <[email protected]>
2011-04-16Btrfs end_bio_extent_readpage should look for locked bitsChris Mason1-1/+1
A recent commit caches the extent state in end_bio_extent_readpage, but the search it does should look for locked extents. This fixes things to make it more effective. Signed-off-by: Chris Mason <[email protected]>
2011-04-15Merge branch 'for-linus' of ↵Linus Torvalds11-78/+84
git://git.kernel.org/pub/scm/linux/kernel/git/ericvh/v9fs * 'for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/ericvh/v9fs: net/9p: nwname should be an unsigned int 9p: Fix sparse error fs/9p: Fix error reported by coccicheck 9p: revert tsyncfs related changes fs/9p: Use write_inode for data sync on server fs/9p: Fix revalidate to return correct value
2011-04-15Merge branch 'for-linus' of git://android.git.kernel.org/kernel/tegraLinus Torvalds2-6/+9
* 'for-linus' of git://android.git.kernel.org/kernel/tegra: arm: tegra: fix error check in tegra2_clocks.c ARM: tegra: gpio: Fix unused variable warnings
2011-04-15Merge branch 'fixes' of master.kernel.org:/home/rmk/linux-2.6-armLinus Torvalds28-52/+108
* 'fixes' of master.kernel.org:/home/rmk/linux-2.6-arm: ARM: 6879/1: fix personality test wrt usage of domain handlers ARM: 6878/1: fix personality flag propagation across an exec ARM: 6877/1: the ADDR_NO_RANDOMIZE personality flag should be honored with mmap() ARM: 6876/1: Kconfig.debug: Remove unused CONFIG_DEBUG_ERRORS ARM: pxa: convert incorrect IRQ_TO_IRQ() to irq_to_gpio() ARM: mmp: align NR_BUILTIN_GPIO with gpio interrupt number ARM: pxa: align NR_BUILTIN_GPIO with GPIO interrupt number ARM: pxa: always clear LPM bits for PXA168 MFPR pcmcia: limit pxa2xx_trizeps4 subdriver to trizeps4 platform pcmcia: limit pxa2xx_balloon3 subdriver to balloon3 platform ARM: pxafb: Fix access to nonexistent member of pxafb_info ARM: 6872/1: arch:common:Makefile Remove unused config in the Makefile. ARM: 6868/1: Preserve the VFP state during fork ARM: 6867/1: Introduce THREAD_NOTIFY_COPY for copy_thread() hooks ARM: 6866/1: Do not restrict HIGHPTE to !OUTER_CACHE ARM: 6865/1: perf: ensure pass through zero is counted on overflow ARM: 6864/1: hw_breakpoint: clear DBGVCR out of reset ARM: Only allow PM_SLEEP with CPUs which support suspend ARM: Make consolidated PM sleep code depend on PM_SLEEP
2011-04-15x86, amd: Disable GartTlbWlkErr when BIOS forgets itJoerg Roedel2-0/+23
This patch disables GartTlbWlk errors on AMD Fam10h CPUs if the BIOS forgets to do is (or is just too old). Letting these errors enabled can cause a sync-flood on the CPU causing a reboot. The AMD BKDG recommends disabling GART TLB Wlk Error completely. This patch is the fix for https://bugzilla.kernel.org/show_bug.cgi?id=33012 on my machine. Signed-off-by: Joerg Roedel <[email protected]> Link: http://lkml.kernel.org/r/[email protected] Tested-by: Alexandre Demers <[email protected]> Cc: <[email protected]> Signed-off-by: H. Peter Anvin <[email protected]>
2011-04-15vfs: Fix absolute RCU path walk failures due to uninitialized seq numberTim Chen1-0/+1
During RCU walk in path_lookupat and path_openat, the rcu lookup frequently failed if looking up an absolute path, because when root directory was looked up, seq number was not properly set in nameidata. We dropped out of RCU walk in nameidata_drop_rcu due to mismatch in directory entry's seq number. We reverted to slow path walk that need to take references. With the following patch, I saw a 50% increase in an exim mail server benchmark throughput on a 4-socket Nehalem-EX system. Signed-off-by: Tim Chen <[email protected]> Reviewed-by: Andi Kleen <[email protected]> Cc: [email protected] (v2.6.38) Signed-off-by: Linus Torvalds <[email protected]>
2011-04-15net/9p: nwname should be an unsigned intHarsh Prateek Bora3-8/+8
Signed-off-by: Harsh Prateek Bora <[email protected]> Signed-off-by: Venkateswararao Jujjuri <[email protected]> Signed-off-by: Eric VAn Hensbergen <[email protected]>
2011-04-159p: Fix sparse errorAneesh Kumar K.V3-6/+14
Signed-off-by: Aneesh Kumar K.V <[email protected]> Signed-off-by: Venkateswararao Jujjuri <[email protected]> Signed-off-by: Eric Van Hensbergen <[email protected]>
2011-04-15fs/9p: Fix error reported by coccicheckAneesh Kumar K.V1-1/+1
Signed-off-by: Aneesh Kumar K.V <[email protected]> Signed-off-by: Venkateswararao Jujjuri <[email protected]> Signed-off-by: Eric Van Hensbergen <[email protected]>
2011-04-159p: revert tsyncfs related changesAneesh Kumar K.V6-62/+11
Now that we use write_inode to flush server cache related to fid, we don't need tsyncfs either fort dotl or dotu protocols. For dotu this helps to do a more efficient server flush. Signed-off-by: Aneesh Kumar K.V <[email protected]> Signed-off-by: Venkateswararao Jujjuri <[email protected]> Signed-off-by: Eric Van Hensbergen <[email protected]>
2011-04-15fs/9p: Use write_inode for data sync on serverAneesh Kumar K.V1-0/+47
Signed-off-by: Aneesh Kumar K.V <[email protected]> Signed-off-by: Venkateswararao Jujjuri <[email protected]> Signed-off-by: Eric Van Hensbergen <[email protected]>
2011-04-15fs/9p: Fix revalidate to return correct valueAneesh Kumar K.V1-1/+3
revalidate should return > 0 on success. Also return 0 on ENOENT to force do_revalidate to return NULL dentry; Signed-off-by: Aneesh Kumar K.V <[email protected]> Signed-off-by: Venkateswararao Jujjuri <[email protected]> Signed-off-by: Eric Van Hensbergen <[email protected]>
2011-04-15Btrfs: don't force chunk allocation in find_free_extentChris Mason1-22/+73
find_free_extent likes to allocate in contiguous clusters, which makes writeback faster, especially on SSD storage. As the FS fragments, these clusters become harder to find and we have to decide between allocating a new chunk to make more clusters or giving up on the cluster to allocate from the free space we have. Right now it creates too many chunks, and you can end up with a whole FS that is mostly empty metadata chunks. This commit changes the allocation code to be more strict and only allocate new chunks when we've made good use of the chunks we already have. Signed-off-by: Chris Mason <[email protected]>
2011-04-15x86, NUMA: Fix fakenuma boot failureKOSAKI Motohiro1-0/+23
Currently, numa=fake boot parameter is broken. If it's used, kernel may panic due to devide by zero error depending on CPU configuration Call Trace: [<ffffffff8104ad4c>] find_busiest_group+0x38c/0xd30 [<ffffffff81086aff>] ? local_clock+0x6f/0x80 [<ffffffff81050533>] load_balance+0xa3/0x600 [<ffffffff81050f53>] idle_balance+0xf3/0x180 [<ffffffff81550092>] schedule+0x722/0x7d0 [<ffffffff81550538>] ? wait_for_common+0x128/0x190 [<ffffffff81550a65>] schedule_timeout+0x265/0x320 [<ffffffff81095815>] ? lock_release_holdtime+0x35/0x1a0 [<ffffffff81550538>] ? wait_for_common+0x128/0x190 [<ffffffff8109bb6c>] ? __lock_release+0x9c/0x1d0 [<ffffffff815534e0>] ? _raw_spin_unlock_irq+0x30/0x40 [<ffffffff815534e0>] ? _raw_spin_unlock_irq+0x30/0x40 [<ffffffff81550540>] wait_for_common+0x130/0x190 [<ffffffff81051920>] ? try_to_wake_up+0x510/0x510 [<ffffffff8155067d>] wait_for_completion+0x1d/0x20 [<ffffffff8107f36c>] kthread_create_on_node+0xac/0x150 [<ffffffff81077bb0>] ? process_scheduled_works+0x40/0x40 [<ffffffff8155045f>] ? wait_for_common+0x4f/0x190 [<ffffffff8107a283>] __alloc_workqueue_key+0x1a3/0x590 [<ffffffff81e0cce2>] cpuset_init_smp+0x6b/0x7b [<ffffffff81df3d07>] kernel_init+0xc3/0x182 [<ffffffff8155d5e4>] kernel_thread_helper+0x4/0x10 [<ffffffff81553cd4>] ? retint_restore_args+0x13/0x13 [<ffffffff81df3c44>] ? start_kernel+0x400/0x400 [<ffffffff8155d5e0>] ? gs_change+0x13/0x13 The divede by zero is caused by the following line, group->cpu_power==0: kernel/sched_fair.c::update_sg_lb_stats() /* Adjust by relative CPU power of the group */ sgs->avg_load = (sgs->group_load * SCHED_LOAD_SCALE) / group->cpu_power; This regression was caused by commit e23bba6044 ("x86-64, NUMA: Unify emulated distance mapping") because it changes cpu -> node mapping in the process of dropping fake_physnodes(). old) all cpus are assinged node 0 now) cpus are assigned round robin (the logic is implemented by numa_init_array()) Note: The change in behavior only happens if the system doesn't have neither ACPI SRAT table nor AMD northbridge NUMA information. Round robin assignment doesn't work because init_numa_sched_groups_power() assumes all logical cpus in the same physical cpu share the same node (then it only accounts for group_first_cpu()), and the simple round robin breaks the above assumption. Thus, this patch implements a reassignment of node-ids if buggy firmware or numa emulation makes wrong cpu node map. Tt enforce all logical cpus in the same physical cpu share the same node. Signed-off-by: KOSAKI Motohiro <[email protected]> Acked-by: Tejun Heo <[email protected]> Cc: Yinghai Lu <[email protected]> Cc: Brian Gerst <[email protected]> Cc: Cyrill Gorcunov <[email protected]> Cc: Shaohui Zheng <[email protected]> Cc: David Rientjes <[email protected]> Cc: H. Peter Anvin <[email protected]> Link: http://lkml.kernel.org/r/[email protected] Signed-off-by: Ingo Molnar <[email protected]>
2011-04-15Merge branch 'for-linus' of git://git.kernel.dk/linux-2.6-blockLinus Torvalds6-84/+78
* 'for-linus' of git://git.kernel.dk/linux-2.6-block: block: only force kblockd unplugging from the schedule() path block: cleanup the block plug helper functions block, blk-sysfs: Use the variable directly instead of a function call block: move queue run on unplug to kblockd block: kill queue_sync_plugs() block: readd plug trace event block: add callback function for unplug notification block: add comment on why we save and disable interrupts in flush_plug_list() block: fixup block IO unplug trace call block: remove block_unplug_timer() trace point block: splice plug list to local context
2011-04-15Merge branch 'linux-next' of git://git.infradead.org/ubifs-2.6Linus Torvalds2-58/+97
* 'linux-next' of git://git.infradead.org/ubifs-2.6: UBIFS: fix compilation warnings when compiling with gcc 4.5 UBIFS: fix oops when R/O file-system is fsync'ed
2011-04-15futex: Set FLAGS_HAS_TIMEOUT during futex_wait restart setupDarren Hart1-1/+1
The FLAGS_HAS_TIMEOUT flag was not getting set, causing the restart_block to restart futex_wait() without a timeout after a signal. Commit b41277dc7a18ee332d in 2.6.38 introduced the regression by accidentally removing the the FLAGS_HAS_TIMEOUT assignment from futex_wait() during the setup of the restart block. Restore the originaly behavior. Fixes: https://bugzilla.kernel.org/show_bug.cgi?id=32922 Reported-by: Tim Smith <[email protected]> Reported-by: Torsten Hilbrich <[email protected]> Signed-off-by: Darren Hart <[email protected]> Signed-off-by: Eric Dumazet <[email protected]> Cc: Peter Zijlstra <[email protected]> Cc: John Kacur <[email protected]> Cc: [email protected] Link: http://lkml.kernel.org/r/%3Cdaac0eb3af607f72b9a4d3126b2ba8fb5ed3b883.1302820917.git.dvhart%40linux.intel.com%3E Signed-off-by: Thomas Gleixner <[email protected]>
2011-04-15vfs: fix incorrect dentry_update_name_case() BUG_ON() testLinus Torvalds1-1/+1
The case we should be verifying when updating the dentry name is that the _parent_ inode (the directory) semaphore is held, not the semaphore for the dentry itself. It's the directory locking that rename and readdir() etc all care about. The comment just above even says so - but then the BUG_ON() still checked the dentry inode itself. Very few people noticed, because this helper function really isn't used for very much, so you had to be using ncpfs to ever hit it. I think I should just remove the BUG_ON (the function really has just one user), but let's run with it fixed for a while before getting rid of it entirely. Reported-and-tested-by: Bongani Hlope <[email protected]> Reported-and-tested-by: Bernd Feige <[email protected]> Cc: Petr Vandrovec <[email protected]>, Cc: Arnd Bergmann <[email protected]> Cc: Christoph Hellwig <[email protected]> Cc: Nick Piggin <[email protected]> Signed-off-by: Linus Torvalds <[email protected]>
2011-04-15block: only force kblockd unplugging from the schedule() pathJens Axboe2-8/+9
For the explicit unplugging, we'd prefer to kick things off immediately and not pay the penalty of the latency to switch to kblockd. So let blk_finish_plug() do the run inline, while the implicit-on-schedule-out unplug will punt to kblockd. Signed-off-by: Jens Axboe <[email protected]>
2011-04-15block: cleanup the block plug helper functionsChristoph Hellwig2-21/+9
It's a bit of a mess currently. task->plug is being cleared and reset in __blk_finish_plug(), and blk_finish_plug() is testing for a NULL plug which cannot happen even from schedule() anymore since it uses blk_needs_flush_plug() to determine whether to call into this function at all. So get rid of some of the cruft. Signed-off-by: Jens Axboe <[email protected]>
2011-04-14Merge branch 'for-linus' of ↵Linus Torvalds4-23/+42
git://git.kernel.org/pub/scm/linux/kernel/git/vapier/blackfin * 'for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/vapier/blackfin: Blackfin: SMP: fix cache flush loop Blackfin: time-ts: ack gptimer sooner to avoid missing short ints Blackfin: gptimers: fix thinko when disabling timers Blackfin: SMP: make all barriers handle cache issues
2011-04-14Merge branch 'for-linus' of ↵Linus Torvalds1-3/+9
git://git.kernel.org/pub/scm/linux/kernel/git/sage/ceph-client * 'for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/sage/ceph-client: libceph: fix linger request requeueing
2011-04-14mm/thp: use conventional format for boolean attributesBen Hutchings1-10/+14
The conventional format for boolean attributes in sysfs is numeric ("0" or "1" followed by new-line). Any boolean attribute can then be read and written using a generic function. Using the strings "yes [no]", "[yes] no" (read), "yes" and "no" (write) will frustrate this. [[email protected]: use kstrtoul()] [[email protected]: test_bit() doesn't return 1/0, per Neil] Signed-off-by: Ben Hutchings <[email protected]> Cc: Andrea Arcangeli <[email protected]> Cc: Mel Gorman <[email protected]> Cc: Johannes Weiner <[email protected]> Cc: Rik van Riel <[email protected]> Cc: Hugh Dickins <[email protected]> Tested-by: David Rientjes <[email protected]> Cc: NeilBrown <[email protected]> Cc: <[email protected]> [2.6.38.x] Signed-off-by: Andrew Morton <[email protected]> Signed-off-by: Linus Torvalds <[email protected]>
2011-04-14ramfs: fix memleak on no-mmu archBob Liu1-0/+1
On no-mmu arch, there is a memleak during shmem test. The cause of this memleak is ramfs_nommu_expand_for_mapping() added page refcount to 2 which makes iput() can't free that pages. The simple test file is like this: int main(void) { int i; key_t k = ftok("/etc", 42); for ( i=0; i<100; ++i) { int id = shmget(k, 10000, 0644|IPC_CREAT); if (id == -1) { printf("shmget error\n"); } if(shmctl(id, IPC_RMID, NULL ) == -1) { printf("shm rm error\n"); return -1; } } printf("run ok...\n"); return 0; } And the result: root:/> free total used free shared buffers Mem: 60320 17912 42408 0 0 -/+ buffers: 17912 42408 root:/> shmem run ok... root:/> free total used free shared buffers Mem: 60320 19096 41224 0 0 -/+ buffers: 19096 41224 root:/> shmem run ok... root:/> free total used free shared buffers Mem: 60320 20296 40024 0 0 -/+ buffers: 20296 40024 ... After this patch the test result is:(no memleak anymore) root:/> free total used free shared buffers Mem: 60320 16668 43652 0 0 -/+ buffers: 16668 43652 root:/> shmem run ok... root:/> free total used free shared buffers Mem: 60320 16668 43652 0 0 -/+ buffers: 16668 43652 Signed-off-by: Bob Liu <[email protected]> Acked-by: Hugh Dickins <[email protected]> Signed-off-by: David Howells <[email protected]> Cc: <[email protected]> Signed-off-by: Andrew Morton <[email protected]> Signed-off-by: Linus Torvalds <[email protected]>
2011-04-14um: disable CONFIG_CMPXCHG_LOCALRichard Weinberger1-0/+4
Commit 8a5ec0ba "Lockless (and preemptless) fastpaths for slub" makes use of this_cpu_cmpxchg_double() which needs this_cpu_cmpxchg16b_emu() on x86_64. Implementing cmpxchg16b emulation for UML would introduce too much complexity. So just disable it. Signed-off-by: Richard Weinberger <[email protected]> Reported-by: Sergei Trofimovich <[email protected]> Acked-by: Pekka Enberg <[email protected]> Signed-off-by: Andrew Morton <[email protected]> Signed-off-by: Linus Torvalds <[email protected]>
2011-04-14um: fix call tracer and bug handlerRichard Weinberger1-0/+6
Commit 1de1502c ("x86, um: now we can get rid of trivial uml headers") removed accidentally bug.h which broke UML's call tracer and bug handler. Without asm-generic/bug.h UML uses BUG() from arch/x86/ which makes use of ud2. UML cannot use ud2, it raises SIGILL in user mode. As UML has a different stack for handling signals the call trace will be cut off. Signed-off-by: Richard Weinberger <[email protected]> Reported-by: Sergei Trofimovich <[email protected]> Tested-by: Sergei Trofimovich <[email protected]> Signed-off-by: Andrew Morton <[email protected]> Signed-off-by: Linus Torvalds <[email protected]>
2011-04-14fs/fhandle.c: add <linux/personality.h> for ia64Jeff Mahoney1-0/+1
force_o_largefile() on ia64 is defined in <asm/fcntl.h> and requires <linux/personality.h>. Signed-off-by: Jeff Mahoney <[email protected]> Cc: Aneesh Kumar K.V <[email protected]> Signed-off-by: Andrew Morton <[email protected]> Signed-off-by: Linus Torvalds <[email protected]>
2011-04-14MAINTAINERS: change mail adress of Hans J. KochHans J. Koch1-3/+3
My old mail address doesn't exist anymore. This patch changes all occurences in MAINTAINERS to my new address. Signed-off-by: Hans J. Koch <[email protected]> Cc: Joe Perches <[email protected]> Signed-off-by: Andrew Morton <[email protected]> Signed-off-by: Linus Torvalds <[email protected]>
2011-04-14RapidIO/mpc85xx: fix possible mport registration problemsAlexandre Bounine3-4/+7
Fix a possible problem with mport registration left non-cleared after fsl_rio_setup() exits on link error. Abort mport initialization if registration failed. This patch is applicable to 2.6.39-rc1 only. The problem does not exist for earlier versions. Signed-off-by: Alexandre Bounine <[email protected]> Cc: Kumar Gala <[email protected]> Cc: Matt Porter <[email protected]> Cc: Li Yang <[email protected]> Cc: Thomas Moll <[email protected]> Signed-off-by: Andrew Morton <[email protected]> Signed-off-by: Linus Torvalds <[email protected]>
2011-04-14RapidIO: add IDT CPS-1432 switch definitionsAlexandre Bounine2-0/+2
Signed-off-by: Alexandre Bounine <[email protected]> Cc: Matt Porter <[email protected]> Cc: Kumar Gala <[email protected]> Signed-off-by: Andrew Morton <[email protected]> Signed-off-by: Linus Torvalds <[email protected]>
2011-04-14oom-kill: remove boost_dying_task_prio()KOSAKI Motohiro1-28/+0
This is an almost-revert of commit 93b43fa ("oom: give the dying task a higher priority"). That commit dramatically improved oom killer logic when a fork-bomb occurs. But I've found that it has nasty corner case. Now cpu cgroup has strange default RT runtime. It's 0! That said, if a process under cpu cgroup promote RT scheduling class, the process never run at all. If an admin inserts a !RT process into a cpu cgroup by setting rtruntime=0, usually it runs perfectly because a !RT task isn't affected by the rtruntime knob. But if it promotes an RT task via an explicit setscheduler() syscall or an OOM, the task can't run at all. In short, the oom killer doesn't work at all if admins are using cpu cgroup and don't touch the rtruntime knob. Eventually, kernel may hang up when oom kill occur. I and the original author Luis agreed to disable this logic. Signed-off-by: KOSAKI Motohiro <[email protected]> Acked-by: Luis Claudio R. Goncalves <[email protected]> Acked-by: KAMEZAWA Hiroyuki <[email protected]> Reviewed-by: Minchan Kim <[email protected]> Acked-by: David Rientjes <[email protected]> Cc: <[email protected]> Signed-off-by: Andrew Morton <[email protected]> Signed-off-by: Linus Torvalds <[email protected]>
2011-04-14vmscan: all_unreclaimable() use zone->all_unreclaimable as a nameKOSAKI Motohiro1-11/+13
all_unreclaimable check in direct reclaim has been introduced at 2.6.19 by following commit. 2006 Sep 25; commit 408d8544; oom: use unreclaimable info And it went through strange history. firstly, following commit broke the logic unintentionally. 2008 Apr 29; commit a41f24ea; page allocator: smarter retry of costly-order allocations Two years later, I've found obvious meaningless code fragment and restored original intention by following commit. 2010 Jun 04; commit bb21c7ce; vmscan: fix do_try_to_free_pages() return value when priority==0 But, the logic didn't works when 32bit highmem system goes hibernation and Minchan slightly changed the algorithm and fixed it . 2010 Sep 22: commit d1908362: vmscan: check all_unreclaimable in direct reclaim path But, recently, Andrey Vagin found the new corner case. Look, struct zone { .. int all_unreclaimable; .. unsigned long pages_scanned; .. } zone->all_unreclaimable and zone->pages_scanned are neigher atomic variables nor protected by lock. Therefore zones can become a state of zone->page_scanned=0 and zone->all_unreclaimable=1. In this case, current all_unreclaimable() return false even though zone->all_unreclaimabe=1. This resulted in the kernel hanging up when executing a loop of the form 1. fork 2. mmap 3. touch memory 4. read memory 5. munmmap as described in http://www.gossamer-threads.com/lists/linux/kernel/1348725#1348725 Is this ignorable minor issue? No. Unfortunately, x86 has very small dma zone and it become zone->all_unreclamble=1 easily. and if it become all_unreclaimable=1, it never restore all_unreclaimable=0. Why? if all_unreclaimable=1, vmscan only try DEF_PRIORITY reclaim and a-few-lru-pages>>DEF_PRIORITY always makes 0. that mean no page scan at all! Eventually, oom-killer never works on such systems. That said, we can't use zone->pages_scanned for this purpose. This patch restore all_unreclaimable() use zone->all_unreclaimable as old. and in addition, to add oom_killer_disabled check to avoid reintroduce the issue of commit d1908362 ("vmscan: check all_unreclaimable in direct reclaim path"). Reported-by: Andrey Vagin <[email protected]> Signed-off-by: KOSAKI Motohiro <[email protected]> Cc: Nick Piggin <[email protected]> Reviewed-by: Minchan Kim <[email protected]> Reviewed-by: KAMEZAWA Hiroyuki <[email protected]> Acked-by: David Rientjes <[email protected]> Cc: <[email protected]> Signed-off-by: Andrew Morton <[email protected]> Signed-off-by: Linus Torvalds <[email protected]>
2011-04-14mm: check that we have the right vma in __access_remote_vm()Michael Ellerman1-1/+1
In __access_remote_vm() we need to check that we have found the right vma, not the following vma before we try to access it. Otherwise we might call the vma's access routine with an address which does not fall inside the vma. It was discovered on a current kernel but with an unreleased driver, from memory it was strace leading to a kernel bad access, but it obviously depends on what the access implementation does. Looking at other access implementations I only see: $ git grep -A 5 vm_operations|grep access arch/powerpc/platforms/cell/spufs/file.c- .access = spufs_mem_mmap_access, arch/x86/pci/i386.c- .access = generic_access_phys, drivers/char/mem.c- .access = generic_access_phys fs/sysfs/bin.c- .access = bin_access, The spufs one looks like it might behave badly given the wrong vma, it assumes vma->vm_file->private_data is a spu_context, and looks like it would probably blow up pretty quickly if it wasn't. generic_access_phys() only uses the vma to check vm_flags and get the mm, and then walks page tables using the address. So it should bail on the vm_flags check, or at worst let you access some other VM_IO mapping. And bin_access() just proxies to another access implementation. Signed-off-by: Michael Ellerman <[email protected]> Reviewed-by: KOSAKI Motohiro <[email protected]> Cc: Hugh Dickins <[email protected]> Signed-off-by: Andrew Morton <[email protected]> Signed-off-by: Linus Torvalds <[email protected]>
2011-04-14brk: COMPAT_BRK: fix detection of randomized brkJiri Kosina3-2/+9
5520e89 ("brk: fix min_brk lower bound computation for COMPAT_BRK") tried to get the whole logic of brk randomization for legacy (libc5-based) applications finally right. It turns out that the way to detect whether brk has actually been randomized in the end or not introduced by that patch still doesn't work for those binaries, as reported by Geert: : /sbin/init from my old m68k ramdisk exists prematurely. : : Before the patch: : : | brk(0x80005c8e) = 0x80006000 : : After the patch: : : | brk(0x80005c8e) = 0x80005c8e : : Old libc5 considers brk() to have failed if the return value is not : identical to the requested value. I don't like it, but currently see no better option than a bit flag in task_struct to catch the CONFIG_COMPAT_BRK && randomize_va_space == 2 case. Signed-off-by: Jiri Kosina <[email protected]> Tested-by: Geert Uytterhoeven <[email protected]> Reported-by: Geert Uytterhoeven <[email protected]> Cc: <[email protected]> Signed-off-by: Andrew Morton <[email protected]> Signed-off-by: Linus Torvalds <[email protected]>
2011-04-14drivers/misc/sgi-gru/grufile.c: fix the wrong members of gru_chipWanlong Gao1-4/+4
Fix the wrong members and the wrong function's definition, since the irq_chip had changed. Signed-off-by: Wanlong Gao <[email protected]> Cc: Jack Steiner <[email protected]> Cc: Thomas Gleixner <[email protected]> Signed-off-by: Andrew Morton <[email protected]> Signed-off-by: Linus Torvalds <[email protected]>
2011-04-14tmpfs: fix off-by-one in max_blocks checksHugh Dickins1-2/+4
If you fill up a tmpfs, df was showing tmpfs 460800 - - - /tmp because of an off-by-one in the max_blocks checks. Fix it so df shows tmpfs 460800 460800 0 100% /tmp Signed-off-by: Hugh Dickins <[email protected]> Cc: Tim Chen <[email protected]> Signed-off-by: Andrew Morton <[email protected]> Signed-off-by: Linus Torvalds <[email protected]>
2011-04-14MAINTAINERS: update STABLE BRANCH infoRandy Dunlap1-1/+0
Drop Chris Wright from STABLE maintainers. He hasn't done STABLE release work for quite some time. Signed-off-by: Randy Dunlap <[email protected]> Acked-by: Chris Wright <[email protected]> Cc: Greg KH <[email protected]> Signed-off-by: Andrew Morton <[email protected]> Signed-off-by: Linus Torvalds <[email protected]>
2011-04-14mm: add VM counters for transparent hugepagesAndi Kleen3-4/+37
I found it difficult to make sense of transparent huge pages without having any counters for its actions. Add some counters to vmstat for allocation of transparent hugepages and fallback to smaller pages. Optional patch, but useful for development and understanding the system. Contains improvements from Andrea Arcangeli and Johannes Weiner [[email protected]: coding-style fixes] [[email protected]: fix vmstat_text[] entries] Signed-off-by: Andi Kleen <[email protected]> Acked-by: Andrea Arcangeli <[email protected]> Reviewed-by: KAMEZAWA Hiroyuki <[email protected]> Signed-off-by: Johannes Weiner <[email protected]> Signed-off-by: Andrew Morton <[email protected]> Signed-off-by: Linus Torvalds <[email protected]>
2011-04-14MAINTAINERS: update various tty patternsJoe Perches1-16/+13
Commits 4a6514e6d0 ("tty: move obsolete and broken tty drivers to drivers/staging/tty/") and a6afd9f3e8 ("tty: move a number of tty drivers from drivers/char/ to drivers/tty/") moved files around. Update patterns and orphan some files that were moved to staging. Signed-off-by: Joe Perches <[email protected]> Cc: Greg Kroah-Hartman <[email protected]> Signed-off-by: Andrew Morton <[email protected]> Signed-off-by: Linus Torvalds <[email protected]>
2011-04-14MAINTAINERS: update m68knommu patternsJoe Perches1-1/+2
Commit 66d857b08b ("m68k: merge m68k and m68knommu arch directories") moved the files around. Signed-off-by: Joe Perches <[email protected]> Acked-by: Greg Ungerer <[email protected]> Signed-off-by: Andrew Morton <[email protected]> Signed-off-by: Linus Torvalds <[email protected]>
2011-04-14MAINTAINERS: add ARM/ts78xx-setup platform maintainerAlexander Clouter1-0/+7
Signed-off-by: Alexander Clouter <[email protected]> Cc: Russell King <[email protected]> Signed-off-by: Andrew Morton <[email protected]> Signed-off-by: Linus Torvalds <[email protected]>
2011-04-14kstrtox: simpler code in _kstrtoull()Alexey Dobriyan1-6/+3
Signed-off-by: Alexey Dobriyan <[email protected]> Signed-off-by: Andrew Morton <[email protected]> Signed-off-by: Linus Torvalds <[email protected]>
2011-04-14kstrtox: fix compile warnings in testAlexey Dobriyan1-16/+16
Fix the following warnings: CC [M] lib/test-kstrtox.o lib/test-kstrtox.c: In function 'test_kstrtou64_ok': lib/test-kstrtox.c:318: warning: this decimal constant is unsigned only in ISO C90 ... Signed-off-by: Alexey Dobriyan <[email protected]> Reported-by: Geert Uytterhoeven <[email protected]> Acked-by: Geert Uytterhoeven <[email protected]> Signed-off-by: Andrew Morton <[email protected]> Signed-off-by: Linus Torvalds <[email protected]>
2011-04-14leds/leds-regulator.c: fix handling of already enabled regulatorsAntonio Ospite1-0/+4
Make the driver aware of the initial status of the regulator. The leds-regulator driver was ignoring the initial status of the regulator; this resulted in rdev->use_count being incremented to 2 after calling regulator_led_set_value() in the .probe method when a regulator was already enabled at insmod time, which made it impossible to ever disable the regulator. Signed-off-by: Antonio Ospite <[email protected]> Cc: Richard Purdie <[email protected]> Cc: Antonio Ospite <[email protected]> Acked-by: Mark Brown <[email protected]> Cc: Liam Girdwood <[email protected]> Cc: Daniel Ribeiro <[email protected]> Signed-off-by: Andrew Morton <[email protected]> Signed-off-by: Linus Torvalds <[email protected]>
2011-04-14vmstat: update comment regarding stat_thresholdChristoph Lameter1-3/+6
Signed-off-by: Christoph Lameter <[email protected]> Signed-off-by: Andrew Morton <[email protected]> Signed-off-by: Linus Torvalds <[email protected]>
2011-04-14mm/page_alloc.c: silence build_all_zonelists() section mismatchPaul Mundt1-1/+1
The memory hotplug case involves calling to build_all_zonelists() which in turns calls in to setup_zone_pageset(). The latter is marked __meminit while build_all_zonelists() itself has no particular annotation. build_all_zonelists() is only handed a non-NULL pointer in the case of memory hotplug through an existing __meminit path, so the setup_zone_pageset() reference is always safe. The options as such are either to flag build_all_zonelists() as __ref (as per __build_all_zonelists()), or to simply discard the __meminit annotation from setup_zone_pageset(). Signed-off-by: Paul Mundt <[email protected]> Acked-by: Mel Gorman <[email protected]> Cc: KAMEZAWA Hiroyuki <[email protected]> Acked-by: Randy Dunlap <[email protected]> Signed-off-by: Andrew Morton <[email protected]> Signed-off-by: Linus Torvalds <[email protected]>
2011-04-14fs/partitions/ldm.c: fix oops caused by corrupted partition tableTimo Warns1-4/+12
The kernel automatically evaluates partition tables of storage devices. The code for evaluating LDM partitions (in fs/partitions/ldm.c) contains a bug that causes a kernel oops on certain corrupted LDM partitions. A kernel subsystem seems to crash, because, after the oops, the kernel no longer recognizes newly connected storage devices. The patch validates the value of vblk_size. [[email protected]: coding-style fixes] Signed-off-by: Timo Warns <[email protected]> Cc: Eugene Teo <[email protected]> Cc: Harvey Harrison <[email protected]> Cc: Richard Russon <[email protected]> Signed-off-by: Andrew Morton <[email protected]> Signed-off-by: Linus Torvalds <[email protected]>
2011-04-14drivers/rtc/rtc-mc13xxx.c: fix unterminated platform_device_id tableAxel Lin1-0/+1
The platform_device_id table is supposed to be zero-terminated. Signed-off-by: Axel Lin <[email protected]> Acked-by: Uwe Kleine-König <[email protected]> Signed-off-by: Andrew Morton <[email protected]> Signed-off-by: Linus Torvalds <[email protected]>