aboutsummaryrefslogtreecommitdiff
AgeCommit message (Collapse)AuthorFilesLines
2011-05-25memblock/nobootmem: allow alloc_bootmem() to take 0 as low limitYinghai Lu1-9/+16
The bootmem wrapper with memblock supports top-down now, so we do not need to set the low limit to __pa(MAX_DMA_ADDRESS). The logic should be: good to allocate above __pa(MAX_DMA_ADDRESS), but it is ok if we can not find memory above 16M on system that has a small amount of RAM. Signed-off-by: Yinghai LU <[email protected]> Cc: Ingo Molnar <[email protected]> Cc: "H. Peter Anvin" <[email protected]> Cc: Olaf Hering <[email protected]> Cc: Tejun Heo <[email protected]> Cc: Lucas De Marchi <[email protected]> Signed-off-by: Andrew Morton <[email protected]> Signed-off-by: Linus Torvalds <[email protected]>
2011-05-25mm: delete non-atomic mm counter implementationMatt Fleming2-43/+10
The problem with having two different types of counters is that developers adding new code need to keep in mind whether it's safe to use both the atomic and non-atomic implementations. For example, when adding new callers of the *_mm_counter() functions a developer needs to ensure that those paths are always executed with page_table_lock held, in case we're using the non-atomic implementation of mm counters. Hugh Dickins introduced the atomic mm counters in commit f412ac08c986 ("[PATCH] mm: fix rss and mmlist locking"). When asked why he left the non-atomic counters around he said, | The only reason was to avoid adding costly atomic operations into a | configuration that had no need for them there: the page_table_lock | sufficed. | | Certainly it would be simpler just to delete the non-atomic variant. | | And I think it's fair to say that any configuration on which we're | measuring performance to that degree (rather than "does it boot fast?" | type measurements), would already be going the split ptlocks route. Removing the non-atomic counters eases the maintenance burden because developers no longer have to mindful of the two implementations when using *_mm_counter(). Note that all architectures provide a means of atomically updating atomic_long_t variables, even if they have to revert to the generic spinlock implementation because they don't support 64-bit atomic instructions (see lib/atomic64.c). Signed-off-by: Matt Fleming <[email protected]> Acked-by: Dave Hansen <[email protected]> Acked-by: KAMEZAWA Hiroyuki <[email protected]> Cc: Hugh Dickins <[email protected]> Cc: Christoph Lameter <[email protected]> Signed-off-by: Andrew Morton <[email protected]> Signed-off-by: Linus Torvalds <[email protected]>
2011-05-25mm: fail GFP_DMA allocations when ZONE_DMA is not configuredDavid Rientjes1-0/+4
The page allocator will improperly return a page from ZONE_NORMAL even when __GFP_DMA is passed if CONFIG_ZONE_DMA is disabled. The caller expects DMA memory, perhaps for ISA devices with 16-bit address registers, and may get higher memory resulting in undefined behavior. This patch causes the page allocator to return NULL in such circumstances with a warning emitted to the kernel log on the first occurrence. Signed-off-by: David Rientjes <[email protected]> Cc: Mel Gorman <[email protected]> Cc: KOSAKI Motohiro <[email protected]> Cc: KAMEZAWA Hiroyuki <[email protected]> Cc: Rik van Riel <[email protected]> Signed-off-by: Andrew Morton <[email protected]> Signed-off-by: Linus Torvalds <[email protected]>
2011-05-25mm: do not define PFN_SECTION_SHIFT if !CONFIG_SPARSEMEMDaniel Kiper1-4/+0
Do not define PFN_SECTION_SHIFT if !CONFIG_SPARSEMEM. Signed-off-by: Daniel Kiper <[email protected]> Acked-by: Dave Hansen <[email protected]> Signed-off-by: Andrew Morton <[email protected]> Signed-off-by: Linus Torvalds <[email protected]>
2011-05-25mm: pfn_to_section_nr()/section_nr_to_pfn() is valid only in ↵Daniel Kiper1-3/+3
CONFIG_SPARSEMEM context pfn_to_section_nr()/section_nr_to_pfn() is valid only in CONFIG_SPARSEMEM context. Move it to proper place. Signed-off-by: Daniel Kiper <[email protected]> Cc: Dave Hansen <[email protected]> Signed-off-by: Andrew Morton <[email protected]> Signed-off-by: Linus Torvalds <[email protected]>
2011-05-25mm: enable set_page_section() only if CONFIG_SPARSEMEM and ↵Daniel Kiper1-6/+8
!CONFIG_SPARSEMEM_VMEMMAP set_page_section() is meaningful only in CONFIG_SPARSEMEM and !CONFIG_SPARSEMEM_VMEMMAP context. Move it to proper place and amend accordingly functions which are using it. Signed-off-by: Daniel Kiper <[email protected]> Acked-by: Dave Hansen <[email protected]> Acked-by: David Rientjes <[email protected]> Signed-off-by: Andrew Morton <[email protected]> Signed-off-by: Linus Torvalds <[email protected]>
2011-05-25mm: remove dependency on CONFIG_FLATMEM from online_page()Daniel Kiper1-4/+0
online_pages() is only compiled for CONFIG_MEMORY_HOTPLUG_SPARSE, so there is no need to support CONFIG_FLATMEM code within it. This patch removes code that is never used. Signed-off-by: Daniel Kiper <[email protected]> Acked-by: Dave Hansen <[email protected]> Acked-by: David Rientjes <[email protected]> Signed-off-by: Andrew Morton <[email protected]> Signed-off-by: Linus Torvalds <[email protected]>
2011-05-25mm: filter unevictable page out in deactivate_page()Minchan Kim1-0/+7
It's pointless that deactive_page's operates on unevictable pages. This patch removes unnecessary overhead which might be a bit problem in case that there are many unevictable page in system(ex, mprotect workload) [[email protected]: tidy up comment] Reviewed-by: KOSAKI Motohiro <[email protected]> Signed-off-by: Minchan Kim <[email protected]> Reviewed-by: Rik van Riel<[email protected]> Signed-off-by: Andrew Morton <[email protected]> Signed-off-by: Linus Torvalds <[email protected]>
2011-05-25readahead: trigger mmap sequential readahead on PG_readaheadWu Fengguang1-4/+2
Previously the mmap sequential readahead is triggered by updating ra->prev_pos on each page fault and compare it with current page offset. It costs dirtying the cache line on each _minor_ page fault. So remove the ra->prev_pos recording, and instead tag PG_readahead to trigger the possible sequential readahead. It's not only more simple, but also will work more reliably and reduce cache line bouncing on concurrent page faults on shared struct file. In the mosbench exim benchmark which does multi-threaded page faults on shared struct file, the ra->mmap_miss and ra->prev_pos updates are found to cause excessive cache line bouncing on tmpfs, which actually disabled readahead totally (shmem_backing_dev_info.ra_pages == 0). So remove the ra->prev_pos recording, and instead tag PG_readahead to trigger the possible sequential readahead. It's not only more simple, but also will work more reliably on concurrent reads on shared struct file. Signed-off-by: Wu Fengguang <[email protected]> Tested-by: Tim Chen <[email protected]> Reported-by: Andi Kleen <[email protected]> Signed-off-by: Andrew Morton <[email protected]> Signed-off-by: Linus Torvalds <[email protected]>
2011-05-25readahead: reduce unnecessary mmap_miss increasesAndi Kleen1-1/+2
The original INT_MAX is too large, reduce it to - avoid unnecessarily dirtying/bouncing the cache line - restore mmap read-around faster on changed access pattern Background: in the mosbench exim benchmark which does multi-threaded page faults on shared struct file, the ra->mmap_miss updates are found to cause excessive cache line bouncing on tmpfs. The ra state updates are needless for tmpfs because it actually disabled readahead totally (shmem_backing_dev_info.ra_pages == 0). Tested-by: Tim Chen <[email protected]> Signed-off-by: Andi Kleen <[email protected]> Signed-off-by: Wu Fengguang <[email protected]> Signed-off-by: Andrew Morton <[email protected]> Signed-off-by: Linus Torvalds <[email protected]>
2011-05-25readahead: return early when readahead is disabledWu Fengguang1-6/+6
Reduce readahead overheads by returning early in do_sync_mmap_readahead(). tmpfs has ra_pages=0 and it can page fault really fast (not constraint by IO if not swapping). Signed-off-by: Wu Fengguang <[email protected]> Tested-by: Tim Chen <[email protected]> Reported-by: Andi Kleen <[email protected]> Signed-off-by: Andrew Morton <[email protected]> Signed-off-by: Linus Torvalds <[email protected]>
2011-05-25vmscan: change shrinker API by passing shrink_control structYing Han21-61/+95
Change each shrinker's API by consolidating the existing parameters into shrink_control struct. This will simplify any further features added w/o touching each file of shrinker. [[email protected]: fix build] [[email protected]: fix warning] [[email protected]: fix up new shrinker API] [[email protected]: fix xfs warning] [[email protected]: update gfs2] Signed-off-by: Ying Han <[email protected]> Cc: KOSAKI Motohiro <[email protected]> Cc: Minchan Kim <[email protected]> Acked-by: Pavel Emelyanov <[email protected]> Cc: KAMEZAWA Hiroyuki <[email protected]> Cc: Mel Gorman <[email protected]> Acked-by: Rik van Riel <[email protected]> Cc: Johannes Weiner <[email protected]> Cc: Hugh Dickins <[email protected]> Cc: Dave Hansen <[email protected]> Cc: Steven Whitehouse <[email protected]> Signed-off-by: Andrew Morton <[email protected]> Signed-off-by: Linus Torvalds <[email protected]>
2011-05-25vmscan: change shrink_slab() interfaces by passing shrink_controlYing Han4-17/+55
Consolidate the existing parameters to shrink_slab() into a new shrink_control struct. This is needed later to pass the same struct to shrinkers. Signed-off-by: Ying Han <[email protected]> Cc: KOSAKI Motohiro <[email protected]> Cc: Minchan Kim <[email protected]> Acked-by: Pavel Emelyanov <[email protected]> Cc: KAMEZAWA Hiroyuki <[email protected]> Cc: Mel Gorman <[email protected]> Acked-by: Rik van Riel <[email protected]> Cc: Johannes Weiner <[email protected]> Cc: Hugh Dickins <[email protected]> Cc: Dave Hansen <[email protected]> Signed-off-by: Andrew Morton <[email protected]> Signed-off-by: Linus Torvalds <[email protected]>
2011-05-25readahead: readahead page allocations are OK to failWu Fengguang2-1/+7
Pass __GFP_NORETRY|__GFP_NOWARN for readahead page allocations. readahead page allocations are completely optional. They are OK to fail and in particular shall not trigger OOM on themselves. Reported-by: Dave Young <[email protected]> Reviewed-by: KOSAKI Motohiro <[email protected]> Signed-off-by: Wu Fengguang <[email protected]> Reviewed-by: Minchan Kim <[email protected]> Reviewed-by: Pekka Enberg <[email protected]> Signed-off-by: Andrew Morton <[email protected]> Signed-off-by: Linus Torvalds <[email protected]>
2011-05-25mm: check if any page in a pageblock is reserved before marking it ↵Arve Hjønnevåg1-2/+17
MIGRATE_RESERVE This fixes a problem where the first pageblock got marked MIGRATE_RESERVE even though it only had a few free pages. eg, On current ARM port, The kernel starts at offset 0x8000 to leave room for boot parameters, and the memory is freed later. This in turn caused no contiguous memory to be reserved and frequent kswapd wakeups that emptied the caches to get more contiguous memory. Unfortunatelly, ARM needs order-2 allocation for pgd (see arm/mm/pgd.c#pgd_alloc()). Therefore the issue is not minor nor easy avoidable. [[email protected]: added some explanation] [[email protected]: add !pfn_valid_within() to check] [[email protected]: check end_pfn in pageblock_is_reserved] Signed-off-by: John Stultz <[email protected]> Signed-off-by: Arve Hjønnevåg <[email protected]> Signed-off-by: KOSAKI Motohiro <[email protected]> Acked-by: Mel Gorman <[email protected]> Acked-by: Dave Hansen <[email protected]> Signed-off-by: Andrew Morton <[email protected]> Signed-off-by: Linus Torvalds <[email protected]>
2011-05-25m32r, mm: set all online nodes in N_NORMAL_MEMORYDavid Rientjes1-0/+1
For m32r, N_NORMAL_MEMORY represents all nodes that have present memory since it does not support HIGHMEM. This patch sets the bit at the time the node is initialized. If N_NORMAL_MEMORY is not accurate, slub may encounter errors since it uses this nodemask to setup per-cache kmem_cache_node data structures. Signed-off-by: David Rientjes <[email protected]> Cc: Hirokazu Takata <[email protected]> Signed-off-by: Andrew Morton <[email protected]> Signed-off-by: Linus Torvalds <[email protected]>
2011-05-25alpha, mm: set all online nodes in N_NORMAL_MEMORYDavid Rientjes1-0/+1
For alpha, N_NORMAL_MEMORY represents all nodes that have present memory since it does not support HIGHMEM. This patch sets the bit at the time the node is initialized. If N_NORMAL_MEMORY is not accurate, slub may encounter errors since it uses this nodemask to setup per-cache kmem_cache_node data structures. Signed-off-by: David Rientjes <[email protected]> Cc: Richard Henderson <[email protected]> Cc: Ivan Kokshaysky <[email protected]> Cc: Matt Turner <[email protected]> Signed-off-by: Andrew Morton <[email protected]> Signed-off-by: Linus Torvalds <[email protected]>
2011-05-25mm: strictly require elevated page refcount in isolate_lru_page()Konstantin Khlebnikov1-1/+4
isolate_lru_page() must be called only with stable reference to the page, this is what is written in the comment above it, this is reasonable. current isolate_lru_page() users and its page extra reference sources: mm/huge_memory.c: __collapse_huge_page_isolate() - reference from pte mm/memcontrol.c: mem_cgroup_move_parent() - get_page_unless_zero() mem_cgroup_move_charge_pte_range() - reference from pte mm/memory-failure.c: soft_offline_page() - fixed, reference from get_any_page() delete_from_lru_cache() - reference from caller or get_page_unless_zero() [ seems like there bug, because __memory_failure() can call page_action() for hpages tail, but it is ok for isolate_lru_page(), tail getted and not in lru] mm/memory_hotplug.c: do_migrate_range() - fixed, get_page_unless_zero() mm/mempolicy.c: migrate_page_add() - reference from pte mm/migrate.c: do_move_page_to_node_array() - reference from follow_page() mlock.c: - various external references mm/vmscan.c: putback_lru_page() - reference from isolate_lru_page() It seems that all isolate_lru_page() users are ready now for this restriction. So, let's replace redundant get_page_unless_zero() with get_page() and add page initial reference count check with VM_BUG_ON() Signed-off-by: Konstantin Khlebnikov <[email protected]> Cc: Andi Kleen <[email protected]> Cc: KAMEZAWA Hiroyuki <[email protected]> Cc: KOSAKI Motohiro <[email protected]> Cc: Mel Gorman <[email protected]> Cc: Lee Schermerhorn <[email protected]> Cc: Rik van Riel <[email protected]> Signed-off-by: Andrew Morton <[email protected]> Signed-off-by: Linus Torvalds <[email protected]>
2011-05-25mem-hwpoison: fix page refcount around isolate_lru_page()Konstantin Khlebnikov1-5/+6
Drop first page reference only after calling isolate_lru_page() to keep page stable reference while isolating. Signed-off-by: Konstantin Khlebnikov <[email protected]> Cc: Andi Kleen <[email protected]> Cc: KAMEZAWA Hiroyuki <[email protected]> Cc: KOSAKI Motohiro <[email protected]> Cc: Mel Gorman <[email protected]> Cc: Lee Schermerhorn <[email protected]> Cc: Rik van Riel <[email protected]> Signed-off-by: Andrew Morton <[email protected]> Signed-off-by: Linus Torvalds <[email protected]>
2011-05-25mem-hotplug: call isolate_lru_page with elevated refcountKonstantin Khlebnikov1-1/+3
isolate_lru_page() must be called only with stable reference to page. So, let's grab normal page reference. Signed-off-by: Konstantin Khlebnikov <[email protected]> Cc: Andi Kleen <[email protected]> Cc: KAMEZAWA Hiroyuki <[email protected]> Cc: KOSAKI Motohiro <[email protected]> Cc: Mel Gorman <[email protected]> Cc: Lee Schermerhorn <[email protected]> Cc: Rik van Riel <[email protected]> Signed-off-by: Andrew Morton <[email protected]> Signed-off-by: Linus Torvalds <[email protected]>
2011-05-25mm: print vmalloc() state after allocation failuresDave Hansen1-2/+7
I was tracking down a page allocation failure that ended up in vmalloc(). Since vmalloc() uses 0-order pages, if somebody asks for an insane amount of memory, we'll still get a warning with "order:0" in it. That's not very useful. During recovery, vmalloc() also nicely frees all of the memory that it got up to the point of the failure. That is wonderful, but it also quickly hides any issues. We have a much different sitation if vmalloc() repeatedly fails 10GB in to: vmalloc(100 * 1<<30); versus repeatedly failing 4096 bytes in to a: vmalloc(8192); This patch will print out messages that look like this: [ 68.123503] vmalloc: allocation failure, allocated 6680576 of 13426688 bytes [ 68.124218] bash: page allocation failure: order:0, mode:0xd2 [ 68.124811] Pid: 3770, comm: bash Not tainted 2.6.39-rc3-00082-g85f2e68-dirty #333 [ 68.125579] Call Trace: [ 68.125853] [<ffffffff810f6da6>] warn_alloc_failed+0x146/0x170 [ 68.126464] [<ffffffff8107e05c>] ? printk+0x6c/0x70 [ 68.126791] [<ffffffff8112b5d4>] ? alloc_pages_current+0x94/0xe0 [ 68.127661] [<ffffffff8111ed37>] __vmalloc_node_range+0x237/0x290 ... The 'order' variable is added for clarity when calling warn_alloc_failed() to avoid having an unexplained '0' as an argument. The 'tmp_mask' is because adding an open-coded '| __GFP_NOWARN' would take us over 80 columns for the alloc_pages_node() call. If we are going to add a line, it might as well be one that makes the sucker easier to read. As a side issue, I also noticed that ctl_ioctl() does vmalloc() based solely on an unverified value passed in from userspace. Granted, it's under CAP_SYS_ADMIN, but it still frightens me a bit. Signed-off-by: Dave Hansen <[email protected]> Cc: Johannes Weiner <[email protected]> Cc: David Rientjes <[email protected]> Cc: Michal Nazarewicz <[email protected]> Signed-off-by: Andrew Morton <[email protected]> Signed-off-by: Linus Torvalds <[email protected]>
2011-05-25mm: break out page allocation warning codeDave Hansen2-21/+43
This originally started as a simple patch to give vmalloc() some more verbose output on failure on top of the plain page allocator messages. Johannes suggested that it might be nicer to lead with the vmalloc() info _before_ the page allocator messages. But, I do think there's a lot of value in what __alloc_pages_slowpath() does with its filtering and so forth. This patch creates a new function which other allocators can call instead of relying on the internal page allocator warnings. It also gives this function private rate-limiting which separates it from other printk_ratelimit() users. Signed-off-by: Dave Hansen <[email protected]> Cc: Johannes Weiner <[email protected]> Cc: David Rientjes <[email protected]> Cc: Michal Nazarewicz <[email protected]> Signed-off-by: Andrew Morton <[email protected]> Signed-off-by: Linus Torvalds <[email protected]>
2011-05-25mm: convert mm->cpu_vm_cpumask into cpumask_var_tKOSAKI Motohiro7-9/+44
cpumask_t is very big struct and cpu_vm_mask is placed wrong position. It might lead to reduce cache hit ratio. This patch has two change. 1) Move the place of cpumask into last of mm_struct. Because usually cpumask is accessed only front bits when the system has cpu-hotplug capability 2) Convert cpu_vm_mask into cpumask_var_t. It may help to reduce memory footprint if cpumask_size() will use nr_cpumask_bits properly in future. In addition, this patch change the name of cpu_vm_mask with cpu_vm_mask_var. It may help to detect out of tree cpu_vm_mask users. This patch has no functional change. [[email protected]: build fix] [[email protected]: coding-style fixes] Signed-off-by: KOSAKI Motohiro <[email protected]> Cc: David Howells <[email protected]> Cc: Koichi Yasutake <[email protected]> Cc: Hugh Dickins <[email protected]> Cc: Chris Metcalf <[email protected]> Signed-off-by: Andrew Morton <[email protected]> Signed-off-by: Linus Torvalds <[email protected]>
2011-05-25mm: thp: optimize memcg charge in khugepagedAndrea Arcangeli1-10/+11
We don't need to hold the mmmap_sem through mem_cgroup_newpage_charge(), the mmap_sem is only hold for keeping the vma stable and we don't need the vma stable anymore after we return from alloc_hugepage_vma(). Signed-off-by: Andrea Arcangeli <[email protected]> Cc: Johannes Weiner <[email protected]> Cc: Hugh Dickins <[email protected]> Cc: David Rientjes <[email protected]> Cc: Minchan Kim <[email protected]> Signed-off-by: Andrew Morton <[email protected]> Signed-off-by: Linus Torvalds <[email protected]>
2011-05-25mm: uninline large generic tlb.h functionsPeter Zijlstra2-124/+135
Some of these functions have grown beyond inline sanity, move them out-of-line. Signed-off-by: Peter Zijlstra <[email protected]> Requested-by: Andrew Morton <[email protected]> Requested-by: Hugh Dickins <[email protected]> Signed-off-by: Andrew Morton <[email protected]> Signed-off-by: Linus Torvalds <[email protected]>
2011-05-25mm: optimize page_lock_anon_vma() fast-pathPeter Zijlstra1-4/+82
Optimize the page_lock_anon_vma() fast path to be one atomic op, instead of two. Signed-off-by: Peter Zijlstra <[email protected]> Reviewed-by: KAMEZAWA Hiroyuki <[email protected]> Cc: Benjamin Herrenschmidt <[email protected]> Cc: David Miller <[email protected]> Cc: Martin Schwidefsky <[email protected]> Cc: Russell King <[email protected]> Cc: Paul Mundt <[email protected]> Cc: Jeff Dike <[email protected]> Cc: Richard Weinberger <[email protected]> Cc: Tony Luck <[email protected]> Cc: Hugh Dickins <[email protected]> Cc: Mel Gorman <[email protected]> Cc: KOSAKI Motohiro <[email protected]> Cc: Nick Piggin <[email protected]> Cc: Namhyung Kim <[email protected]> Signed-off-by: Andrew Morton <[email protected]> Signed-off-by: Linus Torvalds <[email protected]>
2011-05-25mm: convert anon_vma->lock to a mutexPeter Zijlstra6-25/+21
Straightforward conversion of anon_vma->lock to a mutex. Signed-off-by: Peter Zijlstra <[email protected]> Acked-by: Hugh Dickins <[email protected]> Reviewed-by: KOSAKI Motohiro <[email protected]> Cc: Benjamin Herrenschmidt <[email protected]> Cc: David Miller <[email protected]> Cc: Martin Schwidefsky <[email protected]> Cc: Russell King <[email protected]> Cc: Paul Mundt <[email protected]> Cc: Jeff Dike <[email protected]> Cc: Richard Weinberger <[email protected]> Cc: Tony Luck <[email protected]> Cc: KAMEZAWA Hiroyuki <[email protected]> Cc: Mel Gorman <[email protected]> Cc: Nick Piggin <[email protected]> Cc: Namhyung Kim <[email protected]> Signed-off-by: Andrew Morton <[email protected]> Signed-off-by: Linus Torvalds <[email protected]>
2011-05-25mm: use refcounts for page_lock_anon_vma()Peter Zijlstra2-28/+31
Convert page_lock_anon_vma() over to use refcounts. This is done to prepare for the conversion of anon_vma from spinlock to mutex. Sadly this inceases the cost of page_lock_anon_vma() from one to two atomics, a follow up patch addresses this, lets keep that simple for now. Signed-off-by: Peter Zijlstra <[email protected]> Reviewed-by: KAMEZAWA Hiroyuki <[email protected]> Reviewed-by: KOSAKI Motohiro <[email protected]> Acked-by: Hugh Dickins <[email protected]> Cc: Benjamin Herrenschmidt <[email protected]> Cc: David Miller <[email protected]> Cc: Martin Schwidefsky <[email protected]> Cc: Russell King <[email protected]> Cc: Paul Mundt <[email protected]> Cc: Jeff Dike <[email protected]> Cc: Richard Weinberger <[email protected]> Cc: Tony Luck <[email protected]> Cc: Mel Gorman <[email protected]> Cc: Nick Piggin <[email protected]> Cc: Namhyung Kim <[email protected]> Signed-off-by: Andrew Morton <[email protected]> Signed-off-by: Linus Torvalds <[email protected]>
2011-05-25mm: improve page_lock_anon_vma() commentPeter Zijlstra1-2/+16
A slightly more verbose comment to go along with the trickery in page_lock_anon_vma(). Signed-off-by: Peter Zijlstra <[email protected]> Reviewed-by: KOSAKI Motohiro <[email protected]> Reviewed-by: KAMEZAWA Hiroyuki <[email protected]> Acked-by: Mel Gorman <[email protected]> Acked-by: Hugh Dickins <[email protected]> Cc: Benjamin Herrenschmidt <[email protected]> Cc: David Miller <[email protected]> Cc: Martin Schwidefsky <[email protected]> Cc: Russell King <[email protected]> Cc: Paul Mundt <[email protected]> Cc: Jeff Dike <[email protected]> Cc: Richard Weinberger <[email protected]> Cc: Tony Luck <[email protected]> Cc: Nick Piggin <[email protected]> Cc: Namhyung Kim <[email protected]> Signed-off-by: Andrew Morton <[email protected]> Signed-off-by: Linus Torvalds <[email protected]>
2011-05-25mm: revert page_lock_anon_vma() lock annotationPeter Zijlstra2-17/+2
Its beyond ugly and gets in the way. Signed-off-by: Peter Zijlstra <[email protected]> Acked-by: Hugh Dickins <[email protected]> Cc: Benjamin Herrenschmidt <[email protected]> Cc: David Miller <[email protected]> Cc: Martin Schwidefsky <[email protected]> Cc: Russell King <[email protected]> Cc: Paul Mundt <[email protected]> Cc: Jeff Dike <[email protected]> Cc: Richard Weinberger <[email protected]> Cc: Tony Luck <[email protected]> Cc: KAMEZAWA Hiroyuki <[email protected]> Cc: Mel Gorman <[email protected]> Cc: Namhyung Kim <[email protected]> Cc: KOSAKI Motohiro <[email protected]> Cc: Nick Piggin <[email protected]> Cc: Namhyung Kim <[email protected]> Signed-off-by: Andrew Morton <[email protected]> Signed-off-by: Linus Torvalds <[email protected]>
2011-05-25mm: Convert i_mmap_lock to a mutexPeter Zijlstra17-58/+58
Straightforward conversion of i_mmap_lock to a mutex. Signed-off-by: Peter Zijlstra <[email protected]> Acked-by: Hugh Dickins <[email protected]> Cc: Benjamin Herrenschmidt <[email protected]> Cc: David Miller <[email protected]> Cc: Martin Schwidefsky <[email protected]> Cc: Russell King <[email protected]> Cc: Paul Mundt <[email protected]> Cc: Jeff Dike <[email protected]> Cc: Richard Weinberger <[email protected]> Cc: Tony Luck <[email protected]> Cc: KAMEZAWA Hiroyuki <[email protected]> Cc: Mel Gorman <[email protected]> Cc: KOSAKI Motohiro <[email protected]> Cc: Nick Piggin <[email protected]> Cc: Namhyung Kim <[email protected]> Signed-off-by: Andrew Morton <[email protected]> Signed-off-by: Linus Torvalds <[email protected]>
2011-05-25mm: Remove i_mmap_lock lockbreakPeter Zijlstra8-188/+28
Hugh says: "The only significant loser, I think, would be page reclaim (when concurrent with truncation): could spin for a long time waiting for the i_mmap_mutex it expects would soon be dropped? " Counter points: - cpu contention makes the spin stop (need_resched()) - zap pages should be freeing pages at a higher rate than reclaim ever can I think the simplification of the truncate code is definitely worth it. Effectively reverts: 2aa15890f3c ("mm: prevent concurrent unmap_mapping_range() on the same inode") and takes out the code that caused its problem. Signed-off-by: Peter Zijlstra <[email protected]> Reviewed-by: KAMEZAWA Hiroyuki <[email protected]> Cc: Hugh Dickins <[email protected]> Cc: Benjamin Herrenschmidt <[email protected]> Cc: David Miller <[email protected]> Cc: Martin Schwidefsky <[email protected]> Cc: Russell King <[email protected]> Cc: Paul Mundt <[email protected]> Cc: Jeff Dike <[email protected]> Cc: Richard Weinberger <[email protected]> Cc: Tony Luck <[email protected]> Cc: Mel Gorman <[email protected]> Cc: KOSAKI Motohiro <[email protected]> Cc: Nick Piggin <[email protected]> Cc: Namhyung Kim <[email protected]> Signed-off-by: Andrew Morton <[email protected]> Signed-off-by: Linus Torvalds <[email protected]>
2011-05-25lockdep, mutex: provide mutex_lock_nest_lockPeter Zijlstra3-8/+29
In order to convert i_mmap_lock to a mutex we need a mutex equivalent to spin_lock_nest_lock(), thus provide the mutex_lock_nest_lock() annotation. As with spin_lock_nest_lock(), mutex_lock_nest_lock() allows annotation of the locking pattern where an outer lock serializes the acquisition order of nested locks. That is, if every time you lock multiple locks A, say A1 and A2 you first acquire N, the order of acquiring A1 and A2 is irrelevant. Signed-off-by: Peter Zijlstra <[email protected]> Cc: Benjamin Herrenschmidt <[email protected]> Cc: David Miller <[email protected]> Cc: Martin Schwidefsky <[email protected]> Cc: Russell King <[email protected]> Cc: Paul Mundt <[email protected]> Cc: Jeff Dike <[email protected]> Cc: Richard Weinberger <[email protected]> Cc: Tony Luck <[email protected]> Cc: KAMEZAWA Hiroyuki <[email protected]> Cc: Hugh Dickins <[email protected]> Cc: Mel Gorman <[email protected]> Cc: KOSAKI Motohiro <[email protected]> Cc: Nick Piggin <[email protected]> Cc: Namhyung Kim <[email protected]> Signed-off-by: Andrew Morton <[email protected]> Signed-off-by: Linus Torvalds <[email protected]>
2011-05-25mm: extended batches for generic mmu_gatherPeter Zijlstra2-47/+84
Instead of using a single batch (the small on-stack, or an allocated page), try and extend the batch every time it runs out and only flush once either the extend fails or we're done. Signed-off-by: Peter Zijlstra <[email protected]> Requested-by: Nick Piggin <[email protected]> Reviewed-by: KAMEZAWA Hiroyuki <[email protected]> Acked-by: Hugh Dickins <[email protected]> Cc: Benjamin Herrenschmidt <[email protected]> Cc: David Miller <[email protected]> Cc: Martin Schwidefsky <[email protected]> Cc: Russell King <[email protected]> Cc: Paul Mundt <[email protected]> Cc: Jeff Dike <[email protected]> Cc: Richard Weinberger <[email protected]> Cc: Tony Luck <[email protected]> Cc: Mel Gorman <[email protected]> Cc: KOSAKI Motohiro <[email protected]> Cc: Nick Piggin <[email protected]> Cc: Namhyung Kim <[email protected]> Signed-off-by: Andrew Morton <[email protected]> Signed-off-by: Linus Torvalds <[email protected]>
2011-05-25mm, powerpc: move the RCU page-table freeing into generic codePeter Zijlstra10-125/+150
In case other architectures require RCU freed page-tables to implement gup_fast() and software filled hashes and similar things, provide the means to do so by moving the logic into generic code. Signed-off-by: Peter Zijlstra <[email protected]> Requested-by: David Miller <[email protected]> Cc: Benjamin Herrenschmidt <[email protected]> Cc: Martin Schwidefsky <[email protected]> Cc: Russell King <[email protected]> Cc: Paul Mundt <[email protected]> Cc: Jeff Dike <[email protected]> Cc: Richard Weinberger <[email protected]> Cc: Tony Luck <[email protected]> Cc: KAMEZAWA Hiroyuki <[email protected]> Cc: Hugh Dickins <[email protected]> Cc: Mel Gorman <[email protected]> Cc: KOSAKI Motohiro <[email protected]> Cc: Nick Piggin <[email protected]> Cc: Namhyung Kim <[email protected]> Signed-off-by: Andrew Morton <[email protected]> Signed-off-by: Linus Torvalds <[email protected]>
2011-05-25mm: now that all old mmu_gather code is gone, remove the storagePeter Zijlstra21-41/+0
Fold all the mmu_gather rework patches into one for submission Signed-off-by: Peter Zijlstra <[email protected]> Reported-by: Hugh Dickins <[email protected]> Cc: Benjamin Herrenschmidt <[email protected]> Cc: David Miller <[email protected]> Cc: Martin Schwidefsky <[email protected]> Cc: Russell King <[email protected]> Cc: Paul Mundt <[email protected]> Cc: Jeff Dike <[email protected]> Cc: Richard Weinberger <[email protected]> Cc: Tony Luck <[email protected]> Cc: KAMEZAWA Hiroyuki <[email protected]> Cc: Mel Gorman <[email protected]> Cc: KOSAKI Motohiro <[email protected]> Cc: Nick Piggin <[email protected]> Cc: Namhyung Kim <[email protected]> Signed-off-by: Andrew Morton <[email protected]> Signed-off-by: Linus Torvalds <[email protected]>
2011-05-25um: mmu_gather reworkPeter Zijlstra1-18/+11
Fix up the um mmu_gather code to conform to the new API. Signed-off-by: Peter Zijlstra <[email protected]> Cc: Jeff Dike <[email protected]> Cc: Richard Weinberger <[email protected]> Cc: Benjamin Herrenschmidt <[email protected]> Cc: David Miller <[email protected]> Cc: Martin Schwidefsky <[email protected]> Cc: Russell King <[email protected]> Cc: Paul Mundt <[email protected]> Cc: Tony Luck <[email protected]> Cc: KAMEZAWA Hiroyuki <[email protected]> Cc: Hugh Dickins <[email protected]> Cc: Mel Gorman <[email protected]> Cc: KOSAKI Motohiro <[email protected]> Cc: Nick Piggin <[email protected]> Cc: Namhyung Kim <[email protected]> Signed-off-by: Andrew Morton <[email protected]> Signed-off-by: Linus Torvalds <[email protected]>
2011-05-25ia64: mmu_gather reworkPeter Zijlstra1-20/+46
Fix up the ia64 mmu_gather code to conform to the new API. Signed-off-by: Peter Zijlstra <[email protected]> Acked-by: Tony Luck <[email protected]> Cc: Benjamin Herrenschmidt <[email protected]> Cc: David Miller <[email protected]> Cc: Martin Schwidefsky <[email protected]> Cc: Russell King <[email protected]> Cc: Paul Mundt <[email protected]> Cc: Jeff Dike <[email protected]> Cc: Richard Weinberger <[email protected]> Cc: KAMEZAWA Hiroyuki <[email protected]> Cc: Hugh Dickins <[email protected]> Cc: Mel Gorman <[email protected]> Cc: KOSAKI Motohiro <[email protected]> Cc: Nick Piggin <[email protected]> Cc: Namhyung Kim <[email protected]> Signed-off-by: Andrew Morton <[email protected]> Signed-off-by: Linus Torvalds <[email protected]>
2011-05-25sh: mmu_gather reworkPeter Zijlstra1-11/+17
Fix up the sh mmu_gather code to conform to the new API. Signed-off-by: Peter Zijlstra <[email protected]> Acked-by: Paul Mundt <[email protected]> Cc: Benjamin Herrenschmidt <[email protected]> Cc: David Miller <[email protected]> Cc: Martin Schwidefsky <[email protected]> Cc: Russell King <[email protected]> Cc: Jeff Dike <[email protected]> Cc: Richard Weinberger <[email protected]> Cc: Tony Luck <[email protected]> Cc: KAMEZAWA Hiroyuki <[email protected]> Cc: Hugh Dickins <[email protected]> Cc: Mel Gorman <[email protected]> Cc: KOSAKI Motohiro <[email protected]> Cc: Nick Piggin <[email protected]> Cc: Namhyung Kim <[email protected]> Signed-off-by: Andrew Morton <[email protected]> Signed-off-by: Linus Torvalds <[email protected]>
2011-05-25arm: mmu_gather reworkPeter Zijlstra1-16/+37
Fix up the arm mmu_gather code to conform to the new API. Signed-off-by: Peter Zijlstra <[email protected]> Cc: Benjamin Herrenschmidt <[email protected]> Cc: David Miller <[email protected]> Cc: Martin Schwidefsky <[email protected]> Cc: Russell King <[email protected]> Cc: Paul Mundt <[email protected]> Cc: Jeff Dike <[email protected]> Cc: Richard Weinberger <[email protected]> Cc: Tony Luck <[email protected]> Cc: KAMEZAWA Hiroyuki <[email protected]> Cc: Hugh Dickins <[email protected]> Cc: Mel Gorman <[email protected]> Cc: KOSAKI Motohiro <[email protected]> Cc: Nick Piggin <[email protected]> Cc: Namhyung Kim <[email protected]> Signed-off-by: Andrew Morton <[email protected]> Signed-off-by: Linus Torvalds <[email protected]>
2011-05-25s390: mmu_gather reworkPeter Zijlstra1-25/+37
Adapt the stand-alone s390 mmu_gather implementation to the new preemptible mmu_gather interface. Signed-off-by: Peter Zijlstra <[email protected]> Signed-off-by: Martin Schwidefsky <[email protected]> Cc: Benjamin Herrenschmidt <[email protected]> Cc: David Miller <[email protected]> Cc: Russell King <[email protected]> Cc: Paul Mundt <[email protected]> Cc: Jeff Dike <[email protected]> Cc: Richard Weinberger <[email protected]> Cc: Tony Luck <[email protected]> Cc: KAMEZAWA Hiroyuki <[email protected]> Cc: Hugh Dickins <[email protected]> Cc: Mel Gorman <[email protected]> Cc: KOSAKI Motohiro <[email protected]> Cc: Nick Piggin <[email protected]> Cc: Namhyung Kim <[email protected]> Signed-off-by: Andrew Morton <[email protected]> Signed-off-by: Linus Torvalds <[email protected]>
2011-05-25sparc: mmu_gather reworkPeter Zijlstra6-116/+63
Rework the sparc mmu_gather usage to conform to the new world order :-) Sparc mmu_gather does two things: - tracks vaddrs to unhash - tracks pages to free Split these two things like powerpc has done and keep the vaddrs in per-cpu data structures and flush them on context switch. The remaining bits can then use the generic mmu_gather. Signed-off-by: Peter Zijlstra <[email protected]> Acked-by: David Miller <[email protected]> Cc: Benjamin Herrenschmidt <[email protected]> Cc: Martin Schwidefsky <[email protected]> Cc: Russell King <[email protected]> Cc: Paul Mundt <[email protected]> Cc: Jeff Dike <[email protected]> Cc: Richard Weinberger <[email protected]> Cc: Tony Luck <[email protected]> Cc: KAMEZAWA Hiroyuki <[email protected]> Cc: Hugh Dickins <[email protected]> Cc: Mel Gorman <[email protected]> Cc: KOSAKI Motohiro <[email protected]> Cc: Nick Piggin <[email protected]> Cc: Namhyung Kim <[email protected]> Signed-off-by: Andrew Morton <[email protected]> Signed-off-by: Linus Torvalds <[email protected]>
2011-05-25powerpc: mmu_gather reworkPeter Zijlstra8-17/+46
Fix up powerpc to the new mmu_gather stuff. PPC has an extra batching queue to RCU free the actual pagetable allocations, use the ARCH extentions for that for now. For the ppc64_tlb_batch, which tracks the vaddrs to unhash from the hardware hash-table, keep using per-cpu arrays but flush on context switch and use a TLF bit to track the lazy_mmu state. Signed-off-by: Peter Zijlstra <[email protected]> Acked-by: Benjamin Herrenschmidt <[email protected]> Cc: David Miller <[email protected]> Cc: Martin Schwidefsky <[email protected]> Cc: Russell King <[email protected]> Cc: Paul Mundt <[email protected]> Cc: Jeff Dike <[email protected]> Cc: Richard Weinberger <[email protected]> Cc: Tony Luck <[email protected]> Cc: KAMEZAWA Hiroyuki <[email protected]> Cc: Hugh Dickins <[email protected]> Cc: Mel Gorman <[email protected]> Cc: KOSAKI Motohiro <[email protected]> Cc: Nick Piggin <[email protected]> Cc: Namhyung Kim <[email protected]> Signed-off-by: Andrew Morton <[email protected]> Signed-off-by: Linus Torvalds <[email protected]>
2011-05-25mm: mmu_gather reworkPeter Zijlstra5-65/+107
Rework the existing mmu_gather infrastructure. The direct purpose of these patches was to allow preemptible mmu_gather, but even without that I think these patches provide an improvement to the status quo. The first 9 patches rework the mmu_gather infrastructure. For review purpose I've split them into generic and per-arch patches with the last of those a generic cleanup. The next patch provides generic RCU page-table freeing, and the followup is a patch converting s390 to use this. I've also got 4 patches from DaveM lined up (not included in this series) that uses this to implement gup_fast() for sparc64. Then there is one patch that extends the generic mmu_gather batching. After that follow the mm preemptibility patches, these make part of the mm a lot more preemptible. It converts i_mmap_lock and anon_vma->lock to mutexes which together with the mmu_gather rework makes mmu_gather preemptible as well. Making i_mmap_lock a mutex also enables a clean-up of the truncate code. This also allows for preemptible mmu_notifiers, something that XPMEM I think wants. Furthermore, it removes the new and universially detested unmap_mutex. This patch: Remove the first obstacle towards a fully preemptible mmu_gather. The current scheme assumes mmu_gather is always done with preemption disabled and uses per-cpu storage for the page batches. Change this to try and allocate a page for batching and in case of failure, use a small on-stack array to make some progress. Preemptible mmu_gather is desired in general and usable once i_mmap_lock becomes a mutex. Doing it before the mutex conversion saves us from having to rework the code by moving the mmu_gather bits inside the pte_lock. Also avoid flushing the tlb batches from under the pte lock, this is useful even without the i_mmap_lock conversion as it significantly reduces pte lock hold times. [[email protected]: fix comment tpyo] Signed-off-by: Peter Zijlstra <[email protected]> Cc: Benjamin Herrenschmidt <[email protected]> Cc: David Miller <[email protected]> Cc: Martin Schwidefsky <[email protected]> Cc: Russell King <[email protected]> Cc: Paul Mundt <[email protected]> Cc: Jeff Dike <[email protected]> Cc: Richard Weinberger <[email protected]> Cc: Tony Luck <[email protected]> Reviewed-by: KAMEZAWA Hiroyuki <[email protected]> Acked-by: Hugh Dickins <[email protected]> Acked-by: Mel Gorman <[email protected]> Cc: KOSAKI Motohiro <[email protected]> Cc: Nick Piggin <[email protected]> Cc: Namhyung Kim <[email protected]> Signed-off-by: Andrew Morton <[email protected]> Signed-off-by: Linus Torvalds <[email protected]>
2011-05-25mm: make expand_downwards() symmetrical with expand_upwards()Michal Hocko4-11/+8
Currently we have expand_upwards exported while expand_downwards is accessible only via expand_stack or expand_stack_downwards. check_stack_guard_page is a nice example of the asymmetry. It uses expand_stack for VM_GROWSDOWN while expand_upwards is called for VM_GROWSUP case. Let's clean this up by exporting both functions and make those names consistent. Let's use expand_{upwards,downwards} because expanding doesn't always involve stack manipulation (an example is ia64_do_page_fault which uses expand_upwards for registers backing store expansion). expand_downwards has to be defined for both CONFIG_STACK_GROWS{UP,DOWN} because get_arg_page calls the downwards version in the early process initialization phase for growsup configuration. Signed-off-by: Michal Hocko <[email protected]> Acked-by: Hugh Dickins <[email protected]> Cc: James Bottomley <[email protected]> Cc: "Luck, Tony" <[email protected]> Signed-off-by: Andrew Morton <[email protected]> Signed-off-by: Linus Torvalds <[email protected]>
2011-05-25mm/vmalloc: remove guard page from between vmap blocksJohannes Weiner1-3/+3
The vmap allocator is used to, among other things, allocate per-cpu vmap blocks, where each vmap block is naturally aligned to its own size. Obviously, leaving a guard page after each vmap area forbids packing vmap blocks efficiently and can make the kernel run out of possible vmap blocks long before overall vmap space is exhausted. The new interface to map a user-supplied page array into linear vmalloc space (vm_map_ram) insists on allocating from a vmap block (instead of falling back to a custom area) when the area size is below a certain threshold. With heavy users of this interface (e.g. XFS) and limited vmalloc space on 32-bit, vmap block exhaustion is a real problem. Remove the guard page from the core vmap allocator. vmalloc and the old vmap interface enforce a guard page on their own at a higher level. Note that without this patch, we had accidental guard pages after those vm_map_ram areas that happened to be at the end of a vmap block, but not between every area. This patch removes this accidental guard page only. If we want guard pages after every vm_map_ram area, this should be done separately. And just like with vmalloc and the old interface on a different level, not in the core allocator. Mel pointed out: "If necessary, the guard page could be reintroduced as a debugging-only option (CONFIG_DEBUG_PAGEALLOC?). Otherwise it seems reasonable." Signed-off-by: Johannes Weiner <[email protected]> Cc: Nick Piggin <[email protected]> Cc: Dave Chinner <[email protected]> Acked-by: Mel Gorman <[email protected]> Cc: Hugh Dickins <[email protected]> Cc: Christoph Hellwig <[email protected]> Signed-off-by: Andrew Morton <[email protected]> Signed-off-by: Linus Torvalds <[email protected]>
2011-05-25include/linux/gfp.h: convert BUG_ON() into VM_BUG_ON()Dave Hansen1-3/+1
VM_BUG_ON() if effectively a BUG_ON() undef #ifdef CONFIG_DEBUG_VM. That is exactly what we have here now, and two different folks have suggested doing it this way. Signed-off-by: Dave Hansen <[email protected]> Cc: Christoph Lameter <[email protected]> Signed-off-by: Andrew Morton <[email protected]> Signed-off-by: Linus Torvalds <[email protected]>
2011-05-25include/linux/gfp.h: work around apparent sparse confusionDave Hansen1-6/+1
Running sparse on page_alloc.c today, it errors out: include/linux/gfp.h:254:17: error: bad constant expression include/linux/gfp.h:254:17: error: cannot size expression which is a line in gfp_zone(): BUILD_BUG_ON((GFP_ZONE_BAD >> bit) & 1); That's really unfortunate, because it ends up hiding all of the other legitimate sparse messages like this: mm/page_alloc.c:5315:59: warning: incorrect type in argument 1 (different base types) mm/page_alloc.c:5315:59: expected unsigned long [unsigned] [usertype] size mm/page_alloc.c:5315:59: got restricted gfp_t [usertype] <noident> ... Having sparse be able to catch these very oopsable bugs is a lot more important than keeping a BUILD_BUG_ON(). Kill the BUILD_BUG_ON(). Compiles on x86_64 with and without CONFIG_DEBUG_VM=y. defconfig boots fine for me. Signed-off-by: Dave Hansen <[email protected]> Signed-off-by: Andrew Morton <[email protected]> Signed-off-by: Linus Torvalds <[email protected]>
2011-05-25oom: replace PF_OOM_ORIGIN with toggling oom_score_adjDavid Rientjes5-14/+38
There's a kernel-wide shortage of per-process flags, so it's always helpful to trim one when possible without incurring a significant penalty. It's even more important when you're planning on adding a per- process flag yourself, which I plan to do shortly for transparent hugepages. PF_OOM_ORIGIN is used by ksm and swapoff to prefer current since it has a tendency to allocate large amounts of memory and should be preferred for killing over other tasks. We'd rather immediately kill the task making the errant syscall rather than penalizing an innocent task. This patch removes PF_OOM_ORIGIN since its behavior is equivalent to setting the process's oom_score_adj to OOM_SCORE_ADJ_MAX. The process's old oom_score_adj is stored and then set to OOM_SCORE_ADJ_MAX during the time it used to have PF_OOM_ORIGIN. The old value is then reinstated when the process should no longer be considered a high priority for oom killing. Signed-off-by: David Rientjes <[email protected]> Reviewed-by: KOSAKI Motohiro <[email protected]> Reviewed-by: Minchan Kim <[email protected]> Cc: Hugh Dickins <[email protected]> Cc: Izik Eidus <[email protected]> Cc: KAMEZAWA Hiroyuki <[email protected]> Signed-off-by: Andrew Morton <[email protected]> Signed-off-by: Linus Torvalds <[email protected]>
2011-05-25mm/compaction: reverse the change that forbade sync migraton with ↵Andrea Arcangeli1-1/+1
__GFP_NO_KSWAPD It's uncertain this has been beneficial, so it's safer to undo it. All other compaction users would still go in synchronous mode if a first attempt at async compaction failed. Hopefully we don't need to force special behavior for THP (which is the only __GFP_NO_KSWAPD user so far and it's the easier to exercise and to be noticeable). This also make __GFP_NO_KSWAPD return to its original strict semantics specific to bypass kswapd, as THP allocations have khugepaged for the async THP allocations/compactions. Signed-off-by: Andrea Arcangeli <[email protected]> Cc: Alex Villacis Lasso <[email protected]> Cc: Mel Gorman <[email protected]> Signed-off-by: Andrew Morton <[email protected]> Signed-off-by: Linus Torvalds <[email protected]>