aboutsummaryrefslogtreecommitdiff
path: root/mm
AgeCommit message (Collapse)AuthorFilesLines
2022-05-13mm/memory-failure.c: simplify num_poisoned_pages_deczhenwei pi2-29/+9
Don't decrease the number of poisoned pages in page_alloc.c, let the memory-failure.c do inc/dec poisoned pages only. Also simplify unpoison_memory(), only decrease the number of poisoned pages when: - TestClearPageHWPoison() succeed - put_page_back_buddy succeed After decreasing, print necessary log. Finally, remove clear_page_hwpoison() and unpoison_taken_off_page(). Link: https://lkml.kernel.org/r/[email protected] Signed-off-by: zhenwei pi <[email protected]> Acked-by: Naoya Horiguchi <[email protected]> Signed-off-by: Andrew Morton <[email protected]>
2022-05-13mm/memory-failure.c: move clear_hwpoisoned_pageszhenwei pi3-27/+32
Patch series "memory-failure: fix hwpoison_filter", v2. As well known, the memory failure mechanism handles memory corrupted event, and try to send SIGBUS to the user process which uses this corrupted page. For the virtualization case, QEMU catches SIGBUS and tries to inject MCE into the guest, and the guest handles memory failure again. Thus the guest gets the minimal effect from hardware memory corruption. The further step I'm working on: 1, try to modify code to decrease poisoned pages in a single place (mm/memofy-failure.c: simplify num_poisoned_pages_dec in this series). 2, try to use page_handle_poison() to handle SetPageHWPoison() and num_poisoned_pages_inc() together. It would be best to call num_poisoned_pages_inc() in a single place too. 3, introduce memory failure notifier list in memory-failure.c: notify the corrupted PFN to someone who registers this list. If I can complete [1] and [2] part, [3] will be quite easy(just call notifier list after increasing poisoned page). 4, introduce memory recover VQ for memory balloon device, and registers memory failure notifier list. During the guest kernel handles memory failure, balloon device gets notified by memory failure notifier list, and tells the host to recover the corrupted PFN(GPA) by the new VQ. 5, host side remaps the corrupted page(HVA), and tells the guest side to unpoison the PFN(GPA). Then the guest fixes the corrupted page(GPA) dynamically. This patch (of 5): clear_hwpoisoned_pages() clears HWPoison flag and decreases the number of poisoned pages, this actually works as part of memory failure. Move this function from sparse.c to memory-failure.c, finally there is no CONFIG_MEMORY_FAILURE in sparse.c. Link: https://lkml.kernel.org/r/[email protected] Link: https://lkml.kernel.org/r/[email protected] Signed-off-by: zhenwei pi <[email protected]> Acked-by: Naoya Horiguchi <[email protected]> Signed-off-by: Andrew Morton <[email protected]>
2022-05-13mm/page_owner: use strscpy() instead of strlcpy()Eric Dumazet1-1/+1
current->comm[] is not a string (no guarantee for a zero byte in it). strlcpy(s1, s2, l) is calling strlen(s2), potentially causing out-of-bound access, as reported by syzbot: detected buffer overflow in __fortify_strlen ------------[ cut here ]------------ kernel BUG at lib/string_helpers.c:980! invalid opcode: 0000 [#1] PREEMPT SMP KASAN CPU: 0 PID: 4087 Comm: dhcpcd-run-hooks Not tainted 5.18.0-rc3-syzkaller-01537-g20b87e7c29df #0 Hardware name: Google Google Compute Engine/Google Compute Engine, BIOS Google 01/01/2011 RIP: 0010:fortify_panic+0x18/0x1a lib/string_helpers.c:980 Code: 8c e8 c5 ba e1 fa e9 23 0f bf fa e8 0b 5d 8c f8 eb db 55 48 89 fd e8 e0 49 40 f8 48 89 ee 48 c7 c7 80 f5 26 8a e8 99 09 f1 ff <0f> 0b e8 ca 49 40 f8 48 8b 54 24 18 4c 89 f1 48 c7 c7 00 00 27 8a RSP: 0018:ffffc900000074a8 EFLAGS: 00010286 RAX: 000000000000002c RBX: ffff88801226b728 RCX: 0000000000000000 RDX: ffff8880198e0000 RSI: ffffffff81600458 RDI: fffff52000000e87 RBP: ffffffff89da2aa0 R08: 000000000000002c R09: 0000000000000000 R10: ffffffff815fae2e R11: 0000000000000000 R12: ffff88801226b700 R13: ffff8880198e0830 R14: 0000000000000000 R15: 0000000000000000 FS: 0000000000000000(0000) GS:ffff8880b9c00000(0000) knlGS:0000000000000000 CS: 0010 DS: 0000 ES: 0000 CR0: 0000000080050033 CR2: 00007f5876ad6ff8 CR3: 000000001a48c000 CR4: 00000000003506f0 DR0: 0000000000000000 DR1: 0000000000000000 DR2: 0000000000000000 DR3: 0000000000000000 DR6: 00000000fffe0ff0 DR7: 0000000000000600 Call Trace: <IRQ> __fortify_strlen include/linux/fortify-string.h:128 [inline] strlcpy include/linux/fortify-string.h:143 [inline] __set_page_owner_handle+0x2b1/0x3e0 mm/page_owner.c:171 __set_page_owner+0x3e/0x50 mm/page_owner.c:190 prep_new_page mm/page_alloc.c:2441 [inline] get_page_from_freelist+0xba2/0x3e00 mm/page_alloc.c:4182 __alloc_pages+0x1b2/0x500 mm/page_alloc.c:5408 alloc_pages+0x1aa/0x310 mm/mempolicy.c:2272 alloc_slab_page mm/slub.c:1799 [inline] allocate_slab+0x26c/0x3c0 mm/slub.c:1944 new_slab mm/slub.c:2004 [inline] ___slab_alloc+0x8df/0xf20 mm/slub.c:3005 __slab_alloc.constprop.0+0x4d/0xa0 mm/slub.c:3092 slab_alloc_node mm/slub.c:3183 [inline] slab_alloc mm/slub.c:3225 [inline] __kmem_cache_alloc_lru mm/slub.c:3232 [inline] kmem_cache_alloc+0x360/0x3b0 mm/slub.c:3242 dst_alloc+0x146/0x1f0 net/core/dst.c:92 Link: https://lkml.kernel.org/r/[email protected] Fixes: 865ed6a32786 ("mm/page_owner: record task command name") Signed-off-by: Eric Dumazet <[email protected]> Reported-by: syzbot <[email protected]> Acked-by: Waiman Long <[email protected]> Acked-by: Shakeel Butt <[email protected]> Cc: <[email protected]> Signed-off-by: Andrew Morton <[email protected]>
2022-05-13kasan: give better names to shadow valuesAndrey Konovalov5-21/+21
Rename KASAN_KMALLOC_* shadow values to KASAN_SLAB_*, as they are used for all slab allocations, not only for kmalloc. Also rename KASAN_FREE_PAGE to KASAN_PAGE_FREE to be consistent with KASAN_PAGE_REDZONE and KASAN_SLAB_FREE. Link: https://lkml.kernel.org/r/bebcaf4eafdb0cabae0401a69c0af956aa87fcaa.1652111464.git.andreyknvl@google.com Signed-off-by: Andrey Konovalov <[email protected]> Reviewed-by: Alexander Potapenko <[email protected]> Cc: Andrey Ryabinin <[email protected]> Cc: Dmitry Vyukov <[email protected]> Cc: Marco Elver <[email protected]> Signed-off-by: Andrew Morton <[email protected]>
2022-05-13kasan: use tabs to align shadow valuesAndrey Konovalov1-16/+16
Consistently use tabs instead of spaces to shadow value definitions. Link: https://lkml.kernel.org/r/00e7e66b5fc375d58200dc1489949b3edcd096b7.1652111464.git.andreyknvl@google.com Signed-off-by: Andrey Konovalov <[email protected]> Cc: Alexander Potapenko <[email protected]> Cc: Andrey Ryabinin <[email protected]> Cc: Dmitry Vyukov <[email protected]> Cc: Marco Elver <[email protected]> Signed-off-by: Andrew Morton <[email protected]>
2022-05-13kasan: clean up comments in internal kasan.hAndrey Konovalov1-41/+33
Clean up comments in mm/kasan/kasan.h: clarify, unify styles, fix punctuation, etc. Link: https://lkml.kernel.org/r/a0680ff30035b56cb7bdd5f59fd400e71712ceb5.1652111464.git.andreyknvl@google.com Signed-off-by: Andrey Konovalov <[email protected]> Reviewed-by: Alexander Potapenko <[email protected]> Cc: Marco Elver <[email protected]> Cc: Alexander Potapenko <[email protected]> Cc: Dmitry Vyukov <[email protected]> Cc: Andrey Ryabinin <[email protected]> Signed-off-by: Andrew Morton <[email protected]>
2022-05-13mm/vmalloc: use raw_cpu_ptr() for vmap_block_queue accessSebastian Andrzej Siewior1-4/+2
The per-CPU resource vmap_block_queue is accessed via get_cpu_var(). That macro disables preemption and then loads the pointer from the current CPU. This doesn't work on PREEMPT_RT because a spinlock_t is later accessed within the preempt-disable section. There is no need to disable preemption while accessing the per-CPU struct vmap_block_queue because the list is protected with a spinlock_t. The per-CPU struct is also accessed cross-CPU in purge_fragmented_blocks(). It is possible that by using raw_cpu_ptr() the code migrates to another CPU and uses struct from another CPU. This is fine because the list is locked and the locked section is very short. Use raw_cpu_ptr() to access vmap_block_queue. Link: https://lkml.kernel.org/r/[email protected] Signed-off-by: Sebastian Andrzej Siewior <[email protected]> Cc: Uladzislau Rezki (Sony) <[email protected]> Cc: Thomas Gleixner <[email protected]> Signed-off-by: Andrew Morton <[email protected]>
2022-05-13tracing: incorrect gfp_t conversionVasily Averin1-1/+1
Fixes the following sparse warnings: include/trace/events/*: sparse: cast to restricted gfp_t include/trace/events/*: sparse: restricted gfp_t degrades to integer gfp_t type is bitwise and requires __force attributes for any casts. Link: https://lkml.kernel.org/r/[email protected] Signed-off-by: Vasily Averin <[email protected]> Cc: Steven Rostedt <[email protected]> Cc: Ingo Molnar <[email protected]> Signed-off-by: Andrew Morton <[email protected]>
2022-05-13percpu: improve percpu_alloc_percpu event traceVasily Averin2-6/+7
Add call_site, bytes_alloc and gfp_flags fields to the output of the percpu_alloc_percpu ftrace event: mkdir-4393 [001] 169.334788: percpu_alloc_percpu: call_site=mem_cgroup_css_alloc+0xa6 reserved=0 is_atomic=0 size=2408 align=8 base_addr=0xffffc7117fc00000 off=402176 ptr=0x3dc867a62300 bytes_alloc=14448 gfp_flags=GFP_KERNEL_ACCOUNT This is required to track memcg-accounted percpu allocations. Link: https://lkml.kernel.org/r/[email protected] Signed-off-by: Vasily Averin <[email protected]> Acked-by: Roman Gushchin <[email protected]> Cc: Shakeel Butt <[email protected]> Cc: Steven Rostedt <[email protected]> Cc: Ingo Molnar <[email protected]> Cc: Vlastimil Babka <[email protected]> Cc: Michal Hocko <[email protected]> Cc: Dennis Zhou <[email protected]> Cc: Tejun Heo <[email protected]> Cc: Christoph Lameter <[email protected]> Signed-off-by: Andrew Morton <[email protected]>
2022-05-13mm/damon/reclaim: use resource_size function on resource objectJiapeng Chong1-1/+1
Fix the following coccicheck warnings: ./mm/damon/reclaim.c:241:30-33: WARNING: Suspicious code. resource_size is maybe missing with res. Link: https://lkml.kernel.org/r/[email protected] Signed-off-by: Jiapeng Chong <[email protected]> Reported-by: Abaci Robot <[email protected]> Reviewed-by: SeongJae Park <[email protected]> Cc: "Boehme, Markus" <[email protected]> Signed-off-by: Andrew Morton <[email protected]>
2022-05-13mm: functions may simplify the use of return valuesLi kunyu1-5/+2
p4d_clear_huge may be optimized for void return type and function usage. vunmap_p4d_range function saves a few steps here. Link: https://lkml.kernel.org/r/[email protected] Signed-off-by: Li kunyu <[email protected]> Cc: Dave Hansen <[email protected]> Cc: Andy Lutomirski <[email protected]> Cc: Peter Zijlstra <[email protected]> Cc: Thomas Gleixner <[email protected]> Cc: Ingo Molnar <[email protected]> Cc: Borislav Petkov <[email protected]> Cc: "H. Peter Anvin" <[email protected]> Signed-off-by: Andrew Morton <[email protected]>
2022-05-13mm: page_table_check: move pxx_user_accessible_page into x86Kefeng Wang1-17/+0
The pxx_user_accessible_page() checks the PTE bit, it's architecture-specific code, move them into x86's pgtable.h. These helpers are being moved out to make the page table check framework platform independent. Link: https://lkml.kernel.org/r/[email protected] Signed-off-by: Kefeng Wang <[email protected]> Signed-off-by: Tong Tiangen <[email protected]> Acked-by: Pasha Tatashin <[email protected]> Reviewed-by: Anshuman Khandual <[email protected]> Cc: Borislav Petkov <[email protected]> Cc: Catalin Marinas <[email protected]> Cc: Dave Hansen <[email protected]> Cc: "H. Peter Anvin" <[email protected]> Cc: Ingo Molnar <[email protected]> Cc: Palmer Dabbelt <[email protected]> Cc: Paul Walmsley <[email protected]> Cc: Will Deacon <[email protected]> Signed-off-by: Andrew Morton <[email protected]>
2022-05-13mm: page_table_check: using PxD_SIZE instead of PxD_PAGE_SIZETong Tiangen1-4/+4
Patch series "mm: page_table_check: add support on arm64 and riscv", v7. Page table check performs extra verifications at the time when new pages become accessible from the userspace by getting their page table entries (PTEs PMDs etc.) added into the table. It is supported on X86[1]. This patchset made some simple changes and make it easier to support new architecture, then we support this feature on ARM64 and RISCV. [1]https://lore.kernel.org/lkml/[email protected]/ This patch (of 6): Compared with PxD_PAGE_SIZE, which is defined and used only on X86, PxD_SIZE is more common in each architecture. Therefore, it is more reasonable to use PxD_SIZE instead of PxD_PAGE_SIZE in page_table_check.c. At the same time, it is easier to support page table check in other architectures. The substitution has no functional impact on the x86. Link: https://lkml.kernel.org/r/[email protected] Link: https://lkml.kernel.org/r/[email protected] Signed-off-by: Tong Tiangen <[email protected]> Suggested-by: Anshuman Khandual <[email protected]> Acked-by: Pasha Tatashin <[email protected]> Reviewed-by: Anshuman Khandual <[email protected]> Cc: Ingo Molnar <[email protected]> Cc: Borislav Petkov <[email protected]> Cc: Dave Hansen <[email protected]> Cc: "H. Peter Anvin" <[email protected]> Cc: Catalin Marinas <[email protected]> Cc: Will Deacon <[email protected]> Cc: Paul Walmsley <[email protected]> Cc: Palmer Dabbelt <[email protected]> Cc: Kefeng Wang <[email protected]> Signed-off-by: Andrew Morton <[email protected]>
2022-05-13mm/migrate: convert move_to_new_page() into move_to_new_folio()Matthew Wilcox (Oracle)1-29/+29
Pass in the folios that we already have in each caller. Saves a lot of calls to compound_head(). Link: https://lkml.kernel.org/r/[email protected] Signed-off-by: Matthew Wilcox (Oracle) <[email protected]> Reviewed-by: Christoph Hellwig <[email protected]> Signed-off-by: Andrew Morton <[email protected]>
2022-05-13mm/shmem: convert shmem_swapin_page() to shmem_swapin_folio()Matthew Wilcox (Oracle)1-59/+51
shmem_swapin_page() only brings in order-0 pages, which are folios by definition. Link: https://lkml.kernel.org/r/[email protected] Signed-off-by: Matthew Wilcox (Oracle) <[email protected]> Reviewed-by: Christoph Hellwig <[email protected]> Signed-off-by: Andrew Morton <[email protected]>
2022-05-13mm/shmem: convert shmem_getpage_gfp to use a folioMatthew Wilcox (Oracle)1-52/+43
Rename shmem_alloc_and_acct_page() to shmem_alloc_and_acct_folio() and have it return a folio, then use a folio throuughout shmem_getpage_gfp(). It continues to return a struct page. Link: https://lkml.kernel.org/r/[email protected] Signed-off-by: Matthew Wilcox (Oracle) <[email protected]> Reviewed-by: Christoph Hellwig <[email protected]> Signed-off-by: Andrew Morton <[email protected]>
2022-05-13mm/shmem: convert shmem_alloc_and_acct_page to use a folioMatthew Wilcox (Oracle)1-9/+9
Convert shmem_alloc_hugepage() to return the folio that it uses and use a folio throughout shmem_alloc_and_acct_page(). Continue to return a page from shmem_alloc_and_acct_page() for now. Link: https://lkml.kernel.org/r/[email protected] Signed-off-by: Matthew Wilcox (Oracle) <[email protected]> Reviewed-by: Christoph Hellwig <[email protected]> Signed-off-by: Andrew Morton <[email protected]>
2022-05-13mm/shmem: add shmem_alloc_folio()Matthew Wilcox (Oracle)1-4/+10
Call vma_alloc_folio() directly instead of alloc_page_vma(). Add a shmem_alloc_page() wrapper to avoid changing the callers. Link: https://lkml.kernel.org/r/[email protected] Signed-off-by: Matthew Wilcox (Oracle) <[email protected]> Reviewed-by: Christoph Hellwig <[email protected]> Signed-off-by: Andrew Morton <[email protected]>
2022-05-13mm/shmem: turn shmem_should_replace_page into shmem_should_replace_folioMatthew Wilcox (Oracle)1-4/+4
This is a straightforward conversion. Link: https://lkml.kernel.org/r/[email protected] Signed-off-by: Matthew Wilcox (Oracle) <[email protected]> Reviewed-by: Christoph Hellwig <[email protected]> Signed-off-by: Andrew Morton <[email protected]>
2022-05-13mm/shmem: convert shmem_add_to_page_cache to take a folioMatthew Wilcox (Oracle)1-26/+31
Shrinks shmem_add_to_page_cache() by 16 bytes. All the callers grow, but this is temporary as they will all be converted to folios soon. Link: https://lkml.kernel.org/r/[email protected] Signed-off-by: Matthew Wilcox (Oracle) <[email protected]> Reviewed-by: Christoph Hellwig <[email protected]> Signed-off-by: Andrew Morton <[email protected]>
2022-05-13mm/shmem: use a folio in shmem_unused_huge_shrinkMatthew Wilcox (Oracle)1-11/+12
When calling split_huge_page() we usually have to find the precise page, but that's not necessary here because we only need to unlock and put the folio afterwards. Saves 231 bytes of text (20% of this function). Link: https://lkml.kernel.org/r/[email protected] Signed-off-by: Matthew Wilcox (Oracle) <[email protected]> Reviewed-by: Christoph Hellwig <[email protected]> Signed-off-by: Andrew Morton <[email protected]>
2022-05-13vmscan: remove remaining uses of page in shrink_page_listMatthew Wilcox (Oracle)1-62/+60
These are all straightforward conversions to the folio API. Link: https://lkml.kernel.org/r/[email protected] Signed-off-by: Matthew Wilcox (Oracle) <[email protected]> Reviewed-by: Christoph Hellwig <[email protected]> Signed-off-by: Andrew Morton <[email protected]>
2022-05-13vmscan: convert the activate_locked portion of shrink_page_list to foliosMatthew Wilcox (Oracle)1-8/+9
This accounts the number of pages activated correctly for large folios. Link: https://lkml.kernel.org/r/[email protected] Signed-off-by: Matthew Wilcox (Oracle) <[email protected]> Reviewed-by: Christoph Hellwig <[email protected]> Signed-off-by: Andrew Morton <[email protected]>
2022-05-13vmscan: move initialisation of mapping downMatthew Wilcox (Oracle)1-5/+2
Now that we don't interrogate the BDI for congestion, we can delay looking up the folio's mapping until we've got further through the function, reducing register pressure and saving a call to folio_mapping for folios we're adding to the swap cache. Link: https://lkml.kernel.org/r/[email protected] Signed-off-by: Matthew Wilcox (Oracle) <[email protected]> Reviewed-by: Christoph Hellwig <[email protected]> Signed-off-by: Andrew Morton <[email protected]>
2022-05-13vmscan: convert lazy freeing to foliosMatthew Wilcox (Oracle)1-9/+9
Remove a hidden call to compound_head(), and account nr_pages instead of a single page. This matches the code in lru_lazyfree_fn() that accounts nr_pages to PGLAZYFREE. Link: https://lkml.kernel.org/r/[email protected] Signed-off-by: Matthew Wilcox (Oracle) <[email protected]> Reviewed-by: Christoph Hellwig <[email protected]> Signed-off-by: Andrew Morton <[email protected]>
2022-05-13vmscan: convert page buffer handling to use foliosMatthew Wilcox (Oracle)1-23/+25
This mostly just removes calls to compound_head() although nr_reclaimed should be incremented by the number of pages, not just 1. Link: https://lkml.kernel.org/r/[email protected] Signed-off-by: Matthew Wilcox (Oracle) <[email protected]> Reviewed-by: Christoph Hellwig <[email protected]> Signed-off-by: Andrew Morton <[email protected]>
2022-05-13vmscan: convert dirty page handling to foliosMatthew Wilcox (Oracle)1-22/+26
Mostly this just eliminates calls to compound_head(), but NR_VMSCAN_IMMEDIATE was being incremented by 1 instead of by nr_pages. Link: https://lkml.kernel.org/r/[email protected] Signed-off-by: Matthew Wilcox (Oracle) <[email protected]> Reviewed-by: Christoph Hellwig <[email protected]> Signed-off-by: Andrew Morton <[email protected]>
2022-05-13swap: convert add_to_swap() to take a folioMatthew Wilcox (Oracle)3-28/+31
The only caller already has a folio available, so this saves a conversion. Also convert the return type to boolean. Link: https://lkml.kernel.org/r/[email protected] Signed-off-by: Matthew Wilcox (Oracle) <[email protected]> Reviewed-by: Christoph Hellwig <[email protected]> Signed-off-by: Andrew Morton <[email protected]>
2022-05-13swap: turn get_swap_page() into folio_alloc_swap()Matthew Wilcox (Oracle)5-25/+28
This removes an assumption that a large folio is HPAGE_PMD_NR pages in size. Link: https://lkml.kernel.org/r/[email protected] Signed-off-by: Matthew Wilcox (Oracle) <[email protected]> Reviewed-by: Christoph Hellwig <[email protected]> Signed-off-by: Andrew Morton <[email protected]>
2022-05-13vmscan: convert the writeback handling in shrink_page_list() to foliosMatthew Wilcox (Oracle)1-36/+42
Slightly more efficient due to fewer calls to compound_head(). Link: https://lkml.kernel.org/r/[email protected] Signed-off-by: Matthew Wilcox (Oracle) <[email protected]> Reviewed-by: Christoph Hellwig <[email protected]> Signed-off-by: Andrew Morton <[email protected]>
2022-05-13vmscan: use folio_mapped() in shrink_page_list()Matthew Wilcox (Oracle)1-8/+8
Remove some legacy function calls. Link: https://lkml.kernel.org/r/[email protected] Signed-off-by: Matthew Wilcox (Oracle) <[email protected]> Reviewed-by: Christoph Hellwig <[email protected]> Signed-off-by: Andrew Morton <[email protected]>
2022-05-13mm: remove alloc_pages_vma()Matthew Wilcox (Oracle)1-26/+25
All callers have now been converted to use vma_alloc_folio(), so convert the body of alloc_pages_vma() to allocate folios instead. Link: https://lkml.kernel.org/r/[email protected] Signed-off-by: Matthew Wilcox (Oracle) <[email protected]> Reviewed-by: Christoph Hellwig <[email protected]> Signed-off-by: Andrew Morton <[email protected]>
2022-05-13mm/huge_memory: convert do_huge_pmd_anonymous_page() to use vma_alloc_folio()Matthew Wilcox (Oracle)1-5/+4
Remove the use of this old API, eliminating a call to prep_transhuge_page(). Link: https://lkml.kernel.org/r/[email protected] Signed-off-by: Matthew Wilcox (Oracle) <[email protected]> Reviewed-by: Christoph Hellwig <[email protected]> Signed-off-by: Andrew Morton <[email protected]>
2022-05-13shmem: convert shmem_alloc_hugepage() to use vma_alloc_folio()Matthew Wilcox (Oracle)1-6/+4
Patch series "Folio patches for 5.19", v2. This patch (of 26): For now, return the head page of the folio, but remove use of the old alloc_pages_vma() API. Link: https://lkml.kernel.org/r/[email protected] Link: https://lkml.kernel.org/r/[email protected] Signed-off-by: Matthew Wilcox (Oracle) <[email protected]> Reviewed-by: Christoph Hellwig <[email protected]> Reviewed-by: Zi Yan <[email protected]> Signed-off-by: Andrew Morton <[email protected]>
2022-05-13mm/shmem: remove duplicate include in memory.cWan Jiabing1-1/+0
Fix following checkincludes.pl warning: mm/memory.c: linux/mm_inline.h is included more than once. The include is in line 44. Remove the duplicated here. Link: https://lkml.kernel.org/r/[email protected] Signed-off-by: Wan Jiabing <[email protected]> Signed-off-by: Andrew Morton <[email protected]>
2022-05-13mm/vmscan: don't use NUMA_NO_NODE as indicator of page on different nodeWei Yang1-4/+3
Now we are sure there is at least one page on page_list, so it is safe to get the nid of it. This means it is not necessary to use NUMA_NO_NODE as an indicator for the beginning of iteration or a page on different node. Link: https://lkml.kernel.org/r/[email protected] Signed-off-by: Wei Yang <[email protected]> Reviewed-by: Andrew Morton <[email protected]> Cc: Minchan Kim <[email protected]> Signed-off-by: Andrew Morton <[email protected]>
2022-05-13mm/vmscan: filter empty page_list at the beginningWei Yang1-4/+6
node_page_list would always be !empty on finishing the loop, except page_list is empty. Let's handle empty page_list before doing any real work including touching PF_MEMALLOC flag. Link: https://lkml.kernel.org/r/[email protected] Signed-off-by: Wei Yang <[email protected]> Reviewed-by: Andrew Morton <[email protected]> Cc: Minchan Kim <[email protected]> Signed-off-by: Andrew Morton <[email protected]>
2022-05-13mm/vmscan: use helper folio_is_file_lru()Miaohe Lin1-2/+2
Use helper folio_is_file_lru() to check whether folio is file lru. Minor readability improvement. [[email protected]: use folio_is_file_lru()] Link: https://lkml.kernel.org/r/[email protected] Link: https://lkml.kernel.org/r/[email protected] Signed-off-by: Miaohe Lin <[email protected]> Cc: Christoph Hellwig <[email protected]> Cc: Huang, Ying <[email protected]> Cc: Joonsoo Kim <[email protected]> Cc: Oscar Salvador <[email protected]> Signed-off-by: Andrew Morton <[email protected]>
2022-05-13mm/vmscan: remove obsolete comment in kswapd_runMiaohe Lin1-1/+0
Since commit 6b700b5b3c59 ("mm/vmscan.c: remove cpu online notification for now"), cpu online notification is removed. So kswapd won't move to proper cpus if cpus are hot-added. Remove this obsolete comment. Link: https://lkml.kernel.org/r/[email protected] Signed-off-by: Miaohe Lin <[email protected]> Cc: Christoph Hellwig <[email protected]> Cc: Huang, Ying <[email protected]> Cc: Joonsoo Kim <[email protected]> Cc: Oscar Salvador <[email protected]> Signed-off-by: Andrew Morton <[email protected]>
2022-05-13mm/vmscan: take all base pages of THP into account when race with ↵Miaohe Lin1-1/+1
speculative reference If the page has buffers, shrink_page_list will try to free the buffer mappings associated with the page and try to free the page as well. In the rare race with speculative reference, the page will be freed shortly by speculative reference. But nr_reclaimed is not incremented correctly when we come across the THP. We need to account all the base pages in this case. Link: https://lkml.kernel.org/r/[email protected] Signed-off-by: Miaohe Lin <[email protected]> Cc: Christoph Hellwig <[email protected]> Cc: Huang, Ying <[email protected]> Cc: Joonsoo Kim <[email protected]> Cc: Oscar Salvador <[email protected]> Signed-off-by: Andrew Morton <[email protected]>
2022-05-13mm/vmscan: introduce helper function reclaim_page_list()Miaohe Lin1-25/+25
Introduce helper function reclaim_page_list() to eliminate the duplicated code of doing shrink_page_list() and putback_lru_page. Also we can separate node reclaim from node page list operation this way. No functional change intended. Link: https://lkml.kernel.org/r/[email protected] Signed-off-by: Miaohe Lin <[email protected]> Cc: Christoph Hellwig <[email protected]> Cc: Huang, Ying <[email protected]> Cc: Joonsoo Kim <[email protected]> Cc: Oscar Salvador <[email protected]> Signed-off-by: Andrew Morton <[email protected]>
2022-05-13mm/vmscan: add a comment about MADV_FREE pages check in ↵Miaohe Lin1-1/+4
folio_check_dirty_writeback Patch series "A few cleanup and fixup patches for vmscan This series contains a few patches to remove obsolete comment, introduce helper to remove duplicated code and so no. Also we take all base pages of THP into account in rare race condition. More details can be found in the respective changelogs. This patch (of 6): The MADV_FREE pages check in folio_check_dirty_writeback is a bit hard to follow. Add a comment to make the code clear. Link: https://lkml.kernel.org/r/[email protected] Suggested-by: Huang, Ying <[email protected]> Signed-off-by: Miaohe Lin <[email protected]> Reviewed-by: Christoph Hellwig <[email protected]> Reviewed-by: Oscar Salvador <[email protected]> Cc: Joonsoo Kim <[email protected]> Signed-off-by: Andrew Morton <[email protected]>
2022-05-13mm/vmscan: not necessary to re-init the list for each iterationWei Yang1-3/+1
node_page_list is defined with LIST_HEAD and be cleaned until list_empty. So it is not necessary to re-init it again. [[email protected]: remove unneeded braces] Link: https://lkml.kernel.org/r/[email protected] Signed-off-by: Wei Yang <[email protected]> Signed-off-by: Andrew Morton <[email protected]>
2022-05-13mm: convert sysfs input to bool using kstrtobool()Jagdish Gediya2-12/+10
Sysfs input conversion to corrosponding bool value e.g. "false" or "0" to false, "true" or "1" to true are currently handled through strncmp at multiple places. Use kstrtobool() to convert sysfs input to bool value. [[email protected]: propagate kstrtobool() return value, per Andy] Link: https://lkml.kernel.org/r/[email protected] Signed-off-by: Jagdish Gediya <[email protected]> Reviewed-by: Matthew Wilcox (Oracle) <[email protected]> Reviewed-by: Andy Shevchenko <[email protected]> Cc: Alexey Dobriyan <[email protected]> Cc: Dave Hansen <[email protected]> Cc: "Huang, Ying" <[email protected]> Cc: Jonathan Cameron <[email protected]> Cc: Petr Mladek <[email protected]> Cc: Richard Fitzgerald <[email protected]> Signed-off-by: Andrew Morton <[email protected]>
2022-05-13mm/vmscan: take min_slab_pages into account when try to call shrink_nodeMiaohe Lin1-1/+2
Since commit 6b4f7799c6a5 ("mm: vmscan: invoke slab shrinkers from shrink_zone()"), slab reclaim and lru page reclaim are done together in the shrink_node. So we should take min_slab_pages into account when try to call shrink_node. Link: https://lkml.kernel.org/r/[email protected] Signed-off-by: Miaohe Lin <[email protected]> Cc: Huang Ying <[email protected]> Cc: Christoph Hellwig <[email protected]> Cc: Johannes Weiner <[email protected]> Signed-off-by: Andrew Morton <[email protected]>
2022-05-13mm: cma: use pageblock_order as the single alignmentZi Yan1-2/+2
Now alloc_contig_range() works at pageblock granularity. Change CMA allocation, which uses alloc_contig_range(), to use pageblock_nr_pages alignment. Link: https://lkml.kernel.org/r/[email protected] Signed-off-by: Zi Yan <[email protected]> Cc: Christophe Leroy <[email protected]> Cc: David Hildenbrand <[email protected]> Cc: Eric Ren <[email protected]> Cc: kernel test robot <[email protected]> Cc: Mel Gorman <[email protected]> Cc: Mike Rapoport <[email protected]> Cc: Minchan Kim <[email protected]> Cc: Oscar Salvador <[email protected]> Cc: Vlastimil Babka <[email protected]> Signed-off-by: Andrew Morton <[email protected]>
2022-05-13mm: page_isolation: enable arbitrary range page isolation.Zi Yan2-31/+18
Now start_isolate_page_range() is ready to handle arbitrary range isolation, so move the alignment check/adjustment into the function body. Do the same for its counterpart undo_isolate_page_range(). alloc_contig_range(), its caller, can pass an arbitrary range instead of a MAX_ORDER_NR_PAGES aligned one. Link: https://lkml.kernel.org/r/[email protected] Signed-off-by: Zi Yan <[email protected]> Cc: Christophe Leroy <[email protected]> Cc: David Hildenbrand <[email protected]> Cc: Eric Ren <[email protected]> Cc: kernel test robot <[email protected]> Cc: Mel Gorman <[email protected]> Cc: Mike Rapoport <[email protected]> Cc: Minchan Kim <[email protected]> Cc: Oscar Salvador <[email protected]> Cc: Vlastimil Babka <[email protected]> Signed-off-by: Andrew Morton <[email protected]>
2022-05-13mm: make alloc_contig_range work at pageblock granularityZi Yan4-16/+240
alloc_contig_range() worked at MAX_ORDER_NR_PAGES granularity to avoid merging pageblocks with different migratetypes. It might unnecessarily convert extra pageblocks at the beginning and at the end of the range. Change alloc_contig_range() to work at pageblock granularity. Special handling is needed for free pages and in-use pages across the boundaries of the range specified by alloc_contig_range(). Because these= Partially isolated pages causes free page accounting issues. The free pages will be split and freed into separate migratetype lists; the in-use= Pages will be migrated then the freed pages will be handled in the aforementioned way. [[email protected]: fix deadlock/crash] Link: https://lkml.kernel.org/r/[email protected] Link: https://lkml.kernel.org/r/[email protected] Signed-off-by: Zi Yan <[email protected]> Reported-by: kernel test robot <[email protected]> Cc: Christophe Leroy <[email protected]> Cc: David Hildenbrand <[email protected]> Cc: Eric Ren <[email protected]> Cc: Mel Gorman <[email protected]> Cc: Mike Rapoport <[email protected]> Cc: Minchan Kim <[email protected]> Cc: Oscar Salvador <[email protected]> Cc: Vlastimil Babka <[email protected]> Signed-off-by: Andrew Morton <[email protected]>
2022-05-13mm: page_isolation: check specified range for unmovable pagesZi Yan1-13/+34
Enable set_migratetype_isolate() to check specified range for unmovable pages during isolation to prepare arbitrary range page isolation. The functionality will take effect in upcoming commits by adjusting the callers of start_isolate_page_range(), which uses set_migratetype_isolate(). For example, alloc_contig_range(), which calls start_isolate_page_range(), accepts unaligned ranges, but because page isolation is currently done at MAX_ORDER_NR_PAEGS granularity, pages that are out of the specified range but withint MAX_ORDER_NR_PAEGS alignment might be attempted for isolation and the failure of isolating these unrelated pages fails the whole operation undesirably. Link: https://lkml.kernel.org/r/[email protected] Signed-off-by: Zi Yan <[email protected]> Cc: Christophe Leroy <[email protected]> Cc: David Hildenbrand <[email protected]> Cc: Eric Ren <[email protected]> Cc: kernel test robot <[email protected]> Cc: Mel Gorman <[email protected]> Cc: Mike Rapoport <[email protected]> Cc: Minchan Kim <[email protected]> Cc: Oscar Salvador <[email protected]> Cc: Vlastimil Babka <[email protected]> Signed-off-by: Andrew Morton <[email protected]>
2022-05-13mm: page_isolation: move has_unmovable_pages() to mm/page_isolation.cZi Yan2-119/+119
Patch series "Use pageblock_order for cma and alloc_contig_range alignment", v11. This patchset tries to remove the MAX_ORDER-1 alignment requirement for CMA and alloc_contig_range(). It prepares for my upcoming changes to make MAX_ORDER adjustable at boot time[1]. The MAX_ORDER - 1 alignment requirement comes from that alloc_contig_range() isolates pageblocks to remove free memory from buddy allocator but isolating only a subset of pageblocks within a page spanning across multiple pageblocks causes free page accounting issues. Isolated page might not be put into the right free list, since the code assumes the migratetype of the first pageblock as the whole free page migratetype. This is based on the discussion at [2]. To remove the requirement, this patchset: 1. isolates pages at pageblock granularity instead of max(MAX_ORDER_NR_PAEGS, pageblock_nr_pages); 2. splits free pages across the specified range or migrates in-use pages across the specified range then splits the freed page to avoid free page accounting issues (it happens when multiple pageblocks within a single page have different migratetypes); 3. only checks unmovable pages within the range instead of MAX_ORDER - 1 aligned range during isolation to avoid alloc_contig_range() failure when pageblocks within a MAX_ORDER - 1 aligned range are allocated separately. 4. returns pages not in the range as it did before. One optimization might come later: 1. make MIGRATE_ISOLATE a separate bit to be able to restore the original migratetypes when isolation fails in the middle of the range. [1] https://lore.kernel.org/linux-mm/[email protected]/ [2] https://lore.kernel.org/linux-mm/[email protected]/ This patch (of 6): has_unmovable_pages() is only used in mm/page_isolation.c. Move it from mm/page_alloc.c and make it static. Link: https://lkml.kernel.org/r/[email protected] Signed-off-by: Zi Yan <[email protected]> Reviewed-by: Oscar Salvador <[email protected]> Reviewed-by: Mike Rapoport <[email protected]> Acked-by: David Hildenbrand <[email protected]> Cc: Vlastimil Babka <[email protected]> Cc: Mel Gorman <[email protected]> Cc: Eric Ren <[email protected]> Cc: Christophe Leroy <[email protected]> Cc: Minchan Kim <[email protected]> Cc: kernel test robot <[email protected]> Signed-off-by: Andrew Morton <[email protected]>