blaster4385/linux-IllusionX - Linux kernel with personal config changes for arch linux

Age	Commit message (Collapse)	Author	Files	Lines
2023-08-21	mm: convert free_transhuge_folio() to folio_undo_large_rmappable()	Matthew Wilcox (Oracle)	3	-14/+19
	Indirect calls are expensive, thanks to Spectre. Test for TRANSHUGE_PAGE_DTOR and destroy the folio appropriately. Move the free_compound_page() call into destroy_large_folio() to simplify later patches. Link: https://lkml.kernel.org/r/[email protected] Signed-off-by: Matthew Wilcox (Oracle) <[email protected]> Cc: David Hildenbrand <[email protected]> Cc: Jens Axboe <[email protected]> Cc: Sidhartha Kumar <[email protected]> Cc: Yanteng Si <[email protected]> Signed-off-by: Andrew Morton <[email protected]>
2023-08-21	mm: convert free_huge_page() to free_huge_folio()	Matthew Wilcox (Oracle)	2	-26/+24
	Pass a folio instead of the head page to save a few instructions. Update the documentation, at least in English. Link: https://lkml.kernel.org/r/[email protected] Signed-off-by: Matthew Wilcox (Oracle) <[email protected]> Reviewed-by: Sidhartha Kumar <[email protected]> Cc: Yanteng Si <[email protected]> Cc: David Hildenbrand <[email protected]> Cc: Jens Axboe <[email protected]> Signed-off-by: Andrew Morton <[email protected]>
2023-08-21	mm: call free_huge_page() directly	Matthew Wilcox (Oracle)	1	-3/+5
	Indirect calls are expensive, thanks to Spectre. Call free_huge_page() directly if the folio belongs to hugetlb. Link: https://lkml.kernel.org/r/[email protected] Signed-off-by: Matthew Wilcox (Oracle) <[email protected]> Cc: David Hildenbrand <[email protected]> Cc: Jens Axboe <[email protected]> Cc: Sidhartha Kumar <[email protected]> Cc: Yanteng Si <[email protected]> Signed-off-by: Andrew Morton <[email protected]>
2023-08-21	mm/gup: don't implicitly set FOLL_HONOR_NUMA_FAULT	David Hildenbrand	1	-7/+0
	Commit 0b9d705297b2 ("mm: numa: Support NUMA hinting page faults from gup/gup_fast") from 2012 documented as the primary reason why we would want to handle NUMA hinting faults from GUP: KVM secondary MMU page faults will trigger the NUMA hinting page faults through gup_fast -> get_user_pages -> follow_page -> handle_mm_fault. That is still the case today, and relevant KVM code has been converted to manually set FOLL_HONOR_NUMA_FAULT. So let's stop setting FOLL_HONOR_NUMA_FAULT for all GUP users and cross fingers that not that many other ones that really require such handling for autonuma remain. Possible interaction with MMU notifiers: Assume a driver obtains a page using get_user_pages() to map it into a secondary MMU, and uses the MMU notifier framework to get notified on changes. Assume get_user_pages() succeeded on a PROT_NONE-mapped page (because FOLL_HONOR_NUMA_FAULT is not set) in an accessible VMA and the page is mapped into a secondary MMU. Once user space would turn that mapping inaccessible using mprotect(PROT_NONE), the actual PTE in the page table might not change. If the MMU notifier would be smart and optimize for that case "why notify if the PTE didn't change", that could be problematic. At least change_pmd_range() with MMU_NOTIFY_PROTECTION_VMA for now does an unconditional mmu_notifier_invalidate_range_start() -> mmu_notifier_invalidate_range_end() and should be fine. Note that even if a PTE in an accessible VMA is pte_protnone(), the underlying page might be accessed by a secondary MMU that does not set FOLL_HONOR_NUMA_FAULT, and test_young() MMU notifiers would return "true". Link: https://lkml.kernel.org/r/[email protected] Signed-off-by: David Hildenbrand <[email protected]> Cc: Hugh Dickins <[email protected]> Cc: Jason Gunthorpe <[email protected]> Cc: John Hubbard <[email protected]> Cc: Linus Torvalds <[email protected]> Cc: liubo <[email protected]> Cc: Matthew Wilcox (Oracle) <[email protected]> Cc: Mel Gorman <[email protected]> Cc: Mel Gorman <[email protected]> Cc: Paolo Bonzini <[email protected]> Cc: Peter Xu <[email protected]> Cc: Shuah Khan <[email protected]> Signed-off-by: Andrew Morton <[email protected]>
2023-08-21	merge mm-hotfixes-stable into mm-stable to pick up depended-upon changes	Andrew Morton	22	-86/+226

2023-08-21	lib/vsprintf: declare no_hash_pointers in sprintf.h	Andy Shevchenko	1	-2/+1
	Sparse is not happy to see non-static variable without declaration: lib/vsprintf.c:61:6: warning: symbol 'no_hash_pointers' was not declared. Should it be static? Declare respective variable in the sprintf.h. With this, add a comment to discourage its use if no real need. Link: https://lkml.kernel.org/r/[email protected] Signed-off-by: Andy Shevchenko <[email protected]> Acked-by: Marco Elver <[email protected]> Reviewed-by: Petr Mladek <[email protected]> Cc: Alexander Potapenko <[email protected]> Cc: Dmitry Vyukov <[email protected]> Cc: Rasmus Villemoes <[email protected]> Cc: Sergey Senozhatsky <[email protected]> Cc: Steven Rostedt (Google) <[email protected]> Signed-off-by: Andrew Morton <[email protected]>
2023-08-21	Rename kmemleak_initialized to kmemleak_late_initialized	Xiaolei Wang	1	-4/+4
	The old name is confusing because it implies the completion of earlier kmemleak_init(), the new name update to kmemleak_late_initial represents the completion of kmemleak_late_init(). No functional changes. Link: https://lkml.kernel.org/r/[email protected] Signed-off-by: Xiaolei Wang <[email protected]> Acked-by: Catalin Marinas <[email protected]> Cc: Alexander Potapenko <[email protected]> Cc: Andrey Konovalov <[email protected]> Cc: Vlastimil Babka <[email protected]> Cc: Zhaoyang Huang <[email protected]> Signed-off-by: Andrew Morton <[email protected]>
2023-08-21	mm/kmemleak: use object_cache instead of kmemleak_initialized to check in ↵	Xiaolei Wang	1	-1/+6
	set_track_prepare() Patch series "mm/kmemleak: use object_cache instead of kmemleak_initialized", v3. Use object_cache instead of kmemleak_initialized to check in set_track_prepare(), so that memory leaks after kmemleak_init() can be recorded and Rename kmemleak_initialized to kmemleak_late_initialized unreferenced object 0xc674ca80 (size 64): comm "swapper/0", pid 1, jiffies 4294938337 (age 204.880s) hex dump (first 32 bytes): 80 55 75 c6 80 54 75 c6 00 55 75 c6 80 52 75 c6 .Uu..Tu..Uu..Ru. 00 53 75 c6 00 00 00 00 00 00 00 00 00 00 00 00 .Su.......... This patch (of 2): kmemleak_initialized is set in kmemleak_late_init(), which also means that there is no call trace which object's memory leak is before kmemleak_late_init(), so use object_cache instead of kmemleak_initialized to check in set_track_prepare() to avoid no call trace records when there is a memory leak in the code between kmemleak_init() and kmemleak_late_init(). unreferenced object 0xc674ca80 (size 64): comm "swapper/0", pid 1, jiffies 4294938337 (age 204.880s) hex dump (first 32 bytes): 80 55 75 c6 80 54 75 c6 00 55 75 c6 80 52 75 c6 .Uu..Tu..Uu..Ru. 00 53 75 c6 00 00 00 00 00 00 00 00 00 00 00 00 .Su.......... Link: https://lkml.kernel.org/r/[email protected] Link: https://lkml.kernel.org/r/[email protected] Fixes: 56a61617dd22 ("mm: use stack_depot for recording kmemleak's backtrace") Signed-off-by: Xiaolei Wang <[email protected]> Reviewed-by: Catalin Marinas <[email protected]> Cc: Alexander Potapenko <[email protected]> Cc: Andrey Konovalov <[email protected]> Cc: Vlastimil Babka <[email protected]> Cc: Zhaoyang Huang <[email protected]> Signed-off-by: Andrew Morton <[email protected]>
2023-08-21	mm/ksm: add pages scanned metric	Stefan Roesch	1	-1/+15
	ksm currently maintains several statistics, which let you determine how successful KSM is at sharing pages. However it does not contain a metric to determine how much work it does. This commit adds the pages scanned metric. This allows the administrator to determine how many pages have been scanned over a period of time. Link: https://lkml.kernel.org/r/[email protected] Signed-off-by: Stefan Roesch <[email protected]> Acked-by: David Hildenbrand <[email protected]> Cc: Johannes Weiner <[email protected]> Cc: Rik van Riel <[email protected]> Signed-off-by: Andrew Morton <[email protected]>
2023-08-21	mm: allow fault_dirty_shared_page() to be called under the VMA lock	Matthew Wilcox (Oracle)	1	-1/+1
	By making maybe_unlock_mmap_for_io() handle the VMA lock correctly, we make fault_dirty_shared_page() safe to be called without the mmap lock held. Link: https://lkml.kernel.org/r/[email protected] Signed-off-by: Matthew Wilcox (Oracle) <[email protected]> Reported-by: David Hildenbrand <[email protected]> Tested-by: Suren Baghdasaryan <[email protected]> Signed-off-by: Andrew Morton <[email protected]>
2023-08-21	mm/secretmem: use a folio in secretmem_fault()	ZhangPeng	1	-6/+8
	Saves four implicit call to compound_head(). Link: https://lkml.kernel.org/r/[email protected] Signed-off-by: ZhangPeng <[email protected]> Reviewed-by: Matthew Wilcox (Oracle) <[email protected]> Reviewed-by: David Hildenbrand <[email protected]> Cc: Kefeng Wang <[email protected]> Cc: Nanyong Sun <[email protected]> Signed-off-by: Andrew Morton <[email protected]>
2023-08-21	mm,thp: no space after colon in Mem-Info fields	Hugh Dickins	1	-3/+3
	Patch series "mm,thp: fix sloppy text output". Three independent trivial patches, fixing sloppy text output which has annoyed me; but might risk surprising a parser, so any can be dropped. This patch (of 3): The SysRq-m or OOM Mem-Info dmesg showed (long lines containing) ... shmem:NkB shmem_thp: NkB shmem_pmdmapped: NkB anon_thp: NkB ... Delete the space after the colon after shmem_thp, shmem_pmdmapped, anon_thp: as the shmem example shows, no other fields have a space after the colon in this output. Link: https://lkml.kernel.org/r/[email protected] Link: https://lkml.kernel.org/r/[email protected] Signed-off-by: Hugh Dickins <[email protected]> Reviewed-by: David Hildenbrand <[email protected]> Cc: Alexey Dobriyan <[email protected]> Signed-off-by: Andrew Morton <[email protected]>
2023-08-21	memfd: replace ratcheting feature from vm.memfd_noexec with hierarchy	Aleksa Sarai	1	-1/+2
	This sysctl has the very unusual behaviour of not allowing any user (even CAP_SYS_ADMIN) to reduce the restriction setting, meaning that if you were to set this sysctl to a more restrictive option in the host pidns you would need to reboot your machine in order to reset it. The justification given in [1] is that this is a security feature and thus it should not be possible to disable. Aside from the fact that we have plenty of security-related sysctls that can be disabled after being enabled (fs.protected_symlinks for instance), the protection provided by the sysctl is to stop users from being able to create a binary and then execute it. A user with CAP_SYS_ADMIN can trivially do this without memfd_create(2): % cat mount-memfd.c #include <fcntl.h> #include <string.h> #include <stdio.h> #include <stdlib.h> #include <unistd.h> #include <linux/mount.h> #define SHELLCODE "#!/bin/echo this file was executed from this totally private tmpfs:" int main(void) { int fsfd = fsopen("tmpfs", FSOPEN_CLOEXEC); assert(fsfd >= 0); assert(!fsconfig(fsfd, FSCONFIG_CMD_CREATE, NULL, NULL, 2)); int dfd = fsmount(fsfd, FSMOUNT_CLOEXEC, 0); assert(dfd >= 0); int execfd = openat(dfd, "exe", O_CREAT \| O_RDWR \| O_CLOEXEC, 0782); assert(execfd >= 0); assert(write(execfd, SHELLCODE, strlen(SHELLCODE)) == strlen(SHELLCODE)); assert(!close(execfd)); char execpath = NULL; char argv[] = { "bad-exe", NULL }, envp[] = { NULL }; execfd = openat(dfd, "exe", O_PATH \| O_CLOEXEC); assert(execfd >= 0); assert(asprintf(&execpath, "/proc/self/fd/%d", execfd) > 0); assert(!execve(execpath, argv, envp)); } % ./mount-memfd this file was executed from this totally private tmpfs: /proc/self/fd/5 % Given that it is possible for CAP_SYS_ADMIN users to create executable binaries without memfd_create(2) and without touching the host filesystem (not to mention the many other things a CAP_SYS_ADMIN process would be able to do that would be equivalent or worse), it seems strange to cause a fair amount of headache to admins when there doesn't appear to be an actual security benefit to blocking this. There appear to be concerns about confused-deputy-esque attacks[2] but a confused deputy that can write to arbitrary sysctls is a bigger security issue than executable memfds. / New API / The primary requirement from the original author appears to be more based on the need to be able to restrict an entire system in a hierarchical manner[3], such that child namespaces cannot re-enable executable memfds. So, implement that behaviour explicitly -- the vm.memfd_noexec scope is evaluated up the pidns tree to &init_pid_ns and you have the most restrictive value applied to you. The new lower limit you can set vm.memfd_noexec is whatever limit applies to your parent. Note that a pidns will inherit a copy of the parent pidns's effective vm.memfd_noexec setting at unshare() time. This matches the existing behaviour, and it also ensures that a pidns will never have its vm.memfd_noexec setting lowered* behind its back (but it will be raised if the parent raises theirs). /* Backwards Compatibility / As the previous version of the sysctl didn't allow you to lower the setting at all, there are no backwards compatibility issues with this aspect of the change. However it should be noted that now that the setting is completely hierarchical. Previously, a cloned pidns would just copy the current pidns setting, meaning that if the parent's vm.memfd_noexec was changed it wouldn't propoagate to existing pid namespaces. Now, the restriction applies recursively. This is a uAPI change, however: The sysctl is very new, having been merged in 6.3. * Several aspects of the sysctl were broken up until this patchset and the other patchset by Jeff Xu last month. And thus it seems incredibly unlikely that any real users would run into this issue. In the worst case, if this causes userspace isues we could make it so that modifying the setting follows the hierarchical rules but the restriction checking uses the cached copy. [1]: https://lore.kernel.org/CABi2SkWnAgHK1i6iqSqPMYuNEhtHBkO8jUuCvmG3RmUB5TKHJw@mail.gmail.com/ [2]: https://lore.kernel.org/CALmYWFs_dNCzw_pW1yRAo4bGCPEtykroEQaowNULp7svwMLjOg@mail.gmail.com/ [3]: https://lore.kernel.org/CALmYWFuahdUF7cT4cm7_TGLqPanuHXJ-hVSfZt7vpTnc18DPrw@mail.gmail.com/ Link: https://lkml.kernel.org/r/[email protected] Fixes: 105ff5339f49 ("mm/memfd: add MFD_NOEXEC_SEAL and MFD_EXEC") Signed-off-by: Aleksa Sarai <[email protected]> Cc: Dominique Martinet <[email protected]> Cc: Christian Brauner <[email protected]> Cc: Daniel Verkamp <[email protected]> Cc: Jeff Xu <[email protected]> Cc: Kees Cook <[email protected]> Cc: Shuah Khan <[email protected]> Cc: <[email protected]> Signed-off-by: Andrew Morton <[email protected]>
2023-08-21	memfd: improve userspace warnings for missing exec-related flags	Aleksa Sarai	1	-1/+1
	In order to incentivise userspace to switch to passing MFD_EXEC and MFD_NOEXEC_SEAL, we need to provide a warning on each attempt to call memfd_create() without the new flags. pr_warn_once() is not useful because on most systems the one warning is burned up during the boot process (on my system, systemd does this within the first second of boot) and thus userspace will in practice never see the warnings to push them to switch to the new flags. The original patchset[1] used pr_warn_ratelimited(), however there were concerns about the degree of spam in the kernel log[2,3]. The resulting inability to detect every case was flagged as an issue at the time[4]. While we could come up with an alternative rate-limiting scheme such as only outputting the message if vm.memfd_noexec has been modified, or only outputting the message once for a given task, these alternatives have downsides that don't make sense given how low-stakes a single kernel warning message is. Switching to pr_info_ratelimited() instead should be fine -- it's possible some monitoring tool will be unhappy with a stream of warning-level messages but there's already plenty of info-level message spam in dmesg. [1]: https://lore.kernel.org/[email protected]/ [2]: https://lore.kernel.org/202212161233.85C9783FB@keescook/ [3]: https://lore.kernel.org/Y5yS8wCnuYGLHMj4@x1n/ [4]: https://lore.kernel.org/[email protected]/ Link: https://lkml.kernel.org/r/[email protected] Fixes: 105ff5339f49 ("mm/memfd: add MFD_NOEXEC_SEAL and MFD_EXEC") Signed-off-by: Aleksa Sarai <[email protected]> Cc: Christian Brauner <[email protected]> Cc: Daniel Verkamp <[email protected]> Cc: Dominique Martinet <[email protected]> Cc: Kees Cook <[email protected]> Cc: Shuah Khan <[email protected]> Cc: <[email protected]> Signed-off-by: Andrew Morton <[email protected]>
2023-08-21	memfd: do not -EACCES old memfd_create() users with vm.memfd_noexec=2	Aleksa Sarai	1	-19/+11
	Given the difficulty of auditing all of userspace to figure out whether every memfd_create() user has switched to passing MFD_EXEC and MFD_NOEXEC_SEAL flags, it seems far less distruptive to make it possible for older programs that don't make use of executable memfds to run under vm.memfd_noexec=2. Otherwise, a small dependency change can result in spurious errors. For programs that don't use executable memfds, passing MFD_NOEXEC_SEAL is functionally a no-op and thus having the same In addition, every failure under vm.memfd_noexec=2 needs to print to the kernel log so that userspace can figure out where the error came from. The concerns about pr_warn_ratelimited() spam that caused the switch to pr_warn_once()[1,2] do not apply to the vm.memfd_noexec=2 case. This is a user-visible API change, but as it allows programs to do something that would be blocked before, and the sysctl itself was broken and recently released, it seems unlikely this will cause any issues. [1]: https://lore.kernel.org/Y5yS8wCnuYGLHMj4@x1n/ [2]: https://lore.kernel.org/202212161233.85C9783FB@keescook/ Link: https://lkml.kernel.org/r/[email protected] Fixes: 105ff5339f49 ("mm/memfd: add MFD_NOEXEC_SEAL and MFD_EXEC") Signed-off-by: Aleksa Sarai <[email protected]> Cc: Dominique Martinet <[email protected]> Cc: Christian Brauner <[email protected]> Cc: Daniel Verkamp <[email protected]> Cc: Jeff Xu <[email protected]> Cc: Kees Cook <[email protected]> Cc: Shuah Khan <[email protected]> Cc: <[email protected]> Signed-off-by: Andrew Morton <[email protected]>
2023-08-21	mm: convert ptlock_free() to use ptdescs	Vishal Moola (Oracle)	1	-2/+2
	This removes some direct accesses to struct page, working towards splitting out struct ptdesc from struct page. Link: https://lkml.kernel.org/r/[email protected] Signed-off-by: Vishal Moola (Oracle) <[email protected]> Acked-by: Mike Rapoport (IBM) <[email protected]> Cc: Arnd Bergmann <[email protected]> Cc: Catalin Marinas <[email protected]> Cc: Christophe Leroy <[email protected]> Cc: Claudio Imbrenda <[email protected]> Cc: Dave Hansen <[email protected]> Cc: David Hildenbrand <[email protected]> Cc: "David S. Miller" <[email protected]> Cc: Dinh Nguyen <[email protected]> Cc: Geert Uytterhoeven <[email protected]> Cc: Geert Uytterhoeven <[email protected]> Cc: Guo Ren <[email protected]> Cc: Huacai Chen <[email protected]> Cc: Hugh Dickins <[email protected]> Cc: John Paul Adrian Glaubitz <[email protected]> Cc: Jonas Bonn <[email protected]> Cc: Matthew Wilcox <[email protected]> Cc: Palmer Dabbelt <[email protected]> Cc: Paul Walmsley <[email protected]> Cc: Richard Weinberger <[email protected]> Cc: Thomas Bogendoerfer <[email protected]> Cc: Yoshinori Sato <[email protected]> Signed-off-by: Andrew Morton <[email protected]>
2023-08-21	mm: convert ptlock_alloc() to use ptdescs	Vishal Moola (Oracle)	1	-2/+2
	This removes some direct accesses to struct page, working towards splitting out struct ptdesc from struct page. Link: https://lkml.kernel.org/r/[email protected] Signed-off-by: Vishal Moola (Oracle) <[email protected]> Acked-by: Mike Rapoport (IBM) <[email protected]> Cc: Arnd Bergmann <[email protected]> Cc: Catalin Marinas <[email protected]> Cc: Christophe Leroy <[email protected]> Cc: Claudio Imbrenda <[email protected]> Cc: Dave Hansen <[email protected]> Cc: David Hildenbrand <[email protected]> Cc: "David S. Miller" <[email protected]> Cc: Dinh Nguyen <[email protected]> Cc: Geert Uytterhoeven <[email protected]> Cc: Geert Uytterhoeven <[email protected]> Cc: Guo Ren <[email protected]> Cc: Huacai Chen <[email protected]> Cc: Hugh Dickins <[email protected]> Cc: John Paul Adrian Glaubitz <[email protected]> Cc: Jonas Bonn <[email protected]> Cc: Matthew Wilcox <[email protected]> Cc: Palmer Dabbelt <[email protected]> Cc: Paul Walmsley <[email protected]> Cc: Richard Weinberger <[email protected]> Cc: Thomas Bogendoerfer <[email protected]> Cc: Yoshinori Sato <[email protected]> Signed-off-by: Andrew Morton <[email protected]>
2023-08-21	mm/z3fold: remove obsolete comment for struct z3fold_pool	Xiu Jianfeng	1	-2/+0
	Since commit e774a7bc7f0a ("mm: zswap: remove page reclaim logic from z3fold"), zpool and zpool_ops have been removed, so also remove the corresponding comments. Link: https://lkml.kernel.org/r/[email protected] Signed-off-by: Xiu Jianfeng <[email protected]> Reviewed-by: Miaohe Lin <[email protected]> Cc: Vitaly Wool <[email protected]> Signed-off-by: Andrew Morton <[email protected]>
2023-08-21	mm/page_alloc: use get_pfnblock_migratetype to avoid extra page_to_pfn	Kemeng Shi	1	-2/+2
	We have get_pageblock_migratetype and get_pfnblock_migratetype to get migratetype of page. get_pfnblock_migratetype accepts both page and pfn from caller while get_pageblock_migratetype only accept page and get pfn with page_to_pfn from page. In case we already record pfn of page, we can simply call get_pfnblock_migratetype to avoid a page_to_pfn. Link: https://lkml.kernel.org/r/[email protected] Signed-off-by: Kemeng Shi <[email protected]> Acked-by: Mel Gorman <[email protected]> Cc: Baolin Wang <[email protected]> Cc: David Hildenbrand <[email protected]> Cc: Matthew Wilcox (Oracle) <[email protected]> Signed-off-by: Andrew Morton <[email protected]>
2023-08-21	mm/page_alloc: remove unnecessary inner __get_pfnblock_flags_mask	Kemeng Shi	1	-19/+11
	Patch series "Two minor cleanups for get pageblock migratetype". This series contains two minor cleanups for get pageblock migratetype. More details can be found in respective patches. This patch (of 2): get_pfnblock_flags_mask() just calls inline inner __get_pfnblock_flags_mask without any extra work. Just opencode __get_pfnblock_flags_mask in get_pfnblock_flags_mask and replace call to __get_pfnblock_flags_mask with call to get_pfnblock_flags_mask to remove unnecessary __get_pfnblock_flags_mask. Link: https://lkml.kernel.org/r/[email protected] Link: https://lkml.kernel.org/r/[email protected] Signed-off-by: Kemeng Shi <[email protected]> Acked-by: Mel Gorman <[email protected]> Reviewed-by: Matthew Wilcox (Oracle) <[email protected]> Cc: Baolin Wang <[email protected]> Cc: David Hildenbrand <[email protected]> Signed-off-by: Andrew Morton <[email protected]>
2023-08-21	mm: page_alloc: remove unused parameter from reserve_highatomic_pageblock()	ZhangPeng	1	-3/+2
	Just remove the redundant parameter alloc_order from reserve_highatomic_pageblock(). No functional modification involved. Link: https://lkml.kernel.org/r/[email protected] Signed-off-by: ZhangPeng <[email protected]> Cc: Kefeng Wang <[email protected]> Cc: Mel Gorman <[email protected]> Cc: Nanyong Sun <[email protected]> Signed-off-by: Andrew Morton <[email protected]>
2023-08-21	Multi-gen LRU: skip CMA pages when they are not eligible	Charan Teja Kalla	1	-1/+1
	This patch is based on the commit 5da226dbfce3("mm: skip CMA pages when they are not available") which skips cma pages reclaim when they are not eligible for the current allocation context. In mglru, such pages are added to the tail of the immediate generation to maintain better LRU order, which is unlike the case of conventional LRU where such pages are directly added to the head of the LRU list(akin to adding to head of the youngest generation in mglru). No observable issue without this patch on MGLRU, but logically it make sense to skip the CMA page reclaim when those pages can't be satisfied for the current allocation context. Link: https://lkml.kernel.org/r/[email protected] Fixes: ac35a4902374 ("mm: multi-gen LRU: minimal implementation") Signed-off-by: Charan Teja Kalla <[email protected]> Reviewed-by: Kalesh Singh <[email protected]> Cc: David Hildenbrand <[email protected]> Cc: Suren Baghdasaryan <[email protected]> Cc: Yu Zhao <[email protected]> Cc: Zhaoyang Huang <[email protected]> Signed-off-by: Andrew Morton <[email protected]>
2023-08-21	mm/compaction: remove unused parameter pgdata of fragmentation_score_wmark	Kemeng Shi	1	-3/+3
	Parameter pgdat is not used in fragmentation_score_wmark. Just remove it. Link: https://lkml.kernel.org/r/[email protected] Signed-off-by: Kemeng Shi <[email protected]> Reviewed-by: David Hildenbrand <[email protected]> Acked-by: Mel Gorman <[email protected]> Reviewed-by: Baolin Wang <[email protected]> Cc: Matthew Wilcox <[email protected]> Signed-off-by: Andrew Morton <[email protected]>
2023-08-21	mm/page_alloc: remove unnecessary parameter batch of nr_pcp_free	Kemeng Shi	1	-5/+3
	We get batch from pcp and just pass it to nr_pcp_free immediately. Get batch from pcp inside nr_pcp_free to remove unnecessary parameter batch. Link: https://lkml.kernel.org/r/[email protected] Signed-off-by: Kemeng Shi <[email protected]> Cc: Baolin Wang <[email protected]> Cc: David Hildenbrand <[email protected]> Cc: Matthew Wilcox <[email protected]> Cc: Mel Gorman <[email protected]> Signed-off-by: Andrew Morton <[email protected]>
2023-08-21	mm/page_alloc: remove track of active PCP lists range in bulk free	Kemeng Shi	1	-12/+3
	Patch series "Two minor cleanups for pcp list in page_alloc". There are two minor cleanups for pcp list in page_alloc. More details can be found in respective patches. This patch (of 2): After commit fd56eef258a17 ("mm/page_alloc: simplify how many pages are selected per pcp list during bulk free"), we will drain all pages in selected pcp list. And we ensured passed count is < pcp->count. Then, the search will finish before wrap-around and track of active PCP lists range intended for wrap-around case is no longer needed. Link: https://lkml.kernel.org/r/[email protected] Link: https://lkml.kernel.org/r/[email protected] Signed-off-by: Kemeng Shi <[email protected]> Cc: Baolin Wang <[email protected]> Cc: David Hildenbrand <[email protected]> Cc: Matthew Wilcox <[email protected]> Cc: Mel Gorman <[email protected]> Signed-off-by: Andrew Morton <[email protected]>
2023-08-21	mm/memory_hotplug: embed vmem_altmap details in memory block	Aneesh Kumar K.V	1	-21/+35
	With memmap on memory, some architecture needs more details w.r.t altmap such as base_pfn, end_pfn, etc to unmap vmemmap memory. Instead of computing them again when we remove a memory block, embed vmem_altmap details in struct memory_block if we are using memmap on memory block feature. [[email protected]: fix error return code in add_memory_resource()] Link: https://lkml.kernel.org/r/[email protected] Link: https://lkml.kernel.org/r/[email protected] Signed-off-by: Aneesh Kumar K.V <[email protected]> Signed-off-by: Yang Yingliang <[email protected]> Acked-by: Michal Hocko <[email protected]> Acked-by: David Hildenbrand <[email protected]> Cc: Christophe Leroy <[email protected]> Cc: Michael Ellerman <[email protected]> Cc: Nicholas Piggin <[email protected]> Cc: Oscar Salvador <[email protected]> Cc: Vishal Verma <[email protected]> Signed-off-by: Andrew Morton <[email protected]>
2023-08-21	mm/memory_hotplug: support memmap_on_memory when memmap is not aligned to ↵	Aneesh Kumar K.V	1	-19/+101
	pageblocks Currently, memmap_on_memory feature is only supported with memory block sizes that result in vmemmap pages covering full page blocks. This is because memory onlining/offlining code requires applicable ranges to be pageblock-aligned, for example, to set the migratetypes properly. This patch helps to lift that restriction by reserving more pages than required for vmemmap space. This helps the start address to be page block aligned with different memory block sizes. Using this facility implies the kernel will be reserving some pages for every memoryblock. This allows the memmap on memory feature to be widely useful with different memory block size values. For ex: with 64K page size and 256MiB memory block size, we require 4 pages to map vmemmap pages, To align things correctly we end up adding a reserve of 28 pages. ie, for every 4096 pages 28 pages get reserved. Link: https://lkml.kernel.org/r/[email protected] Signed-off-by: Aneesh Kumar K.V <[email protected]> Acked-by: Michal Hocko <[email protected]> Acked-by: David Hildenbrand <[email protected]> Cc: Christophe Leroy <[email protected]> Cc: Michael Ellerman <[email protected]> Cc: Nicholas Piggin <[email protected]> Cc: Oscar Salvador <[email protected]> Cc: Vishal Verma <[email protected]> Signed-off-by: Andrew Morton <[email protected]>
2023-08-21	mm/memory_hotplug: allow architecture to override memmap on memory support check	Aneesh Kumar K.V	1	-4/+20
	Some architectures would want different restrictions. Hence add an architecture-specific override. The PMD_SIZE check is moved there. Link: https://lkml.kernel.org/r/[email protected] Signed-off-by: Aneesh Kumar K.V <[email protected]> Acked-by: Michal Hocko <[email protected]> Acked-by: David Hildenbrand <[email protected]> Cc: Christophe Leroy <[email protected]> Cc: Michael Ellerman <[email protected]> Cc: Nicholas Piggin <[email protected]> Cc: Oscar Salvador <[email protected]> Cc: Vishal Verma <[email protected]> Signed-off-by: Andrew Morton <[email protected]>
2023-08-21	mm/memory_hotplug: allow memmap on memory hotplug request to fallback	Aneesh Kumar K.V	1	-7/+6
	If not supported, fallback to not using memap on memmory. This avoids the need for callers to do the fallback. Link: https://lkml.kernel.org/r/[email protected] Signed-off-by: Aneesh Kumar K.V <[email protected]> Acked-by: Michal Hocko <[email protected]> Acked-by: David Hildenbrand <[email protected]> Cc: Christophe Leroy <[email protected]> Cc: Michael Ellerman <[email protected]> Cc: Nicholas Piggin <[email protected]> Cc: Oscar Salvador <[email protected]> Cc: Vishal Verma <[email protected]> Signed-off-by: Andrew Morton <[email protected]>
2023-08-21	mm/memory_hotplug: simplify ARCH_MHP_MEMMAP_ON_MEMORY_ENABLE kconfig	Aneesh Kumar K.V	1	-0/+3
	Patch series "Add support for memmap on memory feature on ppc64", v8. This patch series update memmap on memory feature to fall back to memmap allocation outside the memory block if the alignment rules are not met. This makes the feature more useful on architectures like ppc64 where alignment rules are different with 64K page size. This patch (of 6): Instead of adding menu entry with all supported architectures, add mm/Kconfig variable and select the same from supported architectures. No functional change in this patch. Link: https://lkml.kernel.org/r/[email protected] Link: https://lkml.kernel.org/r/[email protected] Signed-off-by: Aneesh Kumar K.V <[email protected]> Acked-by: Michal Hocko <[email protected]> Acked-by: David Hildenbrand <[email protected]> Cc: Christophe Leroy <[email protected]> Cc: Michael Ellerman <[email protected]> Cc: Nicholas Piggin <[email protected]> Cc: Oscar Salvador <[email protected]> Cc: Vishal Verma <[email protected]> Signed-off-by: Andrew Morton <[email protected]>
2023-08-21	writeback: remove redundant checks for root memcg	Jinliang Zheng	1	-3/+0
	The check for root memcg will be done in wb_get_lookup(), so remove the redundant one to simplify the code. Link: https://lkml.kernel.org/r/[email protected] Signed-off-by: Jinliang Zheng <[email protected]> Signed-off-by: Andrew Morton <[email protected]>
2023-08-21	mm: zswap: update comment for struct zswap_entry	Xiu Jianfeng	1	-1/+2
	Since commit 0bb488498c98 ("mm: zswap: remove zswap_header"), the 'offset' has been replaced by swpentry, update the comment for it, and also add comment for 'objcg'. Link: https://lkml.kernel.org/r/[email protected] Signed-off-by: Xiu Jianfeng <[email protected]> Reviewed-by: Yosry Ahmed <[email protected]> Acked-by: Nhat Pham <[email protected]> Signed-off-by: Andrew Morton <[email protected]>
2023-08-21	mm: memtest: convert to memtest_report_meminfo()	Kefeng Wang	1	-2/+20
	It is better to not expose too many internal variables of memtest, add a helper memtest_report_meminfo() to show memtest results. Link: https://lkml.kernel.org/r/[email protected] Signed-off-by: Kefeng Wang <[email protected]> Acked-by: Mike Rapoport (IBM) <[email protected]> Cc: Matthew Wilcox <[email protected]> Cc: Tomas Mudrunka <[email protected]> Signed-off-by: Andrew Morton <[email protected]>
2023-08-21	mm/mm_init: use helper macro BITS_PER_LONG and BITS_PER_BYTE	Miaohe Lin	1	-3/+3
	It's more readable to use helper macro BITS_PER_LONG and BITS_PER_BYTE. No functional change intended. Link: https://lkml.kernel.org/r/[email protected] Signed-off-by: Miaohe Lin <[email protected]> Reviewed-by: David Hildenbrand <[email protected]> Reviewed-by: Mike Rapoport (IBM) <[email protected]> Signed-off-by: Andrew Morton <[email protected]>
2023-08-21	mm: memory-failure: use helper macro llist_for_each_entry_safe()	Miaohe Lin	1	-8/+5
	It's more convenient to use helper macro llist_for_each_entry_safe(). No functional change intended. Link: https://lkml.kernel.org/r/[email protected] Signed-off-by: Miaohe Lin <[email protected]> Acked-by: Naoya Horiguchi <[email protected]> Signed-off-by: Andrew Morton <[email protected]>
2023-08-21	mm: move dummy_vm_ops out of a header	Mateusz Guzik	1	-0/+2
	Otherwise the kernel ends up with multiple copies: $ nm vmlinux \| grep dummy_vm_ops ffffffff81e4ea00 d dummy_vm_ops.2 ffffffff81e11760 d dummy_vm_ops.254 ffffffff81e406e0 d dummy_vm_ops.4 ffffffff81e3c780 d dummy_vm_ops.7 While here prefix it with vma_. Link: https://lkml.kernel.org/r/[email protected] Signed-off-by: Mateusz Guzik <[email protected]> Cc: Matthew Wilcox <[email protected]> Signed-off-by: Andrew Morton <[email protected]>
2023-08-21	mm: move vma locking out of vma_prepare and dup_anon_vma	Suren Baghdasaryan	1	-11/+19
	vma_prepare() is currently the central place where vmas are being locked before vma_complete() applies changes to them. While this is convenient, it also obscures vma locking and makes it harder to follow the locking rules. Move vma locking out of vma_prepare() and take vma locks explicitly at the locations where vmas are being modified. Move vma locking and replace it with an assertion inside dup_anon_vma() to further clarify the locking pattern inside vma_merge(). Link: https://lkml.kernel.org/r/[email protected] Suggested-by: Linus Torvalds <[email protected]> Suggested-by: Liam R. Howlett <[email protected]> Signed-off-by: Suren Baghdasaryan <[email protected]> Cc: Jann Horn <[email protected]> Signed-off-by: Andrew Morton <[email protected]>
2023-08-21	mm: always lock new vma before inserting into vma tree	Suren Baghdasaryan	1	-2/+5
	While it's not strictly necessary to lock a newly created vma before adding it into the vma tree (as long as no further changes are performed to it), it seems like a good policy to lock it and prevent accidental changes after it becomes visible to the page faults. Lock the vma before adding it into the vma tree. [[email protected]: fix reject fixing in vma_link(), per Jann] Link: https://lkml.kernel.org/r/[email protected] Suggested-by: Jann Horn <[email protected]> Signed-off-by: Suren Baghdasaryan <[email protected]> Reviewed-by: Liam R. Howlett <[email protected]> Cc: Linus Torvalds <[email protected]> Cc: Jann Horn <[email protected]> Signed-off-by: Andrew Morton <[email protected]>
2023-08-21	mm: lock vma explicitly before doing vm_flags_reset and vm_flags_reset_once	Suren Baghdasaryan	3	-4/+5
	Implicit vma locking inside vm_flags_reset() and vm_flags_reset_once() is not obvious and makes it hard to understand where vma locking is happening. Also in some cases (like in dup_userfaultfd()) vma should be locked earlier than vma_flags modification. To make locking more visible, change these functions to assert that the vma write lock is taken and explicitly lock the vma beforehand. Fix userfaultfd functions which should lock the vma earlier. Link: https://lkml.kernel.org/r/[email protected] Suggested-by: Linus Torvalds <[email protected]> Signed-off-by: Suren Baghdasaryan <[email protected]> Cc: Jann Horn <[email protected]> Cc: Liam R. Howlett <[email protected]> Signed-off-by: Andrew Morton <[email protected]>
2023-08-21	mm: replace mmap with vma write lock assertions when operating on a vma	Suren Baghdasaryan	2	-2/+2
	Vma write lock assertion always includes mmap write lock assertion and additional vma lock checks when per-VMA locks are enabled. Replace weaker mmap_assert_write_locked() assertions with stronger vma_assert_write_locked() ones when we are operating on a vma which is expected to be locked. Link: https://lkml.kernel.org/r/[email protected] Suggested-by: Jann Horn <[email protected]> Signed-off-by: Suren Baghdasaryan <[email protected]> Reviewed-by: Liam R. Howlett <[email protected]> Cc: Linus Torvalds <[email protected]> Signed-off-by: Andrew Morton <[email protected]>
2023-08-21	mm/hugetlb.c: use helper macro K()	ZhangPeng	1	-1/+1
	Use helper macro K() to improve code readability. No functional modification involved. Link: https://lkml.kernel.org/r/[email protected] Signed-off-by: ZhangPeng <[email protected]> Reviewed-by: Matthew Wilcox (Oracle) <[email protected]> Reviewed-by: David Hildenbrand <[email protected]> Cc: Kefeng Wang <[email protected]> Cc: Nanyong Sun <[email protected]> Signed-off-by: Andrew Morton <[email protected]>
2023-08-21	mm/mmap.c: use helper macro K()	ZhangPeng	1	-3/+3
	Use helper macro K() to improve code readability. No functional modification involved. Link: https://lkml.kernel.org/r/[email protected] Signed-off-by: ZhangPeng <[email protected]> Reviewed-by: Matthew Wilcox (Oracle) <[email protected]> Reviewed-by: David Hildenbrand <[email protected]> Cc: Kefeng Wang <[email protected]> Cc: Nanyong Sun <[email protected]> Signed-off-by: Andrew Morton <[email protected]>
2023-08-21	mm/nommu.c: use helper macro K()	ZhangPeng	1	-2/+2
	Use helper macro K() to improve code readability. No functional modification involved. Link: https://lkml.kernel.org/r/[email protected] Signed-off-by: ZhangPeng <[email protected]> Reviewed-by: Matthew Wilcox (Oracle) <[email protected]> Reviewed-by: David Hildenbrand <[email protected]> Cc: Kefeng Wang <[email protected]> Cc: Nanyong Sun <[email protected]> Signed-off-by: Andrew Morton <[email protected]>
2023-08-21	mm/shmem.c: use helper macro K()	ZhangPeng	1	-2/+1
	Use helper macro K() to improve code readability. No functional modification involved. Link: https://lkml.kernel.org/r/[email protected] Signed-off-by: ZhangPeng <[email protected]> Reviewed-by: Matthew Wilcox (Oracle) <[email protected]> Reviewed-by: David Hildenbrand <[email protected]> Cc: Kefeng Wang <[email protected]> Cc: Nanyong Sun <[email protected]> Signed-off-by: Andrew Morton <[email protected]>
2023-08-21	mm/swap_state.c: use helper macro K()	ZhangPeng	1	-3/+2
	Use helper macro K() to improve code readability. No functional modification involved. Link: https://lkml.kernel.org/r/[email protected] Signed-off-by: ZhangPeng <[email protected]> Reviewed-by: Matthew Wilcox (Oracle) <[email protected]> Reviewed-by: David Hildenbrand <[email protected]> Cc: Kefeng Wang <[email protected]> Cc: Nanyong Sun <[email protected]> Signed-off-by: Andrew Morton <[email protected]>
2023-08-21	mm/swapfile.c: use helper macro K()	ZhangPeng	1	-6/+6
	Use helper macro K() to improve code readability. No functional modification involved. Link: https://lkml.kernel.org/r/[email protected] Signed-off-by: ZhangPeng <[email protected]> Reviewed-by: Matthew Wilcox (Oracle) <[email protected]> Reviewed-by: David Hildenbrand <[email protected]> Cc: Kefeng Wang <[email protected]> Cc: Nanyong Sun <[email protected]> Signed-off-by: Andrew Morton <[email protected]>
2023-08-21	mm: remove redundant K() macro definition	ZhangPeng	3	-6/+1
	Patch series "cleanup with helper macro K()". Use helper macro K() to improve code readability. No functional modification involved. Remove redundant K() macro definition. This patch (of 7): Since commit eb8589b4f8c1 ("mm: move mem_init_print_info() to mm_init.c"), the K() macro definition has been moved to mm/internal.h. Therefore, the definitions in mm/memcontrol.c, mm/backing-dev.c and mm/oom_kill.c are redundant. Drop redundant definitions. [[email protected]: oom_kill.c: remove "#undef K", per Kefeng] Link: https://lkml.kernel.org/r/[email protected] Link: https://lkml.kernel.org/r/[email protected] Signed-off-by: ZhangPeng <[email protected]> Reviewed-by: Matthew Wilcox (Oracle) <[email protected]> Reviewed-by: David Hildenbrand <[email protected]> Cc: Kefeng Wang <[email protected]> Cc: Nanyong Sun <[email protected]> Signed-off-by: Andrew Morton <[email protected]>
2023-08-21	mm: disable kernelcore=mirror when no mirror memory	Ma Wupeng	3	-0/+11
	For system with kernelcore=mirror enabled while no mirrored memory is reported by efi. This could lead to kernel OOM during startup since all memory beside zone DMA are in the movable zone and this prevents the kernel to use it. Zone DMA/DMA32 initialization is independent of mirrored memory and their max pfn is set in zone_sizes_init(). Since kernel can fallback to zone DMA/DMA32 if there is no memory in zone Normal, these zones are seen as mirrored memory no mather their memory attributes are. To solve this problem, disable kernelcore=mirror when there is no real mirrored memory exists. Link: https://lkml.kernel.org/r/[email protected] Signed-off-by: Ma Wupeng <[email protected]> Suggested-by: Kefeng Wang <[email protected]> Suggested-by: Mike Rapoport <[email protected]> Reviewed-by: Mike Rapoport (IBM) <[email protected]> Reviewed-by: Kefeng Wang <[email protected]> Cc: Levi Yun <[email protected]> Signed-off-by: Andrew Morton <[email protected]>
2023-08-21	mm/compaction: only set skip flag if cc->no_set_skip_hint is false	Kemeng Shi	1	-1/+1
	Keep the same logic as update_pageblock_skip, only set skip if no_set_skip_hint is false which is more reasonable. Link: https://lkml.kernel.org/r/[email protected] Signed-off-by: Kemeng Shi <[email protected]> Reviewed-by: David Hildenbrand <[email protected]> Cc: Baolin Wang <[email protected]> Cc: Mel Gorman <[email protected]> Signed-off-by: Andrew Morton <[email protected]>
2023-08-21	mm/compaction: remove unnecessary return for void function	Kemeng Shi	1	-3/+1
	Remove unnecessary return for void function Link: https://lkml.kernel.org/r/[email protected] Signed-off-by: Kemeng Shi <[email protected]> Reviewed-by: David Hildenbrand <[email protected]> Reviewed-by: Baolin Wang <[email protected]> Cc: Mel Gorman <[email protected]> Signed-off-by: Andrew Morton <[email protected]>