aboutsummaryrefslogtreecommitdiff
AgeCommit message (Collapse)AuthorFilesLines
2022-05-19arm64/mm: fix page table check compile error for CONFIG_PGTABLE_LEVELS=2Tong Tiangen1-16/+17
If CONFIG_PGTABLE_LEVELS=2 and CONFIG_ARCH_SUPPORTS_PAGE_TABLE_CHECK=y, then we trigger a compile error: error: implicit declaration of function 'pte_user_accessible_page' Move the definition of page table check helper out of branch CONFIG_PGTABLE_LEVELS > 2 Link: https://lkml.kernel.org/r/[email protected] Fixes: daf214c14dbe ("arm64/mm: enable ARCH_SUPPORTS_PAGE_TABLE_CHECK") Signed-off-by: Tong Tiangen <[email protected]> Acked-by: Catalin Marinas <[email protected]> Cc: Anshuman Khandual <[email protected]> Cc: Pasha Tatashin <[email protected]> Cc: Anshuman Khandual <[email protected]> Cc: Will Deacon <[email protected]> Cc: Paul Walmsley <[email protected]> Cc: Palmer Dabbelt <[email protected]> Cc: Albert Ou <[email protected]> Cc: Arnd Bergmann <[email protected]> Cc: Guohanjun <[email protected]> Cc: Xie XiuQi <[email protected]> Cc: kernel test robot <[email protected]> Signed-off-by: Andrew Morton <[email protected]>
2022-05-19riscv/mm: fix two page table check related issuesTong Tiangen3-5/+7
Two page table check related issues have been fixed here. 1. Open CONFIG_PAGE_TABLE_CHECK in riscv32, we got a compile error[1]: error: implicit declaration of function 'pud_leaf' Add pud_leaf() definition to incluce/asm-generic/pgtable-nopmd.h to fix this issue. 2. Keep consistent with other pud_xxx() helpers, move pud_user() to pgtable-64.h and add pud_user() to pgtable-nopmd.h. [1]https://lore.kernel.org/linux-mm/[email protected]/T/ Link: https://lkml.kernel.org/r/[email protected] Fixes: 856eed79f8d3 ("riscv/mm: enable ARCH_SUPPORTS_PAGE_TABLE_CHECK") Signed-off-by: Tong Tiangen <[email protected]> Reported-by: kernel test robot <[email protected]> Cc: Anshuman Khandual <[email protected]> Cc: Pasha Tatashin <[email protected]> Cc: Anshuman Khandual <[email protected]> Cc: Albert Ou <[email protected]> Cc: Arnd Bergmann <[email protected]> Cc: Catalin Marinas <[email protected]> Cc: Guohanjun <[email protected]> Cc: Palmer Dabbelt <[email protected]> Cc: Paul Walmsley <[email protected]> Cc: Will Deacon <[email protected]> Cc: Xie XiuQi <[email protected]> Signed-off-by: Andrew Morton <[email protected]>
2022-05-13mm, compaction: fast_find_migrateblock() should return pfn in the target zoneRei Yamamoto1-0/+2
At present, pages not in the target zone are added to cc->migratepages list in isolate_migratepages_block(). As a result, pages may migrate between nodes unintentionally. This would be a serious problem for older kernels without commit a984226f457f849e ("mm: memcontrol: remove the pgdata parameter of mem_cgroup_page_lruvec"), because it can corrupt the lru list by handling pages in list without holding proper lru_lock. Avoid returning a pfn outside the target zone in the case that it is not aligned with a pageblock boundary. Otherwise isolate_migratepages_block() will handle pages not in the target zone. Link: https://lkml.kernel.org/r/[email protected] Fixes: 70b44595eafe ("mm, compaction: use free lists to quickly locate a migration source") Signed-off-by: Rei Yamamoto <[email protected]> Reviewed-by: Miaohe Lin <[email protected]> Acked-by: Mel Gorman <[email protected]> Reviewed-by: Oscar Salvador <[email protected]> Cc: Don Dutile <[email protected]> Cc: Wonhyuk Yang <[email protected]> Cc: Rei Yamamoto <[email protected]> Cc: <[email protected]> Signed-off-by: Andrew Morton <[email protected]>
2022-05-13mm/damon: add documentation for Enum valueGautam Menghani1-0/+1
Fix the warning - "Enum value 'NR_DAMON_OPS' not described in enum 'damon_ops_id'" generated by the command "make pdfdocs" Link: https://lkml.kernel.org/r/[email protected] Signed-off-by: Gautam Menghani <[email protected]> Reviewed-by: SeongJae Park <[email protected]> Signed-off-by: Andrew Morton <[email protected]>
2022-05-13mm/memcontrol: export memcg->watermark via sysfs for v2 memcgGanesan Rajagopal2-0/+20
We run a lot of automated tests when building our software and run into OOM scenarios when the tests run unbounded. v1 memcg exports memcg->watermark as "memory.max_usage_in_bytes" in sysfs. We use this metric to heuristically limit the number of tests that can run in parallel based on per test historical data. This metric is currently not exported for v2 memcg and there is no other easy way of getting this information. getrusage() syscall returns "ru_maxrss" which can be used as an approximation but that's the max RSS of a single child process across all children instead of the aggregated max for all child processes. The only work around is to periodically poll "memory.current" but that's not practical for short-lived one-off cgroups. Hence, expose memcg->watermark as "memory.peak" for v2 memcg. Link: https://lkml.kernel.org/r/[email protected] Signed-off-by: Ganesan Rajagopal <[email protected]> Acked-by: Shakeel Butt <[email protected]> Acked-by: Johannes Weiner <[email protected]> Acked-by: Roman Gushchin <[email protected]> Acked-by: Michal Hocko <[email protected]> Cc: Roman Gushchin <[email protected]> Reviewed-by: Michal Koutný <[email protected]> Signed-off-by: Andrew Morton <[email protected]>
2022-05-13mm: hugetlb_vmemmap: add hugetlb_optimize_vmemmap sysctlMuchun Song4-15/+133
We must add hugetlb_free_vmemmap=on (or "off") to the boot cmdline and reboot the server to enable or disable the feature of optimizing vmemmap pages associated with HugeTLB pages. However, rebooting usually takes a long time. So add a sysctl to enable or disable the feature at runtime without rebooting. Why we need this? There are 3 use cases. 1) The feature of minimizing overhead of struct page associated with each HugeTLB is disabled by default without passing "hugetlb_free_vmemmap=on" to the boot cmdline. When we (ByteDance) deliver the servers to the users who want to enable this feature, they have to configure the grub (change boot cmdline) and reboot the servers, whereas rebooting usually takes a long time (we have thousands of servers). It's a very bad experience for the users. So we need a approach to enable this feature after rebooting. This is a use case in our practical environment. 2) Some use cases are that HugeTLB pages are allocated 'on the fly' instead of being pulled from the HugeTLB pool, those workloads would be affected with this feature enabled. Those workloads could be identified by the characteristics of they never explicitly allocating huge pages with 'nr_hugepages' but only set 'nr_overcommit_hugepages' and then let the pages be allocated from the buddy allocator at fault time. We can confirm it is a real use case from the commit 099730d67417. For those workloads, the page fault time could be ~2x slower than before. We suspect those users want to disable this feature if the system has enabled this before and they don't think the memory savings benefit is enough to make up for the performance drop. 3) If the workload which wants vmemmap pages to be optimized and the workload which wants to set 'nr_overcommit_hugepages' and does not want the extera overhead at fault time when the overcommitted pages be allocated from the buddy allocator are deployed in the same server. The user could enable this feature and set 'nr_hugepages' and 'nr_overcommit_hugepages', then disable the feature. In this case, the overcommited HugeTLB pages will not encounter the extra overhead at fault time. Link: https://lkml.kernel.org/r/[email protected] Signed-off-by: Muchun Song <[email protected]> Reviewed-by: Mike Kravetz <[email protected]> Cc: Jonathan Corbet <[email protected]> Cc: Luis Chamberlain <[email protected]> Cc: Kees Cook <[email protected]> Cc: Iurii Zaikin <[email protected]> Cc: Oscar Salvador <[email protected]> Cc: David Hildenbrand <[email protected]> Cc: Masahiro Yamada <[email protected]> Cc: Xiongchun Duan <[email protected]> Signed-off-by: Andrew Morton <[email protected]>
2022-05-13mm: hugetlb_vmemmap: use kstrtobool for hugetlb_vmemmap param parsingMuchun Song2-8/+8
Use kstrtobool rather than open coding "on" and "off" parsing in mm/hugetlb_vmemmap.c, which is more powerful to handle all kinds of parameters like 'Yy1Nn0' or [oO][NnFf] for "on" and "off". Link: https://lkml.kernel.org/r/[email protected] Signed-off-by: Muchun Song <[email protected]> Reviewed-by: Mike Kravetz <[email protected]> Cc: David Hildenbrand <[email protected]> Cc: Iurii Zaikin <[email protected]> Cc: Jonathan Corbet <[email protected]> Cc: Kees Cook <[email protected]> Cc: Luis Chamberlain <[email protected]> Cc: Masahiro Yamada <[email protected]> Cc: Oscar Salvador <[email protected]> Cc: Xiongchun Duan <[email protected]> Signed-off-by: Andrew Morton <[email protected]>
2022-05-13mm: memory_hotplug: override memmap_on_memory when hugetlb_free_vmemmap=onMuchun Song1-6/+26
Optimizing HugeTLB vmemmap pages is not compatible with allocating memmap on hot added memory. If "hugetlb_free_vmemmap=on" and memory_hotplug.memmap_on_memory" are both passed on the kernel command line, optimizing hugetlb pages takes precedence. However, the global variable memmap_on_memory will still be set to 1, even though we will not try to allocate memmap on hot added memory. Also introduce mhp_memmap_on_memory() helper to move the definition of "memmap_on_memory" to the scope of CONFIG_MHP_MEMMAP_ON_MEMORY. In the next patch, mhp_memmap_on_memory() will also be exported to be used in hugetlb_vmemmap.c. Link: https://lkml.kernel.org/r/[email protected] Signed-off-by: Muchun Song <[email protected]> Acked-by: Mike Kravetz <[email protected]> Cc: David Hildenbrand <[email protected]> Cc: Iurii Zaikin <[email protected]> Cc: Jonathan Corbet <[email protected]> Cc: Kees Cook <[email protected]> Cc: Luis Chamberlain <[email protected]> Cc: Masahiro Yamada <[email protected]> Cc: Oscar Salvador <[email protected]> Cc: Xiongchun Duan <[email protected]> Signed-off-by: Andrew Morton <[email protected]>
2022-05-13mm: hugetlb_vmemmap: disable hugetlb_optimize_vmemmap when struct page ↵Muchun Song1-6/+6
crosses page boundaries Patch series "add hugetlb_optimize_vmemmap sysctl", v11. This series aims to add hugetlb_optimize_vmemmap sysctl to enable or disable the feature of optimizing vmemmap pages associated with HugeTLB pages. This patch (of 4): If the size of "struct page" is not the power of two but with the feature of minimizing overhead of struct page associated with each HugeTLB is enabled, then the vmemmap pages of HugeTLB will be corrupted after remapping (panic is about to happen in theory). But this only exists when !CONFIG_MEMCG && !CONFIG_SLUB on x86_64. However, it is not a conventional configuration nowadays. So it is not a real word issue, just the result of a code review. But we cannot prevent anyone from configuring that combined configure. This hugetlb_optimize_vmemmap should be disable in this case to fix this issue. Link: https://lkml.kernel.org/r/[email protected] Link: https://lkml.kernel.org/r/[email protected] Signed-off-by: Muchun Song <[email protected]> Reviewed-by: Mike Kravetz <[email protected]> Cc: David Hildenbrand <[email protected]> Cc: Iurii Zaikin <[email protected]> Cc: Jonathan Corbet <[email protected]> Cc: Kees Cook <[email protected]> Cc: Luis Chamberlain <[email protected]> Cc: Masahiro Yamada <[email protected]> Cc: Oscar Salvador <[email protected]> Cc: Xiongchun Duan <[email protected]> Signed-off-by: Andrew Morton <[email protected]>
2022-05-13mm: rmap: fix CONT-PTE/PMD size hugetlb issue when unmappingBaolin Wang1-17/+22
On some architectures (like ARM64), it can support CONT-PTE/PMD size hugetlb, which means it can support not only PMD/PUD size hugetlb: 2M and 1G, but also CONT-PTE/PMD size: 64K and 32M if a 4K page size specified. When unmapping a hugetlb page, we will get the relevant page table entry by huge_pte_offset() only once to nuke it. This is correct for PMD or PUD size hugetlb, since they always contain only one pmd entry or pud entry in the page table. However this is incorrect for CONT-PTE and CONT-PMD size hugetlb, since they can contain several continuous pte or pmd entry with same page table attributes, so we will nuke only one pte or pmd entry for this CONT-PTE/PMD size hugetlb page. And now try_to_unmap() is only passed a hugetlb page in the case where the hugetlb page is poisoned. Which means now we will unmap only one pte entry for a CONT-PTE or CONT-PMD size poisoned hugetlb page, and we can still access other subpages of a CONT-PTE or CONT-PMD size poisoned hugetlb page, which will cause serious issues possibly. So we should change to use huge_ptep_clear_flush() to nuke the hugetlb page table to fix this issue, which already considered CONT-PTE and CONT-PMD size hugetlb. We've already used set_huge_swap_pte_at() to set a poisoned swap entry for a poisoned hugetlb page. Meanwhile adding a VM_BUG_ON() to make sure the passed hugetlb page is poisoned in try_to_unmap(). Link: https://lkml.kernel.org/r/0a2e547238cad5bc153a85c3e9658cb9d55f9cac.1652270205.git.baolin.wang@linux.alibaba.com Link: https://lkml.kernel.org/r/730ea4b6d292f32fb10b7a4e87dad49b0eb30474.1652147571.git.baolin.wang@linux.alibaba.com Signed-off-by: Baolin Wang <[email protected]> Reviewed-by: Muchun Song <[email protected]> Reviewed-by: Mike Kravetz <[email protected]> Acked-by: David Hildenbrand <[email protected]> Cc: Alexander Gordeev <[email protected]> Cc: Arnd Bergmann <[email protected]> Cc: Benjamin Herrenschmidt <[email protected]> Cc: Catalin Marinas <[email protected]> Cc: Christian Borntraeger <[email protected]> Cc: David S. Miller <[email protected]> Cc: Gerald Schaefer <[email protected]> Cc: Heiko Carstens <[email protected]> Cc: Helge Deller <[email protected]> Cc: James Bottomley <[email protected]> Cc: Michael Ellerman <[email protected]> Cc: Paul Mackerras <[email protected]> Cc: Rich Felker <[email protected]> Cc: Sven Schnelle <[email protected]> Cc: Thomas Bogendoerfer <[email protected]> Cc: Vasily Gorbik <[email protected]> Cc: Will Deacon <[email protected]> Cc: Yoshinori Sato <[email protected]> Signed-off-by: Andrew Morton <[email protected]>
2022-05-13mm: rmap: fix CONT-PTE/PMD size hugetlb issue when migrationBaolin Wang2-6/+29
On some architectures (like ARM64), it can support CONT-PTE/PMD size hugetlb, which means it can support not only PMD/PUD size hugetlb: 2M and 1G, but also CONT-PTE/PMD size: 64K and 32M if a 4K page size specified. When migrating a hugetlb page, we will get the relevant page table entry by huge_pte_offset() only once to nuke it and remap it with a migration pte entry. This is correct for PMD or PUD size hugetlb, since they always contain only one pmd entry or pud entry in the page table. However this is incorrect for CONT-PTE and CONT-PMD size hugetlb, since they can contain several continuous pte or pmd entry with same page table attributes. So we will nuke or remap only one pte or pmd entry for this CONT-PTE/PMD size hugetlb page, which is not expected for hugetlb migration. The problem is we can still continue to modify the subpages' data of a hugetlb page during migrating a hugetlb page, which can cause a serious data consistent issue, since we did not nuke the page table entry and set a migration pte for the subpages of a hugetlb page. To fix this issue, we should change to use huge_ptep_clear_flush() to nuke a hugetlb page table, and remap it with set_huge_pte_at() and set_huge_swap_pte_at() when migrating a hugetlb page, which already considered the CONT-PTE or CONT-PMD size hugetlb. [[email protected]: fix nommu build] [[email protected]: fix build errors for !CONFIG_MMU] Link: https://lkml.kernel.org/r/a4baca670aca637e7198d9ae4543b8873cb224dc.1652270205.git.baolin.wang@linux.alibaba.com Link: https://lkml.kernel.org/r/ea5abf529f0997b5430961012bfda6166c1efc8c.1652147571.git.baolin.wang@linux.alibaba.com Signed-off-by: Baolin Wang <[email protected]> Reviewed-by: Muchun Song <[email protected]> Reviewed-by: Mike Kravetz <[email protected]> Acked-by: David Hildenbrand <[email protected]> Cc: Alexander Gordeev <[email protected]> Cc: Arnd Bergmann <[email protected]> Cc: Benjamin Herrenschmidt <[email protected]> Cc: Catalin Marinas <[email protected]> Cc: Christian Borntraeger <[email protected]> Cc: David S. Miller <[email protected]> Cc: Gerald Schaefer <[email protected]> Cc: Heiko Carstens <[email protected]> Cc: Helge Deller <[email protected]> Cc: James Bottomley <[email protected]> Cc: Michael Ellerman <[email protected]> Cc: Paul Mackerras <[email protected]> Cc: Rich Felker <[email protected]> Cc: Sven Schnelle <[email protected]> Cc: Thomas Bogendoerfer <[email protected]> Cc: Vasily Gorbik <[email protected]> Cc: Will Deacon <[email protected]> Cc: Yoshinori Sato <[email protected]> Signed-off-by: Andrew Morton <[email protected]>
2022-05-13mm: change huge_ptep_clear_flush() to return the original pteBaolin Wang10-28/+36
Patch series "Fix CONT-PTE/PMD size hugetlb issue when unmapping or migrating", v4. presently, migrating a hugetlb page or unmapping a poisoned hugetlb page, we'll use ptep_clear_flush() and set_pte_at() to nuke the page table entry and remap it, and this is incorrect for CONT-PTE or CONT-PMD size hugetlb page, which will cause potential data consistent issue. This patch set will change to use hugetlb related APIs to fix this issue. Note: Mike pointed out the huge_ptep_get() will only return the one specific value, and it would not take into account the dirty or young bits of CONT-PTE/PMDs like the huge_ptep_get_and_clear() [1]. This inconsistent issue is not introduced by this patch set, and this issue will be addressed in another thread [2]. Meanwhile the uffd for hugetlb case [3] pointed out by Gerald also needs another patch to address. [1] https://lore.kernel.org/linux-mm/[email protected]/ [2] https://lore.kernel.org/all/[email protected]/ [3] https://lore.kernel.org/linux-mm/20220503120343.6264e126@thinkpad/ This patch (of 3): It is incorrect to use ptep_clear_flush() to nuke a hugetlb page table when unmapping or migrating a hugetlb page, and will change to use huge_ptep_clear_flush() instead in the following patches. So this is a preparation patch, which changes the huge_ptep_clear_flush() to return the original pte to help to nuke a hugetlb page table. [[email protected]: fix build in several more architectures] Link: https://lkml.kernel.org/r/[email protected] [[email protected]: fixup] Link: https://lkml.kernel.org/r/[email protected] Link: https://lkml.kernel.org/r/[email protected] Link: https://lkml.kernel.org/r/20f77ddab90baa249bd24504c413189b82acde69.1652270205.git.baolin.wang@linux.alibaba.com Link: https://lkml.kernel.org/r/[email protected] Link: https://lkml.kernel.org/r/dcf065868cce35bceaf138613ad27f17bb7c0c19.1652147571.git.baolin.wang@linux.alibaba.com Signed-off-by: Baolin Wang <[email protected]> Signed-off-by: Stephen Rothwell <[email protected]> Acked-by: Mike Kravetz <[email protected]> Reviewed-by: Muchun Song <[email protected]> Cc: Catalin Marinas <[email protected]> Cc: Will Deacon <[email protected]> Cc: Thomas Bogendoerfer <[email protected]> Cc: James Bottomley <[email protected]> Cc: Helge Deller <[email protected]> Cc: Michael Ellerman <[email protected]> Cc: Benjamin Herrenschmidt <[email protected]> Cc: Paul Mackerras <[email protected]> Cc: Heiko Carstens <[email protected]> Cc: Vasily Gorbik <[email protected]> Cc: Alexander Gordeev <[email protected]> Cc: Christian Borntraeger <[email protected]> Cc: Sven Schnelle <[email protected]> Cc: Yoshinori Sato <[email protected]> Cc: Rich Felker <[email protected]> Cc: David S. Miller <[email protected]> Cc: Arnd Bergmann <[email protected]> Cc: Gerald Schaefer <[email protected]> Signed-off-by: Andrew Morton <[email protected]>
2022-05-13Documentation/vm: rework "Temporary Virtual Mappings" sectionFabio M. De Francesco1-11/+59
Extend and rework the "Temporary Virtual Mappings" section of the highmem.rst documentation. Despite the local kmaps were introduced by Thomas Gleixner in October 2020, documentation was still missing information about them. These additions rely largely on Gleixner's patches, Jonathan Corbet's LWN articles, comments by Ira Weiny and Matthew Wilcox, and in-code comments from ./include/linux/highmem.h. 1) Add a paragraph to document kmap_local_page(). 2) Reorder the list of functions by decreasing order of preference of use. 3) Rework part of the kmap() entry in list. Link: https://lkml.kernel.org/r/[email protected] Signed-off-by: Fabio M. De Francesco <[email protected]> Suggested-by: Ira Weiny <[email protected]> Reviewed-by: Sebastian Andrzej Siewior <[email protected]> Reviewed-by: Ira Weiny <[email protected]> Cc: Jonathan Corbet <[email protected]> Cc: Matthew Wilcox <[email protected]> Cc: Peter Zijlstra <[email protected]> Cc: Thomas Gleixner <[email protected]> Cc: Catalin Marinas <[email protected]> Cc: Mike Rapoport <[email protected]> Cc: Peter Collingbourne <[email protected]> Cc: Vlastimil Babka <[email protected]> Cc: Will Deacon <[email protected]> Cc: Jonathan Corbet <[email protected]> Signed-off-by: Andrew Morton <[email protected]>
2022-05-13Documentation/vm: move "Using kmap-atomic" to highmem.hFabio M. De Francesco2-35/+31
The use of kmap_atomic() is new code is being deprecated in favor of kmap_local_page(). For this reason the "Using kmap_atomic" section in highmem.rst is obsolete and unnecessary, but it can still help developers if it were moved to kdocs in highmem.h. Therefore, move the relevant parts of this section from highmem.rst and merge them with the kdocs in highmem.h. Link: https://lkml.kernel.org/r/[email protected] Signed-off-by: Fabio M. De Francesco <[email protected]> Suggested-by: Ira Weiny <[email protected]> Reviewed-by: Sebastian Andrzej Siewior <[email protected]> Reviewed-by: Ira Weiny <[email protected]> Cc: Jonathan Corbet <[email protected]> Cc: Matthew Wilcox <[email protected]> Cc: Peter Zijlstra <[email protected]> Cc: Thomas Gleixner <[email protected]> Cc: Catalin Marinas <[email protected]> Cc: Mike Rapoport <[email protected]> Cc: Peter Collingbourne <[email protected]> Cc: Vlastimil Babka <[email protected]> Cc: Will Deacon <[email protected]> Cc: Jonathan Corbet <[email protected]> Signed-off-by: Andrew Morton <[email protected]>
2022-05-13Documentation/vm: include kdocs from highmem*.h into highmem.rstFabio M. De Francesco1-0/+7
kernel-docs that are in include/linux/highmem.h and in include/linux/highmem-internal.h should be included in highmem.rst. Use kdocs directives to include the above-mentioned comments into highmem.rst. Link: https://lkml.kernel.org/r/[email protected] Signed-off-by: Fabio M. De Francesco <[email protected]> Acked-by: Mike Rapoport <[email protected]> Reviewed-by: Ira Weiny <[email protected]> Suggested-by: Ira Weiny <[email protected]> Reviewed-by: Sebastian Andrzej Siewior <[email protected]> Cc: Jonathan Corbet <[email protected]> Cc: Matthew Wilcox <[email protected]> Cc: Peter Zijlstra <[email protected]> Cc: Thomas Gleixner <[email protected]> Cc: Catalin Marinas <[email protected]> Cc: Peter Collingbourne <[email protected]> Cc: Vlastimil Babka <[email protected]> Cc: Will Deacon <[email protected]> Cc: Jonathan Corbet <[email protected]> Signed-off-by: Andrew Morton <[email protected]>
2022-05-13mm/highmem: fix kernel-doc warnings in highmem*.hFabio M. De Francesco2-17/+23
Patch series "Extend and reorganize Highmem's documentation", v4. This series has the purpose to extend and reorganize Highmem's documentation. This is a work in progress because some information should still be moved from highmem.rst to highmem.h and highmem-internal.h. Specifically I'm talking about moving the "how to" information to the relevant headers, as it as been suggested by Ira Weiny (Intel). Also, this is a work in progress because some kdocs in highmem.h and highmem-internal.h should be improved. This patch (of 4): `scripts/kernel-doc -v -none include/linux/highmem*` reports the following warnings: include/linux/highmem.h:160: warning: expecting prototype for kunmap_atomic(). Prototype was for nr_free_highpages() instead include/linux/highmem.h:204: warning: No description found for return value of 'alloc_zeroed_user_highpage_movable' include/linux/highmem-internal.h:256: warning: Function parameter or member '__addr' not described in 'kunmap_atomic' include/linux/highmem-internal.h:256: warning: Excess function parameter 'addr' description in 'kunmap_atomic' Fix these warnings by (1) moving the kernel-doc comments from highmem.h to highmem-internal.h (which is the file were the kunmap_atomic() macro is actually defined), (2) extending and merging it with the comment which was already in highmem-internal.h, and (3) using correct parameter names (4) correcting a few technical inaccuracies in comments, and (5) adding a deprecation notice in kunmap_atomic() for consistency with kmap_atomic(). Link: https://lkml.kernel.org/r/[email protected] Link: https://lkml.kernel.org/r/[email protected] Signed-off-by: Fabio M. De Francesco <[email protected]> Reviewed-by: Sebastian Andrzej Siewior <[email protected]> Reviewed-by: Ira Weiny <[email protected]> Cc: Matthew Wilcox <[email protected]> Cc: Mike Rapoport <[email protected]> Cc: Catalin Marinas <[email protected]> Cc: Will Deacon <[email protected]> Cc: Peter Collingbourne <[email protected]> Cc: Vlastimil Babka <[email protected]> Cc: Jonathan Corbet <[email protected]> Cc: Peter Zijlstra <[email protected]> Cc: Thomas Gleixner <[email protected]> Cc: Jonathan Corbet <[email protected]> Signed-off-by: Andrew Morton <[email protected]>
2022-05-13mm/memory-failure.c: simplify num_poisoned_pages_inc/deczhenwei pi1-8/+3
Originally, do num_poisoned_pages_inc() in memory failure routine, use num_poisoned_pages_dec() to rollback the number if filtered/ cancelled. Suggested by Naoya, do num_poisoned_pages_inc() only in action_result(), this make this clear and simple. Link: https://lkml.kernel.org/r/[email protected] Signed-off-by: zhenwei pi <[email protected]> Acked-by: Naoya Horiguchi <[email protected]> Signed-off-by: Andrew Morton <[email protected]>
2022-05-13mm/hwpoison: disable hwpoison filter during removingzhenwei pi1-0/+1
hwpoison filter is enabled by hwpoison-inject module, after removing this module, hwpoison filter still works. What is worse, user can not find the debugfs entries to know this. Disable the hwpoison filter during removing hwpoison-inject module. Link: https://lkml.kernel.org/r/[email protected] Signed-off-by: zhenwei pi <[email protected]> Acked-by: Naoya Horiguchi <[email protected]> Signed-off-by: Andrew Morton <[email protected]>
2022-05-13mm/memory-failure.c: add hwpoison_filter for soft offlinezhenwei pi1-2/+14
hwpoison_filter is missing in the soft offline path, this leads an issue: after enabling the corrupt filter, the user process still has a chance to inject hwpoison fault by madvise(addr, len, MADV_SOFT_OFFLINE) at PFN which is expected to reject. Also do a minor change in comment of memory_failure(). Link: https://lkml.kernel.org/r/[email protected] Signed-off-by: zhenwei pi <[email protected]> Acked-by: Naoya Horiguchi <[email protected]> Signed-off-by: Andrew Morton <[email protected]>
2022-05-13mm/memory-failure.c: simplify num_poisoned_pages_deczhenwei pi2-29/+9
Don't decrease the number of poisoned pages in page_alloc.c, let the memory-failure.c do inc/dec poisoned pages only. Also simplify unpoison_memory(), only decrease the number of poisoned pages when: - TestClearPageHWPoison() succeed - put_page_back_buddy succeed After decreasing, print necessary log. Finally, remove clear_page_hwpoison() and unpoison_taken_off_page(). Link: https://lkml.kernel.org/r/[email protected] Signed-off-by: zhenwei pi <[email protected]> Acked-by: Naoya Horiguchi <[email protected]> Signed-off-by: Andrew Morton <[email protected]>
2022-05-13mm/memory-failure.c: move clear_hwpoisoned_pageszhenwei pi3-27/+32
Patch series "memory-failure: fix hwpoison_filter", v2. As well known, the memory failure mechanism handles memory corrupted event, and try to send SIGBUS to the user process which uses this corrupted page. For the virtualization case, QEMU catches SIGBUS and tries to inject MCE into the guest, and the guest handles memory failure again. Thus the guest gets the minimal effect from hardware memory corruption. The further step I'm working on: 1, try to modify code to decrease poisoned pages in a single place (mm/memofy-failure.c: simplify num_poisoned_pages_dec in this series). 2, try to use page_handle_poison() to handle SetPageHWPoison() and num_poisoned_pages_inc() together. It would be best to call num_poisoned_pages_inc() in a single place too. 3, introduce memory failure notifier list in memory-failure.c: notify the corrupted PFN to someone who registers this list. If I can complete [1] and [2] part, [3] will be quite easy(just call notifier list after increasing poisoned page). 4, introduce memory recover VQ for memory balloon device, and registers memory failure notifier list. During the guest kernel handles memory failure, balloon device gets notified by memory failure notifier list, and tells the host to recover the corrupted PFN(GPA) by the new VQ. 5, host side remaps the corrupted page(HVA), and tells the guest side to unpoison the PFN(GPA). Then the guest fixes the corrupted page(GPA) dynamically. This patch (of 5): clear_hwpoisoned_pages() clears HWPoison flag and decreases the number of poisoned pages, this actually works as part of memory failure. Move this function from sparse.c to memory-failure.c, finally there is no CONFIG_MEMORY_FAILURE in sparse.c. Link: https://lkml.kernel.org/r/[email protected] Link: https://lkml.kernel.org/r/[email protected] Signed-off-by: zhenwei pi <[email protected]> Acked-by: Naoya Horiguchi <[email protected]> Signed-off-by: Andrew Morton <[email protected]>
2022-05-13mm/page_owner: use strscpy() instead of strlcpy()Eric Dumazet1-1/+1
current->comm[] is not a string (no guarantee for a zero byte in it). strlcpy(s1, s2, l) is calling strlen(s2), potentially causing out-of-bound access, as reported by syzbot: detected buffer overflow in __fortify_strlen ------------[ cut here ]------------ kernel BUG at lib/string_helpers.c:980! invalid opcode: 0000 [#1] PREEMPT SMP KASAN CPU: 0 PID: 4087 Comm: dhcpcd-run-hooks Not tainted 5.18.0-rc3-syzkaller-01537-g20b87e7c29df #0 Hardware name: Google Google Compute Engine/Google Compute Engine, BIOS Google 01/01/2011 RIP: 0010:fortify_panic+0x18/0x1a lib/string_helpers.c:980 Code: 8c e8 c5 ba e1 fa e9 23 0f bf fa e8 0b 5d 8c f8 eb db 55 48 89 fd e8 e0 49 40 f8 48 89 ee 48 c7 c7 80 f5 26 8a e8 99 09 f1 ff <0f> 0b e8 ca 49 40 f8 48 8b 54 24 18 4c 89 f1 48 c7 c7 00 00 27 8a RSP: 0018:ffffc900000074a8 EFLAGS: 00010286 RAX: 000000000000002c RBX: ffff88801226b728 RCX: 0000000000000000 RDX: ffff8880198e0000 RSI: ffffffff81600458 RDI: fffff52000000e87 RBP: ffffffff89da2aa0 R08: 000000000000002c R09: 0000000000000000 R10: ffffffff815fae2e R11: 0000000000000000 R12: ffff88801226b700 R13: ffff8880198e0830 R14: 0000000000000000 R15: 0000000000000000 FS: 0000000000000000(0000) GS:ffff8880b9c00000(0000) knlGS:0000000000000000 CS: 0010 DS: 0000 ES: 0000 CR0: 0000000080050033 CR2: 00007f5876ad6ff8 CR3: 000000001a48c000 CR4: 00000000003506f0 DR0: 0000000000000000 DR1: 0000000000000000 DR2: 0000000000000000 DR3: 0000000000000000 DR6: 00000000fffe0ff0 DR7: 0000000000000600 Call Trace: <IRQ> __fortify_strlen include/linux/fortify-string.h:128 [inline] strlcpy include/linux/fortify-string.h:143 [inline] __set_page_owner_handle+0x2b1/0x3e0 mm/page_owner.c:171 __set_page_owner+0x3e/0x50 mm/page_owner.c:190 prep_new_page mm/page_alloc.c:2441 [inline] get_page_from_freelist+0xba2/0x3e00 mm/page_alloc.c:4182 __alloc_pages+0x1b2/0x500 mm/page_alloc.c:5408 alloc_pages+0x1aa/0x310 mm/mempolicy.c:2272 alloc_slab_page mm/slub.c:1799 [inline] allocate_slab+0x26c/0x3c0 mm/slub.c:1944 new_slab mm/slub.c:2004 [inline] ___slab_alloc+0x8df/0xf20 mm/slub.c:3005 __slab_alloc.constprop.0+0x4d/0xa0 mm/slub.c:3092 slab_alloc_node mm/slub.c:3183 [inline] slab_alloc mm/slub.c:3225 [inline] __kmem_cache_alloc_lru mm/slub.c:3232 [inline] kmem_cache_alloc+0x360/0x3b0 mm/slub.c:3242 dst_alloc+0x146/0x1f0 net/core/dst.c:92 Link: https://lkml.kernel.org/r/[email protected] Fixes: 865ed6a32786 ("mm/page_owner: record task command name") Signed-off-by: Eric Dumazet <[email protected]> Reported-by: syzbot <[email protected]> Acked-by: Waiman Long <[email protected]> Acked-by: Shakeel Butt <[email protected]> Cc: <[email protected]> Signed-off-by: Andrew Morton <[email protected]>
2022-05-13kasan: clean-up kconfig options descriptionsAndrey Konovalov1-86/+82
Various readability clean-ups of KASAN Kconfig options. No functional changes. Link: https://lkml.kernel.org/r/c160840dd9e4b1ad5529ecfdb0bba35d9a14d826.1652203271.git.andreyknvl@google.com Link: https://lkml.kernel.org/r/47afaecec29221347bee49f58c258ac1ced3b429.1652123204.git.andreyknvl@google.com Signed-off-by: Andrey Konovalov <[email protected]> Reviewed-by: Marco Elver <[email protected]> Cc: Alexander Potapenko <[email protected]> Cc: Dmitry Vyukov <[email protected]> Cc: Andrey Ryabinin <[email protected]> Signed-off-by: Andrew Morton <[email protected]>
2022-05-13kasan: move boot parameters section in documentationAndrey Konovalov1-41/+41
Move the "Boot parameters" section in KASAN documentation next to the section that describes KASAN build options. No content changes. Link: https://lkml.kernel.org/r/870628e1293b4f44edf7cbcb92374ff9eb7503d7.1652203271.git.andreyknvl@google.com Link: https://lkml.kernel.org/r/ec9c923f35e7c5312836c4624a7f317dc1ee2c1c.1652123204.git.andreyknvl@google.com Signed-off-by: Andrey Konovalov <[email protected]> Reviewed-by: Marco Elver <[email protected]> Cc: Alexander Potapenko <[email protected]> Cc: Andrey Ryabinin <[email protected]> Cc: Dmitry Vyukov <[email protected]> Signed-off-by: Andrew Morton <[email protected]>
2022-05-13kasan: update documentationAndrey Konovalov1-60/+90
Do assorted clean-ups and improvements to KASAN documentation, including: - Describe each mode in a dedicated paragraph. - Split out a Support section that describes in details which compilers, architectures and memory types each mode requires/supports. - Capitalize the first letter in the names of each KASAN mode. [[email protected]: rewording, per Marco] Link: https://lkml.kernel.org/r/896b2d914d6b50d677fd7b38f76967cc705c01ba.1652203271.git.andreyknvl@google.com Link: https://lkml.kernel.org/r/5bd58ebebf066593ce0e1d265d60278b5f5a1874.1652123204.git.andreyknvl@google.com Signed-off-by: Andrey Konovalov <[email protected]> Reviewed-by: Marco Elver <[email protected]> Cc: Alexander Potapenko <[email protected]> Cc: Andrey Ryabinin <[email protected]> Cc: Dmitry Vyukov <[email protected]> Signed-off-by: Andrew Morton <[email protected]>
2022-05-13kasan: give better names to shadow valuesAndrey Konovalov5-21/+21
Rename KASAN_KMALLOC_* shadow values to KASAN_SLAB_*, as they are used for all slab allocations, not only for kmalloc. Also rename KASAN_FREE_PAGE to KASAN_PAGE_FREE to be consistent with KASAN_PAGE_REDZONE and KASAN_SLAB_FREE. Link: https://lkml.kernel.org/r/bebcaf4eafdb0cabae0401a69c0af956aa87fcaa.1652111464.git.andreyknvl@google.com Signed-off-by: Andrey Konovalov <[email protected]> Reviewed-by: Alexander Potapenko <[email protected]> Cc: Andrey Ryabinin <[email protected]> Cc: Dmitry Vyukov <[email protected]> Cc: Marco Elver <[email protected]> Signed-off-by: Andrew Morton <[email protected]>
2022-05-13kasan: use tabs to align shadow valuesAndrey Konovalov1-16/+16
Consistently use tabs instead of spaces to shadow value definitions. Link: https://lkml.kernel.org/r/00e7e66b5fc375d58200dc1489949b3edcd096b7.1652111464.git.andreyknvl@google.com Signed-off-by: Andrey Konovalov <[email protected]> Cc: Alexander Potapenko <[email protected]> Cc: Andrey Ryabinin <[email protected]> Cc: Dmitry Vyukov <[email protected]> Cc: Marco Elver <[email protected]> Signed-off-by: Andrew Morton <[email protected]>
2022-05-13kasan: clean up comments in internal kasan.hAndrey Konovalov1-41/+33
Clean up comments in mm/kasan/kasan.h: clarify, unify styles, fix punctuation, etc. Link: https://lkml.kernel.org/r/a0680ff30035b56cb7bdd5f59fd400e71712ceb5.1652111464.git.andreyknvl@google.com Signed-off-by: Andrey Konovalov <[email protected]> Reviewed-by: Alexander Potapenko <[email protected]> Cc: Marco Elver <[email protected]> Cc: Alexander Potapenko <[email protected]> Cc: Dmitry Vyukov <[email protected]> Cc: Andrey Ryabinin <[email protected]> Signed-off-by: Andrew Morton <[email protected]>
2022-05-13mm/vmalloc: use raw_cpu_ptr() for vmap_block_queue accessSebastian Andrzej Siewior1-4/+2
The per-CPU resource vmap_block_queue is accessed via get_cpu_var(). That macro disables preemption and then loads the pointer from the current CPU. This doesn't work on PREEMPT_RT because a spinlock_t is later accessed within the preempt-disable section. There is no need to disable preemption while accessing the per-CPU struct vmap_block_queue because the list is protected with a spinlock_t. The per-CPU struct is also accessed cross-CPU in purge_fragmented_blocks(). It is possible that by using raw_cpu_ptr() the code migrates to another CPU and uses struct from another CPU. This is fine because the list is locked and the locked section is very short. Use raw_cpu_ptr() to access vmap_block_queue. Link: https://lkml.kernel.org/r/[email protected] Signed-off-by: Sebastian Andrzej Siewior <[email protected]> Cc: Uladzislau Rezki (Sony) <[email protected]> Cc: Thomas Gleixner <[email protected]> Signed-off-by: Andrew Morton <[email protected]>
2022-05-13tracing: incorrect gfp_t conversionVasily Averin7-59/+59
Fixes the following sparse warnings: include/trace/events/*: sparse: cast to restricted gfp_t include/trace/events/*: sparse: restricted gfp_t degrades to integer gfp_t type is bitwise and requires __force attributes for any casts. Link: https://lkml.kernel.org/r/[email protected] Signed-off-by: Vasily Averin <[email protected]> Cc: Steven Rostedt <[email protected]> Cc: Ingo Molnar <[email protected]> Signed-off-by: Andrew Morton <[email protected]>
2022-05-13zram: remove double compression logicAlexey Romanov2-33/+10
The 2nd trial allocation under per-cpu presumption has been used to prevent regression of allocation failure. However, it makes trouble for maintenance without significant benefit. The slowpath branch is executed extremely rarely: getting there is problematic. Therefore, we delete this branch. Since b09ab054b69b ("zram: support BDI_CAP_STABLE_WRITES"), zram has used QUEUE_FLAG_STABLE_WRITES to prevent buffer change between 1st and 2nd memory allocations. Since we remove second trial memory allocation logic, we could remove the STABLE_WRITES flag because there is no change buffer to be modified under us. Link: https://lkml.kernel.org/r/[email protected] Signed-off-by: Alexey Romanov <[email protected]> Signed-off-by: Dmitry Rokosov <[email protected]> Acked-by: Minchan Kim <[email protected]> Reviewed-by: Sergey Senozhatsky <[email protected]> Cc: Nitin Gupta <[email protected]> Signed-off-by: Andrew Morton <[email protected]>
2022-05-13percpu: improve percpu_alloc_percpu event traceVasily Averin3-12/+24
Add call_site, bytes_alloc and gfp_flags fields to the output of the percpu_alloc_percpu ftrace event: mkdir-4393 [001] 169.334788: percpu_alloc_percpu: call_site=mem_cgroup_css_alloc+0xa6 reserved=0 is_atomic=0 size=2408 align=8 base_addr=0xffffc7117fc00000 off=402176 ptr=0x3dc867a62300 bytes_alloc=14448 gfp_flags=GFP_KERNEL_ACCOUNT This is required to track memcg-accounted percpu allocations. Link: https://lkml.kernel.org/r/[email protected] Signed-off-by: Vasily Averin <[email protected]> Acked-by: Roman Gushchin <[email protected]> Cc: Shakeel Butt <[email protected]> Cc: Steven Rostedt <[email protected]> Cc: Ingo Molnar <[email protected]> Cc: Vlastimil Babka <[email protected]> Cc: Michal Hocko <[email protected]> Cc: Dennis Zhou <[email protected]> Cc: Tejun Heo <[email protected]> Cc: Christoph Lameter <[email protected]> Signed-off-by: Andrew Morton <[email protected]>
2022-05-13docs: vm/page_owner: tweak literal block in STANDARD FORMAT SPECIFIERSAkira Yokosawa1-4/+4
A semantic conflict between commit 5603f9bdea68 ("docs: vm/page_owner: use literal blocks for param description") and a change queued for v5.19 authored by Jiajian Ye ("tools/vm/page_owner_sort.c: support sorting blocks by multiple keys") results in a warning from "make htmldocs" saying: [...]/vm/page_owner.rst:176: WARNING: Literal block expected; none found. This is because a literal block in ReST ends at a line which has the same indent as the paragraph preceding it. In this case the one with no indent. Indent the two "For --xxxx option:" lines by two columns and make the whole section a literal block. While at it, fix indents by white spaces of "ator" keys. Link: https://lkml.kernel.org/r/[email protected] Signed-of-by: Akira Yokosawa <[email protected]> Reported-by: Shenghong Han <[email protected]> Cc: Jiajian Ye <[email protected]> Cc: Chongxi Zhao <[email protected]> Cc: Yinan Zhang <[email protected]> Cc: Yixuan Cao <[email protected]> Cc: Yongqiang Liu <[email protected]> Cc: Yuhong Feng <[email protected]> Cc: Haowen Bai <[email protected]> Cc: Jonathan Corbet <[email protected]> Signed-off-by: Andrew Morton <[email protected]>
2022-05-13mm/damon/reclaim: use resource_size function on resource objectJiapeng Chong1-1/+1
Fix the following coccicheck warnings: ./mm/damon/reclaim.c:241:30-33: WARNING: Suspicious code. resource_size is maybe missing with res. Link: https://lkml.kernel.org/r/[email protected] Signed-off-by: Jiapeng Chong <[email protected]> Reported-by: Abaci Robot <[email protected]> Reviewed-by: SeongJae Park <[email protected]> Cc: "Boehme, Markus" <[email protected]> Signed-off-by: Andrew Morton <[email protected]>
2022-05-13mm: functions may simplify the use of return valuesLi kunyu3-16/+6
p4d_clear_huge may be optimized for void return type and function usage. vunmap_p4d_range function saves a few steps here. Link: https://lkml.kernel.org/r/[email protected] Signed-off-by: Li kunyu <[email protected]> Cc: Dave Hansen <[email protected]> Cc: Andy Lutomirski <[email protected]> Cc: Peter Zijlstra <[email protected]> Cc: Thomas Gleixner <[email protected]> Cc: Ingo Molnar <[email protected]> Cc: Borislav Petkov <[email protected]> Cc: "H. Peter Anvin" <[email protected]> Signed-off-by: Andrew Morton <[email protected]>
2022-05-13riscv/mm: enable ARCH_SUPPORTS_PAGE_TABLE_CHECKTong Tiangen2-6/+66
As commit d283d422c6c4 ("x86: mm: add x86_64 support for page table check"), enable ARCH_SUPPORTS_PAGE_TABLE_CHECK on riscv. Add additional page table check stubs for page table helpers, these stubs can be used to check the existing page table entries. Link: https://lkml.kernel.org/r/[email protected] Signed-off-by: Tong Tiangen <[email protected]> Reviewed-by: Pasha Tatashin <[email protected]> Cc: Anshuman Khandual <[email protected]> Cc: Borislav Petkov <[email protected]> Cc: Catalin Marinas <[email protected]> Cc: Dave Hansen <[email protected]> Cc: "H. Peter Anvin" <[email protected]> Cc: Ingo Molnar <[email protected]> Cc: Kefeng Wang <[email protected]> Cc: Palmer Dabbelt <[email protected]> Cc: Paul Walmsley <[email protected]> Cc: Will Deacon <[email protected]> Signed-off-by: Andrew Morton <[email protected]>
2022-05-13arm64/mm: enable ARCH_SUPPORTS_PAGE_TABLE_CHECKKefeng Wang2-6/+56
As commit d283d422c6c4 ("x86: mm: add x86_64 support for page table check") , enable ARCH_SUPPORTS_PAGE_TABLE_CHECK on arm64. Add additional page table check stubs for page table helpers, these stubs can be used to check the existing page table entries. Link: https://lkml.kernel.org/r/[email protected] Signed-off-by: Kefeng Wang <[email protected]> Signed-off-by: Tong Tiangen <[email protected]> Reviewed-by: Pasha Tatashin <[email protected]> Acked-by: Catalin Marinas <[email protected]> Cc: Anshuman Khandual <[email protected]> Cc: Borislav Petkov <[email protected]> Cc: Dave Hansen <[email protected]> Cc: "H. Peter Anvin" <[email protected]> Cc: Ingo Molnar <[email protected]> Cc: Palmer Dabbelt <[email protected]> Cc: Paul Walmsley <[email protected]> Cc: Will Deacon <[email protected]> Signed-off-by: Andrew Morton <[email protected]>
2022-05-13mm: remove __HAVE_ARCH_PTEP_CLEAR in pgtable.hTong Tiangen1-2/+0
Currently, there is no architecture definition __HAVE_ARCH_PTEP_CLEAR, Generic ptep_clear() is the only definition for all architecture, So drop the "#ifndef __HAVE_ARCH_PTEP_CLEAR". Link: https://lkml.kernel.org/r/[email protected] Signed-off-by: Tong Tiangen <[email protected]> Suggested-by: Anshuman Khandual <[email protected]> Cc: Borislav Petkov <[email protected]> Cc: Catalin Marinas <[email protected]> Cc: Dave Hansen <[email protected]> Cc: "H. Peter Anvin" <[email protected]> Cc: Ingo Molnar <[email protected]> Cc: Kefeng Wang <[email protected]> Cc: Palmer Dabbelt <[email protected]> Cc: Pasha Tatashin <[email protected]> Cc: Paul Walmsley <[email protected]> Cc: Will Deacon <[email protected]> Signed-off-by: Andrew Morton <[email protected]>
2022-05-13mm: page_table_check: add hooks to public helpersTong Tiangen2-18/+15
Move ptep_clear() to the include/linux/pgtable.h and add page table check relate hooks to some helpers, it's prepare for support page table check feature on new architecture. Optimize the implementation of ptep_clear(), page table hooks added page table check stubs, the interface control should be at stubs, there is no rationale for doing a IS_ENABLED() check here. For architectures that do not enable CONFIG_PAGE_TABLE_CHECK, they will call a fallback page table check stubs[1] when getting their page table helpers[2] in include/linux/pgtable.h. [1] page table check stubs defined in include/linux/page_table_check.h [2] ptep_clear() ptep_get_and_clear() pmdp_huge_get_and_clear() pudp_huge_get_and_clear() Link: https://lkml.kernel.org/r/[email protected] Signed-off-by: Tong Tiangen <[email protected]> Acked-by: Pasha Tatashin <[email protected]> Cc: Anshuman Khandual <[email protected]> Cc: Borislav Petkov <[email protected]> Cc: Catalin Marinas <[email protected]> Cc: Dave Hansen <[email protected]> Cc: "H. Peter Anvin" <[email protected]> Cc: Ingo Molnar <[email protected]> Cc: Kefeng Wang <[email protected]> Cc: Palmer Dabbelt <[email protected]> Cc: Paul Walmsley <[email protected]> Cc: Will Deacon <[email protected]> Signed-off-by: Andrew Morton <[email protected]>
2022-05-13mm: page_table_check: move pxx_user_accessible_page into x86Kefeng Wang2-17/+17
The pxx_user_accessible_page() checks the PTE bit, it's architecture-specific code, move them into x86's pgtable.h. These helpers are being moved out to make the page table check framework platform independent. Link: https://lkml.kernel.org/r/[email protected] Signed-off-by: Kefeng Wang <[email protected]> Signed-off-by: Tong Tiangen <[email protected]> Acked-by: Pasha Tatashin <[email protected]> Reviewed-by: Anshuman Khandual <[email protected]> Cc: Borislav Petkov <[email protected]> Cc: Catalin Marinas <[email protected]> Cc: Dave Hansen <[email protected]> Cc: "H. Peter Anvin" <[email protected]> Cc: Ingo Molnar <[email protected]> Cc: Palmer Dabbelt <[email protected]> Cc: Paul Walmsley <[email protected]> Cc: Will Deacon <[email protected]> Signed-off-by: Andrew Morton <[email protected]>
2022-05-13mm: page_table_check: using PxD_SIZE instead of PxD_PAGE_SIZETong Tiangen1-4/+4
Patch series "mm: page_table_check: add support on arm64 and riscv", v7. Page table check performs extra verifications at the time when new pages become accessible from the userspace by getting their page table entries (PTEs PMDs etc.) added into the table. It is supported on X86[1]. This patchset made some simple changes and make it easier to support new architecture, then we support this feature on ARM64 and RISCV. [1]https://lore.kernel.org/lkml/[email protected]/ This patch (of 6): Compared with PxD_PAGE_SIZE, which is defined and used only on X86, PxD_SIZE is more common in each architecture. Therefore, it is more reasonable to use PxD_SIZE instead of PxD_PAGE_SIZE in page_table_check.c. At the same time, it is easier to support page table check in other architectures. The substitution has no functional impact on the x86. Link: https://lkml.kernel.org/r/[email protected] Link: https://lkml.kernel.org/r/[email protected] Signed-off-by: Tong Tiangen <[email protected]> Suggested-by: Anshuman Khandual <[email protected]> Acked-by: Pasha Tatashin <[email protected]> Reviewed-by: Anshuman Khandual <[email protected]> Cc: Ingo Molnar <[email protected]> Cc: Borislav Petkov <[email protected]> Cc: Dave Hansen <[email protected]> Cc: "H. Peter Anvin" <[email protected]> Cc: Catalin Marinas <[email protected]> Cc: Will Deacon <[email protected]> Cc: Paul Walmsley <[email protected]> Cc: Palmer Dabbelt <[email protected]> Cc: Kefeng Wang <[email protected]> Signed-off-by: Andrew Morton <[email protected]>
2022-05-13mm/migrate: convert move_to_new_page() into move_to_new_folio()Matthew Wilcox (Oracle)1-29/+29
Pass in the folios that we already have in each caller. Saves a lot of calls to compound_head(). Link: https://lkml.kernel.org/r/[email protected] Signed-off-by: Matthew Wilcox (Oracle) <[email protected]> Reviewed-by: Christoph Hellwig <[email protected]> Signed-off-by: Andrew Morton <[email protected]>
2022-05-13mm: add folio_test_movable()Matthew Wilcox (Oracle)1-0/+5
This is the folio equivalent of PageMovable() which is needed to convert mm/migrate.c to folios. Link: https://lkml.kernel.org/r/[email protected] Signed-off-by: Matthew Wilcox (Oracle) <[email protected]> Reviewed-by: Christoph Hellwig <[email protected]> Signed-off-by: Andrew Morton <[email protected]>
2022-05-13mm: add folio_mapping_flags()Matthew Wilcox (Oracle)1-0/+5
This is the equivalent of PageMappingFlags and is needed for converting mm/migrate.c to folios. Link: https://lkml.kernel.org/r/[email protected] Signed-off-by: Matthew Wilcox (Oracle) <[email protected]> Reviewed-by: Christoph Hellwig <[email protected]> Signed-off-by: Andrew Morton <[email protected]>
2022-05-13mm/shmem: convert shmem_swapin_page() to shmem_swapin_folio()Matthew Wilcox (Oracle)3-63/+55
shmem_swapin_page() only brings in order-0 pages, which are folios by definition. Link: https://lkml.kernel.org/r/[email protected] Signed-off-by: Matthew Wilcox (Oracle) <[email protected]> Reviewed-by: Christoph Hellwig <[email protected]> Signed-off-by: Andrew Morton <[email protected]>
2022-05-13mm/shmem: convert shmem_getpage_gfp to use a folioMatthew Wilcox (Oracle)1-52/+43
Rename shmem_alloc_and_acct_page() to shmem_alloc_and_acct_folio() and have it return a folio, then use a folio throuughout shmem_getpage_gfp(). It continues to return a struct page. Link: https://lkml.kernel.org/r/[email protected] Signed-off-by: Matthew Wilcox (Oracle) <[email protected]> Reviewed-by: Christoph Hellwig <[email protected]> Signed-off-by: Andrew Morton <[email protected]>
2022-05-13mm/shmem: convert shmem_alloc_and_acct_page to use a folioMatthew Wilcox (Oracle)1-9/+9
Convert shmem_alloc_hugepage() to return the folio that it uses and use a folio throughout shmem_alloc_and_acct_page(). Continue to return a page from shmem_alloc_and_acct_page() for now. Link: https://lkml.kernel.org/r/[email protected] Signed-off-by: Matthew Wilcox (Oracle) <[email protected]> Reviewed-by: Christoph Hellwig <[email protected]> Signed-off-by: Andrew Morton <[email protected]>
2022-05-13mm/shmem: add shmem_alloc_folio()Matthew Wilcox (Oracle)1-4/+10
Call vma_alloc_folio() directly instead of alloc_page_vma(). Add a shmem_alloc_page() wrapper to avoid changing the callers. Link: https://lkml.kernel.org/r/[email protected] Signed-off-by: Matthew Wilcox (Oracle) <[email protected]> Reviewed-by: Christoph Hellwig <[email protected]> Signed-off-by: Andrew Morton <[email protected]>
2022-05-13mm/shmem: turn shmem_should_replace_page into shmem_should_replace_folioMatthew Wilcox (Oracle)1-4/+4
This is a straightforward conversion. Link: https://lkml.kernel.org/r/[email protected] Signed-off-by: Matthew Wilcox (Oracle) <[email protected]> Reviewed-by: Christoph Hellwig <[email protected]> Signed-off-by: Andrew Morton <[email protected]>
2022-05-13mm/shmem: convert shmem_add_to_page_cache to take a folioMatthew Wilcox (Oracle)1-26/+31
Shrinks shmem_add_to_page_cache() by 16 bytes. All the callers grow, but this is temporary as they will all be converted to folios soon. Link: https://lkml.kernel.org/r/[email protected] Signed-off-by: Matthew Wilcox (Oracle) <[email protected]> Reviewed-by: Christoph Hellwig <[email protected]> Signed-off-by: Andrew Morton <[email protected]>