aboutsummaryrefslogtreecommitdiff
AgeCommit message (Collapse)AuthorFilesLines
2020-10-13include/linux/gfp.h: clarify usage of GFP_ATOMIC in !preemptible contextsMichal Hocko1-1/+3
There is a general understanding that GFP_ATOMIC/GFP_NOWAIT are to be used from atomic contexts. E.g. from within a spin lock or from the IRQ context. This is correct but there are some atomic contexts where the above doesn't hold. One of them would be an NMI context. Page allocator has never supported that and the general fear of this context didn't let anybody to actually even try to use the allocator there. Good, but let's be more specific about that. Another such a context, and that is where people seem to be more daring, is raw_spin_lock. Mostly because it simply resembles regular spin lock which is supported by the allocator and there is not any implementation difference with !RT kernels in the first place. Be explicit that such a context is not supported by the allocator. The underlying reason is that zone->lock would have to become raw_spin_lock as well and that has turned out to be a problem for RT (http://lkml.kernel.org/r/[email protected]). Signed-off-by: Michal Hocko <[email protected]> Signed-off-by: Andrew Morton <[email protected]> Reviewed-by: David Hildenbrand <[email protected]> Reviewed-by: Thomas Gleixner <[email protected]> Reviewed-by: Uladzislau Rezki <[email protected]> Cc: "Paul E. McKenney" <[email protected]> Link: https://lkml.kernel.org/r/[email protected] Signed-off-by: Linus Torvalds <[email protected]>
2020-10-13mm/page_alloc.c: fix freeing non-compound pagesMatthew Wilcox (Oracle)4-0/+55
Here is a very rare race which leaks memory: Page P0 is allocated to the page cache. Page P1 is free. Thread A Thread B Thread C find_get_entry(): xas_load() returns P0 Removes P0 from page cache P0 finds its buddy P1 alloc_pages(GFP_KERNEL, 1) returns P0 P0 has refcount 1 page_cache_get_speculative(P0) P0 has refcount 2 __free_pages(P0) P0 has refcount 1 put_page(P0) P1 is not freed Fix this by freeing all the pages in __free_pages() that won't be freed by the call to put_page(). It's usually not a good idea to split a page, but this is a very unlikely scenario. Fixes: e286781d5f2e ("mm: speculative page references") Signed-off-by: Matthew Wilcox (Oracle) <[email protected]> Signed-off-by: Andrew Morton <[email protected]> Acked-by: Mike Rapoport <[email protected]> Cc: Nick Piggin <[email protected]> Cc: Hugh Dickins <[email protected]> Cc: Peter Zijlstra <[email protected]> Link: https://lkml.kernel.org/r/[email protected] Signed-off-by: Linus Torvalds <[email protected]>
2020-10-13mm: move call to compound_head() in release_pages()Ralph Campbell1-1/+1
The function is_huge_zero_page() doesn't call compound_head() to make sure the page pointer is a head page. The call to is_huge_zero_page() in release_pages() is made before compound_head() is called so the test would fail if release_pages() was called with a tail page of the huge_zero_page and put_page_testzero() would be called releasing the page. This is unlikely to be happening in normal use or we would be seeing all sorts of process data corruption when accessing a THP zero page. Looking at other places where is_huge_zero_page() is called, all seem to only pass a head page so I think the right solution is to move the call to compound_head() in release_pages() to a point before calling is_huge_zero_page(). Signed-off-by: Ralph Campbell <[email protected]> Signed-off-by: Andrew Morton <[email protected]> Reviewed-by: Andrew Morton <[email protected]> Cc: Yu Zhao <[email protected]> Cc: Dan Williams <[email protected]> Cc: Matthew Wilcox <[email protected]> Cc: Christoph Hellwig <[email protected]> Link: https://lkml.kernel.org/r/[email protected] Signed-off-by: Linus Torvalds <[email protected]>
2020-10-13mmzone: clean code by removing unused macro parameterMateusz Nosek2-3/+3
Previously 'for_next_zone_zonelist_nodemask' macro parameter 'zlist' was unused so this patch removes it. Signed-off-by: Mateusz Nosek <[email protected]> Signed-off-by: Andrew Morton <[email protected]> Link: https://lkml.kernel.org/r/[email protected] Signed-off-by: Linus Torvalds <[email protected]>
2020-10-13mm/page_alloc.c: __perform_reclaim should return 'unsigned long'Yanfei Xu1-3/+2
__perform_reclaim()'s single caller expects it to return 'unsigned long', hence change its return value and a local variable to 'unsigned long'. Suggested-by: Andrew Morton <[email protected]> Signed-off-by: Yanfei Xu <[email protected]> Signed-off-by: Andrew Morton <[email protected]> Reviewed-by: Andrew Morton <[email protected]> Link: https://lkml.kernel.org/r/[email protected] Signed-off-by: Linus Torvalds <[email protected]>
2020-10-13mm/page_alloc.c: clean code by merging two functionsMateusz Nosek1-8/+2
finalise_ac() is just 'epilogue' for 'prepare_alloc_pages'. Therefore there is no need to keep them both so 'finalise_ac' content can be merged into prepare_alloc_pages() code. It would make __alloc_pages_nodemask() cleaner when it comes to readability. Signed-off-by: Mateusz Nosek <[email protected]> Signed-off-by: Andrew Morton <[email protected]> Reviewed-by: Andrew Morton <[email protected]> Cc: Mel Gorman <[email protected]> Cc: Mike Rapoport <[email protected]> Link: https://lkml.kernel.org/r/[email protected] Signed-off-by: Linus Torvalds <[email protected]>
2020-10-13mm/page_alloc.c: fix early params garbage value accessesMateusz Nosek1-6/+6
Previously in '__init early_init_on_alloc' and '__init early_init_on_free' the return values from 'kstrtobool' were not handled properly. That caused potential garbage value read from variable 'bool_result'. Introduced patch fixes error handling. Signed-off-by: Mateusz Nosek <[email protected]> Signed-off-by: Andrew Morton <[email protected]> Reviewed-by: Andrew Morton <[email protected]> Link: https://lkml.kernel.org/r/[email protected] Signed-off-by: Linus Torvalds <[email protected]>
2020-10-13mm/page_alloc.c: micro-optimization remove unnecessary branchMateusz Nosek1-5/+3
Previously flags check was separated into two separated checks with two separated branches. In case of presence of any of two mentioned flags, the same effect on flow occurs. Therefore checks can be merged and one branch can be avoided. Signed-off-by: Mateusz Nosek <[email protected]> Signed-off-by: Andrew Morton <[email protected]> Reviewed-by: Andrew Morton <[email protected]> Link: https://lkml.kernel.org/r/[email protected] Signed-off-by: Linus Torvalds <[email protected]>
2020-10-13mm/page_alloc.c: clean code by removing unnecessary initializationMateusz Nosek1-3/+1
Previously variable 'tmp' was initialized, but was not read later before reassigning. So the initialization can be removed. [[email protected]: remove `tmp' altogether] Signed-off-by: Mateusz Nosek <[email protected]> Signed-off-by: Andrew Morton <[email protected]> Link: https://lkml.kernel.org/r/[email protected] Signed-off-by: Linus Torvalds <[email protected]>
2020-10-13mm, isolation: avoid checking unmovable pages across pageblock boundaryLi Xinhai1-1/+2
In has_unmovable_pages(), the page parameter would not always be the first page within a pageblock (see how the page pointer is passed in from start_isolate_page_range() after call __first_valid_page()), so that would cause checking unmovable pages span two pageblocks. After this patch, the checking is enforced within one pageblock no matter the page is first one or not, and obey the semantics of this function. This issue is found by code inspection. Michal said "this might lead to false negatives when an unrelated block would cause an isolation failure". Signed-off-by: Li Xinhai <[email protected]> Signed-off-by: Andrew Morton <[email protected]> Reviewed-by: Oscar Salvador <[email protected]> Acked-by: Michal Hocko <[email protected]> Cc: David Hildenbrand <[email protected]> Link: https://lkml.kernel.org/r/[email protected] Signed-off-by: Linus Torvalds <[email protected]>
2020-10-13mm: document semantics of ZONE_MOVABLEDavid Hildenbrand1-0/+35
Let's document what ZONE_MOVABLE means, how it's used, and which special cases we have regarding unmovable pages (memory offlining vs. migration / allocations). Signed-off-by: David Hildenbrand <[email protected]> Signed-off-by: Andrew Morton <[email protected]> Acked-by: Mike Rapoport <[email protected]> Cc: Michal Hocko <[email protected]> Cc: Michael S. Tsirkin <[email protected]> Cc: Mike Kravetz <[email protected]> Cc: Pankaj Gupta <[email protected]> Cc: Baoquan He <[email protected]> Cc: Jason Wang <[email protected]> Cc: Qian Cai <[email protected]> Link: http://lkml.kernel.org/r/[email protected] Signed-off-by: Linus Torvalds <[email protected]>
2020-10-13virtio-mem: don't special-case ZONE_MOVABLEDavid Hildenbrand1-39/+8
When introducing virtio-mem, the semantics of ZONE_MOVABLE were rather unclear, which is why we special-cased ZONE_MOVABLE such that partially plugged blocks would never end up in ZONE_MOVABLE. Now that the semantics are much clearer (and will be documented in a follow-up patch including the new virtio-mem behavior), let's allow to online partially plugged memory blocks to ZONE_MOVABLE and also consider memory blocks that were onlined to ZONE_MOVABLE when unplugging memory. While unplugged memory pages are, in general, unmovable, they can be skipped when offlining memory. virtio-mem only unplugs fairly big chunks (in the megabyte range) and rather tries to shrink the memory region than randomly choosing memory. In theory, if all other pages in the movable zone would be movable, virtio-mem would only shrink that zone and not create any kind of fragmentation. In the future, we might want to remember the zone again and use the information when (un)plugging memory. For now, let's keep it simple. Note: Support for defragmentation is planned, to deal with fragmentation after unplug due to memory chunks within memory blocks that could not get unplugged before (e.g., somebody pinning pages within ZONE_MOVABLE for a longer time). Signed-off-by: David Hildenbrand <[email protected]> Signed-off-by: Andrew Morton <[email protected]> Cc: Michal Hocko <[email protected]> Cc: Michael S. Tsirkin <[email protected]> Cc: Jason Wang <[email protected]> Cc: Mike Kravetz <[email protected]> Cc: Pankaj Gupta <[email protected]> Cc: Baoquan He <[email protected]> Cc: Mike Rapoport <[email protected]> Cc: Qian Cai <[email protected]> Link: http://lkml.kernel.org/r/[email protected] Signed-off-by: Linus Torvalds <[email protected]>
2020-10-13mm/page_isolation: cleanup set_migratetype_isolate()David Hildenbrand1-10/+7
Let's clean it up a bit, simplifying the exit paths. Signed-off-by: David Hildenbrand <[email protected]> Signed-off-by: Andrew Morton <[email protected]> Reviewed-by: Baoquan He <[email protected]> Reviewed-by: Pankaj Gupta <[email protected]> Cc: Michal Hocko <[email protected]> Cc: Michael S. Tsirkin <[email protected]> Cc: Mike Kravetz <[email protected]> Cc: Jason Wang <[email protected]> Cc: Mike Rapoport <[email protected]> Cc: Qian Cai <[email protected]> Link: http://lkml.kernel.org/r/[email protected] Signed-off-by: Linus Torvalds <[email protected]>
2020-10-13mm/page_isolation: drop WARN_ON_ONCE() in set_migratetype_isolate()David Hildenbrand1-9/+6
Inside has_unmovable_pages(), we have a comment describing how unmovable data could end up in ZONE_MOVABLE - via "movablecore". Also, besides checking if the first page in the pageblock is reserved, we don't perform any further checks in case of ZONE_MOVABLE. In case of memory offlining, we set REPORT_FAILURE, properly dump_page() the page and handle the error gracefully. alloc_contig_pages() users currently never allocate from ZONE_MOVABLE. E.g., hugetlb uses alloc_contig_pages() for the allocation of gigantic pages only, which will never end up on the MOVABLE zone (see htlb_alloc_mask()). Signed-off-by: David Hildenbrand <[email protected]> Signed-off-by: Andrew Morton <[email protected]> Reviewed-by: Baoquan He <[email protected]> Cc: Michal Hocko <[email protected]> Cc: Michael S. Tsirkin <[email protected]> Cc: Mike Kravetz <[email protected]> Cc: Pankaj Gupta <[email protected]> Cc: Jason Wang <[email protected]> Cc: Mike Rapoport <[email protected]> Cc: Qian Cai <[email protected]> Link: http://lkml.kernel.org/r/[email protected] Signed-off-by: Linus Torvalds <[email protected]>
2020-10-13mm/page_isolation: exit early when pageblock is isolated in ↵David Hildenbrand1-4/+5
set_migratetype_isolate() Right now, if we have two isolations racing on a pageblock that's in the MOVABLE zone, we would trigger the WARN_ON_ONCE(). Let's just return directly, simplifying error handling. The change was introduced in commit 3d680bdf60a5 ("mm/page_isolation: fix potential warning from user"). As far as I can see, we currently don't have alloc_contig_range() users that use the ZONE_MOVABLE (anymore), so it's currently more a cleanup and a preparation for the future than a fix. Signed-off-by: David Hildenbrand <[email protected]> Signed-off-by: Andrew Morton <[email protected]> Reviewed-by: Baoquan He <[email protected]> Reviewed-by: Pankaj Gupta <[email protected]> Acked-by: Mike Kravetz <[email protected]> Cc: Michal Hocko <[email protected]> Cc: Michael S. Tsirkin <[email protected]> Cc: Qian Cai <[email protected]> Cc: Jason Wang <[email protected]> Cc: Mike Rapoport <[email protected]> Link: http://lkml.kernel.org/r/[email protected] Signed-off-by: Linus Torvalds <[email protected]>
2020-10-13mm/page_alloc: tweak comments in has_unmovable_pages()David Hildenbrand1-16/+6
Patch series "mm / virtio-mem: support ZONE_MOVABLE", v5. When introducing virtio-mem, the semantics of ZONE_MOVABLE were rather unclear, which is why we special-cased ZONE_MOVABLE such that partially plugged blocks would never end up in ZONE_MOVABLE. Now that the semantics are much clearer (and are documented in patch #6), let's support partially plugged memory blocks in ZONE_MOVABLE, allowing partially plugged memory blocks to be online to ZONE_MOVABLE and also unplugging from such memory blocks. This avoids surprises when onlining of memory blocks suddenly fails, just because they are not completely populated by virtio-mem (yet). This is especially helpful for testing, but also paves the way for virtio-mem optimizations, allowing more memory to get reliably unplugged. Cleanup has_unmovable_pages() and set_migratetype_isolate(), providing better documentation of how ZONE_MOVABLE interacts with different kind of unmovable pages (memory offlining vs. alloc_contig_range()). This patch (of 6): Let's move the split comment regarding bootmem allocations and memory holes, especially in the context of ZONE_MOVABLE, to the PageReserved() check. Signed-off-by: David Hildenbrand <[email protected]> Signed-off-by: Andrew Morton <[email protected]> Reviewed-by: Baoquan He <[email protected]> Cc: Michal Hocko <[email protected]> Cc: Michael S. Tsirkin <[email protected]> Cc: Mike Kravetz <[email protected]> Cc: Pankaj Gupta <[email protected]> Cc: Jason Wang <[email protected]> Cc: Mike Rapoport <[email protected]> Cc: Qian Cai <[email protected]> Link: http://lkml.kernel.org/r/[email protected] Link: http://lkml.kernel.org/r/[email protected] Signed-off-by: Linus Torvalds <[email protected]>
2020-10-13mm: kasan: do not panic if both panic_on_warn and kasan_multishot setDavid Gow1-1/+1
KASAN errors will currently trigger a panic when panic_on_warn is set. This renders kasan_multishot useless, as further KASAN errors won't be reported if the kernel has already paniced. By making kasan_multishot disable this behaviour for KASAN errors, we can still have the benefits of panic_on_warn for non-KASAN warnings, yet be able to use kasan_multishot. This is particularly important when running KASAN tests, which need to trigger multiple KASAN errors: previously these would panic the system if panic_on_warn was set, now they can run (and will panic the system should non-KASAN warnings show up). Signed-off-by: David Gow <[email protected]> Signed-off-by: Andrew Morton <[email protected]> Tested-by: Andrey Konovalov <[email protected]> Reviewed-by: Andrey Konovalov <[email protected]> Reviewed-by: Brendan Higgins <[email protected]> Cc: Andrey Ryabinin <[email protected]> Cc: Dmitry Vyukov <[email protected]> Cc: Ingo Molnar <[email protected]> Cc: Juri Lelli <[email protected]> Cc: Patricia Alfonso <[email protected]> Cc: Peter Zijlstra <[email protected]> Cc: Shuah Khan <[email protected]> Cc: Vincent Guittot <[email protected]> Link: https://lkml.kernel.org/r/[email protected] Link: https://lkml.kernel.org/r/[email protected] Signed-off-by: Linus Torvalds <[email protected]>
2020-10-13KASAN: Testing DocumentationPatricia Alfonso1-0/+70
Include documentation on how to test KASAN using CONFIG_TEST_KASAN_KUNIT and CONFIG_TEST_KASAN_MODULE. Signed-off-by: Patricia Alfonso <[email protected]> Signed-off-by: David Gow <[email protected]> Signed-off-by: Andrew Morton <[email protected]> Tested-by: Andrey Konovalov <[email protected]> Reviewed-by: Andrey Konovalov <[email protected]> Reviewed-by: Dmitry Vyukov <[email protected]> Acked-by: Brendan Higgins <[email protected]> Cc: Andrey Ryabinin <[email protected]> Cc: Ingo Molnar <[email protected]> Cc: Juri Lelli <[email protected]> Cc: Peter Zijlstra <[email protected]> Cc: Shuah Khan <[email protected]> Cc: Vincent Guittot <[email protected]> Link: https://lkml.kernel.org/r/[email protected] Link: https://lkml.kernel.org/r/[email protected] Signed-off-by: Linus Torvalds <[email protected]>
2020-10-13KASAN: port KASAN Tests to KUnitPatricia Alfonso4-438/+386
Transfer all previous tests for KASAN to KUnit so they can be run more easily. Using kunit_tool, developers can run these tests with their other KUnit tests and see "pass" or "fail" with the appropriate KASAN report instead of needing to parse each KASAN report to test KASAN functionalities. All KASAN reports are still printed to dmesg. Stack tests do not work properly when KASAN_STACK is enabled so those tests use a check for "if IS_ENABLED(CONFIG_KASAN_STACK)" so they only run if stack instrumentation is enabled. If KASAN_STACK is not enabled, KUnit will print a statement to let the user know this test was not run with KASAN_STACK enabled. copy_user_test and kasan_rcu_uaf cannot be run in KUnit so there is a separate test file for those tests, which can be run as before as a module. [[email protected]: v14] Link: https://lkml.kernel.org/r/[email protected] Signed-off-by: Patricia Alfonso <[email protected]> Signed-off-by: David Gow <[email protected]> Signed-off-by: Andrew Morton <[email protected]> Tested-by: Andrey Konovalov <[email protected]> Reviewed-by: Brendan Higgins <[email protected]> Reviewed-by: Andrey Konovalov <[email protected]> Reviewed-by: Dmitry Vyukov <[email protected]> Cc: Andrey Ryabinin <[email protected]> Cc: Ingo Molnar <[email protected]> Cc: Juri Lelli <[email protected]> Cc: Peter Zijlstra <[email protected]> Cc: Shuah Khan <[email protected]> Cc: Vincent Guittot <[email protected]> Link: https://lkml.kernel.org/r/[email protected] Signed-off-by: Linus Torvalds <[email protected]>
2020-10-13KUnit: KASAN IntegrationPatricia Alfonso5-7/+96
Integrate KASAN into KUnit testing framework. - Fail tests when KASAN reports an error that is not expected - Use KUNIT_EXPECT_KASAN_FAIL to expect a KASAN error in KASAN tests - Expected KASAN reports pass tests and are still printed when run without kunit_tool (kunit_tool still bypasses the report due to the test passing) - KUnit struct in current task used to keep track of the current test from KASAN code Make use of "[PATCH v3 kunit-next 1/2] kunit: generalize kunit_resource API beyond allocated resources" and "[PATCH v3 kunit-next 2/2] kunit: add support for named resources" from Alan Maguire [1] - A named resource is added to a test when a KASAN report is expected - This resource contains a struct for kasan_data containing booleans representing if a KASAN report is expected and if a KASAN report is found [1] (https://lore.kernel.org/linux-kselftest/[email protected]/T/#t) Signed-off-by: Patricia Alfonso <[email protected]> Signed-off-by: David Gow <[email protected]> Signed-off-by: Andrew Morton <[email protected]> Tested-by: Andrey Konovalov <[email protected]> Reviewed-by: Andrey Konovalov <[email protected]> Reviewed-by: Dmitry Vyukov <[email protected]> Acked-by: Brendan Higgins <[email protected]> Cc: Andrey Ryabinin <[email protected]> Cc: Ingo Molnar <[email protected]> Cc: Juri Lelli <[email protected]> Cc: Peter Zijlstra <[email protected]> Cc: Shuah Khan <[email protected]> Cc: Vincent Guittot <[email protected]> Link: https://lkml.kernel.org/r/[email protected] Link: https://lkml.kernel.org/r/[email protected] Signed-off-by: Linus Torvalds <[email protected]>
2020-10-13kasan/kunit: add KUnit Struct to Current TaskPatricia Alfonso1-0/+4
Patch series "KASAN-KUnit Integration", v14. This patchset contains everything needed to integrate KASAN and KUnit. KUnit will be able to: (1) Fail tests when an unexpected KASAN error occurs (2) Pass tests when an expected KASAN error occurs Convert KASAN tests to KUnit with the exception of copy_user_test because KUnit is unable to test those. Add documentation on how to run the KASAN tests with KUnit and what to expect when running these tests. This patch (of 5): In order to integrate debugging tools like KASAN into the KUnit framework, add KUnit struct to the current task to keep track of the current KUnit test. Signed-off-by: Patricia Alfonso <[email protected]> Signed-off-by: David Gow <[email protected]> Signed-off-by: Andrew Morton <[email protected]> Tested-by: Andrey Konovalov <[email protected]> Reviewed-by: Brendan Higgins <[email protected]> Cc: Brendan Higgins <[email protected]> Cc: Andrey Ryabinin <[email protected]> Cc: Dmitry Vyukov <[email protected]> Cc: Ingo Molnar <[email protected]> Cc: Peter Zijlstra <[email protected]> Cc: Juri Lelli <[email protected]> Cc: Vincent Guittot <[email protected]> Cc: Shuah Khan <[email protected]> Link: https://lkml.kernel.org/r/[email protected] Link: https://lkml.kernel.org/r/[email protected] Link: https://lkml.kernel.org/r/[email protected] Link: https://lkml.kernel.org/r/[email protected] Signed-off-by: Linus Torvalds <[email protected]>
2020-10-13docs/vm: fix 'mm_count' vs 'mm_users' counter confusionAlexander Gordeev1-1/+1
In the context of the anonymous address space lifespan description the 'mm_users' reference counter is confused with 'mm_count'. I.e a "zombie" mm gets released when "mm_count" becomes zero, not "mm_users". Signed-off-by: Alexander Gordeev <[email protected]> Signed-off-by: Andrew Morton <[email protected]> Cc: Jonathan Corbet <[email protected]> Link: https://lkml.kernel.org/r/[email protected] Signed-off-by: Linus Torvalds <[email protected]>
2020-10-13mm/vmalloc.c: fix the comment of find_vm_areaHui Su1-2/+2
Fix the comment of find_vm_area() and get_vm_area() Signed-off-by: Hui Su <[email protected]> Signed-off-by: Andrew Morton <[email protected]> Reviewed-by: Andrew Morton <[email protected]> Link: https://lkml.kernel.org/r/20200927153034.GA199877@rlk Signed-off-by: Linus Torvalds <[email protected]>
2020-10-13mm/vmalloc.c: update the comment in __vmalloc_area_node()Hui Su1-1/+1
Since c67dc624757 ("mm/vmalloc: do not call kmemleak_free() on not yet accounted memory"), the __vunmap() have been changed to __vfree(), so update the confusing comment(). Signed-off-by: Hui Su <[email protected]> Signed-off-by: Andrew Morton <[email protected]> Reviewed-by: Andrew Morton <[email protected]> Cc: Roman Penyaev <[email protected]> Link: https://lkml.kernel.org/r/20200927155409.GA3315@rlk Signed-off-by: Linus Torvalds <[email protected]>
2020-10-13mm/memory-failure.c: remove unused macro `writeback'Alex Shi1-2/+0
Unlike others we don't use the marco writeback. so let's remove it to tame gcc warning: mm/memory-failure.c:827: warning: macro "writeback" is not used [-Wunused-macros] Signed-off-by: Alex Shi <[email protected]> Signed-off-by: Andrew Morton <[email protected]> Cc: Naoya Horiguchi <[email protected]> Link: https://lkml.kernel.org/r/[email protected] Signed-off-by: Linus Torvalds <[email protected]>
2020-10-13mm/memory-failure: do pgoff calculation before for_each_process()Xianting Tian1-1/+2
There is no need to calculate pgoff in each loop of for_each_process(), so move it to the place before for_each_process(), which can save some CPU cycles. Signed-off-by: Xianting Tian <[email protected]> Signed-off-by: Andrew Morton <[email protected]> Acked-by: Naoya Horiguchi <[email protected]> Link: http://lkml.kernel.org/r/[email protected] Signed-off-by: Linus Torvalds <[email protected]>
2020-10-13mm/dmapool.c: replace hard coded function name with __func__Andy Shevchenko1-22/+18
No need to hard code function name when __func__ can be used. While here, replace specifiers for special types like dma_addr_t. Signed-off-by: Andy Shevchenko <[email protected]> Signed-off-by: Andrew Morton <[email protected]> Cc: Matthew Wilcox <[email protected]> Link: http://lkml.kernel.org/r/[email protected] Signed-off-by: Linus Torvalds <[email protected]>
2020-10-13mm/dmapool.c: replace open-coded list_for_each_entry_safe()Andy Shevchenko1-4/+2
There is a place in the code where open-coded version of list_for_each_entry_safe() is used. Replace that with the standard macro. Signed-off-by: Andy Shevchenko <[email protected]> Signed-off-by: Andrew Morton <[email protected]> Cc: Matthew Wilcox <[email protected]> Link: http://lkml.kernel.org/r/[email protected] Signed-off-by: Linus Torvalds <[email protected]>
2020-10-13lib/test_hmm.c: remove unused dmirror_zero_pageRalph Campbell1-14/+0
The variable dmirror_zero_page is unused in the HMM self test driver which was probably intended to demonstrate how a driver could use migrate_vma_setup() to share a single read-only device private zero page similar to how the CPU does. However, this isn't needed for the self tests so remove it. Signed-off-by: Ralph Campbell <[email protected]> Signed-off-by: Andrew Morton <[email protected]> Cc: Jerome Glisse <[email protected]> Link: https://lkml.kernel.org/r/[email protected] Signed-off-by: Linus Torvalds <[email protected]>
2020-10-13tools/testing/selftests/vm/hmm-tests.c: use the new SKIP() macroRalph Campbell1-2/+2
Some tests might not be able to be run if resources like huge pages are not available. Mark these tests as skipped instead of simply passing. Signed-off-by: Ralph Campbell <[email protected]> Signed-off-by: Andrew Morton <[email protected]> Reviewed-by: Jason Gunthorpe <[email protected]> Cc: Jerome Glisse <[email protected]> Cc: John Hubbard <[email protected]> Cc: Shuah Khan <[email protected]> Link: http://lkml.kernel.org/r/[email protected] Signed-off-by: Linus Torvalds <[email protected]>
2020-10-13include/linux/huge_mm.h: remove mincore_huge_pmd declarationyuleixzhang1-3/+0
As mincore_huge_pmd() was dropped, remove the declaration from the header file. Signed-off-by: Yulei Zhang <[email protected]> Signed-off-by: Andrew Morton <[email protected]> Reviewed-by: Zi Yan <[email protected]> Link: https://lkml.kernel.org/r/[email protected] Signed-off-by: Linus Torvalds <[email protected]>
2020-10-13mm: remove src/dst mm parameter in copy_page_range()Peter Xu3-71/+76
Both of the mm pointers are not needed after commit 7a4830c380f3 ("mm/fork: Pass new vma pointer into copy_page_range()"). Jason Gunthorpe also reported that the ordering of copy_page_range() is odd. Since working at it, reorder the parameters to be logical, by (1) always put the dst_* fields to be before src_* fields, and (2) keep the same type of parameters together. [[email protected]: further reorder some parameters and line format, per Jason] Link: https://lkml.kernel.org/r/[email protected] [[email protected]: fix warnings] Link: https://lkml.kernel.org/r/20201006200138.GA6026@xz-x1 Reported-by: Kirill A. Shutemov <[email protected]> Signed-off-by: Peter Xu <[email protected]> Signed-off-by: Andrew Morton <[email protected]> Reviewed-by: Jason Gunthorpe <[email protected]> Acked-by: Kirill A. Shutemov <[email protected]> Link: https://lkml.kernel.org/r/[email protected] Signed-off-by: Linus Torvalds <[email protected]>
2020-10-13mm/mmap.c: replace do_brk with do_brk_flags in comment of insert_vm_struct()Liao Pingfang1-1/+1
Replace do_brk with do_brk_flags in comment of insert_vm_struct(), since do_brk was removed in following commit. Fixes: bb177a732c4369 ("mm: do not bug_on on incorrect length in __mm_populate()") Signed-off-by: Liao Pingfang <[email protected]> Signed-off-by: Yi Wang <[email protected]> Signed-off-by: Andrew Morton <[email protected]> Link: https://lkml.kernel.org/r/[email protected] Signed-off-by: Linus Torvalds <[email protected]>
2020-10-13mm/mmap.c: use helper function allow_write_access() in ↵Miaohe Lin1-1/+1
__remove_shared_vm_struct() In commit 1da177e4c3f4 ("Linux-2.6.12-rc2"), the helper allow_write_access came with the atomic_inc operation of the i_writecount field in the func __remove_shared_vm_struct(). But it forgot to use this helper function. Signed-off-by: Miaohe Lin <[email protected]> Signed-off-by: Andrew Morton <[email protected]> Link: https://lkml.kernel.org/r/[email protected] Signed-off-by: Linus Torvalds <[email protected]>
2020-10-13mm: use helper function mapping_allow_writable()Miaohe Lin2-2/+2
Commit 4bb5f5d9395b ("mm: allow drivers to prevent new writable mappings") changed i_mmap_writable from unsigned int to atomic_t and add the helper function mapping_allow_writable() to atomic_inc i_mmap_writable. But it forgot to use this helper function in dup_mmap() and __vma_link_file(). Signed-off-by: Miaohe Lin <[email protected]> Signed-off-by: Andrew Morton <[email protected]> Cc: Christian Brauner <[email protected]> Cc: Ingo Molnar <[email protected]> Cc: Peter Zijlstra <[email protected]> Cc: "Eric W. Biederman" <[email protected]> Cc: Christian Kellner <[email protected]> Cc: Suren Baghdasaryan <[email protected]> Cc: Adrian Reber <[email protected]> Cc: Shakeel Butt <[email protected]> Cc: Aleksa Sarai <[email protected]> Cc: Thomas Gleixner <[email protected]> Link: https://lkml.kernel.org/r/[email protected] Signed-off-by: Linus Torvalds <[email protected]>
2020-10-13mm/mmap: check on file instead of the rb_root_cached of its address_spaceWei Yang1-3/+3
In __vma_adjust(), we do the check on *root* to decide whether to adjust the address_space. It seems to be more meaningful to do the check on *file* itself. This means we are adjusting some data because it is a file backed vma. Since we seem to assume the address_space is valid if it is a file backed vma, let's just replace *root* with *file* here. Signed-off-by: Wei Yang <[email protected]> Signed-off-by: Andrew Morton <[email protected]> Reviewed-by: Andrew Morton <[email protected]> Link: https://lkml.kernel.org/r/[email protected] Signed-off-by: Linus Torvalds <[email protected]>
2020-10-13mm/mmap: not necessary to check mapping separatelyWei Yang1-2/+1
*root* with type of struct rb_root_cached is an element of *mapping* with type of struct address_space. This implies when we have a valid *root* it must be a part of valid *mapping*. So we can merge these two checks together to make the code more easy to read and to save some cpu cycles. Signed-off-by: Wei Yang <[email protected]> Signed-off-by: Andrew Morton <[email protected]> Reviewed-by: Andrew Morton <[email protected]> Link: https://lkml.kernel.org/r/[email protected] Signed-off-by: Linus Torvalds <[email protected]>
2020-10-13mm/memory.c: fix spello of "function"Randy Dunlap1-1/+1
Fix typo/spello of "function". Signed-off-by: Randy Dunlap <[email protected]> Signed-off-by: Andrew Morton <[email protected]> Link: https://lkml.kernel.org/r/[email protected] Signed-off-by: Linus Torvalds <[email protected]>
2020-10-13mm/mmap: leave adjust_next as virtual address instead of page frame numberWei Yang2-6/+6
Instead of converting adjust_next between bytes and pages number, let's just store the virtual address into adjust_next. Also, this patch fixes one typo in the comment of vma_adjust_trans_huge(). [[email protected]: changelog tweak] Signed-off-by: Wei Yang <[email protected]> Signed-off-by: Andrew Morton <[email protected]> Acked-by: Vlastimil Babka <[email protected]> Cc: Mike Kravetz <[email protected]> Link: http://lkml.kernel.org/r/[email protected] Signed-off-by: Linus Torvalds <[email protected]>
2020-10-13mm: simplify PageDoubleMap with PF_SECOND policyMatthew Wilcox (Oracle)1-30/+10
Introduce the new page policy of PF_SECOND which lets us use the normal pageflags generation machinery to create the various DoubleMap manipulation functions. Signed-off-by: Matthew Wilcox (Oracle) <[email protected]> Signed-off-by: Andrew Morton <[email protected]> Reviewed-by: Zi Yan <[email protected]> Link: https://lkml.kernel.org/r/[email protected] Signed-off-by: Linus Torvalds <[email protected]>
2020-10-13mm: move PageDoubleMap bitMatthew Wilcox (Oracle)1-1/+1
Patch series "Fix PageDoubleMap". This is a purely theoretical problem for now as none of the filesystems which use PG_private_2 (ie PG_fscache) are being converted at this time, but it's confusing to leave it like this. This patch (of 2): PG_private_2 is defined as being PF_ANY (applicable to tail pages as well as regular & head pages). That means that the first tail page of a double-map page will appear to have Private2 set. Use the Workingset bit instead which is defined as PF_HEAD so any attempt to access the Workingset bit on a tail page will redirect to the head page's Workingset bit. Signed-off-by: Matthew Wilcox (Oracle) <[email protected]> Signed-off-by: Andrew Morton <[email protected]> Reviewed-by: Zi Yan <[email protected]> Link: https://lkml.kernel.org/r/[email protected] Link: https://lkml.kernel.org/r/[email protected] Signed-off-by: Linus Torvalds <[email protected]>
2020-10-13mm: proc: smaps_rollup: do not stall write attempts on mmap_lockChinwen Chang1-1/+65
smaps_rollup will try to grab mmap_lock and go through the whole vma list until it finishes the iterating. When encountering large processes, the mmap_lock will be held for a longer time, which may block other write requests like mmap and munmap from progressing smoothly. There are upcoming mmap_lock optimizations like range-based locks, but the lock applied to smaps_rollup would be the coarse type, which doesn't avoid the occurrence of unpleasant contention. To solve aforementioned issue, we add a check which detects whether anyone wants to grab mmap_lock for write attempts. Signed-off-by: Chinwen Chang <[email protected]> Signed-off-by: Andrew Morton <[email protected]> Cc: Steven Price <[email protected]> Cc: Michel Lespinasse <[email protected]> Cc: Matthias Brugger <[email protected]> Cc: Vlastimil Babka <[email protected]> Cc: Daniel Jordan <[email protected]> Cc: Davidlohr Bueso <[email protected]> Cc: Chinwen Chang <[email protected]> Cc: Alexey Dobriyan <[email protected]> Cc: "Matthew Wilcox (Oracle)" <[email protected]> Cc: Jason Gunthorpe <[email protected]> Cc: Song Liu <[email protected]> Cc: Jimmy Assarsson <[email protected]> Cc: Huang Ying <[email protected]> Cc: Daniel Kiss <[email protected]> Cc: Laurent Dufour <[email protected]> Link: http://lkml.kernel.org/r/[email protected] Signed-off-by: Linus Torvalds <[email protected]>
2020-10-13mm: smaps*: extend smap_gather_stats to support specified beginningChinwen Chang1-8/+22
Extend smap_gather_stats to support indicated beginning address at which it should start gathering. To achieve the goal, we add a new parameter @start assigned by the caller and try to refactor it for simplicity. If @start is 0, it will use the range of @vma for gathering. Signed-off-by: Chinwen Chang <[email protected]> Signed-off-by: Andrew Morton <[email protected]> Reviewed-by: Steven Price <[email protected]> Cc: Michel Lespinasse <[email protected]> Cc: Alexey Dobriyan <[email protected]> Cc: Daniel Jordan <[email protected]> Cc: Daniel Kiss <[email protected]> Cc: Davidlohr Bueso <[email protected]> Cc: Huang Ying <[email protected]> Cc: Jason Gunthorpe <[email protected]> Cc: Jimmy Assarsson <[email protected]> Cc: Laurent Dufour <[email protected]> Cc: "Matthew Wilcox (Oracle)" <[email protected]> Cc: Matthias Brugger <[email protected]> Cc: Song Liu <[email protected]> Cc: Vlastimil Babka <[email protected]> Link: http://lkml.kernel.org/r/[email protected] Signed-off-by: Linus Torvalds <[email protected]>
2020-10-13mmap locking API: add mmap_lock_is_contended()Chinwen Chang1-0/+5
Patch series "Try to release mmap_lock temporarily in smaps_rollup", v4. Recently, we have observed some janky issues caused by unpleasantly long contention on mmap_lock which is held by smaps_rollup when probing large processes. To address the problem, we let smaps_rollup detect if anyone wants to acquire mmap_lock for write attempts. If yes, just release the lock temporarily to ease the contention. smaps_rollup is a procfs interface which allows users to summarize the process's memory usage without the overhead of seq_* calls. Android uses it to sample the memory usage of various processes to balance its memory pool sizes. If no one wants to take the lock for write requests, smaps_rollup with this patch will behave like the original one. Although there are on-going mmap_lock optimizations like range-based locks, the lock applied to smaps_rollup would be the coarse one, which is hard to avoid the occurrence of aforementioned issues. So the detection and temporary release for write attempts on mmap_lock in smaps_rollup is still necessary. This patch (of 3): Add new API to query if someone wants to acquire mmap_lock for write attempts. Using this instead of rwsem_is_contended makes it more tolerant of future changes to the lock type. Signed-off-by: Chinwen Chang <[email protected]> Signed-off-by: Andrew Morton <[email protected]> Reviewed-by: Steven Price <[email protected]> Acked-by: Michel Lespinasse <[email protected]> Cc: Alexey Dobriyan <[email protected]> Cc: Daniel Jordan <[email protected]> Cc: Daniel Kiss <[email protected]> Cc: Davidlohr Bueso <[email protected]> Cc: Huang Ying <[email protected]> Cc: Jason Gunthorpe <[email protected]> Cc: Jimmy Assarsson <[email protected]> Cc: Laurent Dufour <[email protected]> Cc: "Matthew Wilcox (Oracle)" <[email protected]> Cc: Matthias Brugger <[email protected]> Cc: Song Liu <[email protected]> Cc: Vlastimil Babka <[email protected]> Link: http://lkml.kernel.org/r/[email protected] Link: http://lkml.kernel.org/r/[email protected] Signed-off-by: Linus Torvalds <[email protected]>
2020-10-13mm/mmap: leverage vma_rb_erase_ignore() to implement vma_rb_erase()Wei Yang1-9/+7
These two functions share the same logic except ignore a different vma. Let's reuse the code. Signed-off-by: Wei Yang <[email protected]> Signed-off-by: Andrew Morton <[email protected]> Reviewed-by: Andrew Morton <[email protected]> Link: https://lkml.kernel.org/r/[email protected] Signed-off-by: Linus Torvalds <[email protected]>
2020-10-13mm/mmap: rename __vma_unlink_common() to __vma_unlink()Wei Yang1-3/+3
__vma_unlink_common() and __vma_unlink() are counterparts. Since there is no function named __vma_unlink(), let's rename __vma_unlink_common() to __vma_unlink() to make the code more self-explanatory and easy for audience to understand. Otherwise we may expect there are several variants of vma_unlink() and __vma_unlink_common() is used by them. Signed-off-by: Wei Yang <[email protected]> Signed-off-by: Andrew Morton <[email protected]> Reviewed-by: Andrew Morton <[email protected]> Link: https://lkml.kernel.org/r/[email protected] Signed-off-by: Linus Torvalds <[email protected]>
2020-10-13mm/memory.c: replace vmf->vma with variable vmaYanfei Xu1-1/+1
The code has declared a vma_struct named vma which is assigned a value of vmf->vma. Thus, use variable vma directly here. Signed-off-by: Yanfei Xu <[email protected]> Signed-off-by: Andrew Morton <[email protected]> Reviewed-by: Matthew Wilcox (Oracle) <[email protected]> Link: http://lkml.kernel.org/r/[email protected] Signed-off-by: Linus Torvalds <[email protected]>
2020-10-13mm/memory.c: fix typo in __do_fault() commentYanfei Xu1-1/+1
It's "pte_alloc_one", not "pte_alloc_pne". Let's fix that. Signed-off-by: Yanfei Xu <[email protected]> Signed-off-by: Andrew Morton <[email protected]> Reviewed-by: David Hildenbrand <[email protected]> Link: http://lkml.kernel.org/r/[email protected] Signed-off-by: Linus Torvalds <[email protected]>
2020-10-13mm: account PMD tables like PTE tablesMatthew Wilcox2-4/+21
We account the PTE level of the page tables to the process in order to make smarter OOM decisions and help diagnose why memory is fragmented. For these same reasons, we should account pages allocated for PMDs. With larger process address spaces and ASLR, the number of PMDs in use is higher than it used to be so the inaccuracy is starting to matter. [[email protected]: arm: __pmd_free_tlb(): call page table destructor] Link: https://lkml.kernel.org/r/[email protected] Signed-off-by: Matthew Wilcox (Oracle) <[email protected]> Signed-off-by: Andrew Morton <[email protected]> Reviewed-by: Mike Rapoport <[email protected]> Cc: Abdul Haleem <[email protected]> Cc: Andy Lutomirski <[email protected]> Cc: Arnd Bergmann <[email protected]> Cc: Christophe Leroy <[email protected]> Cc: Joerg Roedel <[email protected]> Cc: Max Filippov <[email protected]> Cc: Peter Zijlstra <[email protected]> Cc: Satheesh Rajendran <[email protected]> Cc: Stafford Horne <[email protected]> Cc: Naresh Kamboju <[email protected]> Cc: Anders Roxell <[email protected]> Link: http://lkml.kernel.org/r/[email protected] Signed-off-by: Linus Torvalds <[email protected]>
2020-10-13selftests/vm: fix incorrect gcc invocation in some casesJohn Hubbard1-0/+12
Avoid accidental wrong builds, due to built-in rules working just a little bit too well--but not quite as well as required for our situation here. In other words, "make userfaultfd" (for example) is supposed to fail to build at all, because this Makefile only supports either "make" (all), or "make /full/path". However, the built-in rules, if not suppressed, will pick up CFLAGS and the initial LDLIBS (but not the target-specific LDLIBS, because those are only set for the full path target!). This causes it to get pretty far into building things despite using incorrect values such as an *occasionally* incomplete LDLIBS value. Signed-off-by: John Hubbard <[email protected]> Signed-off-by: Andrew Morton <[email protected]> Cc: Shuah Khan <[email protected]> Cc: Jason Gunthorpe <[email protected]> Link: https://lkml.kernel.org/r/[email protected] Signed-off-by: Linus Torvalds <[email protected]>