aboutsummaryrefslogtreecommitdiff
path: root/mm
AgeCommit message (Collapse)AuthorFilesLines
2020-10-13mm/vmscan: fix comments for isolate_lru_page()Hui Su1-1/+1
fix comments for isolate_lru_page(): s/fundamentnal/fundamental Signed-off-by: Hui Su <[email protected]> Signed-off-by: Andrew Morton <[email protected]> Reviewed-by: Andrew Morton <[email protected]> Link: https://lkml.kernel.org/r/20200927173923.GA8058@rlk Signed-off-by: Linus Torvalds <[email protected]>
2020-10-13mm/vmscan: fix infinite loop in drop_slab_nodeChunxin Zang1-0/+3
We have observed that drop_caches can take a considerable amount of time (<put data here>). Especially when there are many memcgs involved because they are adding an additional overhead. It is quite unfortunate that the operation cannot be interrupted by a signal currently. Add a check for fatal signals into the main loop so that userspace can control early bailout. There are two reasons: 1. We have too many memcgs, even though one object freed in one memcg, the sum of object is bigger than 10. 2. We spend a lot of time in traverse memcg once. So, the memcg who traversed at the first have been freed many objects. Traverse memcg next time, the freed count bigger than 10 again. We can get the following info through 'ps': root:~# ps -aux | grep drop root 357956 ... R Aug25 21119854:55 echo 3 > /proc/sys/vm/drop_caches root 1771385 ... R Aug16 21146421:17 echo 3 > /proc/sys/vm/drop_caches root 1986319 ... R 18:56 117:27 echo 3 > /proc/sys/vm/drop_caches root 2002148 ... R Aug24 5720:39 echo 3 > /proc/sys/vm/drop_caches root 2564666 ... R 18:59 113:58 echo 3 > /proc/sys/vm/drop_caches root 2639347 ... R Sep03 2383:39 echo 3 > /proc/sys/vm/drop_caches root 3904747 ... R 03:35 993:31 echo 3 > /proc/sys/vm/drop_caches root 4016780 ... R Aug21 7882:18 echo 3 > /proc/sys/vm/drop_caches Use bpftrace follow 'freed' value in drop_slab_node: root:~# bpftrace -e 'kprobe:drop_slab_node+70 {@ret=hist(reg("bp")); }' Attaching 1 probe... ^B^C @ret: [64, 128) 1 | | [128, 256) 28 | | [256, 512) 107 |@ | [512, 1K) 298 |@@@ | [1K, 2K) 613 |@@@@@@@ | [2K, 4K) 4435 |@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@| [4K, 8K) 442 |@@@@@ | [8K, 16K) 299 |@@@ | [16K, 32K) 100 |@ | [32K, 64K) 139 |@ | [64K, 128K) 56 | | [128K, 256K) 26 | | [256K, 512K) 2 | | In the while loop, we can check whether the TASK_KILLABLE signal is set, if so, we should break the loop. Signed-off-by: Chunxin Zang <[email protected]> Signed-off-by: Muchun Song <[email protected]> Signed-off-by: Andrew Morton <[email protected]> Acked-by: Chris Down <[email protected]> Acked-by: Michal Hocko <[email protected]> Cc: Vlastimil Babka <[email protected]> Cc: Matthew Wilcox <[email protected]> Link: https://lkml.kernel.org/r/[email protected] Signed-off-by: Linus Torvalds <[email protected]>
2020-10-13hugetlb: add lockdep check for i_mmap_rwsem held in huge_pmd_shareMike Kravetz1-4/+11
As a debugging aid, huge_pmd_share should make sure i_mmap_rwsem is held if necessary. To clarify the 'if necessary', expand the comment block at the beginning of huge_pmd_share. No functional change. The added i_mmap_assert_locked() call is only enabled if CONFIG_LOCKDEP. Ideally, this should have been included with commit 34ae204f1851 ("hugetlbfs: remove call to huge_pte_alloc without i_mmap_rwsem"). Signed-off-by: Mike Kravetz <[email protected]> Signed-off-by: Andrew Morton <[email protected]> Cc: Matthew Wilcox <[email protected]> Cc: Michal Hocko <[email protected]> Cc: "Kirill A . Shutemov" <[email protected]> Cc: Davidlohr Bueso <[email protected]> Link: https://lkml.kernel.org/r/[email protected] Signed-off-by: Linus Torvalds <[email protected]>
2020-10-13mm/hugetlb: take the free hpage during the iteration directlyWei Yang1-13/+9
Function dequeue_huge_page_node_exact() iterates the free list and return the first valid free hpage. Instead of break and check the loop variant, we could return in the loop directly. This could reduce some redundant check. [[email protected]: points out a logic error] [[email protected]: v4] Link: https://lkml.kernel.org/r/[email protected] Signed-off-by: Wei Yang <[email protected]> Signed-off-by: Andrew Morton <[email protected]> Cc: Baoquan He <[email protected]> Cc: Mike Kravetz <[email protected]> Cc: Vlastimil Babka <[email protected]> Link: https://lkml.kernel.org/r/[email protected] Signed-off-by: Linus Torvalds <[email protected]>
2020-10-13mm/hugetlb: narrow the hugetlb_lock protection area during preparing huge pageWei Yang1-1/+1
set_hugetlb_cgroup_[rsvd] just manipulate page local data, which is not necessary to be protected by hugetlb_lock. Let's take this out. Signed-off-by: Wei Yang <[email protected]> Signed-off-by: Andrew Morton <[email protected]> Reviewed-by: Baoquan He <[email protected]> Reviewed-by: Mike Kravetz <[email protected]> Cc: Vlastimil Babka <[email protected]> Link: https://lkml.kernel.org/r/[email protected] Signed-off-by: Linus Torvalds <[email protected]>
2020-10-13mm/hugetlb: a page from buddy is not on any listWei Yang1-1/+1
The page allocated from buddy is not on any list, so just use list_add() is enough. Signed-off-by: Wei Yang <[email protected]> Signed-off-by: Andrew Morton <[email protected]> Reviewed-by: Baoquan He <[email protected]> Reviewed-by: Mike Kravetz <[email protected]> Cc: Vlastimil Babka <[email protected]> Link: https://lkml.kernel.org/r/[email protected] Signed-off-by: Linus Torvalds <[email protected]>
2020-10-13mm/hugetlb: count file_region to be added when regions_needed != NULLWei Yang1-16/+17
There are only two cases of function add_reservation_in_range() * count file_region and return the number in regions_needed * do the real list operation without counting This means it is not necessary to have two parameters to classify these two cases. Just use regions_needed to separate them. Signed-off-by: Wei Yang <[email protected]> Signed-off-by: Andrew Morton <[email protected]> Reviewed-by: Baoquan He <[email protected]> Reviewed-by: Mike Kravetz <[email protected]> Cc: Vlastimil Babka <[email protected]> Link: https://lkml.kernel.org/r/[email protected] Signed-off-by: Linus Torvalds <[email protected]>
2020-10-13mm/hugetlb: use list_splice to merge two list at onceWei Yang1-5/+2
Instead of add allocated file_region one by one to region_cache, we could use list_splice to merge two list at once. Also we know the number of entries in the list, increase the number directly. Signed-off-by: Wei Yang <[email protected]> Signed-off-by: Andrew Morton <[email protected]> Reviewed-by: Baoquan He <[email protected]> Reviewed-by: Mike Kravetz <[email protected]> Cc: Vlastimil Babka <[email protected]> Link: https://lkml.kernel.org/r/[email protected] Signed-off-by: Linus Torvalds <[email protected]>
2020-10-13mm/hugetlb: remove VM_BUG_ON(!nrg) in get_file_region_entry_from_cache()Wei Yang1-1/+0
We are sure to get a valid file_region, otherwise the VM_BUG_ON(resv->region_cache_count <= 0) at the very beginning would be triggered. Let's remove the redundant one. Signed-off-by: Wei Yang <[email protected]> Signed-off-by: Andrew Morton <[email protected]> Reviewed-by: Mike Kravetz <[email protected]> Cc: Baoquan He <[email protected]> Cc: Vlastimil Babka <[email protected]> Link: https://lkml.kernel.org/r/[email protected] Signed-off-by: Linus Torvalds <[email protected]>
2020-10-13mm/hugetlb: not necessary to coalesce regions recursivelyWei Yang1-5/+1
Patch series "mm/hugetlb: code refine and simplification", v4. Following are some cleanups for hugetlb. Simple testing with tools/testing/selftests/vm/map_hugetlb passes. This patch (of 7): Per my understanding, we keep the regions ordered and would always coalesce regions properly. So the task to keep this property is just to coalesce its neighbour. Let's simplify this. Signed-off-by: Wei Yang <[email protected]> Signed-off-by: Andrew Morton <[email protected]> Reviewed-by: Baoquan He <[email protected]> Reviewed-by: Mike Kravetz <[email protected]> Cc: Vlastimil Babka <[email protected]> Link: https://lkml.kernel.org/r/[email protected] Link: https://lkml.kernel.org/r/[email protected] Link: https://lkml.kernel.org/r/[email protected] Signed-off-by: Linus Torvalds <[email protected]>
2020-10-13mm/hugetlb.c: remove the unnecessary non_swap_entry()Baoquan He1-2/+2
If a swap entry tests positive for either is_[migration|hwpoison]_entry(), then its swap_type() is among SWP_MIGRATION_READ, SWP_MIGRATION_WRITE and SWP_HWPOISON. All these types >= MAX_SWAPFILES, exactly what is asserted with non_swap_entry(). So the checking non_swap_entry() in is_hugetlb_entry_migration() and is_hugetlb_entry_hwpoisoned() is redundant. Let's remove it to optimize code. Signed-off-by: Baoquan He <[email protected]> Signed-off-by: Andrew Morton <[email protected]> Reviewed-by: Mike Kravetz <[email protected]> Reviewed-by: David Hildenbrand <[email protected]> Reviewed-by: Anshuman Khandual <[email protected]> Link: https://lkml.kernel.org/r/[email protected] Signed-off-by: Linus Torvalds <[email protected]>
2020-10-13mm/hugetlb.c: make is_hugetlb_entry_hwpoisoned return boolBaoquan He1-4/+4
Patch series "mm/hugetlb: Small cleanup and improvement", v2. This patch (of 3): Just like its neighbour is_hugetlb_entry_migration() has done. Signed-off-by: Baoquan He <[email protected]> Signed-off-by: Andrew Morton <[email protected]> Reviewed-by: Mike Kravetz <[email protected]> Reviewed-by: David Hildenbrand <[email protected]> Reviewed-by: Anshuman Khandual <[email protected]> Link: https://lkml.kernel.org/r/[email protected] Link: https://lkml.kernel.org/r/[email protected] Signed-off-by: Linus Torvalds <[email protected]>
2020-10-13mm/page_alloc.c: fix freeing non-compound pagesMatthew Wilcox (Oracle)1-0/+3
Here is a very rare race which leaks memory: Page P0 is allocated to the page cache. Page P1 is free. Thread A Thread B Thread C find_get_entry(): xas_load() returns P0 Removes P0 from page cache P0 finds its buddy P1 alloc_pages(GFP_KERNEL, 1) returns P0 P0 has refcount 1 page_cache_get_speculative(P0) P0 has refcount 2 __free_pages(P0) P0 has refcount 1 put_page(P0) P1 is not freed Fix this by freeing all the pages in __free_pages() that won't be freed by the call to put_page(). It's usually not a good idea to split a page, but this is a very unlikely scenario. Fixes: e286781d5f2e ("mm: speculative page references") Signed-off-by: Matthew Wilcox (Oracle) <[email protected]> Signed-off-by: Andrew Morton <[email protected]> Acked-by: Mike Rapoport <[email protected]> Cc: Nick Piggin <[email protected]> Cc: Hugh Dickins <[email protected]> Cc: Peter Zijlstra <[email protected]> Link: https://lkml.kernel.org/r/[email protected] Signed-off-by: Linus Torvalds <[email protected]>
2020-10-13mm: move call to compound_head() in release_pages()Ralph Campbell1-1/+1
The function is_huge_zero_page() doesn't call compound_head() to make sure the page pointer is a head page. The call to is_huge_zero_page() in release_pages() is made before compound_head() is called so the test would fail if release_pages() was called with a tail page of the huge_zero_page and put_page_testzero() would be called releasing the page. This is unlikely to be happening in normal use or we would be seeing all sorts of process data corruption when accessing a THP zero page. Looking at other places where is_huge_zero_page() is called, all seem to only pass a head page so I think the right solution is to move the call to compound_head() in release_pages() to a point before calling is_huge_zero_page(). Signed-off-by: Ralph Campbell <[email protected]> Signed-off-by: Andrew Morton <[email protected]> Reviewed-by: Andrew Morton <[email protected]> Cc: Yu Zhao <[email protected]> Cc: Dan Williams <[email protected]> Cc: Matthew Wilcox <[email protected]> Cc: Christoph Hellwig <[email protected]> Link: https://lkml.kernel.org/r/[email protected] Signed-off-by: Linus Torvalds <[email protected]>
2020-10-13mmzone: clean code by removing unused macro parameterMateusz Nosek1-2/+2
Previously 'for_next_zone_zonelist_nodemask' macro parameter 'zlist' was unused so this patch removes it. Signed-off-by: Mateusz Nosek <[email protected]> Signed-off-by: Andrew Morton <[email protected]> Link: https://lkml.kernel.org/r/[email protected] Signed-off-by: Linus Torvalds <[email protected]>
2020-10-13mm/page_alloc.c: __perform_reclaim should return 'unsigned long'Yanfei Xu1-3/+2
__perform_reclaim()'s single caller expects it to return 'unsigned long', hence change its return value and a local variable to 'unsigned long'. Suggested-by: Andrew Morton <[email protected]> Signed-off-by: Yanfei Xu <[email protected]> Signed-off-by: Andrew Morton <[email protected]> Reviewed-by: Andrew Morton <[email protected]> Link: https://lkml.kernel.org/r/[email protected] Signed-off-by: Linus Torvalds <[email protected]>
2020-10-13mm/page_alloc.c: clean code by merging two functionsMateusz Nosek1-8/+2
finalise_ac() is just 'epilogue' for 'prepare_alloc_pages'. Therefore there is no need to keep them both so 'finalise_ac' content can be merged into prepare_alloc_pages() code. It would make __alloc_pages_nodemask() cleaner when it comes to readability. Signed-off-by: Mateusz Nosek <[email protected]> Signed-off-by: Andrew Morton <[email protected]> Reviewed-by: Andrew Morton <[email protected]> Cc: Mel Gorman <[email protected]> Cc: Mike Rapoport <[email protected]> Link: https://lkml.kernel.org/r/[email protected] Signed-off-by: Linus Torvalds <[email protected]>
2020-10-13mm/page_alloc.c: fix early params garbage value accessesMateusz Nosek1-6/+6
Previously in '__init early_init_on_alloc' and '__init early_init_on_free' the return values from 'kstrtobool' were not handled properly. That caused potential garbage value read from variable 'bool_result'. Introduced patch fixes error handling. Signed-off-by: Mateusz Nosek <[email protected]> Signed-off-by: Andrew Morton <[email protected]> Reviewed-by: Andrew Morton <[email protected]> Link: https://lkml.kernel.org/r/[email protected] Signed-off-by: Linus Torvalds <[email protected]>
2020-10-13mm/page_alloc.c: micro-optimization remove unnecessary branchMateusz Nosek1-5/+3
Previously flags check was separated into two separated checks with two separated branches. In case of presence of any of two mentioned flags, the same effect on flow occurs. Therefore checks can be merged and one branch can be avoided. Signed-off-by: Mateusz Nosek <[email protected]> Signed-off-by: Andrew Morton <[email protected]> Reviewed-by: Andrew Morton <[email protected]> Link: https://lkml.kernel.org/r/[email protected] Signed-off-by: Linus Torvalds <[email protected]>
2020-10-13mm/page_alloc.c: clean code by removing unnecessary initializationMateusz Nosek1-3/+1
Previously variable 'tmp' was initialized, but was not read later before reassigning. So the initialization can be removed. [[email protected]: remove `tmp' altogether] Signed-off-by: Mateusz Nosek <[email protected]> Signed-off-by: Andrew Morton <[email protected]> Link: https://lkml.kernel.org/r/[email protected] Signed-off-by: Linus Torvalds <[email protected]>
2020-10-13mm, isolation: avoid checking unmovable pages across pageblock boundaryLi Xinhai1-1/+2
In has_unmovable_pages(), the page parameter would not always be the first page within a pageblock (see how the page pointer is passed in from start_isolate_page_range() after call __first_valid_page()), so that would cause checking unmovable pages span two pageblocks. After this patch, the checking is enforced within one pageblock no matter the page is first one or not, and obey the semantics of this function. This issue is found by code inspection. Michal said "this might lead to false negatives when an unrelated block would cause an isolation failure". Signed-off-by: Li Xinhai <[email protected]> Signed-off-by: Andrew Morton <[email protected]> Reviewed-by: Oscar Salvador <[email protected]> Acked-by: Michal Hocko <[email protected]> Cc: David Hildenbrand <[email protected]> Link: https://lkml.kernel.org/r/[email protected] Signed-off-by: Linus Torvalds <[email protected]>
2020-10-13mm/page_isolation: cleanup set_migratetype_isolate()David Hildenbrand1-10/+7
Let's clean it up a bit, simplifying the exit paths. Signed-off-by: David Hildenbrand <[email protected]> Signed-off-by: Andrew Morton <[email protected]> Reviewed-by: Baoquan He <[email protected]> Reviewed-by: Pankaj Gupta <[email protected]> Cc: Michal Hocko <[email protected]> Cc: Michael S. Tsirkin <[email protected]> Cc: Mike Kravetz <[email protected]> Cc: Jason Wang <[email protected]> Cc: Mike Rapoport <[email protected]> Cc: Qian Cai <[email protected]> Link: http://lkml.kernel.org/r/[email protected] Signed-off-by: Linus Torvalds <[email protected]>
2020-10-13mm/page_isolation: drop WARN_ON_ONCE() in set_migratetype_isolate()David Hildenbrand1-9/+6
Inside has_unmovable_pages(), we have a comment describing how unmovable data could end up in ZONE_MOVABLE - via "movablecore". Also, besides checking if the first page in the pageblock is reserved, we don't perform any further checks in case of ZONE_MOVABLE. In case of memory offlining, we set REPORT_FAILURE, properly dump_page() the page and handle the error gracefully. alloc_contig_pages() users currently never allocate from ZONE_MOVABLE. E.g., hugetlb uses alloc_contig_pages() for the allocation of gigantic pages only, which will never end up on the MOVABLE zone (see htlb_alloc_mask()). Signed-off-by: David Hildenbrand <[email protected]> Signed-off-by: Andrew Morton <[email protected]> Reviewed-by: Baoquan He <[email protected]> Cc: Michal Hocko <[email protected]> Cc: Michael S. Tsirkin <[email protected]> Cc: Mike Kravetz <[email protected]> Cc: Pankaj Gupta <[email protected]> Cc: Jason Wang <[email protected]> Cc: Mike Rapoport <[email protected]> Cc: Qian Cai <[email protected]> Link: http://lkml.kernel.org/r/[email protected] Signed-off-by: Linus Torvalds <[email protected]>
2020-10-13mm/page_isolation: exit early when pageblock is isolated in ↵David Hildenbrand1-4/+5
set_migratetype_isolate() Right now, if we have two isolations racing on a pageblock that's in the MOVABLE zone, we would trigger the WARN_ON_ONCE(). Let's just return directly, simplifying error handling. The change was introduced in commit 3d680bdf60a5 ("mm/page_isolation: fix potential warning from user"). As far as I can see, we currently don't have alloc_contig_range() users that use the ZONE_MOVABLE (anymore), so it's currently more a cleanup and a preparation for the future than a fix. Signed-off-by: David Hildenbrand <[email protected]> Signed-off-by: Andrew Morton <[email protected]> Reviewed-by: Baoquan He <[email protected]> Reviewed-by: Pankaj Gupta <[email protected]> Acked-by: Mike Kravetz <[email protected]> Cc: Michal Hocko <[email protected]> Cc: Michael S. Tsirkin <[email protected]> Cc: Qian Cai <[email protected]> Cc: Jason Wang <[email protected]> Cc: Mike Rapoport <[email protected]> Link: http://lkml.kernel.org/r/[email protected] Signed-off-by: Linus Torvalds <[email protected]>
2020-10-13mm/page_alloc: tweak comments in has_unmovable_pages()David Hildenbrand1-16/+6
Patch series "mm / virtio-mem: support ZONE_MOVABLE", v5. When introducing virtio-mem, the semantics of ZONE_MOVABLE were rather unclear, which is why we special-cased ZONE_MOVABLE such that partially plugged blocks would never end up in ZONE_MOVABLE. Now that the semantics are much clearer (and are documented in patch #6), let's support partially plugged memory blocks in ZONE_MOVABLE, allowing partially plugged memory blocks to be online to ZONE_MOVABLE and also unplugging from such memory blocks. This avoids surprises when onlining of memory blocks suddenly fails, just because they are not completely populated by virtio-mem (yet). This is especially helpful for testing, but also paves the way for virtio-mem optimizations, allowing more memory to get reliably unplugged. Cleanup has_unmovable_pages() and set_migratetype_isolate(), providing better documentation of how ZONE_MOVABLE interacts with different kind of unmovable pages (memory offlining vs. alloc_contig_range()). This patch (of 6): Let's move the split comment regarding bootmem allocations and memory holes, especially in the context of ZONE_MOVABLE, to the PageReserved() check. Signed-off-by: David Hildenbrand <[email protected]> Signed-off-by: Andrew Morton <[email protected]> Reviewed-by: Baoquan He <[email protected]> Cc: Michal Hocko <[email protected]> Cc: Michael S. Tsirkin <[email protected]> Cc: Mike Kravetz <[email protected]> Cc: Pankaj Gupta <[email protected]> Cc: Jason Wang <[email protected]> Cc: Mike Rapoport <[email protected]> Cc: Qian Cai <[email protected]> Link: http://lkml.kernel.org/r/[email protected] Link: http://lkml.kernel.org/r/[email protected] Signed-off-by: Linus Torvalds <[email protected]>
2020-10-13mm: kasan: do not panic if both panic_on_warn and kasan_multishot setDavid Gow1-1/+1
KASAN errors will currently trigger a panic when panic_on_warn is set. This renders kasan_multishot useless, as further KASAN errors won't be reported if the kernel has already paniced. By making kasan_multishot disable this behaviour for KASAN errors, we can still have the benefits of panic_on_warn for non-KASAN warnings, yet be able to use kasan_multishot. This is particularly important when running KASAN tests, which need to trigger multiple KASAN errors: previously these would panic the system if panic_on_warn was set, now they can run (and will panic the system should non-KASAN warnings show up). Signed-off-by: David Gow <[email protected]> Signed-off-by: Andrew Morton <[email protected]> Tested-by: Andrey Konovalov <[email protected]> Reviewed-by: Andrey Konovalov <[email protected]> Reviewed-by: Brendan Higgins <[email protected]> Cc: Andrey Ryabinin <[email protected]> Cc: Dmitry Vyukov <[email protected]> Cc: Ingo Molnar <[email protected]> Cc: Juri Lelli <[email protected]> Cc: Patricia Alfonso <[email protected]> Cc: Peter Zijlstra <[email protected]> Cc: Shuah Khan <[email protected]> Cc: Vincent Guittot <[email protected]> Link: https://lkml.kernel.org/r/[email protected] Link: https://lkml.kernel.org/r/[email protected] Signed-off-by: Linus Torvalds <[email protected]>
2020-10-13KUnit: KASAN IntegrationPatricia Alfonso1-0/+32
Integrate KASAN into KUnit testing framework. - Fail tests when KASAN reports an error that is not expected - Use KUNIT_EXPECT_KASAN_FAIL to expect a KASAN error in KASAN tests - Expected KASAN reports pass tests and are still printed when run without kunit_tool (kunit_tool still bypasses the report due to the test passing) - KUnit struct in current task used to keep track of the current test from KASAN code Make use of "[PATCH v3 kunit-next 1/2] kunit: generalize kunit_resource API beyond allocated resources" and "[PATCH v3 kunit-next 2/2] kunit: add support for named resources" from Alan Maguire [1] - A named resource is added to a test when a KASAN report is expected - This resource contains a struct for kasan_data containing booleans representing if a KASAN report is expected and if a KASAN report is found [1] (https://lore.kernel.org/linux-kselftest/[email protected]/T/#t) Signed-off-by: Patricia Alfonso <[email protected]> Signed-off-by: David Gow <[email protected]> Signed-off-by: Andrew Morton <[email protected]> Tested-by: Andrey Konovalov <[email protected]> Reviewed-by: Andrey Konovalov <[email protected]> Reviewed-by: Dmitry Vyukov <[email protected]> Acked-by: Brendan Higgins <[email protected]> Cc: Andrey Ryabinin <[email protected]> Cc: Ingo Molnar <[email protected]> Cc: Juri Lelli <[email protected]> Cc: Peter Zijlstra <[email protected]> Cc: Shuah Khan <[email protected]> Cc: Vincent Guittot <[email protected]> Link: https://lkml.kernel.org/r/[email protected] Link: https://lkml.kernel.org/r/[email protected] Signed-off-by: Linus Torvalds <[email protected]>
2020-10-13mm/vmalloc.c: fix the comment of find_vm_areaHui Su1-2/+2
Fix the comment of find_vm_area() and get_vm_area() Signed-off-by: Hui Su <[email protected]> Signed-off-by: Andrew Morton <[email protected]> Reviewed-by: Andrew Morton <[email protected]> Link: https://lkml.kernel.org/r/20200927153034.GA199877@rlk Signed-off-by: Linus Torvalds <[email protected]>
2020-10-13mm/vmalloc.c: update the comment in __vmalloc_area_node()Hui Su1-1/+1
Since c67dc624757 ("mm/vmalloc: do not call kmemleak_free() on not yet accounted memory"), the __vunmap() have been changed to __vfree(), so update the confusing comment(). Signed-off-by: Hui Su <[email protected]> Signed-off-by: Andrew Morton <[email protected]> Reviewed-by: Andrew Morton <[email protected]> Cc: Roman Penyaev <[email protected]> Link: https://lkml.kernel.org/r/20200927155409.GA3315@rlk Signed-off-by: Linus Torvalds <[email protected]>
2020-10-13mm/memory-failure.c: remove unused macro `writeback'Alex Shi1-2/+0
Unlike others we don't use the marco writeback. so let's remove it to tame gcc warning: mm/memory-failure.c:827: warning: macro "writeback" is not used [-Wunused-macros] Signed-off-by: Alex Shi <[email protected]> Signed-off-by: Andrew Morton <[email protected]> Cc: Naoya Horiguchi <[email protected]> Link: https://lkml.kernel.org/r/[email protected] Signed-off-by: Linus Torvalds <[email protected]>
2020-10-13mm/memory-failure: do pgoff calculation before for_each_process()Xianting Tian1-1/+2
There is no need to calculate pgoff in each loop of for_each_process(), so move it to the place before for_each_process(), which can save some CPU cycles. Signed-off-by: Xianting Tian <[email protected]> Signed-off-by: Andrew Morton <[email protected]> Acked-by: Naoya Horiguchi <[email protected]> Link: http://lkml.kernel.org/r/[email protected] Signed-off-by: Linus Torvalds <[email protected]>
2020-10-13mm/dmapool.c: replace hard coded function name with __func__Andy Shevchenko1-22/+18
No need to hard code function name when __func__ can be used. While here, replace specifiers for special types like dma_addr_t. Signed-off-by: Andy Shevchenko <[email protected]> Signed-off-by: Andrew Morton <[email protected]> Cc: Matthew Wilcox <[email protected]> Link: http://lkml.kernel.org/r/[email protected] Signed-off-by: Linus Torvalds <[email protected]>
2020-10-13mm/dmapool.c: replace open-coded list_for_each_entry_safe()Andy Shevchenko1-4/+2
There is a place in the code where open-coded version of list_for_each_entry_safe() is used. Replace that with the standard macro. Signed-off-by: Andy Shevchenko <[email protected]> Signed-off-by: Andrew Morton <[email protected]> Cc: Matthew Wilcox <[email protected]> Link: http://lkml.kernel.org/r/[email protected] Signed-off-by: Linus Torvalds <[email protected]>
2020-10-13mm: remove src/dst mm parameter in copy_page_range()Peter Xu1-68/+73
Both of the mm pointers are not needed after commit 7a4830c380f3 ("mm/fork: Pass new vma pointer into copy_page_range()"). Jason Gunthorpe also reported that the ordering of copy_page_range() is odd. Since working at it, reorder the parameters to be logical, by (1) always put the dst_* fields to be before src_* fields, and (2) keep the same type of parameters together. [[email protected]: further reorder some parameters and line format, per Jason] Link: https://lkml.kernel.org/r/[email protected] [[email protected]: fix warnings] Link: https://lkml.kernel.org/r/20201006200138.GA6026@xz-x1 Reported-by: Kirill A. Shutemov <[email protected]> Signed-off-by: Peter Xu <[email protected]> Signed-off-by: Andrew Morton <[email protected]> Reviewed-by: Jason Gunthorpe <[email protected]> Acked-by: Kirill A. Shutemov <[email protected]> Link: https://lkml.kernel.org/r/[email protected] Signed-off-by: Linus Torvalds <[email protected]>
2020-10-13mm/mmap.c: replace do_brk with do_brk_flags in comment of insert_vm_struct()Liao Pingfang1-1/+1
Replace do_brk with do_brk_flags in comment of insert_vm_struct(), since do_brk was removed in following commit. Fixes: bb177a732c4369 ("mm: do not bug_on on incorrect length in __mm_populate()") Signed-off-by: Liao Pingfang <[email protected]> Signed-off-by: Yi Wang <[email protected]> Signed-off-by: Andrew Morton <[email protected]> Link: https://lkml.kernel.org/r/[email protected] Signed-off-by: Linus Torvalds <[email protected]>
2020-10-13mm/mmap.c: use helper function allow_write_access() in ↵Miaohe Lin1-1/+1
__remove_shared_vm_struct() In commit 1da177e4c3f4 ("Linux-2.6.12-rc2"), the helper allow_write_access came with the atomic_inc operation of the i_writecount field in the func __remove_shared_vm_struct(). But it forgot to use this helper function. Signed-off-by: Miaohe Lin <[email protected]> Signed-off-by: Andrew Morton <[email protected]> Link: https://lkml.kernel.org/r/[email protected] Signed-off-by: Linus Torvalds <[email protected]>
2020-10-13mm: use helper function mapping_allow_writable()Miaohe Lin1-1/+1
Commit 4bb5f5d9395b ("mm: allow drivers to prevent new writable mappings") changed i_mmap_writable from unsigned int to atomic_t and add the helper function mapping_allow_writable() to atomic_inc i_mmap_writable. But it forgot to use this helper function in dup_mmap() and __vma_link_file(). Signed-off-by: Miaohe Lin <[email protected]> Signed-off-by: Andrew Morton <[email protected]> Cc: Christian Brauner <[email protected]> Cc: Ingo Molnar <[email protected]> Cc: Peter Zijlstra <[email protected]> Cc: "Eric W. Biederman" <[email protected]> Cc: Christian Kellner <[email protected]> Cc: Suren Baghdasaryan <[email protected]> Cc: Adrian Reber <[email protected]> Cc: Shakeel Butt <[email protected]> Cc: Aleksa Sarai <[email protected]> Cc: Thomas Gleixner <[email protected]> Link: https://lkml.kernel.org/r/[email protected] Signed-off-by: Linus Torvalds <[email protected]>
2020-10-13mm/mmap: check on file instead of the rb_root_cached of its address_spaceWei Yang1-3/+3
In __vma_adjust(), we do the check on *root* to decide whether to adjust the address_space. It seems to be more meaningful to do the check on *file* itself. This means we are adjusting some data because it is a file backed vma. Since we seem to assume the address_space is valid if it is a file backed vma, let's just replace *root* with *file* here. Signed-off-by: Wei Yang <[email protected]> Signed-off-by: Andrew Morton <[email protected]> Reviewed-by: Andrew Morton <[email protected]> Link: https://lkml.kernel.org/r/[email protected] Signed-off-by: Linus Torvalds <[email protected]>
2020-10-13mm/mmap: not necessary to check mapping separatelyWei Yang1-2/+1
*root* with type of struct rb_root_cached is an element of *mapping* with type of struct address_space. This implies when we have a valid *root* it must be a part of valid *mapping*. So we can merge these two checks together to make the code more easy to read and to save some cpu cycles. Signed-off-by: Wei Yang <[email protected]> Signed-off-by: Andrew Morton <[email protected]> Reviewed-by: Andrew Morton <[email protected]> Link: https://lkml.kernel.org/r/[email protected] Signed-off-by: Linus Torvalds <[email protected]>
2020-10-13mm/memory.c: fix spello of "function"Randy Dunlap1-1/+1
Fix typo/spello of "function". Signed-off-by: Randy Dunlap <[email protected]> Signed-off-by: Andrew Morton <[email protected]> Link: https://lkml.kernel.org/r/[email protected] Signed-off-by: Linus Torvalds <[email protected]>
2020-10-13mm/mmap: leave adjust_next as virtual address instead of page frame numberWei Yang2-6/+6
Instead of converting adjust_next between bytes and pages number, let's just store the virtual address into adjust_next. Also, this patch fixes one typo in the comment of vma_adjust_trans_huge(). [[email protected]: changelog tweak] Signed-off-by: Wei Yang <[email protected]> Signed-off-by: Andrew Morton <[email protected]> Acked-by: Vlastimil Babka <[email protected]> Cc: Mike Kravetz <[email protected]> Link: http://lkml.kernel.org/r/[email protected] Signed-off-by: Linus Torvalds <[email protected]>
2020-10-13mm/mmap: leverage vma_rb_erase_ignore() to implement vma_rb_erase()Wei Yang1-9/+7
These two functions share the same logic except ignore a different vma. Let's reuse the code. Signed-off-by: Wei Yang <[email protected]> Signed-off-by: Andrew Morton <[email protected]> Reviewed-by: Andrew Morton <[email protected]> Link: https://lkml.kernel.org/r/[email protected] Signed-off-by: Linus Torvalds <[email protected]>
2020-10-13mm/mmap: rename __vma_unlink_common() to __vma_unlink()Wei Yang1-3/+3
__vma_unlink_common() and __vma_unlink() are counterparts. Since there is no function named __vma_unlink(), let's rename __vma_unlink_common() to __vma_unlink() to make the code more self-explanatory and easy for audience to understand. Otherwise we may expect there are several variants of vma_unlink() and __vma_unlink_common() is used by them. Signed-off-by: Wei Yang <[email protected]> Signed-off-by: Andrew Morton <[email protected]> Reviewed-by: Andrew Morton <[email protected]> Link: https://lkml.kernel.org/r/[email protected] Signed-off-by: Linus Torvalds <[email protected]>
2020-10-13mm/memory.c: replace vmf->vma with variable vmaYanfei Xu1-1/+1
The code has declared a vma_struct named vma which is assigned a value of vmf->vma. Thus, use variable vma directly here. Signed-off-by: Yanfei Xu <[email protected]> Signed-off-by: Andrew Morton <[email protected]> Reviewed-by: Matthew Wilcox (Oracle) <[email protected]> Link: http://lkml.kernel.org/r/[email protected] Signed-off-by: Linus Torvalds <[email protected]>
2020-10-13mm/memory.c: fix typo in __do_fault() commentYanfei Xu1-1/+1
It's "pte_alloc_one", not "pte_alloc_pne". Let's fix that. Signed-off-by: Yanfei Xu <[email protected]> Signed-off-by: Andrew Morton <[email protected]> Reviewed-by: David Hildenbrand <[email protected]> Link: http://lkml.kernel.org/r/[email protected] Signed-off-by: Linus Torvalds <[email protected]>
2020-10-13mm/memcg: fix device private memcg accountingRalph Campbell1-1/+4
The code in mc_handle_swap_pte() checks for non_swap_entry() and returns NULL before checking is_device_private_entry() so device private pages are never handled. Fix this by checking for non_swap_entry() after handling device private swap PTEs. I assume the memory cgroup accounting would be off somehow when moving a process to another memory cgroup. Currently, the device private page is charged like a normal anonymous page when allocated and is uncharged when the page is freed so I think that path is OK. Signed-off-by: Ralph Campbell <[email protected]> Signed-off-by: Andrew Morton <[email protected]> Acked-by: Johannes Weiner <[email protected]> Cc: Michal Hocko <[email protected]> Cc: Vladimir Davydov <[email protected]> Cc: Jerome Glisse <[email protected]> Cc: Balbir Singh <[email protected]> Cc: Ira Weiny <[email protected]> Link: https://lkml.kernel.org/r/[email protected] xFixes: c733a82874a7 ("mm/memcontrol: support MEMORY_DEVICE_PRIVATE") Signed-off-by: Linus Torvalds <[email protected]>
2020-10-13mm: memcg/slab: uncharge during kmem_cache_free_bulk()Bharata B Rao3-17/+30
Object cgroup charging is done for all the objects during allocation, but during freeing, uncharging ends up happening for only one object in the case of bulk allocation/freeing. Fix this by having a separate call to uncharge all the objects from kmem_cache_free_bulk() and by modifying memcg_slab_free_hook() to take care of bulk uncharging. Fixes: 964d4bd370d5 ("mm: memcg/slab: save obj_cgroup for non-root slab objects" Signed-off-by: Bharata B Rao <[email protected]> Signed-off-by: Andrew Morton <[email protected]> Acked-by: Roman Gushchin <[email protected]> Cc: Christoph Lameter <[email protected]> Cc: David Rientjes <[email protected]> Cc: Joonsoo Kim <[email protected]> Cc: Vlastimil Babka <[email protected]> Cc: Shakeel Butt <[email protected]> Cc: Johannes Weiner <[email protected]> Cc: Michal Hocko <[email protected]> Cc: Tejun Heo <[email protected]> Cc: <[email protected]> Link: https://lkml.kernel.org/r/[email protected] Signed-off-by: Linus Torvalds <[email protected]>
2020-10-13mm: memcontrol: reword obsolete comment of mem_cgroup_unmark_under_oom()Miaohe Lin1-2/+2
Since commit 79dfdaccd1d5 ("memcg: make oom_lock 0 and 1 based rather than counter"), the mem_cgroup_unmark_under_oom() is added and the comment of the mem_cgroup_oom_unlock() is moved here. But this comment make no sense here because mem_cgroup_oom_lock() does not operate on under_oom field. So we reword the comment as this would be helpful. [Thanks Michal Hocko for rewording this comment.] Signed-off-by: Miaohe Lin <[email protected]> Signed-off-by: Andrew Morton <[email protected]> Acked-by: Michal Hocko <[email protected]> Cc: Johannes Weiner <[email protected]> Cc: Vladimir Davydov <[email protected]> Link: https://lkml.kernel.org/r/[email protected] Signed-off-by: Linus Torvalds <[email protected]>
2020-10-13mm/page_counter: correct the obsolete func name in the comment of ↵Miaohe Lin1-1/+1
page_counter_try_charge() Since commit bbec2e15170a ("mm: rename page_counter's count/limit into usage/max"), page_counter_limit() is renamed to page_counter_set_max(). So replace page_counter_limit with page_counter_set_max in comment. Signed-off-by: Miaohe Lin <[email protected]> Signed-off-by: Andrew Morton <[email protected]> Cc: Roman Gushchin <[email protected]> Cc: Johannes Weiner <[email protected]> Link: https://lkml.kernel.org/r/[email protected] Signed-off-by: Linus Torvalds <[email protected]>
2020-10-13mm: memcontrol: add the missing numa_stat interface for cgroup v2Muchun Song1-60/+110
In the cgroup v1, we have a numa_stat interface. This is useful for providing visibility into the numa locality information within an memcg since the pages are allowed to be allocated from any physical node. One of the use cases is evaluating application performance by combining this information with the application's CPU allocation. But the cgroup v2 does not. So this patch adds the missing information. Suggested-by: Shakeel Butt <[email protected]> Signed-off-by: Muchun Song <[email protected]> Signed-off-by: Andrew Morton <[email protected]> Reviewed-by: Shakeel Butt <[email protected]> Cc: Zefan Li <[email protected]> Cc: Johannes Weiner <[email protected]> Cc: Jonathan Corbet <[email protected]> Cc: Michal Hocko <[email protected]> Cc: Vladimir Davydov <[email protected]> Cc: Roman Gushchin <[email protected]> Cc: Randy Dunlap <[email protected]> Link: https://lkml.kernel.org/r/[email protected] Signed-off-by: Linus Torvalds <[email protected]>