aboutsummaryrefslogtreecommitdiff
AgeCommit message (Collapse)AuthorFilesLines
2020-10-13mm/vmalloc.c: fix the comment of find_vm_areaHui Su1-2/+2
Fix the comment of find_vm_area() and get_vm_area() Signed-off-by: Hui Su <[email protected]> Signed-off-by: Andrew Morton <[email protected]> Reviewed-by: Andrew Morton <[email protected]> Link: https://lkml.kernel.org/r/20200927153034.GA199877@rlk Signed-off-by: Linus Torvalds <[email protected]>
2020-10-13mm/vmalloc.c: update the comment in __vmalloc_area_node()Hui Su1-1/+1
Since c67dc624757 ("mm/vmalloc: do not call kmemleak_free() on not yet accounted memory"), the __vunmap() have been changed to __vfree(), so update the confusing comment(). Signed-off-by: Hui Su <[email protected]> Signed-off-by: Andrew Morton <[email protected]> Reviewed-by: Andrew Morton <[email protected]> Cc: Roman Penyaev <[email protected]> Link: https://lkml.kernel.org/r/20200927155409.GA3315@rlk Signed-off-by: Linus Torvalds <[email protected]>
2020-10-13mm/memory-failure.c: remove unused macro `writeback'Alex Shi1-2/+0
Unlike others we don't use the marco writeback. so let's remove it to tame gcc warning: mm/memory-failure.c:827: warning: macro "writeback" is not used [-Wunused-macros] Signed-off-by: Alex Shi <[email protected]> Signed-off-by: Andrew Morton <[email protected]> Cc: Naoya Horiguchi <[email protected]> Link: https://lkml.kernel.org/r/[email protected] Signed-off-by: Linus Torvalds <[email protected]>
2020-10-13mm/memory-failure: do pgoff calculation before for_each_process()Xianting Tian1-1/+2
There is no need to calculate pgoff in each loop of for_each_process(), so move it to the place before for_each_process(), which can save some CPU cycles. Signed-off-by: Xianting Tian <[email protected]> Signed-off-by: Andrew Morton <[email protected]> Acked-by: Naoya Horiguchi <[email protected]> Link: http://lkml.kernel.org/r/[email protected] Signed-off-by: Linus Torvalds <[email protected]>
2020-10-13mm/dmapool.c: replace hard coded function name with __func__Andy Shevchenko1-22/+18
No need to hard code function name when __func__ can be used. While here, replace specifiers for special types like dma_addr_t. Signed-off-by: Andy Shevchenko <[email protected]> Signed-off-by: Andrew Morton <[email protected]> Cc: Matthew Wilcox <[email protected]> Link: http://lkml.kernel.org/r/[email protected] Signed-off-by: Linus Torvalds <[email protected]>
2020-10-13mm/dmapool.c: replace open-coded list_for_each_entry_safe()Andy Shevchenko1-4/+2
There is a place in the code where open-coded version of list_for_each_entry_safe() is used. Replace that with the standard macro. Signed-off-by: Andy Shevchenko <[email protected]> Signed-off-by: Andrew Morton <[email protected]> Cc: Matthew Wilcox <[email protected]> Link: http://lkml.kernel.org/r/[email protected] Signed-off-by: Linus Torvalds <[email protected]>
2020-10-13lib/test_hmm.c: remove unused dmirror_zero_pageRalph Campbell1-14/+0
The variable dmirror_zero_page is unused in the HMM self test driver which was probably intended to demonstrate how a driver could use migrate_vma_setup() to share a single read-only device private zero page similar to how the CPU does. However, this isn't needed for the self tests so remove it. Signed-off-by: Ralph Campbell <[email protected]> Signed-off-by: Andrew Morton <[email protected]> Cc: Jerome Glisse <[email protected]> Link: https://lkml.kernel.org/r/[email protected] Signed-off-by: Linus Torvalds <[email protected]>
2020-10-13tools/testing/selftests/vm/hmm-tests.c: use the new SKIP() macroRalph Campbell1-2/+2
Some tests might not be able to be run if resources like huge pages are not available. Mark these tests as skipped instead of simply passing. Signed-off-by: Ralph Campbell <[email protected]> Signed-off-by: Andrew Morton <[email protected]> Reviewed-by: Jason Gunthorpe <[email protected]> Cc: Jerome Glisse <[email protected]> Cc: John Hubbard <[email protected]> Cc: Shuah Khan <[email protected]> Link: http://lkml.kernel.org/r/[email protected] Signed-off-by: Linus Torvalds <[email protected]>
2020-10-13include/linux/huge_mm.h: remove mincore_huge_pmd declarationyuleixzhang1-3/+0
As mincore_huge_pmd() was dropped, remove the declaration from the header file. Signed-off-by: Yulei Zhang <[email protected]> Signed-off-by: Andrew Morton <[email protected]> Reviewed-by: Zi Yan <[email protected]> Link: https://lkml.kernel.org/r/[email protected] Signed-off-by: Linus Torvalds <[email protected]>
2020-10-13mm: remove src/dst mm parameter in copy_page_range()Peter Xu3-71/+76
Both of the mm pointers are not needed after commit 7a4830c380f3 ("mm/fork: Pass new vma pointer into copy_page_range()"). Jason Gunthorpe also reported that the ordering of copy_page_range() is odd. Since working at it, reorder the parameters to be logical, by (1) always put the dst_* fields to be before src_* fields, and (2) keep the same type of parameters together. [[email protected]: further reorder some parameters and line format, per Jason] Link: https://lkml.kernel.org/r/[email protected] [[email protected]: fix warnings] Link: https://lkml.kernel.org/r/20201006200138.GA6026@xz-x1 Reported-by: Kirill A. Shutemov <[email protected]> Signed-off-by: Peter Xu <[email protected]> Signed-off-by: Andrew Morton <[email protected]> Reviewed-by: Jason Gunthorpe <[email protected]> Acked-by: Kirill A. Shutemov <[email protected]> Link: https://lkml.kernel.org/r/[email protected] Signed-off-by: Linus Torvalds <[email protected]>
2020-10-13mm/mmap.c: replace do_brk with do_brk_flags in comment of insert_vm_struct()Liao Pingfang1-1/+1
Replace do_brk with do_brk_flags in comment of insert_vm_struct(), since do_brk was removed in following commit. Fixes: bb177a732c4369 ("mm: do not bug_on on incorrect length in __mm_populate()") Signed-off-by: Liao Pingfang <[email protected]> Signed-off-by: Yi Wang <[email protected]> Signed-off-by: Andrew Morton <[email protected]> Link: https://lkml.kernel.org/r/[email protected] Signed-off-by: Linus Torvalds <[email protected]>
2020-10-13mm/mmap.c: use helper function allow_write_access() in ↵Miaohe Lin1-1/+1
__remove_shared_vm_struct() In commit 1da177e4c3f4 ("Linux-2.6.12-rc2"), the helper allow_write_access came with the atomic_inc operation of the i_writecount field in the func __remove_shared_vm_struct(). But it forgot to use this helper function. Signed-off-by: Miaohe Lin <[email protected]> Signed-off-by: Andrew Morton <[email protected]> Link: https://lkml.kernel.org/r/[email protected] Signed-off-by: Linus Torvalds <[email protected]>
2020-10-13mm: use helper function mapping_allow_writable()Miaohe Lin2-2/+2
Commit 4bb5f5d9395b ("mm: allow drivers to prevent new writable mappings") changed i_mmap_writable from unsigned int to atomic_t and add the helper function mapping_allow_writable() to atomic_inc i_mmap_writable. But it forgot to use this helper function in dup_mmap() and __vma_link_file(). Signed-off-by: Miaohe Lin <[email protected]> Signed-off-by: Andrew Morton <[email protected]> Cc: Christian Brauner <[email protected]> Cc: Ingo Molnar <[email protected]> Cc: Peter Zijlstra <[email protected]> Cc: "Eric W. Biederman" <[email protected]> Cc: Christian Kellner <[email protected]> Cc: Suren Baghdasaryan <[email protected]> Cc: Adrian Reber <[email protected]> Cc: Shakeel Butt <[email protected]> Cc: Aleksa Sarai <[email protected]> Cc: Thomas Gleixner <[email protected]> Link: https://lkml.kernel.org/r/[email protected] Signed-off-by: Linus Torvalds <[email protected]>
2020-10-13mm/mmap: check on file instead of the rb_root_cached of its address_spaceWei Yang1-3/+3
In __vma_adjust(), we do the check on *root* to decide whether to adjust the address_space. It seems to be more meaningful to do the check on *file* itself. This means we are adjusting some data because it is a file backed vma. Since we seem to assume the address_space is valid if it is a file backed vma, let's just replace *root* with *file* here. Signed-off-by: Wei Yang <[email protected]> Signed-off-by: Andrew Morton <[email protected]> Reviewed-by: Andrew Morton <[email protected]> Link: https://lkml.kernel.org/r/[email protected] Signed-off-by: Linus Torvalds <[email protected]>
2020-10-13mm/mmap: not necessary to check mapping separatelyWei Yang1-2/+1
*root* with type of struct rb_root_cached is an element of *mapping* with type of struct address_space. This implies when we have a valid *root* it must be a part of valid *mapping*. So we can merge these two checks together to make the code more easy to read and to save some cpu cycles. Signed-off-by: Wei Yang <[email protected]> Signed-off-by: Andrew Morton <[email protected]> Reviewed-by: Andrew Morton <[email protected]> Link: https://lkml.kernel.org/r/[email protected] Signed-off-by: Linus Torvalds <[email protected]>
2020-10-13mm/memory.c: fix spello of "function"Randy Dunlap1-1/+1
Fix typo/spello of "function". Signed-off-by: Randy Dunlap <[email protected]> Signed-off-by: Andrew Morton <[email protected]> Link: https://lkml.kernel.org/r/[email protected] Signed-off-by: Linus Torvalds <[email protected]>
2020-10-13mm/mmap: leave adjust_next as virtual address instead of page frame numberWei Yang2-6/+6
Instead of converting adjust_next between bytes and pages number, let's just store the virtual address into adjust_next. Also, this patch fixes one typo in the comment of vma_adjust_trans_huge(). [[email protected]: changelog tweak] Signed-off-by: Wei Yang <[email protected]> Signed-off-by: Andrew Morton <[email protected]> Acked-by: Vlastimil Babka <[email protected]> Cc: Mike Kravetz <[email protected]> Link: http://lkml.kernel.org/r/[email protected] Signed-off-by: Linus Torvalds <[email protected]>
2020-10-13mm: simplify PageDoubleMap with PF_SECOND policyMatthew Wilcox (Oracle)1-30/+10
Introduce the new page policy of PF_SECOND which lets us use the normal pageflags generation machinery to create the various DoubleMap manipulation functions. Signed-off-by: Matthew Wilcox (Oracle) <[email protected]> Signed-off-by: Andrew Morton <[email protected]> Reviewed-by: Zi Yan <[email protected]> Link: https://lkml.kernel.org/r/[email protected] Signed-off-by: Linus Torvalds <[email protected]>
2020-10-13mm: move PageDoubleMap bitMatthew Wilcox (Oracle)1-1/+1
Patch series "Fix PageDoubleMap". This is a purely theoretical problem for now as none of the filesystems which use PG_private_2 (ie PG_fscache) are being converted at this time, but it's confusing to leave it like this. This patch (of 2): PG_private_2 is defined as being PF_ANY (applicable to tail pages as well as regular & head pages). That means that the first tail page of a double-map page will appear to have Private2 set. Use the Workingset bit instead which is defined as PF_HEAD so any attempt to access the Workingset bit on a tail page will redirect to the head page's Workingset bit. Signed-off-by: Matthew Wilcox (Oracle) <[email protected]> Signed-off-by: Andrew Morton <[email protected]> Reviewed-by: Zi Yan <[email protected]> Link: https://lkml.kernel.org/r/[email protected] Link: https://lkml.kernel.org/r/[email protected] Signed-off-by: Linus Torvalds <[email protected]>
2020-10-13mm: proc: smaps_rollup: do not stall write attempts on mmap_lockChinwen Chang1-1/+65
smaps_rollup will try to grab mmap_lock and go through the whole vma list until it finishes the iterating. When encountering large processes, the mmap_lock will be held for a longer time, which may block other write requests like mmap and munmap from progressing smoothly. There are upcoming mmap_lock optimizations like range-based locks, but the lock applied to smaps_rollup would be the coarse type, which doesn't avoid the occurrence of unpleasant contention. To solve aforementioned issue, we add a check which detects whether anyone wants to grab mmap_lock for write attempts. Signed-off-by: Chinwen Chang <[email protected]> Signed-off-by: Andrew Morton <[email protected]> Cc: Steven Price <[email protected]> Cc: Michel Lespinasse <[email protected]> Cc: Matthias Brugger <[email protected]> Cc: Vlastimil Babka <[email protected]> Cc: Daniel Jordan <[email protected]> Cc: Davidlohr Bueso <[email protected]> Cc: Chinwen Chang <[email protected]> Cc: Alexey Dobriyan <[email protected]> Cc: "Matthew Wilcox (Oracle)" <[email protected]> Cc: Jason Gunthorpe <[email protected]> Cc: Song Liu <[email protected]> Cc: Jimmy Assarsson <[email protected]> Cc: Huang Ying <[email protected]> Cc: Daniel Kiss <[email protected]> Cc: Laurent Dufour <[email protected]> Link: http://lkml.kernel.org/r/[email protected] Signed-off-by: Linus Torvalds <[email protected]>
2020-10-13mm: smaps*: extend smap_gather_stats to support specified beginningChinwen Chang1-8/+22
Extend smap_gather_stats to support indicated beginning address at which it should start gathering. To achieve the goal, we add a new parameter @start assigned by the caller and try to refactor it for simplicity. If @start is 0, it will use the range of @vma for gathering. Signed-off-by: Chinwen Chang <[email protected]> Signed-off-by: Andrew Morton <[email protected]> Reviewed-by: Steven Price <[email protected]> Cc: Michel Lespinasse <[email protected]> Cc: Alexey Dobriyan <[email protected]> Cc: Daniel Jordan <[email protected]> Cc: Daniel Kiss <[email protected]> Cc: Davidlohr Bueso <[email protected]> Cc: Huang Ying <[email protected]> Cc: Jason Gunthorpe <[email protected]> Cc: Jimmy Assarsson <[email protected]> Cc: Laurent Dufour <[email protected]> Cc: "Matthew Wilcox (Oracle)" <[email protected]> Cc: Matthias Brugger <[email protected]> Cc: Song Liu <[email protected]> Cc: Vlastimil Babka <[email protected]> Link: http://lkml.kernel.org/r/[email protected] Signed-off-by: Linus Torvalds <[email protected]>
2020-10-13mmap locking API: add mmap_lock_is_contended()Chinwen Chang1-0/+5
Patch series "Try to release mmap_lock temporarily in smaps_rollup", v4. Recently, we have observed some janky issues caused by unpleasantly long contention on mmap_lock which is held by smaps_rollup when probing large processes. To address the problem, we let smaps_rollup detect if anyone wants to acquire mmap_lock for write attempts. If yes, just release the lock temporarily to ease the contention. smaps_rollup is a procfs interface which allows users to summarize the process's memory usage without the overhead of seq_* calls. Android uses it to sample the memory usage of various processes to balance its memory pool sizes. If no one wants to take the lock for write requests, smaps_rollup with this patch will behave like the original one. Although there are on-going mmap_lock optimizations like range-based locks, the lock applied to smaps_rollup would be the coarse one, which is hard to avoid the occurrence of aforementioned issues. So the detection and temporary release for write attempts on mmap_lock in smaps_rollup is still necessary. This patch (of 3): Add new API to query if someone wants to acquire mmap_lock for write attempts. Using this instead of rwsem_is_contended makes it more tolerant of future changes to the lock type. Signed-off-by: Chinwen Chang <[email protected]> Signed-off-by: Andrew Morton <[email protected]> Reviewed-by: Steven Price <[email protected]> Acked-by: Michel Lespinasse <[email protected]> Cc: Alexey Dobriyan <[email protected]> Cc: Daniel Jordan <[email protected]> Cc: Daniel Kiss <[email protected]> Cc: Davidlohr Bueso <[email protected]> Cc: Huang Ying <[email protected]> Cc: Jason Gunthorpe <[email protected]> Cc: Jimmy Assarsson <[email protected]> Cc: Laurent Dufour <[email protected]> Cc: "Matthew Wilcox (Oracle)" <[email protected]> Cc: Matthias Brugger <[email protected]> Cc: Song Liu <[email protected]> Cc: Vlastimil Babka <[email protected]> Link: http://lkml.kernel.org/r/[email protected] Link: http://lkml.kernel.org/r/[email protected] Signed-off-by: Linus Torvalds <[email protected]>
2020-10-13mm/mmap: leverage vma_rb_erase_ignore() to implement vma_rb_erase()Wei Yang1-9/+7
These two functions share the same logic except ignore a different vma. Let's reuse the code. Signed-off-by: Wei Yang <[email protected]> Signed-off-by: Andrew Morton <[email protected]> Reviewed-by: Andrew Morton <[email protected]> Link: https://lkml.kernel.org/r/[email protected] Signed-off-by: Linus Torvalds <[email protected]>
2020-10-13mm/mmap: rename __vma_unlink_common() to __vma_unlink()Wei Yang1-3/+3
__vma_unlink_common() and __vma_unlink() are counterparts. Since there is no function named __vma_unlink(), let's rename __vma_unlink_common() to __vma_unlink() to make the code more self-explanatory and easy for audience to understand. Otherwise we may expect there are several variants of vma_unlink() and __vma_unlink_common() is used by them. Signed-off-by: Wei Yang <[email protected]> Signed-off-by: Andrew Morton <[email protected]> Reviewed-by: Andrew Morton <[email protected]> Link: https://lkml.kernel.org/r/[email protected] Signed-off-by: Linus Torvalds <[email protected]>
2020-10-13mm/memory.c: replace vmf->vma with variable vmaYanfei Xu1-1/+1
The code has declared a vma_struct named vma which is assigned a value of vmf->vma. Thus, use variable vma directly here. Signed-off-by: Yanfei Xu <[email protected]> Signed-off-by: Andrew Morton <[email protected]> Reviewed-by: Matthew Wilcox (Oracle) <[email protected]> Link: http://lkml.kernel.org/r/[email protected] Signed-off-by: Linus Torvalds <[email protected]>
2020-10-13mm/memory.c: fix typo in __do_fault() commentYanfei Xu1-1/+1
It's "pte_alloc_one", not "pte_alloc_pne". Let's fix that. Signed-off-by: Yanfei Xu <[email protected]> Signed-off-by: Andrew Morton <[email protected]> Reviewed-by: David Hildenbrand <[email protected]> Link: http://lkml.kernel.org/r/[email protected] Signed-off-by: Linus Torvalds <[email protected]>
2020-10-13mm: account PMD tables like PTE tablesMatthew Wilcox2-4/+21
We account the PTE level of the page tables to the process in order to make smarter OOM decisions and help diagnose why memory is fragmented. For these same reasons, we should account pages allocated for PMDs. With larger process address spaces and ASLR, the number of PMDs in use is higher than it used to be so the inaccuracy is starting to matter. [[email protected]: arm: __pmd_free_tlb(): call page table destructor] Link: https://lkml.kernel.org/r/[email protected] Signed-off-by: Matthew Wilcox (Oracle) <[email protected]> Signed-off-by: Andrew Morton <[email protected]> Reviewed-by: Mike Rapoport <[email protected]> Cc: Abdul Haleem <[email protected]> Cc: Andy Lutomirski <[email protected]> Cc: Arnd Bergmann <[email protected]> Cc: Christophe Leroy <[email protected]> Cc: Joerg Roedel <[email protected]> Cc: Max Filippov <[email protected]> Cc: Peter Zijlstra <[email protected]> Cc: Satheesh Rajendran <[email protected]> Cc: Stafford Horne <[email protected]> Cc: Naresh Kamboju <[email protected]> Cc: Anders Roxell <[email protected]> Link: http://lkml.kernel.org/r/[email protected] Signed-off-by: Linus Torvalds <[email protected]>
2020-10-13selftests/vm: fix incorrect gcc invocation in some casesJohn Hubbard1-0/+12
Avoid accidental wrong builds, due to built-in rules working just a little bit too well--but not quite as well as required for our situation here. In other words, "make userfaultfd" (for example) is supposed to fail to build at all, because this Makefile only supports either "make" (all), or "make /full/path". However, the built-in rules, if not suppressed, will pick up CFLAGS and the initial LDLIBS (but not the target-specific LDLIBS, because those are only set for the full path target!). This causes it to get pretty far into building things despite using incorrect values such as an *occasionally* incomplete LDLIBS value. Signed-off-by: John Hubbard <[email protected]> Signed-off-by: Andrew Morton <[email protected]> Cc: Shuah Khan <[email protected]> Cc: Jason Gunthorpe <[email protected]> Link: https://lkml.kernel.org/r/[email protected] Signed-off-by: Linus Torvalds <[email protected]>
2020-10-13selftests/vm: fix false build success on the second and later attemptsJohn Hubbard1-0/+5
Patch series "selftests/vm: fix some minor aggravating factors in the Makefile". This fixes a couple of minor aggravating factors that I ran across while trying to do some changes in selftests/vm. These are simple things, but like most things with GNU Make, it's rarely obvious what's wrong until you understand *the entire Makefile and all of its includes*. So while there is, of course, joy in learning those details, I thought I'd fix these little things, so as to allow others to skip out on the Joy if they so choose. :) First of all, if you have an item (let's choose userfaultfd for an example) that fails to build, you might do this: $ make -j32 # ...you observe a failed item in the threaded output # OK, let's get a closer look $ make # ...but now the build quietly "succeeds". That's what Patch 0001 fixes. Second, if you instead attempt this approach for your closer look (a casual mistake, as it's not supported): $ make userfaultfd # ...userfaultfd fails to link, due to incomplete LDLIBS That's what Patch 0002 fixes. This patch (of 2): If one or more of these selftest fail to build, then after the first failure, subsequent invocations of "make" will make it appear that there are no build failures, after all. That's because the failed build products remain, with up-to-date timestamps, thus tricking Make (and you!) into believing that there's nothing else to build. Fix this by telling Make to delete targets that didn't completely succeed. Signed-off-by: John Hubbard <[email protected]> Signed-off-by: Andrew Morton <[email protected]> Cc: Shuah Khan <[email protected]> Cc: Jason Gunthorpe <[email protected]> Link: https://lkml.kernel.org/r/[email protected] Link: https://lkml.kernel.org/r/[email protected] Signed-off-by: Linus Torvalds <[email protected]>
2020-10-13mm/memcg: fix device private memcg accountingRalph Campbell1-1/+4
The code in mc_handle_swap_pte() checks for non_swap_entry() and returns NULL before checking is_device_private_entry() so device private pages are never handled. Fix this by checking for non_swap_entry() after handling device private swap PTEs. I assume the memory cgroup accounting would be off somehow when moving a process to another memory cgroup. Currently, the device private page is charged like a normal anonymous page when allocated and is uncharged when the page is freed so I think that path is OK. Signed-off-by: Ralph Campbell <[email protected]> Signed-off-by: Andrew Morton <[email protected]> Acked-by: Johannes Weiner <[email protected]> Cc: Michal Hocko <[email protected]> Cc: Vladimir Davydov <[email protected]> Cc: Jerome Glisse <[email protected]> Cc: Balbir Singh <[email protected]> Cc: Ira Weiny <[email protected]> Link: https://lkml.kernel.org/r/[email protected] xFixes: c733a82874a7 ("mm/memcontrol: support MEMORY_DEVICE_PRIVATE") Signed-off-by: Linus Torvalds <[email protected]>
2020-10-13mm: memcg/slab: uncharge during kmem_cache_free_bulk()Bharata B Rao3-17/+30
Object cgroup charging is done for all the objects during allocation, but during freeing, uncharging ends up happening for only one object in the case of bulk allocation/freeing. Fix this by having a separate call to uncharge all the objects from kmem_cache_free_bulk() and by modifying memcg_slab_free_hook() to take care of bulk uncharging. Fixes: 964d4bd370d5 ("mm: memcg/slab: save obj_cgroup for non-root slab objects" Signed-off-by: Bharata B Rao <[email protected]> Signed-off-by: Andrew Morton <[email protected]> Acked-by: Roman Gushchin <[email protected]> Cc: Christoph Lameter <[email protected]> Cc: David Rientjes <[email protected]> Cc: Joonsoo Kim <[email protected]> Cc: Vlastimil Babka <[email protected]> Cc: Shakeel Butt <[email protected]> Cc: Johannes Weiner <[email protected]> Cc: Michal Hocko <[email protected]> Cc: Tejun Heo <[email protected]> Cc: <[email protected]> Link: https://lkml.kernel.org/r/[email protected] Signed-off-by: Linus Torvalds <[email protected]>
2020-10-13mm: memcontrol: reword obsolete comment of mem_cgroup_unmark_under_oom()Miaohe Lin1-2/+2
Since commit 79dfdaccd1d5 ("memcg: make oom_lock 0 and 1 based rather than counter"), the mem_cgroup_unmark_under_oom() is added and the comment of the mem_cgroup_oom_unlock() is moved here. But this comment make no sense here because mem_cgroup_oom_lock() does not operate on under_oom field. So we reword the comment as this would be helpful. [Thanks Michal Hocko for rewording this comment.] Signed-off-by: Miaohe Lin <[email protected]> Signed-off-by: Andrew Morton <[email protected]> Acked-by: Michal Hocko <[email protected]> Cc: Johannes Weiner <[email protected]> Cc: Vladimir Davydov <[email protected]> Link: https://lkml.kernel.org/r/[email protected] Signed-off-by: Linus Torvalds <[email protected]>
2020-10-13mm/page_counter: correct the obsolete func name in the comment of ↵Miaohe Lin1-1/+1
page_counter_try_charge() Since commit bbec2e15170a ("mm: rename page_counter's count/limit into usage/max"), page_counter_limit() is renamed to page_counter_set_max(). So replace page_counter_limit with page_counter_set_max in comment. Signed-off-by: Miaohe Lin <[email protected]> Signed-off-by: Andrew Morton <[email protected]> Cc: Roman Gushchin <[email protected]> Cc: Johannes Weiner <[email protected]> Link: https://lkml.kernel.org/r/[email protected] Signed-off-by: Linus Torvalds <[email protected]>
2020-10-13mm: memcontrol: add the missing numa_stat interface for cgroup v2Muchun Song2-80/+159
In the cgroup v1, we have a numa_stat interface. This is useful for providing visibility into the numa locality information within an memcg since the pages are allowed to be allocated from any physical node. One of the use cases is evaluating application performance by combining this information with the application's CPU allocation. But the cgroup v2 does not. So this patch adds the missing information. Suggested-by: Shakeel Butt <[email protected]> Signed-off-by: Muchun Song <[email protected]> Signed-off-by: Andrew Morton <[email protected]> Reviewed-by: Shakeel Butt <[email protected]> Cc: Zefan Li <[email protected]> Cc: Johannes Weiner <[email protected]> Cc: Jonathan Corbet <[email protected]> Cc: Michal Hocko <[email protected]> Cc: Vladimir Davydov <[email protected]> Cc: Roman Gushchin <[email protected]> Cc: Randy Dunlap <[email protected]> Link: https://lkml.kernel.org/r/[email protected] Signed-off-by: Linus Torvalds <[email protected]>
2020-10-13mm/memcg: unify swap and memsw page countersWaiman Long2-8/+8
The swap page counter is v2 only while memsw is v1 only. As v1 and v2 controllers cannot be active at the same time, there is no point to keep both swap and memsw page counters in mem_cgroup. The previous patch has made sure that memsw page counter is updated and accessed only when in v1 code paths. So it is now safe to alias the v1 memsw page counter to v2 swap page counter. This saves 14 long's in the size of mem_cgroup. This is a saving of 112 bytes for 64-bit archs. While at it, also document which page counters are used in v1 and/or v2. Signed-off-by: Waiman Long <[email protected]> Signed-off-by: Andrew Morton <[email protected]> Reviewed-by: Shakeel Butt <[email protected]> Acked-by: Michal Hocko <[email protected]> Cc: Chris Down <[email protected]> Cc: Johannes Weiner <[email protected]> Cc: Roman Gushchin <[email protected]> Cc: Tejun Heo <[email protected]> Cc: Vladimir Davydov <[email protected]> Cc: Yafang Shao <[email protected]> Link: https://lkml.kernel.org/r/[email protected] Signed-off-by: Linus Torvalds <[email protected]>
2020-10-13mm/memcg: simplify mem_cgroup_get_max()Waiman Long1-11/+13
mem_cgroup_get_max() used to get memory+swap max from both the v1 memsw and v2 memory+swap page counters & return the maximum of these 2 values. This is redundant and it is more efficient to just get either the v1 or the v2 values depending on which one is currently in use. [[email protected]: v4] Link: https://lkml.kernel.org/r/[email protected] Signed-off-by: Waiman Long <[email protected]> Signed-off-by: Andrew Morton <[email protected]> Reviewed-by: Shakeel Butt <[email protected]> Acked-by: Michal Hocko <[email protected]> Cc: Chris Down <[email protected]> Cc: Johannes Weiner <[email protected]> Cc: Roman Gushchin <[email protected]> Cc: Tejun Heo <[email protected]> Cc: Vladimir Davydov <[email protected]> Cc: Yafang Shao <[email protected]> Link: https://lkml.kernel.org/r/[email protected] Signed-off-by: Linus Torvalds <[email protected]>
2020-10-13mm/memcg: clean up obsolete enum charge_typeWaiman Long1-8/+0
Patch series "mm/memcg: Miscellaneous cleanups and streamlining", v2. This patch (of 3): Since commit 0a31bc97c80c ("mm: memcontrol: rewrite uncharge API") and commit 00501b531c47 ("mm: memcontrol: rewrite charge API") in v3.17, the enum charge_type was no longer used anywhere. However, the enum itself was not removed at that time. Remove the obsolete enum charge_type now. Signed-off-by: Waiman Long <[email protected]> Signed-off-by: Andrew Morton <[email protected]> Reviewed-by: Shakeel Butt <[email protected]> Acked-by: Johannes Weiner <[email protected]> Acked-by: Michal Hocko <[email protected]> Acked-by: Chris Down <[email protected]> Cc: Vladimir Davydov <[email protected]> Cc: Tejun Heo <[email protected]> Cc: Roman Gushchin <[email protected]> Cc: Yafang Shao <[email protected]> Link: https://lkml.kernel.org/r/[email protected] Link: https://lkml.kernel.org/r/[email protected] Signed-off-by: Linus Torvalds <[email protected]>
2020-10-13mm: memcontrol: correct the comment of mem_cgroup_iter()Miaohe Lin1-3/+3
Since commit bbec2e15170a ("mm: rename page_counter's count/limit into usage/max"), the arg @reclaim has no priority field anymore. Signed-off-by: Miaohe Lin <[email protected]> Signed-off-by: Andrew Morton <[email protected]> Acked-by: Michal Hocko <[email protected]> Cc: Johannes Weiner <[email protected]> Cc: Vladimir Davydov <[email protected]> Link: https://lkml.kernel.org/r/[email protected] Signed-off-by: Linus Torvalds <[email protected]>
2020-10-13mm: memcg/slab: fix racy access to page->mem_cgroup in mem_cgroup_from_obj()Roman Gushchin1-0/+11
mem_cgroup_from_obj() checks the lowest bit of the page->mem_cgroup pointer to determine if the page has an attached obj_cgroup vector instead of a regular memcg pointer. If it's not set, it simple returns the page->mem_cgroup value as a struct mem_cgroup pointer. The commit 10befea91b61 ("mm: memcg/slab: use a single set of kmem_caches for all allocations") changed the moment when this bit is set: if previously it was set on the allocation of the slab page, now it can be set well after, when the first accounted object is allocated on this page. It opened a race: if page->mem_cgroup is set concurrently after the first page_has_obj_cgroups(page) check, a pointer to the obj_cgroups array can be returned as a memory cgroup pointer. A simple check for page->mem_cgroup pointer for NULL before the page_has_obj_cgroups() check fixes the race. Indeed, if the pointer is not NULL, it's either a simple mem_cgroup pointer or a pointer to obj_cgroup vector. The pointer can be asynchronously changed from NULL to (obj_cgroup_vec | 0x1UL), but can't be changed from a valid memcg pointer to objcg vector or back. If the object passed to mem_cgroup_from_obj() is a slab object and page->mem_cgroup is NULL, it means that the object is not accounted, so the function must return NULL. I've discovered the race looking at the code, so far I haven't seen it in the wild. Fixes: 10befea91b61 ("mm: memcg/slab: use a single set of kmem_caches for all allocations") Signed-off-by: Roman Gushchin <[email protected]> Signed-off-by: Andrew Morton <[email protected]> Reviewed-by: Shakeel Butt <[email protected]> Cc: Johannes Weiner <[email protected]> Cc: Vlastimil Babka <[email protected]> Link: https://lkml.kernel.org/r/[email protected] Signed-off-by: Linus Torvalds <[email protected]>
2020-10-13mm: memcontrol: use the preferred form for passing the size of a structure typeGustavo A. R. Silva1-1/+1
Use the preferred form for passing the size of a structure type. The alternative form where the structure type is spelled out hurts readability and introduces an opportunity for a bug when the object type is changed but the corresponding object identifier to which the sizeof operator is applied is not. Signed-off-by: Gustavo A. R. Silva <[email protected]> Signed-off-by: Andrew Morton <[email protected]> Acked-by: Michal Hocko <[email protected]> Cc: Johannes Weiner <[email protected]> Cc: Vladimir Davydov <[email protected]> Link: https://lkml.kernel.org/r/773e013ff2f07fe2a0b47153f14dea054c0c04f1.1596214831.git.gustavoars@kernel.org Signed-off-by: Linus Torvalds <[email protected]>
2020-10-13mm: memcontrol: use flex_array_size() helper in memcpy()Gustavo A. R. Silva1-4/+3
Make use of the flex_array_size() helper to calculate the size of a flexible array member within an enclosing structure. This helper offers defense-in-depth against potential integer overflows, while at the same time makes it explicitly clear that we are dealing with a flexible array member. Also, remove unnecessary braces. Signed-off-by: Gustavo A. R. Silva <[email protected]> Signed-off-by: Andrew Morton <[email protected]> Acked-by: Michal Hocko <[email protected]> Cc: Johannes Weiner <[email protected]> Cc: Vladimir Davydov <[email protected]> Link: https://lkml.kernel.org/r/ddd60dae2d9aea1ccdd2be66634815c93696125e.1596214831.git.gustavoars@kernel.org Signed-off-by: Linus Torvalds <[email protected]>
2020-10-13mm/memremap.c: convert devmap static branch to {inc,dec}Ira Weiny1-5/+2
While reviewing Protection Key Supervisor support it was pointed out that using a counter to track static branch enable was an anti-pattern which was better solved using the provided static_branch_{inc,dec} functions.[1] Fix up devmap_managed_key to work the same way. Also this should be safer because there is a very small (very unlikely) race when multiple callers try to enable at the same time. [1] https://lore.kernel.org/lkml/[email protected]/ Signed-off-by: Ira Weiny <[email protected]> Signed-off-by: Andrew Morton <[email protected]> Reviewed-by: William Kucharski <[email protected]> Cc: Dan Williams <[email protected]> Cc: Vishal Verma <[email protected]> Link: https://lkml.kernel.org/r/[email protected] Signed-off-by: Linus Torvalds <[email protected]>
2020-10-13mm/swapfile.c: fix potential memory leak in sys_swaponMiaohe Lin1-1/+3
If we failed to drain inode, we would forget to free the swap address space allocated by init_swap_address_space() above. Fixes: dc617f29dbe5 ("vfs: don't allow writes to swap files") Signed-off-by: Miaohe Lin <[email protected]> Signed-off-by: Andrew Morton <[email protected]> Reviewed-by: Darrick J. Wong <[email protected]> Link: https://lkml.kernel.org/r/[email protected] Signed-off-by: Linus Torvalds <[email protected]>
2020-10-13mm/swapfile.c: remove unnecessary goto out in _swap_info_get()Miaohe Lin1-1/+0
It's unnecessary to goto the out label while out label is just below. Signed-off-by: Miaohe Lin <[email protected]> Signed-off-by: Andrew Morton <[email protected]> Reviewed-by: Andrew Morton <[email protected]> Link: https://lkml.kernel.org/r/[email protected] Signed-off-by: Linus Torvalds <[email protected]>
2020-10-13mm/swap.c: fix incomplete comment in lru_cache_add_inactive_or_unevictable()Miaohe Lin1-3/+1
Since commit 9c4e6b1a7027 ("mm, mlock, vmscan: no more skipping pagevecs"), unevictable pages do not goes directly back onto zone's unevictable list. Signed-off-by: Miaohe Lin <[email protected]> Signed-off-by: Andrew Morton <[email protected]> Cc: Shakeel Butt <[email protected]> Link: https://lkml.kernel.org/r/[email protected] Signed-off-by: Linus Torvalds <[email protected]>
2020-10-13mm/page_io.c: remove useless out label in __swap_writepage()Miaohe Lin1-5/+3
The out label is only used in one place and return ret directly without something like resource cleanup or lock release and so on. So we should remove this jump label and do some cleanup. Signed-off-by: Miaohe Lin <[email protected]> Signed-off-by: Andrew Morton <[email protected]> Reviewed-by: Andrew Morton <[email protected]> Link: https://lkml.kernel.org/r/[email protected] Signed-off-by: Linus Torvalds <[email protected]>
2020-10-13mm/swap_slots.c: remove always zero and unused return value of ↵Miaohe Lin2-3/+2
enable_swap_slots_cache() enable_swap_slots_cache() always return zero and its return value is just ignored by the caller. So make enable_swap_slots_cache() void. Signed-off-by: Miaohe Lin <[email protected]> Signed-off-by: Andrew Morton <[email protected]> Link: https://lkml.kernel.org/r/[email protected] Signed-off-by: Linus Torvalds <[email protected]>
2020-10-13mm/swap.c: fix confusing comment in release_pages()Miaohe Lin1-1/+1
Since commit 07d802699528 ("mm: devmap: refactor 1-based refcounting for ZONE_DEVICE pages"), we have renamed the func put_devmap_managed_page() to page_is_devmap_managed(). Signed-off-by: Miaohe Lin <[email protected]> Signed-off-by: Andrew Morton <[email protected]> Cc: John Hubbard <[email protected]> Link: https://lkml.kernel.org/r/[email protected] Signed-off-by: Linus Torvalds <[email protected]>
2020-10-13mm: remove superfluous __ClearPageActive()Yu Zhao2-4/+0
To activate a page, mark_page_accessed() always holds a reference on it. It either gets a new reference when adding a page to lru_pvecs.activate_page or reuses an existing one it previously got when it added a page to lru_pvecs.lru_add. So it doesn't call SetPageActive() on a page that doesn't have any reference left. Therefore, the race is impossible these days (I didn't brother to dig into its history). For other paths, namely reclaim and migration, a reference count is always held while calling SetPageActive() on a page. SetPageSlabPfmemalloc() also uses SetPageActive(), but it's irrelevant to LRU pages. Signed-off-by: Yu Zhao <[email protected]> Signed-off-by: Andrew Morton <[email protected]> Reviewed-by: Yang Shi <[email protected]> Cc: Alexander Duyck <[email protected]> Cc: David Hildenbrand <[email protected]> Cc: Huang Ying <[email protected]> Cc: Hugh Dickins <[email protected]> Cc: Joonsoo Kim <[email protected]> Cc: Mel Gorman <[email protected]> Cc: Michal Hocko <[email protected]> Cc: Nicholas Piggin <[email protected]> Cc: Qian Cai <[email protected]> Link: http://lkml.kernel.org/r/[email protected] Signed-off-by: Linus Torvalds <[email protected]>
2020-10-13mm: remove activate_page() from unuse_pte()Yu Zhao3-8/+2
We don't initially add anon pages to active lruvec after commit b518154e59aa ("mm/vmscan: protect the workingset on anonymous LRU"). Remove activate_page() from unuse_pte(), which seems to be missed by the commit. And make the function static while we are at it. Before the commit, we called lru_cache_add_active_or_unevictable() to add new ksm pages to active lruvec. Therefore, activate_page() wasn't necessary for them in the first place. Signed-off-by: Yu Zhao <[email protected]> Signed-off-by: Andrew Morton <[email protected]> Reviewed-by: Yang Shi <[email protected]> Cc: Alexander Duyck <[email protected]> Cc: Huang Ying <[email protected]> Cc: David Hildenbrand <[email protected]> Cc: Michal Hocko <[email protected]> Cc: Qian Cai <[email protected]> Cc: Mel Gorman <[email protected]> Cc: Nicholas Piggin <[email protected]> Cc: Hugh Dickins <[email protected]> Cc: Joonsoo Kim <[email protected]> Link: http://lkml.kernel.org/r/[email protected] Signed-off-by: Linus Torvalds <[email protected]>