diff options
author | Suren Baghdasaryan <[email protected]> | 2022-01-14 14:06:22 -0800 |
---|---|---|
committer | Linus Torvalds <[email protected]> | 2022-01-15 16:30:28 +0200 |
commit | ba535c1caf3ee78aa7719e9e4b07a0dc1d153b9e (patch) | |
tree | bb04a8cac4f1c29d09b32a53d151f08d433af93e | |
parent | cc6dcfee72509868271d42919a3c1081b6b0dc7e (diff) |
mm/oom_kill: allow process_mrelease to run under mmap_lock protection
With exit_mmap holding mmap_write_lock during free_pgtables call,
process_mrelease does not need to elevate mm->mm_users in order to
prevent exit_mmap from destrying pagetables while __oom_reap_task_mm is
walking the VMA tree. The change prevents process_mrelease from calling
the last mmput, which can lead to waiting for IO completion in exit_aio.
Link: https://lkml.kernel.org/r/[email protected]
Signed-off-by: Suren Baghdasaryan <[email protected]>
Acked-by: Michal Hocko <[email protected]>
Reviewed-by: Jason Gunthorpe <[email protected]>
Cc: Andrea Arcangeli <[email protected]>
Cc: Andy Lutomirski <[email protected]>
Cc: Christian Brauner <[email protected]>
Cc: Christian Brauner <[email protected]>
Cc: Christoph Hellwig <[email protected]>
Cc: David Hildenbrand <[email protected]>
Cc: David Rientjes <[email protected]>
Cc: Florian Weimer <[email protected]>
Cc: Jan Engelhardt <[email protected]>
Cc: Jann Horn <[email protected]>
Cc: Johannes Weiner <[email protected]>
Cc: Kirill A. Shutemov <[email protected]>
Cc: Matthew Wilcox <[email protected]>
Cc: Minchan Kim <[email protected]>
Cc: Oleg Nesterov <[email protected]>
Cc: Rik van Riel <[email protected]>
Cc: Roman Gushchin <[email protected]>
Cc: Shakeel Butt <[email protected]>
Cc: Tim Murray <[email protected]>
Signed-off-by: Andrew Morton <[email protected]>
Signed-off-by: Linus Torvalds <[email protected]>
-rw-r--r-- | mm/oom_kill.c | 27 |
1 files changed, 15 insertions, 12 deletions
diff --git a/mm/oom_kill.c b/mm/oom_kill.c index e52ce0b1465d..3390316c8a32 100644 --- a/mm/oom_kill.c +++ b/mm/oom_kill.c @@ -1170,15 +1170,15 @@ SYSCALL_DEFINE2(process_mrelease, int, pidfd, unsigned int, flags) goto put_task; } - if (mmget_not_zero(p->mm)) { - mm = p->mm; - if (task_will_free_mem(p)) - reap = true; - else { - /* Error only if the work has not been done already */ - if (!test_bit(MMF_OOM_SKIP, &mm->flags)) - ret = -EINVAL; - } + mm = p->mm; + mmgrab(mm); + + if (task_will_free_mem(p)) + reap = true; + else { + /* Error only if the work has not been done already */ + if (!test_bit(MMF_OOM_SKIP, &mm->flags)) + ret = -EINVAL; } task_unlock(p); @@ -1189,13 +1189,16 @@ SYSCALL_DEFINE2(process_mrelease, int, pidfd, unsigned int, flags) ret = -EINTR; goto drop_mm; } - if (!__oom_reap_task_mm(mm)) + /* + * Check MMF_OOM_SKIP again under mmap_read_lock protection to ensure + * possible change in exit_mmap is seen + */ + if (!test_bit(MMF_OOM_SKIP, &mm->flags) && !__oom_reap_task_mm(mm)) ret = -EAGAIN; mmap_read_unlock(mm); drop_mm: - if (mm) - mmput(mm); + mmdrop(mm); put_task: put_task_struct(task); return ret; |