Age | Commit message (Collapse) | Author | Files | Lines |
|
There is a general understanding that GFP_ATOMIC/GFP_NOWAIT are to be used
from atomic contexts. E.g. from within a spin lock or from the IRQ
context. This is correct but there are some atomic contexts where the
above doesn't hold. One of them would be an NMI context. Page allocator
has never supported that and the general fear of this context didn't let
anybody to actually even try to use the allocator there. Good, but let's
be more specific about that.
Another such a context, and that is where people seem to be more daring,
is raw_spin_lock. Mostly because it simply resembles regular spin lock
which is supported by the allocator and there is not any implementation
difference with !RT kernels in the first place. Be explicit that such a
context is not supported by the allocator. The underlying reason is that
zone->lock would have to become raw_spin_lock as well and that has turned
out to be a problem for RT
(http://lkml.kernel.org/r/[email protected]).
Signed-off-by: Michal Hocko <[email protected]>
Signed-off-by: Andrew Morton <[email protected]>
Reviewed-by: David Hildenbrand <[email protected]>
Reviewed-by: Thomas Gleixner <[email protected]>
Reviewed-by: Uladzislau Rezki <[email protected]>
Cc: "Paul E. McKenney" <[email protected]>
Link: https://lkml.kernel.org/r/[email protected]
Signed-off-by: Linus Torvalds <[email protected]>
|
|
Here is a very rare race which leaks memory:
Page P0 is allocated to the page cache. Page P1 is free.
Thread A Thread B Thread C
find_get_entry():
xas_load() returns P0
Removes P0 from page cache
P0 finds its buddy P1
alloc_pages(GFP_KERNEL, 1) returns P0
P0 has refcount 1
page_cache_get_speculative(P0)
P0 has refcount 2
__free_pages(P0)
P0 has refcount 1
put_page(P0)
P1 is not freed
Fix this by freeing all the pages in __free_pages() that won't be freed
by the call to put_page(). It's usually not a good idea to split a page,
but this is a very unlikely scenario.
Fixes: e286781d5f2e ("mm: speculative page references")
Signed-off-by: Matthew Wilcox (Oracle) <[email protected]>
Signed-off-by: Andrew Morton <[email protected]>
Acked-by: Mike Rapoport <[email protected]>
Cc: Nick Piggin <[email protected]>
Cc: Hugh Dickins <[email protected]>
Cc: Peter Zijlstra <[email protected]>
Link: https://lkml.kernel.org/r/[email protected]
Signed-off-by: Linus Torvalds <[email protected]>
|
|
The function is_huge_zero_page() doesn't call compound_head() to make sure
the page pointer is a head page. The call to is_huge_zero_page() in
release_pages() is made before compound_head() is called so the test would
fail if release_pages() was called with a tail page of the huge_zero_page
and put_page_testzero() would be called releasing the page.
This is unlikely to be happening in normal use or we would be seeing all
sorts of process data corruption when accessing a THP zero page.
Looking at other places where is_huge_zero_page() is called, all seem to
only pass a head page so I think the right solution is to move the call
to compound_head() in release_pages() to a point before calling
is_huge_zero_page().
Signed-off-by: Ralph Campbell <[email protected]>
Signed-off-by: Andrew Morton <[email protected]>
Reviewed-by: Andrew Morton <[email protected]>
Cc: Yu Zhao <[email protected]>
Cc: Dan Williams <[email protected]>
Cc: Matthew Wilcox <[email protected]>
Cc: Christoph Hellwig <[email protected]>
Link: https://lkml.kernel.org/r/[email protected]
Signed-off-by: Linus Torvalds <[email protected]>
|
|
Previously 'for_next_zone_zonelist_nodemask' macro parameter 'zlist' was
unused so this patch removes it.
Signed-off-by: Mateusz Nosek <[email protected]>
Signed-off-by: Andrew Morton <[email protected]>
Link: https://lkml.kernel.org/r/[email protected]
Signed-off-by: Linus Torvalds <[email protected]>
|
|
__perform_reclaim()'s single caller expects it to return 'unsigned long',
hence change its return value and a local variable to 'unsigned long'.
Suggested-by: Andrew Morton <[email protected]>
Signed-off-by: Yanfei Xu <[email protected]>
Signed-off-by: Andrew Morton <[email protected]>
Reviewed-by: Andrew Morton <[email protected]>
Link: https://lkml.kernel.org/r/[email protected]
Signed-off-by: Linus Torvalds <[email protected]>
|
|
finalise_ac() is just 'epilogue' for 'prepare_alloc_pages'. Therefore
there is no need to keep them both so 'finalise_ac' content can be merged
into prepare_alloc_pages() code. It would make __alloc_pages_nodemask()
cleaner when it comes to readability.
Signed-off-by: Mateusz Nosek <[email protected]>
Signed-off-by: Andrew Morton <[email protected]>
Reviewed-by: Andrew Morton <[email protected]>
Cc: Mel Gorman <[email protected]>
Cc: Mike Rapoport <[email protected]>
Link: https://lkml.kernel.org/r/[email protected]
Signed-off-by: Linus Torvalds <[email protected]>
|
|
Previously in '__init early_init_on_alloc' and '__init early_init_on_free'
the return values from 'kstrtobool' were not handled properly. That
caused potential garbage value read from variable 'bool_result'.
Introduced patch fixes error handling.
Signed-off-by: Mateusz Nosek <[email protected]>
Signed-off-by: Andrew Morton <[email protected]>
Reviewed-by: Andrew Morton <[email protected]>
Link: https://lkml.kernel.org/r/[email protected]
Signed-off-by: Linus Torvalds <[email protected]>
|
|
Previously flags check was separated into two separated checks with two
separated branches. In case of presence of any of two mentioned flags,
the same effect on flow occurs. Therefore checks can be merged and one
branch can be avoided.
Signed-off-by: Mateusz Nosek <[email protected]>
Signed-off-by: Andrew Morton <[email protected]>
Reviewed-by: Andrew Morton <[email protected]>
Link: https://lkml.kernel.org/r/[email protected]
Signed-off-by: Linus Torvalds <[email protected]>
|
|
Previously variable 'tmp' was initialized, but was not read later before
reassigning. So the initialization can be removed.
[[email protected]: remove `tmp' altogether]
Signed-off-by: Mateusz Nosek <[email protected]>
Signed-off-by: Andrew Morton <[email protected]>
Link: https://lkml.kernel.org/r/[email protected]
Signed-off-by: Linus Torvalds <[email protected]>
|
|
In has_unmovable_pages(), the page parameter would not always be the first
page within a pageblock (see how the page pointer is passed in from
start_isolate_page_range() after call __first_valid_page()), so that would
cause checking unmovable pages span two pageblocks.
After this patch, the checking is enforced within one pageblock no matter
the page is first one or not, and obey the semantics of this function.
This issue is found by code inspection.
Michal said "this might lead to false negatives when an unrelated block
would cause an isolation failure".
Signed-off-by: Li Xinhai <[email protected]>
Signed-off-by: Andrew Morton <[email protected]>
Reviewed-by: Oscar Salvador <[email protected]>
Acked-by: Michal Hocko <[email protected]>
Cc: David Hildenbrand <[email protected]>
Link: https://lkml.kernel.org/r/[email protected]
Signed-off-by: Linus Torvalds <[email protected]>
|
|
Let's document what ZONE_MOVABLE means, how it's used, and which special
cases we have regarding unmovable pages (memory offlining vs. migration /
allocations).
Signed-off-by: David Hildenbrand <[email protected]>
Signed-off-by: Andrew Morton <[email protected]>
Acked-by: Mike Rapoport <[email protected]>
Cc: Michal Hocko <[email protected]>
Cc: Michael S. Tsirkin <[email protected]>
Cc: Mike Kravetz <[email protected]>
Cc: Pankaj Gupta <[email protected]>
Cc: Baoquan He <[email protected]>
Cc: Jason Wang <[email protected]>
Cc: Qian Cai <[email protected]>
Link: http://lkml.kernel.org/r/[email protected]
Signed-off-by: Linus Torvalds <[email protected]>
|
|
When introducing virtio-mem, the semantics of ZONE_MOVABLE were rather
unclear, which is why we special-cased ZONE_MOVABLE such that partially
plugged blocks would never end up in ZONE_MOVABLE.
Now that the semantics are much clearer (and will be documented in a
follow-up patch including the new virtio-mem behavior), let's allow to
online partially plugged memory blocks to ZONE_MOVABLE and also consider
memory blocks that were onlined to ZONE_MOVABLE when unplugging memory.
While unplugged memory pages are, in general, unmovable, they can be
skipped when offlining memory.
virtio-mem only unplugs fairly big chunks (in the megabyte range) and
rather tries to shrink the memory region than randomly choosing memory.
In theory, if all other pages in the movable zone would be movable,
virtio-mem would only shrink that zone and not create any kind of
fragmentation.
In the future, we might want to remember the zone again and use the
information when (un)plugging memory. For now, let's keep it simple.
Note: Support for defragmentation is planned, to deal with fragmentation
after unplug due to memory chunks within memory blocks that could not get
unplugged before (e.g., somebody pinning pages within ZONE_MOVABLE for a
longer time).
Signed-off-by: David Hildenbrand <[email protected]>
Signed-off-by: Andrew Morton <[email protected]>
Cc: Michal Hocko <[email protected]>
Cc: Michael S. Tsirkin <[email protected]>
Cc: Jason Wang <[email protected]>
Cc: Mike Kravetz <[email protected]>
Cc: Pankaj Gupta <[email protected]>
Cc: Baoquan He <[email protected]>
Cc: Mike Rapoport <[email protected]>
Cc: Qian Cai <[email protected]>
Link: http://lkml.kernel.org/r/[email protected]
Signed-off-by: Linus Torvalds <[email protected]>
|
|
Let's clean it up a bit, simplifying the exit paths.
Signed-off-by: David Hildenbrand <[email protected]>
Signed-off-by: Andrew Morton <[email protected]>
Reviewed-by: Baoquan He <[email protected]>
Reviewed-by: Pankaj Gupta <[email protected]>
Cc: Michal Hocko <[email protected]>
Cc: Michael S. Tsirkin <[email protected]>
Cc: Mike Kravetz <[email protected]>
Cc: Jason Wang <[email protected]>
Cc: Mike Rapoport <[email protected]>
Cc: Qian Cai <[email protected]>
Link: http://lkml.kernel.org/r/[email protected]
Signed-off-by: Linus Torvalds <[email protected]>
|
|
Inside has_unmovable_pages(), we have a comment describing how unmovable
data could end up in ZONE_MOVABLE - via "movablecore". Also, besides
checking if the first page in the pageblock is reserved, we don't perform
any further checks in case of ZONE_MOVABLE.
In case of memory offlining, we set REPORT_FAILURE, properly dump_page()
the page and handle the error gracefully. alloc_contig_pages() users
currently never allocate from ZONE_MOVABLE. E.g., hugetlb uses
alloc_contig_pages() for the allocation of gigantic pages only, which will
never end up on the MOVABLE zone (see htlb_alloc_mask()).
Signed-off-by: David Hildenbrand <[email protected]>
Signed-off-by: Andrew Morton <[email protected]>
Reviewed-by: Baoquan He <[email protected]>
Cc: Michal Hocko <[email protected]>
Cc: Michael S. Tsirkin <[email protected]>
Cc: Mike Kravetz <[email protected]>
Cc: Pankaj Gupta <[email protected]>
Cc: Jason Wang <[email protected]>
Cc: Mike Rapoport <[email protected]>
Cc: Qian Cai <[email protected]>
Link: http://lkml.kernel.org/r/[email protected]
Signed-off-by: Linus Torvalds <[email protected]>
|
|
set_migratetype_isolate()
Right now, if we have two isolations racing on a pageblock that's in the
MOVABLE zone, we would trigger the WARN_ON_ONCE(). Let's just return
directly, simplifying error handling.
The change was introduced in commit 3d680bdf60a5 ("mm/page_isolation: fix
potential warning from user"). As far as I can see, we currently don't
have alloc_contig_range() users that use the ZONE_MOVABLE (anymore), so
it's currently more a cleanup and a preparation for the future than a fix.
Signed-off-by: David Hildenbrand <[email protected]>
Signed-off-by: Andrew Morton <[email protected]>
Reviewed-by: Baoquan He <[email protected]>
Reviewed-by: Pankaj Gupta <[email protected]>
Acked-by: Mike Kravetz <[email protected]>
Cc: Michal Hocko <[email protected]>
Cc: Michael S. Tsirkin <[email protected]>
Cc: Qian Cai <[email protected]>
Cc: Jason Wang <[email protected]>
Cc: Mike Rapoport <[email protected]>
Link: http://lkml.kernel.org/r/[email protected]
Signed-off-by: Linus Torvalds <[email protected]>
|
|
Patch series "mm / virtio-mem: support ZONE_MOVABLE", v5.
When introducing virtio-mem, the semantics of ZONE_MOVABLE were rather
unclear, which is why we special-cased ZONE_MOVABLE such that partially
plugged blocks would never end up in ZONE_MOVABLE.
Now that the semantics are much clearer (and are documented in patch #6),
let's support partially plugged memory blocks in ZONE_MOVABLE, allowing
partially plugged memory blocks to be online to ZONE_MOVABLE and also
unplugging from such memory blocks. This avoids surprises when onlining
of memory blocks suddenly fails, just because they are not completely
populated by virtio-mem (yet).
This is especially helpful for testing, but also paves the way for
virtio-mem optimizations, allowing more memory to get reliably unplugged.
Cleanup has_unmovable_pages() and set_migratetype_isolate(), providing
better documentation of how ZONE_MOVABLE interacts with different kind of
unmovable pages (memory offlining vs. alloc_contig_range()).
This patch (of 6):
Let's move the split comment regarding bootmem allocations and memory
holes, especially in the context of ZONE_MOVABLE, to the PageReserved()
check.
Signed-off-by: David Hildenbrand <[email protected]>
Signed-off-by: Andrew Morton <[email protected]>
Reviewed-by: Baoquan He <[email protected]>
Cc: Michal Hocko <[email protected]>
Cc: Michael S. Tsirkin <[email protected]>
Cc: Mike Kravetz <[email protected]>
Cc: Pankaj Gupta <[email protected]>
Cc: Jason Wang <[email protected]>
Cc: Mike Rapoport <[email protected]>
Cc: Qian Cai <[email protected]>
Link: http://lkml.kernel.org/r/[email protected]
Link: http://lkml.kernel.org/r/[email protected]
Signed-off-by: Linus Torvalds <[email protected]>
|
|
KASAN errors will currently trigger a panic when panic_on_warn is set.
This renders kasan_multishot useless, as further KASAN errors won't be
reported if the kernel has already paniced. By making kasan_multishot
disable this behaviour for KASAN errors, we can still have the benefits of
panic_on_warn for non-KASAN warnings, yet be able to use kasan_multishot.
This is particularly important when running KASAN tests, which need to
trigger multiple KASAN errors: previously these would panic the system if
panic_on_warn was set, now they can run (and will panic the system should
non-KASAN warnings show up).
Signed-off-by: David Gow <[email protected]>
Signed-off-by: Andrew Morton <[email protected]>
Tested-by: Andrey Konovalov <[email protected]>
Reviewed-by: Andrey Konovalov <[email protected]>
Reviewed-by: Brendan Higgins <[email protected]>
Cc: Andrey Ryabinin <[email protected]>
Cc: Dmitry Vyukov <[email protected]>
Cc: Ingo Molnar <[email protected]>
Cc: Juri Lelli <[email protected]>
Cc: Patricia Alfonso <[email protected]>
Cc: Peter Zijlstra <[email protected]>
Cc: Shuah Khan <[email protected]>
Cc: Vincent Guittot <[email protected]>
Link: https://lkml.kernel.org/r/[email protected]
Link: https://lkml.kernel.org/r/[email protected]
Signed-off-by: Linus Torvalds <[email protected]>
|
|
Include documentation on how to test KASAN using CONFIG_TEST_KASAN_KUNIT
and CONFIG_TEST_KASAN_MODULE.
Signed-off-by: Patricia Alfonso <[email protected]>
Signed-off-by: David Gow <[email protected]>
Signed-off-by: Andrew Morton <[email protected]>
Tested-by: Andrey Konovalov <[email protected]>
Reviewed-by: Andrey Konovalov <[email protected]>
Reviewed-by: Dmitry Vyukov <[email protected]>
Acked-by: Brendan Higgins <[email protected]>
Cc: Andrey Ryabinin <[email protected]>
Cc: Ingo Molnar <[email protected]>
Cc: Juri Lelli <[email protected]>
Cc: Peter Zijlstra <[email protected]>
Cc: Shuah Khan <[email protected]>
Cc: Vincent Guittot <[email protected]>
Link: https://lkml.kernel.org/r/[email protected]
Link: https://lkml.kernel.org/r/[email protected]
Signed-off-by: Linus Torvalds <[email protected]>
|
|
Transfer all previous tests for KASAN to KUnit so they can be run more
easily. Using kunit_tool, developers can run these tests with their other
KUnit tests and see "pass" or "fail" with the appropriate KASAN report
instead of needing to parse each KASAN report to test KASAN
functionalities. All KASAN reports are still printed to dmesg.
Stack tests do not work properly when KASAN_STACK is enabled so those
tests use a check for "if IS_ENABLED(CONFIG_KASAN_STACK)" so they only run
if stack instrumentation is enabled. If KASAN_STACK is not enabled, KUnit
will print a statement to let the user know this test was not run with
KASAN_STACK enabled.
copy_user_test and kasan_rcu_uaf cannot be run in KUnit so there is a
separate test file for those tests, which can be run as before as a
module.
[[email protected]: v14]
Link: https://lkml.kernel.org/r/[email protected]
Signed-off-by: Patricia Alfonso <[email protected]>
Signed-off-by: David Gow <[email protected]>
Signed-off-by: Andrew Morton <[email protected]>
Tested-by: Andrey Konovalov <[email protected]>
Reviewed-by: Brendan Higgins <[email protected]>
Reviewed-by: Andrey Konovalov <[email protected]>
Reviewed-by: Dmitry Vyukov <[email protected]>
Cc: Andrey Ryabinin <[email protected]>
Cc: Ingo Molnar <[email protected]>
Cc: Juri Lelli <[email protected]>
Cc: Peter Zijlstra <[email protected]>
Cc: Shuah Khan <[email protected]>
Cc: Vincent Guittot <[email protected]>
Link: https://lkml.kernel.org/r/[email protected]
Signed-off-by: Linus Torvalds <[email protected]>
|
|
Integrate KASAN into KUnit testing framework.
- Fail tests when KASAN reports an error that is not expected
- Use KUNIT_EXPECT_KASAN_FAIL to expect a KASAN error in KASAN
tests
- Expected KASAN reports pass tests and are still printed when run
without kunit_tool (kunit_tool still bypasses the report due to the
test passing)
- KUnit struct in current task used to keep track of the current
test from KASAN code
Make use of "[PATCH v3 kunit-next 1/2] kunit: generalize kunit_resource
API beyond allocated resources" and "[PATCH v3 kunit-next 2/2] kunit: add
support for named resources" from Alan Maguire [1]
- A named resource is added to a test when a KASAN report is
expected
- This resource contains a struct for kasan_data containing
booleans representing if a KASAN report is expected and if a
KASAN report is found
[1] (https://lore.kernel.org/linux-kselftest/[email protected]/T/#t)
Signed-off-by: Patricia Alfonso <[email protected]>
Signed-off-by: David Gow <[email protected]>
Signed-off-by: Andrew Morton <[email protected]>
Tested-by: Andrey Konovalov <[email protected]>
Reviewed-by: Andrey Konovalov <[email protected]>
Reviewed-by: Dmitry Vyukov <[email protected]>
Acked-by: Brendan Higgins <[email protected]>
Cc: Andrey Ryabinin <[email protected]>
Cc: Ingo Molnar <[email protected]>
Cc: Juri Lelli <[email protected]>
Cc: Peter Zijlstra <[email protected]>
Cc: Shuah Khan <[email protected]>
Cc: Vincent Guittot <[email protected]>
Link: https://lkml.kernel.org/r/[email protected]
Link: https://lkml.kernel.org/r/[email protected]
Signed-off-by: Linus Torvalds <[email protected]>
|
|
Patch series "KASAN-KUnit Integration", v14.
This patchset contains everything needed to integrate KASAN and KUnit.
KUnit will be able to:
(1) Fail tests when an unexpected KASAN error occurs
(2) Pass tests when an expected KASAN error occurs
Convert KASAN tests to KUnit with the exception of copy_user_test because
KUnit is unable to test those.
Add documentation on how to run the KASAN tests with KUnit and what to
expect when running these tests.
This patch (of 5):
In order to integrate debugging tools like KASAN into the KUnit framework,
add KUnit struct to the current task to keep track of the current KUnit
test.
Signed-off-by: Patricia Alfonso <[email protected]>
Signed-off-by: David Gow <[email protected]>
Signed-off-by: Andrew Morton <[email protected]>
Tested-by: Andrey Konovalov <[email protected]>
Reviewed-by: Brendan Higgins <[email protected]>
Cc: Brendan Higgins <[email protected]>
Cc: Andrey Ryabinin <[email protected]>
Cc: Dmitry Vyukov <[email protected]>
Cc: Ingo Molnar <[email protected]>
Cc: Peter Zijlstra <[email protected]>
Cc: Juri Lelli <[email protected]>
Cc: Vincent Guittot <[email protected]>
Cc: Shuah Khan <[email protected]>
Link: https://lkml.kernel.org/r/[email protected]
Link: https://lkml.kernel.org/r/[email protected]
Link: https://lkml.kernel.org/r/[email protected]
Link: https://lkml.kernel.org/r/[email protected]
Signed-off-by: Linus Torvalds <[email protected]>
|
|
In the context of the anonymous address space lifespan description the
'mm_users' reference counter is confused with 'mm_count'. I.e a "zombie"
mm gets released when "mm_count" becomes zero, not "mm_users".
Signed-off-by: Alexander Gordeev <[email protected]>
Signed-off-by: Andrew Morton <[email protected]>
Cc: Jonathan Corbet <[email protected]>
Link: https://lkml.kernel.org/r/[email protected]
Signed-off-by: Linus Torvalds <[email protected]>
|
|
Fix the comment of find_vm_area() and get_vm_area()
Signed-off-by: Hui Su <[email protected]>
Signed-off-by: Andrew Morton <[email protected]>
Reviewed-by: Andrew Morton <[email protected]>
Link: https://lkml.kernel.org/r/20200927153034.GA199877@rlk
Signed-off-by: Linus Torvalds <[email protected]>
|
|
Since c67dc624757 ("mm/vmalloc: do not call kmemleak_free() on not yet
accounted memory"), the __vunmap() have been changed to __vfree(), so
update the confusing comment().
Signed-off-by: Hui Su <[email protected]>
Signed-off-by: Andrew Morton <[email protected]>
Reviewed-by: Andrew Morton <[email protected]>
Cc: Roman Penyaev <[email protected]>
Link: https://lkml.kernel.org/r/20200927155409.GA3315@rlk
Signed-off-by: Linus Torvalds <[email protected]>
|
|
Unlike others we don't use the marco writeback. so let's remove it to
tame gcc warning:
mm/memory-failure.c:827: warning: macro "writeback" is not used
[-Wunused-macros]
Signed-off-by: Alex Shi <[email protected]>
Signed-off-by: Andrew Morton <[email protected]>
Cc: Naoya Horiguchi <[email protected]>
Link: https://lkml.kernel.org/r/[email protected]
Signed-off-by: Linus Torvalds <[email protected]>
|
|
There is no need to calculate pgoff in each loop of for_each_process(), so
move it to the place before for_each_process(), which can save some CPU
cycles.
Signed-off-by: Xianting Tian <[email protected]>
Signed-off-by: Andrew Morton <[email protected]>
Acked-by: Naoya Horiguchi <[email protected]>
Link: http://lkml.kernel.org/r/[email protected]
Signed-off-by: Linus Torvalds <[email protected]>
|
|
No need to hard code function name when __func__ can be used.
While here, replace specifiers for special types like dma_addr_t.
Signed-off-by: Andy Shevchenko <[email protected]>
Signed-off-by: Andrew Morton <[email protected]>
Cc: Matthew Wilcox <[email protected]>
Link: http://lkml.kernel.org/r/[email protected]
Signed-off-by: Linus Torvalds <[email protected]>
|
|
There is a place in the code where open-coded version of
list_for_each_entry_safe() is used. Replace that with the standard macro.
Signed-off-by: Andy Shevchenko <[email protected]>
Signed-off-by: Andrew Morton <[email protected]>
Cc: Matthew Wilcox <[email protected]>
Link: http://lkml.kernel.org/r/[email protected]
Signed-off-by: Linus Torvalds <[email protected]>
|
|
The variable dmirror_zero_page is unused in the HMM self test driver which
was probably intended to demonstrate how a driver could use
migrate_vma_setup() to share a single read-only device private zero page
similar to how the CPU does. However, this isn't needed for the self
tests so remove it.
Signed-off-by: Ralph Campbell <[email protected]>
Signed-off-by: Andrew Morton <[email protected]>
Cc: Jerome Glisse <[email protected]>
Link: https://lkml.kernel.org/r/[email protected]
Signed-off-by: Linus Torvalds <[email protected]>
|
|
Some tests might not be able to be run if resources like huge pages are
not available. Mark these tests as skipped instead of simply passing.
Signed-off-by: Ralph Campbell <[email protected]>
Signed-off-by: Andrew Morton <[email protected]>
Reviewed-by: Jason Gunthorpe <[email protected]>
Cc: Jerome Glisse <[email protected]>
Cc: John Hubbard <[email protected]>
Cc: Shuah Khan <[email protected]>
Link: http://lkml.kernel.org/r/[email protected]
Signed-off-by: Linus Torvalds <[email protected]>
|
|
As mincore_huge_pmd() was dropped, remove the declaration from the header
file.
Signed-off-by: Yulei Zhang <[email protected]>
Signed-off-by: Andrew Morton <[email protected]>
Reviewed-by: Zi Yan <[email protected]>
Link: https://lkml.kernel.org/r/[email protected]
Signed-off-by: Linus Torvalds <[email protected]>
|
|
Both of the mm pointers are not needed after commit 7a4830c380f3
("mm/fork: Pass new vma pointer into copy_page_range()").
Jason Gunthorpe also reported that the ordering of copy_page_range() is
odd. Since working at it, reorder the parameters to be logical, by (1)
always put the dst_* fields to be before src_* fields, and (2) keep the
same type of parameters together.
[[email protected]: further reorder some parameters and line format, per Jason]
Link: https://lkml.kernel.org/r/[email protected]
[[email protected]: fix warnings]
Link: https://lkml.kernel.org/r/20201006200138.GA6026@xz-x1
Reported-by: Kirill A. Shutemov <[email protected]>
Signed-off-by: Peter Xu <[email protected]>
Signed-off-by: Andrew Morton <[email protected]>
Reviewed-by: Jason Gunthorpe <[email protected]>
Acked-by: Kirill A. Shutemov <[email protected]>
Link: https://lkml.kernel.org/r/[email protected]
Signed-off-by: Linus Torvalds <[email protected]>
|
|
Replace do_brk with do_brk_flags in comment of insert_vm_struct(), since
do_brk was removed in following commit.
Fixes: bb177a732c4369 ("mm: do not bug_on on incorrect length in __mm_populate()")
Signed-off-by: Liao Pingfang <[email protected]>
Signed-off-by: Yi Wang <[email protected]>
Signed-off-by: Andrew Morton <[email protected]>
Link: https://lkml.kernel.org/r/[email protected]
Signed-off-by: Linus Torvalds <[email protected]>
|
|
__remove_shared_vm_struct()
In commit 1da177e4c3f4 ("Linux-2.6.12-rc2"), the helper allow_write_access
came with the atomic_inc operation of the i_writecount field in the func
__remove_shared_vm_struct(). But it forgot to use this helper function.
Signed-off-by: Miaohe Lin <[email protected]>
Signed-off-by: Andrew Morton <[email protected]>
Link: https://lkml.kernel.org/r/[email protected]
Signed-off-by: Linus Torvalds <[email protected]>
|
|
Commit 4bb5f5d9395b ("mm: allow drivers to prevent new writable mappings")
changed i_mmap_writable from unsigned int to atomic_t and add the helper
function mapping_allow_writable() to atomic_inc i_mmap_writable. But it
forgot to use this helper function in dup_mmap() and __vma_link_file().
Signed-off-by: Miaohe Lin <[email protected]>
Signed-off-by: Andrew Morton <[email protected]>
Cc: Christian Brauner <[email protected]>
Cc: Ingo Molnar <[email protected]>
Cc: Peter Zijlstra <[email protected]>
Cc: "Eric W. Biederman" <[email protected]>
Cc: Christian Kellner <[email protected]>
Cc: Suren Baghdasaryan <[email protected]>
Cc: Adrian Reber <[email protected]>
Cc: Shakeel Butt <[email protected]>
Cc: Aleksa Sarai <[email protected]>
Cc: Thomas Gleixner <[email protected]>
Link: https://lkml.kernel.org/r/[email protected]
Signed-off-by: Linus Torvalds <[email protected]>
|
|
In __vma_adjust(), we do the check on *root* to decide whether to adjust
the address_space. It seems to be more meaningful to do the check on
*file* itself. This means we are adjusting some data because it is a file
backed vma.
Since we seem to assume the address_space is valid if it is a file backed
vma, let's just replace *root* with *file* here.
Signed-off-by: Wei Yang <[email protected]>
Signed-off-by: Andrew Morton <[email protected]>
Reviewed-by: Andrew Morton <[email protected]>
Link: https://lkml.kernel.org/r/[email protected]
Signed-off-by: Linus Torvalds <[email protected]>
|
|
*root* with type of struct rb_root_cached is an element of *mapping*
with type of struct address_space. This implies when we have a valid
*root* it must be a part of valid *mapping*.
So we can merge these two checks together to make the code more easy to
read and to save some cpu cycles.
Signed-off-by: Wei Yang <[email protected]>
Signed-off-by: Andrew Morton <[email protected]>
Reviewed-by: Andrew Morton <[email protected]>
Link: https://lkml.kernel.org/r/[email protected]
Signed-off-by: Linus Torvalds <[email protected]>
|
|
Fix typo/spello of "function".
Signed-off-by: Randy Dunlap <[email protected]>
Signed-off-by: Andrew Morton <[email protected]>
Link: https://lkml.kernel.org/r/[email protected]
Signed-off-by: Linus Torvalds <[email protected]>
|
|
Instead of converting adjust_next between bytes and pages number, let's
just store the virtual address into adjust_next.
Also, this patch fixes one typo in the comment of vma_adjust_trans_huge().
[[email protected]: changelog tweak]
Signed-off-by: Wei Yang <[email protected]>
Signed-off-by: Andrew Morton <[email protected]>
Acked-by: Vlastimil Babka <[email protected]>
Cc: Mike Kravetz <[email protected]>
Link: http://lkml.kernel.org/r/[email protected]
Signed-off-by: Linus Torvalds <[email protected]>
|
|
Introduce the new page policy of PF_SECOND which lets us use the normal
pageflags generation machinery to create the various DoubleMap
manipulation functions.
Signed-off-by: Matthew Wilcox (Oracle) <[email protected]>
Signed-off-by: Andrew Morton <[email protected]>
Reviewed-by: Zi Yan <[email protected]>
Link: https://lkml.kernel.org/r/[email protected]
Signed-off-by: Linus Torvalds <[email protected]>
|
|
Patch series "Fix PageDoubleMap".
This is a purely theoretical problem for now as none of the filesystems
which use PG_private_2 (ie PG_fscache) are being converted at this time,
but it's confusing to leave it like this.
This patch (of 2):
PG_private_2 is defined as being PF_ANY (applicable to tail pages as well
as regular & head pages). That means that the first tail page of a
double-map page will appear to have Private2 set. Use the Workingset bit
instead which is defined as PF_HEAD so any attempt to access the
Workingset bit on a tail page will redirect to the head page's Workingset
bit.
Signed-off-by: Matthew Wilcox (Oracle) <[email protected]>
Signed-off-by: Andrew Morton <[email protected]>
Reviewed-by: Zi Yan <[email protected]>
Link: https://lkml.kernel.org/r/[email protected]
Link: https://lkml.kernel.org/r/[email protected]
Signed-off-by: Linus Torvalds <[email protected]>
|
|
smaps_rollup will try to grab mmap_lock and go through the whole vma list
until it finishes the iterating. When encountering large processes, the
mmap_lock will be held for a longer time, which may block other write
requests like mmap and munmap from progressing smoothly.
There are upcoming mmap_lock optimizations like range-based locks, but the
lock applied to smaps_rollup would be the coarse type, which doesn't avoid
the occurrence of unpleasant contention.
To solve aforementioned issue, we add a check which detects whether anyone
wants to grab mmap_lock for write attempts.
Signed-off-by: Chinwen Chang <[email protected]>
Signed-off-by: Andrew Morton <[email protected]>
Cc: Steven Price <[email protected]>
Cc: Michel Lespinasse <[email protected]>
Cc: Matthias Brugger <[email protected]>
Cc: Vlastimil Babka <[email protected]>
Cc: Daniel Jordan <[email protected]>
Cc: Davidlohr Bueso <[email protected]>
Cc: Chinwen Chang <[email protected]>
Cc: Alexey Dobriyan <[email protected]>
Cc: "Matthew Wilcox (Oracle)" <[email protected]>
Cc: Jason Gunthorpe <[email protected]>
Cc: Song Liu <[email protected]>
Cc: Jimmy Assarsson <[email protected]>
Cc: Huang Ying <[email protected]>
Cc: Daniel Kiss <[email protected]>
Cc: Laurent Dufour <[email protected]>
Link: http://lkml.kernel.org/r/[email protected]
Signed-off-by: Linus Torvalds <[email protected]>
|
|
Extend smap_gather_stats to support indicated beginning address at which
it should start gathering. To achieve the goal, we add a new parameter
@start assigned by the caller and try to refactor it for simplicity.
If @start is 0, it will use the range of @vma for gathering.
Signed-off-by: Chinwen Chang <[email protected]>
Signed-off-by: Andrew Morton <[email protected]>
Reviewed-by: Steven Price <[email protected]>
Cc: Michel Lespinasse <[email protected]>
Cc: Alexey Dobriyan <[email protected]>
Cc: Daniel Jordan <[email protected]>
Cc: Daniel Kiss <[email protected]>
Cc: Davidlohr Bueso <[email protected]>
Cc: Huang Ying <[email protected]>
Cc: Jason Gunthorpe <[email protected]>
Cc: Jimmy Assarsson <[email protected]>
Cc: Laurent Dufour <[email protected]>
Cc: "Matthew Wilcox (Oracle)" <[email protected]>
Cc: Matthias Brugger <[email protected]>
Cc: Song Liu <[email protected]>
Cc: Vlastimil Babka <[email protected]>
Link: http://lkml.kernel.org/r/[email protected]
Signed-off-by: Linus Torvalds <[email protected]>
|
|
Patch series "Try to release mmap_lock temporarily in smaps_rollup", v4.
Recently, we have observed some janky issues caused by unpleasantly long
contention on mmap_lock which is held by smaps_rollup when probing large
processes. To address the problem, we let smaps_rollup detect if anyone
wants to acquire mmap_lock for write attempts. If yes, just release the
lock temporarily to ease the contention.
smaps_rollup is a procfs interface which allows users to summarize the
process's memory usage without the overhead of seq_* calls. Android uses
it to sample the memory usage of various processes to balance its memory
pool sizes. If no one wants to take the lock for write requests,
smaps_rollup with this patch will behave like the original one.
Although there are on-going mmap_lock optimizations like range-based
locks, the lock applied to smaps_rollup would be the coarse one, which is
hard to avoid the occurrence of aforementioned issues. So the detection
and temporary release for write attempts on mmap_lock in smaps_rollup is
still necessary.
This patch (of 3):
Add new API to query if someone wants to acquire mmap_lock for write
attempts.
Using this instead of rwsem_is_contended makes it more tolerant of future
changes to the lock type.
Signed-off-by: Chinwen Chang <[email protected]>
Signed-off-by: Andrew Morton <[email protected]>
Reviewed-by: Steven Price <[email protected]>
Acked-by: Michel Lespinasse <[email protected]>
Cc: Alexey Dobriyan <[email protected]>
Cc: Daniel Jordan <[email protected]>
Cc: Daniel Kiss <[email protected]>
Cc: Davidlohr Bueso <[email protected]>
Cc: Huang Ying <[email protected]>
Cc: Jason Gunthorpe <[email protected]>
Cc: Jimmy Assarsson <[email protected]>
Cc: Laurent Dufour <[email protected]>
Cc: "Matthew Wilcox (Oracle)" <[email protected]>
Cc: Matthias Brugger <[email protected]>
Cc: Song Liu <[email protected]>
Cc: Vlastimil Babka <[email protected]>
Link: http://lkml.kernel.org/r/[email protected]
Link: http://lkml.kernel.org/r/[email protected]
Signed-off-by: Linus Torvalds <[email protected]>
|
|
These two functions share the same logic except ignore a different vma.
Let's reuse the code.
Signed-off-by: Wei Yang <[email protected]>
Signed-off-by: Andrew Morton <[email protected]>
Reviewed-by: Andrew Morton <[email protected]>
Link: https://lkml.kernel.org/r/[email protected]
Signed-off-by: Linus Torvalds <[email protected]>
|
|
__vma_unlink_common() and __vma_unlink() are counterparts. Since there is
no function named __vma_unlink(), let's rename __vma_unlink_common() to
__vma_unlink() to make the code more self-explanatory and easy for
audience to understand.
Otherwise we may expect there are several variants of vma_unlink() and
__vma_unlink_common() is used by them.
Signed-off-by: Wei Yang <[email protected]>
Signed-off-by: Andrew Morton <[email protected]>
Reviewed-by: Andrew Morton <[email protected]>
Link: https://lkml.kernel.org/r/[email protected]
Signed-off-by: Linus Torvalds <[email protected]>
|
|
The code has declared a vma_struct named vma which is assigned a value of
vmf->vma. Thus, use variable vma directly here.
Signed-off-by: Yanfei Xu <[email protected]>
Signed-off-by: Andrew Morton <[email protected]>
Reviewed-by: Matthew Wilcox (Oracle) <[email protected]>
Link: http://lkml.kernel.org/r/[email protected]
Signed-off-by: Linus Torvalds <[email protected]>
|
|
It's "pte_alloc_one", not "pte_alloc_pne". Let's fix that.
Signed-off-by: Yanfei Xu <[email protected]>
Signed-off-by: Andrew Morton <[email protected]>
Reviewed-by: David Hildenbrand <[email protected]>
Link: http://lkml.kernel.org/r/[email protected]
Signed-off-by: Linus Torvalds <[email protected]>
|
|
We account the PTE level of the page tables to the process in order to
make smarter OOM decisions and help diagnose why memory is fragmented.
For these same reasons, we should account pages allocated for PMDs. With
larger process address spaces and ASLR, the number of PMDs in use is
higher than it used to be so the inaccuracy is starting to matter.
[[email protected]: arm: __pmd_free_tlb(): call page table destructor]
Link: https://lkml.kernel.org/r/[email protected]
Signed-off-by: Matthew Wilcox (Oracle) <[email protected]>
Signed-off-by: Andrew Morton <[email protected]>
Reviewed-by: Mike Rapoport <[email protected]>
Cc: Abdul Haleem <[email protected]>
Cc: Andy Lutomirski <[email protected]>
Cc: Arnd Bergmann <[email protected]>
Cc: Christophe Leroy <[email protected]>
Cc: Joerg Roedel <[email protected]>
Cc: Max Filippov <[email protected]>
Cc: Peter Zijlstra <[email protected]>
Cc: Satheesh Rajendran <[email protected]>
Cc: Stafford Horne <[email protected]>
Cc: Naresh Kamboju <[email protected]>
Cc: Anders Roxell <[email protected]>
Link: http://lkml.kernel.org/r/[email protected]
Signed-off-by: Linus Torvalds <[email protected]>
|
|
Avoid accidental wrong builds, due to built-in rules working just a little
bit too well--but not quite as well as required for our situation here.
In other words, "make userfaultfd" (for example) is supposed to fail to
build at all, because this Makefile only supports either "make" (all), or
"make /full/path". However, the built-in rules, if not suppressed, will
pick up CFLAGS and the initial LDLIBS (but not the target-specific LDLIBS,
because those are only set for the full path target!). This causes it to
get pretty far into building things despite using incorrect values such as
an *occasionally* incomplete LDLIBS value.
Signed-off-by: John Hubbard <[email protected]>
Signed-off-by: Andrew Morton <[email protected]>
Cc: Shuah Khan <[email protected]>
Cc: Jason Gunthorpe <[email protected]>
Link: https://lkml.kernel.org/r/[email protected]
Signed-off-by: Linus Torvalds <[email protected]>
|