Age | Commit message (Collapse) | Author | Files | Lines |
|
These might not be issues yet, but they make the script more fragile.
Also by fixing them we give a better example to future readers, who might
copy/paste or otherwise re-use snippets from our script.
- Use "read -r", since we don't ever want read to be interpreting '\'
characters as escape sequences...
- Quote variables, to deal with spaces properly.
- Use $() instead of the older and harder-to-nest ``.
- Get rid of superfluous "$" prefixes inside arithmetic $(()).
Link: https://lkml.kernel.org/r/[email protected]
Signed-off-by: Axel Rasmussen <[email protected]>
Cc: Shuah Khan <[email protected]>
Signed-off-by: Andrew Morton <[email protected]>
|
|
Previously, each test printed out its own header, dealt with its own
return code, etc. By just putting this standard stuff in a function, we
can delete > 300 lines from the script.
This also makes adding future tests easier. And, it gets rid of various
inconsistencies that already exist:
- Some tests correctly deal with ksft_skip, but others don't.
- Some tests just print the executable name, others print arguments, and
yet others print some comment in the header.
- Most tests print out a header with two separator lines, but not the
HMM smoke test or the memfd_secret test, which only print one.
- We had a redundant "exit" at the end, with all the boilerplate it's an
easy oversight.
Link: https://lkml.kernel.org/r/[email protected]
Signed-off-by: Axel Rasmussen <[email protected]>
Cc: Shuah Khan <[email protected]>
Signed-off-by: Andrew Morton <[email protected]>
|
|
This introduces three tests:
1) Sanity check soft dirty basic semantics: allocate area, clean,
dirty, check if the SD bit is flipped.
2) Check VMA reuse: validate the VM_SOFTDIRTY usage
3) Check soft-dirty on huge pages
This was motivated by Will Deacon's fix commit 912efa17e512 ("mm: proc:
Invalidate TLB after clearing soft-dirty page state"). I was tracking the
same issue that he fixed, and this test would have caught it.
Link: https://lkml.kernel.org/r/[email protected]
Signed-off-by: Gabriel Krisman Bertazi <[email protected]>
Signed-off-by: Muhammad Usama Anjum <[email protected]>
Co-developed-by: Muhammad Usama Anjum <[email protected]>
Cc: Will Deacon <[email protected]>
Cc: Shuah Khan <[email protected]>
Signed-off-by: Andrew Morton <[email protected]>
|
|
Bring common functions to a new file while keeping code as much same as
possible. These functions can be used in the new tests. This helps in
avoiding code duplication.
Link: https://lkml.kernel.org/r/[email protected]
Signed-off-by: Muhammad Usama Anjum <[email protected]>
Acked-by: David Hildenbrand <[email protected]>
Cc: Gabriel Krisman Bertazi <[email protected]>
Cc: Shuah Khan <[email protected]>
Cc: Will Deacon <[email protected]>
Signed-off-by: Andrew Morton <[email protected]>
|
|
Print three possible reasons /sys/kernel/debug/gup_test cannot be opened
to help users of this test diagnose failures.
Link: https://lkml.kernel.org/r/[email protected]
Signed-off-by: Sidhartha Kumar <[email protected]>
Reviewed-by: Shuah Khan <[email protected]>
Signed-off-by: Andrew Morton <[email protected]>
|
|
The only user (DAX) of range and pmdpp parameters of
follow_invalidate_pte() is gone, it is safe to remove them and make it
static to simlify the code. This is revertant of the following commits:
097963959594 ("mm: add follow_pte_pmd()")
a4d1a8852513 ("dax: update to new mmu_notifier semantic")
There is only one caller of the follow_invalidate_pte(). So just fold it
into follow_pte() and remove it.
Link: https://lkml.kernel.org/r/[email protected]
Signed-off-by: Muchun Song <[email protected]>
Reviewed-by: Christoph Hellwig <[email protected]>
Cc: Alistair Popple <[email protected]>
Cc: Al Viro <[email protected]>
Cc: Dan Williams <[email protected]>
Cc: Hugh Dickins <[email protected]>
Cc: Jan Kara <[email protected]>
Cc: "Kirill A. Shutemov" <[email protected]>
Cc: Matthew Wilcox <[email protected]>
Cc: Ralph Campbell <[email protected]>
Cc: Ross Zwisler <[email protected]>
Cc: Xiongchun Duan <[email protected]>
Cc: Xiyu Yang <[email protected]>
Cc: Yang Shi <[email protected]>
Signed-off-by: Andrew Morton <[email protected]>
|
|
Currently dax_mapping_entry_mkclean() fails to clean and write protect the
pte entry within a DAX PMD entry during an *sync operation. This can
result in data loss in the following sequence:
1) process A mmap write to DAX PMD, dirtying PMD radix tree entry and
making the pmd entry dirty and writeable.
2) process B mmap with the @offset (e.g. 4K) and @length (e.g. 4K)
write to the same file, dirtying PMD radix tree entry (already
done in 1)) and making the pte entry dirty and writeable.
3) fsync, flushing out PMD data and cleaning the radix tree entry. We
currently fail to mark the pte entry as clean and write protected
since the vma of process B is not covered in dax_entry_mkclean().
4) process B writes to the pte. These don't cause any page faults since
the pte entry is dirty and writeable. The radix tree entry remains
clean.
5) fsync, which fails to flush the dirty PMD data because the radix tree
entry was clean.
6) crash - dirty data that should have been fsync'd as part of 5) could
still have been in the processor cache, and is lost.
Just to use pfn_mkclean_range() to clean the pfns to fix this issue.
Link: https://lkml.kernel.org/r/[email protected]
Fixes: 4b4bb46d00b3 ("dax: clear dirty entry tags on cache flush")
Signed-off-by: Muchun Song <[email protected]>
Reviewed-by: Christoph Hellwig <[email protected]>
Cc: Alistair Popple <[email protected]>
Cc: Al Viro <[email protected]>
Cc: Dan Williams <[email protected]>
Cc: Hugh Dickins <[email protected]>
Cc: Jan Kara <[email protected]>
Cc: "Kirill A. Shutemov" <[email protected]>
Cc: Matthew Wilcox <[email protected]>
Cc: Ralph Campbell <[email protected]>
Cc: Ross Zwisler <[email protected]>
Cc: Xiongchun Duan <[email protected]>
Cc: Xiyu Yang <[email protected]>
Cc: Yang Shi <[email protected]>
Signed-off-by: Andrew Morton <[email protected]>
|
|
The devmap pages can not use page_vma_mapped_walk() to check if a huge
devmap page is mapped into a vma. Add support for walking huge devmap
pages so that DAX can use it in the next patch.
Link: https://lkml.kernel.org/r/[email protected]
Signed-off-by: Muchun Song <[email protected]>
Cc: Alistair Popple <[email protected]>
Cc: Al Viro <[email protected]>
Cc: Christoph Hellwig <[email protected]>
Cc: Dan Williams <[email protected]>
Cc: Hugh Dickins <[email protected]>
Cc: Jan Kara <[email protected]>
Cc: "Kirill A. Shutemov" <[email protected]>
Cc: Matthew Wilcox <[email protected]>
Cc: Ralph Campbell <[email protected]>
Cc: Ross Zwisler <[email protected]>
Cc: Xiongchun Duan <[email protected]>
Cc: Xiyu Yang <[email protected]>
Cc: Yang Shi <[email protected]>
Signed-off-by: Andrew Morton <[email protected]>
|
|
The page_mkclean_one() is supposed to be used with the pfn that has a
associated struct page, but not all the pfns (e.g. DAX) have a struct
page. Introduce a new function pfn_mkclean_range() to cleans the PTEs
(including PMDs) mapped with range of pfns which has no struct page
associated with them. This helper will be used by DAX device in the next
patch to make pfns clean.
Link: https://lkml.kernel.org/r/[email protected]
Signed-off-by: Muchun Song <[email protected]>
Cc: Alistair Popple <[email protected]>
Cc: Al Viro <[email protected]>
Cc: Christoph Hellwig <[email protected]>
Cc: Dan Williams <[email protected]>
Cc: Hugh Dickins <[email protected]>
Cc: Jan Kara <[email protected]>
Cc: "Kirill A. Shutemov" <[email protected]>
Cc: Matthew Wilcox <[email protected]>
Cc: Ralph Campbell <[email protected]>
Cc: Ross Zwisler <[email protected]>
Cc: Xiongchun Duan <[email protected]>
Cc: Xiyu Yang <[email protected]>
Cc: Yang Shi <[email protected]>
Signed-off-by: Andrew Morton <[email protected]>
|
|
The flush_cache_page() only remove a PAGE_SIZE sized range from the cache.
However, it does not cover the full pages in a THP except a head page.
Replace it with flush_cache_range() to fix this issue. This is just a
documentation issue with the respect to properly documenting the expected
usage of cache flushing before modifying the pmd. However, in practice
this is not a problem due to the fact that DAX is not available on
architectures with virtually indexed caches per:
commit d92576f1167c ("dax: does not work correctly with virtual aliasing caches")
Link: https://lkml.kernel.org/r/[email protected]
Fixes: f729c8c9b24f ("dax: wrprotect pmd_t in dax_mapping_entry_mkclean")
Signed-off-by: Muchun Song <[email protected]>
Reviewed-by: Dan Williams <[email protected]>
Reviewed-by: Christoph Hellwig <[email protected]>
Cc: Alistair Popple <[email protected]>
Cc: Al Viro <[email protected]>
Cc: Hugh Dickins <[email protected]>
Cc: Jan Kara <[email protected]>
Cc: "Kirill A. Shutemov" <[email protected]>
Cc: Matthew Wilcox <[email protected]>
Cc: Ralph Campbell <[email protected]>
Cc: Ross Zwisler <[email protected]>
Cc: Xiongchun Duan <[email protected]>
Cc: Xiyu Yang <[email protected]>
Cc: Yang Shi <[email protected]>
Signed-off-by: Andrew Morton <[email protected]>
|
|
Patch series "Fix some bugs related to ramp and dax", v7.
Patch 1-2 fix a cache flush bug, because subsequent patches depend on
those on those changes, there are placed in this series. Patch 3-4 are
preparation for fixing a dax bug in patch 5. Patch 6 is code cleanup
since the previous patch removes the usage of follow_invalidate_pte().
This patch (of 6):
The flush_cache_page() only remove a PAGE_SIZE sized range from the cache.
However, it does not cover the full pages in a THP except a head page.
Replace it with flush_cache_range() to fix this issue. At least, no
problems were found due to this. Maybe because the architectures that
have virtual indexed caches is less.
Link: https://lkml.kernel.org/r/[email protected]
Link: https://lkml.kernel.org/r/[email protected]
Fixes: f27176cfc363 ("mm: convert page_mkclean_one() to use page_vma_mapped_walk()")
Signed-off-by: Muchun Song <[email protected]>
Reviewed-by: Yang Shi <[email protected]>
Reviewed-by: Dan Williams <[email protected]>
Reviewed-by: Christoph Hellwig <[email protected]>
Cc: Matthew Wilcox <[email protected]>
Cc: Jan Kara <[email protected]>
Cc: Al Viro <[email protected]>
Cc: Alistair Popple <[email protected]>
Cc: Yang Shi <[email protected]>
Cc: Ralph Campbell <[email protected]>
Cc: Hugh Dickins <[email protected]>
Cc: Xiyu Yang <[email protected]>
Cc: "Kirill A. Shutemov" <[email protected]>
Cc: Ross Zwisler <[email protected]>
Cc: Xiongchun Duan <[email protected]>
Signed-off-by: Andrew Morton <[email protected]>
|
|
We can't assume pte_offset_map_lock will return same orig_pte value. So
it's necessary to reacquire the orig_pte or pte_unmap_unlock will unmap
the stale pte.
Link: https://lkml.kernel.org/r/[email protected]
Fixes: 9c276cc65a58 ("mm: introduce MADV_COLD")
Fixes: 854e9ed09ded ("mm: support madvise(MADV_FREE)")
Signed-off-by: Miaohe Lin <[email protected]>
Cc: Johannes Weiner <[email protected]>
Cc: Michal Hocko <[email protected]>
Cc: Hugh Dickins <[email protected]>
Signed-off-by: Andrew Morton <[email protected]>
|
|
At the time demote-on-reclaim was introduced, it was tied to
CONFIG_HOTPLUG_CPU + CONFIG_MIGRATE, but that is not really accurate.
The only two things we need to depend on are CONFIG_NUMA + CONFIG_MIGRATE,
so clean this up. Furthermore, we only register the hotplug memory
notifier when the system has CONFIG_MEMORY_HOTPLUG.
Link: https://lkml.kernel.org/r/[email protected]
Signed-off-by: Oscar Salvador <[email protected]>
Suggested-by: "Huang, Ying" <[email protected]>
Reviewed-by: "Huang, Ying" <[email protected]>
Cc: Dave Hansen <[email protected]>
Cc: Abhishek Goel <[email protected]>
Cc: Baolin Wang <[email protected]>
Signed-off-by: Andrew Morton <[email protected]>
|
|
There is no need to validate the hugetlb page's refcount before trying to
freeze the hugetlb page's expected refcount, instead we can just rely on
the page_ref_freeze() to simplify the validation.
Moreover we are always under the page lock when migrating the hugetlb page
mapping, which means nowhere else can remove it from the page cache, so we
can remove the xas_load() validation under the i_pages lock.
Link: https://lkml.kernel.org/r/eb2fbbeaef2b1714097b9dec457426d682ee0635.1649676424.git.baolin.wang@linux.alibaba.com
Signed-off-by: Baolin Wang <[email protected]>
Acked-by: Mike Kravetz <[email protected]>
Cc: Matthew Wilcox <[email protected]>
Cc: "Huang, Ying" <[email protected]>
Cc: Muchun Song <[email protected]>
Cc: Alistair Popple <[email protected]>
Cc: Zi Yan <[email protected]>
Signed-off-by: Andrew Morton <[email protected]>
|
|
When follow_page peeks a page, the page could be migrated and then be
offlined while it's still being used by the do_pages_stat_array(). Use
FOLL_GET to hold the page refcnt to fix this potential race.
Link: https://lkml.kernel.org/r/[email protected]
Signed-off-by: Miaohe Lin <[email protected]>
Acked-by: "Huang, Ying" <[email protected]>
Reviewed-by: Muchun Song <[email protected]>
Cc: Alistair Popple <[email protected]>
Cc: Baolin Wang <[email protected]>
Cc: Zi Yan <[email protected]>
Signed-off-by: Andrew Morton <[email protected]>
|
|
If we failed to setup hotplug state callbacks for mm/demotion:online in
some corner cases, node_demotion will be left uninitialized. Invalid node
might be returned from the next_demotion_node() when doing reclaim-based
migration. Use kcalloc to allocate node_demotion to fix the issue.
Link: https://lkml.kernel.org/r/[email protected]
Fixes: ac16ec835314 ("mm: migrate: support multiple target nodes demotion")
Signed-off-by: Miaohe Lin <[email protected]>
Reviewed-by: "Huang, Ying" <[email protected]>
Cc: Alistair Popple <[email protected]>
Cc: Baolin Wang <[email protected]>
Cc: Muchun Song <[email protected]>
Cc: Zi Yan <[email protected]>
Signed-off-by: Andrew Morton <[email protected]>
|
|
In -ENOMEM case, there might be some subpages of fail-to-migrate THPs left
in thp_split_pages list. We should move them back to migration list so
that they could be put back to the right list by the caller otherwise the
page refcnt will be leaked here. Also adjust nr_failed and nr_thp_failed
accordingly to make vm events account more accurate.
Link: https://lkml.kernel.org/r/[email protected]
Fixes: b5bade978e9b ("mm: migrate: fix the return value of migrate_pages()")
Signed-off-by: Miaohe Lin <[email protected]>
Reviewed-by: Zi Yan <[email protected]>
Reviewed-by: "Huang, Ying" <[email protected]>
Reviewed-by: Baolin Wang <[email protected]>
Reviewed-by: Muchun Song <[email protected]>
Cc: Alistair Popple <[email protected]>
Signed-off-by: Andrew Morton <[email protected]>
|
|
Remove the duplicated codes in migrate_pages to simplify the code. Minor
readability improvement. No functional change intended.
Link: https://lkml.kernel.org/r/[email protected]
Signed-off-by: Miaohe Lin <[email protected]>
Reviewed-by: Zi Yan <[email protected]>
Reviewed-by: Baolin Wang <[email protected]>
Cc: Alistair Popple <[email protected]>
Cc: "Huang, Ying" <[email protected]>
Cc: Muchun Song <[email protected]>
Signed-off-by: Andrew Morton <[email protected]>
|
|
Avoid unneeded next_pass and this_pass initialization as they're always
set before using to save possible cpu cycles when there are plenty of
nodes in the system.
Link: https://lkml.kernel.org/r/[email protected]
Signed-off-by: Miaohe Lin <[email protected]>
Reviewed-by: Muchun Song <[email protected]>
Reviewed-by: Baolin Wang <[email protected]>
Cc: Alistair Popple <[email protected]>
Cc: "Huang, Ying" <[email protected]>
Cc: Zi Yan <[email protected]>
Signed-off-by: Andrew Morton <[email protected]>
|
|
We could use helper macro min to help set the chunk_nr to simplify the
code.
Link: https://lkml.kernel.org/r/[email protected]
Signed-off-by: Miaohe Lin <[email protected]>
Reviewed-by: Muchun Song <[email protected]>
Cc: Alistair Popple <[email protected]>
Cc: Baolin Wang <[email protected]>
Cc: "Huang, Ying" <[email protected]>
Cc: Zi Yan <[email protected]>
Signed-off-by: Andrew Morton <[email protected]>
|
|
We could use helper function vma_lookup() to lookup the needed vma to
simplify the code.
Link: https://lkml.kernel.org/r/[email protected]
Signed-off-by: Miaohe Lin <[email protected]>
Cc: Alistair Popple <[email protected]>
Cc: Baolin Wang <[email protected]>
Cc: "Huang, Ying" <[email protected]>
Cc: Muchun Song <[email protected]>
Cc: Zi Yan <[email protected]>
Signed-off-by: Andrew Morton <[email protected]>
|
|
We can use page_is_file_lru() directly to help account the isolated pages
to simplify the code a bit.
Link: https://lkml.kernel.org/r/[email protected]
Signed-off-by: Miaohe Lin <[email protected]>
Cc: Alistair Popple <[email protected]>
Cc: Baolin Wang <[email protected]>
Cc: "Huang, Ying" <[email protected]>
Cc: Muchun Song <[email protected]>
Cc: Zi Yan <[email protected]>
Signed-off-by: Andrew Morton <[email protected]>
|
|
Patch series "A few cleanup and fixup patches for migration", v2.
This series contains a few patches to remove unneeded variables, jump
label and use helper to simplify the code. Also we fix some bugs such as
page refcounts leak , invalid node access and so on. More details can be
found in the respective changelogs.
This patch (of 11):
When mapping_locked is true, TTU_RMAP_LOCKED is always set to ttu. We can
check ttu instead so mapping_locked can be removed. And ttu is either 0
or TTU_RMAP_LOCKED now. Change '|=' to '=' to reflect this.
Link: https://lkml.kernel.org/r/[email protected]
Link: https://lkml.kernel.org/r/[email protected]
Signed-off-by: Miaohe Lin <[email protected]>
Reviewed-by: Muchun Song <[email protected]>
Cc: Zi Yan <[email protected]>
Cc: Baolin Wang <[email protected]>
Cc: "Huang, Ying" <[email protected]>
Cc: Muchun Song <[email protected]>
Cc: Alistair Popple <[email protected]>
Signed-off-by: Andrew Morton <[email protected]>
|
|
Add some basic migration tests and in particular tests that will
stress both the pte and pmd migration entry wait paths.
Link: https://lkml.kernel.org/r/[email protected]
Signed-off-by: Alistair Popple <[email protected]>
Cc: Hugh Dickins <[email protected]>
Cc: Jan Kara <[email protected]>
Cc: "Kirill A. Shutemov" <[email protected]>
Cc: Matthew Wilcox (Oracle) <[email protected]>
Cc: Ralph Campbell <[email protected]>
Cc: Muchun Song <[email protected]>
Cc: John Hubbard <[email protected]>
Cc: Thomas Gleixner <[email protected]>
Cc: Sebastian Andrzej Siewior <[email protected]>
Cc: Shuah Khan <[email protected]>
Signed-off-by: Andrew Morton <[email protected]>
|
|
Since commit e5947d23edd8 ("mm: mempolicy: don't have to split pmd for
huge zero page"), THP is never splited in queue_pages_pmd. Thus 2 is
never returned now. We can remove such unnecessary ret != 2 check and
clean up the relevant comment. Minor improvements in readability.
Link: https://lkml.kernel.org/r/[email protected]
Signed-off-by: Miaohe Lin <[email protected]>
Reviewed-by: Yang Shi <[email protected]>
Acked-by: David Rientjes <[email protected]>
Cc: Zi Yan <[email protected]>
Cc: Michal Hocko <[email protected]>
Signed-off-by: Andrew Morton <[email protected]>
|
|
Compaction sysfs file is created via compaction_register_node in
register_node. But we forgot to remove it in unregister_node. Thus
compaction sysfs file is leaked. Using compaction_unregister_node to fix
this issue.
Link: https://lkml.kernel.org/r/[email protected]
Fixes: ed4a6d7f0676 ("mm: compaction: add /sys trigger for per-node memory compaction")
Signed-off-by: Miaohe Lin <[email protected]>
Cc: Greg Kroah-Hartman <[email protected]>
Cc: Rafael J. Wysocki <[email protected]>
Cc: Mel Gorman <[email protected]>
Cc: Minchan Kim <[email protected]>
Cc: KAMEZAWA Hiroyuki <[email protected]>
Cc: KOSAKI Motohiro <[email protected]>
Signed-off-by: Andrew Morton <[email protected]>
|
|
Use helper isolation_suitable() to check whether page is suitable to
isolate to simplify the code. Minor readability improvement.
Link: https://lkml.kernel.org/r/[email protected]
Signed-off-by: Miaohe Lin <[email protected]>
Reviewed-by: Andrew Morton <[email protected]>
Reviewed-by: Wei Yang <[email protected]>
Signed-off-by: Andrew Morton <[email protected]>
|
|
The only caller z3fold_free() never calls free_handle() in PAGE_HEADLESS
case. Remove this unneeded check.
Link: https://lkml.kernel.org/r/[email protected]
Signed-off-by: Miaohe Lin <[email protected]>
Reviewed-by: Vitaly Wool <[email protected]>
Signed-off-by: Andrew Morton <[email protected]>
|
|
do_compact_page() will do list_del_init(&zhdr->buddy) for us. Remove this
extra one to save some possible cpu cycles.
Link: https://lkml.kernel.org/r/[email protected]
Signed-off-by: Miaohe Lin <[email protected]>
Reviewed-by: Vitaly Wool <[email protected]>
Signed-off-by: Andrew Morton <[email protected]>
|
|
The z3fold will always do atomic64_dec(&pool->pages_nr) when the
__release_z3fold_page() is called. Thus we can move decrement of
pool->pages_nr into __release_z3fold_page() to simplify the code.
Also we can reduce the size of z3fold.o ~1k.
Without this patch:
text data bss dec hex filename
15444 1376 8 16828 41bc mm/z3fold.o
With this patch:
text data bss dec hex filename
15044 1248 8 16300 3fac mm/z3fold.o
Link: https://lkml.kernel.org/r/[email protected]
Signed-off-by: Miaohe Lin <[email protected]>
Cc: Vitaly Wool <[email protected]>
Signed-off-by: Andrew Morton <[email protected]>
|
|
The local variable l holds the address of unbuddied[i] which won't change
after we take the pool lock. Remove it to avoid confusion.
Link: https://lkml.kernel.org/r/[email protected]
Signed-off-by: Miaohe Lin <[email protected]>
Reviewed-by: Vitaly Wool <[email protected]>
Signed-off-by: Andrew Morton <[email protected]>
|
|
Page->page_type and PagePrivate are not used in z3fold. We should remove
these confusing unneeded operations. The z3fold do these here is due to
referring to zsmalloc's migration code which does need these operations.
Link: https://lkml.kernel.org/r/[email protected]
Signed-off-by: Miaohe Lin <[email protected]>
Reviewed-by: Vitaly Wool <[email protected]>
Signed-off-by: Andrew Morton <[email protected]>
|
|
Use put_z3fold_header() to pair with get_z3fold_header. Also fix the
wrong comments. Minor readability improvement.
Link: https://lkml.kernel.org/r/[email protected]
Signed-off-by: Miaohe Lin <[email protected]>
Reviewed-by: Vitaly Wool <[email protected]>
Signed-off-by: Andrew Morton <[email protected]>
|
|
The highmem pages are supported since commit f1549cb5ab2b ("mm/z3fold.c:
allow __GFP_HIGHMEM in z3fold_alloc"). Remove the residual comment.
Link: https://lkml.kernel.org/r/[email protected]
Signed-off-by: Miaohe Lin <[email protected]>
Reviewed-by: Vitaly Wool <[email protected]>
Signed-off-by: Andrew Morton <[email protected]>
|
|
Patch series "A few cleanup patches for z3fold", v2.
This series contains a few patches to simplify the code, remove unneeded
code, fix obsolete comment and so on. More details can be found in the
respective changelogs.
This patch (of 8):
z3fold_mount is only called during init. So we should declare it with
__init.
Link: https://lkml.kernel.org/r/[email protected]
Link: https://lkml.kernel.org/r/[email protected]
Signed-off-by: Miaohe Lin <[email protected]>
Reviewed-by: Vitaly Wool <[email protected]>
Signed-off-by: Andrew Morton <[email protected]>
|
|
pte_page() always returns a valid page, so remove the redundant page
validation, as we did in many other places.
Link: https://lkml.kernel.org/r/[email protected]
Signed-off-by: Xianting Tian <[email protected]>
Reviewed-by: Chaitanya Kulkarni <[email protected]>
Cc: Alexey Dobriyan <[email protected]>
Cc: Yang Shi <[email protected]>
Cc: Sasha Levin <[email protected]>
Cc: Miaohe Lin <[email protected]>
Signed-off-by: Andrew Morton <[email protected]>
|
|
Since commit 791b48b64232 ("mm: vmscan: scan until it finds eligible
pages"), splicing any skipped pages to the tail of the LRU list won't put
the system at risk of premature OOM but will waste lots of cpu cycles.
Correct the comment accordingly.
Link: https://lkml.kernel.org/r/[email protected]
Signed-off-by: Miaohe Lin <[email protected]>
Cc: Minchan Kim <[email protected]>
Signed-off-by: Andrew Morton <[email protected]>
|
|
Since commit 6d6435811c19 ("remove bdi_congested() and wb_congested() and
related functions"), there is no congested backing device check anymore.
Correct the comment accordingly.
[[email protected]: tweak grammar]
Link: https://lkml.kernel.org/r/[email protected]
Signed-off-by: Miaohe Lin <[email protected]>
Signed-off-by: Andrew Morton <[email protected]>
|
|
Since commit 1431d4d11abb ("mm: base LRU balancing on an explicit cost
model"), the relative value of each set of LRU lists is based on cost
model instead of rotated/scanned ratio. Cleanup the relevant comment.
Link: https://lkml.kernel.org/r/[email protected]
Signed-off-by: Miaohe Lin <[email protected]>
Signed-off-by: Andrew Morton <[email protected]>
|
|
lruvec_lru_size() is only used in get_scan_count(), so the only possible
zone_idx is sc->reclaim_idx. Since sc->reclaim_idx is ensured to be a
valid zone idex, we can remove the extra check for zone iteration.
Link: https://lkml.kernel.org/r/[email protected]
Signed-off-by: Wei Yang <[email protected]>
Cc: Johannes Weiner <[email protected]>
Signed-off-by: Andrew Morton <[email protected]>
|
|
wakeup_kswapd() only wake up kswapd when the zone is managed.
For two callers of wakeup_kswapd(), they are node perspective.
* wake_all_kswapds
* numamigrate_isolate_page
If we picked up a !managed zone, this is not we expected.
This patch makes sure we pick up a managed zone for wakeup_kswapd(). And
it also use managed_zone in migrate_balanced_pgdat() to get the proper
zone.
[[email protected]: adjust the usage in migrate_balanced_pgdat()]
Link: https://lkml.kernel.org/r/[email protected]
Link: https://lkml.kernel.org/r/[email protected]
Signed-off-by: Wei Yang <[email protected]>
Reviewed-by: "Huang, Ying" <[email protected]>
Reviewed-by: Miaohe Lin <[email protected]>
Reviewed-by: David Hildenbrand <[email protected]>
Cc: Mel Gorman <[email protected]>
Cc: Oscar Salvador <[email protected]>
Signed-off-by: Andrew Morton <[email protected]>
|
|
As mentioned in commit 6aa303defb74 ("mm, vmscan: only allocate and
reclaim from zones with pages managed by the buddy allocator") , reclaim
only affects managed_zones.
Let's adjust the code and comment accordingly.
Link: https://lkml.kernel.org/r/[email protected]
Signed-off-by: Wei Yang <[email protected]>
Reviewed-by: Miaohe Lin <[email protected]>
Reviewed-by: David Hildenbrand <[email protected]>
Reviewed-by: Oscar Salvador <[email protected]>
Cc: "Huang, Ying" <[email protected]>
Cc: Mel Gorman <[email protected]>
Signed-off-by: Andrew Morton <[email protected]>
|
|
The feature of minimizing overhead of struct page associated with each
HugeTLB page aims to free its vmemmap pages (used as struct page) to save
memory, where is ~14GB/16GB per 1TB HugeTLB pages (2MB/1GB type). In
short, when a HugeTLB page is allocated or freed, the vmemmap array
representing the range associated with the page will need to be remapped.
When a page is allocated, vmemmap pages are freed after remapping. When a
page is freed, previously discarded vmemmap pages must be allocated before
remapping. More implementations and details can be found here [1].
The infrastructure of freeing vmemmap pages associated with each HugeTLB
page is already there, we can easily enable HUGETLB_PAGE_FREE_VMEMMAP for
arm64, the only thing to be fixed is flush_dcache_page() .
flush_dcache_page() need to be adapted to operate on the head page's flags
since the tail vmemmap pages are mapped with read-only after the feature
is enabled (clear operation is not permitted).
There was some discussions about this in the thread [2], but there was no
conclusion in the end. And I copied the concern proposed by Anshuman to
here and explain why those concern is superfluous. It is safe to enable
it for x86_64 as well as arm64.
1st concern:
'''
But what happens when a hot remove section's vmemmap area (which is
being teared down) is nearby another vmemmap area which is either created
or being destroyed for HugeTLB alloc/free purpose. As you mentioned
HugeTLB pages inside the hot remove section might be safe. But what about
other HugeTLB areas whose vmemmap area shares page table entries with
vmemmap entries for a section being hot removed ? Massive HugeTLB alloc
/use/free test cycle using memory just adjacent to a memory hotplug area,
which is always added and removed periodically, should be able to expose
this problem.
'''
Answer: At the time memory is removed, all HugeTLB pages either have been
migrated away or dissolved. So there is no race between memory hot remove
and free_huge_page_vmemmap(). Therefore, HugeTLB pages inside the hot
remove section is safe. Let's talk your question "what about other
HugeTLB areas whose vmemmap area shares page table entries with vmemmap
entries for a section being hot removed ?", the question is not
established. The minimal granularity size of hotplug memory 128MB (on
arm64, 4k base page), any HugeTLB smaller than 128MB is within a section,
then, there is no share PTE page tables between HugeTLB in this section
and ones in other sections and a HugeTLB page could not cross two
sections. In this case, the section cannot be freed. Any HugeTLB bigger
than 128MB (section size) whose vmemmap pages is an integer multiple of
2MB (PMD-mapped). As long as:
1) HugeTLBs are naturally aligned, power-of-two sizes
2) The HugeTLB size >= the section size
3) The HugeTLB size >= the vmemmap leaf mapping size
Then a HugeTLB will not share any leaf page table entries with *anything
else*, but will share intermediate entries. In this case, at the time
memory is removed, all HugeTLB pages either have been migrated away or
dissolved. So there is also no race between memory hot remove and
free_huge_page_vmemmap().
2nd concern:
'''
differently, not sure if ptdump would require any synchronization.
Dumping an wrong value is probably okay but crashing because a page table
entry is being freed after ptdump acquired the pointer is bad. On arm64,
ptdump() is protected against hotremove via [get|put]_online_mems().
'''
Answer: The ptdump should be fine since vmemmap_remap_free() only
exchanges PTEs or splits the PMD entry (which means allocating a PTE page
table). Both operations do not free any page tables (PTE), so ptdump
cannot run into a UAF on any page tables. The worst case is just dumping
an wrong value.
[1] https://lore.kernel.org/all/[email protected]/
[2] https://lore.kernel.org/all/[email protected]/
[[email protected]: restructure the code comment inside flush_dcache_page()]
Link: https://lkml.kernel.org/r/[email protected]
Link: https://lkml.kernel.org/r/[email protected]
Signed-off-by: Muchun Song <[email protected]>
Reviewed-by: Barry Song <[email protected]>
Tested-by: Barry Song <[email protected]>
Cc: Will Deacon <[email protected]>
Cc: David Hildenbrand <[email protected]>
Cc: Bodeddula Balasubramaniam <[email protected]>
Cc: Oscar Salvador <[email protected]>
Cc: Mike Kravetz <[email protected]>
Cc: David Rientjes <[email protected]>
Cc: Mark Rutland <[email protected]>
Cc: Catalin Marinas <[email protected]>
Cc: James Morse <[email protected]>
Cc: Xiongchun Duan <[email protected]>
Cc: Fam Zheng <[email protected]>
Signed-off-by: Andrew Morton <[email protected]>
|
|
The feature of minimizing overhead of struct page associated with each
HugeTLB page is implemented on x86_64, however, the infrastructure of this
feature is already there, we could easily enable it for other
architectures. Introduce ARCH_WANT_HUGETLB_PAGE_FREE_VMEMMAP for other
architectures to be easily enabled. Just select this config if they want
to enable this feature.
Link: https://lkml.kernel.org/r/[email protected]
Signed-off-by: Muchun Song <[email protected]>
Suggested-by: Andrew Morton <[email protected]>
Reviewed-by: Barry Song <[email protected]>
Tested-by: Barry Song <[email protected]>
Reviewed-by: Anshuman Khandual <[email protected]>
Cc: Bodeddula Balasubramaniam <[email protected]>
Cc: Catalin Marinas <[email protected]>
Cc: David Hildenbrand <[email protected]>
Cc: David Rientjes <[email protected]>
Cc: Fam Zheng <[email protected]>
Cc: James Morse <[email protected]>
Cc: Mark Rutland <[email protected]>
Cc: Mike Kravetz <[email protected]>
Cc: Oscar Salvador <[email protected]>
Cc: Will Deacon <[email protected]>
Cc: Xiongchun Duan <[email protected]>
Signed-off-by: Andrew Morton <[email protected]>
|
|
In preparation to limit the scope of the list iterator to the list
traversal loop, use a dedicated pointer to iterate through the list [1].
Before hugetlb_resv_map_add() was expecting a file_region struct, but in
case the list iterator in add_reservation_in_range() did not exit early,
the variable passed in, is not actually a valid structure.
In such a case 'rg' is computed on the head element of the list and
represents an out-of-bounds pointer. This still remains safe *iff* you
only use the link member (as it is done in hugetlb_resv_map_add()).
To avoid the type-confusion altogether and limit the list iterator to the
loop, only a list_head pointer is kept to pass to hugetlb_resv_map_add().
Link: https://lore.kernel.org/all/CAHk-=wgRr_D8CB-D9Kg-c=EHreAsk5SqXPwr9Y7k9sA6cWXJ6w@mail.gmail.com/ [1]
Link: https://lkml.kernel.org/r/[email protected]
Signed-off-by: Jakob Koschel <[email protected]>
Cc: Mike Kravetz <[email protected]>
Cc: Mike Rapoport <[email protected]>
Cc: "Brian Johannesmeyer" <[email protected]>
Cc: Cristiano Giuffrida <[email protected]>
Cc: "Bos, H.J." <[email protected]>
Cc: Jakob Koschel <[email protected]>
Signed-off-by: Andrew Morton <[email protected]>
|
|
We know that HPageFreed pages should have page refcount 0, so
get_page_unless_zero() always fails and returns 0. So explicitly separate
the branch based on page state for minor optimization and better
readability.
Link: https://lkml.kernel.org/r/[email protected]
Signed-off-by: Naoya Horiguchi <[email protected]>
Suggested-by: Mike Kravetz <[email protected]>
Suggested-by: Miaohe Lin <[email protected]>
Reviewed-by: Mike Kravetz <[email protected]>
Reviewed-by: Miaohe Lin <[email protected]>
Signed-off-by: Andrew Morton <[email protected]>
|
|
If me_huge_page meets a truncated but not yet freed hugepage, it won't be
dissolved even if we hold the last refcnt. It's because the hugepage has
NULL page_mapping while it's not anonymous hugepage too. Thus we lose the
last chance to dissolve it into buddy to save healthy subpages. Remove
PageAnon check to handle these hugepages too.
Link: https://lkml.kernel.org/r/[email protected]
Signed-off-by: Miaohe Lin <[email protected]>
Acked-by: Naoya Horiguchi <[email protected]>
Cc: David Hildenbrand <[email protected]>
Cc: Yang Shi <[email protected]>
Signed-off-by: Andrew Morton <[email protected]>
|
|
Patch series "A few fixup and cleanup patches for memory failure", v2.
This series contains a patch to clean up the HWPoisonHandlable and another
one to dissolve truncated hugetlb page. More details can be found in the
respective changelogs.
This patch (of 2):
The local variable movable can be removed by returning true directly. Also
fix typo 'mirgate'. No functional change intended.
Link: https://lkml.kernel.org/r/[email protected]
Link: https://lkml.kernel.org/r/[email protected]
Signed-off-by: Miaohe Lin <[email protected]>
Reviewed-by: David Hildenbrand <[email protected]>
Reviewed-by: Yang Shi <[email protected]>
Acked-by: Naoya Horiguchi <[email protected]>
Signed-off-by: Andrew Morton <[email protected]>
|
|
Reverts commit 888af2701db7 ("mm/memory-failure.c: fix race with changing
page compound again") because now we fetch the page refcount under
hugetlb_lock in try_memory_failure_hugetlb() so that the race check is no
longer necessary.
Link: https://lkml.kernel.org/r/[email protected]
Signed-off-by: Naoya Horiguchi <[email protected]>
Suggested-by: Miaohe Lin <[email protected]>
Reviewed-by: Miaohe Lin <[email protected]>
Reviewed-by: Mike Kravetz <[email protected]>
Cc: Miaohe Lin <[email protected]>
Cc: Yang Shi <[email protected]>
Cc: Dan Carpenter <[email protected]>
Signed-off-by: Andrew Morton <[email protected]>
|
|
In already hwpoisoned case, memory_failure() is supposed to return with
releasing the page refcount taken for error handling. But currently the
refcount is not released when called with MF_COUNT_INCREASED, which makes
page refcount inconsistent. This should be rare and non-critical, but it
might be inconvenient in testing (unpoison doesn't work).
Link: https://lkml.kernel.org/r/[email protected]
Signed-off-by: Naoya Horiguchi <[email protected]>
Suggested-by: Miaohe Lin <[email protected]>
Reviewed-by: Miaohe Lin <[email protected]>
Reviewed-by: Mike Kravetz <[email protected]>
Cc: Dan Carpenter <[email protected]>
Cc: Yang Shi <[email protected]>
Signed-off-by: Andrew Morton <[email protected]>
|