| Age | Commit message (Collapse) | Author | Files | Lines |
|
git://git.kernel.org/pub/scm/linux/kernel/git/akpm/mm
Pull MM updates from Andrew Morton:
"Along with the usual shower of singleton patches, notable patch series
in this pull request are:
- "Align kvrealloc() with krealloc()" from Danilo Krummrich. Adds
consistency to the APIs and behaviour of these two core allocation
functions. This also simplifies/enables Rustification.
- "Some cleanups for shmem" from Baolin Wang. No functional changes -
mode code reuse, better function naming, logic simplifications.
- "mm: some small page fault cleanups" from Josef Bacik. No
functional changes - code cleanups only.
- "Various memory tiering fixes" from Zi Yan. A small fix and a
little cleanup.
- "mm/swap: remove boilerplate" from Yu Zhao. Code cleanups and
simplifications and .text shrinkage.
- "Kernel stack usage histogram" from Pasha Tatashin and Shakeel
Butt. This is a feature, it adds new feilds to /proc/vmstat such as
$ grep kstack /proc/vmstat
kstack_1k 3
kstack_2k 188
kstack_4k 11391
kstack_8k 243
kstack_16k 0
which tells us that 11391 processes used 4k of stack while none at
all used 16k. Useful for some system tuning things, but
partivularly useful for "the dynamic kernel stack project".
- "kmemleak: support for percpu memory leak detect" from Pavel
Tikhomirov. Teaches kmemleak to detect leaksage of percpu memory.
- "mm: memcg: page counters optimizations" from Roman Gushchin. "3
independent small optimizations of page counters".
- "mm: split PTE/PMD PT table Kconfig cleanups+clarifications" from
David Hildenbrand. Improves PTE/PMD splitlock detection, makes
powerpc/8xx work correctly by design rather than by accident.
- "mm: remove arch_make_page_accessible()" from David Hildenbrand.
Some folio conversions which make arch_make_page_accessible()
unneeded.
- "mm, memcg: cg2 memory{.swap,}.peak write handlers" fro David
Finkel. Cleans up and fixes our handling of the resetting of the
cgroup/process peak-memory-use detector.
- "Make core VMA operations internal and testable" from Lorenzo
Stoakes. Rationalizaion and encapsulation of the VMA manipulation
APIs. With a view to better enable testing of the VMA functions,
even from a userspace-only harness.
- "mm: zswap: fixes for global shrinker" from Takero Funaki. Fix
issues in the zswap global shrinker, resulting in improved
performance.
- "mm: print the promo watermark in zoneinfo" from Kaiyang Zhao. Fill
in some missing info in /proc/zoneinfo.
- "mm: replace follow_page() by folio_walk" from David Hildenbrand.
Code cleanups and rationalizations (conversion to folio_walk())
resulting in the removal of follow_page().
- "improving dynamic zswap shrinker protection scheme" from Nhat
Pham. Some tuning to improve zswap's dynamic shrinker. Significant
reductions in swapin and improvements in performance are shown.
- "mm: Fix several issues with unaccepted memory" from Kirill
Shutemov. Improvements to the new unaccepted memory feature,
- "mm/mprotect: Fix dax puds" from Peter Xu. Implements mprotect on
DAX PUDs. This was missing, although nobody seems to have notied
yet.
- "Introduce a store type enum for the Maple tree" from Sidhartha
Kumar. Cleanups and modest performance improvements for the maple
tree library code.
- "memcg: further decouple v1 code from v2" from Shakeel Butt. Move
more cgroup v1 remnants away from the v2 memcg code.
- "memcg: initiate deprecation of v1 features" from Shakeel Butt.
Adds various warnings telling users that memcg v1 features are
deprecated.
- "mm: swap: mTHP swap allocator base on swap cluster order" from
Chris Li. Greatly improves the success rate of the mTHP swap
allocation.
- "mm: introduce numa_memblks" from Mike Rapoport. Moves various
disparate per-arch implementations of numa_memblk code into generic
code.
- "mm: batch free swaps for zap_pte_range()" from Barry Song. Greatly
improves the performance of munmap() of swap-filled ptes.
- "support large folio swap-out and swap-in for shmem" from Baolin
Wang. With this series we no longer split shmem large folios into
simgle-page folios when swapping out shmem.
- "mm/hugetlb: alloc/free gigantic folios" from Yu Zhao. Nice
performance improvements and code reductions for gigantic folios.
- "support shmem mTHP collapse" from Baolin Wang. Adds support for
khugepaged's collapsing of shmem mTHP folios.
- "mm: Optimize mseal checks" from Pedro Falcato. Fixes an mprotect()
performance regression due to the addition of mseal().
- "Increase the number of bits available in page_type" from Matthew
Wilcox. Increases the number of bits available in page_type!
- "Simplify the page flags a little" from Matthew Wilcox. Many legacy
page flags are now folio flags, so the page-based flags and their
accessors/mutators can be removed.
- "mm: store zero pages to be swapped out in a bitmap" from Usama
Arif. An optimization which permits us to avoid writing/reading
zero-filled zswap pages to backing store.
- "Avoid MAP_FIXED gap exposure" from Liam Howlett. Fixes a race
window which occurs when a MAP_FIXED operqtion is occurring during
an unrelated vma tree walk.
- "mm: remove vma_merge()" from Lorenzo Stoakes. Major rotorooting of
the vma_merge() functionality, making ot cleaner, more testable and
better tested.
- "misc fixups for DAMON {self,kunit} tests" from SeongJae Park.
Minor fixups of DAMON selftests and kunit tests.
- "mm: memory_hotplug: improve do_migrate_range()" from Kefeng Wang.
Code cleanups and folio conversions.
- "Shmem mTHP controls and stats improvements" from Ryan Roberts.
Cleanups for shmem controls and stats.
- "mm: count the number of anonymous THPs per size" from Barry Song.
Expose additional anon THP stats to userspace for improved tuning.
- "mm: finish isolate/putback_lru_page()" from Kefeng Wang: more
folio conversions and removal of now-unused page-based APIs.
- "replace per-quota region priorities histogram buffer with
per-context one" from SeongJae Park. DAMON histogram
rationalization.
- "Docs/damon: update GitHub repo URLs and maintainer-profile" from
SeongJae Park. DAMON documentation updates.
- "mm/vdpa: correct misuse of non-direct-reclaim __GFP_NOFAIL and
improve related doc and warn" from Jason Wang: fixes usage of page
allocator __GFP_NOFAIL and GFP_ATOMIC flags.
- "mm: split underused THPs" from Yu Zhao. Improve THP=always policy.
This was overprovisioning THPs in sparsely accessed memory areas.
- "zram: introduce custom comp backends API" frm Sergey Senozhatsky.
Add support for zram run-time compression algorithm tuning.
- "mm: Care about shadow stack guard gap when getting an unmapped
area" from Mark Brown. Fix up the various arch_get_unmapped_area()
implementations to better respect guard areas.
- "Improve mem_cgroup_iter()" from Kinsey Ho. Improve the reliability
of mem_cgroup_iter() and various code cleanups.
- "mm: Support huge pfnmaps" from Peter Xu. Extends the usage of huge
pfnmap support.
- "resource: Fix region_intersects() vs add_memory_driver_managed()"
from Huang Ying. Fix a bug in region_intersects() for systems with
CXL memory.
- "mm: hwpoison: two more poison recovery" from Kefeng Wang. Teaches
a couple more code paths to correctly recover from the encountering
of poisoned memry.
- "mm: enable large folios swap-in support" from Barry Song. Support
the swapin of mTHP memory into appropriately-sized folios, rather
than into single-page folios"
* tag 'mm-stable-2024-09-20-02-31' of git://git.kernel.org/pub/scm/linux/kernel/git/akpm/mm: (416 commits)
zram: free secondary algorithms names
uprobes: turn xol_area->pages[2] into xol_area->page
uprobes: introduce the global struct vm_special_mapping xol_mapping
Revert "uprobes: use vm_special_mapping close() functionality"
mm: support large folios swap-in for sync io devices
mm: add nr argument in mem_cgroup_swapin_uncharge_swap() helper to support large folios
mm: fix swap_read_folio_zeromap() for large folios with partial zeromap
mm/debug_vm_pgtable: Use pxdp_get() for accessing page table entries
set_memory: add __must_check to generic stubs
mm/vma: return the exact errno in vms_gather_munmap_vmas()
memcg: cleanup with !CONFIG_MEMCG_V1
mm/show_mem.c: report alloc tags in human readable units
mm: support poison recovery from copy_present_page()
mm: support poison recovery from do_cow_fault()
resource, kunit: add test case for region_intersects()
resource: make alloc_free_mem_region() works for iomem_resource
mm: z3fold: deprecate CONFIG_Z3FOLD
vfio/pci: implement huge_fault support
mm/arm64: support large pfn mappings
mm/x86: support large pfn mappings
...
|
|
git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip
Pull x86 memory management updates from Thomas Gleixner:
- Make LAM enablement safe vs. kernel threads using a process mm
temporarily as switching back to the process would not update CR3 and
therefore not enable LAM causing faults in user space when using
tagged pointers. Cure it by synchronizing LAM enablement via IPIs to
all CPUs which use the related mm.
- Cure a LAM harmless inconsistency between CR3 and the state during
context switch. It's both confusing and prone to lead to real bugs
- Handle alt stack handling for threads which run with a non-zero
protection key. The non-zero key prevents the kernel to access the
alternate stack. Cure it by temporarily enabling all protection keys
for the alternate stack setup/restore operations.
- Provide a EFI config table identity mapping for kexec kernel to
prevent kexec fails because the new kernel cannot access the config
table array
- Use GB pages only when a full GB is mapped in the identity map as
otherwise the CPU can speculate into reserved areas after the end of
memory which causes malfunction on UV systems.
- Remove the noisy and pointless SRAT table dump during boot
- Use is_ioremap_addr() for iounmap() address range checks instead of
high_memory. is_ioremap_addr() is more precise.
* tag 'x86-mm-2024-09-17' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip:
x86/ioremap: Improve iounmap() address range checks
x86/mm: Remove duplicate check from build_cr3()
x86/mm: Remove unused NX related declarations
x86/mm: Remove unused CR3_HW_ASID_BITS
x86/mm: Don't print out SRAT table information
x86/mm/ident_map: Use gbpages only where full GB page should be mapped.
x86/kexec: Add EFI config table identity mapping for kexec kernel
selftests/mm: Add new testcases for pkeys
x86/pkeys: Restore altstack access in sigreturn()
x86/pkeys: Update PKRU to enable all pkeys before XSAVE
x86/pkeys: Add helper functions to update PKRU on the sigframe
x86/pkeys: Add PKRU as a parameter in signal handling functions
x86/mm: Cleanup prctl_enable_tagged_addr() nr_bits error checking
x86/mm: Fix LAM inconsistency during context switch
x86/mm: Use IPIs to synchronize LAM enablement
|
|
git://git.kernel.org/pub/scm/linux/kernel/git/arm64/linux
Pull arm64 updates from Will Deacon:
"The highlights are support for Arm's "Permission Overlay Extension"
using memory protection keys, support for running as a protected guest
on Android as well as perf support for a bunch of new interconnect
PMUs.
Summary:
ACPI:
- Enable PMCG erratum workaround for HiSilicon HIP10 and 11
platforms.
- Ensure arm64-specific IORT header is covered by MAINTAINERS.
CPU Errata:
- Enable workaround for hardware access/dirty issue on Ampere-1A
cores.
Memory management:
- Define PHYSMEM_END to fix a crash in the amdgpu driver.
- Avoid tripping over invalid kernel mappings on the kexec() path.
- Userspace support for the Permission Overlay Extension (POE) using
protection keys.
Perf and PMUs:
- Add support for the "fixed instruction counter" extension in the
CPU PMU architecture.
- Extend and fix the event encodings for Apple's M1 CPU PMU.
- Allow LSM hooks to decide on SPE permissions for physical
profiling.
- Add support for the CMN S3 and NI-700 PMUs.
Confidential Computing:
- Add support for booting an arm64 kernel as a protected guest under
Android's "Protected KVM" (pKVM) hypervisor.
Selftests:
- Fix vector length issues in the SVE/SME sigreturn tests
- Fix build warning in the ptrace tests.
Timers:
- Add support for PR_{G,S}ET_TSC so that 'rr' can deal with
non-determinism arising from the architected counter.
Miscellaneous:
- Rework our IPI-based CPU stopping code to try NMIs if regular IPIs
don't succeed.
- Minor fixes and cleanups"
* tag 'arm64-upstream' of git://git.kernel.org/pub/scm/linux/kernel/git/arm64/linux: (94 commits)
perf: arm-ni: Fix an NULL vs IS_ERR() bug
arm64: hibernate: Fix warning for cast from restricted gfp_t
arm64: esr: Define ESR_ELx_EC_* constants as UL
arm64: pkeys: remove redundant WARN
perf: arm_pmuv3: Use BR_RETIRED for HW branch event if enabled
MAINTAINERS: List Arm interconnect PMUs as supported
perf: Add driver for Arm NI-700 interconnect PMU
dt-bindings/perf: Add Arm NI-700 PMU
perf/arm-cmn: Improve format attr printing
perf/arm-cmn: Clean up unnecessary NUMA_NO_NODE check
arm64/mm: use lm_alias() with addresses passed to memblock_free()
mm: arm64: document why pte is not advanced in contpte_ptep_set_access_flags()
arm64: Expose the end of the linear map in PHYSMEM_END
arm64: trans_pgd: mark PTEs entries as valid to avoid dead kexec()
arm64/mm: Delete __init region from memblock.reserved
perf/arm-cmn: Support CMN S3
dt-bindings: perf: arm-cmn: Add CMN S3
perf/arm-cmn: Refactor DTC PMU register access
perf/arm-cmn: Make cycle counts less surprising
perf/arm-cmn: Improve build-time assertion
...
|
|
The encoding of the pkey register differs on arm64, than on x86/ppc. On those
platforms, a bit in the register is used to disable permissions, for arm64, a
bit enabled in the register indicates that the permission is allowed.
This drops two asserts of the form:
assert(read_pkey_reg() <= orig_pkey_reg);
Because on arm64 this doesn't hold, due to the encoding.
The pkey must be reset to both access allow and write allow in the signal
handler. pkey_access_allow() works currently for PowerPC as the
PKEY_DISABLE_ACCESS and PKEY_DISABLE_WRITE have overlapping bits set.
Access to the uc_mcontext is abstracted, as arm64 has a different structure.
Signed-off-by: Joey Gouly <[email protected]>
Cc: Catalin Marinas <[email protected]>
Cc: Will Deacon <[email protected]>
Cc: Andrew Morton <[email protected]>
Cc: Shuah Khan <[email protected]>
Cc: Dave Hansen <[email protected]>
Cc: Aneesh Kumar K.V <[email protected]>
Acked-by: Dave Hansen <[email protected]>
Link: https://lore.kernel.org/r/[email protected]
Signed-off-by: Will Deacon <[email protected]>
|
|
IA64 has gone with commit cf8e8658100d ("arch: Remove Itanium (IA-64)
architecture"), so remove unnecessary ia64 special mm code and comment in
selftests too.
Link: https://lkml.kernel.org/r/[email protected]
Signed-off-by: Jinjiang Tu <[email protected]>
Cc: Kefeng Wang <[email protected]>
Cc: Mike Rapoport <[email protected]>
Cc: Nanyong Sun <[email protected]>
Signed-off-by: Andrew Morton <[email protected]>
|
|
[1] mentions that memfd_secret is only supported on arm64, riscv, x86 and
x86_64 for now. It doesn't support other architectures. I found the
build error on arm and decided to send the fix as it was creating noise on
KernelCI:
memfd_secret.c: In function 'memfd_secret':
memfd_secret.c:42:24: error: '__NR_memfd_secret' undeclared (first use in this function);
did you mean 'memfd_secret'?
42 | return syscall(__NR_memfd_secret, flags);
| ^~~~~~~~~~~~~~~~~
| memfd_secret
Hence I'm adding condition that memfd_secret should only be compiled on
supported architectures.
Also check in run_vmtests script if memfd_secret binary is present before
executing it.
Link: https://lkml.kernel.org/r/[email protected]
Link: https://lore.kernel.org/all/[email protected]/ [1]
Link: https://lkml.kernel.org/r/[email protected]
Fixes: 76fe17ef588a ("secretmem: test: add basic selftest for memfd_secret(2)")
Signed-off-by: Muhammad Usama Anjum <[email protected]>
Reviewed-by: Shuah Khan <[email protected]>
Acked-by: Mike Rapoport (Microsoft) <[email protected]>
Cc: Albert Ou <[email protected]>
Cc: James Bottomley <[email protected]>
Cc: Mike Rapoport (Microsoft) <[email protected]>
Cc: Palmer Dabbelt <[email protected]>
Cc: Paul Walmsley <[email protected]>
Cc: <[email protected]>
Signed-off-by: Andrew Morton <[email protected]>
|
|
commit 0518dbe97fe6 ("selftests/mm: fix cross compilation with LLVM")
changed the env variable for the architecture from MACHINE to ARCH.
This is preventing 3 required TEST_GEN_FILES from being included when
cross compiling s390x and errors when trying to run the test suite. This
is due to the ARCH variable already being set and the arch folder name
being s390.
Add "s390" to the filtered list to cover this case and have the 3 files
included in the build.
Link: https://lkml.kernel.org/r/[email protected]
Fixes: 0518dbe97fe6 ("selftests/mm: fix cross compilation with LLVM")
Signed-off-by: Nico Pache <[email protected]>
Cc: Mark Brown <[email protected]>
Cc: Albert Ou <[email protected]>
Cc: Palmer Dabbelt <[email protected]>
Cc: Paul Walmsley <[email protected]>
Cc: Shuah Khan <[email protected]>
Cc: <[email protected]>
Signed-off-by: Andrew Morton <[email protected]>
|
|
Add a few new tests to exercise the signal handler flow, especially with
PKEY 0 disabled:
- Verify that the SIGSEGV handler is invoked when pkey 0 is disabled.
- Verify that a thread which disables PKEY 0 segfaults with PKUERR when
accessing the stack.
- Verify that the SIGSEGV handler that uses an alternate signal stack is
correctly invoked when the thread disabled PKEY 0
- Verify that the PKRU value set by the application is correctly restored
upon return from signal handling.
- Verify that sigreturn() is able to restore the altstack even if the
thread had PKEY 0 disabled
[ Aruna: Adapted to upstream ]
[ tglx: Made it actually compile. Restored protection_keys compile. Added
useful info to the changelog instead of bare function names. ]
Signed-off-by: Keith Lucas <[email protected]>
Signed-off-by: Aruna Ramakrishna <[email protected]>
Signed-off-by: Thomas Gleixner <[email protected]>
Link: https://lore.kernel.org/all/[email protected]
|
|
git://git.kernel.org/pub/scm/linux/kernel/git/crng/random
Pull random number generator updates from Jason Donenfeld:
"This adds getrandom() support to the vDSO.
First, it adds a new kind of mapping to mmap(2), MAP_DROPPABLE, which
lets the kernel zero out pages anytime under memory pressure, which
enables allocating memory that never gets swapped to disk but also
doesn't count as being mlocked.
Then, the vDSO implementation of getrandom() is introduced in a
generic manner and hooked into random.c.
Next, this is implemented on x86. (Also, though it's not ready for
this pull, somebody has begun an arm64 implementation already)
Finally, two vDSO selftests are added.
There are also two housekeeping cleanup commits"
* tag 'random-6.11-rc1-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/crng/random:
MAINTAINERS: add random.h headers to RNG subsection
random: note that RNDGETPOOL was removed in 2.6.9-rc2
selftests/vDSO: add tests for vgetrandom
x86: vdso: Wire up getrandom() vDSO implementation
random: introduce generic vDSO getrandom() implementation
mm: add MAP_DROPPABLE for designating always lazily freeable mappings
|
|
The vDSO getrandom() implementation works with a buffer allocated with a
new system call that has certain requirements:
- It shouldn't be written to core dumps.
* Easy: VM_DONTDUMP.
- It should be zeroed on fork.
* Easy: VM_WIPEONFORK.
- It shouldn't be written to swap.
* Uh-oh: mlock is rlimited.
* Uh-oh: mlock isn't inherited by forks.
- It shouldn't reserve actual memory, but it also shouldn't crash when
page faulting in memory if none is available
* Uh-oh: VM_NORESERVE means segfaults.
It turns out that the vDSO getrandom() function has three really nice
characteristics that we can exploit to solve this problem:
1) Due to being wiped during fork(), the vDSO code is already robust to
having the contents of the pages it reads zeroed out midway through
the function's execution.
2) In the absolute worst case of whatever contingency we're coding for,
we have the option to fallback to the getrandom() syscall, and
everything is fine.
3) The buffers the function uses are only ever useful for a maximum of
60 seconds -- a sort of cache, rather than a long term allocation.
These characteristics mean that we can introduce VM_DROPPABLE, which
has the following semantics:
a) It never is written out to swap.
b) Under memory pressure, mm can just drop the pages (so that they're
zero when read back again).
c) It is inherited by fork.
d) It doesn't count against the mlock budget, since nothing is locked.
e) If there's not enough memory to service a page fault, it's not fatal,
and no signal is sent.
This way, allocations used by vDSO getrandom() can use:
VM_DROPPABLE | VM_DONTDUMP | VM_WIPEONFORK | VM_NORESERVE
And there will be no problem with OOMing, crashing on overcommitment,
using memory when not in use, not wiping on fork(), coredumps, or
writing out to swap.
In order to let vDSO getrandom() use this, expose these via mmap(2) as
MAP_DROPPABLE.
Note that this involves removing the MADV_FREE special case from
sort_folio(), which according to Yu Zhao is unnecessary and will simply
result in an extra call to shrink_folio_list() in the worst case. The
chunk removed reenables the swapbacked flag, which we don't want for
VM_DROPPABLE, and we can't conditionalize it here because there isn't a
vma reference available.
Finally, the provided self test ensures that this is working as desired.
Cc: [email protected]
Acked-by: David Hildenbrand <[email protected]>
Signed-off-by: Jason A. Donenfeld <[email protected]>
|
|
Add regression and new tests when hugepage has correctable memory errors,
and how userspace wants to deal with it:
* if enable_soft_offline=1, mapped hugepage is soft offlined
* if enable_soft_offline=0, mapped hugepage is intact
Free hugepages case is not explicitly covered by the tests.
Hugepage having corrected memory errors is emulated with
MADV_SOFT_OFFLINE.
[[email protected]: v7]
Link: https://lkml.kernel.org/r/[email protected]
Link: https://lkml.kernel.org/r/[email protected]
Signed-off-by: Jiaqi Yan <[email protected]>
Acked-by: Miaohe Lin <[email protected]>
Acked-by: David Rientjes <[email protected]>
Cc: Frank van der Linden <[email protected]>
Cc: Jane Chu <[email protected]>
Cc: Jonathan Corbet <[email protected]>
Cc: Lance Yang <[email protected]>
Cc: Muchun Song <[email protected]>
Cc: Naoya Horiguchi <[email protected]>
Cc: Oscar Salvador <[email protected]>
Cc: Randy Dunlap <[email protected]>
Cc: Shuah Khan <[email protected]>
Signed-off-by: Andrew Morton <[email protected]>
|
|
Clean up and move some copy-pasted items into a new mseal_helpers.h.
1. The test macros can be made safer and simpler, by observing that
they are invariably called when about to return. This means that the
macros do not need an intrusive label to goto; they can simply return.
2. PKEY* items. We cannot, unfortunately use pkey-helpers.h. The
best we can do is to factor out these few items into mseal_helpers.h.
3. These tests still need their own definition of u64, so also move
that to the header file.
4. Be sure to include the new mseal_helpers.h in the Makefile
dependencies.
[[email protected]: include the new mseal_helpers.h in Makefile dependencies]
Link: https://lkml.kernel.org/r/[email protected]
Link: https://lkml.kernel.org/r/[email protected]
Signed-off-by: John Hubbard <[email protected]>
Acked-by: David Hildenbrand <[email protected]>
Cc: Jeff Xu <[email protected]>
Cc: Andrei Vagin <[email protected]>
Cc: Axel Rasmussen <[email protected]>
Cc: Christian Brauner <[email protected]>
Cc: Kees Cook <[email protected]>
Cc: Kent Overstreet <[email protected]>
Cc: Liam R. Howlett <[email protected]>
Cc: Muhammad Usama Anjum <[email protected]>
Cc: Peter Xu <[email protected]>
Cc: Rich Felker <[email protected]>
Cc: Shuah Khan <[email protected]>
Signed-off-by: Andrew Morton <[email protected]>
|
|
Commit 1b151e2435fc ("block: Remove special-casing of compound pages")
caused a change in behaviour when releasing the pages if the buffer does
not start at the beginning of the page. This was because the calculation
of the number of pages to release was incorrect. This was fixed by commit
38b43539d64b ("block: Fix page refcounts for unaligned buffers in
__bio_release_pages()").
We pin the user buffer during direct I/O writes. If this buffer is a
hugepage, bio_release_page() will unpin it and decrement all references
and pin counts at ->bi_end_io. However, if any references to the hugepage
remain post-I/O, the hugepage will not be freed upon unmap, leading to a
memory leak.
This patch verifies that a hugepage, used as a user buffer for DIO
operations, is correctly freed upon unmapping, regardless of whether the
offsets are aligned or unaligned w.r.t page boundary.
Test Result Fail Scenario (Without the fix)
--------------------------------------------------------
[]# ./hugetlb_dio
TAP version 13
1..4
No. Free pages before allocation : 7
No. Free pages after munmap : 7
ok 1 : Huge pages freed successfully !
No. Free pages before allocation : 7
No. Free pages after munmap : 7
ok 2 : Huge pages freed successfully !
No. Free pages before allocation : 7
No. Free pages after munmap : 7
ok 3 : Huge pages freed successfully !
No. Free pages before allocation : 7
No. Free pages after munmap : 6
not ok 4 : Huge pages not freed!
Totals: pass:3 fail:1 xfail:0 xpass:0 skip:0 error:0
Test Result PASS Scenario (With the fix)
---------------------------------------------------------
[]#./hugetlb_dio
TAP version 13
1..4
No. Free pages before allocation : 7
No. Free pages after munmap : 7
ok 1 : Huge pages freed successfully !
No. Free pages before allocation : 7
No. Free pages after munmap : 7
ok 2 : Huge pages freed successfully !
No. Free pages before allocation : 7
No. Free pages after munmap : 7
ok 3 : Huge pages freed successfully !
No. Free pages before allocation : 7
No. Free pages after munmap : 7
ok 4 : Huge pages freed successfully !
Totals: pass:4 fail:0 xfail:0 xpass:0 skip:0 error:0
[[email protected]: address review comments from Muhammad]
Link: https://lkml.kernel.org/r/[email protected]
[[email protected]: add this test to run_vmtests.sh]
Link: https://lkml.kernel.org/r/[email protected]
Link: https://lkml.kernel.org/r/[email protected]
Fixes: 38b43539d64b ("block: Fix page refcounts for unaligned buffers in __bio_release_pages()")
Signed-off-by: Donet Tom <[email protected]>
Signed-off-by: Ritesh Harjani (IBM) <[email protected]>
Co-developed-by: Ritesh Harjani (IBM) <[email protected]>
Reviewed-by: Muhammad Usama Anjum <[email protected]>
Cc: David Hildenbrand <[email protected]>
Cc: Matthew Wilcox (Oracle) <[email protected]>
Cc: Mike Rapoport (IBM) <[email protected]>
Cc: Muchun Song <[email protected]>
Cc: Ritesh Harjani (IBM) <[email protected]>
Cc: Shuah Khan <[email protected]>
Cc: Tony Battersby <[email protected]>
Cc: Jens Axboe <[email protected]>
Signed-off-by: Andrew Morton <[email protected]>
|
|
Sealing read-only of elf mapping so it can't be changed by mprotect.
[[email protected]: style change]
Link: https://lkml.kernel.org/r/[email protected]
[[email protected]: fix linker error for inline function]
Link: https://lkml.kernel.org/r/[email protected]
[[email protected]: fix compile warning]
Link: https://lkml.kernel.org/r/[email protected]
[[email protected]: fix arm build]
Link: https://lkml.kernel.org/r/[email protected]
Link: https://lkml.kernel.org/r/[email protected]
Signed-off-by: Jeff Xu <[email protected]>
Signed-off-by: Amer Al Shanawany <[email protected]>
Reviewed-by: Kees Cook <[email protected]>
Reviewed-by: Liam R. Howlett <[email protected]>
Cc: Dave Hansen <[email protected]>
Cc: Greg Kroah-Hartman <[email protected]>
Cc: Guenter Roeck <[email protected]>
Cc: Jann Horn <[email protected]>
Cc: Jeff Xu <[email protected]>
Cc: Jonathan Corbet <[email protected]>
Cc: Jorge Lucangeli Obes <[email protected]>
Cc: Linus Torvalds <[email protected]>
Cc: Matthew Wilcox (Oracle) <[email protected]>
Cc: Muhammad Usama Anjum <[email protected]>
Cc: Pedro Falcato <[email protected]>
Cc: Stephen Röttger <[email protected]>
Cc: Suren Baghdasaryan <[email protected]>
Cc: Amer Al Shanawany <[email protected]>
Cc: Javier Carrasco <[email protected]>
Cc: Shuah Khan <[email protected]>
Signed-off-by: Andrew Morton <[email protected]>
|
|
selftest for memory sealing change in mmap() and mseal().
Link: https://lkml.kernel.org/r/[email protected]
Signed-off-by: Jeff Xu <[email protected]>
Reviewed-by: Kees Cook <[email protected]>
Reviewed-by: Liam R. Howlett <[email protected]>
Cc: Dave Hansen <[email protected]>
Cc: Greg Kroah-Hartman <[email protected]>
Cc: Guenter Roeck <[email protected]>
Cc: Jann Horn <[email protected]>
Cc: Jeff Xu <[email protected]>
Cc: Jonathan Corbet <[email protected]>
Cc: Jorge Lucangeli Obes <[email protected]>
Cc: Linus Torvalds <[email protected]>
Cc: Matthew Wilcox (Oracle) <[email protected]>
Cc: Muhammad Usama Anjum <[email protected]>
Cc: Pedro Falcato <[email protected]>
Cc: Stephen Röttger <[email protected]>
Cc: Suren Baghdasaryan <[email protected]>
Cc: Amer Al Shanawany <[email protected]>
Cc: Javier Carrasco <[email protected]>
Cc: Shuah Khan <[email protected]>
Signed-off-by: Andrew Morton <[email protected]>
|
|
git://git.kernel.org/pub/scm/linux/kernel/git/akpm/mm
Pull mm updates from Andrew Morton:
"The usual shower of singleton fixes and minor series all over MM,
documented (hopefully adequately) in the respective changelogs.
Notable series include:
- Lucas Stach has provided some page-mapping cleanup/consolidation/
maintainability work in the series "mm/treewide: Remove pXd_huge()
API".
- In the series "Allow migrate on protnone reference with
MPOL_PREFERRED_MANY policy", Donet Tom has optimized mempolicy's
MPOL_PREFERRED_MANY mode, yielding almost doubled performance in
one test.
- In their series "Memory allocation profiling" Kent Overstreet and
Suren Baghdasaryan have contributed a means of determining (via
/proc/allocinfo) whereabouts in the kernel memory is being
allocated: number of calls and amount of memory.
- Matthew Wilcox has provided the series "Various significant MM
patches" which does a number of rather unrelated things, but in
largely similar code sites.
- In his series "mm: page_alloc: freelist migratetype hygiene"
Johannes Weiner has fixed the page allocator's handling of
migratetype requests, with resulting improvements in compaction
efficiency.
- In the series "make the hugetlb migration strategy consistent"
Baolin Wang has fixed a hugetlb migration issue, which should
improve hugetlb allocation reliability.
- Liu Shixin has hit an I/O meltdown caused by readahead in a
memory-tight memcg. Addressed in the series "Fix I/O high when
memory almost met memcg limit".
- In the series "mm/filemap: optimize folio adding and splitting"
Kairui Song has optimized pagecache insertion, yielding ~10%
performance improvement in one test.
- Baoquan He has cleaned up and consolidated the early zone
initialization code in the series "mm/mm_init.c: refactor
free_area_init_core()".
- Baoquan has also redone some MM initializatio code in the series
"mm/init: minor clean up and improvement".
- MM helper cleanups from Christoph Hellwig in his series "remove
follow_pfn".
- More cleanups from Matthew Wilcox in the series "Various
page->flags cleanups".
- Vlastimil Babka has contributed maintainability improvements in the
series "memcg_kmem hooks refactoring".
- More folio conversions and cleanups in Matthew Wilcox's series:
"Convert huge_zero_page to huge_zero_folio"
"khugepaged folio conversions"
"Remove page_idle and page_young wrappers"
"Use folio APIs in procfs"
"Clean up __folio_put()"
"Some cleanups for memory-failure"
"Remove page_mapping()"
"More folio compat code removal"
- David Hildenbrand chipped in with "fs/proc/task_mmu: convert
hugetlb functions to work on folis".
- Code consolidation and cleanup work related to GUP's handling of
hugetlbs in Peter Xu's series "mm/gup: Unify hugetlb, part 2".
- Rick Edgecombe has developed some fixes to stack guard gaps in the
series "Cover a guard gap corner case".
- Jinjiang Tu has fixed KSM's behaviour after a fork+exec in the
series "mm/ksm: fix ksm exec support for prctl".
- Baolin Wang has implemented NUMA balancing for multi-size THPs.
This is a simple first-cut implementation for now. The series is
"support multi-size THP numa balancing".
- Cleanups to vma handling helper functions from Matthew Wilcox in
the series "Unify vma_address and vma_pgoff_address".
- Some selftests maintenance work from Dev Jain in the series
"selftests/mm: mremap_test: Optimizations and style fixes".
- Improvements to the swapping of multi-size THPs from Ryan Roberts
in the series "Swap-out mTHP without splitting".
- Kefeng Wang has significantly optimized the handling of arm64's
permission page faults in the series
"arch/mm/fault: accelerate pagefault when badaccess"
"mm: remove arch's private VM_FAULT_BADMAP/BADACCESS"
- GUP cleanups from David Hildenbrand in "mm/gup: consistently call
it GUP-fast".
- hugetlb fault code cleanups from Vishal Moola in "Hugetlb fault
path to use struct vm_fault".
- selftests build fixes from John Hubbard in the series "Fix
selftests/mm build without requiring "make headers"".
- Memory tiering fixes/improvements from Ho-Ren (Jack) Chuang in the
series "Improved Memory Tier Creation for CPUless NUMA Nodes".
Fixes the initialization code so that migration between different
memory types works as intended.
- David Hildenbrand has improved follow_pte() and fixed an errant
driver in the series "mm: follow_pte() improvements and acrn
follow_pte() fixes".
- David also did some cleanup work on large folio mapcounts in his
series "mm: mapcount for large folios + page_mapcount() cleanups".
- Folio conversions in KSM in Alex Shi's series "transfer page to
folio in KSM".
- Barry Song has added some sysfs stats for monitoring multi-size
THP's in the series "mm: add per-order mTHP alloc and swpout
counters".
- Some zswap cleanups from Yosry Ahmed in the series "zswap
same-filled and limit checking cleanups".
- Matthew Wilcox has been looking at buffer_head code and found the
documentation to be lacking. The series is "Improve buffer head
documentation".
- Multi-size THPs get more work, this time from Lance Yang. His
series "mm/madvise: enhance lazyfreeing with mTHP in madvise_free"
optimizes the freeing of these things.
- Kemeng Shi has added more userspace-visible writeback
instrumentation in the series "Improve visibility of writeback".
- Kemeng Shi then sent some maintenance work on top in the series
"Fix and cleanups to page-writeback".
- Matthew Wilcox reduces mmap_lock traffic in the anon vma code in
the series "Improve anon_vma scalability for anon VMAs". Intel's
test bot reported an improbable 3x improvement in one test.
- SeongJae Park adds some DAMON feature work in the series
"mm/damon: add a DAMOS filter type for page granularity access recheck"
"selftests/damon: add DAMOS quota goal test"
- Also some maintenance work in the series
"mm/damon/paddr: simplify page level access re-check for pageout"
"mm/damon: misc fixes and improvements"
- David Hildenbrand has disabled some known-to-fail selftests ni the
series "selftests: mm: cow: flag vmsplice() hugetlb tests as
XFAIL".
- memcg metadata storage optimizations from Shakeel Butt in "memcg:
reduce memory consumption by memcg stats".
- DAX fixes and maintenance work from Vishal Verma in the series
"dax/bus.c: Fixups for dax-bus locking""
* tag 'mm-stable-2024-05-17-19-19' of git://git.kernel.org/pub/scm/linux/kernel/git/akpm/mm: (426 commits)
memcg, oom: cleanup unused memcg_oom_gfp_mask and memcg_oom_order
selftests/mm: hugetlb_madv_vs_map: avoid test skipping by querying hugepage size at runtime
mm/hugetlb: add missing VM_FAULT_SET_HINDEX in hugetlb_wp
mm/hugetlb: add missing VM_FAULT_SET_HINDEX in hugetlb_fault
selftests: cgroup: add tests to verify the zswap writeback path
mm: memcg: make alloc_mem_cgroup_per_node_info() return bool
mm/damon/core: fix return value from damos_wmark_metric_value
mm: do not update memcg stats for NR_{FILE/SHMEM}_PMDMAPPED
selftests: cgroup: remove redundant enabling of memory controller
Docs/mm/damon/maintainer-profile: allow posting patches based on damon/next tree
Docs/mm/damon/maintainer-profile: change the maintainer's timezone from PST to PT
Docs/mm/damon/design: use a list for supported filters
Docs/admin-guide/mm/damon/usage: fix wrong schemes effective quota update command
Docs/admin-guide/mm/damon/usage: fix wrong example of DAMOS filter matching sysfs file
selftests/damon: classify tests for functionalities and regressions
selftests/damon/_damon_sysfs: use 'is' instead of '==' for 'None'
selftests/damon/_damon_sysfs: find sysfs mount point from /proc/mounts
selftests/damon/_damon_sysfs: check errors from nr_schemes file reads
mm/damon/core: initialize ->esz_bp from damos_quota_init_priv()
selftests/damon: add a test for DAMOS quota goal
...
|
|
In commit 0518dbe97fe6 ("selftests/mm: fix cross compilation with LLVM")
the logic to detect the machine architecture in the Makefile was changed
to use ARCH, and only fallback to uname -m if ARCH is unset. However the
tests of ARCH were not updated to account for the fact that ARCH is
"powerpc" for powerpc builds, not "ppc64".
Fix it by changing the checks to look for "powerpc", and change the
uname -m logic to convert "ppc64.*" into "powerpc".
With that fixed the following tests now build for powerpc again:
* protection_keys
* va_high_addr_switch
* virtual_address_range
* write_to_hugetlbfs
Link: https://lkml.kernel.org/r/[email protected]
Fixes: 0518dbe97fe6 ("selftests/mm: fix cross compilation with LLVM")
Signed-off-by: Michael Ellerman <[email protected]>
Cc: Mark Brown <[email protected]>
Cc: <[email protected]> [6.4+]
Signed-off-by: Andrew Morton <[email protected]>
|
|
Patch series "Fix selftests/mm build without requiring "make headers"".
As mentioned in each patch, this implements the solution that we discussed
in December 2023, in [1]. This turned out to be very clean and easy. It
should also be quite easy to maintain.
This should also make Peter Zijlstra happy, because it directly addresses
the root cause of his "NAK NAK NAK" reply [2]. :)
[1] https://lore.kernel.org/all/[email protected]/
[2] https://lore.kernel.org/lkml/[email protected]/
This patch (of 2):
Use tools/include/uapi/ files instead. These are obtained by taking a
snapshot: run "make headers" at the top level, then copy the desired
header file into the appropriate subdir in tools/uapi/.
This was discussed and solved in [1].
However, even before copying any additional files there, there are already
quite a few in tools/include/uapi already. And these will immediately fix
a number of selftests/mm build failures.
So this patch:
a) Adds TOOLS_INCLUDES to selftests/lib.mk, so that all selftests can
immediately and easily include the snapshotted header files.
b) Uses $(TOOLS_INCLUDES) in the selftests/mm build. On today's Arch
Linux, this already fixes all build errors except for a few
userfaultfd.h (those will be addressed in a subsequent patch).
[1] https://lore.kernel.org/all/[email protected]/
Link: https://lkml.kernel.org/r/[email protected]
Link: https://lkml.kernel.org/r/[email protected]
Signed-off-by: John Hubbard <[email protected]>
Acked-by: David Hildenbrand <[email protected]>
Cc: Mark Brown <[email protected]>
Cc: Muhammad Usama Anjum <[email protected]>
Cc: Suren Baghdasaryan <[email protected]>
Cc: Peter Zijlstra <[email protected]>
Cc: Andrea Arcangeli <[email protected]>
Cc: Axel Rasmussen <[email protected]>
Cc: Peter Xu <[email protected]>
Cc: Shuah Khan <[email protected]>
Signed-off-by: Andrew Morton <[email protected]>
|
|
git://git.kernel.org/pub/scm/linux/kernel/git/akpm/mm
Pull non-MM updates from Andrew Morton:
- Kuan-Wei Chiu has developed the well-named series "lib min_heap: Min
heap optimizations".
- Kuan-Wei Chiu has also sped up the library sorting code in the series
"lib/sort: Optimize the number of swaps and comparisons".
- Alexey Gladkov has added the ability for code running within an IPC
namespace to alter its IPC and MQ limits. The series is "Allow to
change ipc/mq sysctls inside ipc namespace".
- Geert Uytterhoeven has contributed some dhrystone maintenance work in
the series "lib: dhry: miscellaneous cleanups".
- Ryusuke Konishi continues nilfs2 maintenance work in the series
"nilfs2: eliminate kmap and kmap_atomic calls"
"nilfs2: fix kernel bug at submit_bh_wbc()"
- Nathan Chancellor has updated our build tools requirements in the
series "Bump the minimum supported version of LLVM to 13.0.1".
- Muhammad Usama Anjum continues with the selftests maintenance work in
the series "selftests/mm: Improve run_vmtests.sh".
- Oleg Nesterov has done some maintenance work against the signal code
in the series "get_signal: minor cleanups and fix".
Plus the usual shower of singleton patches in various parts of the tree.
Please see the individual changelogs for details.
* tag 'mm-nonmm-stable-2024-03-14-09-36' of git://git.kernel.org/pub/scm/linux/kernel/git/akpm/mm: (77 commits)
nilfs2: prevent kernel bug at submit_bh_wbc()
nilfs2: fix failure to detect DAT corruption in btree and direct mappings
ocfs2: enable ocfs2_listxattr for special files
ocfs2: remove SLAB_MEM_SPREAD flag usage
assoc_array: fix the return value in assoc_array_insert_mid_shortcut()
buildid: use kmap_local_page()
watchdog/core: remove sysctl handlers from public header
nilfs2: use div64_ul() instead of do_div()
mul_u64_u64_div_u64: increase precision by conditionally swapping a and b
kexec: copy only happens before uchunk goes to zero
get_signal: don't initialize ksig->info if SIGNAL_GROUP_EXIT/group_exec_task
get_signal: hide_si_addr_tag_bits: fix the usage of uninitialized ksig
get_signal: don't abuse ksig->info.si_signo and ksig->sig
const_structs.checkpatch: add device_type
Normalise "name (ad@dr)" MODULE_AUTHORs to "name <ad@dr>"
dyndbg: replace kstrdup() + strchr() with kstrdup_and_replace()
list: leverage list_is_head() for list_entry_is_head()
nilfs2: MAINTAINERS: drop unreachable project mirror site
smp: make __smp_processor_id() 0-argument macro
fat: fix uninitialized field in nostale filehandles
...
|
|
Add missing tests to run_vmtests.sh. The mm kselftests are run through
run_vmtests.sh. If a test isn't present in this script, it'll not run
with run_tests or `make -C tools/testing/selftests/mm run_tests`.
[[email protected]: use correct flag in the code]
Link: https://lkml.kernel.org/r/[email protected]
Link: https://lkml.kernel.org/r/[email protected]
Cc: Ryan Roberts <[email protected]>
Signed-off-by: Muhammad Usama Anjum <[email protected]>
Cc: Shuah Khan <[email protected]>
Signed-off-by: Andrew Morton <[email protected]>
|
|
This test stresses the race between of madvise(DONTNEED), a page fault
and a parallel huge page mmap, which should fail due to lack of
available page available for mapping.
This test case must run on a system with one and only one huge page
available.
# echo 1 > /sys/kernel/mm/hugepages/hugepages-2048kB/nr_hugepages
During setup, the test allocates the only available page, and starts
three threads:
- thread 1:
* madvise(MADV_DONTNEED) on the allocated huge page
- thread 2:
* Write to the allocated huge page
- thread 3:
* Tries to allocated (steal) an extra huge page (which is not
available)
thread 3 should never succeed in the allocation, since the only huge
page was never unmapped, and should be reserved.
Touching the old page after thread3 allocation will raise a SIGBUS.
Link: https://lkml.kernel.org/r/[email protected]
Signed-off-by: Breno Leitao <[email protected]>
Cc: Mike Rapoport (IBM) <[email protected]>
Cc: Muchun Song <[email protected]>
Cc: Rik van Riel <[email protected]>
Cc: Shuah Khan <[email protected]>
Cc: Vegard Nossum <[email protected]>
Cc: Yang Shi <[email protected]>
Signed-off-by: Andrew Morton <[email protected]>
|
|
The khugepaged test has a useful framework for save/restore/pop/push of
all thp settings via the sysfs interface. This will be useful to
explicitly control multi-size THP settings in other tests, so let's move
it out of khugepaged and into its own thp_settings.[c|h] utility.
Link: https://lkml.kernel.org/r/[email protected]
Signed-off-by: Ryan Roberts <[email protected]>
Tested-by: Alistair Popple <[email protected]>
Acked-by: David Hildenbrand <[email protected]>
Tested-by: Kefeng Wang <[email protected]>
Tested-by: John Hubbard <[email protected]>
Cc: Anshuman Khandual <[email protected]>
Cc: Barry Song <[email protected]>
Cc: Catalin Marinas <[email protected]>
Cc: David Rientjes <[email protected]>
Cc: "Huang, Ying" <[email protected]>
Cc: Hugh Dickins <[email protected]>
Cc: Itaru Kitayama <[email protected]>
Cc: Kirill A. Shutemov <[email protected]>
Cc: Luis Chamberlain <[email protected]>
Cc: Matthew Wilcox (Oracle) <[email protected]>
Cc: Vlastimil Babka <[email protected]>
Cc: Yang Shi <[email protected]>
Cc: Yin Fengwei <[email protected]>
Cc: Yu Zhao <[email protected]>
Cc: Zi Yan <[email protected]>
Signed-off-by: Andrew Morton <[email protected]>
|
|
Commit 05f1edac8009 ("selftests/mm: run all tests from run_vmtests.sh")
fixed the inconsistency caused by tests being defined as TEST_GEN_PROGS.
This issue was leading to tests not being executed via run_vmtests.sh and
furthermore some tests running twice due to the kselftests wrapper also
executing them.
Fix the definition of two tests (soft-dirty and pagemap_ioctl) that are
still incorrectly defined.
Link: https://lkml.kernel.org/r/[email protected]
Signed-off-by: Nico Pache <[email protected]>
Reviewed-by: David Hildenbrand <[email protected]>
Cc: Joel Savitz <[email protected]>
Cc: Shuah Khan <[email protected]>
Signed-off-by: Andrew Morton <[email protected]>
|
|
Create a selftest that exercises the race between page faults and
madvise(MADV_DONTNEED) in the same huge page. Do it by running two
threads that touches the huge page and madvise(MADV_DONTNEED) at the same
time.
In case of a SIGBUS coming at pagefault, the test should fail, since we
hit the bug.
The test doesn't have a signal handler, and if it fails, it fails like
the following
----------------------------------
running ./hugetlb_fault_after_madv
----------------------------------
./run_vmtests.sh: line 186: 595563 Bus error (core dumped) "$@"
[FAIL]
This selftest goes together with the fix of the bug[1] itself.
[1] https://lore.kernel.org/all/[email protected]/#r
Link: https://lkml.kernel.org/r/[email protected]
Signed-off-by: Breno Leitao <[email protected]>
Reviewed-by: Rik van Riel <[email protected]>
Tested-by: Rik van Riel <[email protected]>
Cc: Mike Kravetz <[email protected]>
Cc: Muchun Song <[email protected]>
Signed-off-by: Andrew Morton <[email protected]>
|
|
Add pagemap ioctl tests. Add several different types of tests to judge
the correction of the interface.
Link: https://lkml.kernel.org/r/[email protected]
Signed-off-by: Muhammad Usama Anjum <[email protected]>
Cc: Alex Sierra <[email protected]>
Cc: Al Viro <[email protected]>
Cc: Andrei Vagin <[email protected]>
Cc: Axel Rasmussen <[email protected]>
Cc: Christian Brauner <[email protected]>
Cc: Cyrill Gorcunov <[email protected]>
Cc: Dan Williams <[email protected]>
Cc: David Hildenbrand <[email protected]>
Cc: Greg Kroah-Hartman <[email protected]>
Cc: Gustavo A. R. Silva <[email protected]>
Cc: "Liam R. Howlett" <[email protected]>
Cc: Matthew Wilcox <[email protected]>
Cc: Michal Miroslaw <[email protected]>
Cc: Michał Mirosław <[email protected]>
Cc: Mike Rapoport (IBM) <[email protected]>
Cc: Nadav Amit <[email protected]>
Cc: Pasha Tatashin <[email protected]>
Cc: Paul Gofman <[email protected]>
Cc: Peter Xu <[email protected]>
Cc: Shuah Khan <[email protected]>
Cc: Suren Baghdasaryan <[email protected]>
Cc: Vlastimil Babka <[email protected]>
Cc: Yang Shi <[email protected]>
Cc: Yun Zhou <[email protected]>
Signed-off-by: Andrew Morton <[email protected]>
|
|
It is very unclear to me how one is supposed to run all the mm selftests
consistently and get clear results.
Most of the test programs are launched by both run_vmtests.sh and
run_kselftest.sh:
hugepage-mmap
hugepage-shm
map_hugetlb
hugepage-mremap
hugepage-vmemmap
hugetlb-madvise
map_fixed_noreplace
gup_test
gup_longterm
uffd-unit-tests
uffd-stress
compaction_test
on-fault-limit
map_populate
mlock-random-test
mlock2-tests
mrelease_test
mremap_test
thuge-gen
virtual_address_range
va_high_addr_switch
mremap_dontunmap
hmm-tests
madv_populate
memfd_secret
ksm_tests
ksm_functional_tests
soft-dirty
cow
However, of this set, when launched by run_vmtests.sh, some of the
programs are invoked multiple times with different arguments. When
invoked by run_kselftest.sh, they are invoked without arguments (and as
a consequence, some fail immediately).
Some test programs are only launched by run_vmtests.sh:
test_vmalloc.sh
And some test programs and only launched by run_kselftest.sh:
khugepaged
migration
mkdirty
transhuge-stress
split_huge_page_test
mdwe_test
write_to_hugetlbfs
Furthermore, run_vmtests.sh is invoked by run_kselftest.sh, so in this
case all the test programs invoked by both scripts are run twice!
Needless to say, this is a bit of a mess. In the absence of fully
understanding the history here, it looks to me like the best solution is
to launch ALL test programs from run_vmtests.sh, and ONLY invoke
run_vmtests.sh from run_kselftest.sh. This way, we get full control over
the parameters, each program is only invoked the intended number of
times, and regardless of which script is used, the same tests get run in
the same way.
The only drawback is that if using run_kselftest.sh, it's top-level tap
result reporting reports only a single test and it fails if any of the
contained tests fail. I don't see this as a big deal though since we
still see all the nested reporting from multiple layers. The other issue
with this is that all of run_vmtests.sh must execute within a single
kselftest timeout period, so let's increase that to something more
suitable.
In the Makefile, TEST_GEN_PROGS will compile and install the tests and
will add them to the list of tests that run_kselftest.sh will run.
TEST_GEN_FILES will compile and install the tests but will not add them
to the test list. So let's move all the programs from TEST_GEN_PROGS to
TEST_GEN_FILES so that they are built but not executed by
run_kselftest.sh. Note that run_vmtests.sh is added to TEST_PROGS, which
means it ends up in the test list. (the lack of "_GEN" means it won't be
compiled, but simply copied).
Link: https://lkml.kernel.org/r/[email protected]
Signed-off-by: Ryan Roberts <[email protected]>
Acked-by: David Hildenbrand <[email protected]>
Acked-by: Peter Xu <[email protected]>
Cc: Florent Revest <[email protected]>
Cc: Jérôme Glisse <[email protected]>
Cc: John Hubbard <[email protected]>
Cc: Mark Brown <[email protected]>
Cc: Shuah Khan <[email protected]>
Signed-off-by: Andrew Morton <[email protected]>
|
|
arm64 does not support the soft-dirty PTE bit. However, the `soft-dirty`
test suite is currently run unconditionally and therefore generates
spurious test failures on arm64. There are also some tests in
`madv_populate` which assume it is supported.
For `soft-dirty` lets disable the whole suite for arm64; it is no longer
built and run_vmtests.sh will skip it if its not present.
For `madv_populate`, we need a runtime mechanism so that the remaining
tests continue to be run. Unfortunately, the only way to determine if the
soft-dirty dirty bit is supported is to write to a page, then see if the
bit is set in /proc/self/pagemap. But the tests that we want to
conditionally execute are testing precicesly this. So if we introduced
this feature check, we could accedentally turn a real failure (on a system
that claims to support soft-dirty) into a skip. So instead, do the check
based on architecture; for arm64, we report that soft-dirty is not
supported.
Link: https://lkml.kernel.org/r/[email protected]
Signed-off-by: Ryan Roberts <[email protected]>
Acked-by: David Hildenbrand <[email protected]>
Cc: Florent Revest <[email protected]>
Cc: Jérôme Glisse <[email protected]>
Cc: John Hubbard <[email protected]>
Cc: Mark Brown <[email protected]>
Cc: Peter Xu <[email protected]>
Cc: Shuah Khan <[email protected]>
Signed-off-by: Andrew Morton <[email protected]>
|
|
Add tests for the improvement made to read operation on HWPOISON
hugetlb page with different read granularities. For each chunk size,
three read scenarios are tested:
1. Simple regression test on read without HWPOISON.
2. Sequential read page by page should succeed until encounters the 1st
raw HWPOISON subpage.
3. After skip a raw HWPOISON subpage by lseek, read()s always succeed.
Link: https://lkml.kernel.org/r/[email protected]
Signed-off-by: Jiaqi Yan <[email protected]>
Acked-by: Mike Kravetz <[email protected]>
Reviewed-by: Naoya Horiguchi <[email protected]>
Cc: James Houghton <[email protected]>
Cc: Miaohe Lin <[email protected]>
Cc: Muchun Song <[email protected]>
Cc: Yang Shi <[email protected]>
Signed-off-by: Andrew Morton <[email protected]>
|
|
|
|
It is wrong to include unprocessed user header files directly. They are
processed to "<source_tree>/usr/include" by running "make headers" and
they are included in selftests by kselftest makefiles automatically with
help of KHDR_INCLUDES variable. These headers should always bulilt first
before building kselftests.
Link: https://lkml.kernel.org/r/[email protected]
Fixes: 07115fcc15b4 ("selftests/mm: add new selftests for KSM")
Signed-off-by: Muhammad Usama Anjum <[email protected]>
Acked-by: David Hildenbrand <[email protected]>
Cc: Shuah Khan <[email protected]>
Cc: Stefan Roesch <[email protected]>
Signed-off-by: Andrew Morton <[email protected]>
|
|
Currently the MM selftests attempt to work out the target architecture by
using CROSS_COMPILE or otherwise querying the host machine, storing the
target architecture in a variable called MACHINE rather than the usual
ARCH though as far as I can tell (including for x86_64) the value is the
same as we would use for architecture.
When cross compiling with LLVM we don't need a CROSS_COMPILE as LLVM can
support many target architectures in a single build so this logic does not
work, CROSS_COMPILE is not set and we end up selecting tests for the host
rather than target architecture. Fix this by using the more standard ARCH
to describe the architecture, taking it from the environment if specified.
Link: https://lkml.kernel.org/r/[email protected]
Signed-off-by: Mark Brown <[email protected]>
Cc: Nick Desaulniers <[email protected]>
Cc: Albert Ou <[email protected]>
Cc: Nathan Chancellor <[email protected]>
Cc: Nick Desaulniers <[email protected]>
Cc: Palmer Dabbelt <[email protected]>
Cc: Paul Walmsley <[email protected]>
Cc: Shuah Khan <[email protected]>
Cc: Tom Rix <[email protected]>
Signed-off-by: Andrew Morton <[email protected]>
|
|
Let's add a new test for checking whether GUP long-term page pinning works
as expected (R/O vs. R/W, MAP_PRIVATE vs. MAP_SHARED, GUP vs.
GUP-fast). Note that COW handling with long-term R/O pinning in private
mappings, and pinning of anonymous memory in general, is tested by the COW
selftest. This test, therefore, focuses on page pinning in file mappings.
The most interesting case is probably the "local tmpfile" case, as that
will likely end up on a "real" filesystem such as ext4 or xfs, not on a
virtual one like tmpfs or hugetlb where any long-term page pinning is
always expected to succeed.
For now, only add tests that use the "/sys/kernel/debug/gup_test"
interface. We'll add tests based on liburing separately next.
[[email protected]: update .gitignore for gup_longterm, per Peter]
Link: https://lkml.kernel.org/r/[email protected]
Signed-off-by: David Hildenbrand <[email protected]>
Reviewed-by: Lorenzo Stoakes <[email protected]>
Cc: Jan Kara <[email protected]>
Cc: Jason Gunthorpe <[email protected]>
Cc: Jens Axboe <[email protected]>
Cc: John Hubbard <[email protected]>
Cc: Peter Xu <[email protected]>
Cc: Shuah Khan <[email protected]>
Signed-off-by: Andrew Morton <[email protected]>
|
|
git://git.kernel.org/pub/scm/linux/kernel/git/akpm/mm
Pull MM updates from Andrew Morton:
- Nick Piggin's "shoot lazy tlbs" series, to improve the peformance of
switching from a user process to a kernel thread.
- More folio conversions from Kefeng Wang, Zhang Peng and Pankaj
Raghav.
- zsmalloc performance improvements from Sergey Senozhatsky.
- Yue Zhao has found and fixed some data race issues around the
alteration of memcg userspace tunables.
- VFS rationalizations from Christoph Hellwig:
- removal of most of the callers of write_one_page()
- make __filemap_get_folio()'s return value more useful
- Luis Chamberlain has changed tmpfs so it no longer requires swap
backing. Use `mount -o noswap'.
- Qi Zheng has made the slab shrinkers operate locklessly, providing
some scalability benefits.
- Keith Busch has improved dmapool's performance, making part of its
operations O(1) rather than O(n).
- Peter Xu adds the UFFD_FEATURE_WP_UNPOPULATED feature to userfaultd,
permitting userspace to wr-protect anon memory unpopulated ptes.
- Kirill Shutemov has changed MAX_ORDER's meaning to be inclusive
rather than exclusive, and has fixed a bunch of errors which were
caused by its unintuitive meaning.
- Axel Rasmussen give userfaultfd the UFFDIO_CONTINUE_MODE_WP feature,
which causes minor faults to install a write-protected pte.
- Vlastimil Babka has done some maintenance work on vma_merge():
cleanups to the kernel code and improvements to our userspace test
harness.
- Cleanups to do_fault_around() by Lorenzo Stoakes.
- Mike Rapoport has moved a lot of initialization code out of various
mm/ files and into mm/mm_init.c.
- Lorenzo Stoakes removd vmf_insert_mixed_prot(), which was added for
DRM, but DRM doesn't use it any more.
- Lorenzo has also coverted read_kcore() and vread() to use iterators
and has thereby removed the use of bounce buffers in some cases.
- Lorenzo has also contributed further cleanups of vma_merge().
- Chaitanya Prakash provides some fixes to the mmap selftesting code.
- Matthew Wilcox changes xfs and afs so they no longer take sleeping
locks in ->map_page(), a step towards RCUification of pagefaults.
- Suren Baghdasaryan has improved mmap_lock scalability by switching to
per-VMA locking.
- Frederic Weisbecker has reworked the percpu cache draining so that it
no longer causes latency glitches on cpu isolated workloads.
- Mike Rapoport cleans up and corrects the ARCH_FORCE_MAX_ORDER Kconfig
logic.
- Liu Shixin has changed zswap's initialization so we no longer waste a
chunk of memory if zswap is not being used.
- Yosry Ahmed has improved the performance of memcg statistics
flushing.
- David Stevens has fixed several issues involving khugepaged,
userfaultfd and shmem.
- Christoph Hellwig has provided some cleanup work to zram's IO-related
code paths.
- David Hildenbrand has fixed up some issues in the selftest code's
testing of our pte state changing.
- Pankaj Raghav has made page_endio() unneeded and has removed it.
- Peter Xu contributed some rationalizations of the userfaultfd
selftests.
- Yosry Ahmed has fixed an issue around memcg's page recalim
accounting.
- Chaitanya Prakash has fixed some arm-related issues in the
selftests/mm code.
- Longlong Xia has improved the way in which KSM handles hwpoisoned
pages.
- Peter Xu fixes a few issues with uffd-wp at fork() time.
- Stefan Roesch has changed KSM so that it may now be used on a
per-process and per-cgroup basis.
* tag 'mm-stable-2023-04-27-15-30' of git://git.kernel.org/pub/scm/linux/kernel/git/akpm/mm: (369 commits)
mm,unmap: avoid flushing TLB in batch if PTE is inaccessible
shmem: restrict noswap option to initial user namespace
mm/khugepaged: fix conflicting mods to collapse_file()
sparse: remove unnecessary 0 values from rc
mm: move 'mmap_min_addr' logic from callers into vm_unmapped_area()
hugetlb: pte_alloc_huge() to replace huge pte_alloc_map()
maple_tree: fix allocation in mas_sparse_area()
mm: do not increment pgfault stats when page fault handler retries
zsmalloc: allow only one active pool compaction context
selftests/mm: add new selftests for KSM
mm: add new KSM process and sysfs knobs
mm: add new api to enable ksm per process
mm: shrinkers: fix debugfs file permissions
mm: don't check VMA write permissions if the PTE/PMD indicates write permissions
migrate_pages_batch: fix statistics for longterm pin retry
userfaultfd: use helper function range_in_vma()
lib/show_mem.c: use for_each_populated_zone() simplify code
mm: correct arg in reclaim_pages()/reclaim_clean_pages_from_list()
fs/buffer: convert create_page_buffers to folio_create_buffers
fs/buffer: add folio_create_empty_buffers helper
...
|
|
This adds three new tests to the selftests for KSM. These tests use the
new prctl API's to enable and disable KSM.
1) add new prctl flags to prctl header file in tools dir
This adds the new prctl flags to the include file prct.h in the
tools directory. This makes sure they are available for testing.
2) add KSM prctl merge test to ksm_tests
This adds the -t option to the ksm_tests program. The -t flag
allows to specify if it should use madvise or prctl ksm merging.
3) add two functions for debugging merge outcome for ksm_tests
This adds two functions to report the metrics in /proc/self/ksm_stat
and /sys/kernel/debug/mm/ksm. The debug output is enabled with the
-d option.
4) add KSM prctl test to ksm_functional_tests
This adds a test to the ksm_functional_test that verifies that the
prctl system call to enable / disable KSM works.
5) add KSM fork test to ksm_functional_test
Add fork test to verify that the MMF_VM_MERGE_ANY flag is inherited
by the child process.
Link: https://lkml.kernel.org/r/[email protected]
Signed-off-by: Stefan Roesch <[email protected]>
Acked-by: David Hildenbrand <[email protected]>
Cc: Bagas Sanjaya <[email protected]>
Cc: David Hildenbrand <[email protected]>
Cc: Johannes Weiner <[email protected]>
Cc: Michal Hocko <[email protected]>
Cc: Rik van Riel <[email protected]>
Signed-off-by: Andrew Morton <[email protected]>
|
|
The macro and facility can be reused in other tests too. Make it general.
Link: https://lkml.kernel.org/r/[email protected]
Signed-off-by: Peter Xu <[email protected]>
Reviewed-by: David Hildenbrand <[email protected]>
Cc: Andrea Arcangeli <[email protected]>
Cc: Axel Rasmussen <[email protected]>
Cc: Mika Penttilä <[email protected]>
Cc: Mike Kravetz <[email protected]>
Cc: Nadav Amit <[email protected]>
Signed-off-by: Andrew Morton <[email protected]>
|
|
As the initial selftest only took into consideration PowperPC and x86
architectures, on adding support for arm64, a platform independent naming
convention is chosen.
Link: https://lkml.kernel.org/r/[email protected]
Signed-off-by: Chaitanya S Prakash <[email protected]>
Cc: Aneesh Kumar K.V <[email protected]>
Cc: Kirill A. Shutemov <[email protected]>
Cc: Shuah Khan <[email protected]>
Cc: Mark Rutland <[email protected]>
Signed-off-by: Andrew Morton <[email protected]>
|
|
In many ways it's weird and unwanted to keep all the tests in the same
userfaultfd.c at least when still in the current way.
For example, it doesn't make much sense to run the stress test for each
method we can create an userfaultfd handle (either via syscall or /dev/
node). It's a waste of time running this twice for the whole stress as
the stress paths are the same, only the open path is different.
It's also just weird to need to manually specify different types of memory
to run all unit tests for the userfaultfd interface. We should be able to
just run a single program and that should go through all functional uffd
tests without running the stress test at all. The stress test was more
for torturing and finding race conditions. We don't want to wait for
stress to finish just to regress test a functional test.
When we start to pile up more things on top of the same file and same
functions, things start to go a bit chaos and the code is just harder to
maintain too with tons of global variables.
This patch creates a new test uffd-unit-tests to keep userfaultfd unit
tests in the future, currently empty.
Meanwhile rename the old userfaultfd.c test to uffd-stress.c.
Link: https://lkml.kernel.org/r/[email protected]
Signed-off-by: Peter Xu <[email protected]>
Reviewed-by: Mike Rapoport (IBM) <[email protected]>
Reviewed-by: Axel Rasmussen <[email protected]>
Cc: David Hildenbrand <[email protected]>
Cc: Dmitry Safonov <[email protected]>
Cc: Mike Kravetz <[email protected]>
Cc: Zach O'Keefe <[email protected]>
Signed-off-by: Andrew Morton <[email protected]>
|
|
Move common utility functions into uffd-common.[ch] files from the
original userfaultfd.c. This prepares for a split of userfaultfd.c into
two tests: one to only cover the old but powerful stress test, the other
one covers all the functional tests.
This movement is kind of a brute-force effort for now, with light
touch-ups but nothing should really change. There's chances to optimize
more, but let's leave that for later.
Link: https://lkml.kernel.org/r/[email protected]
Signed-off-by: Peter Xu <[email protected]>
Reviewed-by: Mike Rapoport (IBM) <[email protected]>
Reviewed-by: Axel Rasmussen <[email protected]>
Cc: David Hildenbrand <[email protected]>
Cc: Dmitry Safonov <[email protected]>
Cc: Mike Kravetz <[email protected]>
Cc: Zach O'Keefe <[email protected]>
Signed-off-by: Andrew Morton <[email protected]>
|
|
We do have plenty of files that want to link against vm_util.c. Just make
it simple by linking it always.
Link: https://lkml.kernel.org/r/[email protected]
Signed-off-by: Peter Xu <[email protected]>
Reviewed-by: David Hildenbrand <[email protected]>
Reviewed-by: Mike Rapoport (IBM) <[email protected]>
Cc: Axel Rasmussen <[email protected]>
Cc: Dmitry Safonov <[email protected]>
Cc: Mike Kravetz <[email protected]>
Cc: Zach O'Keefe <[email protected]>
Signed-off-by: Andrew Morton <[email protected]>
|
|
TEST_GEN_PROGS and TEST_GEN_FILES are used randomly in the mm/Makefile to
specify programs that need to build. Logically all these binaries should
all fall into TEST_GEN_PROGS.
Replace those TEST_GEN_FILES with TEST_GEN_PROGS, so that we can reference
all the tests easily later.
[[email protected]: tools/testing/selftests/mm/Makefile: don't wipe out TEST_GEN_PROGS]
Link: https://lkml.kernel.org/r/ZDxrvZh/cw357D8P@x1n
Link: https://lkml.kernel.org/r/[email protected]
Signed-off-by: Peter Xu <[email protected]>
Reviewed-by: David Hildenbrand <[email protected]>
Reviewed-by: Mike Rapoport (IBM) <[email protected]>
Cc: Axel Rasmussen <[email protected]>
Cc: Dmitry Safonov <[email protected]>
Cc: Mike Kravetz <[email protected]>
Cc: Zach O'Keefe <[email protected]>
Signed-off-by: Andrew Morton <[email protected]>
|
|
There're two util headers under mm/ kselftest. Merge one with another.
It turns out util.h is the easy one to move.
When merging, drop PAGE_SIZE / PAGE_SHIFT because they're unnecessary
wrappers to page_size() / page_shift(), meanwhile rename them to psize()
and pshift() so as to not conflict with some existing definitions in some
test files that includes vm_util.h.
Link: https://lkml.kernel.org/r/[email protected]
Signed-off-by: Peter Xu <[email protected]>
Reviewed-by: Axel Rasmussen <[email protected]>
Reviewed-by: David Hildenbrand <[email protected]>
Reviewed-by: Mike Rapoport (IBM) <[email protected]>
Cc: Dmitry Safonov <[email protected]>
Cc: Mike Kravetz <[email protected]>
Cc: Zach O'Keefe <[email protected]>
Signed-off-by: Andrew Morton <[email protected]>
|
|
write permissions
Let's add some tests that trigger (pte|pmd)_mkdirty on VMAs without write
permissions. If an architecture implementation is wrong, we might
accidentally set the PTE/PMD writable and allow for write access in a VMA
without write permissions.
The tests include reproducers for the two issues recently discovered
and worked-around in core-MM for now:
(1) commit 624a2c94f5b7 ("Partly revert "mm/thp: carry over dirty
bit when thp splits on pmd"")
(2) commit 96a9c287e25d ("mm/migrate: fix wrongly apply write bit
after mkdirty on sparc64")
In addition, some other tests that reveal further issues.
All tests pass under x86_64:
./mkdirty
# [INFO] detected THP size: 2048 KiB
TAP version 13
1..6
# [INFO] PTRACE write access
ok 1 SIGSEGV generated, page not modified
# [INFO] PTRACE write access to THP
ok 2 SIGSEGV generated, page not modified
# [INFO] Page migration
ok 3 SIGSEGV generated, page not modified
# [INFO] Page migration of THP
ok 4 SIGSEGV generated, page not modified
# [INFO] PTE-mapping a THP
ok 5 SIGSEGV generated, page not modified
# [INFO] UFFDIO_COPY
ok 6 SIGSEGV generated, page not modified
# Totals: pass:6 fail:0 xfail:0 xpass:0 skip:0 error:0
But some fail on sparc64:
./mkdirty
# [INFO] detected THP size: 8192 KiB
TAP version 13
1..6
# [INFO] PTRACE write access
not ok 1 SIGSEGV generated, page not modified
# [INFO] PTRACE write access to THP
not ok 2 SIGSEGV generated, page not modified
# [INFO] Page migration
ok 3 SIGSEGV generated, page not modified
# [INFO] Page migration of THP
ok 4 SIGSEGV generated, page not modified
# [INFO] PTE-mapping a THP
ok 5 SIGSEGV generated, page not modified
# [INFO] UFFDIO_COPY
not ok 6 SIGSEGV generated, page not modified
Bail out! 3 out of 6 tests failed
# Totals: pass:3 fail:3 xfail:0 xpass:0 skip:0 error:0
Reverting both above commits makes all tests fail on sparc64:
./mkdirty
# [INFO] detected THP size: 8192 KiB
TAP version 13
1..6
# [INFO] PTRACE write access
not ok 1 SIGSEGV generated, page not modified
# [INFO] PTRACE write access to THP
not ok 2 SIGSEGV generated, page not modified
# [INFO] Page migration
not ok 3 SIGSEGV generated, page not modified
# [INFO] Page migration of THP
not ok 4 SIGSEGV generated, page not modified
# [INFO] PTE-mapping a THP
not ok 5 SIGSEGV generated, page not modified
# [INFO] UFFDIO_COPY
not ok 6 SIGSEGV generated, page not modified
Bail out! 6 out of 6 tests failed
# Totals: pass:0 fail:6 xfail:0 xpass:0 skip:0 error:0
The tests are useful to detect other problematic archs, to verify new
arch fixes, and to stop such issues from reappearing in the future.
For now, we don't add any hugetlb tests.
Link: https://lkml.kernel.org/r/[email protected]
Signed-off-by: David Hildenbrand <[email protected]>
Cc: Anshuman Khandual <[email protected]>
Cc: David S. Miller <[email protected]>
Cc: Hugh Dickins <[email protected]>
Cc: Peter Xu <[email protected]>
Cc: Sam Ravnborg <[email protected]>
Cc: Shuah Khan <[email protected]>
Cc: Yu Zhao <[email protected]>
Signed-off-by: Andrew Morton <[email protected]>
|
|
A previous patch removed most of the sh5 (sh64) support from the
kernel tree. Now remove the last stragglers.
Fixes: 37744feebc08 ("sh: remove sh5 support")
Signed-off-by: Randy Dunlap <[email protected]>
Cc: Geert Uytterhoeven <[email protected]>
Cc: Arnd Bergmann <[email protected]>
Cc: Rich Felker <[email protected]>
Cc: Yoshinori Sato <[email protected]>
Cc: John Paul Adrian Glaubitz <[email protected]>
Cc: [email protected]
Acked-by: John Paul Adrian Glaubitz <[email protected]>
Reviewed-by: John Paul Adrian Glaubitz <[email protected]>
Link: https://lore.kernel.org/r/[email protected]
Signed-off-by: John Paul Adrian Glaubitz <[email protected]>
|
|
git://git.kernel.org/pub/scm/linux/kernel/git/akpm/mm
Pull MM updates from Andrew Morton:
- Daniel Verkamp has contributed a memfd series ("mm/memfd: add
F_SEAL_EXEC") which permits the setting of the memfd execute bit at
memfd creation time, with the option of sealing the state of the X
bit.
- Peter Xu adds a patch series ("mm/hugetlb: Make huge_pte_offset()
thread-safe for pmd unshare") which addresses a rare race condition
related to PMD unsharing.
- Several folioification patch serieses from Matthew Wilcox, Vishal
Moola, Sidhartha Kumar and Lorenzo Stoakes
- Johannes Weiner has a series ("mm: push down lock_page_memcg()")
which does perform some memcg maintenance and cleanup work.
- SeongJae Park has added DAMOS filtering to DAMON, with the series
"mm/damon/core: implement damos filter".
These filters provide users with finer-grained control over DAMOS's
actions. SeongJae has also done some DAMON cleanup work.
- Kairui Song adds a series ("Clean up and fixes for swap").
- Vernon Yang contributed the series "Clean up and refinement for maple
tree".
- Yu Zhao has contributed the "mm: multi-gen LRU: memcg LRU" series. It
adds to MGLRU an LRU of memcgs, to improve the scalability of global
reclaim.
- David Hildenbrand has added some userfaultfd cleanup work in the
series "mm: uffd-wp + change_protection() cleanups".
- Christoph Hellwig has removed the generic_writepages() library
function in the series "remove generic_writepages".
- Baolin Wang has performed some maintenance on the compaction code in
his series "Some small improvements for compaction".
- Sidhartha Kumar is doing some maintenance work on struct page in his
series "Get rid of tail page fields".
- David Hildenbrand contributed some cleanup, bugfixing and
generalization of pte management and of pte debugging in his series
"mm: support __HAVE_ARCH_PTE_SWP_EXCLUSIVE on all architectures with
swap PTEs".
- Mel Gorman and Neil Brown have removed the __GFP_ATOMIC allocation
flag in the series "Discard __GFP_ATOMIC".
- Sergey Senozhatsky has improved zsmalloc's memory utilization with
his series "zsmalloc: make zspage chain size configurable".
- Joey Gouly has added prctl() support for prohibiting the creation of
writeable+executable mappings.
The previous BPF-based approach had shortcomings. See "mm: In-kernel
support for memory-deny-write-execute (MDWE)".
- Waiman Long did some kmemleak cleanup and bugfixing in the series
"mm/kmemleak: Simplify kmemleak_cond_resched() & fix UAF".
- T.J. Alumbaugh has contributed some MGLRU cleanup work in his series
"mm: multi-gen LRU: improve".
- Jiaqi Yan has provided some enhancements to our memory error
statistics reporting, mainly by presenting the statistics on a
per-node basis. See the series "Introduce per NUMA node memory error
statistics".
- Mel Gorman has a second and hopefully final shot at fixing a CPU-hog
regression in compaction via his series "Fix excessive CPU usage
during compaction".
- Christoph Hellwig does some vmalloc maintenance work in the series
"cleanup vfree and vunmap".
- Christoph Hellwig has removed block_device_operations.rw_page() in
ths series "remove ->rw_page".
- We get some maple_tree improvements and cleanups in Liam Howlett's
series "VMA tree type safety and remove __vma_adjust()".
- Suren Baghdasaryan has done some work on the maintainability of our
vm_flags handling in the series "introduce vm_flags modifier
functions".
- Some pagemap cleanup and generalization work in Mike Rapoport's
series "mm, arch: add generic implementation of pfn_valid() for
FLATMEM" and "fixups for generic implementation of pfn_valid()"
- Baoquan He has done some work to make /proc/vmallocinfo and
/proc/kcore better represent the real state of things in his series
"mm/vmalloc.c: allow vread() to read out vm_map_ram areas".
- Jason Gunthorpe rationalized the GUP system's interface to the rest
of the kernel in the series "Simplify the external interface for
GUP".
- SeongJae Park wishes to migrate people from DAMON's debugfs interface
over to its sysfs interface. To support this, we'll temporarily be
printing warnings when people use the debugfs interface. See the
series "mm/damon: deprecate DAMON debugfs interface".
- Andrey Konovalov provided the accurately named "lib/stackdepot: fixes
and clean-ups" series.
- Huang Ying has provided a dramatic reduction in migration's TLB flush
IPI rates with the series "migrate_pages(): batch TLB flushing".
- Arnd Bergmann has some objtool fixups in "objtool warning fixes".
* tag 'mm-stable-2023-02-20-13-37' of git://git.kernel.org/pub/scm/linux/kernel/git/akpm/mm: (505 commits)
include/linux/migrate.h: remove unneeded externs
mm/memory_hotplug: cleanup return value handing in do_migrate_range()
mm/uffd: fix comment in handling pte markers
mm: change to return bool for isolate_movable_page()
mm: hugetlb: change to return bool for isolate_hugetlb()
mm: change to return bool for isolate_lru_page()
mm: change to return bool for folio_isolate_lru()
objtool: add UACCESS exceptions for __tsan_volatile_read/write
kmsan: disable ftrace in kmsan core code
kasan: mark addr_has_metadata __always_inline
mm: memcontrol: rename memcg_kmem_enabled()
sh: initialize max_mapnr
m68k/nommu: add missing definition of ARCH_PFN_OFFSET
mm: percpu: fix incorrect size in pcpu_obj_full_size()
maple_tree: reduce stack usage with gcc-9 and earlier
mm: page_alloc: call panic() when memoryless node allocation fails
mm: multi-gen LRU: avoid futile retries
migrate_pages: move THP/hugetlb migration support check to simplify code
migrate_pages: batch flushing TLB
migrate_pages: share more code between _unmap and _move
...
|
|
Add some tests to cover the new PR_SET_MDWE prctl.
Link: https://lkml.kernel.org/r/[email protected]
Co-developed-by: Joey Gouly <[email protected]>
Signed-off-by: Joey Gouly <[email protected]>
Signed-off-by: Kees Cook <[email protected]>
Cc: Shuah Khan <[email protected]>
Cc: Alexander Viro <[email protected]>
Cc: Catalin Marinas <[email protected]>
Cc: Jeremy Linton <[email protected]>
Cc: Lennart Poettering <[email protected]>
Cc: Mark Brown <[email protected]>
Cc: nd <[email protected]>
Cc: Szabolcs Nagy <[email protected]>
Cc: Topi Miettinen <[email protected]>
Cc: Zbigniew Jędrzejewski-Szmek <[email protected]>
Cc: David Hildenbrand <[email protected]>
Signed-off-by: Andrew Morton <[email protected]>
|
|
Selftests vm builds break when doing cross-compilation. The Makefile
MACHINE variable incorrectly picks upp the host machine architecture.
If the CROSS_COMPILE variable is set, dig out the target host architecture
from CROSS_COMPILE, instead of calling uname.
Link: https://lkml.kernel.org/r/[email protected]
Signed-off-by: Björn Töpel <[email protected]>
Cc: Shuah Khan <[email protected]>
Signed-off-by: Andrew Morton <[email protected]>
|
|
Rename selftets/vm to selftests/mm for being more consistent with the
code, documentation, and tools directories, and won't be confused with
virtual machines.
[[email protected]: convert missing vm->mm changes]
Link: https://lkml.kernel.org/r/[email protected]
Link: https://lkml.kernel.org/r/[email protected]
Signed-off-by: SeongJae Park <[email protected]>
Cc: Jonathan Corbet <[email protected]>
Cc: Shuah Khan <[email protected]>
Signed-off-by: Andrew Morton <[email protected]>
|