aboutsummaryrefslogtreecommitdiff
AgeCommit message (Collapse)AuthorFilesLines
2024-09-09mm: make arch_get_unmapped_area() take vm_flags by defaultMark Brown18-81/+51
Patch series "mm: Care about shadow stack guard gap when getting an unmapped area", v2. As covered in the commit log for c44357c2e76b ("x86/mm: care about shadow stack guard gap during placement") our current mmap() implementation does not take care to ensure that a new mapping isn't placed with existing mappings inside it's own guard gaps. This is particularly important for shadow stacks since if two shadow stacks end up getting placed adjacent to each other then they can overflow into each other which weakens the protection offered by the feature. On x86 there is a custom arch_get_unmapped_area() which was updated by the above commit to cover this case by specifying a start_gap for allocations with VM_SHADOW_STACK. Both arm64 and RISC-V have equivalent features and use the generic implementation of arch_get_unmapped_area() so let's make the equivalent change there so they also don't get shadow stack pages placed without guard pages. The arm64 and RISC-V shadow stack implementations are currently on the list: https://lore.kernel.org/r/20240829-arm64-gcs-v12-0-42fec94743 https://lore.kernel.org/lkml/[email protected]/ Given the addition of the use of vm_flags in the generic implementation we also simplify the set of possibilities that have to be dealt with in the core code by making arch_get_unmapped_area() take vm_flags as standard. This is a bit invasive since the prototype change touches quite a few architectures but since the parameter is ignored the change is straightforward, the simplification for the generic code seems worth it. This patch (of 3): When we introduced arch_get_unmapped_area_vmflags() in 961148704acd ("mm: introduce arch_get_unmapped_area_vmflags()") we did so as part of properly supporting guard pages for shadow stacks on x86_64, which uses a custom arch_get_unmapped_area(). Equivalent features are also present on both arm64 and RISC-V, both of which use the generic implementation of arch_get_unmapped_area() and will require equivalent modification there. Rather than continue to deal with having two versions of the functions let's bite the bullet and have all implementations of arch_get_unmapped_area() take vm_flags as a parameter. The new parameter is currently ignored by all implementations other than x86. The only caller that doesn't have a vm_flags available is mm_get_unmapped_area(), as for the x86 implementation and the wrapper used on other architectures this is modified to supply no flags. No functional changes. Link: https://lkml.kernel.org/r/20240904-mm-generic-shadow-stack-guard-v2-0-a46b8b6dc0ed@kernel.org Link: https://lkml.kernel.org/r/20240904-mm-generic-shadow-stack-guard-v2-1-a46b8b6dc0ed@kernel.org Signed-off-by: Mark Brown <[email protected]> Acked-by: Lorenzo Stoakes <[email protected]> Reviewed-by: Liam R. Howlett <[email protected]> Acked-by: Helge Deller <[email protected]> [parisc] Cc: Alexander Gordeev <[email protected]> Cc: Andreas Larsson <[email protected]> Cc: Borislav Petkov <[email protected]> Cc: Christian Borntraeger <[email protected]> Cc: Christophe Leroy <[email protected]> Cc: Chris Zankel <[email protected]> Cc: Dave Hansen <[email protected]> Cc: David S. Miller <[email protected]> Cc: "Edgecombe, Rick P" <[email protected]> Cc: Gerald Schaefer <[email protected]> Cc: Guo Ren <[email protected]> Cc: Heiko Carstens <[email protected]> Cc: "H. Peter Anvin" <[email protected]> Cc: Huacai Chen <[email protected]> Cc: Ingo Molnar <[email protected]> Cc: Ivan Kokshaysky <[email protected]> Cc: James Bottomley <[email protected]> Cc: John Paul Adrian Glaubitz <[email protected]> Cc: Matt Turner <[email protected]> Cc: Max Filippov <[email protected]> Cc: Michael Ellerman <[email protected]> Cc: Naveen N Rao <[email protected]> Cc: Nicholas Piggin <[email protected]> Cc: Richard Henderson <[email protected]> Cc: Rich Felker <[email protected]> Cc: Russell King <[email protected]> Cc: Sven Schnelle <[email protected]> Cc: Thomas Bogendoerfer <[email protected]> Cc: Thomas Gleixner <[email protected]> Cc: Vasily Gorbik <[email protected]> Cc: Vineet Gupta <[email protected]> Cc: Vlastimil Babka <[email protected]> Cc: WANG Xuerui <[email protected]> Cc: Yoshinori Sato <[email protected]> Signed-off-by: Andrew Morton <[email protected]>
2024-09-09mm/damon/tests/vaddr-kunit: init maple tree without MT_FLAGS_LOCK_EXTERNSeongJae Park1-1/+1
damon_test_three_regions_in_vmas() initializes a maple tree with MM_MT_FLAGS. The flags contains MT_FLAGS_LOCK_EXTERN, which means mt_lock of the maple tree will not be used. And therefore the maple tree initialization code skips initialization of the mt_lock. However, __link_vmas(), which adds vmas for test to the maple tree, uses the mt_lock. In other words, the uninitialized spinlock is used. The problem becomes clear when spinlock debugging is turned on, since it reports spinlock bad magic bug. Fix the issue by excluding MT_FLAGS_LOCK_EXTERN from the maple tree initialization flags. Note that we don't use empty flags to make it further similar to the usage of mm maple tree, and to be prepared for possible future changes, as suggested by Liam. Link: https://lkml.kernel.org/r/[email protected] Fixes: d0cf3dd47f0d ("damon: convert __damon_va_three_regions to use the VMA iterator") Signed-off-by: SeongJae Park <[email protected]> Reported-by: Guenter Roeck <[email protected]> Closes: https://lore.kernel.org/[email protected] Suggested-by: Liam R. Howlett <[email protected]> Tested-by: Guenter Roeck <[email protected]> Signed-off-by: Andrew Morton <[email protected]>
2024-09-09mm: Kconfig: fixup zsmalloc configurationSergey Senozhatsky1-1/+1
zsmalloc is not exclusive to zswap. Commit b3fbd58fcbb1 ("mm: Kconfig: simplify zswap configuration") made CONFIG_ZSMALLOC only visible when CONFIG_ZSWAP is selected, which makes it impossible to menuconfig zsmalloc-specific features (stats, chain-size, etc.) on systems that use ZRAM but don't have ZSWAP enabled. Make zsmalloc depend on both ZRAM and ZSWAP. Link: https://lkml.kernel.org/r/[email protected] Fixes: b3fbd58fcbb1 ("mm: Kconfig: simplify zswap configuration") Signed-off-by: Sergey Senozhatsky <[email protected]> Cc: Johannes Weiner <[email protected]> Cc: Minchan Kim <[email protected]> Signed-off-by: Andrew Morton <[email protected]>
2024-09-09filemap: fix the last_index of mm_filemap_get_pagesTakaya Saeki1-1/+1
In commit b6273b55d885 ("filemap: add trace events for get_pages, map_pages, and fault"), mm_filemap_get_pages was added to trace page cache access. However, it tracks an extra page beyond the end of the accessed range. This patch fixes it by replacing last_index with last_index - 1. Link: https://lkml.kernel.org/r/[email protected] Fixes: b6273b55d885 ("filemap: add trace events for get_pages, map_pages, and fault") Signed-off-by: Takaya Saeki <[email protected]> Cc: Junichi Uekawa <[email protected]> Cc: Masami Hiramatsu <[email protected]> Cc: Mathieu Desnoyers <[email protected]> Cc: Matthew Wilcox <[email protected]> Cc: Steven Rostedt (Google) <[email protected]> Signed-off-by: Andrew Morton <[email protected]>
2024-09-09mm,tmpfs: consider end of file write in shmem_is_hugeRik van Riel7-39/+42
Take the end of a file write into consideration when deciding whether or not to use huge pages for tmpfs files when the tmpfs filesystem is mounted with huge=within_size This allows large writes that append to the end of a file to automatically use large pages. Doing 4MB sequential writes without fallocate to a 16GB tmpfs file with fio. The numbers without THP or with huge=always stay the same, but the performance with huge=within_size now matches that of huge=always. huge before after 4kB pages 1560 MB/s 1560 MB/s within_size 1560 MB/s 4720 MB/s always: 4720 MB/s 4720 MB/s [[email protected]: coding-style cleanups] Link: https://lkml.kernel.org/r/[email protected] Signed-off-by: Rik van Riel <[email protected]> Reviewed-by: Baolin Wang <[email protected]> Tested-by: Baolin Wang <[email protected]> Cc: Darrick J. Wong <[email protected]> Cc: Hugh Dickins <[email protected]> Cc: Matthew Wilcox <[email protected]> Cc: Vlastimil Babka <[email protected]> Signed-off-by: Andrew Morton <[email protected]>
2024-09-09zram: support priority parameter in recompressionSergey Senozhatsky2-1/+16
recompress device attribute supports alg=NAME parameter so that we can specify only one particular algorithm we want to perform recompression with. However, with algo params we now can have several exactly same secondary algorithms but each with its own params tuning (e.g. priority 1 configured to use more aggressive level, and priority 2 configured to use a pre-trained dictionary). Support priority=NUM parameter so that we can correctly determine which secondary algorithm we want to use. Link: https://lkml.kernel.org/r/[email protected] Signed-off-by: Sergey Senozhatsky <[email protected]> Cc: Minchan Kim <[email protected]> Cc: Nick Terrell <[email protected]> Signed-off-by: Andrew Morton <[email protected]>
2024-09-09Documentation/zram: add documentation for algorithm parametersSergey Senozhatsky1-8/+39
Document brief description of compression algorithms' parameters: compression level and pre-trained dictionary. [[email protected]: trivial fixup] Link: https://lkml.kernel.org/r/[email protected] Link: https://lkml.kernel.org/r/[email protected] Signed-off-by: Sergey Senozhatsky <[email protected]> Cc: Minchan Kim <[email protected]> Cc: Nick Terrell <[email protected]> Signed-off-by: Andrew Morton <[email protected]>
2024-09-09zram: add dictionary support to zstd backendSergey Senozhatsky1-26/+124
This adds support for pre-trained zstd dictionaries [1] Dictionary is setup in params once (per-comp) and loaded to Cctx and Dctx by reference, so we don't allocate extra memory. TEST ==== *** zstd /sys/block/zram0/mm_stat 1750654976 504565092 514203648 0 514203648 1 0 34204 34204 *** zstd dict=/etc/zstd-dict-amd64 /sys/block/zram0/mm_stat 1750638592 465851259 475373568 0 475373568 1 0 34185 34185 *** zstd level=8 dict=/etc/zstd-dict-amd64 /sys/block/zram0/mm_stat 1750642688 430765171 439955456 0 439955456 1 0 34185 34185 [1] https://github.com/facebook/zstd/blob/dev/programs/zstd.1.md#dictionary-builder Link: https://lkml.kernel.org/r/[email protected] Signed-off-by: Sergey Senozhatsky <[email protected]> Cc: Minchan Kim <[email protected]> Cc: Nick Terrell <[email protected]> Signed-off-by: Andrew Morton <[email protected]>
2024-09-09zram: add dictionary support to lz4hcSergey Senozhatsky1-7/+68
Support pre-trained dictionary param. Just like lz4, lz4hc doesn't mandate specific format of the dictionary and zstd --train can be used to train a dictionary for lz4, according to [1]. TEST ==== *** lz4hc /sys/block/zram0/mm_stat 1750638592 608954620 621031424 0 621031424 1 0 34288 34288 *** lz4hc dict=/etc/lz4-dict-amd64 /sys/block/zram0/mm_stat 1750671360 505068582 514994176 0 514994176 1 0 34278 34278 [1] https://github.com/lz4/lz4/issues/557 Link: https://lkml.kernel.org/r/[email protected] Signed-off-by: Sergey Senozhatsky <[email protected]> Cc: Minchan Kim <[email protected]> Cc: Nick Terrell <[email protected]> Signed-off-by: Andrew Morton <[email protected]>
2024-09-09zram: add dictionary support to lz4Sergey Senozhatsky1-7/+67
Support pre-trained dictionary param. lz4 doesn't mandate specific format of the dictionary and even zstd --train can be used to train a dictionary for lz4, according to [1]. TEST ==== *** lz4 /sys/block/zram0/mm_stat 1750654976 664188565 676864000 0 676864000 1 0 34288 34288 *** lz4 dict=/etc/lz4-dict-amd64 /sys/block/zram0/mm_stat 1750638592 619891141 632053760 0 632053760 1 0 34278 34278 *** lz4 level=5 dict=/etc/lz4-dict-amd64 /sys/block/zram0/mm_stat 1750638592 727174243 740810752 0 740810752 1 0 34437 34437 [1] https://github.com/lz4/lz4/issues/557 Link: https://lkml.kernel.org/r/[email protected] Signed-off-by: Sergey Senozhatsky <[email protected]> Cc: Minchan Kim <[email protected]> Cc: Nick Terrell <[email protected]> Signed-off-by: Andrew Morton <[email protected]>
2024-09-09zram: move immutable comp params away from per-CPU contextSergey Senozhatsky9-98/+168
Immutable params never change once comp has been allocated and setup, so we don't need to store multiple copies of them in each per-CPU backend context. Move those to per-comp zcomp_params and pass it to backends callbacks for requests execution. Basically, this means parameters sharing between different contexts. Also introduce two new backends callbacks: setup_params() and release_params(). First, we need to validate params in a driver-specific way; second, driver may want to allocate its specific representation of the params which is needed to execute requests. Link: https://lkml.kernel.org/r/[email protected] Signed-off-by: Sergey Senozhatsky <[email protected]> Cc: Minchan Kim <[email protected]> Cc: Nick Terrell <[email protected]> Signed-off-by: Andrew Morton <[email protected]>
2024-09-09zram: introduce zcomp_ctx structureSergey Senozhatsky9-135/+149
Keep run-time driver data (scratch buffers, etc.) in zcomp_ctx structure. This structure is allocated per-CPU because drivers (backends) need to modify its content during requests execution. We will split mutable and immutable driver data, this is a preparation path. Link: https://lkml.kernel.org/r/[email protected] Signed-off-by: Sergey Senozhatsky <[email protected]> Cc: Minchan Kim <[email protected]> Cc: Nick Terrell <[email protected]> Signed-off-by: Andrew Morton <[email protected]>
2024-09-09zram: introduce zcomp_req structureSergey Senozhatsky9-72/+77
Encapsulate compression/decompression data in zcomp_req structure. Link: https://lkml.kernel.org/r/[email protected] Signed-off-by: Sergey Senozhatsky <[email protected]> Cc: Minchan Kim <[email protected]> Cc: Nick Terrell <[email protected]> Signed-off-by: Andrew Morton <[email protected]>
2024-09-09zram: add support for dict comp configSergey Senozhatsky1-9/+36
Handle dict=path algorithm param so that we can read a pre-trained compression algorithm dictionary which we then pass to the backend configuration. Link: https://lkml.kernel.org/r/[email protected] Signed-off-by: Sergey Senozhatsky <[email protected]> Cc: Minchan Kim <[email protected]> Cc: Nick Terrell <[email protected]> Signed-off-by: Andrew Morton <[email protected]>
2024-09-09zram: introduce algorithm_params device attributeSergey Senozhatsky3-0/+76
This attribute is used to setup compression algorithms' parameters, so we can tweak algorithms' characteristics. At this point only 'level' is supported (to be extended in the future). Each call sets up parameters for one particular algorithm, which should be specified either by the algorithm's priority or algo name. This is expected to be called after corresponding algorithm is selected via comp_algorithm or recomp_algorithm. echo "priority=0 level=1" > /sys/block/zram0/algorithm_params or echo "algo=zstd level=1" > /sys/block/zram0/algorithm_params Link: https://lkml.kernel.org/r/[email protected] Signed-off-by: Sergey Senozhatsky <[email protected]> Cc: Minchan Kim <[email protected]> Cc: Nick Terrell <[email protected]> Signed-off-by: Andrew Morton <[email protected]>
2024-09-09zram: recalculate zstd compression params onceSergey Senozhatsky1-2/+3
zstd compression params depends on level, but are constant for a given instance of zstd compression backend. Calculate once (during ctx creation). Link: https://lkml.kernel.org/r/[email protected] Signed-off-by: Sergey Senozhatsky <[email protected]> Cc: Minchan Kim <[email protected]> Cc: Nick Terrell <[email protected]> Signed-off-by: Andrew Morton <[email protected]>
2024-09-09zram: introduce zcomp_params structureSergey Senozhatsky11-24/+67
We will store a per-algorithm parameters there (compression level, dictionary, dictionary size, etc.). Link: https://lkml.kernel.org/r/[email protected] Signed-off-by: Sergey Senozhatsky <[email protected]> Cc: Minchan Kim <[email protected]> Cc: Nick Terrell <[email protected]> Signed-off-by: Andrew Morton <[email protected]>
2024-09-09zram: check that backends array has at least one backendSergey Senozhatsky2-6/+21
Make sure that backends array has anything apart from the sentinel NULL value. We also select LZO_BACKEND if none backends were selected. Link: https://lkml.kernel.org/r/[email protected] Signed-off-by: Sergey Senozhatsky <[email protected]> Cc: Minchan Kim <[email protected]> Cc: Nick Terrell <[email protected]> Signed-off-by: Andrew Morton <[email protected]>
2024-09-09zram: add 842 compression backend supportSergey Senozhatsky5-0/+94
Add s/w 842 compression support. Link: https://lkml.kernel.org/r/[email protected] Signed-off-by: Sergey Senozhatsky <[email protected]> Cc: Minchan Kim <[email protected]> Cc: Nick Terrell <[email protected]> Signed-off-by: Andrew Morton <[email protected]>
2024-09-09zram: add zlib compression backend supportSergey Senozhatsky5-0/+158
Add s/w zlib (inflate/deflate) compression. Link: https://lkml.kernel.org/r/[email protected] Signed-off-by: Sergey Senozhatsky <[email protected]> Cc: Minchan Kim <[email protected]> Cc: Nick Terrell <[email protected]> Signed-off-by: Andrew Morton <[email protected]>
2024-09-09zram: pass estimated src size hint to zstdSergey Senozhatsky1-2/+2
zram works with PAGE_SIZE buffers, so we always know exact size of the source buffer and hence can pass estimated_src_size to zstd_get_params(). This hint on x86_64, for example, reduces the size of the work memory buffer from 1303520 bytes down to 90080 bytes. Given that compression streams are per-CPU that's quite some memory saving. Link: https://lkml.kernel.org/r/[email protected] Signed-off-by: Sergey Senozhatsky <[email protected]> Cc: Minchan Kim <[email protected]> Cc: Nick Terrell <[email protected]> Signed-off-by: Andrew Morton <[email protected]>
2024-09-09zram: add zstd compression backend supportSergey Senozhatsky5-0/+123
Add s/w zstd compression. Link: https://lkml.kernel.org/r/[email protected] Signed-off-by: Sergey Senozhatsky <[email protected]> Cc: Minchan Kim <[email protected]> Cc: Nick Terrell <[email protected]> Signed-off-by: Andrew Morton <[email protected]>
2024-09-09zram: add lz4hc compression backend supportSergey Senozhatsky5-2/+100
Add s/w lz4hc compression support. Link: https://lkml.kernel.org/r/[email protected] Signed-off-by: Sergey Senozhatsky <[email protected]> Cc: Minchan Kim <[email protected]> Cc: Nick Terrell <[email protected]> Signed-off-by: Andrew Morton <[email protected]>
2024-09-09zram: add lz4 compression backend supportSergey Senozhatsky5-0/+98
Add s/w lz4 compression support. Link: https://lkml.kernel.org/r/[email protected] Signed-off-by: Sergey Senozhatsky <[email protected]> Cc: Minchan Kim <[email protected]> Cc: Nick Terrell <[email protected]> Signed-off-by: Andrew Morton <[email protected]>
2024-09-09zram: add lzo and lzorle compression backends supportSergey Senozhatsky7-0/+140
Add s/w lzo/lzorle compression support. Link: https://lkml.kernel.org/r/[email protected] Signed-off-by: Sergey Senozhatsky <[email protected]> Cc: Minchan Kim <[email protected]> Cc: Nick Terrell <[email protected]> Signed-off-by: Andrew Morton <[email protected]>
2024-09-09zram: introduce custom comp backends APISergey Senozhatsky5-160/+78
Moving to custom backends implementation gives us ability to have our own minimalistic and extendable API, and algorithms tunings becomes possible. The list of compression backends is empty at this point, we will add backends in the followup patches. Link: https://lkml.kernel.org/r/[email protected] Signed-off-by: Sergey Senozhatsky <[email protected]> Cc: Minchan Kim <[email protected]> Cc: Nick Terrell <[email protected]> Signed-off-by: Andrew Morton <[email protected]>
2024-09-09lib: zstd: fix null-deref in ZSTD_createCDict_advanced2()Sergey Senozhatsky1-0/+2
ZSTD_createCDict_advanced2() must ensure that ZSTD_createCDict_advanced_internal() has successfully allocated cdict. customMalloc() may be called under low memory condition and may be unable to allocate workspace for cdict. Link: https://lkml.kernel.org/r/[email protected] Signed-off-by: Sergey Senozhatsky <[email protected]> Cc: Nick Terrell <[email protected]> Cc: Minchan Kim <[email protected]> Signed-off-by: Andrew Morton <[email protected]>
2024-09-09lib: lz4hc: export LZ4_resetStreamHC symbolSergey Senozhatsky1-0/+1
This symbol is needed to enable lz4hc dictionary support. Link: https://lkml.kernel.org/r/[email protected] Signed-off-by: Sergey Senozhatsky <[email protected]> Cc: Nick Terrell <[email protected]> Cc: Minchan Kim <[email protected]> Signed-off-by: Andrew Morton <[email protected]>
2024-09-09lib: zstd: export API needed for dictionary supportSergey Senozhatsky3-0/+252
Patch series "zram: introduce custom comp backends API", v7. This series introduces support for run-time compression algorithms tuning, so users, for instance, can adjust compression/acceleration levels and provide pre-trained compression/decompression dictionaries which certain algorithms support. At this point we stop supporting (old/deprecated) comp API. We may add new acomp API support in the future, but before that zram needs to undergo some major rework (we are not ready for async compression). Some benchmarks for reference (look at column #2) *** init zstd /sys/block/zram0/mm_stat 1750659072 504622188 514355200 0 514355200 1 0 34204 34204 *** init zstd dict=/home/ss/zstd-dict-amd64 /sys/block/zram0/mm_stat 1750650880 465908890 475398144 0 475398144 1 0 34185 34185 *** init zstd level=8 dict=/home/ss/zstd-dict-amd64 /sys/block/zram0/mm_stat 1750654976 430803319 439873536 0 439873536 1 0 34185 34185 *** init lz4 /sys/block/zram0/mm_stat 1750646784 664266564 677060608 0 677060608 1 0 34288 34288 *** init lz4 dict=/home/ss/lz4-dict-amd64 /sys/block/zram0/mm_stat 1750650880 619990300 632102912 0 632102912 1 0 34278 34278 *** init lz4hc /sys/block/zram0/mm_stat 1750630400 609023822 621232128 0 621232128 1 0 34288 34288 *** init lz4hc dict=/home/ss/lz4-dict-amd64 /sys/block/zram0/mm_stat 1750659072 505133172 515231744 0 515231744 1 0 34278 34278 Recompress init zram zstd (prio=0), zstd level=5 (prio 1), zstd with dict (prio 2) *** zstd /sys/block/zram0/mm_stat 1750982656 504630584 514269184 0 514269184 1 0 34204 34204 *** idle recompress priority=1 (zstd level=5) /sys/block/zram0/mm_stat 1750982656 488645601 525438976 0 514269184 1 0 34204 34204 *** idle recompress priority=2 (zstd dict) /sys/block/zram0/mm_stat 1750982656 460869640 517914624 0 514269184 1 0 34185 34204 This patch (of 24): We need to export a number of API functions that enable advanced zstd usage - C/D dictionaries, dictionaries sharing between contexts, etc. Link: https://lkml.kernel.org/r/[email protected] Link: https://lkml.kernel.org/r/[email protected] Signed-off-by: Sergey Senozhatsky <[email protected]> Cc: Nick Terrell <[email protected]> Cc: Minchan Kim <[email protected]> Cc: Sergey Senozhatsky <[email protected]> Signed-off-by: Andrew Morton <[email protected]>
2024-09-09maple_tree: fix comment typo on ma_flag of allocation treeWei Yang1-3/+3
The maple tree flag of allocation tree is MT_FLAGS_ALLOC_RANGE. Just correct it. Link: https://lkml.kernel.org/r/[email protected] Signed-off-by: Wei Yang <[email protected]> Reviewed-by: Liam R. Howlett <[email protected]> Signed-off-by: Andrew Morton <[email protected]>
2024-09-09mm: fix folio_alloc_noprof()Kent Overstreet1-1/+1
folio_alloc_noprof) wasn't calling the _noprof version, causing allocations to be accounted here instead of to the caller Link: https://lkml.kernel.org/r/[email protected] Signed-off-by: Kent Overstreet <[email protected]> Signed-off-by: Andrew Morton <[email protected]>
2024-09-09maple_tree: cleanup function descriptionsWei Yang1-58/+47
This patch tries to cleanup some function description: * function name mismatch * parameter name mismatch * parameter all end up with ':' * not prefix '*' if parameter is a pointer There is still some missing description of parameters, I didn't add them since I am not sure the exact meaning. Link: https://lkml.kernel.org/r/[email protected] Signed-off-by: Wei Yang <[email protected]> Reviewed-by: Liam R. Howlett <[email protected]> Signed-off-by: Andrew Morton <[email protected]>
2024-09-09mm: page_alloc: simpify page del and expandHuan Yang1-10/+25
When page del from buddy and need expand, it will account free_pages in zone's migratetype. The current way is to subtract the page number of the current order when deleting, and then add it back when expanding. This is unnecessary, as when migrating the same type, we can directly record the difference between the high-order pages and the expand added, and then subtract it directly. This patch merge that, only when del and expand done, then account free_pages. Link: https://lkml.kernel.org/r/[email protected] Signed-off-by: Huan Yang <[email protected]> Reviewed-by: Vlastimil Babka <[email protected]> Signed-off-by: Andrew Morton <[email protected]>
2024-09-09selftests/mm: relax test to fail after 100 migration failuresDev Jain1-6/+11
It was recently observed at [1] that during the folio unmapping stage of migration, when the PTEs are cleared, a racing thread faulting on that folio may increase the refcount of the folio, sleep on the folio lock (the migration path has the lock), and migration ultimately fails when asserting the actual refcount against the expected. Thereby, the migration selftest fails on shared-anon mappings. The above enforces the fact that migration is a best-effort service, therefore, it is wrong to fail the test for just a single failure; hence, fail the test after 100 consecutive failures (where 100 is still a subjective choice). Note that, this has no effect on the execution time of the test since that is controlled by a timeout. [1] https://lore.kernel.org/all/[email protected]/ Link: https://lkml.kernel.org/r/[email protected] Signed-off-by: Dev Jain <[email protected]> Suggested-by: David Hildenbrand <[email protected]> Reviewed-by: Ryan Roberts <[email protected]> Tested-by: Ryan Roberts <[email protected]> Cc: Alistair Popple <[email protected]> Cc: Aneesh Kumar K.V <[email protected]> Cc: Anshuman Khandual <[email protected]> Cc: Baolin Wang <[email protected]> Cc: Barry Song <[email protected]> Cc: Catalin Marinas <[email protected]> Cc: Christoph Lameter <[email protected]> Cc: Dave Hansen <[email protected]> Cc: Gavin Shan <[email protected]> Cc: "Huang, Ying" <[email protected]> Cc: Hugh Dickins <[email protected]> Cc: Kirill A. Shutemov <[email protected]> Cc: Lance Yang <[email protected]> Cc: Mark Brown <[email protected]> Cc: Mark Rutland <[email protected]> Cc: Matthew Wilcox <[email protected]> Cc: Mel Gorman <[email protected]> Cc: Michal Hocko <[email protected]> Cc: Oscar Salvador <[email protected]> Cc: Shuah Khan <[email protected]> Cc: Vlastimil Babka <[email protected]> Cc: Will Deacon <[email protected]> Cc: Yang Shi <[email protected]> Signed-off-by: Andrew Morton <[email protected]>
2024-09-09mm/vmalloc.c: make use of the helper macro LIST_HEAD()Hongbo Li1-8/+3
list_head can be initialized automatically with LIST_HEAD() instead of calling INIT_LIST_HEAD(). Here we can simplify the code. Link: https://lkml.kernel.org/r/[email protected] Signed-off-by: Hongbo Li <[email protected]> Reviewed-by: Uladzislau Rezki (Sony) <[email protected]> Cc: Christoph Hellwig <[email protected]> Signed-off-by: Andrew Morton <[email protected]>
2024-09-09mm: add sysfs entry to disable splitting underused THPsUsama Arif2-0/+36
If disabled, THPs faulted in or collapsed will not be added to _deferred_list, and therefore won't be considered for splitting under memory pressure if underused. Link: https://lkml.kernel.org/r/[email protected] Signed-off-by: Usama Arif <[email protected]> Cc: Alexander Zhu <[email protected]> Cc: Barry Song <[email protected]> Cc: David Hildenbrand <[email protected]> Cc: Domenico Cerasuolo <[email protected]> Cc: Johannes Weiner <[email protected]> Cc: Jonathan Corbet <[email protected]> Cc: Kairui Song <[email protected]> Cc: Matthew Wilcox <[email protected]> Cc: Mike Rapoport <[email protected]> Cc: Nico Pache <[email protected]> Cc: Rik van Riel <[email protected]> Cc: Roman Gushchin <[email protected]> Cc: Ryan Roberts <[email protected]> Cc: Shakeel Butt <[email protected]> Cc: Shuang Zhai <[email protected]> Cc: Shuang Zhai <[email protected]> Cc: Yu Zhao <[email protected]> Cc: Hugh Dickins <[email protected]> Signed-off-by: Andrew Morton <[email protected]>
2024-09-09mm: split underused THPsUsama Arif6-3/+69
This is an attempt to mitigate the issue of running out of memory when THP is always enabled. During runtime whenever a THP is being faulted in (__do_huge_pmd_anonymous_page) or collapsed by khugepaged (collapse_huge_page), the THP is added to _deferred_list. Whenever memory reclaim happens in linux, the kernel runs the deferred_split shrinker which goes through the _deferred_list. If the folio was partially mapped, the shrinker attempts to split it. If the folio is not partially mapped, the shrinker checks if the THP was underused, i.e. how many of the base 4K pages of the entire THP were zero-filled. If this number goes above a certain threshold (decided by /sys/kernel/mm/transparent_hugepage/khugepaged/max_ptes_none), the shrinker will attempt to split that THP. Then at remap time, the pages that were zero-filled are mapped to the shared zeropage, hence saving memory. Link: https://lkml.kernel.org/r/[email protected] Signed-off-by: Usama Arif <[email protected]> Suggested-by: Rik van Riel <[email protected]> Co-authored-by: Johannes Weiner <[email protected]> Cc: Alexander Zhu <[email protected]> Cc: Barry Song <[email protected]> Cc: David Hildenbrand <[email protected]> Cc: Domenico Cerasuolo <[email protected]> Cc: Jonathan Corbet <[email protected]> Cc: Kairui Song <[email protected]> Cc: Matthew Wilcox <[email protected]> Cc: Mike Rapoport <[email protected]> Cc: Nico Pache <[email protected]> Cc: Roman Gushchin <[email protected]> Cc: Ryan Roberts <[email protected]> Cc: Shakeel Butt <[email protected]> Cc: Shuang Zhai <[email protected]> Cc: Yu Zhao <[email protected]> Cc: Shuang Zhai <[email protected]> Cc: Hugh Dickins <[email protected]> Signed-off-by: Andrew Morton <[email protected]>
2024-09-09mm: introduce a pageflag for partially mapped foliosUsama Arif8-21/+56
Currently folio->_deferred_list is used to keep track of partially_mapped folios that are going to be split under memory pressure. In the next patch, all THPs that are faulted in and collapsed by khugepaged are also going to be tracked using _deferred_list. This patch introduces a pageflag to be able to distinguish between partially mapped folios and others in the deferred_list at split time in deferred_split_scan. Its needed as __folio_remove_rmap decrements _mapcount, _large_mapcount and _entire_mapcount, hence it won't be possible to distinguish between partially mapped folios and others in deferred_split_scan. Eventhough it introduces an extra flag to track if the folio is partially mapped, there is no functional change intended with this patch and the flag is not useful in this patch itself, it will become useful in the next patch when _deferred_list has non partially mapped folios. Link: https://lkml.kernel.org/r/[email protected] Signed-off-by: Usama Arif <[email protected]> Cc: Alexander Zhu <[email protected]> Cc: Barry Song <[email protected]> Cc: David Hildenbrand <[email protected]> Cc: Domenico Cerasuolo <[email protected]> Cc: Johannes Weiner <[email protected]> Cc: Jonathan Corbet <[email protected]> Cc: Kairui Song <[email protected]> Cc: Matthew Wilcox <[email protected]> Cc: Mike Rapoport <[email protected]> Cc: Nico Pache <[email protected]> Cc: Rik van Riel <[email protected]> Cc: Roman Gushchin <[email protected]> Cc: Ryan Roberts <[email protected]> Cc: Shakeel Butt <[email protected]> Cc: Shuang Zhai <[email protected]> Cc: Yu Zhao <[email protected]> Cc: Shuang Zhai <[email protected]> Cc: Hugh Dickins <[email protected]> Signed-off-by: Andrew Morton <[email protected]>
2024-09-09mm: selftest to verify zero-filled pages are mapped to zeropageAlexander Zhu3-0/+94
When a THP is split, any subpage that is zero-filled will be mapped to the shared zeropage, hence saving memory. Add selftest to verify this by allocating zero-filled THP and comparing RssAnon before and after split. Link: https://lkml.kernel.org/r/[email protected] Signed-off-by: Alexander Zhu <[email protected]> Signed-off-by: Usama Arif <[email protected]> Acked-by: Rik van Riel <[email protected]> Cc: Barry Song <[email protected]> Cc: David Hildenbrand <[email protected]> Cc: Domenico Cerasuolo <[email protected]> Cc: Johannes Weiner <[email protected]> Cc: Jonathan Corbet <[email protected]> Cc: Kairui Song <[email protected]> Cc: Matthew Wilcox <[email protected]> Cc: Mike Rapoport <[email protected]> Cc: Nico Pache <[email protected]> Cc: Roman Gushchin <[email protected]> Cc: Ryan Roberts <[email protected]> Cc: Shakeel Butt <[email protected]> Cc: Shuang Zhai <[email protected]> Cc: Yu Zhao <[email protected]> Cc: Shuang Zhai <[email protected]> Cc: Hugh Dickins <[email protected]> Signed-off-by: Andrew Morton <[email protected]>
2024-09-09mm: remap unused subpages to shared zeropage when splitting isolated thpYu Zhao4-16/+75
Patch series "mm: split underused THPs", v5. The current upstream default policy for THP is always. However, Meta uses madvise in production as the current THP=always policy vastly overprovisions THPs in sparsely accessed memory areas, resulting in excessive memory pressure and premature OOM killing. Using madvise + relying on khugepaged has certain drawbacks over THP=always. Using madvise hints mean THPs aren't "transparent" and require userspace changes. Waiting for khugepaged to scan memory and collapse pages into THP can be slow and unpredictable in terms of performance (i.e. you dont know when the collapse will happen), while production environments require predictable performance. If there is enough memory available, its better for both performance and predictability to have a THP from fault time, i.e. THP=always rather than wait for khugepaged to collapse it, and deal with sparsely populated THPs when the system is running out of memory. This patch series is an attempt to mitigate the issue of running out of memory when THP is always enabled. During runtime whenever a THP is being faulted in or collapsed by khugepaged, the THP is added to a list. Whenever memory reclaim happens, the kernel runs the deferred_split shrinker which goes through the list and checks if the THP was underused, i.e. how many of the base 4K pages of the entire THP were zero-filled. If this number goes above a certain threshold, the shrinker will attempt to split that THP. Then at remap time, the pages that were zero-filled are mapped to the shared zeropage, hence saving memory. This method avoids the downside of wasting memory in areas where THP is sparsely filled when THP is always enabled, while still providing the upside THPs like reduced TLB misses without having to use madvise. Meta production workloads that were CPU bound (>99% CPU utilzation) were tested with THP shrinker. The results after 2 hours are as follows: | THP=madvise | THP=always | THP=always | | | + shrinker series | | | + max_ptes_none=409 ----------------------------------------------------------------------------- Performance improvement | - | +1.8% | +1.7% (over THP=madvise) | | | ----------------------------------------------------------------------------- Memory usage | 54.6G | 58.8G (+7.7%) | 55.9G (+2.4%) ----------------------------------------------------------------------------- max_ptes_none=409 means that any THP that has more than 409 out of 512 (80%) zero filled filled pages will be split. To test out the patches, the below commands without the shrinker will invoke OOM killer immediately and kill stress, but will not fail with the shrinker: echo 450 > /sys/kernel/mm/transparent_hugepage/khugepaged/max_ptes_none mkdir /sys/fs/cgroup/test echo $$ > /sys/fs/cgroup/test/cgroup.procs echo 20M > /sys/fs/cgroup/test/memory.max echo 0 > /sys/fs/cgroup/test/memory.swap.max # allocate twice memory.max for each stress worker and touch 40/512 of # each THP, i.e. vm-stride 50K. # With the shrinker, max_ptes_none of 470 and below won't invoke OOM # killer. # Without the shrinker, OOM killer is invoked immediately irrespective # of max_ptes_none value and kills stress. stress --vm 1 --vm-bytes 40M --vm-stride 50K This patch (of 5): Here being unused means containing only zeros and inaccessible to userspace. When splitting an isolated thp under reclaim or migration, the unused subpages can be mapped to the shared zeropage, hence saving memory. This is particularly helpful when the internal fragmentation of a thp is high, i.e. it has many untouched subpages. This is also a prerequisite for THP low utilization shrinker which will be introduced in later patches, where underutilized THPs are split, and the zero-filled pages are freed saving memory. Link: https://lkml.kernel.org/r/[email protected] Link: https://lkml.kernel.org/r/[email protected] Signed-off-by: Yu Zhao <[email protected]> Signed-off-by: Usama Arif <[email protected]> Tested-by: Shuang Zhai <[email protected]> Cc: Alexander Zhu <[email protected]> Cc: Barry Song <[email protected]> Cc: David Hildenbrand <[email protected]> Cc: Domenico Cerasuolo <[email protected]> Cc: Johannes Weiner <[email protected]> Cc: Jonathan Corbet <[email protected]> Cc: Kairui Song <[email protected]> Cc: Matthew Wilcox <[email protected]> Cc: Mike Rapoport <[email protected]> Cc: Nico Pache <[email protected]> Cc: Rik van Riel <[email protected]> Cc: Roman Gushchin <[email protected]> Cc: Ryan Roberts <[email protected]> Cc: Shakeel Butt <[email protected]> Cc: Shuang Zhai <[email protected]> Cc: Hugh Dickins <[email protected]> Signed-off-by: Andrew Morton <[email protected]>
2024-09-09mm: warn about illegal __GFP_NOFAIL usage in a more appropriate location and ↵Barry Song2-26/+27
manner Three points for this change: 1. We should consolidate all warnings in one place. Currently, the order > 1 warning is in the hotpath, while others are in less likely scenarios. Moving all warnings to the slowpath will reduce the overhead for order > 1 and increase the visibility of other warnings. 2. We currently have two warnings for order: one for order > 1 in the hotpath and another for order > costly_order in the laziest path. I suggest standardizing on order > 1 since it's been in use for a long time. 3. We don't need to check for __GFP_NOWARN in this case. __GFP_NOWARN is meant to suppress allocation failure reports, but here we're dealing with bug detection, not allocation failures. So replace WARN_ON_ONCE_GFP by WARN_ON_ONCE. [[email protected]: also update the doc for __GFP_NOFAIL with order > 1] Link: https://lkml.kernel.org/r/[email protected] Link: https://lkml.kernel.org/r/[email protected] Signed-off-by: Barry Song <[email protected]> Suggested-by: Vlastimil Babka <[email protected]> Reviewed-by: Vlastimil Babka <[email protected]> Acked-by: David Hildenbrand <[email protected]> Acked-by: Michal Hocko <[email protected]> Cc: Christoph Hellwig <[email protected]> Cc: Christoph Lameter <[email protected]> Cc: Davidlohr Bueso <[email protected]> Cc: David Rientjes <[email protected]> Cc: "Eugenio Pérez" <[email protected]> Cc: Hailong.Liu <[email protected]> Cc: Hyeonggon Yoo <[email protected]> Cc: Jason Wang <[email protected]> Cc: Joonsoo Kim <[email protected]> Cc: Kees Cook <[email protected]> Cc: Linus Torvalds <[email protected]> Cc: Lorenzo Stoakes <[email protected]> Cc: Maxime Coquelin <[email protected]> Cc: "Michael S. Tsirkin" <[email protected]> Cc: Pekka Enberg <[email protected]> Cc: Roman Gushchin <[email protected]> Cc: Uladzislau Rezki (Sony) <[email protected]> Cc: Xie Yongji <[email protected]> Cc: Xuan Zhuo <[email protected]> Cc: Yafang Shao <[email protected]> Signed-off-by: Andrew Morton <[email protected]>
2024-09-09mm: document __GFP_NOFAIL must be blockableBarry Song1-1/+4
Non-blocking allocation with __GFP_NOFAIL is not supported and may still result in NULL pointers (if we don't return NULL, we result in busy-loop within non-sleepable contexts): static inline struct page * __alloc_pages_slowpath(gfp_t gfp_mask, unsigned int order, struct alloc_context *ac) { ... /* * Make sure that __GFP_NOFAIL request doesn't leak out and make sure * we always retry */ if (gfp_mask & __GFP_NOFAIL) { /* * All existing users of the __GFP_NOFAIL are blockable, so warn * of any new users that actually require GFP_NOWAIT */ if (WARN_ON_ONCE_GFP(!can_direct_reclaim, gfp_mask)) goto fail; ... } ... fail: warn_alloc(gfp_mask, ac->nodemask, "page allocation failure: order:%u", order); got_pg: return page; } Highlight this in the documentation of __GFP_NOFAIL so that non-mm subsystems can reject any illegal usage of __GFP_NOFAIL with GFP_ATOMIC, GFP_NOWAIT, etc. Link: https://lkml.kernel.org/r/[email protected] Signed-off-by: Barry Song <[email protected]> Acked-by: Michal Hocko <[email protected]> Reviewed-by: Christoph Hellwig <[email protected]> Acked-by: Vlastimil Babka <[email protected]> Acked-by: Davidlohr Bueso <[email protected]> Acked-by: David Hildenbrand <[email protected]> Cc: Christoph Lameter <[email protected]> Cc: David Rientjes <[email protected]> Cc: "Eugenio Pérez" <[email protected]> Cc: Hailong.Liu <[email protected]> Cc: Hyeonggon Yoo <[email protected]> Cc: Jason Wang <[email protected]> Cc: Joonsoo Kim <[email protected]> Cc: Kees Cook <[email protected]> Cc: Linus Torvalds <[email protected]> Cc: Lorenzo Stoakes <[email protected]> Cc: Maxime Coquelin <[email protected]> Cc: "Michael S. Tsirkin" <[email protected]> Cc: Pekka Enberg <[email protected]> Cc: Roman Gushchin <[email protected]> Cc: Uladzislau Rezki (Sony) <[email protected]> Cc: Xuan Zhuo <[email protected]> Cc: Christoph Hellwig <[email protected]> Cc: Xie Yongji <[email protected]> Cc: Yafang Shao <[email protected]> Signed-off-by: Andrew Morton <[email protected]>
2024-09-09vduse: avoid using __GFP_NOFAILJason Wang2-8/+12
Patch series "mm/vdpa: correct misuse of non-direct-reclaim __GFP_NOFAIL and improve related doc and warn", v4. __GFP_NOFAIL carries the semantics of never failing, so its callers do not check the return value: %__GFP_NOFAIL: The VM implementation _must_ retry infinitely: the caller cannot handle allocation failures. The allocation could block indefinitely but will never return with failure. Testing for failure is pointless. However, __GFP_NOFAIL can sometimes fail if it exceeds size limits or is used with GFP_ATOMIC/GFP_NOWAIT in a non-sleepable context. This patchset handles illegal using __GFP_NOFAIL together with GFP_ATOMIC lacking __GFP_DIRECT_RECLAIM(without this, we can't do anything to reclaim memory to satisfy the nofail requirement) and improve related document and warnings. The proper size limits for __GFP_NOFAIL will be handled separately after more discussions. This patch (of 3): mm doesn't support non-blockable __GFP_NOFAIL allocation. Because persisting in providing __GFP_NOFAIL services for non-block users who cannot perform direct memory reclaim may only result in an endless busy loop. Therefore, in such cases, the current mm-core may directly return a NULL pointer: static inline struct page * __alloc_pages_slowpath(gfp_t gfp_mask, unsigned int order, struct alloc_context *ac) { ... if (gfp_mask & __GFP_NOFAIL) { /* * All existing users of the __GFP_NOFAIL are blockable, so warn * of any new users that actually require GFP_NOWAIT */ if (WARN_ON_ONCE_GFP(!can_direct_reclaim, gfp_mask)) goto fail; ... } ... fail: warn_alloc(gfp_mask, ac->nodemask, "page allocation failure: order:%u", order); got_pg: return page; } Unfortuantely, vpda does that nofail allocation under non-sleepable lock. A possible way to fix that is to move the pages allocation out of the lock into the caller, but having to allocate a huge number of pages and auxiliary page array seems to be problematic as well per Tetsuon: " You should implement proper error handling instead of using __GFP_NOFAIL if count can become large." So I chose another way, which does not release kernel bounce pages when user tries to register userspace bounce pages. Then we can avoid allocating in paths where failure is not expected.(e.g in the release). We pay this for more memory usage as we don't release kernel bounce pages but further optimizations could be done on top. [[email protected]: Refine the changelog] Link: https://lkml.kernel.org/r/[email protected] Link: https://lkml.kernel.org/r/[email protected] Fixes: 6c77ed22880d ("vduse: Support using userspace pages as bounce buffer") Signed-off-by: Barry Song <[email protected]> Reviewed-by: Xie Yongji <[email protected]> Tested-by: Xie Yongji <[email protected]> Signed-off-by: Jason Wang <[email protected]> Cc: Christoph Hellwig <[email protected]> Cc: Christoph Lameter <[email protected]> Cc: David Hildenbrand <[email protected]> Cc: David Rientjes <[email protected]> Cc: Hailong.Liu <[email protected]> Cc: Hyeonggon Yoo <[email protected]> Cc: Joonsoo Kim <[email protected]> Cc: Linus Torvalds <[email protected]> Cc: Michal Hocko <[email protected]> Cc: Pekka Enberg <[email protected]> Cc: Roman Gushchin <[email protected]> Cc: Uladzislau Rezki (Sony) <[email protected]> Cc: Vlastimil Babka <[email protected]> Cc: Yafang Shao <[email protected]> Cc: Christoph Hellwig <[email protected]> Cc: Davidlohr Bueso <[email protected]> Cc: "Eugenio Pérez" <[email protected]> Cc: Kees Cook <[email protected]> Cc: Lorenzo Stoakes <[email protected]> Cc: Maxime Coquelin <[email protected]> Cc: "Michael S. Tsirkin" <[email protected]> Cc: Xuan Zhuo <[email protected]> Signed-off-by: Andrew Morton <[email protected]>
2024-09-09mm/hugetlb: sort out global lock annotationsMateusz Guzik1-3/+3
The mutex array pointer shares a cacheline with the spinlock: ffffffff84187480 B hugetlb_fault_mutex_table ffffffff84187488 B hugetlb_lock This is because the former is annotated with a macro forcing cacheline alignment. I suspect it was meant to be the variant which on top of it makes sure the object does not share the cacheline with anyone. Since array pointer itself is de facto read-only such an annotation does not make sense there anyway. Instead mark it __ro_after_init along with the size var. Do however move the spinlock out of the way. [[email protected]: move section directives to the end of the definitions, per convention] [[email protected]: DEFINE_SPINLOCK doesn't permit section modifiers at end-of-definition] Link: https://lkml.kernel.org/r/[email protected] Signed-off-by: Mateusz Guzik <[email protected]> Cc: Davidlohr Bueso <[email protected]> Cc: Muchun Song <[email protected]> Signed-off-by: Andrew Morton <[email protected]>
2024-09-09mm: shmem: extend shmem_unused_huge_shrink() to all sizesHugh Dickins1-25/+20
Although shmem_get_folio_gfp() is correctly putting inodes on the shrinklist according to the folio size, shmem_unused_huge_shrink() was still dealing with that shrinklist in terms of HPAGE_PMD_SIZE. Generalize that; and to handle the mixture of sizes more sensibly, shmem_alloc_and_add_folio() give it a number of pages to be freed (approximate: no need to minimize that with an exact calculation) instead of a number of inodes to split. [[email protected]: comment tweak, per David] Link: https://lkml.kernel.org/r/[email protected] Signed-off-by: Hugh Dickins <[email protected]> Reviewed-by: David Hildenbrand <[email protected]> Cc: Baolin Wang <[email protected]> Cc: Hugh Dickins <[email protected]> Cc: Kirill A. Shutemov <[email protected]> Signed-off-by: Andrew Morton <[email protected]>
2024-09-09mm: shmem: fix minor off-by-one in shrinkable calculationHugh Dickins1-1/+1
There has been a long-standing and very minor off-by-one, where shmem_get_folio_gfp() decides if a large folio extends beyond i_size far enough to leave a page or more for freeing later under pressure. This is not something needed for stable: but it will be proportionately more significant as support for smaller large folios is added, and is best fixed before duplicating the check in other places. Link: https://lkml.kernel.org/r/[email protected] Fixes: 779750d20b93 ("shmem: split huge pages beyond i_size under memory pressure") Signed-off-by: Hugh Dickins <[email protected]> Reviewed-by: David Hildenbrand <[email protected]> Reviewed-by: Baolin Wang <[email protected]> Signed-off-by: Andrew Morton <[email protected]>
2024-09-09maple_tree: dump error message based on formatWei Yang1-2/+8
Just do what mt_dump_range64() does. Dump the error message based on format. Link: https://lkml.kernel.org/r/[email protected] Signed-off-by: Wei Yang <[email protected]> Cc: Liam R. Howlett <[email protected]> Cc: Matthew Wilcox <[email protected]> Signed-off-by: Andrew Morton <[email protected]>
2024-09-09maple_tree: arange64 node is not a leaf nodeWei Yang1-5/+1
mt_dump_arange64() only applies to an entry whose type is maple_arange_64, in which mte_is_leaf() must return false. Since mte_is_leaf() here is always false, we can remove this condition check. Link: https://lkml.kernel.org/r/[email protected] Signed-off-by: Wei Yang <[email protected]> Reviewed-by: Liam R. Howlett <[email protected]> Cc: Matthew Wilcox <[email protected]> Signed-off-by: Andrew Morton <[email protected]>
2024-09-09Docs/damon/maintainer-profile: document Google calendar for bi-weekly meetupsSeongJae Park1-2/+4
We added a public Google calendar for easy sharing of DAMON bi-weekly meetups[1]. Add it to the official document for a better visibility. [1] https://lore.kernel.org/all/[email protected]/ Link: https://lkml.kernel.org/r/[email protected] Signed-off-by: SeongJae Park <[email protected]> Cc: Alex Shi <[email protected]> Cc: Hu Haowen <[email protected]> Cc: Jonathan Corbet <[email protected]> Cc: Yanteng Si <[email protected]> Signed-off-by: Andrew Morton <[email protected]>
2024-09-09Docs/damon/maintainer-profile: add links in placeSeongJae Park1-40/+44
maintainer-profile.rst for DAMON separates the links and target definitions. It is not really necessary, and only makes the readability worse. At least the definitions need the section title (say, "References"). Just add the links in place on the doc. Link: https://lkml.kernel.org/r/[email protected] Signed-off-by: SeongJae Park <[email protected]> Cc: Alex Shi <[email protected]> Cc: Hu Haowen <[email protected]> Cc: Jonathan Corbet <[email protected]> Cc: Yanteng Si <[email protected]> Signed-off-by: Andrew Morton <[email protected]>