aboutsummaryrefslogtreecommitdiff
path: root/tools/perf/scripts/python/flamegraph.py
diff options
context:
space:
mode:
authorYu Zhao <[email protected]>2022-09-18 02:00:04 -0600
committerAndrew Morton <[email protected]>2022-09-26 19:46:09 -0700
commit018ee47f14893d500131dfca2ff9f3ff8ebd4ed2 (patch)
tree6af13c564726319335e84454ca640736913ff068 /tools/perf/scripts/python/flamegraph.py
parentac35a490237446b71e3b4b782b1596967edd0aa8 (diff)
mm: multi-gen LRU: exploit locality in rmap
Searching the rmap for PTEs mapping each page on an LRU list (to test and clear the accessed bit) can be expensive because pages from different VMAs (PA space) are not cache friendly to the rmap (VA space). For workloads mostly using mapped pages, searching the rmap can incur the highest CPU cost in the reclaim path. This patch exploits spatial locality to reduce the trips into the rmap. When shrink_page_list() walks the rmap and finds a young PTE, a new function lru_gen_look_around() scans at most BITS_PER_LONG-1 adjacent PTEs. On finding another young PTE, it clears the accessed bit and updates the gen counter of the page mapped by this PTE to (max_seq%MAX_NR_GENS)+1. Server benchmark results: Single workload: fio (buffered I/O): no change Single workload: memcached (anon): +[3, 5]% Ops/sec KB/sec patch1-6: 1106168.46 43025.04 patch1-7: 1147696.57 44640.29 Configurations: no change Client benchmark results: kswapd profiles: patch1-6 39.03% lzo1x_1_do_compress (real work) 18.47% page_vma_mapped_walk (overhead) 6.74% _raw_spin_unlock_irq 3.97% do_raw_spin_lock 2.49% ptep_clear_flush 2.48% anon_vma_interval_tree_iter_first 1.92% folio_referenced_one 1.88% __zram_bvec_write 1.48% memmove 1.31% vma_interval_tree_iter_next patch1-7 48.16% lzo1x_1_do_compress (real work) 8.20% page_vma_mapped_walk (overhead) 7.06% _raw_spin_unlock_irq 2.92% ptep_clear_flush 2.53% __zram_bvec_write 2.11% do_raw_spin_lock 2.02% memmove 1.93% lru_gen_look_around 1.56% free_unref_page_list 1.40% memset Configurations: no change Link: https://lkml.kernel.org/r/[email protected] Signed-off-by: Yu Zhao <[email protected]> Acked-by: Barry Song <[email protected]> Acked-by: Brian Geffon <[email protected]> Acked-by: Jan Alexander Steffens (heftig) <[email protected]> Acked-by: Oleksandr Natalenko <[email protected]> Acked-by: Steven Barrett <[email protected]> Acked-by: Suleiman Souhlal <[email protected]> Tested-by: Daniel Byrne <[email protected]> Tested-by: Donald Carr <[email protected]> Tested-by: Holger Hoffstätte <[email protected]> Tested-by: Konstantin Kharlamov <[email protected]> Tested-by: Shuang Zhai <[email protected]> Tested-by: Sofia Trinh <[email protected]> Tested-by: Vaibhav Jain <[email protected]> Cc: Andi Kleen <[email protected]> Cc: Aneesh Kumar K.V <[email protected]> Cc: Catalin Marinas <[email protected]> Cc: Dave Hansen <[email protected]> Cc: Hillf Danton <[email protected]> Cc: Jens Axboe <[email protected]> Cc: Johannes Weiner <[email protected]> Cc: Jonathan Corbet <[email protected]> Cc: Linus Torvalds <[email protected]> Cc: Matthew Wilcox <[email protected]> Cc: Mel Gorman <[email protected]> Cc: Miaohe Lin <[email protected]> Cc: Michael Larabel <[email protected]> Cc: Michal Hocko <[email protected]> Cc: Mike Rapoport <[email protected]> Cc: Mike Rapoport <[email protected]> Cc: Peter Zijlstra <[email protected]> Cc: Qi Zheng <[email protected]> Cc: Tejun Heo <[email protected]> Cc: Vlastimil Babka <[email protected]> Cc: Will Deacon <[email protected]> Signed-off-by: Andrew Morton <[email protected]>
Diffstat (limited to 'tools/perf/scripts/python/flamegraph.py')
0 files changed, 0 insertions, 0 deletions