aboutsummaryrefslogtreecommitdiff
path: root/tools/perf/scripts/python/bin
diff options
context:
space:
mode:
authorJohannes Weiner <[email protected]>2024-05-14 16:26:41 -0400
committerAndrew Morton <[email protected]>2024-07-03 19:29:53 -0700
commitb82b530740b960eb949813043c4187c254039ac1 (patch)
tree38c92fc95e7f691d06486cd73fba94043520eb96 /tools/perf/scripts/python/bin
parent7f83bf14603ef41a44dc907594d749a283e22c37 (diff)
mm: vmscan: restore incremental cgroup iteration
Currently, reclaim always walks the entire cgroup tree in order to ensure fairness between groups. While overreclaim is limited in shrink_lruvec(), many of our systems have a sizable number of active groups, and an even bigger number of idle cgroups with cache left behind by previous jobs; the mere act of walking all these cgroups can impose significant latency on direct reclaimers. In the past, we've used a save-and-restore iterator that enabled incremental tree walks over multiple reclaim invocations. This ensured fairness, while keeping the work of individual reclaimers small. However, in edge cases with a lot of reclaim concurrency, individual reclaimers would sometimes not see enough of the cgroup tree to make forward progress and (prematurely) declare OOM. Consequently we switched to comprehensive walks in 1ba6fc9af35b ("mm: vmscan: do not share cgroup iteration between reclaimers"). To address the latency problem without bringing back the premature OOM issue, reinstate the shared iteration, but with a restart condition to do the full walk in the OOM case - similar to what we do for memory.low enforcement and active page protection. In the worst case, we do one more full tree walk before declaring OOM. But the vast majority of direct reclaim scans can then finish much quicker, while fairness across the tree is maintained: - Before this patch, we observed that direct reclaim always takes more than 100us and most direct reclaim time is spent in reclaim cycles lasting between 1ms and 1 second. Almost 40% of direct reclaim time was spent on reclaim cycles exceeding 100ms. - With this patch, almost all page reclaim cycles last less than 10ms, and a good amount of direct page reclaim finishes in under 100us. No page reclaim cycles lasting over 100ms were observed anymore. The shared iterator state is maintaned inside the target cgroup, so fair and incremental walks are performed during both global reclaim and cgroup limit reclaim of complex subtrees. Link: https://lkml.kernel.org/r/[email protected] Signed-off-by: Johannes Weiner <[email protected]> Signed-off-by: Rik van Riel <[email protected]> Reported-by: Rik van Riel <[email protected]> Reviewed-by: Shakeel Butt <[email protected]> Reviewed-by: Roman Gushchin <[email protected]> Cc: Facebook Kernel Team <[email protected]> Cc: Michal Hocko <[email protected]> Cc: Rik van Riel <[email protected]> Signed-off-by: Andrew Morton <[email protected]>
Diffstat (limited to 'tools/perf/scripts/python/bin')
0 files changed, 0 insertions, 0 deletions