aboutsummaryrefslogtreecommitdiff
path: root/tools/perf/scripts/python/bin/stackcollapse-report
diff options
context:
space:
mode:
authorUladzislau Rezki (Sony) <[email protected]>2022-12-22 20:00:20 +0100
committerAndrew Morton <[email protected]>2023-01-18 17:12:48 -0800
commitedd898181e2f6f0969c08e1dfe2b7cdf902b9b33 (patch)
treea36aec58f87324ebec3f3fd3217750778ba4b50a /tools/perf/scripts/python/bin/stackcollapse-report
parentb5054174ac7c7d8fae15deee7ddc0e20fd604f30 (diff)
mm: vmalloc: avoid calling __find_vmap_area() twice in __vunmap()
Currently the __vunmap() path calls __find_vmap_area() twice. Once on entry to check that the area exists, then inside the remove_vm_area() function which also performs a new search for the VA. In order to improvie it from a performance point of view we split remove_vm_area() into two new parts: - find_unlink_vmap_area() that does a search and unlink from tree; - __remove_vm_area() that removes without searching. In this case there is no any functional change for remove_vm_area() whereas vm_remove_mappings(), where a second search happens, switches to the __remove_vm_area() variant where the already detached VA is passed as a parameter, so there is no need to find it again. Performance wise, i use test_vmalloc.sh with 32 threads doing alloc free on a 64-CPUs-x86_64-box: perf without this patch: - 31.41% 0.50% vmalloc_test/10 [kernel.vmlinux] [k] __vunmap - 30.92% __vunmap - 17.67% _raw_spin_lock native_queued_spin_lock_slowpath - 12.33% remove_vm_area - 11.79% free_vmap_area_noflush - 11.18% _raw_spin_lock native_queued_spin_lock_slowpath 0.76% free_unref_page perf with this patch: - 11.35% 0.13% vmalloc_test/14 [kernel.vmlinux] [k] __vunmap - 11.23% __vunmap - 8.28% find_unlink_vmap_area - 7.95% _raw_spin_lock 7.44% native_queued_spin_lock_slowpath - 1.93% free_vmap_area_noflush - 0.56% _raw_spin_lock 0.53% native_queued_spin_lock_slowpath 0.60% __vunmap_range_noflush __vunmap() consumes around ~20% less CPU cycles on this test. Also, switch from find_vmap_area() to find_unlink_vmap_area() to prevent a double access to the vmap_area_lock: one for finding area, second time is for unlinking from a tree. [[email protected]: switch to find_unlink_vmap_area() in vm_unmap_ram()] Link: https://lkml.kernel.org/r/[email protected] Link: https://lkml.kernel.org/r/[email protected] Signed-off-by: Uladzislau Rezki (Sony) <[email protected]> Reported-by: Roman Gushchin <[email protected]> Reviewed-by: Lorenzo Stoakes <[email protected]> Cc: Baoquan He <[email protected]> Cc: Christoph Hellwig <[email protected]> Cc: Matthew Wilcox (Oracle) <[email protected]> Cc: Nicholas Piggin <[email protected]> Cc: Oleksiy Avramchenko <[email protected]> Cc: Christoph Hellwig <[email protected]> Signed-off-by: Andrew Morton <[email protected]>
Diffstat (limited to 'tools/perf/scripts/python/bin/stackcollapse-report')
0 files changed, 0 insertions, 0 deletions