aboutsummaryrefslogtreecommitdiff
path: root/scripts/clang-tools/gen_compile_commands.py
diff options
context:
space:
mode:
authorMateusz Guzik <[email protected]>2023-08-23 07:06:09 +0200
committerDennis Zhou <[email protected]>2023-08-25 08:10:35 -0700
commit14ef95be6f5558fb9e43aaf06ef9a1d6e0cae6c8 (patch)
tree7abdf1224e08569f9a4dbd499192deeaba8729ef /scripts/clang-tools/gen_compile_commands.py
parentc439d5e8a0deb7310b5bb4e5f2fe47c40ff5297f (diff)
kernel/fork: group allocation/free of per-cpu counters for mm struct
A trivial execve scalability test which tries to be very friendly (statically linked binaries, all separate) is predominantly bottlenecked by back-to-back per-cpu counter allocations which serialize on global locks. Ease the pain by allocating and freeing them in one go. Bench can be found here: http://apollo.backplane.com/DFlyMisc/doexec.c $ cc -static -O2 -o static-doexec doexec.c $ ./static-doexec $(nproc) Even at a very modest scale of 26 cores (ops/s): before: 133543.63 after: 186061.81 (+39%) While with the patch these allocations remain a significant problem, the primary bottleneck shifts to page release handling. Signed-off-by: Mateusz Guzik <[email protected]> Link: https://lore.kernel.org/r/[email protected] [Dennis: reflowed 1 line] Signed-off-by: Dennis Zhou <[email protected]>
Diffstat (limited to 'scripts/clang-tools/gen_compile_commands.py')
0 files changed, 0 insertions, 0 deletions