aboutsummaryrefslogtreecommitdiff
path: root/tools/testing/selftests/bpf/prog_tests/tracing_struct.c
diff options
context:
space:
mode:
authorDaniel Borkmann <daniel@iogearbox.net>2023-07-12 23:45:23 +0200
committerDaniel Borkmann <daniel@iogearbox.net>2023-07-12 23:45:27 +0200
commit968a3b922ca1e468b2c53426760d16a9298702c1 (patch)
tree0f87eddd06a2878f6c0e69e3d8c137e6eb8386c1 /tools/testing/selftests/bpf/prog_tests/tracing_struct.c
parentc21de5fc5ffde70b21e4f370aebad79a7d7bdc0d (diff)
parent4ed8b5bcfada6687baa478bf8febe891d9107118 (diff)
Merge branch 'bpf-mem-cache-free-rcu'
Alexei Starovoitov says: ==================== v3->v4: - extra patch 14 from Hou to check for object leaks. - fixed the race/leak in free_by_rcu_ttrace. Extra hunk in patch 8. - added Acks and fixed typos. v2->v3: - dropped _tail optimization for free_by_rcu_ttrace - new patch 5 to refactor inc/dec of c->active - change 'draining' logic in patch 7 - add rcu_barrier in patch 12 - __llist_add-> llist_add(waiting_for_gp_ttrace) in patch 9 to fix race - David's Ack in patch 13 and explanation that migrate_disable cannot be removed just yet. v1->v2: - Fixed race condition spotted by Hou. Patch 7. v1: Introduce bpf_mem_cache_free_rcu() that is similar to kfree_rcu except the objects will go through an additional RCU tasks trace grace period before being freed into slab. Patches 1-9 - a bunch of prep work Patch 10 - a patch from Paul that exports rcu_request_urgent_qs_task(). Patch 12 - the main bpf_mem_cache_free_rcu patch. Patch 13 - use it in bpf_cpumask. bpf_local_storage, bpf_obj_drop, qp-trie will be other users eventually. With additional hack patch to htab that replaces bpf_mem_cache_free with bpf_mem_cache_free_rcu the following are benchmark results: - map_perf_test 4 8 16348 1000000 drops from 800k to 600k. Waiting for RCU GP makes objects cache cold. - bench htab-mem -a -p 8 20% drop in performance and big increase in memory. From 3 Mbyte to 50 Mbyte. As expected. - bench htab-mem -a -p 16 --use-case add_del_on_diff_cpu Same performance and better memory consumption. Before these patches this bench would OOM (with or without 'reuse after GP') Patch 8 addresses the issue. At the end the performance drop and additional memory consumption due to _rcu() were expected and came out to be within reasonable margin. Without Paul's patch 10 the memory consumption in 'bench htab-mem' is in Gbytes which wouldn't be acceptable. Patch 8 is a heuristic to address 'alloc on one cpu, free on another' issue. It works well in practice. One can probably construct an artificial benchmark to make heuristic ineffective, but we have to trade off performance, code complexity, and memory consumption. The life cycle of objects: alloc: dequeue free_llist free: enqeueu free_llist free_llist above high watermark -> free_by_rcu_ttrace free_rcu: enqueue free_by_rcu -> waiting_for_gp after RCU GP waiting_for_gp -> free_by_rcu_ttrace free_by_rcu_ttrace -> waiting_for_gp_ttrace -> slab ==================== Signed-off-by: Daniel Borkmann <daniel@iogearbox.net>
Diffstat (limited to 'tools/testing/selftests/bpf/prog_tests/tracing_struct.c')
0 files changed, 0 insertions, 0 deletions