aboutsummaryrefslogtreecommitdiff
path: root/tools/testing/selftests/bpf
AgeCommit message (Collapse)AuthorFilesLines
2022-07-08selftests/bpf: Fix xdp_synproxy build failure if CONFIG_NF_CONNTRACK=m/nMaxim Mikityanskiy1-7/+17
When CONFIG_NF_CONNTRACK=m, struct bpf_ct_opts and enum member BPF_F_CURRENT_NETNS are not exposed. This commit allows building the xdp_synproxy selftest in such cases. Note that nf_conntrack must be loaded before running the test if it's compiled as a module. This commit also allows this selftest to be successfully compiled when CONFIG_NF_CONNTRACK is disabled. One unused local variable of type struct bpf_ct_opts is also removed. Fixes: fb5cd0ce70d4 ("selftests/bpf: Add selftests for raw syncookie helpers") Reported-by: Yauheni Kaliuta <[email protected]> Signed-off-by: Maxim Mikityanskiy <[email protected]> Signed-off-by: Andrii Nakryiko <[email protected]> Link: https://lore.kernel.org/bpf/[email protected]
2022-07-08bpf: Check attach_func_proto more carefully in check_return_codeStanislav Fomichev3-6/+32
Syzkaller reports the following crash: RIP: 0010:check_return_code kernel/bpf/verifier.c:10575 [inline] RIP: 0010:do_check kernel/bpf/verifier.c:12346 [inline] RIP: 0010:do_check_common+0xb3d2/0xd250 kernel/bpf/verifier.c:14610 With the following reproducer: bpf$PROG_LOAD_XDP(0x5, &(0x7f00000004c0)={0xd, 0x3, &(0x7f0000000000)=ANY=[@ANYBLOB="1800000000000019000000000000000095"], &(0x7f0000000300)='GPL\x00', 0x0, 0x0, 0x0, 0x0, 0x0, '\x00', 0x0, 0x2b, 0xffffffffffffffff, 0x8, 0x0, 0x0, 0x10, 0x0}, 0x80) Because we don't enforce expected_attach_type for XDP programs, we end up in hitting 'if (prog->expected_attach_type == BPF_LSM_CGROUP' part in check_return_code and follow up with testing `prog->aux->attach_func_proto->type`, but `prog->aux->attach_func_proto` is NULL. Add explicit prog_type check for the "Note, BPF_LSM_CGROUP that attach ..." condition. Also, don't skip return code check for LSM/STRUCT_OPS. The above actually brings an issue with existing selftest which tries to return EPERM from void inet_csk_clone. Fix the test (and move called_socket_clone to make sure it's not incremented in case of an error) and add a new one to explicitly verify this condition. Fixes: 69fd337a975c ("bpf: per-cgroup lsm flavor") Reported-by: [email protected] Signed-off-by: Stanislav Fomichev <[email protected]> Signed-off-by: Daniel Borkmann <[email protected]> Acked-by: Martin KaFai Lau <[email protected]> Link: https://lore.kernel.org/bpf/[email protected]
2022-07-08selftests/bpf: Add test involving restrict type qualifierDaniel Müller3-2/+13
This change adds a type based test involving the restrict type qualifier to the BPF selftests. On the btfgen path, this will verify that bpftool correctly handles the corresponding RESTRICT BTF kind. Signed-off-by: Daniel Müller <[email protected]> Signed-off-by: Daniel Borkmann <[email protected]> Acked-by: Quentin Monnet <[email protected]> Link: https://lore.kernel.org/bpf/[email protected]
2022-07-08selftests, xsk: Rename AF_XDP testing appMaciej Fijalkowski6-13/+13
Recently, xsk part of libbpf was moved to selftests/bpf directory and lives on its own because there is an AF_XDP testing application that needs it called xdpxceiver. That name makes it a bit hard to indicate who maintains it as there are other XDP samples in there, whereas this one is strictly about AF_XDP. Do s/xdpxceiver/xskxceiver so that it will be easier to figure out who maintains it. A follow-up patch will correct MAINTAINERS file. Signed-off-by: Maciej Fijalkowski <[email protected]> Signed-off-by: Daniel Borkmann <[email protected]> Link: https://lore.kernel.org/bpf/[email protected]
2022-07-08bpf: Add flags arg to bpf_dynptr_read and bpf_dynptr_write APIsJoanne Koong2-7/+7
Commit 13bbbfbea759 ("bpf: Add bpf_dynptr_read and bpf_dynptr_write") added the bpf_dynptr_write() and bpf_dynptr_read() APIs. However, it will be needed for some dynptr types to pass in flags as well (e.g. when writing to a skb, the user may like to invalidate the hash or recompute the checksum). This patch adds a "u64 flags" arg to the bpf_dynptr_read() and bpf_dynptr_write() APIs before their UAPI signature freezes where we then cannot change them anymore with a 5.19.x released kernel. Fixes: 13bbbfbea759 ("bpf: Add bpf_dynptr_read and bpf_dynptr_write") Signed-off-by: Joanne Koong <[email protected]> Signed-off-by: Alexei Starovoitov <[email protected]> Signed-off-by: Daniel Borkmann <[email protected]> Link: https://lore.kernel.org/r/[email protected]
2022-07-07Merge git://git.kernel.org/pub/scm/linux/kernel/git/netdev/netJakub Kicinski2-0/+43
No conflicts. Signed-off-by: Jakub Kicinski <[email protected]>
2022-07-07selftests/bpf: Add benchmark for local_storage RCU Tasks Trace usageDave Marchevsky6-1/+416
This benchmark measures grace period latency and kthread cpu usage of RCU Tasks Trace when many processes are creating/deleting BPF local_storage. Intent here is to quantify improvement on these metrics after Paul's recent RCU Tasks patches [0]. Specifically, fork 15k tasks which call a bpf prog that creates/destroys task local_storage and sleep in a loop, resulting in many call_rcu_tasks_trace calls. To determine grace period latency, trace time elapsed between rcu_tasks_trace_pregp_step and rcu_tasks_trace_postgp; for cpu usage look at rcu_task_trace_kthread's stime in /proc/PID/stat. On my virtualized test environment (Skylake, 8 cpus) benchmark results demonstrate significant improvement: BEFORE Paul's patches: SUMMARY tasks_trace grace period latency avg 22298.551 us stddev 1302.165 us SUMMARY ticks per tasks_trace grace period avg 2.291 stddev 0.324 AFTER Paul's patches: SUMMARY tasks_trace grace period latency avg 16969.197 us stddev 2525.053 us SUMMARY ticks per tasks_trace grace period avg 1.146 stddev 0.178 Note that since these patches are not in bpf-next benchmarking was done by cherry-picking this patch onto rcu tree. [0] https://lore.kernel.org/rcu/20220620225402.GA3842369@paulmck-ThinkPad-P17-Gen-1/ Signed-off-by: Dave Marchevsky <[email protected]> Signed-off-by: Daniel Borkmann <[email protected]> Acked-by: Paul E. McKenney <[email protected]> Acked-by: Martin KaFai Lau <[email protected]> Link: https://lore.kernel.org/bpf/[email protected]
2022-07-06selftests/bpf: Fix few more compiler warningsAndrii Nakryiko3-4/+4
When compiling with -O2, GCC detects few problems with selftests/bpf, so fix all of them. Two are real issues (uninitialized err and nums out-of-bounds access), but two other uninitialized variables warnings are due to GCC not being able to prove that variables are indeed initialized under conditions under which they are used. Fix all 4 cases, though. Signed-off-by: Andrii Nakryiko <[email protected]> Signed-off-by: Daniel Borkmann <[email protected]> Acked-by: Yonghong Song <[email protected]> Acked-by: Jiri Olsa <[email protected]> Link: https://lore.kernel.org/bpf/[email protected]
2022-07-06selftests/bpf: Fix bogus uninitialized variable warningAndrii Nakryiko1-1/+1
When compiling selftests/bpf in optimized mode (-O2), GCC erroneously complains about uninitialized token variable: In file included from network_helpers.c:22: network_helpers.c: In function ‘open_netns’: test_progs.h:355:22: error: ‘token’ may be used uninitialized [-Werror=maybe-uninitialized] 355 | int ___err = libbpf_get_error(___res); \ | ^~~~~~~~~~~~~~~~~~~~~~~~ network_helpers.c:440:14: note: in expansion of macro ‘ASSERT_OK_PTR’ 440 | if (!ASSERT_OK_PTR(token, "malloc token")) | ^~~~~~~~~~~~~ In file included from /data/users/andriin/linux/tools/testing/selftests/bpf/tools/include/bpf/libbpf.h:21, from bpf_util.h:9, from network_helpers.c:20: /data/users/andriin/linux/tools/testing/selftests/bpf/tools/include/bpf/libbpf_legacy.h:113:17: note: by argument 1 of type ‘const void *’ to ‘libbpf_get_error’ declared here 113 | LIBBPF_API long libbpf_get_error(const void *ptr); | ^~~~~~~~~~~~~~~~ cc1: all warnings being treated as errors make: *** [Makefile:522: /data/users/andriin/linux/tools/testing/selftests/bpf/network_helpers.o] Error 1 This is completely bogus becuase libbpf_get_error() doesn't dereference pointer, but the only easy way to silence this is to allocate initialized memory with calloc(). Signed-off-by: Andrii Nakryiko <[email protected]> Signed-off-by: Daniel Borkmann <[email protected]> Acked-by: Yonghong Song <[email protected]> Acked-by: Jiri Olsa <[email protected]> Link: https://lore.kernel.org/bpf/[email protected]
2022-07-05selftests/bpf: Add type match test against kernel's task_structDaniel Müller3-0/+21
This change extends the existing core_reloc/kernel test to include a type match check of a local task_struct against the kernel's definition -- which we assume to succeed. Signed-off-by: Daniel Müller <[email protected]> Signed-off-by: Andrii Nakryiko <[email protected]> Link: https://lore.kernel.org/bpf/[email protected]
2022-07-05selftests/bpf: Add nested type to type based testsDaniel Müller3-20/+58
This change extends the type based tests with another struct type (in addition to a_struct) to check relocations against: a_complex_struct. This type is nested more deeply to provide additional coverage of certain paths in the type match logic. Signed-off-by: Daniel Müller <[email protected]> Signed-off-by: Andrii Nakryiko <[email protected]> Link: https://lore.kernel.org/bpf/[email protected]
2022-07-05selftests/bpf: Add test checking more characteristicsDaniel Müller3-0/+91
This change adds another type-based self-test that specifically aims to test some more characteristics of the TYPE_MATCH logic. Specifically, it covers a few more potential differences between types, such as different orders, enum variant values, and integer signedness. Signed-off-by: Daniel Müller <[email protected]> Signed-off-by: Andrii Nakryiko <[email protected]> Link: https://lore.kernel.org/bpf/[email protected]
2022-07-05selftests/bpf: Add type-match checks to type-based testsDaniel Müller3-4/+73
Now that we have type-match logic in both libbpf and the kernel, this change adjusts the existing BPF self tests to check this functionality. Specifically, we extend the existing type-based tests to check the previously introduced bpf_core_type_matches macro. Signed-off-by: Daniel Müller <[email protected]> Signed-off-by: Andrii Nakryiko <[email protected]> Link: https://lore.kernel.org/bpf/[email protected]
2022-07-01Merge https://git.kernel.org/pub/scm/linux/kernel/git/bpf/bpfJakub Kicinski2-0/+43
Daniel Borkmann says: ==================== pull-request: bpf 2022-07-02 We've added 7 non-merge commits during the last 14 day(s) which contain a total of 6 files changed, 193 insertions(+), 86 deletions(-). The main changes are: 1) Fix clearing of page contiguity when unmapping XSK pool, from Ivan Malov. 2) Two verifier fixes around bounds data propagation, from Daniel Borkmann. 3) Fix fprobe sample module's parameter descriptions, from Masami Hiramatsu. 4) General BPF maintainer entry revamp to better scale patch reviews. * https://git.kernel.org/pub/scm/linux/kernel/git/bpf/bpf: bpf, selftests: Add verifier test case for jmp32's jeq/jne bpf, selftests: Add verifier test case for imm=0,umin=0,umax=1 scalar bpf: Fix insufficient bounds propagation from adjust_scalar_min_max_vals bpf: Fix incorrect verifier simulation around jmp32's jeq/jne xsk: Clear page contiguity bit when unmapping pool bpf, docs: Better scale maintenance of BPF subsystem fprobe, samples: Add module parameter descriptions ==================== Link: https://lore.kernel.org/r/[email protected] Signed-off-by: Jakub Kicinski <[email protected]>
2022-07-01bpf, selftests: Add verifier test case for jmp32's jeq/jneDaniel Borkmann1-0/+21
Add a test case to trigger the verifier's incorrect conclusion in the case of jmp32's jeq/jne. Also here, make use of dead code elimination, so that we can see the verifier bailing out on unfixed kernels. Before: # ./test_verifier 724 #724/p jeq32/jne32: bounds checking FAIL Failed to load prog 'Permission denied'! R4 !read_ok verification time 8 usec stack depth 0 processed 8 insns (limit 1000000) max_states_per_insn 0 total_states 1 peak_states 1 mark_read 0 Summary: 0 PASSED, 0 SKIPPED, 1 FAILED After: # ./test_verifier 724 #724/p jeq32/jne32: bounds checking OK Summary: 1 PASSED, 0 SKIPPED, 0 FAILED Signed-off-by: Daniel Borkmann <[email protected]> Signed-off-by: Andrii Nakryiko <[email protected]> Link: https://lore.kernel.org/bpf/[email protected]
2022-07-01bpf, selftests: Add verifier test case for imm=0,umin=0,umax=1 scalarDaniel Borkmann1-0/+22
Add a test case to trigger the constant scalar issue which leaves the register in scalar(imm=0,umin=0,umax=1,var_off=(0x0; 0x0)) state. Make use of dead code elimination, so that we can see the verifier bailing out on unfixed kernels. For the condition, we use jle given it checks on umax bound. Before: # ./test_verifier 743 #743/p jump & dead code elimination FAIL Failed to load prog 'Permission denied'! R4 !read_ok verification time 11 usec stack depth 0 processed 13 insns (limit 1000000) max_states_per_insn 0 total_states 1 peak_states 1 mark_read 1 Summary: 0 PASSED, 0 SKIPPED, 1 FAILED After: # ./test_verifier 743 #743/p jump & dead code elimination OK Summary: 1 PASSED, 0 SKIPPED, 0 FAILED Signed-off-by: Daniel Borkmann <[email protected]> Signed-off-by: Andrii Nakryiko <[email protected]> Link: https://lore.kernel.org/bpf/[email protected]
2022-07-01selftests/bpf: Skip lsm_cgroup when we don't have trampolinesStanislav Fomichev1-0/+8
With arch_prepare_bpf_trampoline removed on x86: [...] #98/1 lsm_cgroup/functional:SKIP #98 lsm_cgroup:SKIP Summary: 1/0 PASSED, 1 SKIPPED, 0 FAILED Fixes: dca85aac8895 ("selftests/bpf: lsm_cgroup functional test") Signed-off-by: Stanislav Fomichev <[email protected]> Signed-off-by: Daniel Borkmann <[email protected]> Acked-by: Hao Luo <[email protected]> Link: https://lore.kernel.org/bpf/[email protected]
2022-06-30Merge git://git.kernel.org/pub/scm/linux/kernel/git/netdev/netJakub Kicinski1-9/+75
drivers/net/ethernet/microchip/sparx5/sparx5_switchdev.c 9c5de246c1db ("net: sparx5: mdb add/del handle non-sparx5 devices") fbb89d02e33a ("net: sparx5: Allow mdb entries to both CPU and ports") Signed-off-by: Jakub Kicinski <[email protected]>
2022-06-30selftests/xsk: Destroy BPF resources only when ctx refcount drops to 0Maciej Fijalkowski1-5/+4
Currently, xsk_socket__delete frees BPF resources regardless of ctx refcount. Xdpxceiver has a test to verify whether underlying BPF resources would not be wiped out after closing XSK socket that was bound to interface with other active sockets. From library's xsk part perspective it also means that the internal xsk context is shared and its refcount is bumped accordingly. After a switch to loading XDP prog based on previously opened XSK socket, mentioned xdpxceiver test fails with: not ok 16 [xdpxceiver.c:swap_xsk_resources:1334]: ERROR: 9/"Bad file descriptor which means that in swap_xsk_resources(), xsk_socket__delete() released xskmap which in turn caused a failure of xsk_socket__update_xskmap(). To fix this, when deleting socket, decrement ctx refcount before releasing BPF resources and do so only when refcount dropped to 0 which means there are no more active sockets for this ctx so BPF resources can be freed safely. Fixes: 2f6324a3937f ("libbpf: Support shared umems between queues and devices") Signed-off-by: Maciej Fijalkowski <[email protected]> Signed-off-by: Daniel Borkmann <[email protected]> Acked-by: Magnus Karlsson <[email protected]> Link: https://lore.kernel.org/bpf/[email protected]
2022-06-30selftests/xsk: Verify correctness of XDP prog attach pointMaciej Fijalkowski1-0/+17
To prevent the case we had previously where for TEST_MODE_SKB, XDP prog was attached in native mode, call bpf_xdp_query() after loading prog and make sure that attach_mode is as expected. Signed-off-by: Maciej Fijalkowski <[email protected]> Signed-off-by: Daniel Borkmann <[email protected]> Acked-by: Magnus Karlsson <[email protected]> Link: https://lore.kernel.org/bpf/[email protected]
2022-06-30selftests/xsk: Introduce XDP prog load based on existing AF_XDP socketMaciej Fijalkowski3-1/+7
Currently, xsk_setup_xdp_prog() uses anonymous xsk_socket struct which means that during xsk_create_bpf_link() call, xsk->config.xdp_flags is always 0. This in turn means that from xdpxceiver it is impossible to use xdpgeneric attachment, so since commit 3b22523bca02 ("selftests, xsk: Fix bpf_res cleanup test") we were not testing SKB mode at all. To fix this, introduce a function, called xsk_setup_xdp_prog_xsk(), that will load XDP prog based on the existing xsk_socket, so that xsk context's refcount is correctly bumped and flags from application side are respected. Use this from xdpxceiver side so we get coverage of generic and native XDP program attach points. Signed-off-by: Maciej Fijalkowski <[email protected]> Signed-off-by: Daniel Borkmann <[email protected]> Acked-by: Magnus Karlsson <[email protected]> Link: https://lore.kernel.org/bpf/[email protected]
2022-06-30selftests/xsk: Avoid bpf_link probe for existing xskMaciej Fijalkowski1-1/+1
Currently bpf_link probe is done for each call of xsk_socket__create(). For cases where xsk context was previously created and current socket creation uses it, has_bpf_link will be overwritten, where it has already been initialized. Optimize this by moving the query to the xsk_create_ctx() so that when xsk_get_ctx() finds a ctx then no further bpf_link probes are needed. Signed-off-by: Maciej Fijalkowski <[email protected]> Signed-off-by: Daniel Borkmann <[email protected]> Acked-by: Magnus Karlsson <[email protected]> Link: https://lore.kernel.org/bpf/[email protected]
2022-06-30bpftool: Use feature list in bash completionQuentin Monnet1-17/+3
Now that bpftool is able to produce a list of known program, map, attach types, let's use as much of this as we can in the bash completion file, so that we don't have to expand the list each time a new type is added to the kernel. Also update the relevant test script to remove some checks that are no longer needed. Signed-off-by: Quentin Monnet <[email protected]> Signed-off-by: Daniel Borkmann <[email protected]> Acked-by: Daniel Müller <[email protected]> Link: https://lore.kernel.org/bpf/[email protected]
2022-06-29selftests/bpf: lsm_cgroup functional testStanislav Fomichev3-0/+474
Functional test that exercises the following: 1. apply default sk_priority policy 2. permit TX-only AF_PACKET socket 3. cgroup attach/detach/replace 4. reusing trampoline shim Signed-off-by: Stanislav Fomichev <[email protected]> Acked-by: Martin KaFai Lau <[email protected]> Link: https://lore.kernel.org/r/[email protected] Signed-off-by: Alexei Starovoitov <[email protected]>
2022-06-29tools/bpf: Sync btf_ids.h to toolsStanislav Fomichev1-1/+1
Has been slowly getting out of sync, let's update it. resolve_btfids usage has been updated to match the header changes. Also bring new parts of tools/include/uapi/linux/bpf.h. Acked-by: Martin KaFai Lau <[email protected]> Signed-off-by: Stanislav Fomichev <[email protected]> Link: https://lore.kernel.org/r/[email protected] Signed-off-by: Alexei Starovoitov <[email protected]>
2022-06-28selftests/bpf: remove last tests with legacy BPF map definitionsAndrii Nakryiko4-79/+0
Libbpf 1.0 stops support legacy-style BPF map definitions. Selftests has been migrated away from using legacy BPF map definitions except for two selftests, to make sure that legacy functionality still worked in pre-1.0 libbpf. Now it's time to let those tests go as libbpf 1.0 is imminent. Signed-off-by: Andrii Nakryiko <[email protected]> Link: https://lore.kernel.org/r/[email protected] Signed-off-by: Alexei Starovoitov <[email protected]>
2022-06-28libbpf: move xsk.{c,h} into selftests/bpfAndrii Nakryiko4-1/+1582
Remove deprecated xsk APIs from libbpf. But given we have selftests relying on this, move those files (with minimal adjustments to make them compilable) under selftests/bpf. We also remove all the removed APIs from libbpf.map, while overall keeping version inheritance chain, as most APIs are backwards compatible so there is no need to reassign them as LIBBPF_1.0.0 versions. Cc: Magnus Karlsson <[email protected]> Signed-off-by: Andrii Nakryiko <[email protected]> Link: https://lore.kernel.org/r/[email protected] Signed-off-by: Alexei Starovoitov <[email protected]>
2022-06-24selftests/bpf: Test sockmap update when socket has ULPJakub Sitnicki1-9/+75
Cover the scenario when we cannot insert a socket into the sockmap, because it has it is using ULP. Failed insert should not have any effect on the ULP state. This is a regression test. Signed-off-by: Jakub Sitnicki <[email protected]> Link: https://lore.kernel.org/r/[email protected] Signed-off-by: Jakub Kicinski <[email protected]>
2022-06-24selftest/bpf: Test for use-after-free bug fix in inline_bpf_loopEduard Zingerman2-0/+50
This test verifies that bpf_loop() inlining works as expected when address of `env->prog` is updated. This address is updated upon BPF program reallocation. Reallocation is handled by bpf_prog_realloc(), which reuses old memory if page boundary is not crossed. The value of `len` in the test is chosen to cross this boundary on bpf_loop() patching. Verify that the use-after-free bug in inline_bpf_loop() reported by Dan Carpenter is fixed. Signed-off-by: Eduard Zingerman <[email protected]> Signed-off-by: Daniel Borkmann <[email protected]> Link: https://lore.kernel.org/bpf/[email protected]
2022-06-23Merge git://git.kernel.org/pub/scm/linux/kernel/git/netdev/netJakub Kicinski5-51/+151
No conflicts. Signed-off-by: Jakub Kicinski <[email protected]>
2022-06-23selftests/bpf: Fix rare segfault in sock_fields prog testJörn-Thorben Hinz1-1/+0
test_sock_fields__detach() got called with a null pointer here when one of the CHECKs or ASSERTs up to the test_sock_fields__open_and_load() call resulted in a jump to the "done" label. A skeletons *__detach() is not safe to call with a null pointer, though. This led to a segfault. Go the easy route and only call test_sock_fields__destroy() which is null-pointer safe and includes detaching. Came across this while looking[1] to introduce the usage of bpf_tcp_helpers.h (included in progs/test_sock_fields.c) together with vmlinux.h. [1] https://lore.kernel.org/bpf/629bc069dd807d7ac646f836e9dca28bbc1108e2.camel@mailbox.tu-berlin.de/ Fixes: 8f50f16ff39d ("selftests/bpf: Extend verifier and bpf_sock tests for dst_port loads") Signed-off-by: Jörn-Thorben Hinz <[email protected]> Signed-off-by: Andrii Nakryiko <[email protected]> Reviewed-by: Jakub Sitnicki <[email protected]> Reviewed-by: Martin KaFai Lau <[email protected]> Acked-by: John Fastabend <[email protected]> Link: https://lore.kernel.org/bpf/[email protected]
2022-06-23selftests/bpf: Test a BPF CC implementing the unsupported get_info()Jörn-Thorben Hinz2-0/+41
Test whether a TCP CC implemented in BPF providing get_info() is rejected correctly. get_info() is unsupported in a BPF CC. The check for required functions in a BPF CC has moved, this test ensures unsupported functions are still rejected correctly. Signed-off-by: Jörn-Thorben Hinz <[email protected]> Reviewed-by: Martin KaFai Lau <[email protected]> Link: https://lore.kernel.org/r/[email protected] Signed-off-by: Alexei Starovoitov <[email protected]>
2022-06-23selftests/bpf: Test an incomplete BPF CCJörn-Thorben Hinz2-0/+57
Test whether a TCP CC implemented in BPF providing neither cong_avoid() nor cong_control() is correctly rejected. This check solely depends on tcp_register_congestion_control() now, which is invoked during bpf_map__attach_struct_ops(). Signed-off-by: Jörn-Thorben Hinz <[email protected]> Link: https://lore.kernel.org/r/[email protected] Signed-off-by: Alexei Starovoitov <[email protected]>
2022-06-23selftests/bpf: Test a BPF CC writing sk_pacing_*Jörn-Thorben Hinz2-0/+79
Test whether a TCP CC implemented in BPF is allowed to write sk_pacing_rate and sk_pacing_status in struct sock. This is needed when cong_control() is implemented and used. Signed-off-by: Jörn-Thorben Hinz <[email protected]> Link: https://lore.kernel.org/r/[email protected] Signed-off-by: Alexei Starovoitov <[email protected]>
2022-06-22selftests/bpf: Add benchmark for local_storage getDave Marchevsky7-1/+494
Add a benchmarks to demonstrate the performance cliff for local_storage get as the number of local_storage maps increases beyond current local_storage implementation's cache size. "sequential get" and "interleaved get" benchmarks are added, both of which do many bpf_task_storage_get calls on sets of task local_storage maps of various counts, while considering a single specific map to be 'important' and counting task_storage_gets to the important map separately in addition to normal 'hits' count of all gets. Goal here is to mimic scenario where a particular program using one map - the important one - is running on a system where many other local_storage maps exist and are accessed often. While "sequential get" benchmark does bpf_task_storage_get for map 0, 1, ..., {9, 99, 999} in order, "interleaved" benchmark interleaves 4 bpf_task_storage_gets for the important map for every 10 map gets. This is meant to highlight performance differences when important map is accessed far more frequently than non-important maps. A "hashmap control" benchmark is also included for easy comparison of standard bpf hashmap lookup vs local_storage get. The benchmark is similar to "sequential get", but creates and uses BPF_MAP_TYPE_HASH instead of local storage. Only one inner map is created - a hashmap meant to hold tid -> data mapping for all tasks. Size of the hashmap is hardcoded to my system's PID_MAX_LIMIT (4,194,304). The number of these keys which are actually fetched as part of the benchmark is configurable. Addition of this benchmark is inspired by conversation with Alexei in a previous patchset's thread [0], which highlighted the need for such a benchmark to motivate and validate improvements to local_storage implementation. My approach in that series focused on improving performance for explicitly-marked 'important' maps and was rejected with feedback to make more generally-applicable improvements while avoiding explicitly marking maps as important. Thus the benchmark reports both general and important-map-focused metrics, so effect of future work on both is clear. Regarding the benchmark results. On a powerful system (Skylake, 20 cores, 256gb ram): Hashmap Control =============== num keys: 10 hashmap (control) sequential get: hits throughput: 20.900 ± 0.334 M ops/s, hits latency: 47.847 ns/op, important_hits throughput: 20.900 ± 0.334 M ops/s num keys: 1000 hashmap (control) sequential get: hits throughput: 13.758 ± 0.219 M ops/s, hits latency: 72.683 ns/op, important_hits throughput: 13.758 ± 0.219 M ops/s num keys: 10000 hashmap (control) sequential get: hits throughput: 6.995 ± 0.034 M ops/s, hits latency: 142.959 ns/op, important_hits throughput: 6.995 ± 0.034 M ops/s num keys: 100000 hashmap (control) sequential get: hits throughput: 4.452 ± 0.371 M ops/s, hits latency: 224.635 ns/op, important_hits throughput: 4.452 ± 0.371 M ops/s num keys: 4194304 hashmap (control) sequential get: hits throughput: 3.043 ± 0.033 M ops/s, hits latency: 328.587 ns/op, important_hits throughput: 3.043 ± 0.033 M ops/s Local Storage ============= num_maps: 1 local_storage cache sequential get: hits throughput: 47.298 ± 0.180 M ops/s, hits latency: 21.142 ns/op, important_hits throughput: 47.298 ± 0.180 M ops/s local_storage cache interleaved get: hits throughput: 55.277 ± 0.888 M ops/s, hits latency: 18.091 ns/op, important_hits throughput: 55.277 ± 0.888 M ops/s num_maps: 10 local_storage cache sequential get: hits throughput: 40.240 ± 0.802 M ops/s, hits latency: 24.851 ns/op, important_hits throughput: 4.024 ± 0.080 M ops/s local_storage cache interleaved get: hits throughput: 48.701 ± 0.722 M ops/s, hits latency: 20.533 ns/op, important_hits throughput: 17.393 ± 0.258 M ops/s num_maps: 16 local_storage cache sequential get: hits throughput: 44.515 ± 0.708 M ops/s, hits latency: 22.464 ns/op, important_hits throughput: 2.782 ± 0.044 M ops/s local_storage cache interleaved get: hits throughput: 49.553 ± 2.260 M ops/s, hits latency: 20.181 ns/op, important_hits throughput: 15.767 ± 0.719 M ops/s num_maps: 17 local_storage cache sequential get: hits throughput: 38.778 ± 0.302 M ops/s, hits latency: 25.788 ns/op, important_hits throughput: 2.284 ± 0.018 M ops/s local_storage cache interleaved get: hits throughput: 43.848 ± 1.023 M ops/s, hits latency: 22.806 ns/op, important_hits throughput: 13.349 ± 0.311 M ops/s num_maps: 24 local_storage cache sequential get: hits throughput: 19.317 ± 0.568 M ops/s, hits latency: 51.769 ns/op, important_hits throughput: 0.806 ± 0.024 M ops/s local_storage cache interleaved get: hits throughput: 24.397 ± 0.272 M ops/s, hits latency: 40.989 ns/op, important_hits throughput: 6.863 ± 0.077 M ops/s num_maps: 32 local_storage cache sequential get: hits throughput: 13.333 ± 0.135 M ops/s, hits latency: 75.000 ns/op, important_hits throughput: 0.417 ± 0.004 M ops/s local_storage cache interleaved get: hits throughput: 16.898 ± 0.383 M ops/s, hits latency: 59.178 ns/op, important_hits throughput: 4.717 ± 0.107 M ops/s num_maps: 100 local_storage cache sequential get: hits throughput: 6.360 ± 0.107 M ops/s, hits latency: 157.233 ns/op, important_hits throughput: 0.064 ± 0.001 M ops/s local_storage cache interleaved get: hits throughput: 7.303 ± 0.362 M ops/s, hits latency: 136.930 ns/op, important_hits throughput: 1.907 ± 0.094 M ops/s num_maps: 1000 local_storage cache sequential get: hits throughput: 0.452 ± 0.010 M ops/s, hits latency: 2214.022 ns/op, important_hits throughput: 0.000 ± 0.000 M ops/s local_storage cache interleaved get: hits throughput: 0.542 ± 0.007 M ops/s, hits latency: 1843.341 ns/op, important_hits throughput: 0.136 ± 0.002 M ops/s Looking at the "sequential get" results, it's clear that as the number of task local_storage maps grows beyond the current cache size (16), there's a significant reduction in hits throughput. Note that current local_storage implementation assigns a cache_idx to maps as they are created. Since "sequential get" is creating maps 0..n in order and then doing bpf_task_storage_get calls in the same order, the benchmark is effectively ensuring that a map will not be in cache when the program tries to access it. For "interleaved get" results, important-map hits throughput is greatly increased as the important map is more likely to be in cache by virtue of being accessed far more frequently. Throughput still reduces as # maps increases, though. To get a sense of the overhead of the benchmark program, I commented out bpf_task_storage_get/bpf_map_lookup_elem in local_storage_bench.c and ran the benchmark on the same host as the 'real' run. Results: Hashmap Control =============== num keys: 10 hashmap (control) sequential get: hits throughput: 54.288 ± 0.655 M ops/s, hits latency: 18.420 ns/op, important_hits throughput: 54.288 ± 0.655 M ops/s num keys: 1000 hashmap (control) sequential get: hits throughput: 52.913 ± 0.519 M ops/s, hits latency: 18.899 ns/op, important_hits throughput: 52.913 ± 0.519 M ops/s num keys: 10000 hashmap (control) sequential get: hits throughput: 53.480 ± 1.235 M ops/s, hits latency: 18.699 ns/op, important_hits throughput: 53.480 ± 1.235 M ops/s num keys: 100000 hashmap (control) sequential get: hits throughput: 54.982 ± 1.902 M ops/s, hits latency: 18.188 ns/op, important_hits throughput: 54.982 ± 1.902 M ops/s num keys: 4194304 hashmap (control) sequential get: hits throughput: 50.858 ± 0.707 M ops/s, hits latency: 19.662 ns/op, important_hits throughput: 50.858 ± 0.707 M ops/s Local Storage ============= num_maps: 1 local_storage cache sequential get: hits throughput: 110.990 ± 4.828 M ops/s, hits latency: 9.010 ns/op, important_hits throughput: 110.990 ± 4.828 M ops/s local_storage cache interleaved get: hits throughput: 161.057 ± 4.090 M ops/s, hits latency: 6.209 ns/op, important_hits throughput: 161.057 ± 4.090 M ops/s num_maps: 10 local_storage cache sequential get: hits throughput: 112.930 ± 1.079 M ops/s, hits latency: 8.855 ns/op, important_hits throughput: 11.293 ± 0.108 M ops/s local_storage cache interleaved get: hits throughput: 115.841 ± 2.088 M ops/s, hits latency: 8.633 ns/op, important_hits throughput: 41.372 ± 0.746 M ops/s num_maps: 16 local_storage cache sequential get: hits throughput: 115.653 ± 0.416 M ops/s, hits latency: 8.647 ns/op, important_hits throughput: 7.228 ± 0.026 M ops/s local_storage cache interleaved get: hits throughput: 138.717 ± 1.649 M ops/s, hits latency: 7.209 ns/op, important_hits throughput: 44.137 ± 0.525 M ops/s num_maps: 17 local_storage cache sequential get: hits throughput: 112.020 ± 1.649 M ops/s, hits latency: 8.927 ns/op, important_hits throughput: 6.598 ± 0.097 M ops/s local_storage cache interleaved get: hits throughput: 128.089 ± 1.960 M ops/s, hits latency: 7.807 ns/op, important_hits throughput: 38.995 ± 0.597 M ops/s num_maps: 24 local_storage cache sequential get: hits throughput: 92.447 ± 5.170 M ops/s, hits latency: 10.817 ns/op, important_hits throughput: 3.855 ± 0.216 M ops/s local_storage cache interleaved get: hits throughput: 128.844 ± 2.808 M ops/s, hits latency: 7.761 ns/op, important_hits throughput: 36.245 ± 0.790 M ops/s num_maps: 32 local_storage cache sequential get: hits throughput: 102.042 ± 1.462 M ops/s, hits latency: 9.800 ns/op, important_hits throughput: 3.194 ± 0.046 M ops/s local_storage cache interleaved get: hits throughput: 126.577 ± 1.818 M ops/s, hits latency: 7.900 ns/op, important_hits throughput: 35.332 ± 0.507 M ops/s num_maps: 100 local_storage cache sequential get: hits throughput: 111.327 ± 1.401 M ops/s, hits latency: 8.983 ns/op, important_hits throughput: 1.113 ± 0.014 M ops/s local_storage cache interleaved get: hits throughput: 131.327 ± 1.339 M ops/s, hits latency: 7.615 ns/op, important_hits throughput: 34.302 ± 0.350 M ops/s num_maps: 1000 local_storage cache sequential get: hits throughput: 101.978 ± 0.563 M ops/s, hits latency: 9.806 ns/op, important_hits throughput: 0.102 ± 0.001 M ops/s local_storage cache interleaved get: hits throughput: 141.084 ± 1.098 M ops/s, hits latency: 7.088 ns/op, important_hits throughput: 35.430 ± 0.276 M ops/s Adjusting for overhead, latency numbers for "hashmap control" and "sequential get" are: hashmap_control_1k: ~53.8ns hashmap_control_10k: ~124.2ns hashmap_control_100k: ~206.5ns sequential_get_1: ~12.1ns sequential_get_10: ~16.0ns sequential_get_16: ~13.8ns sequential_get_17: ~16.8ns sequential_get_24: ~40.9ns sequential_get_32: ~65.2ns sequential_get_100: ~148.2ns sequential_get_1000: ~2204ns Clearly demonstrating a cliff. In the discussion for v1 of this patch, Alexei noted that local_storage was 2.5x faster than a large hashmap when initially implemented [1]. The benchmark results show that local_storage is 5-10x faster: a long-running BPF application putting some pid-specific info into a hashmap for each pid it sees will probably see on the order of 10-100k pids. Bench numbers for hashmaps of this size are ~10x slower than sequential_get_16, but as the number of local_storage maps grows far past local_storage cache size the performance advantage shrinks and eventually reverses. When running the benchmarks it may be necessary to bump 'open files' ulimit for a successful run. [0]: https://lore.kernel.org/all/[email protected] [1]: https://lore.kernel.org/bpf/20220511173305.ftldpn23m4ski3d3@MBP-98dd607d3435.dhcp.thefacebook.com/ Signed-off-by: Dave Marchevsky <[email protected]> Link: https://lore.kernel.org/r/[email protected] Signed-off-by: Alexei Starovoitov <[email protected]>
2022-06-20selftests/bpf: BPF test_prog selftests for bpf_loop inliningEduard Zingerman2-0/+176
Two new test BPF programs for test_prog selftests checking bpf_loop behavior. Both are corner cases for bpf_loop inlinig transformation: - check that bpf_loop behaves correctly when callback function is not a compile time constant - check that local function variables are not affected by allocating additional stack storage for registers spilled by loop inlining Signed-off-by: Eduard Zingerman <[email protected]> Acked-by: Song Liu <[email protected]> Link: https://lore.kernel.org/r/[email protected] Signed-off-by: Alexei Starovoitov <[email protected]>
2022-06-20selftests/bpf: BPF test_verifier selftests for bpf_loop inliningEduard Zingerman1-0/+252
A number of test cases for BPF selftests test_verifier to check how bpf_loop inline transformation rewrites the BPF program. The following cases are covered: - happy path - no-rewrite when flags is non-zero - no-rewrite when callback is non-constant - subprogno in insn_aux is updated correctly when dead sub-programs are removed - check that correct stack offsets are assigned for spilling of R6-R8 registers Signed-off-by: Eduard Zingerman <[email protected]> Acked-by: Song Liu <[email protected]> Link: https://lore.kernel.org/r/[email protected] Signed-off-by: Alexei Starovoitov <[email protected]>
2022-06-20selftests/bpf: allow BTF specs and func infos in test_verifier testsEduard Zingerman3-18/+79
The BTF and func_info specification for test_verifier tests follows the same notation as in prog_tests/btf.c tests. E.g.: ... .func_info = { { 0, 6 }, { 8, 7 } }, .func_info_cnt = 2, .btf_strings = "\0int\0", .btf_types = { BTF_TYPE_INT_ENC(1, BTF_INT_SIGNED, 0, 32, 4), BTF_PTR_ENC(1), }, ... The BTF specification is loaded only when specified. Signed-off-by: Eduard Zingerman <[email protected]> Acked-by: Song Liu <[email protected]> Link: https://lore.kernel.org/r/[email protected] Signed-off-by: Alexei Starovoitov <[email protected]>
2022-06-20selftests/bpf: specify expected instructions in test_verifier testsEduard Zingerman1-0/+234
Allows to specify expected and unexpected instruction sequences in test_verifier test cases. The instructions are requested from kernel after BPF program loading, thus allowing to check some of the transformations applied by BPF verifier. - `expected_insn` field specifies a sequence of instructions expected to be found in the program; - `unexpected_insn` field specifies a sequence of instructions that are not expected to be found in the program; - `INSN_OFF_MASK` and `INSN_IMM_MASK` values could be used to mask `off` and `imm` fields. - `SKIP_INSNS` could be used to specify that some instructions in the (un)expected pattern are not important (behavior similar to usage of `\t` in `errstr` field). The intended usage is as follows: { "inline simple bpf_loop call", .insns = { /* main */ BPF_ALU64_IMM(BPF_MOV, BPF_REG_1, 1), BPF_RAW_INSN(BPF_LD | BPF_IMM | BPF_DW, BPF_REG_2, BPF_PSEUDO_FUNC, 0, 6), ... BPF_EXIT_INSN(), /* callback */ BPF_ALU64_IMM(BPF_MOV, BPF_REG_0, 1), BPF_EXIT_INSN(), }, .expected_insns = { BPF_ALU64_IMM(BPF_MOV, BPF_REG_1, 1), SKIP_INSNS(), BPF_RAW_INSN(BPF_JMP | BPF_CALL, 0, BPF_PSEUDO_CALL, 8, 1) }, .unexpected_insns = { BPF_RAW_INSN(BPF_JMP | BPF_CALL, 0, 0, INSN_OFF_MASK, INSN_IMM_MASK), }, .prog_type = BPF_PROG_TYPE_TRACEPOINT, .result = ACCEPT, .runs = 0, }, Here it is expected that move of 1 to register 1 would remain in place and helper function call instruction would be replaced by a relative call instruction. Signed-off-by: Eduard Zingerman <[email protected]> Acked-by: Song Liu <[email protected]> Link: https://lore.kernel.org/r/[email protected] Signed-off-by: Alexei Starovoitov <[email protected]>
2022-06-20selftests/bpf: Enable config options needed for xdp_synproxy testMaxim Mikityanskiy1-0/+6
This commit adds the kernel config options needed to run the recently added xdp_synproxy test. Users without these options will hit errors like this: test_synproxy:FAIL:iptables -t raw -I PREROUTING -i tmp1 -p tcp -m tcp --syn --dport 8080 -j CT --notrack unexpected error: 256 (errno 22) Suggested-by: Alexei Starovoitov <[email protected]> Signed-off-by: Maxim Mikityanskiy <[email protected]> Signed-off-by: Daniel Borkmann <[email protected]> Link: https://lore.kernel.org/bpf/[email protected]
2022-06-17Merge https://git.kernel.org/pub/scm/linux/kernel/git/bpf/bpf-nextJakub Kicinski29-202/+2575
Daniel Borkmann says: ==================== pull-request: bpf-next 2022-06-17 We've added 72 non-merge commits during the last 15 day(s) which contain a total of 92 files changed, 4582 insertions(+), 834 deletions(-). The main changes are: 1) Add 64 bit enum value support to BTF, from Yonghong Song. 2) Implement support for sleepable BPF uprobe programs, from Delyan Kratunov. 3) Add new BPF helpers to issue and check TCP SYN cookies without binding to a socket especially useful in synproxy scenarios, from Maxim Mikityanskiy. 4) Fix libbpf's internal USDT address translation logic for shared libraries as well as uprobe's symbol file offset calculation, from Andrii Nakryiko. 5) Extend libbpf to provide an API for textual representation of the various map/prog/attach/link types and use it in bpftool, from Daniel Müller. 6) Provide BTF line info for RV64 and RV32 JITs, and fix a put_user bug in the core seen in 32 bit when storing BPF function addresses, from Pu Lehui. 7) Fix libbpf's BTF pointer size guessing by adding a list of various aliases for 'long' types, from Douglas Raillard. 8) Fix bpftool to readd setting rlimit since probing for memcg-based accounting has been unreliable and caused a regression on COS, from Quentin Monnet. 9) Fix UAF in BPF cgroup's effective program computation triggered upon BPF link detachment, from Tadeusz Struk. 10) Fix bpftool build bootstrapping during cross compilation which was pointing to the wrong AR process, from Shahab Vahedi. 11) Fix logic bug in libbpf's is_pow_of_2 implementation, from Yuze Chi. 12) BPF hash map optimization to avoid grabbing spinlocks of all CPUs when there is no free element. Also add a benchmark as reproducer, from Feng Zhou. 13) Fix bpftool's codegen to bail out when there's no BTF, from Michael Mullin. 14) Various minor cleanup and improvements all over the place. * https://git.kernel.org/pub/scm/linux/kernel/git/bpf/bpf-next: (72 commits) bpf: Fix bpf_skc_lookup comment wrt. return type bpf: Fix non-static bpf_func_proto struct definitions selftests/bpf: Don't force lld on non-x86 architectures selftests/bpf: Add selftests for raw syncookie helpers in TC mode bpf: Allow the new syncookie helpers to work with SKBs selftests/bpf: Add selftests for raw syncookie helpers bpf: Add helpers to issue and check SYN cookies in XDP bpf: Allow helpers to accept pointers with a fixed size bpf: Fix documentation of th_len in bpf_tcp_{gen,check}_syncookie selftests/bpf: add tests for sleepable (uk)probes libbpf: add support for sleepable uprobe programs bpf: allow sleepable uprobe programs to attach bpf: implement sleepable uprobes by chaining gps bpf: move bpf_prog to bpf.h libbpf: Fix internal USDT address translation logic for shared libraries samples/bpf: Check detach prog exist or not in xdp_fwd selftests/bpf: Avoid skipping certain subtests selftests/bpf: Fix test_varlen verification failure with latest llvm bpftool: Do not check return value from libbpf_set_strict_mode() Revert "bpftool: Use libbpf 1.0 API mode instead of RLIMIT_MEMLOCK" ... ==================== Link: https://lore.kernel.org/r/[email protected] Signed-off-by: Jakub Kicinski <[email protected]>
2022-06-17selftests/bpf: Don't force lld on non-x86 architecturesAndrii Nakryiko1-2/+9
LLVM's lld linker doesn't have a universal architecture support (e.g., it definitely doesn't work on s390x), so be safe and force lld for urandom_read and liburandom_read.so only on x86 architectures. This should fix s390x CI runs. Fixes: 3e6fe5ce4d48 ("libbpf: Fix internal USDT address translation logic for shared libraries") Signed-off-by: Andrii Nakryiko <[email protected]> Signed-off-by: Daniel Borkmann <[email protected]> Link: https://lore.kernel.org/bpf/[email protected]
2022-06-16selftests/bpf: Add selftests for raw syncookie helpers in TC modeMaxim Mikityanskiy3-69/+224
This commit extends selftests for the new BPF helpers bpf_tcp_raw_{gen,check}_syncookie_ipv{4,6} to also test the TC BPF functionality added in the previous commit. Signed-off-by: Maxim Mikityanskiy <[email protected]> Reviewed-by: Tariq Toukan <[email protected]> Link: https://lore.kernel.org/r/[email protected] Signed-off-by: Alexei Starovoitov <[email protected]>
2022-06-16selftests/bpf: Add selftests for raw syncookie helpersMaxim Mikityanskiy5-1/+1330
This commit adds selftests for the new BPF helpers: bpf_tcp_raw_{gen,check}_syncookie_ipv{4,6}. xdp_synproxy_kern.c is a BPF program that generates SYN cookies on allowed TCP ports and sends SYNACKs to clients, accelerating synproxy iptables module. xdp_synproxy.c is a userspace control application that allows to configure the following options in runtime: list of allowed ports, MSS, window scale, TTL. A selftest is added to prog_tests that leverages the above programs to test the functionality of the new helpers. Signed-off-by: Maxim Mikityanskiy <[email protected]> Reviewed-by: Tariq Toukan <[email protected]> Link: https://lore.kernel.org/r/[email protected] Signed-off-by: Alexei Starovoitov <[email protected]>
2022-06-16selftest/bpf: Fix kprobe_multi bench testJiri Olsa1-0/+3
With [1] the available_filter_functions file contains records starting with __ftrace_invalid_address___ and marking disabled entries. We need to filter them out for the bench test to pass only resolvable symbols to kernel. [1] commit b39181f7c690 ("ftrace: Add FTRACE_MCOUNT_MAX_OFFSET to avoid adding weak function") Fixes: b39181f7c690 ("ftrace: Add FTRACE_MCOUNT_MAX_OFFSET to avoid adding weak function") Signed-off-by: Jiri Olsa <[email protected]> Link: https://lore.kernel.org/r/[email protected] Signed-off-by: Alexei Starovoitov <[email protected]>
2022-06-16selftests/bpf: Shuffle cookies symbols in kprobe multi testJiri Olsa2-51/+51
There's a kernel bug that causes cookies to be misplaced and the reason we did not catch this with this test is that we provide bpf_fentry_test* functions already sorted by name. Shuffling function bpf_fentry_test2 deeper in the list and keeping the current cookie values as before will trigger the bug. The kernel fix is coming in following changes. Acked-by: Song Liu <[email protected]> Signed-off-by: Jiri Olsa <[email protected]> Link: https://lore.kernel.org/r/[email protected] Signed-off-by: Alexei Starovoitov <[email protected]>
2022-06-16selftests/bpf: add tests for sleepable (uk)probesDelyan Kratunov2-1/+108
Add tests that ensure sleepable uprobe programs work correctly. Add tests that ensure sleepable kprobe programs cannot attach. Also add tests that attach both sleepable and non-sleepable uprobe programs to the same location (i.e. same bpf_prog_array). Acked-by: Andrii Nakryiko <[email protected]> Signed-off-by: Delyan Kratunov <[email protected]> Link: https://lore.kernel.org/r/c744e5bb7a5c0703f05444dc41f2522ba3579a48.1655248076.git.delyank@fb.com Signed-off-by: Alexei Starovoitov <[email protected]>
2022-06-17libbpf: Fix internal USDT address translation logic for shared librariesAndrii Nakryiko1-5/+9
Perform the same virtual address to file offset translation that libbpf is doing for executable ELF binaries also for shared libraries. Currently libbpf is making a simplifying and sometimes wrong assumption that for shared libraries relative virtual addresses inside ELF are always equal to file offsets. Unfortunately, this is not always the case with LLVM's lld linker, which now by default generates quite more complicated ELF segments layout. E.g., for liburandom_read.so from selftests/bpf, here's an excerpt from readelf output listing ELF segments (a.k.a. program headers): Type Offset VirtAddr PhysAddr FileSiz MemSiz Flg Align PHDR 0x000040 0x0000000000000040 0x0000000000000040 0x0001f8 0x0001f8 R 0x8 LOAD 0x000000 0x0000000000000000 0x0000000000000000 0x0005e4 0x0005e4 R 0x1000 LOAD 0x0005f0 0x00000000000015f0 0x00000000000015f0 0x000160 0x000160 R E 0x1000 LOAD 0x000750 0x0000000000002750 0x0000000000002750 0x000210 0x000210 RW 0x1000 LOAD 0x000960 0x0000000000003960 0x0000000000003960 0x000028 0x000029 RW 0x1000 Compare that to what is generated by GNU ld (or LLVM lld's with extra -znoseparate-code argument which disables this cleverness in the name of file size reduction): Type Offset VirtAddr PhysAddr FileSiz MemSiz Flg Align LOAD 0x000000 0x0000000000000000 0x0000000000000000 0x000550 0x000550 R 0x1000 LOAD 0x001000 0x0000000000001000 0x0000000000001000 0x000131 0x000131 R E 0x1000 LOAD 0x002000 0x0000000000002000 0x0000000000002000 0x0000ac 0x0000ac R 0x1000 LOAD 0x002dc0 0x0000000000003dc0 0x0000000000003dc0 0x000262 0x000268 RW 0x1000 You can see from the first example above that for executable (Flg == "R E") PT_LOAD segment (LOAD #2), Offset doesn't match VirtAddr columns. And it does in the second case (GNU ld output). This is important because all the addresses, including USDT specs, operate in a virtual address space, while kernel is expecting file offsets when performing uprobe attach. So such mismatches have to be properly taken care of and compensated by libbpf, which is what this patch is fixing. Also patch clarifies few function and variable names, as well as updates comments to reflect this important distinction (virtaddr vs file offset) and to ephasize that shared libraries are not all that different from executables in this regard. This patch also changes selftests/bpf Makefile to force urand_read and liburand_read.so to be built with Clang and LLVM's lld (and explicitly request this ELF file size optimization through -znoseparate-code linker parameter) to validate libbpf logic and ensure regressions don't happen in the future. I've bundled these selftests changes together with libbpf changes to keep the above description tied with both libbpf and selftests changes. Fixes: 74cc6311cec9 ("libbpf: Add USDT notes parsing and resolution logic") Signed-off-by: Andrii Nakryiko <[email protected]> Signed-off-by: Daniel Borkmann <[email protected]> Link: https://lore.kernel.org/bpf/[email protected]
2022-06-16selftests/bpf: Test tail call counting with bpf2bpf and data on stackJakub Sitnicki2-0/+97
Cover the case when tail call count needs to be passed from BPF function to BPF function, and the caller has data on stack. Specifically when the size of data allocated on BPF stack is not a multiple on 8. Signed-off-by: Jakub Sitnicki <[email protected]> Signed-off-by: Daniel Borkmann <[email protected]> Link: https://lore.kernel.org/bpf/[email protected]
2022-06-14selftests/bpf: Avoid skipping certain subtestsYonghong Song1-1/+6
Commit 704c91e59fe0 ('selftests/bpf: Test "bpftool gen min_core_btf"') added a test test_core_btfgen to test core relocation with btf generated with 'bpftool gen min_core_btf'. Currently, among 76 subtests, 25 are skipped. ... #46/69 core_reloc_btfgen/enumval:OK #46/70 core_reloc_btfgen/enumval___diff:OK #46/71 core_reloc_btfgen/enumval___val3_missing:OK #46/72 core_reloc_btfgen/enumval___err_missing:SKIP #46/73 core_reloc_btfgen/enum64val:OK #46/74 core_reloc_btfgen/enum64val___diff:OK #46/75 core_reloc_btfgen/enum64val___val3_missing:OK #46/76 core_reloc_btfgen/enum64val___err_missing:SKIP ... #46 core_reloc_btfgen:SKIP Summary: 1/51 PASSED, 25 SKIPPED, 0 FAILED Alexei found that in the above core_reloc_btfgen/enum64val___err_missing should not be skipped. Currently, the core_reloc tests have some negative tests. In Commit 704c91e59fe0, for core_reloc_btfgen, all negative tests are skipped with the following condition if (!test_case->btf_src_file || test_case->fails) { test__skip(); continue; } This is too conservative. Negative tests do not fail mkstemp() and run_btfgen() should not be skipped. There are a few negative tests indeed failing run_btfgen() and this patch added 'run_btfgen_fails' to mark these tests so that they can be skipped for btfgen tests. With this, we have ... #46/69 core_reloc_btfgen/enumval:OK #46/70 core_reloc_btfgen/enumval___diff:OK #46/71 core_reloc_btfgen/enumval___val3_missing:OK #46/72 core_reloc_btfgen/enumval___err_missing:OK #46/73 core_reloc_btfgen/enum64val:OK #46/74 core_reloc_btfgen/enum64val___diff:OK #46/75 core_reloc_btfgen/enum64val___val3_missing:OK #46/76 core_reloc_btfgen/enum64val___err_missing:OK ... Summary: 1/62 PASSED, 14 SKIPPED, 0 FAILED Totally 14 subtests are skipped instead of 25. Reported-by: Alexei Starovoitov <[email protected]> Signed-off-by: Yonghong Song <[email protected]> Link: https://lore.kernel.org/r/[email protected] Signed-off-by: Alexei Starovoitov <[email protected]>