aboutsummaryrefslogtreecommitdiff
path: root/tools
AgeCommit message (Collapse)AuthorFilesLines
2024-07-11selftests/resctrl: Remove mongrp from MBA testIlpo Järvinen1-1/+0
Nothing during MBA test uses mongrp even if it has been defined ever since the introduction of the MBA test in the commit 01fee6b4d1f9 ("selftests/resctrl: Add MBA test"). Remove the mongrp from MBA test. Signed-off-by: Ilpo Järvinen <[email protected]> Tested-by: Babu Moger <[email protected]> Reviewed-by: Reinette Chatre <[email protected]> Signed-off-by: Shuah Khan <[email protected]>
2024-07-11selftests/resctrl: Convert ctrlgrp & mongrp to pointersIlpo Järvinen2-11/+9
The struct resctrl_val_param has control and monitor groups as char arrays but they are not supposed to be mutated within resctrl_val(). Convert the ctrlgrp and mongrp char array within resctrl_val_param to plain const char pointers and adjust the strlen() based checks to check NULL instead. Signed-off-by: Ilpo Järvinen <[email protected]> Tested-by: Babu Moger <[email protected]> Reviewed-by: Reinette Chatre <[email protected]> Signed-off-by: Shuah Khan <[email protected]>
2024-07-11selftests/resctrl: Make some strings passed to resctrlfs functions constIlpo Järvinen2-6/+8
Control group, monitor group and resctrl_val are not mutated and should not be mutated within resctrlfs.c functions. Mark this by using const char * for the arguments. Signed-off-by: Ilpo Järvinen <[email protected]> Tested-by: Babu Moger <[email protected]> Reviewed-by: Reinette Chatre <[email protected]> Signed-off-by: Shuah Khan <[email protected]>
2024-07-11selftests/resctrl: Simplify bandwidth report type handlingIlpo Järvinen5-25/+20
bw_report is only needed for selecting the correct value from the values IMC measured. It is a member in the resctrl_val_param struct and is always set to "reads". The value is then checked in resctrl_val() using validate_bw_report_request() that besides validating the input, assumes it can mutate the string which is questionable programming practice. Simplify handling bw_report: - Convert validate_bw_report_request() into get_bw_report_type() that inputs and returns const char *. Use NULL to indicate error. - Validate the report types inside measure_mem_bw(), not in resctrl_val(). - Pass bw_report to measure_mem_bw() from ->measure() hook because resctrl_val() no longer needs bw_report for anything. Signed-off-by: Ilpo Järvinen <[email protected]> Tested-by: Babu Moger <[email protected]> Reviewed-by: Reinette Chatre <[email protected]> Signed-off-by: Shuah Khan <[email protected]>
2024-07-11selftests/resctrl: Add ->init() callback into resctrl_val_paramIlpo Järvinen5-63/+60
The struct resctrl_val_param is there to customize behavior inside resctrl_val() which is currently not used to full extent and there are number of strcmp()s for test name in resctrl_val done by resctrl_val(). Create ->init() hook into the struct resctrl_val_param to cleanly do per test initialization. Remove also unused branches to setup paths and the related #defines for CMT test. While touching kerneldoc, make the adjacent line consistent with the newly added form (callback vs call back). Signed-off-by: Ilpo Järvinen <[email protected]> Tested-by: Babu Moger <[email protected]> Reviewed-by: Reinette Chatre <[email protected]> Signed-off-by: Shuah Khan <[email protected]>
2024-07-11selftests/resctrl: Add ->measure() callback to resctrl_val_paramIlpo Järvinen5-15/+35
The measurement done in resctrl_val() varies depending on test type. The decision for how to measure is decided based on the string compare to test name which is quite inflexible. Add ->measure() callback into the struct resctrl_val_param to allow each test to provide necessary code as a function which simplifies what resctrl_val() has to do. Signed-off-by: Ilpo Järvinen <[email protected]> Tested-by: Babu Moger <[email protected]> Reviewed-by: Reinette Chatre <[email protected]> Signed-off-by: Shuah Khan <[email protected]>
2024-07-11selftests/resctrl: Simplify mem bandwidth file code for MBA & MBM testsIlpo Järvinen2-42/+4
initialize_mem_bw_resctrl() and set_mbm_path() contain complicated set of conditions, each yielding different file to be opened to measure memory bandwidth through resctrl FS. In practice, only two of them are used. For MBA test, ctrlgrp is always provided, and for MBM test both ctrlgrp and mongrp are set. The file used differ between MBA/MBM test, however, MBM test unnecessarily create monitor group because resctrl FS already provides monitoring interface underneath any ctrlgrp too, which is what the MBA selftest uses. Consolidate memory bandwidth file used to the one used by the MBA selftest. Remove all unused branches opening other files to simplify the code. Suggested-by: Reinette Chatre <[email protected]> Signed-off-by: Ilpo Järvinen <[email protected]> Tested-by: Babu Moger <[email protected]> Reviewed-by: Reinette Chatre <[email protected]> Signed-off-by: Shuah Khan <[email protected]>
2024-07-11selftests/resctrl: Rename measure_vals() to measure_mem_bw_vals() & documentIlpo Järvinen1-3/+8
measure_vals() is awfully generic name so rename it to measure_mem_bw() to describe better what it does and document the function parameters. Signed-off-by: Ilpo Järvinen <[email protected]> Tested-by: Babu Moger <[email protected]> Reviewed-by: Reinette Chatre <[email protected]> Signed-off-by: Shuah Khan <[email protected]>
2024-07-11selftests/resctrl: Cleanup bm_pid and ppid usage & limit scopeIlpo Järvinen2-17/+15
'bm_pid' and 'ppid' are global variables. As they are used by different processes and in signal handler, they cannot be entirely converted into local variables. The scope of those variables can still be reduced into resctrl_val.c only. As PARENT_EXIT() macro is using 'ppid', make it a function in resctrl_val.c and pass ppid to it as an argument because it is easier to understand than using the global variable directly. Pass 'bm_pid' into measure_vals() instead of relying on the global variable which helps to make the call signatures of measure_vals() and measure_llc_resctrl() more similar to each other. Signed-off-by: Ilpo Järvinen <[email protected]> Tested-by: Babu Moger <[email protected]> Reviewed-by: Reinette Chatre <[email protected]> Signed-off-by: Shuah Khan <[email protected]>
2024-07-11selftests/resctrl: Use correct type for pidsIlpo Järvinen4-12/+12
A few functions receive PIDs through int arguments. PIDs variables should be of type pid_t, not int. Convert pid arguments from int to pid_t. Before printing PID, match the type to %d by casting to int which is enough for Linux (standard would allow using a longer integer type but generalizing for that would complicate the code unnecessarily, the selftest code does not need to be portable). Signed-off-by: Ilpo Järvinen <[email protected]> Tested-by: Babu Moger <[email protected]> Reviewed-by: Reinette Chatre <[email protected]> Signed-off-by: Shuah Khan <[email protected]>
2024-07-11selftests/resctrl: Consolidate get_domain_id() into resctrl_val()Ilpo Järvinen1-20/+13
Both initialize_mem_bw_resctrl() and initialize_llc_occu_resctrl() that are called from resctrl_val() need to determine domain ID to construct resctrl fs related paths. Both functions do it by taking CPU ID which neither needs for any other purpose than determining the domain ID. Consolidate determining the domain ID into resctrl_val() and pass the domain ID instead of CPU ID to initialize_mem_bw_resctrl() and initialize_llc_occu_resctrl(). Signed-off-by: Ilpo Järvinen <[email protected]> Tested-by: Babu Moger <[email protected]> Reviewed-by: Reinette Chatre <[email protected]> Signed-off-by: Shuah Khan <[email protected]>
2024-07-11selftests/resctrl: Make "bandwidth" consistent in comments & printsIlpo Järvinen2-8/+8
Resctrl selftests refer to "bandwidth" currently in two other forms in the code ("B/W" and "band width"). Use "bandwidth" consistently everywhere. While at it, fix also one "over flow" -> "overflow" on a line that is touched by the change. Suggested-by: Reinette Chatre <[email protected]> Signed-off-by: Ilpo Järvinen <[email protected]> Tested-by: Babu Moger <[email protected]> Reviewed-by: Reinette Chatre <[email protected]> Signed-off-by: Shuah Khan <[email protected]>
2024-07-11selftests/resctrl: Calculate resctrl FS derived mem bw over sleep(1) onlyIlpo Järvinen1-50/+91
For MBM/MBA tests, measure_vals() calls get_mem_bw_imc() that performs the measurement over a duration of sleep(1) call. The memory bandwidth numbers from IMC are derived over this duration. The resctrl FS derived memory bandwidth, however, is calculated inside measure_vals() and only takes delta between the previous value and the current one which besides the actual test, also samples inter-test noise. Rework the logic in measure_vals() and get_mem_bw_imc() such that the resctrl FS memory bandwidth section covers much shorter duration closely matching that of the IMC perf counters to improve measurement accuracy. For the second read after rewind() to return a fresh value, also newline has to be consumed by the fscanf(). Suggested-by: Reinette Chatre <[email protected]> Signed-off-by: Ilpo Järvinen <[email protected]> Tested-by: Babu Moger <[email protected]> Reviewed-by: Reinette Chatre <[email protected]> Signed-off-by: Shuah Khan <[email protected]>
2024-07-11selftests/resctrl: Fix closing IMC fds on error and open-code R+W instead of ↵Ilpo Järvinen1-18/+36
loops The imc perf fd close() calls are missing from all error paths. In addition, get_mem_bw_imc() handles fds in a for loop but close() is based on two fixed indexes READ and WRITE. Open code inner for loops to READ+WRITE entries for clarity and add a function to close() IMC fds properly in all cases. Fixes: 7f4d257e3a2a ("selftests/resctrl: Add callback to start a benchmark") Suggested-by: Reinette Chatre <[email protected]> Signed-off-by: Ilpo Järvinen <[email protected]> Tested-by: Babu Moger <[email protected]> Reviewed-by: Reinette Chatre <[email protected]> Signed-off-by: Shuah Khan <[email protected]>
2024-07-11selftests/sched: fix code format issuesaigourensheng1-5/+5
There are extra spaces in the middle of #define. It is recommended to delete the spaces to make the code look more comfortable. Signed-off-by: aigourensheng <[email protected]> Signed-off-by: Shuah Khan <[email protected]>
2024-07-11selftests/lib.mk: silence some clang warnings that gcc already ignoresJohn Hubbard1-0/+8
gcc defaults to silence (off) for the following warnings, but clang defaults to the opposite. The warnings are not useful for the kernel itself, which is why they have remained disabled in gcc for the main kernel build. And it is only due to including kernel data structures in the selftests, that we get the warnings from clang. -Waddress-of-packed-member -Wgnu-variable-sized-type-not-at-end In other words, the warnings are not unique to the selftests: there is nothing that the selftests' code does that triggers these warnings, other than the act of including the kernel's data structures. Therefore, silence them for the clang builds as well. This eliminates warnings for the net/ and user_events/ kselftest subsystems, in these files: ./net/af_unix/scm_rights.c ./net/timestamping.c ./net/ipsec.c ./user_events/perf_test.c Cc: Nathan Chancellor <[email protected]> Signed-off-by: John Hubbard <[email protected]> Acked-by: Nathan Chancellor <[email protected]> Signed-off-by: Shuah Khan <[email protected]>
2024-07-11Merge tag 'net-6.10-rc8' of ↵Linus Torvalds5-4/+246
git://git.kernel.org/pub/scm/linux/kernel/git/netdev/net Pull networking fixes from Paolo Abeni: "Including fixes from bpf and netfilter. Current release - regressions: - core: fix rc7's __skb_datagram_iter() regression Current release - new code bugs: - eth: bnxt: fix crashes when reducing ring count with active RSS contexts Previous releases - regressions: - sched: fix UAF when resolving a clash - skmsg: skip zero length skb in sk_msg_recvmsg2 - sunrpc: fix kernel free on connection failure in xs_tcp_setup_socket - tcp: avoid too many retransmit packets - tcp: fix incorrect undo caused by DSACK of TLP retransmit - udp: Set SOCK_RCU_FREE earlier in udp_lib_get_port(). - eth: ks8851: fix deadlock with the SPI chip variant - eth: i40e: fix XDP program unloading while removing the driver Previous releases - always broken: - bpf: - fix too early release of tcx_entry - fail bpf_timer_cancel when callback is being cancelled - bpf: fix order of args in call to bpf_map_kvcalloc - netfilter: nf_tables: prefer nft_chain_validate - ppp: reject claimed-as-LCP but actually malformed packets - wireguard: avoid unaligned 64-bit memory accesses" * tag 'net-6.10-rc8' of git://git.kernel.org/pub/scm/linux/kernel/git/netdev/net: (33 commits) net, sunrpc: Remap EPERM in case of connection failure in xs_tcp_setup_socket net/sched: Fix UAF when resolving a clash net: ks8851: Fix potential TX stall after interface reopen udp: Set SOCK_RCU_FREE earlier in udp_lib_get_port(). netfilter: nf_tables: prefer nft_chain_validate netfilter: nfnetlink_queue: drop bogus WARN_ON ethtool: netlink: do not return SQI value if link is down ppp: reject claimed-as-LCP but actually malformed packets selftests/bpf: Add timer lockup selftest net: ethernet: mtk-star-emac: set mac_managed_pm when probing e1000e: fix force smbus during suspend flow tcp: avoid too many retransmit packets bpf: Defer work in bpf_timer_cancel_and_free bpf: Fail bpf_timer_cancel when callback is being cancelled bpf: fix order of args in call to bpf_map_kvcalloc net: ethernet: lantiq_etop: fix double free in detach i40e: Fix XDP program unloading while removing the driver net: fix rc7's __skb_datagram_iter() net: ks8851: Fix deadlock with the SPI chip variant octeontx2-af: Fix incorrect value output on error path in rvu_check_rsrc_availability() ...
2024-07-11selftests/bpf: Add timer lockup selftestKumar Kartikeya Dwivedi2-0/+178
Add a selftest that tries to trigger a situation where two timer callbacks are attempting to cancel each other's timer. By running them continuously, we hit a condition where both run in parallel and cancel each other. Without the fix in the previous patch, this would cause a lockup as hrtimer_cancel on either side will wait for forward progress from the callback. Ensure that this situation leads to a EDEADLK error. Signed-off-by: Kumar Kartikeya Dwivedi <[email protected]> Signed-off-by: Daniel Borkmann <[email protected]> Link: https://lore.kernel.org/bpf/[email protected]
2024-07-10selftests/bpf: Close obj in error path in xdp_adjust_tailGeliang Tang1-1/+1
If bpf_object__load() fails in test_xdp_adjust_frags_tail_grow(), "obj" opened before this should be closed. So use "goto out" to close it instead of using "return" here. Fixes: 110221081aac ("bpf: selftests: update xdp_adjust_tail selftest to include xdp frags") Signed-off-by: Geliang Tang <[email protected]> Link: https://lore.kernel.org/r/f282a1ed2d0e3fb38cceefec8e81cabb69cab260.1720615848.git.tanggeliang@kylinos.cn Signed-off-by: Martin KaFai Lau <[email protected]>
2024-07-10selftests/bpf: Null checks for links in bpf_tcp_caGeliang Tang1-4/+12
Run bpf_tcp_ca selftests (./test_progs -t bpf_tcp_ca) on a Loongarch platform, some "Segmentation fault" errors occur: ''' test_dctcp:PASS:bpf_dctcp__open_and_load 0 nsec test_dctcp:FAIL:bpf_map__attach_struct_ops unexpected error: -524 #29/1 bpf_tcp_ca/dctcp:FAIL test_cubic:PASS:bpf_cubic__open_and_load 0 nsec test_cubic:FAIL:bpf_map__attach_struct_ops unexpected error: -524 #29/2 bpf_tcp_ca/cubic:FAIL test_dctcp_fallback:PASS:dctcp_skel 0 nsec test_dctcp_fallback:PASS:bpf_dctcp__load 0 nsec test_dctcp_fallback:FAIL:dctcp link unexpected error: -524 #29/4 bpf_tcp_ca/dctcp_fallback:FAIL test_write_sk_pacing:PASS:open_and_load 0 nsec test_write_sk_pacing:FAIL:attach_struct_ops unexpected error: -524 #29/6 bpf_tcp_ca/write_sk_pacing:FAIL test_update_ca:PASS:open 0 nsec test_update_ca:FAIL:attach_struct_ops unexpected error: -524 settcpca:FAIL:setsockopt unexpected setsockopt: \ actual -1 == expected -1 (network_helpers.c:99: errno: No such file or directory) \ Failed to call post_socket_cb start_test:FAIL:start_server_str unexpected start_server_str: \ actual -1 == expected -1 test_update_ca:FAIL:ca1_ca1_cnt unexpected ca1_ca1_cnt: \ actual 0 <= expected 0 #29/9 bpf_tcp_ca/update_ca:FAIL #29 bpf_tcp_ca:FAIL Caught signal #11! Stack trace: ./test_progs(crash_handler+0x28)[0x5555567ed91c] linux-vdso.so.1(__vdso_rt_sigreturn+0x0)[0x7ffffee408b0] ./test_progs(bpf_link__update_map+0x80)[0x555556824a78] ./test_progs(+0x94d68)[0x5555564c4d68] ./test_progs(test_bpf_tcp_ca+0xe8)[0x5555564c6a88] ./test_progs(+0x3bde54)[0x5555567ede54] ./test_progs(main+0x61c)[0x5555567efd54] /usr/lib64/libc.so.6(+0x22208)[0x7ffff2aaa208] /usr/lib64/libc.so.6(__libc_start_main+0xac)[0x7ffff2aaa30c] ./test_progs(_start+0x48)[0x55555646bca8] Segmentation fault ''' This is because BPF trampoline is not implemented on Loongarch yet, "link" returned by bpf_map__attach_struct_ops() is NULL. test_progs crashs when this NULL link passes to bpf_link__update_map(). This patch adds NULL checks for all links in bpf_tcp_ca to fix these errors. If "link" is NULL, goto the newly added label "out" to destroy the skel. v2: - use "goto out" instead of "return" as Eduard suggested. Fixes: 06da9f3bd641 ("selftests/bpf: Test switching TCP Congestion Control algorithms.") Signed-off-by: Geliang Tang <[email protected]> Reviewed-by: Alan Maguire <[email protected]> Link: https://lore.kernel.org/r/b4c841492bd4ed97964e4e61e92827ce51bf1dc9.1720615848.git.tanggeliang@kylinos.cn Signed-off-by: Martin KaFai Lau <[email protected]>
2024-07-10selftests/bpf: Use connect_fd_to_fd in sk_lookupGeliang Tang1-9/+3
This patch uses public helper connect_fd_to_fd() exported in network_helpers.h instead of using getsockname() + connect() in run_lookup_prog() in prog_tests/sk_lookup.c. This can simplify the code. Signed-off-by: Geliang Tang <[email protected]> Link: https://lore.kernel.org/r/7077c277cde5a1864cdc244727162fb75c8bb9c5.1720515893.git.tanggeliang@kylinos.cn Signed-off-by: Martin KaFai Lau <[email protected]>
2024-07-10selftests/bpf: Use start_server_addr in sk_lookupGeliang Tang1-8/+2
This patch uses public helper start_server_addr() in udp_recv_send() in prog_tests/sk_lookup.c to simplify the code. And use ASSERT_OK_FD() to check fd returned by start_server_addr(). Acked-by: Eduard Zingerman <[email protected]> Signed-off-by: Geliang Tang <[email protected]> Link: https://lore.kernel.org/r/f11cabfef4a2170ecb66a1e8e2e72116d8f621b3.1720515893.git.tanggeliang@kylinos.cn Signed-off-by: Martin KaFai Lau <[email protected]>
2024-07-10selftests/bpf: Use start_server_str in sk_lookupGeliang Tang1-24/+34
This patch uses public helper start_server_str() to simplify make_server() in prog_tests/sk_lookup.c. Add a callback setsockopts() to do all sockopts, set it to post_socket_cb pointer of struct network_helper_opts. And add a new struct cb_opts to save the data needed to pass to the callback. Then pass this network_helper_opts to start_server_str(). Also use ASSERT_OK_FD() to check fd returned by start_server_str(). Acked-by: Eduard Zingerman <[email protected]> Signed-off-by: Geliang Tang <[email protected]> Link: https://lore.kernel.org/r/5981539f5591d2c4998c962ef2bf45f34c940548.1720515893.git.tanggeliang@kylinos.cn Signed-off-by: Martin KaFai Lau <[email protected]>
2024-07-10selftests/bpf: Close fd in error path in drop_on_reuseportGeliang Tang1-1/+1
In the error path when update_lookup_map() fails in drop_on_reuseport in prog_tests/sk_lookup.c, "server1", the fd of server 1, should be closed. This patch fixes this by using "goto close_srv1" lable instead of "detach" to close "server1" in this case. Fixes: 0ab5539f8584 ("selftests/bpf: Tests for BPF_SK_LOOKUP attach point") Acked-by: Eduard Zingerman <[email protected]> Signed-off-by: Geliang Tang <[email protected]> Link: https://lore.kernel.org/r/86aed33b4b0ea3f04497c757845cff7e8e621a2d.1720515893.git.tanggeliang@kylinos.cn Signed-off-by: Martin KaFai Lau <[email protected]>
2024-07-10selftests/bpf: Add ASSERT_OK_FD macroGeliang Tang1-0/+9
Add a new dedicated ASSERT macro ASSERT_OK_FD to test whether a socket FD is valid or not. It can be used to replace macros ASSERT_GT(fd, 0, ""), ASSERT_NEQ(fd, -1, "") or statements (fd < 0), (fd != -1). Suggested-by: Martin KaFai Lau <[email protected]> Signed-off-by: Geliang Tang <[email protected]> Link: https://lore.kernel.org/r/ded75be86ac630a3a5099739431854c1ec33f0ea.1720515893.git.tanggeliang@kylinos.cn Signed-off-by: Martin KaFai Lau <[email protected]>
2024-07-10selftests/bpf: Add backlog for network_helper_optsGeliang Tang2-1/+11
Some callers expect __start_server() helper to pass their own "backlog" value to listen() instead of the default of 1. So this patch adds struct member "backlog" for network_helper_opts to allow callers to set "backlog" value via start_server_str() helper. listen(fd, 0 /* backlog */) can be used to enforce syncookie. Meaning backlog 0 is a legit value. Using 0 as a default and changing it to 1 here is fine. It makes the test program easier to write for the common case. Enforcing syncookie mode by using backlog 0 is a niche use case but it should at least have a way for the caller to do that. Thus, -ve backlog value is used here for the syncookie use case. Please see the comment in network_helpers.h for the details. Signed-off-by: Geliang Tang <[email protected]> Link: https://lore.kernel.org/r/1660229659b66eaad07aa2126e9c9fe217eba0dd.1720515893.git.tanggeliang@kylinos.cn Signed-off-by: Martin KaFai Lau <[email protected]>
2024-07-10selftests/bpf: fix compilation failure when CONFIG_NF_FLOW_TABLE=mAlan Maguire1-3/+7
In many cases, kernel netfilter functionality is built as modules. If CONFIG_NF_FLOW_TABLE=m in particular, progs/xdp_flowtable.c (and hence selftests) will fail to compile, so add a ___local version of "struct flow_ports". Fixes: c77e572d3a8c ("selftests/bpf: Add selftest for bpf_xdp_flow_lookup kfunc") Signed-off-by: Alan Maguire <[email protected]> Link: https://lore.kernel.org/r/[email protected] Signed-off-by: Alexei Starovoitov <[email protected]>
2024-07-10clone3: drop __ARCH_WANT_SYS_CLONE3 macroArnd Bergmann3-6/+0
When clone3() was introduced, it was not obvious how each architecture deals with setting up the stack and keeping the register contents in a fork()-like system call, so this was left for the architecture maintainers to implement, with __ARCH_WANT_SYS_CLONE3 defined by those that already implement it. Five years later, we still have a few architectures left that are missing clone3(), and the macro keeps getting in the way as it's fundamentally different from all the other __ARCH_WANT_SYS_* macros that are meant to provide backwards-compatibility with applications using older syscalls that are no longer provided by default. Address this by reversing the polarity of the macro, adding an __ARCH_BROKEN_SYS_CLONE3 macro to all architectures that don't already provide the syscall, and remove __ARCH_WANT_SYS_CLONE3 from all the other ones. Acked-by: Geert Uytterhoeven <[email protected]> Signed-off-by: Arnd Bergmann <[email protected]>
2024-07-09bpf: relax zero fixed offset constraint on KF_TRUSTED_ARGS/KF_RCUMatt Bobrowski3-9/+9
Currently, BPF kfuncs which accept trusted pointer arguments i.e. those flagged as KF_TRUSTED_ARGS, KF_RCU, or KF_RELEASE, all require an original/unmodified trusted pointer argument to be supplied to them. By original/unmodified, it means that the backing register holding the trusted pointer argument that is to be supplied to the BPF kfunc must have its fixed offset set to zero, or else the BPF verifier will outright reject the BPF program load. However, this zero fixed offset constraint that is currently enforced by the BPF verifier onto BPF kfuncs specifically flagged to accept KF_TRUSTED_ARGS or KF_RCU trusted pointer arguments is rather unnecessary, and can limit their usability in practice. Specifically, it completely eliminates the possibility of constructing a derived trusted pointer from an original trusted pointer. To put it simply, a derived pointer is a pointer which points to one of the nested member fields of the object being pointed to by the original trusted pointer. This patch relaxes the zero fixed offset constraint that is enforced upon BPF kfuncs which specifically accept KF_TRUSTED_ARGS, or KF_RCU arguments. Although, the zero fixed offset constraint technically also applies to BPF kfuncs accepting KF_RELEASE arguments, relaxing this constraint for such BPF kfuncs has subtle and unwanted side-effects. This was discovered by experimenting a little further with an initial version of this patch series [0]. The primary issue with relaxing the zero fixed offset constraint on BPF kfuncs accepting KF_RELEASE arguments is that it'd would open up the opportunity for BPF programs to supply both trusted pointers and derived trusted pointers to them. For KF_RELEASE BPF kfuncs specifically, this could be problematic as resources associated with the backing pointer could be released by the backing BPF kfunc and cause instabilities for the rest of the kernel. With this new fixed offset semantic in-place for BPF kfuncs accepting KF_TRUSTED_ARGS and KF_RCU arguments, we now have more flexibility when it comes to the BPF kfuncs that we're able to introduce moving forward. Early discussions covering the possibility of relaxing the zero fixed offset constraint can be found using the link below. This will provide more context on where all this has stemmed from [1]. Notably, pre-existing tests have been updated such that they provide coverage for the updated zero fixed offset functionality. Specifically, the nested offset test was converted from a negative to positive test as it was already designed to assert zero fixed offset semantics of a KF_TRUSTED_ARGS BPF kfunc. [0] https://lore.kernel.org/bpf/[email protected]/ [1] https://lore.kernel.org/bpf/[email protected]/ Signed-off-by: Matt Bobrowski <[email protected]> Acked-by: Kumar Kartikeya Dwivedi <[email protected]> Link: https://lore.kernel.org/r/[email protected] Signed-off-by: Alexei Starovoitov <[email protected]>
2024-07-09libbpf: improve old BPF skeleton handling for map auto-attachAndrii Nakryiko1-12/+14
Improve how we handle old BPF skeletons when it comes to BPF map auto-attachment. Emit one warn-level message per each struct_ops map that could have been auto-attached, if user provided recent enough BPF skeleton version. Don't spam log if there are no relevant struct_ops maps, though. This should help users realize that they probably need to regenerate BPF skeleton header with more recent bpftool/libbpf-cargo (or whatever other means of BPF skeleton generation). Signed-off-by: Andrii Nakryiko <[email protected]> Acked-by: Eduard Zingerman <[email protected]> Link: https://lore.kernel.org/r/[email protected] Signed-off-by: Alexei Starovoitov <[email protected]>
2024-07-09libbpf: fix BPF skeleton forward/backward compat handlingAndrii Nakryiko1-20/+27
BPF skeleton was designed from day one to be extensible. Generated BPF skeleton code specifies actual sizes of map/prog/variable skeletons for that reason and libbpf is supposed to work with newer/older versions correctly. Unfortunately, it was missed that we implicitly embed hard-coded most up-to-date (according to libbpf's version of libbpf.h header used to compile BPF skeleton header) sizes of those structs, which can differ from the actual sizes at runtime when libbpf is used as a shared library. We have a few places were we just index array of maps/progs/vars, which implicitly uses these potentially invalid sizes of structs. This patch aims to fix this problem going forward. Once this lands, we'll backport these changes in Github repo to create patched releases for older libbpfs. Acked-by: Eduard Zingerman <[email protected]> Reviewed-by: Alan Maguire <[email protected]> Fixes: d66562fba1ce ("libbpf: Add BPF object skeleton support") Fixes: 430025e5dca5 ("libbpf: Add subskeleton scaffolding") Fixes: 08ac454e258e ("libbpf: Auto-attach struct_ops BPF maps in BPF skeleton") Co-developed-by: Mykyta Yatsenko <[email protected]> Signed-off-by: Mykyta Yatsenko <[email protected]> Signed-off-by: Andrii Nakryiko <[email protected]> Link: https://lore.kernel.org/r/[email protected] Signed-off-by: Alexei Starovoitov <[email protected]>
2024-07-09bpftool: improve skeleton backwards compat with old buggy libbpfsAndrii Nakryiko1-14/+32
Old versions of libbpf don't handle varying sizes of bpf_map_skeleton struct correctly. As such, BPF skeleton generated by newest bpftool might not be compatible with older libbpf (though only when libbpf is used as a shared library), even though it, by design, should. Going forward libbpf will be fixed, plus we'll release bug fixed versions of relevant old libbpfs, but meanwhile try to mitigate from bpftool side by conservatively assuming older and smaller definition of bpf_map_skeleton, if possible. Meaning, if there are no struct_ops maps. If there are struct_ops, then presumably user would like to have auto-attaching logic and struct_ops map link placeholders, so use the full bpf_map_skeleton definition in that case. Acked-by: Quentin Monnet <[email protected]> Co-developed-by: Mykyta Yatsenko <[email protected]> Signed-off-by: Mykyta Yatsenko <[email protected]> Signed-off-by: Andrii Nakryiko <[email protected]> Acked-by: Eduard Zingerman <[email protected]> Link: https://lore.kernel.org/r/[email protected] Signed-off-by: Alexei Starovoitov <[email protected]>
2024-07-09selftests: drv-net: rss_ctx: test flow rehashing without impacting trafficJakub Kicinski1-1/+31
Some workloads may want to rehash the flows in response to an imbalance. Most effective way to do that is changing the RSS key. Check that changing the key does not cause link flaps or traffic disruption. Disrupting traffic for key update is not incorrect, but makes the key update unusable for rehashing under load. Reviewed-by: Willem de Bruijn <[email protected]> Link: https://patch.msgid.link/[email protected] Signed-off-by: Jakub Kicinski <[email protected]>
2024-07-09selftests: drv-net: rss_ctx: check behavior of indirection table resizingJakub Kicinski1-1/+36
Some devices dynamically increase and decrease the size of the RSS indirection table based on the number of enabled queues. When that happens driver must maintain the balance of entries (preferably duplicating the smaller table). Reviewed-by: Willem de Bruijn <[email protected]> Link: https://patch.msgid.link/[email protected] Signed-off-by: Jakub Kicinski <[email protected]>
2024-07-09selftests: drv-net: rss_ctx: test queue changes vs user RSS configJakub Kicinski1-1/+80
By default main RSS table should change to include all queues. When user sets a specific RSS config the driver should preserve it, even when queue count changes. Driver should refuse to deactivate queues used in the user-set RSS config. For additional contexts driver should still refuse to deactivate queues in use. Whether the contexts should get resized like context 0 when queue count increases is a bit unclear. I anticipate most drivers today don't do that. Since main use case for additional contexts is to set the indir table - it doesn't seem worthwhile to care about behavior of the default table too much. Don't test that. Reviewed-by: Willem de Bruijn <[email protected]> Link: https://patch.msgid.link/[email protected] Signed-off-by: Jakub Kicinski <[email protected]>
2024-07-09selftests: drv-net: rss_ctx: factor out send traffic and checkJakub Kicinski1-19/+39
Wrap up sending traffic and checking in which queues it landed in a helper. The method used for testing is to send a lot of iperf traffic and check which queues received the most packets. Those should be the queues where we expect iperf to land - either because we installed a filter for the port iperf uses, or we didn't and expect it to use context 0. Contexts get disjoint queue sets, but the main context (AKA context 0) may receive some background traffic (noise). Reviewed-by: Willem de Bruijn <[email protected]> Link: https://patch.msgid.link/[email protected] Signed-off-by: Jakub Kicinski <[email protected]>
2024-07-09selftests: drv-net: rss_ctx: fix cleanup in the basic testJakub Kicinski1-4/+4
The basic test may fail without resetting the RSS indir table. Use the .exec() method to run cleanup early since we re-test with traffic that returning to default state works. While at it reformat the doc a tiny bit. Reviewed-by: Willem de Bruijn <[email protected]> Link: https://patch.msgid.link/[email protected] Signed-off-by: Jakub Kicinski <[email protected]>
2024-07-09selftests: forwarding: Make vxlan-bridge-1d pass on debug kernelsIdo Schimmel1-4/+4
The ageing time used by the test is too short for debug kernels and results in entries being aged out prematurely [1]. Fix by increasing the ageing time. The same change was done for the VLAN-aware version of the test in commit dfbab74044be ("selftests: forwarding: Make vxlan-bridge-1q pass on debug kernels"). [1] # ./vxlan_bridge_1d.sh [...] # TEST: VXLAN: flood before learning [ OK ] # TEST: VXLAN: show learned FDB entry [ OK ] # TEST: VXLAN: learned FDB entry [FAIL] # veth3: Expected to capture 0 packets, got 4. # RTNETLINK answers: No such file or directory # TEST: VXLAN: deletion of learned FDB entry [ OK ] # TEST: VXLAN: Ageing of learned FDB entry [FAIL] # veth3: Expected to capture 0 packets, got 2. [...] Signed-off-by: Ido Schimmel <[email protected]> Reviewed-by: Petr Machata <[email protected]> Reviewed-by: Simon Horman <[email protected]> Link: https://patch.msgid.link/[email protected] Signed-off-by: Jakub Kicinski <[email protected]>
2024-07-09Merge tag 'linux_kselftest-fixes-6.10' of ↵Linus Torvalds7-31/+46
git://git.kernel.org/pub/scm/linux/kernel/git/shuah/linux-kselftest Pull kselftest fixes from Shuah Khan "Fixes to clang build failures to timerns, vDSO tests and fixes to vDSO makefile" * tag 'linux_kselftest-fixes-6.10' of git://git.kernel.org/pub/scm/linux/kernel/git/shuah/linux-kselftest: selftests/vDSO: remove duplicate compiler invocations from Makefile selftests/vDSO: remove partially duplicated "all:" target in Makefile selftests/vDSO: fix clang build errors and warnings selftest/timerns: fix clang build failures for abs() calls
2024-07-09Merge tag 'for-netdev' of ↵Paolo Abeni89-596/+3268
https://git.kernel.org/pub/scm/linux/kernel/git/bpf/bpf-next Daniel Borkmann says: ==================== pull-request: bpf-next 2024-07-08 The following pull-request contains BPF updates for your *net-next* tree. We've added 102 non-merge commits during the last 28 day(s) which contain a total of 127 files changed, 4606 insertions(+), 980 deletions(-). The main changes are: 1) Support resilient split BTF which cuts down on duplication and makes BTF as compact as possible wrt BTF from modules, from Alan Maguire & Eduard Zingerman. 2) Add support for dumping kfunc prototypes from BTF which enables both detecting as well as dumping compilable prototypes for kfuncs, from Daniel Xu. 3) Batch of s390x BPF JIT improvements to add support for BPF arena and to implement support for BPF exceptions, from Ilya Leoshkevich. 4) Batch of riscv64 BPF JIT improvements in particular to add 12-argument support for BPF trampolines and to utilize bpf_prog_pack for the latter, from Pu Lehui. 5) Extend BPF test infrastructure to add a CHECKSUM_COMPLETE validation option for skbs and add coverage along with it, from Vadim Fedorenko. 6) Inline bpf_get_current_task/_btf() helpers in the arm64 BPF JIT which gives a small 1% performance improvement in micro-benchmarks, from Puranjay Mohan. 7) Extend the BPF verifier to track the delta between linked registers in order to better deal with recent LLVM code optimizations, from Alexei Starovoitov. 8) Fix bpf_wq_set_callback_impl() kfunc signature where the third argument should have been a pointer to the map value, from Benjamin Tissoires. 9) Extend BPF selftests to add regular expression support for test output matching and adjust some of the selftest when compiled under gcc, from Cupertino Miranda. 10) Simplify task_file_seq_get_next() and remove an unnecessary loop which always iterates exactly once anyway, from Dan Carpenter. 11) Add the capability to offload the netfilter flowtable in XDP layer through kfuncs, from Florian Westphal & Lorenzo Bianconi. 12) Various cleanups in networking helpers in BPF selftests to shave off a few lines of open-coded functions on client/server handling, from Geliang Tang. 13) Properly propagate prog->aux->tail_call_reachable out of BPF verifier, so that x86 JIT does not need to implement detection, from Leon Hwang. 14) Fix BPF verifier to add a missing check_func_arg_reg_off() to prevent an out-of-bounds memory access for dynpointers, from Matt Bobrowski. 15) Fix bpf_session_cookie() kfunc to return __u64 instead of long pointer as it might lead to problems on 32-bit archs, from Jiri Olsa. 16) Enhance traffic validation and dynamic batch size support in xsk selftests, from Tushar Vyavahare. bpf-next-for-netdev * tag 'for-netdev' of https://git.kernel.org/pub/scm/linux/kernel/git/bpf/bpf-next: (102 commits) selftests/bpf: DENYLIST.aarch64: Remove fexit_sleep selftests/bpf: amend for wrong bpf_wq_set_callback_impl signature bpf: helpers: fix bpf_wq_set_callback_impl signature libbpf: Add NULL checks to bpf_object__{prev_map,next_map} selftests/bpf: Remove exceptions tests from DENYLIST.s390x s390/bpf: Implement exceptions s390/bpf: Change seen_reg to a mask bpf: Remove unnecessary loop in task_file_seq_get_next() riscv, bpf: Optimize stack usage of trampoline bpf, devmap: Add .map_alloc_check selftests/bpf: Remove arena tests from DENYLIST.s390x selftests/bpf: Add UAF tests for arena atomics selftests/bpf: Introduce __arena_global s390/bpf: Support arena atomics s390/bpf: Enable arena s390/bpf: Support address space cast instruction s390/bpf: Support BPF_PROBE_MEM32 s390/bpf: Land on the next JITed instruction after exception s390/bpf: Introduce pre- and post- probe functions s390/bpf: Get rid of get_probe_mem_regno() ... ==================== Link: https://patch.msgid.link/[email protected] Signed-off-by: Paolo Abeni <[email protected]>
2024-07-08Merge tag 'perf-tools-fixes-for-v6.10-2024-07-08' of ↵Linus Torvalds2-16/+39
git://git.kernel.org/pub/scm/linux/kernel/git/perf/perf-tools Pull perf tools fixes from Namhyung Kim: "Fix performance issue for v6.10 These address the performance issues reported by Matt, Namhyung and Linus. Recently perf changed the processing of the comm string and DSO using sorted arrays but this caused it to sort the array whenever adding a new entry. This caused a performance issue and the fix is to enhance the sorting by finding the insertion point in the sorted array and to shift righthand side using memmove()" * tag 'perf-tools-fixes-for-v6.10-2024-07-08' of git://git.kernel.org/pub/scm/linux/kernel/git/perf/perf-tools: perf dsos: When adding a dso into sorted dsos maintain the sort order perf comm str: Avoid sort during insert
2024-07-08selftests/bpf: Extend tcx tests to cover late tcx_entry releaseDaniel Borkmann2-0/+64
Add a test case which replaces an active ingress qdisc while keeping the miniq in-tact during the transition period to the new clsact qdisc. # ./vmtest.sh -- ./test_progs -t tc_link [...] ./test_progs -t tc_link [ 3.412871] bpf_testmod: loading out-of-tree module taints kernel. [ 3.413343] bpf_testmod: module verification failed: signature and/or required key missing - tainting kernel #332 tc_links_after:OK #333 tc_links_append:OK #334 tc_links_basic:OK #335 tc_links_before:OK #336 tc_links_chain_classic:OK #337 tc_links_chain_mixed:OK #338 tc_links_dev_chain0:OK #339 tc_links_dev_cleanup:OK #340 tc_links_dev_mixed:OK #341 tc_links_ingress:OK #342 tc_links_invalid:OK #343 tc_links_prepend:OK #344 tc_links_replace:OK #345 tc_links_revision:OK Summary: 14/0 PASSED, 0 SKIPPED, 0 FAILED Signed-off-by: Daniel Borkmann <[email protected]> Cc: Martin KaFai Lau <[email protected]> Link: https://lore.kernel.org/r/[email protected] Signed-off-by: Martin KaFai Lau <[email protected]>
2024-07-08selftests: net: ksft: interrupt cleanly on KeyboardInterruptJakub Kicinski1-1/+8
It's very useful to be able to interrupt the tests during development. Detect KeyboardInterrupt, run the cleanups and exit. Link: https://patch.msgid.link/[email protected] Signed-off-by: Jakub Kicinski <[email protected]>
2024-07-08selftests/bpf: DENYLIST.aarch64: Remove fexit_sleepPuranjay Mohan1-1/+0
fexit_sleep test runs successfully now on the BPF CI so remove it from the deny list. ftrace direct calls was blocking tracing programs on arm64 but it has been resolved by now. For more details see also discussion in [*]. Signed-off-by: Puranjay Mohan <[email protected]> Signed-off-by: Daniel Borkmann <[email protected]> Link: https://lore.kernel.org/bpf/[email protected] [*]
2024-07-08selftests/bpf: amend for wrong bpf_wq_set_callback_impl signatureBenjamin Tissoires3-8/+17
See the previous patch: the API was wrong, we were provided the pointer to the value, not the actual struct bpf_wq *. Signed-off-by: Benjamin Tissoires <[email protected]> Link: https://lore.kernel.org/r/[email protected] Signed-off-by: Alexei Starovoitov <[email protected]>
2024-07-08libbpf: Add NULL checks to bpf_object__{prev_map,next_map}Andreas Ziegler1-2/+2
In the current state, an erroneous call to bpf_object__find_map_by_name(NULL, ...) leads to a segmentation fault through the following call chain: bpf_object__find_map_by_name(obj = NULL, ...) -> bpf_object__for_each_map(pos, obj = NULL) -> bpf_object__next_map((obj = NULL), NULL) -> return (obj = NULL)->maps While calling bpf_object__find_map_by_name with obj = NULL is obviously incorrect, this should not lead to a segmentation fault but rather be handled gracefully. As __bpf_map__iter already handles this situation correctly, we can delegate the check for the regular case there and only add a check in case the prev or next parameter is NULL. Signed-off-by: Andreas Ziegler <[email protected]> Signed-off-by: Daniel Borkmann <[email protected]> Link: https://lore.kernel.org/bpf/[email protected]
2024-07-08Merge tag 'linux-cpupower-6.11-rc1-2' of ↵Rafael J. Wysocki2-9/+6
ssh://gitolite.kernel.org/pub/scm/linux/kernel/git/shuah/linux into pm-tools Merge more cpupower utility changes for 6.11-rc1 from Shuah Khan: "This cpupower second update for Linux 6.11-rc1 consists of -- fix to install cpupower library in standard librray intall location - /usr/lib -- disable direct build of cpupower bench as it can only be built from the cpupower main makefile." * tag 'linux-cpupower-6.11-rc1-2' of ssh://gitolite.kernel.org/pub/scm/linux/kernel/git/shuah/linux: cpupower: fix lib default installation path cpupower: Disable direct build of the 'bench' subproject
2024-07-08selftests/bpf: Remove exceptions tests from DENYLIST.s390xIlya Leoshkevich1-1/+0
Now that the s390x JIT supports exceptions, remove the respective tests from the denylist. Signed-off-by: Ilya Leoshkevich <[email protected]> Signed-off-by: Daniel Borkmann <[email protected]> Link: https://lore.kernel.org/bpf/[email protected]
2024-07-07perf dsos: When adding a dso into sorted dsos maintain the sort orderIan Rogers1-5/+21
dsos__add would add at the end of the dso array possibly requiring a later find to re-sort the array. Patterns of find then add were becoming O(n*log n) due to the sorts. Change the add routine to be O(n) rather than O(1) but to maintain the sorted-ness of the dsos array so that later finds don't need the O(n*log n) sort. Fixes: 3f4ac23a9908 ("perf dsos: Switch backing storage to array from rbtree/list") Reported-by: Namhyung Kim <[email protected]> Signed-off-by: Ian Rogers <[email protected]> Cc: Steinar Gunderson <[email protected]> Cc: Athira Rajeev <[email protected]> Cc: Matt Fleming <[email protected]> Link: https://lore.kernel.org/r/[email protected] Signed-off-by: Namhyung Kim <[email protected]>
2024-07-07perf comm str: Avoid sort during insertIan Rogers1-11/+18
The array is sorted, so just move the elements and insert in order. Fixes: 13ca628716c6 ("perf comm: Add reference count checking to 'struct comm_str'") Reported-by: Matt Fleming <[email protected]> Signed-off-by: Ian Rogers <[email protected]> Tested-by: Matt Fleming <[email protected]> Cc: Steinar Gunderson <[email protected]> Cc: Athira Rajeev <[email protected]> Link: https://lore.kernel.org/r/[email protected] Signed-off-by: Namhyung Kim <[email protected]>