aboutsummaryrefslogtreecommitdiff
path: root/tools
AgeCommit message (Collapse)AuthorFilesLines
2024-02-16selftests: bonding: make sure new active is not nullHangbin Liu1-2/+2
One of Jakub's tests[1] shows that there may be period all ports are down and no active slave. This makes the new_active_slave null and the test fails. Add a check to make sure the new active is not null. [ 189.051966] br0: port 2(s1) entered disabled state [ 189.317881] bond0: (slave eth1): link status definitely down, disabling slave [ 189.318487] bond0: (slave eth2): making interface the new active one [ 190.435430] br0: port 4(s2) entered disabled state [ 190.773786] bond0: (slave eth0): link status definitely down, disabling slave [ 190.774204] bond0: (slave eth2): link status definitely down, disabling slave [ 190.774715] bond0: now running without any active interface! [ 190.877760] bond0: (slave eth0): link status definitely up [ 190.878098] bond0: (slave eth0): making interface the new active one [ 190.878495] bond0: active interface up! [ 191.802872] br0: port 4(s2) entered blocking state [ 191.803157] br0: port 4(s2) entered forwarding state [ 191.813756] bond0: (slave eth2): link status definitely up [ 192.847095] br0: port 2(s1) entered blocking state [ 192.847396] br0: port 2(s1) entered forwarding state [ 192.853740] bond0: (slave eth1): link status definitely up # TEST: prio (active-backup ns_ip6_target primary_reselect 1) [FAIL] # Current active slave is null but not eth0 [1] https://netdev-3.bots.linux.dev/vmksft-bonding/results/464481/1-bond-options-sh/stdout Fixes: 45bf79bc56c4 ("selftests: bonding: reduce garp_test/arp_validate test time") Signed-off-by: Hangbin Liu <[email protected]> Signed-off-by: David S. Miller <[email protected]>
2024-02-15evm: Move to LSM infrastructureRoberto Sassu1-0/+3
As for IMA, move hardcoded EVM function calls from various places in the kernel to the LSM infrastructure, by introducing a new LSM named 'evm' (last and always enabled like 'ima'). The order in the Makefile ensures that 'evm' hooks are executed after 'ima' ones. Make EVM functions as static (except for evm_inode_init_security(), which is exported), and register them as hook implementations in init_evm_lsm(). Also move the inline functions evm_inode_remove_acl(), evm_inode_post_remove_acl(), and evm_inode_post_set_acl() from the public evm.h header to evm_main.c. Unlike before (see commit to move IMA to the LSM infrastructure), evm_inode_post_setattr(), evm_inode_post_set_acl(), evm_inode_post_remove_acl(), and evm_inode_post_removexattr() are not executed for private inodes. Finally, add the LSM_ID_EVM case in lsm_list_modules_test.c Signed-off-by: Roberto Sassu <[email protected]> Reviewed-by: Casey Schaufler <[email protected]> Acked-by: Christian Brauner <[email protected]> Reviewed-by: Stefan Berger <[email protected]> Reviewed-by: Mimi Zohar <[email protected]> Acked-by: Mimi Zohar <[email protected]> Signed-off-by: Paul Moore <[email protected]>
2024-02-15ima: Move to LSM infrastructureRoberto Sassu1-0/+3
Move hardcoded IMA function calls (not appraisal-specific functions) from various places in the kernel to the LSM infrastructure, by introducing a new LSM named 'ima' (at the end of the LSM list and always enabled like 'integrity'). Having IMA before EVM in the Makefile is sufficient to preserve the relative order of the new 'ima' LSM in respect to the upcoming 'evm' LSM, and thus the order of IMA and EVM function calls as when they were hardcoded. Make moved functions as static (except ima_post_key_create_or_update(), which is not in ima_main.c), and register them as implementation of the respective hooks in the new function init_ima_lsm(). Select CONFIG_SECURITY_PATH, to ensure that the path-based LSM hook path_post_mknod is always available and ima_post_path_mknod() is always executed to mark files as new, as before the move. A slight difference is that IMA and EVM functions registered for the inode_post_setattr, inode_post_removexattr, path_post_mknod, inode_post_create_tmpfile, inode_post_set_acl and inode_post_remove_acl won't be executed for private inodes. Since those inodes are supposed to be fs-internal, they should not be of interest to IMA or EVM. The S_PRIVATE flag is used for anonymous inodes, hugetlbfs, reiserfs xattrs, XFS scrub and kernel-internal tmpfs files. Conditionally register ima_post_key_create_or_update() if CONFIG_IMA_MEASURE_ASYMMETRIC_KEYS is enabled. Also, conditionally register ima_kernel_module_request() if CONFIG_INTEGRITY_ASYMMETRIC_KEYS is enabled. Finally, add the LSM_ID_IMA case in lsm_list_modules_test.c. Signed-off-by: Roberto Sassu <[email protected]> Acked-by: Chuck Lever <[email protected]> Acked-by: Casey Schaufler <[email protected]> Acked-by: Christian Brauner <[email protected]> Reviewed-by: Stefan Berger <[email protected]> Reviewed-by: Mimi Zohar <[email protected]> Acked-by: Mimi Zohar <[email protected]> Signed-off-by: Paul Moore <[email protected]>
2024-02-15selftest/bpf: Test the read of vsyscall page under x86-64Hou Tao2-0/+102
Under x86-64, when using bpf_probe_read_kernel{_str}() or bpf_probe_read{_str}() to read vsyscall page, the read may trigger oops, so add one test case to ensure that the problem is fixed. Beside those four bpf helpers mentioned above, testing the read of vsyscall page by using bpf_probe_read_user{_str} and bpf_copy_from_user{_task}() as well. The test case passes the address of vsyscall page to these six helpers and checks whether the returned values are expected: 1) For bpf_probe_read_kernel{_str}()/bpf_probe_read{_str}(), the expected return value is -ERANGE as shown below: bpf_probe_read_kernel_common copy_from_kernel_nofault // false, return -ERANGE copy_from_kernel_nofault_allowed 2) For bpf_probe_read_user{_str}(), the expected return value is -EFAULT as show below: bpf_probe_read_user_common copy_from_user_nofault // false, return -EFAULT __access_ok 3) For bpf_copy_from_user(), the expected return value is -EFAULT: // return -EFAULT bpf_copy_from_user copy_from_user _copy_from_user // return false access_ok 4) For bpf_copy_from_user_task(), the expected return value is -EFAULT: // return -EFAULT bpf_copy_from_user_task access_process_vm // return 0 vma_lookup() // return 0 expand_stack() The occurrence of oops depends on the availability of CPU SMAP [1] feature and there are three possible configurations of vsyscall page in the boot cmd-line: vsyscall={xonly|none|emulate}, so there are a total of six possible combinations. Under all these combinations, the test case runs successfully. [1]: https://en.wikipedia.org/wiki/Supervisor_Mode_Access_Prevention Acked-by: Yonghong Song <[email protected]> Signed-off-by: Hou Tao <[email protected]> Link: https://lore.kernel.org/r/[email protected] Signed-off-by: Alexei Starovoitov <[email protected]>
2024-02-15Merge git://git.kernel.org/pub/scm/linux/kernel/git/netdev/netJakub Kicinski29-84/+274
Cross-merge networking fixes after downstream PR. No conflicts. Adjacent changes: net/core/dev.c 9f30831390ed ("net: add rcu safety to rtnl_prop_list_size()") 723de3ebef03 ("net: free altname using an RCU callback") net/unix/garbage.c 11498715f266 ("af_unix: Remove io_uring code for GC.") 25236c91b5ab ("af_unix: Fix task hung while purging oob_skb in GC.") drivers/net/ethernet/renesas/ravb_main.c ed4adc07207d ("net: ravb: Count packets instead of descriptors in GbEth RX path" ) c2da9408579d ("ravb: Add Rx checksum offload support for GbEth") net/mptcp/protocol.c bdd70eb68913 ("mptcp: drop the push_pending field") 28e5c1380506 ("mptcp: annotate lockless accesses around read-mostly fields") Signed-off-by: Jakub Kicinski <[email protected]>
2024-02-15perf build: Cleanup perf register configurationLeo Yan1-21/+0
The target is to allow the tool to always enable the perf register feature for native parsing and cross parsing, and current code doesn't depend on the macro 'HAVE_PERF_REGS_SUPPORT'. This patch remove the variable 'NO_PERF_REGS' and the defined macro 'HAVE_PERF_REGS_SUPPORT' from the Makefile. Signed-off-by: Leo Yan <[email protected]> Reviewed-by: Ian Rogers <[email protected]> Cc: James Clark <[email protected]> Cc: Palmer Dabbelt <[email protected]> Cc: Albert Ou <[email protected]> Cc: Huacai Chen <[email protected]> Cc: Guo Ren <[email protected]> Cc: Will Deacon <[email protected]> Cc: Mike Leach <[email protected]> Cc: Kan Liang <[email protected]> Cc: Ming Wang <[email protected]> Cc: John Garry <[email protected]> Cc: Paul Walmsley <[email protected]> Cc: [email protected] Cc: [email protected] Signed-off-by: Namhyung Kim <[email protected]> Link: https://lore.kernel.org/r/[email protected]
2024-02-15perf parse-regs: Introduce a weak function arch__sample_reg_masks()Leo Yan13-17/+68
Every architecture can provide a register list for sampling. If an architecture doesn't support register sampling, it won't define the data structure 'sample_reg_masks'. Consequently, any code using this structure must be protected by the macro 'HAVE_PERF_REGS_SUPPORT'. This patch defines a weak function, arch__sample_reg_masks(), which will be replaced by an architecture-defined function for returning the architecture's register list. With this refactoring, the function always exists, the condition checking for 'HAVE_PERF_REGS_SUPPORT' is not needed anymore, so remove it. Signed-off-by: Leo Yan <[email protected]> Reviewed-by: Ian Rogers <[email protected]> Cc: James Clark <[email protected]> Cc: Palmer Dabbelt <[email protected]> Cc: Albert Ou <[email protected]> Cc: Huacai Chen <[email protected]> Cc: Guo Ren <[email protected]> Cc: Will Deacon <[email protected]> Cc: Mike Leach <[email protected]> Cc: Kan Liang <[email protected]> Cc: Ming Wang <[email protected]> Cc: John Garry <[email protected]> Cc: Paul Walmsley <[email protected]> Cc: [email protected] Cc: [email protected] Signed-off-by: Namhyung Kim <[email protected]> Link: https://lore.kernel.org/r/[email protected]
2024-02-15perf parse-regs: Always build perf register functionsLeo Yan11-71/+0
Currently, the macro HAVE_PERF_REGS_SUPPORT is used as a switch to turn on or turn off the code of perf registers. If any architecture cannot support perf register, it disables the perf register parsing, for both the native parsing and cross parsing for other architectures. To support both the native parsing and cross parsing, the tool should always build the perf regs functions. Thus, this patch removes HAVE_PERF_REGS_SUPPORT from the perf regs files. Signed-off-by: Leo Yan <[email protected]> Reviewed-by: Ian Rogers <[email protected]> Cc: James Clark <[email protected]> Cc: Palmer Dabbelt <[email protected]> Cc: Albert Ou <[email protected]> Cc: Huacai Chen <[email protected]> Cc: Guo Ren <[email protected]> Cc: Will Deacon <[email protected]> Cc: Mike Leach <[email protected]> Cc: Kan Liang <[email protected]> Cc: Ming Wang <[email protected]> Cc: John Garry <[email protected]> Cc: Paul Walmsley <[email protected]> Cc: [email protected] Cc: [email protected] Signed-off-by: Namhyung Kim <[email protected]> Link: https://lore.kernel.org/r/[email protected]
2024-02-15perf build: Remove unused CONFIG_PERF_REGSLeo Yan1-4/+0
CONFIG_PERF_REGS is not used, remove it. Signed-off-by: Leo Yan <[email protected]> Reviewed-by: Ian Rogers <[email protected]> Cc: James Clark <[email protected]> Cc: Palmer Dabbelt <[email protected]> Cc: Albert Ou <[email protected]> Cc: Huacai Chen <[email protected]> Cc: Guo Ren <[email protected]> Cc: Will Deacon <[email protected]> Cc: Mike Leach <[email protected]> Cc: Kan Liang <[email protected]> Cc: Ming Wang <[email protected]> Cc: John Garry <[email protected]> Cc: Paul Walmsley <[email protected]> Cc: [email protected] Cc: [email protected] Signed-off-by: Namhyung Kim <[email protected]> Link: https://lore.kernel.org/r/[email protected]
2024-02-15bpf: Fix test verif_scale_strobemeta_subprogs failure due to llvm19Yonghong Song2-4/+8
With latest llvm19, I hit the following selftest failures with $ ./test_progs -j libbpf: prog 'on_event': BPF program load failed: Permission denied libbpf: prog 'on_event': -- BEGIN PROG LOAD LOG -- combined stack size of 4 calls is 544. Too large verification time 1344153 usec stack depth 24+440+0+32 processed 51008 insns (limit 1000000) max_states_per_insn 19 total_states 1467 peak_states 303 mark_read 146 -- END PROG LOAD LOG -- libbpf: prog 'on_event': failed to load: -13 libbpf: failed to load object 'strobemeta_subprogs.bpf.o' scale_test:FAIL:expect_success unexpected error: -13 (errno 13) #498 verif_scale_strobemeta_subprogs:FAIL The verifier complains too big of the combined stack size (544 bytes) which exceeds the maximum stack limit 512. This is a regression from llvm19 ([1]). In the above error log, the original stack depth is 24+440+0+32. To satisfy interpreter's need, in verifier the stack depth is adjusted to 32+448+32+32=544 which exceeds 512, hence the error. The same adjusted stack size is also used for jit case. But the jitted codes could use smaller stack size. $ egrep -r stack_depth | grep round_up arm64/net/bpf_jit_comp.c: ctx->stack_size = round_up(prog->aux->stack_depth, 16); loongarch/net/bpf_jit.c: bpf_stack_adjust = round_up(ctx->prog->aux->stack_depth, 16); powerpc/net/bpf_jit_comp.c: cgctx.stack_size = round_up(fp->aux->stack_depth, 16); riscv/net/bpf_jit_comp32.c: round_up(ctx->prog->aux->stack_depth, STACK_ALIGN); riscv/net/bpf_jit_comp64.c: bpf_stack_adjust = round_up(ctx->prog->aux->stack_depth, 16); s390/net/bpf_jit_comp.c: u32 stack_depth = round_up(fp->aux->stack_depth, 8); sparc/net/bpf_jit_comp_64.c: stack_needed += round_up(stack_depth, 16); x86/net/bpf_jit_comp.c: EMIT3_off32(0x48, 0x81, 0xEC, round_up(stack_depth, 8)); x86/net/bpf_jit_comp.c: int tcc_off = -4 - round_up(stack_depth, 8); x86/net/bpf_jit_comp.c: round_up(stack_depth, 8)); x86/net/bpf_jit_comp.c: int tcc_off = -4 - round_up(stack_depth, 8); x86/net/bpf_jit_comp.c: EMIT3_off32(0x48, 0x81, 0xC4, round_up(stack_depth, 8)); In the above, STACK_ALIGN in riscv/net/bpf_jit_comp32.c is defined as 16. So stack is aligned in either 8 or 16, x86/s390 having 8-byte stack alignment and the rest having 16-byte alignment. This patch calculates total stack depth based on 16-byte alignment if jit is requested. For the above failing case, the new stack size will be 32+448+0+32=512 and no verification failure. llvm19 regression will be discussed separately in llvm upstream. The verifier change caused three test failures as these tests compared messages with stack size. More specifically, - test_global_funcs/global_func1: fail with interpreter mode and success with jit mode. Adjusted stack sizes so both jit and interpreter modes will fail. - async_stack_depth/{pseudo_call_check, async_call_root_check}: since jit and interpreter will calculate different stack sizes, the failure msg is adjusted to omit those specific stack size numbers. [1] https://lore.kernel.org/bpf/[email protected]/ Suggested-by: Alexei Starovoitov <[email protected]> Signed-off-by: Yonghong Song <[email protected]> Link: https://lore.kernel.org/r/[email protected] Signed-off-by: Alexei Starovoitov <[email protected]>
2024-02-15Merge tag 'net-6.8-rc5' of ↵Linus Torvalds13-41/+163
git://git.kernel.org/pub/scm/linux/kernel/git/netdev/net Pull networking fixes from Jakub Kicinski: "Including fixes from can, wireless and netfilter. Current release - regressions: - af_unix: fix task hung while purging oob_skb in GC - pds_core: do not try to run health-thread in VF path Current release - new code bugs: - sched: act_mirred: don't zero blockid when net device is being deleted Previous releases - regressions: - netfilter: - nat: restore default DNAT behavior - nf_tables: fix bidirectional offload, broken when unidirectional offload support was added - openvswitch: limit the number of recursions from action sets - eth: i40e: do not allow untrusted VF to remove administratively set MAC address Previous releases - always broken: - tls: fix races and bugs in use of async crypto - mptcp: prevent data races on some of the main socket fields, fix races in fastopen handling - dpll: fix possible deadlock during netlink dump operation - dsa: lan966x: fix crash when adding interface under a lag when some of the ports are disabled - can: j1939: prevent deadlock by changing j1939_socks_lock to rwlock Misc: - a handful of fixes and reliability improvements for selftests - fix sysfs documentation missing net/ in paths - finish the work of squashing the missing MODULE_DESCRIPTION() warnings in networking" * tag 'net-6.8-rc5' of git://git.kernel.org/pub/scm/linux/kernel/git/netdev/net: (92 commits) net: fill in MODULE_DESCRIPTION()s for missing arcnet net: fill in MODULE_DESCRIPTION()s for mdio_devres net: fill in MODULE_DESCRIPTION()s for ppp net: fill in MODULE_DESCRIPTION()s for fddik/skfp net: fill in MODULE_DESCRIPTION()s for plip net: fill in MODULE_DESCRIPTION()s for ieee802154/fakelb net: fill in MODULE_DESCRIPTION()s for xen-netback net: ravb: Count packets instead of descriptors in GbEth RX path pppoe: Fix memory leak in pppoe_sendmsg() net: sctp: fix skb leak in sctp_inq_free() net: bcmasp: Handle RX buffer allocation failure net-timestamp: make sk_tskey more predictable in error path selftests: tls: increase the wait in poll_partial_rec_async ice: Add check for lport extraction to LAG init netfilter: nf_tables: fix bidirectional offload regression netfilter: nat: restore default DNAT behavior netfilter: nft_set_pipapo: fix missing : in kdoc igc: Remove temporary workaround igb: Fix string truncation warnings in igb_set_fw_version can: netlink: Fix TDCO calculation using the old data bittiming ...
2024-02-15Merge tag 'devicetree-fixes-for-6.8-1' of ↵Linus Torvalds1-6/+7
git://git.kernel.org/pub/scm/linux/kernel/git/robh/linux Pull devicetree fixes from Rob Herring: - Improve devlink dependency parsing for DT graphs - Fix devlink handling of io-channels dependencies - Fix PCI addressing in marvell,prestera example - A few schema fixes for property constraints - Improve performance of DT unprobed devices kselftest - Fix regression in DT_SCHEMA_FILES handling - Fix compile error in unittest for !OF_DYNAMIC * tag 'devicetree-fixes-for-6.8-1' of git://git.kernel.org/pub/scm/linux/kernel/git/robh/linux: dt-bindings: ufs: samsung,exynos-ufs: Add size constraints on "samsung,sysreg" of: property: Add in-ports/out-ports support to of_graph_get_port_parent() of: property: Improve finding the supplier of a remote-endpoint property of: property: Improve finding the consumer of a remote-endpoint property net: marvell,prestera: Fix example PCI bus addressing of: unittest: Fix compile in the non-dynamic case of: property: fix typo in io-channels dt-bindings: tpm: Drop type from "resets" dt-bindings: display: nxp,tda998x: Fix 'audio-ports' constraints dt-bindings: xilinx: replace Piyush Mehta maintainership kselftest: dt: Stop relying on dirname to improve performance dt-bindings: don't anchor DT_SCHEMA_FILES to bindings directory
2024-02-14selftests: tls: increase the wait in poll_partial_rec_asyncJakub Kicinski1-2/+2
Test runners on debug kernels occasionally fail with: # # RUN tls_err.13_aes_gcm.poll_partial_rec_async ... # # tls.c:1883:poll_partial_rec_async:Expected poll(&pfd, 1, 5) (0) == 1 (1) # # tls.c:1870:poll_partial_rec_async:Expected status (256) == 0 (0) # # poll_partial_rec_async: Test failed at step #17 # # FAIL tls_err.13_aes_gcm.poll_partial_rec_async # not ok 699 tls_err.13_aes_gcm.poll_partial_rec_async # # FAILED: 698 / 699 tests passed. This points to the second poll() in the test which is expected to wait for the sender to send the rest of the data. Apparently under some conditions that doesn't happen within 5ms, bump the timeout to 20ms. Fixes: 23fcb62bc19c ("selftests: tls: add tests for poll behavior") Link: https://lore.kernel.org/r/[email protected] Signed-off-by: Jakub Kicinski <[email protected]>
2024-02-14Merge tag 'landlock-6.8-rc5' of ↵Linus Torvalds3-13/+59
git://git.kernel.org/pub/scm/linux/kernel/git/mic/linux Pull landlock test fixes from Mickaël Salaün: "Fix build issues for tests, and improve test compatibility" * tag 'landlock-6.8-rc5' of git://git.kernel.org/pub/scm/linux/kernel/git/mic/linux: selftests/landlock: Fix capability for net_test selftests/landlock: Fix fs_test build with old libc selftests/landlock: Fix net_test build with old libc
2024-02-14libbpf: Make remark about zero-initializing bpf_*_info structsMatt Bobrowski1-5/+17
In some situations, if you fail to zero-initialize the bpf_{prog,map,btf,link}_info structs supplied to the set of LIBBPF helpers bpf_{prog,map,btf,link}_get_info_by_fd(), you can expect the helper to return an error. This can possibly leave people in a situation where they're scratching their heads for an unnnecessary amount of time. Make an explicit remark about the requirement of zero-initializing the supplied bpf_{prog,map,btf,link}_info structs for the respective LIBBPF helpers. Internally, LIBBPF helpers bpf_{prog,map,btf,link}_get_info_by_fd() call into bpf_obj_get_info_by_fd() where the bpf(2) BPF_OBJ_GET_INFO_BY_FD command is used. This specific command is effectively backed by restrictions enforced by the bpf_check_uarg_tail_zero() helper. This function ensures that if the size of the supplied bpf_{prog,map,btf,link}_info structs are larger than what the kernel can handle, trailing bits are zeroed. This can be a problem when compiling against UAPI headers that don't necessarily match the sizes of the same underlying types known to the kernel. Signed-off-by: Matt Bobrowski <[email protected]> Signed-off-by: Andrii Nakryiko <[email protected]> Acked-by: Jiri Olsa <[email protected]> Link: https://lore.kernel.org/bpf/[email protected]
2024-02-14Merge tag 'kvm-x86-selftests-6.8-rcN' of https://github.com/kvm-x86/linux ↵Paolo Bonzini58-239/+236
into HEAD KVM selftests fixes/cleanups (and one KVM x86 cleanup) for 6.8: - Remove redundant newlines from error messages. - Delete an unused variable in the AMX test (which causes build failures when compiling with -Werror). - Fail instead of skipping tests if open(), e.g. of /dev/kvm, fails with an error code other than ENOENT (a Hyper-V selftest bug resulted in an EMFILE, and the test eventually got skipped). - Fix TSC related bugs in several Hyper-V selftests. - Fix a bug in the dirty ring logging test where a sem_post() could be left pending across multiple runs, resulting in incorrect synchronization between the main thread and the vCPU worker thread. - Relax the dirty log split test's assertions on 4KiB mappings to fix false positives due to the number of mappings for memslot 0 (used for code and data that is NOT being dirty logged) changing, e.g. due to NUMA balancing. - Have KVM's gtod_is_based_on_tsc() return "bool" instead of an "int" (the function generates boolean values, and all callers treat the return value as a bool).
2024-02-14Merge branch 'x86/bugs' into x86/core, to pick up pending changes before ↵Ingo Molnar3-8/+8
dependent patches Merge in pending alternatives patching infrastructure changes, before applying more patches. Signed-off-by: Ingo Molnar <[email protected]>
2024-02-13bpf: emit source code file name and line number in verifier logAndrii Nakryiko1-2/+2
As BPF applications grow in size and complexity and are separated into multiple .bpf.c files that are statically linked together, it becomes harder and harder to match verifier's BPF assembly level output to original C code. While often annotated C source code is unique enough to be able to identify the file it belongs to, quite often this is actually problematic as parts of source code can be quite generic. Long story short, it is very useful to see source code file name and line number information along with the original C code. Verifier already knows this information, we just need to output it. This patch extends verifier log with file name and line number information, emitted next to original (presumably C) source code, annotating BPF assembly output, like so: ; <original C code> @ <filename>.bpf.c:<line> If file name has directory names in it, they are stripped away. This should be fine in practice as file names tend to be pretty unique with C code anyways, and keeping log size smaller is always good. In practice this might look something like below, where some code is coming from application files, while others are from libbpf's usdt.bpf.h header file: ; if (STROBEMETA_READ( @ strobemeta_probe.bpf.c:534 5592: (79) r1 = *(u64 *)(r10 -56) ; R1_w=mem_or_null(id=1589,sz=7680) R10=fp0 5593: (7b) *(u64 *)(r10 -56) = r1 ; R1_w=mem_or_null(id=1589,sz=7680) R10=fp0 5594: (79) r3 = *(u64 *)(r10 -8) ; R3_w=scalar() R10=fp0 fp-8=mmmmmmmm ... 170: (71) r1 = *(u8 *)(r8 +15) ; frame1: R1_w=scalar(...) R8_w=map_value(map=__bpf_usdt_spec,ks=4,vs=208) 171: (67) r1 <<= 56 ; frame1: R1_w=scalar(...) 172: (c7) r1 s>>= 56 ; frame1: R1_w=scalar(smin=smin32=-128,smax=smax32=127) ; val <<= arg_spec->arg_bitshift; @ usdt.bpf.h:183 173: (67) r1 <<= 32 ; frame1: R1_w=scalar(...) 174: (77) r1 >>= 32 ; frame1: R1_w=scalar(smin=0,smax=umax=0xffffffff,var_off=(0x0; 0xffffffff)) 175: (79) r2 = *(u64 *)(r10 -8) ; frame1: R2_w=scalar() R10=fp0 fp-8=mmmmmmmm 176: (6f) r2 <<= r1 ; frame1: R1_w=scalar(smin=0,smax=umax=0xffffffff,var_off=(0x0; 0xffffffff)) R2_w=scalar() 177: (7b) *(u64 *)(r10 -8) = r2 ; frame1: R2_w=scalar(id=61) R10=fp0 fp-8_w=scalar(id=61) ; if (arg_spec->arg_signed) @ usdt.bpf.h:184 178: (bf) r3 = r2 ; frame1: R2_w=scalar(id=61) R3_w=scalar(id=61) 179: (7f) r3 >>= r1 ; frame1: R1_w=scalar(smin=0,smax=umax=0xffffffff,var_off=(0x0; 0xffffffff)) R3_w=scalar() ; if (arg_spec->arg_signed) @ usdt.bpf.h:184 180: (71) r4 = *(u8 *)(r8 +14) 181: safe log_fixup tests needed a minor adjustment as verifier log output increased a bit and that test is quite sensitive to such changes. Signed-off-by: Andrii Nakryiko <[email protected]> Link: https://lore.kernel.org/r/[email protected] Signed-off-by: Alexei Starovoitov <[email protected]>
2024-02-13selftests/bpf: add anonymous user struct as global subprog arg testAndrii Nakryiko1-0/+29
Add tests validating that kernel handles pointer to anonymous struct argument as PTR_TO_MEM case, not as PTR_TO_CTX case. Signed-off-by: Andrii Nakryiko <[email protected]> Link: https://lore.kernel.org/r/[email protected] Signed-off-by: Alexei Starovoitov <[email protected]>
2024-02-13bpf: don't infer PTR_TO_CTX for programs with unnamed context typeAndrii Nakryiko1-0/+19
For program types that don't have named context type name (e.g., BPF iterator programs or tracepoint programs), ctx_tname will be a non-NULL empty string. For such programs it shouldn't be possible to have PTR_TO_CTX argument for global subprogs based on type name alone. arg:ctx tag is the only way to have PTR_TO_CTX passed into global subprog for such program types. Fix this loophole, which currently would assume PTR_TO_CTX whenever user uses a pointer to anonymous struct as an argument to their global subprogs. This happens in practice with the following (quite common, in practice) approach: typedef struct { /* anonymous */ int x; } my_type_t; int my_subprog(my_type_t *arg) { ... } User's intent is to have PTR_TO_MEM argument for `arg`, but verifier will complain about expecting PTR_TO_CTX. This fix also closes unintended s390x-specific KPROBE handling of PTR_TO_CTX case. Selftest change is necessary to accommodate this. Fixes: 91cc1a99740e ("bpf: Annotate context types") Signed-off-by: Andrii Nakryiko <[email protected]> Link: https://lore.kernel.org/r/[email protected] Signed-off-by: Alexei Starovoitov <[email protected]>
2024-02-13selftests/bpf: Test PTR_MAYBE_NULL arguments of struct_ops operators.Kui-Feng Lee5-1/+115
Test if the verifier verifies nullable pointer arguments correctly for BPF struct_ops programs. "test_maybe_null" in struct bpf_testmod_ops is the operator defined for the test cases here. A BPF program should check a pointer for NULL beforehand to access the value pointed by the nullable pointer arguments, or the verifier should reject the programs. The test here includes two parts; the programs checking pointers properly and the programs not checking pointers beforehand. The test checks if the verifier accepts the programs checking properly and rejects the programs not checking at all. Signed-off-by: Kui-Feng Lee <[email protected]> Link: https://lore.kernel.org/r/[email protected] Signed-off-by: Martin KaFai Lau <[email protected]>
2024-02-13perf metric: Don't remove scale from countsIan Rogers1-6/+1
Counts were switched from the scaled saved value form to the aggregated count to avoid double accounting. When this happened the removing of scaling for a count should have been removed, however, it wasn't and this wasn't observed as it normally doesn't matter because a counter's scale is 1. A problem was observed with RAPL events that are scaled. Fixes: 37cc8ad77cf8 ("perf metric: Directly use counts rather than saved_value") Signed-off-by: Ian Rogers <[email protected]> Reviewed-by: Kan Liang <[email protected]> Cc: K Prateek Nayak <[email protected]> Cc: James Clark <[email protected]> Cc: Kaige Ye <[email protected]> Cc: John Garry <[email protected]> Signed-off-by: Namhyung Kim <[email protected]> Link: https://lore.kernel.org/r/[email protected]
2024-02-13perf stat: Avoid metric-only segvIan Rogers1-1/+1
Cycles is recognized as part of a hard coded metric in stat-shadow.c, it may call print_metric_only with a NULL fmt string leading to a segfault. Handle the NULL fmt explicitly. Fixes: 088519f318be ("perf stat: Move the display functions to stat-display.c") Signed-off-by: Ian Rogers <[email protected]> Reviewed-by: Kan Liang <[email protected]> Cc: K Prateek Nayak <[email protected]> Cc: James Clark <[email protected]> Cc: Kaige Ye <[email protected]> Cc: John Garry <[email protected]> Signed-off-by: Namhyung Kim <[email protected]> Link: https://lore.kernel.org/r/[email protected]
2024-02-13perf expr: Fix "has_event" function for metric style eventsIan Rogers1-1/+19
Events in metrics cannot use '/' as a separator, it would be recognized as a divide, so they use '@'. The '@' is recognized in the metricgroups code and changed to '/', do the same in the has_event function so that the parsing is only tried without the @s. Fixes: 4a4a9bf9075f ("perf expr: Add has_event function") Signed-off-by: Ian Rogers <[email protected]> Reviewed-by: Kan Liang <[email protected]> Cc: K Prateek Nayak <[email protected]> Cc: James Clark <[email protected]> Cc: Kaige Ye <[email protected]> Cc: John Garry <[email protected]> Signed-off-by: Namhyung Kim <[email protected]> Link: https://lore.kernel.org/r/[email protected]
2024-02-13perf expr: Allow NaN to be a valid numberIan Rogers1-0/+9
Currently only floating point numbers can be parsed, add a special case for NaN. Signed-off-by: Ian Rogers <[email protected]> Reviewed-by: Kan Liang <[email protected]> Cc: K Prateek Nayak <[email protected]> Cc: James Clark <[email protected]> Cc: Kaige Ye <[email protected]> Cc: John Garry <[email protected]> Signed-off-by: Namhyung Kim <[email protected]> Link: https://lore.kernel.org/r/[email protected]
2024-02-13selftests/resctrl: Get domain id from cache idIlpo Järvinen3-12/+21
Domain id is acquired differently depending on CPU. AMD tests use id from L3 cache, whereas CPUs from other vendors base the id on topology package id. In order to support L2 CAT test, this has to be generalized. The driver side code seems to get the domain ids from cache ids so the approach used by the AMD branch seems to match the kernel-side code. It will also work with L2 domain IDs as long as the cache level is generalized. Using the topology id was always fragile due to mismatch with the kernel-side way to acquire the domain id. It got incorrect domain id, e.g., when Cluster-on-Die (CoD) is enabled for CPU (but CoD is not well suited for resctrl in the first place so it has not been a big issue if tests don't work correctly with it). Taking all the above into account, generalize acquiring the domain id by taking it from the cache id and do not hard-code the cache level. Signed-off-by: Ilpo Järvinen <[email protected]> Reviewed-by: Reinette Chatre <[email protected]> Signed-off-by: Shuah Khan <[email protected]>
2024-02-13selftests/resctrl: Rename resource ID to domain IDIlpo Järvinen3-25/+25
Kernel-side calls the instances of a resource domains. Change the resource_id naming in the selftest code to domain_id to match the kernel side better. Suggested-by: Maciej Wieczór-Retman <[email protected]> Signed-off-by: Ilpo Järvinen <[email protected]> Reviewed-by: Reinette Chatre <[email protected]> Signed-off-by: Shuah Khan <[email protected]>
2024-02-13selftests/resctrl: Add helper to convert L2/3 to integerIlpo Järvinen1-8/+20
"L2"/"L3" conversion to integer is embedded into get_cache_size() which prevents reuse. Create a helper for the cache string to integer conversion to make it reusable. Signed-off-by: Ilpo Järvinen <[email protected]> Reviewed-by: Reinette Chatre <[email protected]> Signed-off-by: Shuah Khan <[email protected]>
2024-02-13selftests/resctrl: Pass write_schemata() resource instead of test nameIlpo Järvinen7-38/+38
write_schemata() takes the test name as an argument and determines the relevant resource based on the test name. Such mapping from name to resource does not really belong to resctrlfs.c that should provide only generic, test-independent functions. Pass the resource stored in the test information structure to write_schemata() instead of the test name. The new API is also more flexible as it enables to use write_schemata() for more than one resource within a test. While touching the sprintf(), move the unnecessary %c that is always '=' directly into the format string. Signed-off-by: Ilpo Järvinen <[email protected]> Reviewed-by: Reinette Chatre <[email protected]> Signed-off-by: Shuah Khan <[email protected]>
2024-02-13selftests/resctrl: Introduce generalized test frameworkIlpo Järvinen7-121/+148
Each test currently has a "run test" function in per test file and another resctrl_tests.c. The functions in resctrl_tests.c are almost identical. Generalize the one in resctrl_tests.c such that it can be shared between all of the tests. It makes adding new tests easier and removes the per test if () forests. Also add comment to CPU vendor IDs that they must be defined as bits for a bitmask. Signed-off-by: Ilpo Järvinen <[email protected]> Reviewed-by: Reinette Chatre <[email protected]> Signed-off-by: Shuah Khan <[email protected]>
2024-02-13selftests/resctrl: Create struct for input parametersIlpo Järvinen7-71/+95
resctrl_tests reads a set of parameters and passes them individually for each tests which causes variations in the call signature between the tests. Add struct input_params to hold all input parameters. It can be easily passed to every test without varying the call signature. Signed-off-by: Ilpo Järvinen <[email protected]> Reviewed-by: Reinette Chatre <[email protected]> Signed-off-by: Shuah Khan <[email protected]>
2024-02-13selftests/resctrl: Restore the CPU affinity after CAT testIlpo Järvinen4-9/+42
CAT test does not reset the CPU affinity after the benchmark. This is relatively harmless as is because CAT test is the last benchmark to run, however, more tests may be added later. Store the CPU affinity the first time taskset_benchmark() is run and add taskset_restore() which the test can call to reset the CPU mask to its original value. Signed-off-by: Ilpo Järvinen <[email protected]> Reviewed-by: Reinette Chatre <[email protected]> Signed-off-by: Shuah Khan <[email protected]>
2024-02-13selftests/resctrl: Rewrite Cache Allocation Technology (CAT) testIlpo Järvinen4-199/+139
CAT test spawns two processes into two different control groups with exclusive schemata. Both the processes alloc a buffer from memory matching their allocated LLC block size and flush the entire buffer out of caches. Since the processes are reading through the buffer only once during the measurement and initially all the buffer was flushed, the test isn't testing CAT. Rewrite the CAT test to allocate a buffer sized to half of LLC. Then perform a sequence of tests with different LLC alloc sizes starting from half of the CBM bits down to 1-bit CBM. Flush the buffer before each test and read the buffer twice. Observe the LLC misses on the second read through the buffer. As the allocated LLC block gets smaller and smaller, the LLC misses will become larger and larger giving a strong signal on CAT working properly. The new CAT test is using only a single process because it relies on measured effect against another run of itself rather than another process adding noise. The rest of the system is set to use the CBM bits not used by the CAT test to keep the test isolated. Replace count_bits() with count_contiguous_bits() to get the first bit position in order to be able to calculate masks based on it. This change has been tested with a number of systems from different generations. Suggested-by: Reinette Chatre <[email protected]> Signed-off-by: Ilpo Järvinen <[email protected]> Reviewed-by: Reinette Chatre <[email protected]> Signed-off-by: Shuah Khan <[email protected]>
2024-02-13selftests/resctrl: Read in less obvious order to defeat prefetch optimizationsIlpo Järvinen1-8/+30
When reading memory in order, HW prefetching optimizations will interfere with measuring how caches and memory are being accessed. This adds noise into the results. Change the fill_buf reading loop to not use an obvious in-order access using multiply by a prime and modulo. Using a prime multiplier with modulo ensures the entire buffer is eventually read. 23 is small enough that the reads are spread out but wrapping does not occur very frequently (wrapping too often can trigger L2 hits more frequently which causes noise to the test because getting the data from LLC is not required). It was discovered that not all primes work equally well and some can cause wildly unstable results (e.g., in an earlier version of this patch, the reads were done in reversed order and 59 was used as the prime resulting in unacceptably high and unstable results in MBA and MBM test on some architectures). Link: https://lore.kernel.org/linux-kselftest/TYAPR01MB6330025B5E6537F94DA49ACB8B499@TYAPR01MB6330.jpnprd01.prod.outlook.com/ Signed-off-by: Ilpo Järvinen <[email protected]> Reviewed-by: Reinette Chatre <[email protected]> Signed-off-by: Shuah Khan <[email protected]>
2024-02-13selftests/resctrl: Replace file write with volatile variableIlpo Järvinen3-21/+16
The fill_buf code prevents compiler optimizating the entire read loop away by writing the final value of the variable into a file. While it achieves the goal, writing into a file requires significant amount of work within the innermost test loop and also error handling. A simpler approach is to take advantage of volatile. Writing through a pointer to a volatile variable is enough to prevent compiler from optimizing the write away, and therefore compiler cannot remove the read loop either. Add a volatile 'value_sink' into resctrl_tests.c and make fill_buf to write into it. As a result, the error handling in fill_buf.c can be simplified. Signed-off-by: Ilpo Järvinen <[email protected]> Reviewed-by: Reinette Chatre <[email protected]> Signed-off-by: Shuah Khan <[email protected]>
2024-02-13selftests/resctrl: Open perf fd before start & add error handlingIlpo Järvinen3-11/+23
Perf fd (pe_fd) is opened, reset, and enabled during every test the CAT selftest runs. Also, ioctl(pe_fd, ...) calls are not error checked even if ioctl() could return an error. Open perf fd only once before the tests and only reset and enable the counter within the test loop. Add error checking to pe_fd ioctl() calls. Signed-off-by: Ilpo Järvinen <[email protected]> Reviewed-by: Reinette Chatre <[email protected]> Signed-off-by: Shuah Khan <[email protected]>
2024-02-13selftests/resctrl: Move cat_val() to cat_test.c and rename to cat_test()Ilpo Järvinen3-87/+90
The main CAT test function is called cat_val() and resides in cache.c which is illogical. Rename the function to cat_test() and move it into cat_test.c. Signed-off-by: Ilpo Järvinen <[email protected]> Reviewed-by: Reinette Chatre <[email protected]> Signed-off-by: Shuah Khan <[email protected]>
2024-02-13selftests/resctrl: Convert perf related globals to localsIlpo Järvinen1-32/+40
Perf related variables pea_llc_miss, pe_read, and pe_fd are globals in cache.c. Convert them to locals for better scoping and make pea_llc_miss simpler by renaming it to pea. Make close(pe_fd) handling easier to understand by doing it inside cat_val(). Make also sizeof()s use safer way to determine the right struct. Signed-off-by: Ilpo Järvinen <[email protected]> Reviewed-by: Reinette Chatre <[email protected]> Signed-off-by: Shuah Khan <[email protected]>
2024-02-13selftests/resctrl: Improve perf initIlpo Järvinen1-10/+6
struct perf_event_attr initialization is spread into perf_event_initialize() and perf_event_attr_initialize() and setting ->config is hardcoded by the deepest level. perf_event_attr init belongs to perf_event_attr_initialize() so move it entirely there. Rename the other function perf_event_initialized_read_format(). Call each init function directly from the test as they will take different parameters (especially true after the perf related global variables are moved to local variables). Signed-off-by: Ilpo Järvinen <[email protected]> Reviewed-by: Reinette Chatre <[email protected]> Signed-off-by: Shuah Khan <[email protected]>
2024-02-13selftests/resctrl: Consolidate naming of perf event related thingsIlpo Järvinen1-21/+21
Naming for perf event related functions, types, and variables is inconsistent. Make struct read_format and all functions related to perf events start with "perf_". Adjust variable names towards the same direction but use shorter names for variables where appropriate (pe prefix). Signed-off-by: Ilpo Järvinen <[email protected]> Reviewed-by: Reinette Chatre <[email protected]> Signed-off-by: Shuah Khan <[email protected]>
2024-02-13selftests/resctrl: Remove nested calls in perf event handlingIlpo Järvinen1-50/+14
Perf event handling has functions that are the sole caller of another perf event handling related function: - reset_enable_llc_perf() calls perf_event_open_llc_miss() - perf_event_measure() calls get_llc_perf() Remove the extra layer of calls to make the code easier to follow by moving the code into the calling function. Signed-off-by: Ilpo Järvinen <[email protected]> Reviewed-by: Reinette Chatre <[email protected]> Signed-off-by: Shuah Khan <[email protected]>
2024-02-13selftests/resctrl: Remove unnecessary __u64 -> unsigned long conversionIlpo Järvinen3-19/+13
Perf counters are __u64 but the code converts them to unsigned long before printing them out. Remove unnecessary type conversion and retain the perf originating value as __u64. Signed-off-by: Ilpo Järvinen <[email protected]> Reviewed-by: Reinette Chatre <[email protected]> Signed-off-by: Shuah Khan <[email protected]>
2024-02-13selftests/resctrl: Split show_cache_info() to test specific and generic partsIlpo Järvinen4-44/+66
show_cache_info() calculates results and provides generic cache information. This makes it hard to alter pass/fail conditions. Separate the test specific checks into CAT and CMT test files and leave only the generic information part into show_cache_info(). Signed-off-by: Ilpo Järvinen <[email protected]> Reviewed-by: Reinette Chatre <[email protected]> Signed-off-by: Shuah Khan <[email protected]>
2024-02-13selftests/resctrl: Split measure_cache_vals()Ilpo Järvinen3-27/+41
measure_cache_vals() does a different thing depending on the test case that called it: - For CAT, it measures LLC misses through perf. - For CMT, it measures LLC occupancy through resctrl. Split these two functionalities into own functions the CAT and CMT tests can call directly. Replace passing the struct resctrl_val_param parameter with the filename because it's more generic and all those functions need out of resctrl_val. Co-developed-by: Fenghua Yu <[email protected]> Signed-off-by: Fenghua Yu <[email protected]> Signed-off-by: Ilpo Järvinen <[email protected]> Reviewed-by: Reinette Chatre <[email protected]> Signed-off-by: Shuah Khan <[email protected]>
2024-02-13selftests/resctrl: Exclude shareable bits from schemata in CAT testIlpo Järvinen3-4/+99
CAT test doesn't take shareable bits into account, i.e., the test might be sharing cache with some devices (e.g., graphics). Introduce get_mask_no_shareable() and use it to provision an environment for CAT test where the allocated LLC is isolated better. Excluding shareable_bits may create hole(s) into the cbm_mask, thus add a new helper count_contiguous_bits() to find the longest contiguous set of CBM bits. create_bit_mask() is needed by an upcoming CAT test rewrite so make it available in resctrl.h right away. Signed-off-by: Ilpo Järvinen <[email protected]> Reviewed-by: Reinette Chatre <[email protected]> Signed-off-by: Shuah Khan <[email protected]>
2024-02-13selftests/resctrl: Create cache_portion_size() helperIlpo Järvinen3-9/+33
CAT and CMT tests calculate size of the cache portion for the n-bits cache allocation on their own. Add cache_portion_size() helper that calculates size of the cache portion for the given number of bits and use it to replace the existing span calculations. This also prepares for the new CAT test that will need to determine the size of the cache portion also during results processing. Rename also 'cache_size' local variables to 'cache_total_size' to prevent misinterpretations. Signed-off-by: Ilpo Järvinen <[email protected]> Reviewed-by: Reinette Chatre <[email protected]> Signed-off-by: Shuah Khan <[email protected]>
2024-02-13selftests/resctrl: Mark get_cache_size() cache_type constIlpo Järvinen2-2/+2
get_cache_size() does not modify cache_type so it could be const. Mark cache_type const so that const char * can be passed to it. This prevents warnings once many of the test parameters are marked const. Signed-off-by: Ilpo Järvinen <[email protected]> Reviewed-by: Reinette Chatre <[email protected]> Signed-off-by: Shuah Khan <[email protected]>
2024-02-13selftests/resctrl: Refactor get_cbm_mask() and rename to get_full_cbm()Ilpo Järvinen4-24/+41
Callers of get_cbm_mask() are required to pass a string into which the capacity bitmask (CBM) is read. Neither CAT nor CMT tests need the bitmask as string but just convert it into an unsigned long value. Another limitation is that the bit mask reader can only read .../cbm_mask files. Generalize the bit mask reading function into get_bit_mask() such that it can be used to handle other files besides the .../cbm_mask and handles the unsigned long conversion within get_bit_mask() using fscanf(). Change get_cbm_mask() to use get_bit_mask() and rename it to get_full_cbm() to better indicate what the function does. Return error from get_full_cbm() if the bitmask is zero for some reason because it makes the code more robust as the selftests naturally assume the bitmask has some bits. Also mark cache_type const while at it and remove useless comments that are related to processing of CBM bits. Co-developed-by: Fenghua Yu <[email protected]> Signed-off-by: Fenghua Yu <[email protected]> Signed-off-by: Ilpo Järvinen <[email protected]> Reviewed-by: Reinette Chatre <[email protected]> Signed-off-by: Shuah Khan <[email protected]>
2024-02-13selftests/resctrl: Refactor fill_buf functionsIlpo Järvinen2-43/+18
There are unnecessary nested calls in fill_buf.c: - run_fill_buf() calls fill_cache() - alloc_buffer() calls malloc_and_init_memory() Simplify the code flow and remove those unnecessary call levels by moving the called code inside the calling function and remove the duplicated error print. Resolve the difference in run_fill_buf() and fill_cache() parameter name into 'buf_size' which is more descriptive than 'span'. Also, while moving the allocation related code, rename 'p' into 'buf' to be consistent in naming the variables. Signed-off-by: Ilpo Järvinen <[email protected]> Reviewed-by: Reinette Chatre <[email protected]> Signed-off-by: Shuah Khan <[email protected]>
2024-02-13selftests/resctrl: Split fill_buf to allow tests finer-grained controlIlpo Järvinen1-6/+15
MBM, MBA and CMT test cases call run_fill_buf() that in turn calls fill_cache() to alloc and loop indefinitely around the buffer. This binds buffer allocation and running the benchmark into a single bundle so that a selftest cannot allocate a buffer once and reuse it. CAT test doesn't want to loop around the buffer continuously and after rewrite it needs the ability to allocate the buffer separately. Split buffer allocation out of fill_cache() into alloc_buffer(). This change is part of preparation for the new CAT test that allocates a buffer and does multiple passes over the same buffer (but not in an infinite loop). Co-developed-by: Fenghua Yu <[email protected]> Signed-off-by: Fenghua Yu <[email protected]> Signed-off-by: Ilpo Järvinen <[email protected]> Reviewed-by: Reinette Chatre <[email protected]> Signed-off-by: Shuah Khan <[email protected]>