aboutsummaryrefslogtreecommitdiff
path: root/tools
AgeCommit message (Collapse)AuthorFilesLines
2024-06-20KVM: selftests: arm64: Test writes to CTR_EL0Sebastian Ott1-0/+16
Test that CTR_EL0 is modifiable from userspace, that changes are visible to guests, and that they are preserved across a vCPU reset. Signed-off-by: Sebastian Ott <[email protected]> Reviewed-by: Eric Auger <[email protected]> Link: https://lore.kernel.org/r/[email protected] Signed-off-by: Oliver Upton <[email protected]>
2024-06-20tools/perf: Handle perftool-testsuite_probe testcases fail when kernel ↵Athira Rajeev1-3/+28
debuginfo is not present Running "perftool-testsuite_probe" fails as below: ./perf test -v "perftool-testsuite_probe" 83: perftool-testsuite_probe : FAILED There are three fails: 1. Regexp not found: "\s*probe:inode_permission(?:_\d+)?\s+\(on inode_permission(?:[:\+][0-9A-Fa-f]+)?@.+\)" -- [ FAIL ] -- perf_probe :: test_adding_kernel :: listing added probe :: perf probe -l (output regexp parsing) 2. Regexp not found: "probe:vfs_mknod" Regexp not found: "probe:vfs_create" Regexp not found: "probe:vfs_rmdir" Regexp not found: "probe:vfs_link" Regexp not found: "probe:vfs_write" -- [ FAIL ] -- perf_probe :: test_adding_kernel :: wildcard adding support (command exitcode + output regexp parsing) 3. Regexp not found: "Failed to find" Regexp not found: "somenonexistingrandomstuffwhichisalsoprettylongorevenlongertoexceed64" Regexp not found: "in this function|at this address" Line did not match any pattern: "The /boot/vmlinux file has no debug information." Line did not match any pattern: "Rebuild with CONFIG_DEBUG_INFO=y, or install an appropriate debuginfo package." These three tests depends on kernel debug info. 1. Fail 1 expects file name along with probe which needs debuginfo 2. Fail 2 : perf probe -nf --max-probes=512 -a 'vfs_* $params' Debuginfo-analysis is not supported. Error: Failed to add events. 3. Fail 3 : perf probe 'vfs_read somenonexistingrandomstuffwhichisalsoprettylongorevenlongertoexceed64' Debuginfo-analysis is not supported. Error: Failed to add events. There is already helper function skip_if_no_debuginfo in lib/probe_vfs_getname.sh which does perf probe and returns "2" if debug info is not present. Use the skip_if_no_debuginfo function and skip only the three tests which needs debuginfo based on the result. With the patch: 83: perftool-testsuite_probe: --- start --- test child forked, pid 3927 -- [ PASS ] -- perf_probe :: test_adding_kernel :: adding probe inode_permission :: -- [ PASS ] -- perf_probe :: test_adding_kernel :: adding probe inode_permission :: -a -- [ PASS ] -- perf_probe :: test_adding_kernel :: adding probe inode_permission :: --add -- [ PASS ] -- perf_probe :: test_adding_kernel :: listing added probe :: perf list Regexp not found: "\s*probe:inode_permission(?:_\d+)?\s+\(on inode_permission(?:[:\+][0-9A-Fa-f]+)?@.+\)" -- [ SKIP ] -- perf_probe :: test_adding_kernel :: 2 2 Skipped due to missing debuginfo :: testcase skipped -- [ PASS ] -- perf_probe :: test_adding_kernel :: using added probe -- [ PASS ] -- perf_probe :: test_adding_kernel :: deleting added probe -- [ PASS ] -- perf_probe :: test_adding_kernel :: listing removed probe (should NOT be listed) -- [ PASS ] -- perf_probe :: test_adding_kernel :: dry run :: adding probe -- [ PASS ] -- perf_probe :: test_adding_kernel :: force-adding probes :: first probe adding -- [ PASS ] -- perf_probe :: test_adding_kernel :: force-adding probes :: second probe adding (without force) -- [ PASS ] -- perf_probe :: test_adding_kernel :: force-adding probes :: second probe adding (with force) -- [ PASS ] -- perf_probe :: test_adding_kernel :: using doubled probe -- [ PASS ] -- perf_probe :: test_adding_kernel :: removing multiple probes Regexp not found: "probe:vfs_mknod" Regexp not found: "probe:vfs_create" Regexp not found: "probe:vfs_rmdir" Regexp not found: "probe:vfs_link" Regexp not found: "probe:vfs_write" -- [ SKIP ] -- perf_probe :: test_adding_kernel :: 2 2 Skipped due to missing debuginfo :: testcase skipped Regexp not found: "Failed to find" Regexp not found: "somenonexistingrandomstuffwhichisalsoprettylongorevenlongertoexceed64" Regexp not found: "in this function|at this address" Line did not match any pattern: "The /boot/vmlinux file has no debug information." Line did not match any pattern: "Rebuild with CONFIG_DEBUG_INFO=y, or install an appropriate debuginfo package." -- [ SKIP ] -- perf_probe :: test_adding_kernel :: 2 2 Skipped due to missing debuginfo :: testcase skipped -- [ PASS ] -- perf_probe :: test_adding_kernel :: function with retval :: add -- [ PASS ] -- perf_probe :: test_adding_kernel :: function with retval :: record -- [ PASS ] -- perf_probe :: test_adding_kernel :: function argument probing :: script ## [ PASS ] ## perf_probe :: test_adding_kernel SUMMARY ---- end(0) ---- 83: perftool-testsuite_probe : Ok Only the three specific tests are skipped and remaining ran successfully. Signed-off-by: Athira Rajeev <[email protected]> Reviewed-by: James Clark <[email protected]> Cc: [email protected] Cc: [email protected] Cc: [email protected] Cc: [email protected] Cc: [email protected] Signed-off-by: Namhyung Kim <[email protected]> Link: https://lore.kernel.org/r/[email protected]
2024-06-20cpupower: Change the var type of the 'monitor' subcommand display modeRoman Storozhenko1-1/+1
There is a type 'enum operation_mode_e' contains the display modes of the 'monitor' subcommand. This type isn't used though, instead the variable 'mode' is of a simple 'int' type. Change 'mode' variable type from 'int' to 'enum operation_mode_e' in order to improve compiler type checking. Built and tested this with different monitor cmdline params. Everything works as expected, that is nothing changed and no regressions encountered. Signed-off-by: Roman Storozhenko <[email protected]> Signed-off-by: Shuah Khan <[email protected]>
2024-06-20cpupower: Remove absent 'v' parameter from monitor man pageRoman Storozhenko1-5/+0
Remove not supported '-v' parameter from the cpupower's 'monitor' command description. There is a '-v' parameter described in cpupower's 'monitor' command man page. It isn't supported at the moment, and perhaps has never been supported. When I run the monitor with this parameter I get the following: $ sudo LD_LIBRARY_PATH=lib64/ bin/cpupower monitor -v monitor: invalid option -- 'v' invalid or unknown argument $ sudo LD_LIBRARY_PATH=lib64/ bin/cpupower monitor -V monitor: invalid option -- 'V' invalid or unknown argument Signed-off-by: Roman Storozhenko <[email protected]> Signed-off-by: Shuah Khan <[email protected]>
2024-06-20selftests: virtio_net: add forgotten config optionsJiri Pirko1-1/+7
One may use tools/testing/selftests/drivers/net/virtio_net/config for example for vng build command like this one: $ vng -v -b -f tools/testing/selftests/drivers/net/virtio_net/config In that case, the needed kernel config options are not turned on. Add the missed kernel config options. Reported-by: Jakub Kicinski <[email protected]> Closes: https://lore.kernel.org/netdev/[email protected]/ Reported-by: Matthieu Baerts <[email protected]> Closes: https://lore.kernel.org/netdev/[email protected]/ Fixes: ccfaed04db5e ("selftests: virtio_net: add initial tests") Signed-off-by: Jiri Pirko <[email protected]> Reviewed-by: Xuan Zhuo <[email protected]> Acked-by: Michael S. Tsirkin <[email protected]> Link: https://lore.kernel.org/r/[email protected] Signed-off-by: Jakub Kicinski <[email protected]>
2024-06-19binfmt_elf: Honor PT_LOAD alignment for static PIEKees Cook1-1/+1
The p_align values in PT_LOAD were ignored for static PIE executables (i.e. ET_DYN without PT_INTERP). This is because there is no way to request a non-fixed mmap region with a specific alignment. ET_DYN with PT_INTERP uses a separate base address (ELF_ET_DYN_BASE) and binfmt_elf performs the ASLR itself, which means it can also apply alignment. For the mmap region, the address selection happens deep within the vm_mmap() implementation (when the requested address is 0). The earlier attempt to implement this: commit 9630f0d60fec ("fs/binfmt_elf: use PT_LOAD p_align values for static PIE") commit 925346c129da ("fs/binfmt_elf: fix PT_LOAD p_align values for loaders") did not take into account the different base address origins, and were eventually reverted: aeb7923733d1 ("revert "fs/binfmt_elf: use PT_LOAD p_align values for static PIE"") In order to get the correct alignment from an mmap base, binfmt_elf must perform a 0-address load first, then tear down the mapping and perform alignment on the resulting address. Since this is slightly more overhead, only do this when it is needed (i.e. the alignment is not the default ELF alignment). This does, however, have the benefit of being able to use MAP_FIXED_NOREPLACE, to avoid potential collisions. With this fixed, enable the static PIE self tests again. Reported-by: H.J. Lu <[email protected]> Closes: https://bugzilla.kernel.org/show_bug.cgi?id=215275 Link: https://lore.kernel.org/r/[email protected] Signed-off-by: Kees Cook <[email protected]>
2024-06-19selftests/exec: Build both static and non-static load_address testsKees Cook2-20/+66
After commit 4d1cd3b2c5c1 ("tools/testing/selftests/exec: fix link error"), the load address alignment tests tried to build statically. This was silently ignored in some cases. However, after attempting to further fix the build by switching to "-static-pie", the test started failing. This appears to be due to non-PT_INTERP ET_DYN execs ("static PIE") not doing alignment correctly, which remains unfixed[1]. See commit aeb7923733d1 ("revert "fs/binfmt_elf: use PT_LOAD p_align values for static PIE"") for more details. Provide rules to build both static and non-static PIE binaries, improve debug reporting, and perform several test steps instead of a single all-or-nothing test. However, do not actually enable static-pie tests; alignment specification is only supported for ET_DYN with PT_INTERP ("regular PIE"). Link: https://bugzilla.kernel.org/show_bug.cgi?id=215275 [1] Link: https://lore.kernel.org/r/[email protected] Signed-off-by: Kees Cook <[email protected]>
2024-06-19selftest/cgroup: Update test_cpuset_prs.sh to match changesWaiman Long1-15/+40
Unlike the list of isolated CPUs, it is not easy to programamatically determine what sched domains are being created by the scheduler just by examinng the data in various kernfs filesystems. The easiest way to get this information is by enabling /sys/kernel/debug/sched/verbose file to make those information displayed in the console. This is also what the test_cpuset_prs.sh script is doing when the -v flag is given. It is rather hard to fetch the data from the console and compare it to the expected result. An easier way is to dump the expected sched-domain information out to the console so that they can be visually compared with the actual sched domain data. However, this have to be done manually by visual inspection and so will only be done once in a while. Moreover the preceding cpuset commits also change the cpuset behavior requiring corresponding chanages in some test cases as well as new test cases to test the newly added functionality. Signed-off-by: Waiman Long <[email protected]> Signed-off-by: Tejun Heo <[email protected]>
2024-06-19selftest/cgroup: Fix test_cpuset_prs.sh problems reported by test robotWaiman Long1-6/+14
The test robot reported two different problems when running the test_cpuset_prs.sh test. # ./test_cpuset_prs.sh: line 106: echo: write error: Input/output error # : # Effective cpus changed to 0-1,4-7 after test 4! The write error is caused by writing to /dev/console. It looks like some systems may not have /dev/console configured or in a writeable state. Fix this by checking the existence of /dev/console before attempting to write it. After the completion of each test run, the script will check if the cpuset state is reset back to the original state. That usually takes a while to happen. The test script inserts some artificial delay to make sure that the reset has completed. The current setting is about 80ms. That may not be enough in some cases especially if the test system is slow. Double it to 160ms to minimize the chance of this type of failure. Reported-by: kernel test robot <[email protected]> Closes: https://lore.kernel.org/oe-lkp/[email protected] Signed-off-by: Waiman Long <[email protected]> Signed-off-by: Tejun Heo <[email protected]>
2024-06-19selftests: add selftest for the SRv6 End.DX6 behavior with netfilterJianguo Wu3-0/+342
this selftest is designed for evaluating the SRv6 End.DX6 behavior used with netfilter(rpfilter), in this example, for implementing IPv6 L3 VPN use cases. Signed-off-by: Jianguo Wu <[email protected]> Signed-off-by: Pablo Neira Ayuso <[email protected]>
2024-06-19selftests: add selftest for the SRv6 End.DX4 behavior with netfilterJianguo Wu3-0/+337
this selftest is designed for evaluating the SRv6 End.DX4 behavior used with netfilter(rpfilter), in this example, for implementing IPv4 L3 VPN use cases. Signed-off-by: Jianguo Wu <[email protected]> Signed-off-by: Pablo Neira Ayuso <[email protected]>
2024-06-19selftests: openvswitch: Set value to nla flags.Adrian Moreno1-1/+1
Netlink flags, although they don't have payload at the netlink level, are represented as having "True" as value in pyroute2. Without it, trying to add a flow with a flag-type action (e.g: pop_vlan) fails with the following traceback: Traceback (most recent call last): File "[...]/ovs-dpctl.py", line 2498, in <module> sys.exit(main(sys.argv)) ^^^^^^^^^^^^^^ File "[...]/ovs-dpctl.py", line 2487, in main ovsflow.add_flow(rep["dpifindex"], flow) File "[...]/ovs-dpctl.py", line 2136, in add_flow reply = self.nlm_request( ^^^^^^^^^^^^^^^^^ File "[...]/pyroute2/netlink/nlsocket.py", line 822, in nlm_request return tuple(self._genlm_request(*argv, **kwarg)) ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^ File "[...]/pyroute2/netlink/generic/__init__.py", line 126, in nlm_request return tuple(super().nlm_request(*argv, **kwarg)) ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^ File "[...]/pyroute2/netlink/nlsocket.py", line 1124, in nlm_request self.put(msg, msg_type, msg_flags, msg_seq=msg_seq) File "[...]/pyroute2/netlink/nlsocket.py", line 389, in put self.sendto_gate(msg, addr) File "[...]/pyroute2/netlink/nlsocket.py", line 1056, in sendto_gate msg.encode() File "[...]/pyroute2/netlink/__init__.py", line 1245, in encode offset = self.encode_nlas(offset) ^^^^^^^^^^^^^^^^^^^^^^^^ File "[...]/pyroute2/netlink/__init__.py", line 1560, in encode_nlas nla_instance.setvalue(cell[1]) File "[...]/pyroute2/netlink/__init__.py", line 1265, in setvalue nlv.setvalue(nla_tuple[1]) ~~~~~~~~~^^^ IndexError: list index out of range Signed-off-by: Adrian Moreno <[email protected]> Acked-by: Aaron Conole <[email protected]> Signed-off-by: David S. Miller <[email protected]>
2024-06-18Merge tag 'linux_kselftest-fixes-6.10-rc5' of ↵Linus Torvalds4-8/+35
git://git.kernel.org/pub/scm/linux/kernel/git/shuah/linux-kselftest Pull kselftest fixes from Shuah Khan: - filesystems: warn_unused_result warnings - seccomp: format-zero-length warnings - fchmodat2: clang build warnings due to-static-libasan - openat2: clang build warnings due to static-libasan, LOCAL_HDRS * tag 'linux_kselftest-fixes-6.10-rc5' of git://git.kernel.org/pub/scm/linux/kernel/git/shuah/linux-kselftest: selftests/fchmodat2: fix clang build failure due to -static-libasan selftests/openat2: fix clang build failures: -static-libasan, LOCAL_HDRS selftests: seccomp: fix format-zero-length warnings selftests: filesystems: fix warn_unused_result build warnings
2024-06-18selftests: openvswitch: Use bash as interpreterSimon Horman1-1/+1
openvswitch.sh makes use of substitutions of the form ${ns:0:1}, to obtain the first character of $ns. Empirically, this is works with bash but not dash. When run with dash these evaluate to an empty string and printing an error to stdout. # dash -c 'ns=client; echo "${ns:0:1}"' 2>error # cat error dash: 1: Bad substitution # bash -c 'ns=client; echo "${ns:0:1}"' 2>error c # cat error This leads to tests that neither pass nor fail. F.e. TEST: arp_ping [START] adding sandbox 'test_arp_ping' Adding DP/Bridge IF: sbx:test_arp_ping dp:arpping {, , } create namespaces ./openvswitch.sh: 282: eval: Bad substitution TEST: ct_connect_v4 [START] adding sandbox 'test_ct_connect_v4' Adding DP/Bridge IF: sbx:test_ct_connect_v4 dp:ct4 {, , } ./openvswitch.sh: 322: eval: Bad substitution create namespaces Resolve this by making openvswitch.sh a bash script. Fixes: 918423fda910 ("selftests: openvswitch: add an initial flow programming case") Signed-off-by: Simon Horman <[email protected]> Reviewed-by: Przemek Kitszel <[email protected]> Link: https://lore.kernel.org/r/[email protected] Signed-off-by: Jakub Kicinski <[email protected]>
2024-06-18cpupower: Improve cpupower build process descriptionRoman Storozhenko1-10/+150
Enhance cpupower build process description with the information on building and installing the utility to the user defined directories as well as with the information on the way of running the utility from the custom defined installation directory. Signed-off-by: Roman Storozhenko <[email protected]> Signed-off-by: Shuah Khan <[email protected]>
2024-06-18cpupower: Add 'help' target to the main MakefileRoman Storozhenko1-1/+36
Make "cpupower" building process more user friendly by adding 'help' target to the main makefile. This target describes various build and cleaning options available to the user. Signed-off-by: Roman Storozhenko <[email protected]> Signed-off-by: Shuah Khan <[email protected]>
2024-06-18cpupower: Replace a dead reference link with working onesRoman Storozhenko1-3/+5
Replace a dead reference link to a turbo boost technology description with a reference to a root page of the technology on the Intel site, and add another one, describing power management technology, which includes short description of the turbo boost. Signed-off-by: Roman Storozhenko <[email protected]> Signed-off-by: Shuah Khan <[email protected]>
2024-06-18KVM: selftests: Test vCPU boot IDs above 2^32 and MAX_VCPU_IDMathias Krause1-0/+16
The KVM_SET_BOOT_CPU_ID ioctl missed to reject invalid vCPU IDs. Verify this no longer works and gets rejected with an appropriate error code. Signed-off-by: Mathias Krause <[email protected]> Link: https://lore.kernel.org/r/[email protected] [sean: add test for MAX_VCPU_ID+1, always do negative test] Signed-off-by: Sean Christopherson <[email protected]>
2024-06-18KVM: selftests: Test max vCPU IDs corner casesMathias Krause1-2/+20
The KVM_CREATE_VCPU ioctl ABI had an implicit integer truncation bug, allowing 2^32 aliases for a vCPU ID by setting the upper 32 bits of a 64 bit ioctl() argument. It also allowed excluding a once set boot CPU ID. Verify this no longer works and gets rejected with an error. Signed-off-by: Mathias Krause <[email protected]> Link: https://lore.kernel.org/r/[email protected] [sean: tweak assert message+comment for 63:32!=0 testcase] Signed-off-by: Sean Christopherson <[email protected]>
2024-06-18kselftest/alsa: Fix validation of writes to volatile controlsMark Brown1-16/+29
When validating writes to controls we check that whatever value we wrote actually appears in the control. For volatile controls we cannot assume that this will be the case, the value may be changed at any time including between our write and read. Handle this by moving the check for volatile controls that we currently do for events to a separate block and just verifying that whatever value we read is valid for the control. Signed-off-by: Mark Brown <[email protected]> Link: https://lore.kernel.org/r/[email protected] Signed-off-by: Takashi Iwai <[email protected]> Link: https://lore.kernel.org/[email protected]
2024-06-17selftests: mptcp: userspace_pm: fixed subtest namesMatthieu Baerts (NGI0)1-18/+28
It is important to have fixed (sub)test names in TAP, because these names are used to identify them. If they are not fixed, tracking cannot be done. Some subtests from the userspace_pm selftest were using random numbers in their names: the client and server address IDs from $RANDOM, and the client port number randomly picked by the kernel when creating the connection. These values have been replaced by 'client' and 'server' words: that's even more helpful than showing random numbers. Note that the addresses IDs are incremented and decremented in the test: +1 or -1 are then displayed in these cases. Not to loose info that can be useful for debugging in case of issues, these random numbers are now displayed at the beginning of the test. Fixes: f589234e1af0 ("selftests: mptcp: userspace_pm: format subtests results in TAP") Cc: [email protected] Signed-off-by: Matthieu Baerts (NGI0) <[email protected]> Reviewed-by: Simon Horman <[email protected]> Link: https://lore.kernel.org/r/20240614-upstream-net-20240614-selftests-mptcp-uspace-pm-fixed-test-names-v1-1-460ad3edb429@kernel.org Signed-off-by: Jakub Kicinski <[email protected]>
2024-06-17testing: nvdimm: Add MODULE_DESCRIPTION() macrosIra Weiny2-0/+2
When building with W=1 the following errors are seen: WARNING: modpost: missing MODULE_DESCRIPTION() in tools/testing/nvdimm/test/nfit_test.o WARNING: modpost: missing MODULE_DESCRIPTION() in tools/testing/nvdimm/test/ndtest.o Add the required MODULE_DESCRIPTION() to the test platform device drivers. Suggested-by: Jeff Johnson <[email protected]> Reviewed-by: Jeff Johnson <[email protected]> Link: https://patch.msgid.link/r/[email protected] Signed-off-by: Ira Weiny <[email protected]>
2024-06-17testing: nvdimm: iomap: add MODULE_DESCRIPTION()Jeff Johnson1-0/+1
Fix the 'make W=1' warning: WARNING: modpost: missing MODULE_DESCRIPTION() in drivers/nvdimm/../../tools/testing/nvdimm/test/iomap.o Signed-off-by: Jeff Johnson <[email protected]> Link: https://patch.msgid.link/r/[email protected] Signed-off-by: Ira Weiny <[email protected]>
2024-06-17resolve_btfids: Handle presence of .BTF.base sectionAlan Maguire1-0/+8
Now that btf_parse_elf() handles .BTF.base section presence, we need to ensure that resolve_btfids uses .BTF.base when present rather than the vmlinux base BTF passed in via the -B option. Detect .BTF.base section presence and unset the base BTF path to ensure that BTF ELF parsing will do the right thing. Signed-off-by: Alan Maguire <[email protected]> Signed-off-by: Andrii Nakryiko <[email protected]> Reviewed-by: Eduard Zingerman <[email protected]> Link: https://lore.kernel.org/bpf/[email protected]
2024-06-17libbpf: Make btf_parse_elf process .BTF.base transparentlyEduard Zingerman2-54/+111
Update btf_parse_elf() to check if .BTF.base section is present. The logic is as follows: if .BTF.base section exists: distilled_base := btf_new(.BTF.base) if distilled_base: btf := btf_new(.BTF, .base_btf=distilled_base) if base_btf: btf_relocate(btf, base_btf) else: btf := btf_new(.BTF) return btf In other words: - if .BTF.base section exists, load BTF from it and use it as a base for .BTF load; - if base_btf is specified and .BTF.base section exist, relocate newly loaded .BTF against base_btf. Signed-off-by: Eduard Zingerman <[email protected]> Signed-off-by: Alan Maguire <[email protected]> Signed-off-by: Andrii Nakryiko <[email protected]> Link: https://lore.kernel.org/bpf/[email protected]
2024-06-17selftests/bpf: Extend distilled BTF tests to cover BTF relocationAlan Maguire1-0/+278
Ensure relocated BTF looks as expected; in this case identical to original split BTF, with a few duplicate anonymous types added to split BTF by the relocation process. Also add relocation tests for edge cases like missing type in base BTF and multiple types of the same name. Signed-off-by: Alan Maguire <[email protected]> Signed-off-by: Andrii Nakryiko <[email protected]> Acked-by: Eduard Zingerman <[email protected]> Link: https://lore.kernel.org/bpf/[email protected]
2024-06-17libbpf: Split BTF relocationAlan Maguire6-1/+542
Map distilled base BTF type ids referenced in split BTF and their references to the base BTF passed in, and if the mapping succeeds, reparent the split BTF to the base BTF. Relocation is done by first verifying that distilled base BTF only consists of named INT, FLOAT, ENUM, FWD, STRUCT and UNION kinds; then we sort these to speed lookups. Once sorted, the base BTF is iterated, and for each relevant kind we check for an equivalent in distilled base BTF. When found, the mapping from distilled -> base BTF id and string offset is recorded. In establishing mappings, we need to ensure we check STRUCT/UNION size when the STRUCT/UNION is embedded in a split BTF STRUCT/UNION, and when duplicate names exist for the same STRUCT/UNION. Otherwise size is ignored in matching STRUCT/UNIONs. Once all mappings are established, we can update type ids and string offsets in split BTF and reparent it to the new base. Signed-off-by: Alan Maguire <[email protected]> Signed-off-by: Andrii Nakryiko <[email protected]> Acked-by: Eduard Zingerman <[email protected]> Link: https://lore.kernel.org/bpf/[email protected]
2024-06-17selftests/bpf: Test distilled base, split BTF generationAlan Maguire1-0/+274
Test generation of split+distilled base BTF, ensuring that - named base BTF STRUCTs and UNIONs are represented as 0-vlen sized STRUCT/UNIONs - named ENUM[64]s are represented as 0-vlen named ENUM[64]s - anonymous struct/unions are represented in full in split BTF - anonymous enums are represented in full in split BTF - types unreferenced from split BTF are not present in distilled base BTF Also test that with vmlinux BTF and split BTF based upon it, we only represent needed base types referenced from split BTF in distilled base. Signed-off-by: Alan Maguire <[email protected]> Signed-off-by: Andrii Nakryiko <[email protected]> Acked-by: Eduard Zingerman <[email protected]> Link: https://lore.kernel.org/bpf/[email protected]
2024-06-17libbpf: Add btf__distill_base() creating split BTF with distilled base BTFAlan Maguire3-6/+335
To support more robust split BTF, adding supplemental context for the base BTF type ids that split BTF refers to is required. Without such references, a simple shuffling of base BTF type ids (without any other significant change) invalidates the split BTF. Here the attempt is made to store additional context to make split BTF more robust. This context comes in the form of distilled base BTF providing minimal information (name and - in some cases - size) for base INTs, FLOATs, STRUCTs, UNIONs, ENUMs and ENUM64s along with modified split BTF that points at that base and contains any additional types needed (such as TYPEDEF, PTR and anonymous STRUCT/UNION declarations). This information constitutes the minimal BTF representation needed to disambiguate or remove split BTF references to base BTF. The rules are as follows: - INT, FLOAT, FWD are recorded in full. - if a named base BTF STRUCT or UNION is referred to from split BTF, it will be encoded as a zero-member sized STRUCT/UNION (preserving size for later relocation checks). Only base BTF STRUCT/UNIONs that are either embedded in split BTF STRUCT/UNIONs or that have multiple STRUCT/UNION instances of the same name will _need_ size checks at relocation time, but as it is possible a different set of types will be duplicates in the later to-be-resolved base BTF, we preserve size information for all named STRUCT/UNIONs. - if an ENUM[64] is named, a ENUM forward representation (an ENUM with no values) of the same size is used. - in all other cases, the type is added to the new split BTF. Avoiding struct/union/enum/enum64 expansion is important to keep the distilled base BTF representation to a minimum size. When successful, new representations of the distilled base BTF and new split BTF that refers to it are returned. Both need to be freed by the caller. So to take a simple example, with split BTF with a type referring to "struct sk_buff", we will generate distilled base BTF with a 0-member STRUCT sk_buff of the appropriate size, and the split BTF will refer to it instead. Tools like pahole can utilize such split BTF to populate the .BTF section (split BTF) and an additional .BTF.base section. Then when the split BTF is loaded, the distilled base BTF can be used to relocate split BTF to reference the current (and possibly changed) base BTF. So for example if "struct sk_buff" was id 502 when the split BTF was originally generated, we can use the distilled base BTF to see that id 502 refers to a "struct sk_buff" and replace instances of id 502 with the current (relocated) base BTF sk_buff type id. Distilled base BTF is small; when building a kernel with all modules using distilled base BTF as a test, overall module size grew by only 5.3Mb total across ~2700 modules. Signed-off-by: Alan Maguire <[email protected]> Signed-off-by: Andrii Nakryiko <[email protected]> Acked-by: Eduard Zingerman <[email protected]> Link: https://lore.kernel.org/bpf/[email protected]
2024-06-17Merge tag 'mm-hotfixes-stable-2024-06-17-11-43' of ↵Linus Torvalds1-8/+16
git://git.kernel.org/pub/scm/linux/kernel/git/akpm/mm Pull misc fixes from Andrew Morton: "Mainly MM singleton fixes. And a couple of ocfs2 regression fixes" * tag 'mm-hotfixes-stable-2024-06-17-11-43' of git://git.kernel.org/pub/scm/linux/kernel/git/akpm/mm: kcov: don't lose track of remote references during softirqs mm: shmem: fix getting incorrect lruvec when replacing a shmem folio mm/debug_vm_pgtable: drop RANDOM_ORVALUE trick mm: fix possible OOB in numa_rebuild_large_mapping() mm/migrate: fix kernel BUG at mm/compaction.c:2761! selftests: mm: make map_fixed_noreplace test names stable mm/memfd: add documentation for MFD_NOEXEC_SEAL MFD_EXEC mm: mmap: allow for the maximum number of bits for randomizing mmap_base by default gcov: add support for GCC 14 zap_pid_ns_processes: clear TIF_NOTIFY_SIGNAL along with TIF_SIGPENDING mm: huge_memory: fix misused mapping_large_folio_support() for anon folios lib/alloc_tag: fix RCU imbalance in pgalloc_tag_get() lib/alloc_tag: do not register sysctl interface when CONFIG_SYSCTL=n MAINTAINERS: remove Lorenzo as vmalloc reviewer Revert "mm: init_mlocked_on_free_v3" mm/page_table_check: fix crash on ZONE_DEVICE gcc: disable '-Warray-bounds' for gcc-9 ocfs2: fix NULL pointer dereference in ocfs2_abort_trigger() ocfs2: fix NULL pointer dereference in ocfs2_journal_dirty()
2024-06-17lkdtm/bugs: add test for hung smp_call_function_single()Mark Rutland1-0/+1
The CONFIG_CSD_LOCK_WAIT_DEBUG option enables debugging of hung smp_call_function*() calls (e.g. when the target CPU gets stuck within the callback function). Testing this option requires triggering such hangs. This patch adds an lkdtm test with a hung smp_call_function_single() callback, which can be used to test CONFIG_CSD_LOCK_WAIT_DEBUG and NMI backtraces (as CONFIG_CSD_LOCK_WAIT_DEBUG will attempt an NMI backtrace of the hung target CPU). On arm64 using pseudo-NMI, this looks like: | # mount -t debugfs none /sys/kernel/debug/ | # echo SMP_CALL_LOCKUP > /sys/kernel/debug/provoke-crash/DIRECT | lkdtm: Performing direct entry SMP_CALL_LOCKUP | smp: csd: Detected non-responsive CSD lock (#1) on CPU#1, waiting 5000000176 ns for CPU#00 __lkdtm_SMP_CALL_LOCKUP+0x0/0x8(0x0). | smp: csd: CSD lock (#1) handling this request. | Sending NMI from CPU 1 to CPUs 0: | NMI backtrace for cpu 0 | CPU: 0 PID: 0 Comm: swapper/0 Not tainted 6.9.0-rc4-00001-gfdfd281212ec #1 | Hardware name: linux,dummy-virt (DT) | pstate: 60401005 (nZCv daif +PAN -UAO -TCO -DIT +SSBS BTYPE=--) | pc : __lkdtm_SMP_CALL_LOCKUP+0x0/0x8 | lr : __flush_smp_call_function_queue+0x1b0/0x290 | sp : ffff800080003f30 | pmr_save: 00000060 | x29: ffff800080003f30 x28: ffffa4ce961a4900 x27: 0000000000000000 | x26: fff000003fcfa0c0 x25: ffffa4ce961a4900 x24: ffffa4ce959aa140 | x23: ffffa4ce959aa140 x22: 0000000000000000 x21: ffff800080523c40 | x20: 0000000000000000 x19: 0000000000000000 x18: fff05b31aa323000 | x17: fff05b31aa323000 x16: ffff800080000000 x15: 0000330fc3fe6b2c | x14: 0000000000000000 x13: 0000000000000000 x12: 0000000000000279 | x11: 0000000000000040 x10: fff000000302d0a8 x9 : fff000000302d0a0 | x8 : fff0000003400270 x7 : 0000000000000000 x6 : ffffa4ce9451b810 | x5 : 0000000000000000 x4 : fff05b31aa323000 x3 : ffff800080003f30 | x2 : fff05b31aa323000 x1 : ffffa4ce959aa140 x0 : 0000000000000000 | Call trace: | __lkdtm_SMP_CALL_LOCKUP+0x0/0x8 | generic_smp_call_function_single_interrupt+0x14/0x20 | ipi_handler+0xb8/0x178 | handle_percpu_devid_irq+0x84/0x130 | generic_handle_domain_irq+0x2c/0x44 | gic_handle_irq+0x118/0x240 | call_on_irq_stack+0x24/0x4c | do_interrupt_handler+0x80/0x84 | el1_interrupt+0x44/0xc0 | el1h_64_irq_handler+0x18/0x24 | el1h_64_irq+0x78/0x7c | default_idle_call+0x40/0x60 | do_idle+0x23c/0x2d0 | cpu_startup_entry+0x38/0x3c | kernel_init+0x0/0x1d8 | start_kernel+0x51c/0x608 | __primary_switched+0x80/0x88 | CPU: 1 PID: 128 Comm: sh Not tainted 6.9.0-rc4-00001-gfdfd281212ec #1 | Hardware name: linux,dummy-virt (DT) | Call trace: | dump_backtrace+0x90/0xe8 | show_stack+0x18/0x24 | dump_stack_lvl+0xac/0xe8 | dump_stack+0x18/0x24 | csd_lock_wait_toolong+0x268/0x338 | smp_call_function_single+0x1dc/0x2f0 | lkdtm_SMP_CALL_LOCKUP+0xcc/0xfc | lkdtm_do_action+0x1c/0x38 | direct_entry+0xbc/0x14c | full_proxy_write+0x60/0xb4 | vfs_write+0xd0/0x35c | ksys_write+0x70/0x104 | __arm64_sys_write+0x1c/0x28 | invoke_syscall+0x48/0x114 | el0_svc_common.constprop.0+0x40/0xe0 | do_el0_svc+0x1c/0x28 | el0_svc+0x38/0x108 | el0t_64_sync_handler+0x120/0x12c | el0t_64_sync+0x1a4/0x1a8 | smp: csd: Continued non-responsive CSD lock (#1) on CPU#1, waiting 10000064272 ns for CPU#00 __lkdtm_SMP_CALL_LOCKUP+0x0/0x8(0x0). | smp: csd: CSD lock (#1) handling this request. | smp: csd: Continued non-responsive CSD lock (#1) on CPU#1, waiting 15000064384 ns for CPU#00 __lkdtm_SMP_CALL_LOCKUP+0x0/0x8(0x0). | smp: csd: CSD lock (#1) handling this request. Signed-off-by: Mark Rutland <[email protected]> Acked-by: Paul E. McKenney <[email protected]> Cc: Kees Cook <[email protected]> Link: https://lore.kernel.org/r/[email protected] Signed-off-by: Kees Cook <[email protected]>
2024-06-17Merge tag 'hyperv-fixes-signed-20240616' of ↵Linus Torvalds1-0/+1
git://git.kernel.org/pub/scm/linux/kernel/git/hyperv/linux Pull Hyper-V fixes from Wei Liu: - Some cosmetic changes for hv.c and balloon.c (Aditya Nagesh) - Two documentation updates (Michael Kelley) - Suppress the invalid warning for packed member alignment (Saurabh Sengar) - Two hv_balloon fixes (Michael Kelley) * tag 'hyperv-fixes-signed-20240616' of git://git.kernel.org/pub/scm/linux/kernel/git/hyperv/linux: Drivers: hv: Cosmetic changes for hv.c and balloon.c Documentation: hyperv: Improve synic and interrupt handling description Documentation: hyperv: Update spelling and fix typo tools: hv: suppress the invalid warning for packed member alignment hv_balloon: Enable hot-add for memblock sizes > 128 MiB hv_balloon: Use kernel macros to simplify open coded sequences
2024-06-17selftests/bpf: Add a few tests to coverYonghong Song1-0/+63
Add three unit tests in verifier_movsx.c to cover cases where missed var_off setting can cause unexpected verification success or failure. Signed-off-by: Yonghong Song <[email protected]> Link: https://lore.kernel.org/r/[email protected] Signed-off-by: Alexei Starovoitov <[email protected]>
2024-06-15perf hist: Honor symbol_conf.skip_emptyNamhyung Kim3-5/+29
So that it can skip events with no sample according to the config value. This can omit the dummy event in the output of perf report --group. An example output: $ sudo perf mem record -a sleep 1 $ sudo perf report --group Before) # # Samples: 232 of events 'cpu/mem-loads,ldlat=30/P, cpu/mem-stores/P, dummy:u' # Event count (approx.): 3089861 # # Overhead Command Shared Object Symbol # ........................ ........... ................. ..................................... # 9.29% 0.00% 0.00% swapper [kernel.kallsyms] [k] update_blocked_averages 5.26% 0.15% 0.00% swapper [kernel.kallsyms] [k] __update_load_avg_se 4.15% 0.00% 0.00% perf-exec [kernel.kallsyms] [k] slab_update_freelist.isra.0 3.87% 0.00% 0.00% perf-exec [kernel.kallsyms] [k] memcg_slab_post_alloc_hook 3.79% 0.17% 0.00% swapper [kernel.kallsyms] [k] enqueue_task_fair 3.63% 0.00% 0.00% sleep [kernel.kallsyms] [k] next_uptodate_page 2.86% 0.00% 0.00% swapper [kernel.kallsyms] [k] __update_load_avg_cfs_rq 2.78% 0.00% 0.00% swapper [kernel.kallsyms] [k] __schedule 2.34% 0.00% 0.00% swapper [kernel.kallsyms] [k] intel_idle 2.32% 0.97% 0.00% swapper [kernel.kallsyms] [k] psi_group_change After) # # Samples: 232 of events 'cpu/mem-loads,ldlat=30/P, cpu/mem-stores/P' # Event count (approx.): 3089861 # # Overhead Command Shared Object Symbol # ................ ........... ................. ..................................... # 9.29% 0.00% swapper [kernel.kallsyms] [k] update_blocked_averages 5.26% 0.15% swapper [kernel.kallsyms] [k] __update_load_avg_se 4.15% 0.00% perf-exec [kernel.kallsyms] [k] slab_update_freelist.isra.0 3.87% 0.00% perf-exec [kernel.kallsyms] [k] memcg_slab_post_alloc_hook 3.79% 0.17% swapper [kernel.kallsyms] [k] enqueue_task_fair 3.63% 0.00% sleep [kernel.kallsyms] [k] next_uptodate_page 2.86% 0.00% swapper [kernel.kallsyms] [k] __update_load_avg_cfs_rq 2.78% 0.00% swapper [kernel.kallsyms] [k] __schedule 2.34% 0.00% swapper [kernel.kallsyms] [k] intel_idle 2.32% 0.97% swapper [kernel.kallsyms] [k] psi_group_change Now it doesn't have a column for the dummy event. Tested-by: Arnaldo Carvalho de Melo <[email protected]> Signed-off-by: Namhyung Kim <[email protected]> Link: https://lore.kernel.org/r/[email protected]
2024-06-15perf hist: Add symbol_conf.skip_emptyNamhyung Kim9-24/+20
Add the skip_empty flag to symbol_conf and set the value from the report command to preserve the existing behavior. This makes the code simpler and will be needed other code which is hard to add a new argument. Tested-by: Arnaldo Carvalho de Melo <[email protected]> Signed-off-by: Namhyung Kim <[email protected]> Link: https://lore.kernel.org/r/[email protected]
2024-06-15perf hist: Simplify __hpp_fmt() using hpp_fmt_dataNamhyung Kim1-39/+36
The struct hpp_fmt_data is to keep the values for each group members so it doesn't need to check the event index in the group. Tested-by: Arnaldo Carvalho de Melo <[email protected]> Signed-off-by: Namhyung Kim <[email protected]> Link: https://lore.kernel.org/r/[email protected]
2024-06-15perf hist: Factor out __hpp__fmt_print()Namhyung Kim1-47/+36
Split the logic to print the histogram values according to the format string. This was used in 3 different places so it's better to move out the logic into a function. Tested-by: Arnaldo Carvalho de Melo <[email protected]> Signed-off-by: Namhyung Kim <[email protected]> Link: https://lore.kernel.org/r/[email protected]
2024-06-15perf: sched map skips redundant lines with cpu filtersFernand Sieber1-6/+6
perf sched map supports cpu filter. However, even with cpu filters active, any context switch currently corresponds to a separate line. As result, context switches on irrelevant cpus result to redundant lines, which makes the output particlularly difficult to read on wide architectures. Fix it by skipping printing for irrelevant CPUs. Example snippet of output before fix: *B0 1.461147 secs B0 B0 B0 *G0 1.517139 secs After fix: *B0 1.461147 secs *G0 1.517139 secs Signed-off-by: Fernand Sieber <[email protected]> Acked-by: Namhyung Kim <[email protected]> Reviewed-and-tested-by: Madadi Vineeth Reddy <[email protected]> Signed-off-by: Namhyung Kim <[email protected]> Link: https://lore.kernel.org/r/[email protected]
2024-06-15selftests: mm: make map_fixed_noreplace test names stableMark Brown1-8/+16
KTAP parsers interpret the output of ksft_test_result_*() as being the name of the test. The map_fixed_noreplace test uses a dynamically allocated base address for the mmap()s that it tests and currently includes this in the test names that it logs so the test names that are logged are not stable between runs. It also uses multiples of PAGE_SIZE which mean that runs for kernels with different PAGE_SIZE configurations can't be directly compared. Both these factors cause issues for CI systems when interpreting and displaying results. Fix this by replacing the current test names with fixed strings describing the intent of the mappings that are logged, the existing messages with the actual addresses and sizes are retained as diagnostic prints to aid in debugging. Link: https://lkml.kernel.org/r/20240605-kselftest-mm-fixed-noreplace-v1-1-a235db8b9be9@kernel.org Fixes: 4838cf70e539 ("selftests/mm: map_fixed_noreplace: conform test to TAP format output") Signed-off-by: Mark Brown <[email protected]> Reviewed-by: Ryan Roberts <[email protected]> Cc: Muhammad Usama Anjum <[email protected]> Cc: Shuah Khan <[email protected]> Signed-off-by: Andrew Morton <[email protected]>
2024-06-14selftests: forwarding: Add test for minimum and maximum MTUAmit Cohen2-0/+284
Add cases to check minimum and maximum MTU which are exposed via "ip -d link show". Test configuration and traffic. Use VLAN devices as usually VLAN header (4 bytes) is not included in the MTU, and drivers should configure hardware correctly to send maximum MTU payload size in VLAN tagged packets. $ ./min_max_mtu.sh TEST: ping [ OK ] TEST: ping6 [ OK ] TEST: Test maximum MTU configuration [ OK ] TEST: Test traffic, packet size is maximum MTU [ OK ] TEST: Test minimum MTU configuration [ OK ] TEST: Test traffic, packet size is minimum MTU [ OK ] Signed-off-by: Amit Cohen <[email protected]> Reviewed-by: Petr Machata <[email protected]> Signed-off-by: Petr Machata <[email protected]> Link: https://lore.kernel.org/r/89de8be8989db7a97f3b39e3c9da695673e78d2e.1718275854.git.petrm@nvidia.com Signed-off-by: Jakub Kicinski <[email protected]>
2024-06-14Merge tag 'for-netdev' of ↵Jakub Kicinski2-0/+43
https://git.kernel.org/pub/scm/linux/kernel/git/bpf/bpf Daniel Borkmann says: ==================== pull-request: bpf 2024-06-14 We've added 8 non-merge commits during the last 2 day(s) which contain a total of 9 files changed, 92 insertions(+), 11 deletions(-). The main changes are: 1) Silence a syzkaller splat under CONFIG_DEBUG_NET=y in pskb_pull_reason() triggered via __bpf_try_make_writable(), from Florian Westphal. 2) Fix removal of kfuncs during linking phase which then throws a kernel build warning via resolve_btfids about unresolved symbols, from Tony Ambardar. 3) Fix a UML x86_64 compilation failure from BPF as pcpu_hot symbol is not available on User Mode Linux, from Maciej Żenczykowski. 4) Fix a register corruption in reg_set_min_max triggering an invariant violation in BPF verifier, from Daniel Borkmann. * tag 'for-netdev' of https://git.kernel.org/pub/scm/linux/kernel/git/bpf/bpf: bpf: Harden __bpf_kfunc tag against linker kfunc removal compiler_types.h: Define __retain for __attribute__((__retain__)) bpf: Avoid splat in pskb_pull_reason bpf: fix UML x86_64 compile failure selftests/bpf: Add test coverage for reg_set_min_max handling bpf: Reduce stack consumption in check_stack_write_fixed_off bpf: Fix reg_set_min_max corruption of fake_reg MAINTAINERS: mailmap: Update Stanislav's email address ==================== Link: https://lore.kernel.org/r/[email protected] Signed-off-by: Jakub Kicinski <[email protected]>
2024-06-14selftests/bpf: Add tests for add_constAlexei Starovoitov2-3/+249
Improve arena based tests and add several C and asm tests with specific pattern. These tests would have failed without add_const verifier support. Also add several loop_inside_iter*() tests that are not related to add_const, but nice to have. Signed-off-by: Alexei Starovoitov <[email protected]> Signed-off-by: Daniel Borkmann <[email protected]> Link: https://lore.kernel.org/bpf/[email protected]
2024-06-14bpf: Support can_loop/cond_break on big endianAlexei Starovoitov1-0/+28
Add big endian support for can_loop/cond_break macros. Signed-off-by: Alexei Starovoitov <[email protected]> Signed-off-by: Daniel Borkmann <[email protected]> Acked-by: Yonghong Song <[email protected]> Link: https://lore.kernel.org/bpf/[email protected]
2024-06-14bpf: Track delta between "linked" registers.Alexei Starovoitov1-11/+11
Compilers can generate the code r1 = r2 r1 += 0x1 if r2 < 1000 goto ... use knowledge of r2 range in subsequent r1 operations So remember constant delta between r2 and r1 and update r1 after 'if' condition. Unfortunately LLVM still uses this pattern for loops with 'can_loop' construct: for (i = 0; i < 1000 && can_loop; i++) The "undo" pass was introduced in LLVM https://reviews.llvm.org/D121937 to prevent this optimization, but it cannot cover all cases. Instead of fighting middle end optimizer in BPF backend teach the verifier about this pattern. Signed-off-by: Alexei Starovoitov <[email protected]> Signed-off-by: Daniel Borkmann <[email protected]> Acked-by: Eduard Zingerman <[email protected]> Link: https://lore.kernel.org/bpf/[email protected]
2024-06-14selftests/hid: add subprog call testBenjamin Tissoires2-0/+65
I got a weird verifier error with a subprog once, so let's have a test for it. Link: https://lore.kernel.org/r/[email protected] Signed-off-by: Benjamin Tissoires <[email protected]>
2024-06-14selftests/hid: convert the hid_bpf selftests with struct_opsBenjamin Tissoires3-65/+89
We drop the need for the attach() bpf syscall, but we need to set up the hid_id field before calling __load(). The .bpf.c part is mechanical: we create one struct_ops per HID-BPF program, as all the tests are for one program at a time. Link: https://lore.kernel.org/r/[email protected] Signed-off-by: Benjamin Tissoires <[email protected]>
2024-06-13perf test pmu: Warn don't fail for legacy mixed case event namesIan Rogers1-3/+19
PowerPC has mixed case events matching legacy hardware cache events. Warn but don't fail in this case. Event parsing will still work in this case by matching the legacy case. Signed-off-by: Ian Rogers <[email protected]> Tested-by: Kajol Jain <[email protected]> Cc: James Clark <[email protected]> Signed-off-by: Namhyung Kim <[email protected]> Link: https://lore.kernel.org/r/[email protected]
2024-06-13tools/perf: Fix timing issue with parallel threads in perf bench ↵Athira Rajeev1-1/+1
wake-up-parallel perf bench futex fails as below and hangs intermittently when attempted to run on on a powerpc system: ./perf bench futex wake-parallel Running 'futex/wake-parallel' benchmark: Run summary [PID 88588]: blocking on 640 threads (at [private] futex 0x10464b8c), 640 threads waking up 1 at a time. [Run 1]: Avg per-thread latency (waking 1/640 threads) in 0.1309 ms (+-53.27%) [Run 2]: Avg per-thread latency (waking 1/640 threads) in 0.0120 ms (+-31.16%) [Run 3]: Avg per-thread latency (waking 1/640 threads) in 0.1474 ms (+-92.47%) [Run 4]: Avg per-thread latency (waking 1/640 threads) in 0.2883 ms (+-67.75%) [Run 5]: Avg per-thread latency (waking 1/640 threads) in 0.4108 ms (+-39.60%) [Run 6]: Avg per-thread latency (waking 1/640 threads) in 0.7843 ms (+-78.98%) perf: couldn't wakeup all tasks (0/1) perf: couldn't wakeup all tasks (0/1) perf: couldn't wakeup all tasks (0/1) perf: couldn't wakeup all tasks (0/1) perf: couldn't wakeup all tasks (0/1) perf: couldn't wakeup all tasks (0/1) In the system, where perf bench wake-up-parallel is has system configuration of 640 cpus. After debugging, this turned out to be a timing issue. The benchmark creates threads equal to number of cpus and issues a futex_wait. Then it does a usleep for .1 second before initiating futex_wake. In system configuration with more threads, the usleep time is not enough. Patch changes the usleep from 100000 to 200000 With the patch, ran multiple iterations and there were no issues further seen Reported-by: Disha Goel <[email protected]> Signed-off-by: Athira Rajeev <[email protected]> Reviewed-by: Ian Rogers <[email protected]> Tested-by: Disha Goel <[email protected]> Cc: [email protected] Cc: [email protected] Cc: [email protected] Cc: [email protected] Signed-off-by: Namhyung Kim <[email protected]> Link: https://lore.kernel.org/r/[email protected]
2024-06-13tools/perf: Fix perf bench epoll to enable the run when some CPU's are offlineAthira Rajeev2-2/+2
Perf bench epoll fails as below when attempted to run on on a powerpc system: ./perf bench epoll wait Running 'epoll/wait' benchmark: Run summary [PID 627653]: 79 threads monitoring on 64 file-descriptors for 8 secs. perf: pthread_create: No such file or directory In the setup where this perf bench was ran, difference was that partition had 640 CPU's, but not all CPUs were online. 80 CPUs were online. While creating threads and using epoll_wait , code sets the affinity using cpumask. The cpumask size used is 80 which is picked from "nrcpus = perf_cpu_map__nr(cpu)". Here the benchmark reports fail while setting affinity for cpu number which is greater than 80 or higher, because it attempts to set a bit position which is not allocated on the cpumask. Fix this by changing the size of cpumask to number of possible cpus and not the number of online cpus. Signed-off-by: Athira Rajeev <[email protected]> Reviewed-by: Ian Rogers <[email protected]> Tested-by: Disha Goel <[email protected]> Cc: [email protected] Cc: [email protected] Cc: [email protected] Cc: [email protected] Signed-off-by: Namhyung Kim <[email protected]> Link: https://lore.kernel.org/r/[email protected]
2024-06-13tools/perf: Fix perf bench futex to enable the run when some CPU's are offlineAthira Rajeev5-5/+5
Perf bench futex fails as below when attempted to run on on a powerpc system: ./perf bench futex all Running futex/hash benchmark... Run summary [PID 626307]: 80 threads, each operating on 1024 [private] futexes for 10 secs. perf: pthread_create: No such file or directory In the setup where this perf bench was ran, difference was that partition had 640 CPU's, but not all CPUs were online. 80 CPUs were online. While blocking the threads with futex_wait, code sets the affinity using cpumask. The cpumask size used is 80 which is picked from "nrcpus = perf_cpu_map__nr(cpu)". Here the benchmark reports fail while setting affinity for cpu number which is greater than 80 or higher, because it attempts to set a bit position which is not allocated on the cpumask. Fix this by changing the size of cpumask to number of possible cpus and not the number of online cpus. Signed-off-by: Athira Rajeev <[email protected]> Reviewed-by: Ian Rogers <[email protected]> Tested-by: Disha Goel <[email protected]> Cc: [email protected] Cc: [email protected] Cc: [email protected] Cc: [email protected] Signed-off-by: Namhyung Kim <[email protected]> Link: https://lore.kernel.org/r/[email protected]