aboutsummaryrefslogtreecommitdiff
path: root/tools
AgeCommit message (Collapse)AuthorFilesLines
2021-12-09selftests/powerpc/spectre_v2: Return skip code when miss_percent is highThadeu Lima de Souza Cascardo1-1/+1
A mis-match between reported and actual mitigation is not restricted to the Vulnerable case. The guest might also report the mitigation as "Software count cache flush" and the host will still mitigate with branch cache disabled. So, instead of skipping depending on the detected mitigation, simply skip whenever the detected miss_percent is the expected one for a fully mitigated system, that is, above 95%. Signed-off-by: Thadeu Lima de Souza Cascardo <[email protected]> Signed-off-by: Michael Ellerman <[email protected]> Link: https://lore.kernel.org/r/[email protected]
2021-12-08selftests/bpf: Fix a compilation warningYonghong Song1-1/+1
The following warning is triggered when I used clang compiler to build the selftest. /.../prog_tests/btf_dedup_split.c:368:6: warning: variable 'btf2' is used uninitialized whenever 'if' condition is true [-Wsometimes-uninitialized] if (!ASSERT_OK(err, "btf_dedup")) ^~~~~~~~~~~~~~~~~~~~~~~~~~~~ /.../prog_tests/btf_dedup_split.c:424:12: note: uninitialized use occurs here btf__free(btf2); ^~~~ /.../prog_tests/btf_dedup_split.c:368:2: note: remove the 'if' if its condition is always false if (!ASSERT_OK(err, "btf_dedup")) ^~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ /.../prog_tests/btf_dedup_split.c:343:25: note: initialize the variable 'btf2' to silence this warning struct btf *btf1, *btf2; ^ = NULL Initialize local variable btf2 = NULL and the warning is gone. Fixes: 9a49afe6f5a5 ("selftests/bpf: Add btf_dedup case with duplicated structs within CU") Signed-off-by: Yonghong Song <[email protected]> Signed-off-by: Andrii Nakryiko <[email protected]> Link: https://lore.kernel.org/bpf/[email protected]
2021-12-08Merge git://git.kernel.org/pub/scm/linux/kernel/git/pablo/nfJakub Kicinski3-13/+60
Pablo Neira Ayuso says: ==================== Netfilter fixes for net 1) Fix bogus compilter warning in nfnetlink_queue, from Florian Westphal. 2) Don't run conntrack on vrf with !dflt qdisc, from Nicolas Dichtel. 3) Fix nft_pipapo bucket load in AVX2 lookup routine for six 8-bit groups, from Stefano Brivio. 4) Break rule evaluation on malformed TCP options. 5) Use socat instead of nc in selftests/netfilter/nft_zones_many.sh, also from Florian 6) Fix KCSAN data-race in conntrack timeout updates, from Eric Dumazet. * git://git.kernel.org/pub/scm/linux/kernel/git/pablo/nf: netfilter: conntrack: annotate data-races around ct->timeout selftests: netfilter: switch zone stress to socat netfilter: nft_exthdr: break evaluation if setting TCP option fails selftests: netfilter: Add correctness test for mac,net set type nft_set_pipapo: Fix bucket load in AVX2 lookup routine for six 8-bit groups vrf: don't run conntrack on vrf with !dflt qdisc netfilter: nfnetlink_queue: silence bogus compiler warning ==================== Link: https://lore.kernel.org/r/[email protected] Signed-off-by: Jakub Kicinski <[email protected]>
2021-12-08Merge https://git.kernel.org/pub/scm/linux/kernel/git/bpf/bpfJakub Kicinski2-35/+605
Daniel Borkmann says: ==================== bpf 2021-12-08 We've added 12 non-merge commits during the last 22 day(s) which contain a total of 29 files changed, 659 insertions(+), 80 deletions(-). The main changes are: 1) Fix an off-by-two error in packet range markings and also add a batch of new tests for coverage of these corner cases, from Maxim Mikityanskiy. 2) Fix a compilation issue on MIPS JIT for R10000 CPUs, from Johan Almbladh. 3) Fix two functional regressions and a build warning related to BTF kfunc for modules, from Kumar Kartikeya Dwivedi. 4) Fix outdated code and docs regarding BPF's migrate_disable() use on non- PREEMPT_RT kernels, from Sebastian Andrzej Siewior. 5) Add missing includes in order to be able to detangle cgroup vs bpf header dependencies, from Jakub Kicinski. 6) Fix regression in BPF sockmap tests caused by missing detachment of progs from sockets when they are removed from the map, from John Fastabend. 7) Fix a missing "no previous prototype" warning in x86 JIT caused by BPF dispatcher, from Björn Töpel. * https://git.kernel.org/pub/scm/linux/kernel/git/bpf/bpf: bpf: Add selftests to cover packet access corner cases bpf: Fix the off-by-two error in range markings treewide: Add missing includes masked by cgroup -> bpf dependency tools/resolve_btfids: Skip unresolved symbol warning for empty BTF sets bpf: Fix bpf_check_mod_kfunc_call for built-in modules bpf: Make CONFIG_DEBUG_INFO_BTF depend upon CONFIG_BPF_SYSCALL mips, bpf: Fix reference to non-existing Kconfig symbol bpf: Make sure bpf_disable_instrumentation() is safe vs preemption. Documentation/locking/locktypes: Update migrate_disable() bits. bpf, sockmap: Re-evaluate proto ops when psock is removed from sockmap bpf, sockmap: Attach map progs to psock early for feature probes bpf, x86: Fix "no previous prototype" warning ==================== Link: https://lore.kernel.org/r/[email protected] Signed-off-by: Jakub Kicinski <[email protected]>
2021-12-08perf/bpf_counter: Use bpf_map_create instead of bpf_create_mapSong Liu1-2/+16
bpf_create_map is deprecated. Replace it with bpf_map_create. Also add a __weak bpf_map_create() so that when older version of libbpf is linked as a shared library, it falls back to bpf_create_map(). Fixes: 992c4225419a ("libbpf: Unify low-level map creation APIs w/ new bpf_map_create()") Signed-off-by: Song Liu <[email protected]> Signed-off-by: Andrii Nakryiko <[email protected]> Link: https://lore.kernel.org/bpf/[email protected]
2021-12-08objtool: Add straight-line-speculation validationPeter Zijlstra5-6/+27
Teach objtool to validate the straight-line-speculation constraints: - speculation trap after indirect calls - speculation trap after RET Notable: when an instruction is annotated RETPOLINE_SAFE, indicating speculation isn't a problem, also don't care about sls for that instruction. Signed-off-by: Peter Zijlstra (Intel) <[email protected]> Signed-off-by: Borislav Petkov <[email protected]> Link: https://lore.kernel.org/r/[email protected]
2021-12-08bpf: Add selftests to cover packet access corner casesMaxim Mikityanskiy1-16/+584
This commit adds BPF verifier selftests that cover all corner cases by packet boundary checks. Specifically, 8-byte packet reads are tested at the beginning of data and at the beginning of data_meta, using all kinds of boundary checks (all comparison operators: <, >, <=, >=; both permutations of operands: data + length compared to end, end compared to data + length). For each case there are three tests: 1. Length is just enough for an 8-byte read. Length is either 7 or 8, depending on the comparison. 2. Length is increased by 1 - should still pass the verifier. These cases are useful, because they failed before commit 2fa7d94afc1a ("bpf: Fix the off-by-two error in range markings"). 3. Length is decreased by 1 - should be rejected by the verifier. Some existing tests are just renamed to avoid duplication. Signed-off-by: Maxim Mikityanskiy <[email protected]> Signed-off-by: Daniel Borkmann <[email protected]> Link: https://lore.kernel.org/bpf/[email protected]
2021-12-08ACPI: tools: Fix compilation when output directory is not presentChen Yu2-0/+2
Compiling the ACPI tools when output directory parameter is specified, but the output directory is not present, triggers the following error: make O=/data/test/tmp/ -C tools/power/acpi/ make: Entering directory '/data/src/kernel/linux/tools/power/acpi' DESCEND tools/acpidbg make[1]: Entering directory '/data/src/kernel/linux/tools/power/acpi/tools/acpidbg' MKDIR include CP include CC tools/acpidbg/acpidbg.o Assembler messages: Fatal error: can't create /data/test/tmp/tools/power/acpi/tools/acpidbg/acpidbg.o: No such file or directory make[1]: *** [../../Makefile.rules:24: /data/test/tmp/tools/power/acpi/tools/acpidbg/acpidbg.o] Error 1 make[1]: Leaving directory '/data/src/kernel/linux/tools/power/acpi/tools/acpidbg' make: *** [Makefile:18: acpidbg] Error 2 make: Leaving directory '/data/src/kernel/linux/tools/power/acpi' which occurs because the output directory has not been created yet. Fix this issue by creating the output directory before compiling. Reported-by: Andy Shevchenko <[email protected]> Signed-off-by: Chen Yu <[email protected]> Reviewed-by: Andy Shevchenko <[email protected]> [ rjw: New subject, changelog edits ] Signed-off-by: Rafael J. Wysocki <[email protected]>
2021-12-07selftests: tls: add missing AES256-GCM cipherVadim Fedorenko1-0/+18
Add tests for TLSv1.2 and TLSv1.3 with AES256-GCM cipher Signed-off-by: Vadim Fedorenko <[email protected]> Signed-off-by: Jakub Kicinski <[email protected]>
2021-12-07selftests: tls: add missing AES-CCM cipher testsVadim Fedorenko1-0/+18
Add tests for TLSv1.2 and TLSv1.3 with AES-CCM cipher. Signed-off-by: Vadim Fedorenko <[email protected]> Signed-off-by: Jakub Kicinski <[email protected]>
2021-12-07perf vendor events arm64: Fix JSON indentation to 4 spaces standardAndrew Kilroy1-101/+101
Correct indentation to 4 spaces, same as the other JSON files. Reviewed-by: John Garry <[email protected]> Signed-off-by: Andrew Kilroy <[email protected]> Cc: Alexander Shishkin <[email protected]> Cc: Jiri Olsa <[email protected]> Cc: Leo Yan <[email protected]> Cc: Mark Rutland <[email protected]> Cc: Mathieu Poirier <[email protected]> Cc: Namhyung Kim <[email protected]> Cc: Will Deacon <[email protected]> Cc: [email protected] Link: http://lore.kernel.org/lkml/[email protected] Signed-off-by: Arnaldo Carvalho de Melo <[email protected]>
2021-12-07perf stat: Support --cputype option for hybrid eventsJin Yao4-3/+35
In previous patch, we have supported the syntax which enables the event on a specified pmu, such as: cpu_core/<event>/ cpu_atom/<event>/ While this syntax is not very easy for applying on a set of events or applying on a group. In following example, we have to explicitly assign the pmu prefix. # ./perf stat -e '{cpu_core/cycles/,cpu_core/instructions/}' -- sleep 1 Performance counter stats for 'sleep 1': 1,158,545 cpu_core/cycles/ 1,003,113 cpu_core/instructions/ 1.002428712 seconds time elapsed A much easier way is: # ./perf stat --cputype core -e '{cycles,instructions}' -- sleep 1 Performance counter stats for 'sleep 1': 1,101,071 cpu_core/cycles/ 939,892 cpu_core/instructions/ 1.002363142 seconds time elapsed For this example, the '--cputype' enables the events from specified pmu (cpu_core). If '--cputype' conflicts with pmu prefix, '--cputype' is ignored. # ./perf stat --cputype core -e cycles,cpu_atom/instructions/ -a -- sleep 1 Performance counter stats for 'system wide': 21,003,407 cpu_core/cycles/ 367,886 cpu_atom/instructions/ 1.002203520 seconds time elapsed Signed-off-by: Jin Yao <[email protected]> Cc: Alexander Shishkin <[email protected]> Cc: Andi Kleen <[email protected]> Cc: Jin Yao <[email protected]> Cc: Jiri Olsa <[email protected]> Cc: Kan Liang <[email protected]> Cc: Peter Zijlstra <[email protected]> Link: http://lore.kernel.org/lkml/[email protected] Signed-off-by: Arnaldo Carvalho de Melo <[email protected]>
2021-12-07perf tools: Drop requirement for libstdc++.so for libopencsd checkUwe Kleine-König1-1/+4
It's possible to link against libopencsd_c_api without having libstdc++.so available, only libstdc++.so.6.0.28 (or whatever version is in use) needs to be available. The same holds true for libopencsd.so. When -lstdc++ (or -lopencsd) is explicitly passed to the linker however the .so file must be available. So wrap adding the dependencies into a check for static linking that actually requires adding them all. The same construct is already used for some other tests in the same file to reduce dependencies in the dynamic linking case. Fixes: 573cf5c9a152 ("perf build: Add missing -lstdc++ when linking with libopencsd") Reviewed-by: James Clark <[email protected]> Signed-off-by: Uwe Kleine-König <[email protected]> Cc: Adrian Bunk <[email protected]> Cc: Alexander Shishkin <[email protected]> Cc: Branislav Rankov <[email protected]> Cc: Diederik de Haas <[email protected]> Cc: Jiri Olsa <[email protected]> Cc: Mark Rutland <[email protected]> Cc: Namhyung Kim <[email protected]> Cc: Peter Zijlstra <[email protected]> Link: https://lore.kernel.org/all/[email protected] Signed-off-by: Arnaldo Carvalho de Melo <[email protected]>
2021-12-07perf parse-events: Architecture specific leader overrideIan Rogers3-1/+25
Currently topdown events must appear after a slots event: $ perf stat -e '{slots,topdown-fe-bound}' /bin/true Performance counter stats for '/bin/true': 3,183,090 slots 986,133 topdown-fe-bound Reversing the events yields: $ perf stat -e '{topdown-fe-bound,slots}' /bin/true Error: The sys_perf_event_open() syscall returned with 22 (Invalid argument) for event (topdown-fe-bound). For metrics the order of events is determined by iterating over a hashmap, and so slots isn't guaranteed to be first which can yield this error. Change the set_leader in parse-events, called when a group is closed, so that rather than always making the first event the leader, if the slots event exists then it is made the leader. It is then moved to the head of the evlist otherwise it won't be opened in the correct order. The result is: $ perf stat -e '{topdown-fe-bound,slots}' /bin/true Performance counter stats for '/bin/true': 3,274,795 slots 1,001,702 topdown-fe-bound A problem with this approach is the slots event is identified by name, names can be overwritten like 'cpu/slots,name=foo/' and this causes the leader change to fail. The change also modifies and fixes mixed groups like, with the change: $ perf stat -e '{instructions,slots,topdown-fe-bound}' -a -- sleep 2 Performance counter stats for 'system wide': 5574985410 slots 971981616 instructions 1348461887 topdown-fe-bound 2.001263120 seconds time elapsed Without the change: $ perf stat -e '{instructions,slots,topdown-fe-bound}' -a -- sleep 2 Performance counter stats for 'system wide': <not counted> instructions <not counted> slots <not supported> topdown-fe-bound 2.006247990 seconds time elapsed Something that may be undesirable here is that the events are reordered in the output. Reviewed-by: Kajol Jain <[email protected]> Signed-off-by: Ian Rogers <[email protected]> Acked-by: Jiri Olsa <[email protected]> Cc: Alexander Shishkin <[email protected]> Cc: Andi Kleen <[email protected]> Cc: John Garry <[email protected]> Cc: Kajol Jain <[email protected]> Cc: Kan Liang <[email protected]> Cc: Mark Rutland <[email protected]> Cc: Namhyung Kim <[email protected]> Cc: Paul Clarke <[email protected]> Cc: Peter Zijlstra <[email protected]> Cc: Riccardo Mancini <[email protected]> Cc: Stephane Eranian <[email protected]> Cc: Vineet Singh <[email protected]> Link: http://lore.kernel.org/lkml/[email protected] Signed-off-by: Arnaldo Carvalho de Melo <[email protected]>
2021-12-07perf evlist: Allow setting arbitrary leaderIan Rogers3-9/+12
The leader of a group is the first, but allow it to be an arbitrary list member so that for Intel topdown events slots may always be the group leader. Reviewed-by: Kajol Jain <[email protected]> Signed-off-by: Ian Rogers <[email protected]> Acked-by: Jiri Olsa <[email protected]> Cc: Alexander Shishkin <[email protected]> Cc: Andi Kleen <[email protected]> Cc: John Garry <[email protected]> Cc: Kajol Jain <[email protected]> Cc: Kan Liang <[email protected]> Cc: Mark Rutland <[email protected]> Cc: Namhyung Kim <[email protected]> Cc: Paul Clarke <[email protected]> Cc: Peter Zijlstra <[email protected]> Cc: Riccardo Mancini <[email protected]> Cc: Stephane Eranian <[email protected]> Cc: Vineet Singh <[email protected]> Link: http://lore.kernel.org/lkml/[email protected] Signed-off-by: Arnaldo Carvalho de Melo <[email protected]>
2021-12-07perf metric: Reduce multiplexing with duration_timeIan Rogers1-9/+33
It is common to use the same counters with and without duration_time. The ID sharing code treats duration_time as if it were a hardware event placed in the same group. This causes unnecessary multiplexing such as in the following example where l3_cache_access isn't shared: $ perf stat -M l3 -a sleep 1 Performance counter stats for 'system wide': 3,117,007 l3_cache_miss # 199.5 MB/s l3_rd_bw # 43.6 % l3_hits # 56.4 % l3_miss (50.00%) 5,526,447 l3_cache_access (50.00%) 5,392,435 l3_cache_access # 5389191.2 access/s l3_access_rate (50.00%) 1,000,601,901 ns duration_time 1.000601901 seconds time elapsed Fix this by placing duration_time in all groups unless metric sharing has been disabled on the command line: $ perf stat -M l3 -a sleep 1 Performance counter stats for 'system wide': 3,597,972 l3_cache_miss # 230.3 MB/s l3_rd_bw # 48.0 % l3_hits # 52.0 % l3_miss 6,914,459 l3_cache_access # 6909935.9 access/s l3_access_rate 1,000,654,579 ns duration_time 1.000654579 seconds time elapsed $ perf stat --metric-no-merge -M l3 -a sleep 1 Performance counter stats for 'system wide': 3,501,834 l3_cache_miss # 53.5 % l3_miss (24.99%) 6,548,173 l3_cache_access (24.99%) 3,417,622 l3_cache_miss # 45.7 % l3_hits (25.04%) 6,294,062 l3_cache_access (25.04%) 5,923,238 l3_cache_access # 5919688.1 access/s l3_access_rate (24.99%) 1,000,599,683 ns duration_time 3,607,486 l3_cache_miss # 230.9 MB/s l3_rd_bw (49.97%) 1.000599683 seconds time elapsed v2. Doesn't count duration_time in the metric_list_cmp function that sorts larger metrics first. Without this a metric with duration_time and an event is sorted the same as a metric with two events, possibly not allowing the first metric to share with the second. Signed-off-by: Ian Rogers <[email protected]> Acked-by: Jiri Olsa <[email protected]> Cc: Alexander Shishkin <[email protected]> Cc: Andi Kleen <[email protected]> Cc: John Garry <[email protected]> Cc: Kajol Jain <[email protected]> Cc: Kan Liang <[email protected]> Cc: Mark Rutland <[email protected]> Cc: Namhyung Kim <[email protected]> Cc: Paul Clarke <[email protected]> Cc: Peter Zijlstra <[email protected]> Cc: Stephane Eranian <[email protected]> Link: http://lore.kernel.org/lkml/[email protected] Signed-off-by: Arnaldo Carvalho de Melo <[email protected]>
2021-12-07perf trace: Enable ignore_missing_thread for traceGang Li1-0/+3
perf already support ignore_missing_thread for -u/-p, but not yet applied to `perf trace`. This patch enables ignore_missing_thread for `perf trace`. Signed-off-by: Gang Li <[email protected]> Cc: Alexander Shishkin <[email protected]> Cc: Jiri Olsa <[email protected]> Cc: Mark Rutland <[email protected]> Cc: Namhyung Kim <[email protected]> Cc: Peter Zijlstra <[email protected]> Link: http://lkml.kernel.org/r/[email protected] Link: http://lkml.kernel.org/r/[email protected] Link: http://lore.kernel.org/lkml/[email protected] Signed-off-by: Arnaldo Carvalho de Melo <[email protected]>
2021-12-07perf docs: Update link to AMD documentationSandipan Das1-4/+10
This updates the link to documentation on AMD processors. The new link points to a page where users can find the Processor Programming Reference (PPR) documents for the family and model codes corresponding to processors they are using. Signed-off-by: Sandipan Das <[email protected]> Acked-by: Jiri Olsa <[email protected]> Cc: Ananth Narayan <[email protected]> Cc: Kajol Jain <[email protected]> Cc: Kim Phillips <[email protected]> Cc: Ravi Bangoria <[email protected]> Cc: Robert Richter <[email protected]> Cc: Santosh Shukla <[email protected]> Link: https://lore.kernel.org/r/[email protected] Signed-off-by: Arnaldo Carvalho de Melo <[email protected]>
2021-12-07perf docs: Add info on AMD raw event encodingSandipan Das4-8/+45
AMD processors have events with event select codes and unit masks larger than a byte. The core PMU, for example, uses 12-bit event select codes split between bits 0-7 and 32-35 of the PERF_CTL MSRs as can be seen from /sys/bus/event_sources/devices/cpu/format/*. The Processor Programming Reference (PPR) lists the event codes as unified 12-bit hexadecimal values instead and the split between the bits is not apparent to someone who is not aware of the layout of the PERF_CTL MSRs. 8-bit event select codes continue to work as the layout matches that of the PERF_CTL MSRs i.e. bits 0-7 for event select and 8-15 for unit mask. This adds more details in the perf man pages about using /sys/bus/event_sources/devices/*/format/* for determining the correct raw event encoding scheme. E.g. the "op_cache_hit_miss.op_cache_hit" event with code 0x28f and umask 0x03 can be programmed using its symbolic name as: $ sudo perf --debug perf-event-open stat -e op_cache_hit_miss.op_cache_hit sleep 1 ------------------------------------------------------------ perf_event_attr: type 4 size 128 config 0x20000038f sample_type IDENTIFIER read_format TOTAL_TIME_ENABLED|TOTAL_TIME_RUNNING disabled 1 inherit 1 enable_on_exec 1 exclude_guest 1 ------------------------------------------------------------ [...] One might use a simple eventsel+umask combination based on what the current man pages say and incorrectly program the event as: $ sudo perf --debug perf-event-open stat -e r0328f sleep 1 ------------------------------------------------------------ perf_event_attr: type 4 size 128 config 0x328f sample_type IDENTIFIER read_format TOTAL_TIME_ENABLED|TOTAL_TIME_RUNNING disabled 1 inherit 1 enable_on_exec 1 exclude_guest 1 ------------------------------------------------------------ [...] When it should have been based on the format from sysfs: $ cat /sys/bus/event_source/devices/cpu/format/event config:0-7,32-35 $ sudo perf --debug perf-event-open stat -e r20000038f sleep 1 ------------------------------------------------------------ perf_event_attr: type 4 size 128 config 0x20000038f sample_type IDENTIFIER read_format TOTAL_TIME_ENABLED|TOTAL_TIME_RUNNING disabled 1 inherit 1 enable_on_exec 1 exclude_guest 1 ------------------------------------------------------------ [...] Reviewed-by: Kajol Jain <[email protected]> Signed-off-by: Sandipan Das <[email protected]> Acked-by: Jiri Olsa <[email protected]> Cc: Ananth Narayan <[email protected]> Cc: Kim Phillips <[email protected]> Cc: Ravi Bangoria <[email protected]> Cc: Robert Richter <[email protected]> Cc: Santosh Shukla <[email protected]> Link: https://lore.kernel.org/r/[email protected] Signed-off-by: Arnaldo Carvalho de Melo <[email protected]>
2021-12-07libperf tests: Add test_stat_multiplexing testShunsuke Nakamura1-0/+157
Adds a test for a counter obtained using read() system call during multiplexing. $ sudo make tests -C ./tools/lib/perf/ V=1 make: Entering directory '/home/nakamura/build_work/build_kernel/linux_kernel/linux/tools/lib/perf' make -f /home/nakamura/build_work/build_kernel/linux_kernel/linux/tools/build/Makefile.build dir=. obj=libperf make -C /home/nakamura/build_work/build_kernel/linux_kernel/linux/tools/lib/api/ O= libapi.a make -f /home/nakamura/build_work/build_kernel/linux_kernel/linux/tools/build/Makefile.build dir=./fd obj=libapi make -f /home/nakamura/build_work/build_kernel/linux_kernel/linux/tools/build/Makefile.build dir=./fs obj=libapi make -f /home/nakamura/build_work/build_kernel/linux_kernel/linux/tools/build/Makefile.build dir=. obj=tests make -f /home/nakamura/build_work/build_kernel/linux_kernel/linux/tools/build/Makefile.build dir=./tests obj=tests running static: - running tests/test-cpumap.c...OK - running tests/test-threadmap.c...OK - running tests/test-evlist.c... Event 0 -- Raw count = 298049842, run = 270269503, enable = 456262127 Scaled count = 503160191 (59.24%, 270269503/456262127) Event 1 -- Raw count = 299134173, run = 271075173, enable = 456257234 Scaled count = 503484435 (59.41%, 271075173/456257234) Event 2 -- Raw count = 300461996, run = 272069283, enable = 456253417 Scaled count = 503867290 (59.63%, 272069283/456253417) Event 3 -- Raw count = 301308704, run = 273063387, enable = 456249352 Scaled count = 503443183 (59.85%, 273063387/456249352) Event 4 -- Raw count = 302531164, run = 274102932, enable = 456244712 Scaled count = 503563543 (60.08%, 274102932/456244712) Event 5 -- Raw count = 303710254, run = 275406214, enable = 456228165 Scaled count = 503115633 (60.37%, 275406214/456228165) Event 6 -- Raw count = 304531302, run = 276396076, enable = 456221130 Scaled count = 502661313 (60.58%, 276396076/456221130) Event 7 -- Raw count = 304486460, run = 276601890, enable = 456213754 Scaled count = 502205212 (60.63%, 276601890/456213754) Event 8 -- Raw count = 304116681, run = 276631326, enable = 456205562 Scaled count = 501532936 (60.64%, 276631326/456205562) Event 9 -- Raw count = 303567766, run = 276188567, enable = 456196839 Scaled count = 501420666 (60.54%, 276188567/456196839) Event 10 -- Raw count = 302238014, run = 275144001, enable = 456185300 Scaled count = 501106833 (60.31%, 275144001/456185300) Event 11 -- Raw count = 300805716, run = 273824589, enable = 456175608 Scaled count = 501124573 (60.03%, 273824589/456175608) Event 12 -- Raw count = 299959051, run = 272834556, enable = 456166593 Scaled count = 501517477 (59.81%, 272834556/456166593) Event 13 -- Raw count = 299037090, run = 271820805, enable = 456157086 Scaled count = 501830195 (59.59%, 271820805/456157086) Event 14 -- Raw count = 298327042, run = 270784311, enable = 456147546 Scaled count = 502544433 (59.36%, 270784311/456147546) Expected: 501614268 High: 503867290 Low: 298049842 Average: 502438527 Average Error = 0.16% OK - running tests/test-evsel.c... loop = 65536, count = 328182 loop = 131072, count = 660214 loop = 262144, count = 1315534 loop = 524288, count = 2635364 loop = 1048576, count = 5271971 loop = 65536, count = 491952 loop = 131072, count = 850061 loop = 262144, count = 1648608 loop = 524288, count = 3162059 loop = 1048576, count = 6353393 OK running dynamic: - running tests/test-cpumap.c...OK - running tests/test-threadmap.c...OK - running tests/test-evlist.c... Event 0 -- Raw count = 300218292, run = 297528154, enable = 496789343 Scaled count = 501281125 (59.89%, 297528154/496789343) Event 1 -- Raw count = 301438606, run = 298515328, enable = 496784768 Scaled count = 501649643 (60.09%, 298515328/496784768) Event 2 -- Raw count = 302342618, run = 298798983, enable = 496782015 Scaled count = 502673648 (60.15%, 298798983/496782015) Event 3 -- Raw count = 303132319, run = 299230407, enable = 496778508 Scaled count = 503256412 (60.23%, 299230407/496778508) Event 4 -- Raw count = 302758195, run = 299218047, enable = 496774243 Scaled count = 502651743 (60.23%, 299218047/496774243) Event 5 -- Raw count = 303158458, run = 299204274, enable = 496769146 Scaled count = 503334281 (60.23%, 299204274/496769146) Event 6 -- Raw count = 303471397, run = 299197479, enable = 496763124 Scaled count = 503859189 (60.23%, 299197479/496763124) Event 7 -- Raw count = 303583387, run = 299196861, enable = 496756458 Scaled count = 504039405 (60.23%, 299196861/496756458) Event 8 -- Raw count = 303096897, run = 299186924, enable = 496748667 Scaled count = 503240507 (60.23%, 299186924/496748667) Event 9 -- Raw count = 301424173, run = 297845086, enable = 496739994 Scaled count = 502709122 (59.96%, 297845086/496739994) Event 10 -- Raw count = 300876415, run = 296851339, enable = 496729034 Scaled count = 503464297 (59.76%, 296851339/496729034) Event 11 -- Raw count = 300239338, run = 296547963, enable = 496719538 Scaled count = 502902612 (59.70%, 296547963/496719538) Event 12 -- Raw count = 299751948, run = 296547195, enable = 496710036 Scaled count = 502077926 (59.70%, 296547195/496710036) Event 13 -- Raw count = 299341883, run = 296549981, enable = 496700423 Scaled count = 501376663 (59.70%, 296549981/496700423) Event 14 -- Raw count = 299145476, run = 296561684, enable = 496690949 Scaled count = 501018366 (59.71%, 296561684/496690949) Expected: 501669431 High: 504039405 Low: 300218292 Average: 502635662 Average Error = 0.19% OK - running tests/test-evsel.c... loop = 65536, count = 329275 loop = 131072, count = 664638 loop = 262144, count = 1315367 loop = 524288, count = 2629617 loop = 1048576, count = 5273657 loop = 65536, count = 459641 loop = 131072, count = 978402 loop = 262144, count = 1581219 loop = 524288, count = 3774908 loop = 1048576, count = 7694417 OK make: Leaving directory '/home/nakamura/build_work/build_kernel/linux_kernel/linux/tools/lib/perf' Signed-off-by: Shunsuke Nakamura <[email protected]> Acked-by: Jiri Olsa <[email protected]> Cc: Alexander Shishkin <[email protected]> Cc: Ingo Molnar <[email protected]> Cc: Mark Rutland <[email protected]> Cc: Namhyung Kim <[email protected]> Cc: Peter Zijlstra <[email protected]> Cc: Rob Herring <[email protected]> Link: https://lore.kernel.org/r/[email protected] Signed-off-by: Arnaldo Carvalho de Melo <[email protected]>
2021-12-07libperf: Remove scaling process from perf_mmap__read_self()Shunsuke Nakamura1-2/+0
Remove the scaling process from perf_mmap__read_self(), and unify the counters that can be obtained from perf_evsel__read() to "no scaling". Signed-off-by: Shunsuke Nakamura <[email protected]> Acked-by: Jiri Olsa <[email protected]> Cc: Alexander Shishkin <[email protected]> Cc: Ingo Molnar <[email protected]> Cc: Mark Rutland <[email protected]> Cc: Namhyung Kim <[email protected]> Cc: Peter Zijlstra <[email protected]> Cc: Rob Herring <[email protected]> Link: https://lore.kernel.org/r/[email protected] Signed-off-by: Arnaldo Carvalho de Melo <[email protected]>
2021-12-07libperf: Adopt perf_counts_values__scale() from tools/perf/utilShunsuke Nakamura5-22/+24
Move perf_counts_values__scale() from tools/perf/util to tools/lib/perf so that it can be used with libperf. Committer notes: As noted by Jiri, use __s8 instead of s8 on the exported function. Signed-off-by: Shunsuke Nakamura <[email protected]> Acked-by: Jiri Olsa <[email protected]> Cc: Alexander Shishkin <[email protected]> Cc: Ingo Molnar <[email protected]> Cc: Jiri Olsa <[email protected]> Cc: Mark Rutland <[email protected]> Cc: Namhyung Kim <[email protected]> Cc: Peter Zijlstra <[email protected]> Cc: Rob Herring <[email protected]> Link: https://lore.kernel.org/r/[email protected] Signed-off-by: Arnaldo Carvalho de Melo <[email protected]>
2021-12-07tools build: Enable warnings through HOSTCFLAGSJohn Garry3-2/+7
The tools build system uses KBUILD_HOSTCFLAGS symbol for obvious purposes. However this is not set for anything under tools/ As such, host tools apps built have no compiler warnings enabled. Declare HOSTCFLAGS for perf tools build, and also use that symbol in declaration of host_c_flags. HOSTCFLAGS comes from EXTRA_WARNINGS, which is independent of target platform/arch warning flags. Suggested-by: Jiri Olsa <[email protected]> Signed-off-by: John Garry <[email protected]> Acked-by: Jiri Olsa <[email protected]> Cc: Alexander Shishkin <[email protected]> Cc: Ian Rogers <[email protected]> Cc: Ingo Molnar <[email protected]> Cc: Laura Abbott <[email protected]> Cc: Masahiro Yamada <[email protected]> Cc: Namhyung Kim <[email protected]> Cc: Peter Zijlstra <[email protected]> Link: https://lore.kernel.org/r/[email protected] Signed-off-by: Arnaldo Carvalho de Melo <[email protected]>
2021-12-07perf test sigtrap: Print errno string when failingArnaldo Carvalho de Melo1-3/+6
Helps a bit the user figuring out why it is failing: Before: $ perf test sigtrap 73: Sigtrap : FAILED! $ perf test -v sigtrap 73: Sigtrap : --- start --- test child forked, pid 3816772 FAILED sys_perf_event_open() test child finished with -1 ---- end ---- Sigtrap: FAILED! $ After: $ perf test sigtrap 73: Sigtrap : FAILED! $ perf test -v sigtrap 73: Sigtrap : --- start --- test child forked, pid 3816772 FAILED sys_perf_event_open(): Permission denied test child finished with -1 ---- end ---- Sigtrap: FAILED! $ Cc: Adrian Hunter <[email protected]> Cc: Alexander Shishkin <[email protected]> Cc: Fabian Hemmer <[email protected]> Cc: Ian Rogers <[email protected]> Cc: Ingo Molnar <[email protected]> Cc: Jiri Olsa <[email protected]> Cc: Marco Elver <[email protected]> Cc: Mark Rutland <[email protected]> Cc: Namhyung Kim <[email protected]> Cc: Peter Zijlstra <[email protected]> Cc: [email protected] Link: http://lore.kernel.org/lkml/[email protected] Signed-off-by: Arnaldo Carvalho de Melo <[email protected]>
2021-12-07perf test sigtrap: Add basic stress test for sigtrap handlingMarco Elver4-0/+159
Add basic stress test for sigtrap handling as a perf tool built-in test. This allows sanity checking the basic sigtrap functionality from within the perf tool. Committer notes: Reported that !root was getting -EPERM, applied a fixup from Marco to set .exclude_{hv,kernel} that made it work. Signed-off-by: Marco Elver <[email protected]> Tested-by: Arnaldo Carvalho de Melo <[email protected]> Cc: Adrian Hunter <[email protected]> Cc: Alexander Shishkin <[email protected]> Cc: Fabian Hemmer <[email protected]> Cc: Ian Rogers <[email protected]> Cc: Jiri Olsa <[email protected]> Cc: Mark Rutland <[email protected]> Cc: Namhyung Kim <[email protected]> Cc: Peter Zijlstra <[email protected]> Cc: [email protected] Link: http://lore.kernel.org/lkml/[email protected] Signed-off-by: Arnaldo Carvalho de Melo <[email protected]>
2021-12-07rcutorture: Combine n_max_cbs from all kthreads in a callback floodPaul E. McKenney1-1/+1
With the addition of multiple callback-flood kthreads, the maximum number of callbacks from any one of those kthreads is reported in the rcutorture run summary. This commit changes this to report the sum of each kthread's maximum number of callbacks in a given callback-flooding episode. Cc: Neeraj Upadhyay <[email protected]> Signed-off-by: Paul E. McKenney <[email protected]>
2021-12-07rcutorture: Test RCU-tasks multiqueue callback queueingPaul E. McKenney2-0/+2
This commit modifies the TASKS01 scenario to use four callback queues and the TRACE01 scenario to use two queues, thus providing testing of multiple queues by default. Cc: Neeraj Upadhyay <[email protected]> Signed-off-by: Paul E. McKenney <[email protected]>
2021-12-08selftests: netfilter: switch zone stress to socatFlorian Westphal1-6/+13
centos9 has nmap-ncat which doesn't like the '-q' option, use socat. While at it, mark test skipped if needed tools are missing. Signed-off-by: Florian Westphal <[email protected]> Signed-off-by: Pablo Neira Ayuso <[email protected]>
2021-12-08selftests: netfilter: Add correctness test for mac,net set typeStefano Brivio1-3/+21
The existing net,mac test didn't cover the issue recently reported by Nikita Yushchenko, where MAC addresses wouldn't match if given as first field of a concatenated set with AVX2 and 8-bit groups, because there's a different code path covering the lookup of six 8-bit groups (MAC addresses) if that's the first field. Add a similar mac,net test, with MAC address and IPv4 address swapped in the set specification. Signed-off-by: Stefano Brivio <[email protected]> Signed-off-by: Pablo Neira Ayuso <[email protected]>
2021-12-08vrf: don't run conntrack on vrf with !dflt qdiscNicolas Dichtel1-4/+26
After the below patch, the conntrack attached to skb is set to "notrack" in the context of vrf device, for locally generated packets. But this is true only when the default qdisc is set to the vrf device. When changing the qdisc, notrack is not set anymore. In fact, there is a shortcut in the vrf driver, when the default qdisc is set, see commit dcdd43c41e60 ("net: vrf: performance improvements for IPv4") for more details. This patch ensures that the behavior is always the same, whatever the qdisc is. To demonstrate the difference, a new test is added in conntrack_vrf.sh. Fixes: 8c9c296adfae ("vrf: run conntrack only in context of lower/physdev for locally generated packets") Signed-off-by: Nicolas Dichtel <[email protected]> Acked-by: Florian Westphal <[email protected]> Reviewed-by: David Ahern <[email protected]> Signed-off-by: Pablo Neira Ayuso <[email protected]>
2021-12-07selftests: mptcp: check IP_TOS in/out are the sameFlorian Westphal1-0/+63
Check that getsockopt(IP_TOS) returns what setsockopt(IP_TOS) did set right before. Also check that socklen_t == 0 and -1 input values match those of normal tcp sockets. Reviewed-by: Matthieu Baerts <[email protected]> Signed-off-by: Florian Westphal <[email protected]> Signed-off-by: Mat Martineau <[email protected]> Signed-off-by: Jakub Kicinski <[email protected]>
2021-12-07selftests: mptcp: add inq test caseFlorian Westphal4-1/+645
client & server use a unix socket connection to communicate outside of the mptcp connection. This allows the consumer to know in advance how many bytes have been (or will be) sent by the peer. This allows stricter checks on the bytecounts reported by TCP_INQ cmsg. Suggested-by: Mat Martineau <[email protected]> Signed-off-by: Florian Westphal <[email protected]> Signed-off-by: Mat Martineau <[email protected]> Signed-off-by: Jakub Kicinski <[email protected]>
2021-12-07selftests: mptcp: add TCP_INQ supportFlorian Westphal2-3/+61
Do checks on the returned inq counter. Fail on: 1. Huge value (> 1 kbyte, test case files are 1 kb) 2. last hint larger than returned bytes when read was short 3. erronenous indication of EOF. 3) happens when a hint of X bytes reads X-1 on next call but next recvmsg returns more data (instead of EOF). Signed-off-by: Florian Westphal <[email protected]> Signed-off-by: Mat Martineau <[email protected]> Signed-off-by: Jakub Kicinski <[email protected]>
2021-12-06perf bpf_skel: Do not use typedef to avoid error on old clangSong Liu3-20/+26
When building bpf_skel with clang-10, typedef causes confusions like: libbpf: map 'prev_readings': unexpected def kind var. Fix this by removing the typedef. Fixes: 7fac83aaf2eecc9e ("perf stat: Introduce 'bperf' to share hardware PMCs with BPF") Reported-by: Arnaldo Carvalho de Melo <[email protected]> Signed-off-by: Song Liu <[email protected]> Cc: Jiri Olsa <[email protected]> Cc: Namhyung Kim <[email protected]> Link: http://lore.kernel.org/lkml/[email protected] Signed-off-by: Arnaldo Carvalho de Melo <[email protected]>
2021-12-06perf bpf: Fix building perf with BUILD_BPF_SKEL=1 by default in more distrosSong Liu3-5/+3
Arnaldo reported that building all his containers with BUILD_BPF_SKEL=1 to then make this the default he found problems in some distros where the system linux/bpf.h file was being used and lacked this: util/bpf_skel/bperf_leader.bpf.c:13:20: error: use of undeclared identifier 'BPF_F_PRESERVE_ELEMS' __uint(map_flags, BPF_F_PRESERVE_ELEMS); So use instead the vmlinux.h file generated by bpftool from BTF info. This fixed these as well, getting the build back working on debian:11, debian:experimental and ubuntu:21.10: In file included from In file included from util/bpf_skel/bperf_leader.bpf.cutil/bpf_skel/bpf_prog_profiler.bpf.c::33: : In file included from In file included from /usr/include/linux/bpf.h/usr/include/linux/bpf.h::1111: : /usr/include/linux/types.h/usr/include/linux/types.h::55::1010:: In file included from util/bpf_skel/bperf_follower.bpf.c:3fatal errorfatal error: : : In file included from /usr/include/linux/bpf.h:'asm/types.h' file not found11'asm/types.h' file not found: /usr/include/linux/types.h:5:10: fatal error: 'asm/types.h' file not found #include <asm/types.h>#include <asm/types.h> ^~~~~~~~~~~~~ ^~~~~~~~~~~~~ #include <asm/types.h> ^~~~~~~~~~~~~ 1 error generated. Reported-by: Arnaldo Carvalho de Melo <[email protected]> Signed-off-by: Song Liu <[email protected]> Tested-by: Athira Rajeev <[email protected]> Cc: Jiri Olsa <[email protected]> Cc: Namhyung Kim <[email protected]> Link: http://lore.kernel.org/lkml/[email protected] Signed-off-by: Arnaldo Carvalho de Melo <[email protected]>
2021-12-06perf header: Fix memory leaks when processing feature headersIan Rogers1-5/+10
These leaks were found with leak sanitizer running "perf pipe recording and injection test". In pipe mode feat_fd may hold onto an events struct that needs freeing. When string features are processed they may overwrite an already created string, so free this before the overwrite. Signed-off-by: Ian Rogers <[email protected]> Acked-by: Jiri Olsa <[email protected]> Cc: Alexander Shishkin <[email protected]> Cc: Mark Rutland <[email protected]> Cc: Namhyung Kim <[email protected]> Cc: Peter Zijlstra <[email protected]> Cc: Stephane Eranian <[email protected]> Link: http://lore.kernel.org/lkml/[email protected] Signed-off-by: Arnaldo Carvalho de Melo <[email protected]>
2021-12-06perf test: Reset shadow counts before loadingIan Rogers1-0/+1
Otherwise load counting is an average. Without this change duration_time in test_memory_bandwidth will alter its value if an earlier test contains duration_time. This patch fixes an issue that's introduced in the proposed patch: https://lore.kernel.org/lkml/[email protected]/ in perf test "Parse and process metrics". Signed-off-by: Ian Rogers <[email protected]> Acked-by: Jiri Olsa <[email protected]> Cc: Alexander Shishkin <[email protected]> Cc: Andi Kleen <[email protected]> Cc: John Garry <[email protected]> Cc: Kajol Jain <[email protected]> Cc: Kan Liang <[email protected]> Cc: Mark Rutland <[email protected]> Cc: Namhyung Kim <[email protected]> Cc: Paul Clarke <[email protected]> Cc: Peter Zijlstra <[email protected]> Cc: Stephane Eranian <[email protected]> Link: http://lore.kernel.org/lkml/[email protected] Signed-off-by: Arnaldo Carvalho de Melo <[email protected]>
2021-12-06perf test: Fix 'Simple expression parser' test on arch without CPU die ↵Thomas Richter1-1/+3
topology info Some platforms do not have CPU die support, for example s390. Commit Cc: Ian Rogers <[email protected]> Fixes: fdf1e29b6118c18f ("perf expr: Add metric literals for topology.") fails on s390: # perf test -Fv 7 ... # FAILED tests/expr.c:173 #num_dies >= #num_packages ---- end ---- Simple expression parser: FAILED! # Investigating this issue leads to these functions: build_cpu_topology() +--> has_die_topology(void) { struct utsname uts; if (uname(&uts) < 0) return false; if (strncmp(uts.machine, "x86_64", 6)) return false; .... } which always returns false on s390. The caller build_cpu_topology() checks has_die_topology() return value. On false the the struct cpu_topology::die_cpu_list is not contructed and has zero entries. This leads to the failing comparison: #num_dies >= #num_packages. s390 of course has a positive number of packages. Fix this and check if the function build_cpu_topology() did build up a die_cpus_list. The number of entries in this list should be larger than 0. If the number of list element is zero, the die_cpus_list has not been created and the check in function test__expr(): TEST_ASSERT_VAL("#num_dies >= #num_packages", \ num_dies >= num_packages) always fails. Output after: # perf test -Fv 7 7: Simple expression parser : --- start --- division by zero syntax error ---- end ---- Simple expression parser: Ok # Fixes: fdf1e29b6118c18f ("perf expr: Add metric literals for topology.") Signed-off-by: Thomas Richter <[email protected]> Acked-by: Ian Rogers <[email protected]> Cc: Heiko Carstens <[email protected]> Cc: Sumanth Korikkar <[email protected]> Cc: Sven Schnelle <[email protected]> Cc: Vasily Gorbik <[email protected]> Link: http://lore.kernel.org/lkml/[email protected] [ Added comment in the added 'if (num_dies)' line about architectures not having die topology ] Signed-off-by: Arnaldo Carvalho de Melo <[email protected]>
2021-12-06tools build: Remove needless libpython-version feature check that breaks ↵Arnaldo Carvalho de Melo5-23/+0
test-all fast path Since 66dfdff03d196e51 ("perf tools: Add Python 3 support") we don't use the tools/build/feature/test-libpython-version.c version in any Makefile feature check: $ find tools/ -type f | xargs grep feature-libpython-version $ The only place where this was used was removed in 66dfdff03d196e51: - ifneq ($(feature-libpython-version), 1) - $(warning Python 3 is not yet supported; please set) - $(warning PYTHON and/or PYTHON_CONFIG appropriately.) - $(warning If you also have Python 2 installed, then) - $(warning try something like:) - $(warning $(and ,)) - $(warning $(and ,) make PYTHON=python2) - $(warning $(and ,)) - $(warning Otherwise, disable Python support entirely:) - $(warning $(and ,)) - $(warning $(and ,) make NO_LIBPYTHON=1) - $(warning $(and ,)) - $(error $(and ,)) - else - LDFLAGS += $(PYTHON_EMBED_LDFLAGS) - EXTLIBS += $(PYTHON_EMBED_LIBADD) - LANG_BINDINGS += $(obj-perf)python/perf.so - $(call detected,CONFIG_LIBPYTHON) - endif And nowadays we either build with PYTHON=python3 or just install the python3 devel packages and perf will build against it. But the leftover feature-libpython-version check made the fast path feature detection to break in all cases except when python2 devel files were installed: $ rpm -qa | grep python.*devel python3-devel-3.9.7-1.fc34.x86_64 $ rm -rf /tmp/build/perf ; mkdir -p /tmp/build/perf ; $ make -C tools/perf O=/tmp/build/perf install-bin make: Entering directory '/var/home/acme/git/perf/tools/perf' BUILD: Doing 'make -j32' parallel build HOSTCC /tmp/build/perf/fixdep.o <SNIP> $ cat /tmp/build/perf/feature/test-all.make.output In file included from test-all.c:18: test-libpython-version.c:5:10: error: #error 5 | #error | ^~~~~ $ ldd ~/bin/perf | grep python libpython3.9.so.1.0 => /lib64/libpython3.9.so.1.0 (0x00007fda6dbcf000) $ As python3 is the norm these days, fix this by just removing the unused feature-libpython-version feature check, making the test-all fast path to work with the common case. With this: $ rm -rf /tmp/build/perf ; mkdir -p /tmp/build/perf ; $ make -C tools/perf O=/tmp/build/perf install-bin |& head make: Entering directory '/var/home/acme/git/perf/tools/perf' BUILD: Doing 'make -j32' parallel build HOSTCC /tmp/build/perf/fixdep.o HOSTLD /tmp/build/perf/fixdep-in.o LINK /tmp/build/perf/fixdep Auto-detecting system features: ... dwarf: [ on ] ... dwarf_getlocations: [ on ] ... glibc: [ on ] $ ldd ~/bin/perf | grep python libpython3.9.so.1.0 => /lib64/libpython3.9.so.1.0 (0x00007f58800b0000) $ cat /tmp/build/perf/feature/test-all.make.output $ Reviewed-by: James Clark <[email protected]> Fixes: 66dfdff03d196e51 ("perf tools: Add Python 3 support") Cc: Adrian Hunter <[email protected]> Cc: Ian Rogers <[email protected]> Cc: Jaroslav Škarvada <[email protected]> Cc: Jiri Olsa <[email protected]> Cc: Namhyung Kim <[email protected]> Link: https://lore.kernel.org/lkml/[email protected] Signed-off-by: Arnaldo Carvalho de Melo <[email protected]>
2021-12-06perf tools: Fix SMT detection fast read pathIan Rogers1-1/+1
sysfs__read_int() returns 0 on success, and so the fast read path was always failing. Fixes: bb629484d924118e ("perf tools: Simplify checking if SMT is active.") Signed-off-by: Ian Rogers <[email protected]> Cc: Alexander Shishkin <[email protected]> Cc: Andi Kleen <[email protected]> Cc: Jiri Olsa <[email protected]> Cc: John Garry <[email protected]> Cc: Kajol Jain <[email protected]> Cc: Kan Liang <[email protected]> Cc: Konstantin Khlebnikov <[email protected]> Cc: Mark Rutland <[email protected]> Cc: Namhyung Kim <[email protected]> Cc: Paul Clarke <[email protected]> Cc: Peter Zijlstra <[email protected]> Cc: Stephane Eranian <[email protected]> Link: http://lore.kernel.org/lkml/[email protected] Signed-off-by: Arnaldo Carvalho de Melo <[email protected]>
2021-12-06tools headers UAPI: Sync powerpc syscall table file changed by new ↵Arnaldo Carvalho de Melo1-0/+1
futex_waitv syscall To pick the changes in this cset: a0eb2da92b715d0c ("futex: Wireup futex_waitv syscall") That add support for this new syscall in tools such as 'perf trace'. For instance, this is now possible (adapted from the x86_64 test output): # perf trace -e futex_waitv ^C# # perf trace -v -e futex_waitv event qualifier tracepoint filter: (common_pid != 807333 && common_pid != 3564) && (id == 449) ^C# # perf trace -v -e futex* --max-events 10 event qualifier tracepoint filter: (common_pid != 812168 && common_pid != 3564) && (id == 221 || id == 449) mmap size 528384B ? ( ): Timer/219310 ... [continued]: futex()) = -1 ETIMEDOUT (Connection timed out) 0.012 ( 0.002 ms): Timer/219310 futex(uaddr: 0x7fd0b152d3c8, op: WAKE|PRIVATE_FLAG, val: 1) = 0 0.024 ( 0.060 ms): Timer/219310 futex(uaddr: 0x7fd0b152d420, op: WAIT_BITSET|PRIVATE_FLAG, utime: 0x7fd0b1657840, val3: MATCH_ANY) = 0 0.086 ( 0.001 ms): Timer/219310 futex(uaddr: 0x7fd0b152d3c8, op: WAKE|PRIVATE_FLAG, val: 1) = 0 0.088 ( ): Timer/219310 futex(uaddr: 0x7fd0b152d424, op: WAIT_BITSET|PRIVATE_FLAG, utime: 0x7fd0b1657840, val3: MATCH_ANY) ... 0.075 ( 0.005 ms): Web Content/219299 futex(uaddr: 0x7fd0b152d420, op: WAKE|PRIVATE_FLAG, val: 1) = 1 0.169 ( 0.004 ms): Web Content/219299 futex(uaddr: 0x7fd0b152d424, op: WAKE|PRIVATE_FLAG, val: 1) = 1 0.088 ( 0.089 ms): Timer/219310 ... [continued]: futex()) = 0 0.179 ( 0.001 ms): Timer/219310 futex(uaddr: 0x7fd0b152d3c8, op: WAKE|PRIVATE_FLAG, val: 1) = 0 0.181 ( ): Timer/219310 futex(uaddr: 0x7fd0b152d420, op: WAIT_BITSET|PRIVATE_FLAG, utime: 0x7fd0b1657840, val3: MATCH_ANY) ... # That is the filter expression attached to the raw_syscalls:sys_{enter,exit} tracepoints. $ grep futex tools/perf/arch/powerpc/entry/syscalls/syscall.tbl 221 32 futex sys_futex_time32 221 64 futex sys_futex 221 spu futex sys_futex 422 32 futex_time64 sys_futex sys_futex 449 common futex_waitv sys_futex_waitv $ This addresses this perf build warnings: Warning: Kernel ABI header at 'tools/perf/arch/powerpc/entry/syscalls/syscall.tbl' differs from latest version at 'arch/powerpc/kernel/syscalls/syscall.tbl' diff -u tools/perf/arch/powerpc/entry/syscalls/syscall.tbl arch/powerpc/kernel/syscalls/syscall.tbl Reviewed-by: Athira Rajeev <[email protected]> Tested-by: Athira Rajeev <[email protected]> Cc: Adrian Hunter <[email protected]>, Cc: André Almeida <[email protected]> Cc: Arnd Bergmann <[email protected]> Cc: Jiri Olsa <[email protected]> Cc: Namhyung Kim <[email protected]> Link: https://lore.kernel.org/lkml/YZ%[email protected] Signed-off-by: Arnaldo Carvalho de Melo <[email protected]>
2021-12-06perf inject: Fix itrace space allowed for new attributesAdrian Hunter1-1/+1
The space allowed for new attributes can be too small if existing header information is large. That can happen, for example, if there are very many CPUs, due to having an event ID per CPU per event being stored in the header information. Fix by adding the existing header.data_offset. Also increase the extra space allowed to 8KiB and align to a 4KiB boundary for neatness. Signed-off-by: Adrian Hunter <[email protected]> Cc: Jiri Olsa <[email protected]> Link: http://lore.kernel.org/lkml/[email protected] Signed-off-by: Arnaldo Carvalho de Melo <[email protected]>
2021-12-06tools headers UAPI: Sync s390 syscall table file changed by new futex_waitv ↵Arnaldo Carvalho de Melo1-0/+1
syscall To pick the changes in these csets: 6c122360cf2f4c5a ("s390: wire up sys_futex_waitv system call") That add support for this new syscall in tools such as 'perf trace'. For instance, this is now possible (adapted from the x86_64 test output): # perf trace -e futex_waitv ^C# # perf trace -v -e futex_waitv event qualifier tracepoint filter: (common_pid != 807333 && common_pid != 3564) && (id == 449) ^C# # perf trace -v -e futex* --max-events 10 event qualifier tracepoint filter: (common_pid != 812168 && common_pid != 3564) && (id == 238 || id == 449) ? ( ): Timer/219310 ... [continued]: futex()) = -1 ETIMEDOUT (Connection timed out) 0.012 ( 0.002 ms): Timer/219310 futex(uaddr: 0x7fd0b152d3c8, op: WAKE|PRIVATE_FLAG, val: 1) = 0 0.024 ( 0.060 ms): Timer/219310 futex(uaddr: 0x7fd0b152d420, op: WAIT_BITSET|PRIVATE_FLAG, utime: 0x7fd0b1657840, val3: MATCH_ANY) = 0 0.086 ( 0.001 ms): Timer/219310 futex(uaddr: 0x7fd0b152d3c8, op: WAKE|PRIVATE_FLAG, val: 1) = 0 0.088 ( ): Timer/219310 futex(uaddr: 0x7fd0b152d424, op: WAIT_BITSET|PRIVATE_FLAG, utime: 0x7fd0b1657840, val3: MATCH_ANY) ... 0.075 ( 0.005 ms): Web Content/219299 futex(uaddr: 0x7fd0b152d420, op: WAKE|PRIVATE_FLAG, val: 1) = 1 0.169 ( 0.004 ms): Web Content/219299 futex(uaddr: 0x7fd0b152d424, op: WAKE|PRIVATE_FLAG, val: 1) = 1 0.088 ( 0.089 ms): Timer/219310 ... [continued]: futex()) = 0 0.179 ( 0.001 ms): Timer/219310 futex(uaddr: 0x7fd0b152d3c8, op: WAKE|PRIVATE_FLAG, val: 1) = 0 0.181 ( ): Timer/219310 futex(uaddr: 0x7fd0b152d420, op: WAIT_BITSET|PRIVATE_FLAG, utime: 0x7fd0b1657840, val3: MATCH_ANY) ... # That is the filter expression attached to the raw_syscalls:sys_{enter,exit} tracepoints. $ grep futex tools/perf/arch/s390/entry/syscalls/syscall.tbl 238 common futex sys_futex sys_futex_time32 422 32 futex_time64 - sys_futex 449 common futex_waitv sys_futex_waitv sys_futex_waitv $ This addresses this perf build warnings: Warning: Kernel ABI header at 'tools/perf/arch/s390/entry/syscalls/syscall.tbl' differs from latest version at 'arch/s390/kernel/syscalls/syscall.tbl' diff -u tools/perf/arch/s390/entry/syscalls/syscall.tbl arch/s390/kernel/syscalls/syscall.tbl Acked-by: Heiko Carstens <[email protected]> Cc: Adrian Hunter <[email protected]>, Cc: Jiri Olsa <[email protected]> Cc: Namhyung Kim <[email protected]> Cc: Vasily Gorbik <[email protected]> Link: https://lore.kernel.org/lkml/YZ%2F2qRW%[email protected] Signed-off-by: Arnaldo Carvalho de Melo <[email protected]>
2021-12-06Revert "perf bench: Fix two memory leaks detected with ASan"Jiri Olsa1-4/+0
This: This reverts commit 92723ea0f11d92496687db8c9725248e9d1e5e1d. # perf test 91 91: perf stat --bpf-counters test :RRRRRRRRRRRRR FAILED! # perf test 91 91: perf stat --bpf-counters test :RRRRRRRRRRRRR FAILED! # perf test 91 91: perf stat --bpf-counters test :RRRRRRRRRRRR FAILED! # perf test 91 91: perf stat --bpf-counters test :RRRRRRRRRRRRRRRRRR Ok # perf test 91 91: perf stat --bpf-counters test :RRRRRRRRR FAILED! # perf test 91 91: perf stat --bpf-counters test :RRRRRRRRRRR Ok # perf test 91 91: perf stat --bpf-counters test :RRRRRRRRRRRRRRR Ok yep, it seems the perf bench is broken so the counts won't correlated if I revert this one: 92723ea0f11d perf bench: Fix two memory leaks detected with ASan it works for me again.. it seems to break -t option [root@dell-r440-01 perf]# ./perf bench sched messaging -g 1 -l 100 -t # Running 'sched/messaging' benchmark: RRRperf: CLIENT: ready write: Bad file descriptor Rperf: SENDER: write: Bad file descriptor Reported-by: Arnaldo Carvalho de Melo <[email protected]> Signed-off-by: Jiri Olsa <[email protected]> Tested-by: Arnaldo Carvalho de Melo <[email protected]> Cc: Ian Rogers <[email protected]> Cc: Namhyung Kim <[email protected]> Cc: Sohaib Mohamed <[email protected]> Cc: Song Liu <[email protected]> Link: https://lore.kernel.org/lkml/YZev7KClb%2Fud43Lc@krava/ Signed-off-by: Arnaldo Carvalho de Melo <[email protected]>
2021-12-06libbpf: Add doc comments in libbpf.hGrant Seltzer1-0/+53
This adds comments above functions in libbpf.h which document their uses. These comments are of a format that doxygen and sphinx can pick up and render. These are rendered by libbpf.readthedocs.org These doc comments are for: - bpf_object__open_file() - bpf_object__open_mem() - bpf_program__attach_uprobe() - bpf_program__attach_uprobe_opts() Signed-off-by: Grant Seltzer <[email protected]> Signed-off-by: Andrii Nakryiko <[email protected]> Link: https://lore.kernel.org/bpf/[email protected]
2021-12-06libbpf: Fix trivial typohuangxuesen1-2/+2
Fix typo in comment from 'bpf_skeleton_map' to 'bpf_map_skeleton' and from 'bpf_skeleton_prog' to 'bpf_prog_skeleton'. Signed-off-by: huangxuesen <[email protected]> Signed-off-by: Andrii Nakryiko <[email protected]> Link: https://lore.kernel.org/bpf/[email protected]
2021-12-06tools/perf: Add '__rel_loc' event field parsing supportMasami Hiramatsu7-0/+14
Add new '__rel_loc' dynamic data location attribute support. This type attribute is similar to the '__data_loc' but records the offset from the field itself. The libtraceevent adds TEP_FIELD_IS_RELATIVE to the 'tep_format_field::flags' with TEP_FIELD_IS_DYNAMIC for'__rel_loc'. Link: https://lkml.kernel.org/r/163757344810.510314.12449413842136229871.stgit@devnote2 Cc: Beau Belgrave <[email protected]> Cc: Namhyung Kim <[email protected]> Cc: Tom Zanussi <[email protected]> Signed-off-by: Masami Hiramatsu <[email protected]> Signed-off-by: Steven Rostedt (VMware) <[email protected]>
2021-12-06libtraceevent: Add __rel_loc relative location attribute supportMasami Hiramatsu3-22/+47
Add '__rel_loc' new dynamic data location attribute which encodes the data location from the next to the field itself. This is similar to the '__data_loc' but the location offset is not from the event entry but from the next of the field. This patch adds '__rel_loc' decoding support in the libtraceevent. Link: https://lkml.kernel.org/r/163757343994.510314.13241077597729303802.stgit@devnote2 Cc: Beau Belgrave <[email protected]> Cc: Namhyung Kim <[email protected]> Cc: Tom Zanussi <[email protected]> Signed-off-by: Masami Hiramatsu <[email protected]> Signed-off-by: Steven Rostedt (VMware) <[email protected]>
2021-12-05bpftool: Add debug mode for gen_loader.Alexei Starovoitov1-9/+11
Make -d flag functional for gen_loader style program loading. For example: $ bpftool prog load -L -d test_d_path.o ... // will print: libbpf: loading ./test_d_path.o libbpf: elf: section(3) fentry/security_inode_getattr, size 280, link 0, flags 6, type=1 ... libbpf: prog 'prog_close': found data map 0 (test_d_p.bss, sec 7, off 0) for insn 30 libbpf: gen: load_btf: size 5376 libbpf: gen: map_create: test_d_p.bss idx 0 type 2 value_type_id 118 libbpf: map 'test_d_p.bss': created successfully, fd=0 libbpf: gen: map_update_elem: idx 0 libbpf: sec 'fentry/filp_close': found 1 CO-RE relocations libbpf: record_relo_core: prog 1 insn[15] struct file 0:1 final insn_idx 15 libbpf: gen: prog_load: type 26 insns_cnt 35 progi_idx 0 libbpf: gen: find_attach_tgt security_inode_getattr 12 libbpf: gen: prog_load: type 26 insns_cnt 37 progi_idx 1 libbpf: gen: find_attach_tgt filp_close 12 libbpf: gen: finish 0 ... // at this point libbpf finished generating loader program 0: (bf) r6 = r1 1: (bf) r1 = r10 2: (07) r1 += -136 3: (b7) r2 = 136 4: (b7) r3 = 0 5: (85) call bpf_probe_read_kernel#113 6: (05) goto pc+104 ... // this is the assembly dump of the loader program 390: (63) *(u32 *)(r6 +44) = r0 391: (18) r1 = map[idx:0]+5584 393: (61) r0 = *(u32 *)(r1 +0) 394: (63) *(u32 *)(r6 +24) = r0 395: (b7) r0 = 0 396: (95) exit err 0 // the loader program was loaded and executed successfully (null) func#0 @0 ... // CO-RE in the kernel logs: CO-RE relocating STRUCT file: found target candidate [500] prog '': relo #0: kind <byte_off> (0), spec is [8] STRUCT file.f_path (0:1 @ offset 16) prog '': relo #0: matching candidate #0 [500] STRUCT file.f_path (0:1 @ offset 16) prog '': relo #0: patched insn #15 (ALU/ALU64) imm 16 -> 16 vmlinux_cand_cache:[11]file(500), module_cand_cache: ... // verifier logs when it was checking test_d_path.o program: R1 type=ctx expected=fp 0: R1=ctx(id=0,off=0,imm=0) R10=fp0 ; int BPF_PROG(prog_close, struct file *file, void *id) 0: (79) r6 = *(u64 *)(r1 +0) func 'filp_close' arg0 has btf_id 500 type STRUCT 'file' 1: R1=ctx(id=0,off=0,imm=0) R6_w=ptr_file(id=0,off=0,imm=0) R10=fp0 ; pid_t pid = bpf_get_current_pid_tgid() >> 32; 1: (85) call bpf_get_current_pid_tgid#14 ... // if there are multiple programs being loaded by the loader program ... // only the last program in the elf file will be printed, since ... // the same verifier log_buf is used for all PROG_LOAD commands. Signed-off-by: Alexei Starovoitov <[email protected]> Signed-off-by: Andrii Nakryiko <[email protected]> Link: https://lore.kernel.org/bpf/[email protected]
2021-12-05Merge tag 'x86_urgent_for_v5.16_rc4' of ↵Linus Torvalds2-0/+5
git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip Pull x86 fixes from Borislav Petkov: - Fix a couple of SWAPGS fencing issues in the x86 entry code - Use the proper operand types in __{get,put}_user() to prevent truncation in SEV-ES string io - Make sure the kernel mappings are present in trampoline_pgd in order to prevent any potential accesses to unmapped memory after switching to it - Fix a trivial list corruption in objtool's pv_ops validation - Disable the clocksource watchdog for TSC on platforms which claim that the TSC is constant, doesn't stop in sleep states, CPU has TSC adjust and the number of sockets of the platform are max 2, to prevent erroneous markings of the TSC as unstable. - Make sure TSC adjust is always checked not only when going idle - Prevent a stack leak by initializing struct _fpx_sw_bytes properly in the FPU code - Fix INTEL_FAM6_RAPTORLAKE define naming to adhere to the convention * tag 'x86_urgent_for_v5.16_rc4' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip: x86/xen: Add xenpv_restore_regs_and_return_to_usermode() x86/entry: Use the correct fence macro after swapgs in kernel CR3 x86/entry: Add a fence for kernel entry SWAPGS in paranoid_entry() x86/sev: Fix SEV-ES INS/OUTS instructions for word, dword, and qword x86/64/mm: Map all kernel memory into trampoline_pgd objtool: Fix pv_ops noinstr validation x86/tsc: Disable clocksource watchdog for TSC on qualified platorms x86/tsc: Add a timer to make sure TSC_adjust is always checked x86/fpu/signal: Initialize sw_bytes in save_xstate_epilog() x86/cpu: Drop spurious underscore from RAPTOR_LAKE #define