Age | Commit message (Collapse) | Author | Files | Lines |
|
When using the global aggregation mode, running perf script after perf
stat record can result in a segmentation fault as seen with commit
8b76a3188b85 ("perf stat: Remove unused perf_counts.aggr field").
Add a basic test to the existing suite of stat-related tests for
checking if that workflow runs without erroring out.
Signed-off-by: Sandipan Das <[email protected]>
Acked-by: Namhyung Kim <[email protected]>
Cc: Adrian Hunter <[email protected]>
Cc: Alexander Shishkin <[email protected]>
Cc: Ananth Narayan <[email protected]>
Cc: Ian Rogers <[email protected]>
Cc: Ingo Molnar <[email protected]>
Cc: Jiri Olsa <[email protected]>
Cc: Mark Rutland <[email protected]>
Cc: Nick Terrell <[email protected]>
Cc: Peter Zijlstra <[email protected]>
Cc: Ravi Bangoria <[email protected]>
Link: https://lore.kernel.org/r/6a5429879764e3dac984cbb11ee2d95cc1604161.1683280603.git.sandipan.das@amd.com
Signed-off-by: Arnaldo Carvalho de Melo <[email protected]>
|
|
The script command does not support aggregation modes by itself although
that can be achieved using post-processing scripts. Because of this, it
does not allocate memory for aggregated event values.
Upon running perf stat record, the aggregation mode is set in the perf
data file. If the mode is AGGR_GLOBAL, the aggregated event values are
accessed and this leads to a segmentation fault since these were never
allocated to begin with. Set the mode to AGGR_NONE explicitly to avoid
this.
E.g.
$ perf stat record -e cycles true
$ perf script
Before:
Segmentation fault (core dumped)
After:
CPU THREAD VAL ENA RUN TIME EVENT
-1 231919 162831 362069 362069 935289 cycles:u
Fixes: 8b76a3188b85724f ("perf stat: Remove unused perf_counts.aggr field")
Signed-off-by: Sandipan Das <[email protected]>
Acked-by: Namhyung Kim <[email protected]>
Cc: Adrian Hunter <[email protected]>
Cc: Alexander Shishkin <[email protected]>
Cc: Ananth Narayan <[email protected]>
Cc: Ian Rogers <[email protected]>
Cc: Ingo Molnar <[email protected]>
Cc: Jiri Olsa <[email protected]>
Cc: Mark Rutland <[email protected]>
Cc: Namhyung Kim <[email protected]>
Cc: Nick Terrell <[email protected]>
Cc: Peter Zijlstra <[email protected]>
Cc: Ravi Bangoria <[email protected]>
Cc: [email protected] # v6.2+
Link: https://lore.kernel.org/r/83d6c6c05c54bf00c5a9df32ac160718efca0c7a.1683280603.git.sandipan.das@amd.com
Signed-off-by: Arnaldo Carvalho de Melo <[email protected]>
|
|
There are insufficient headers in tools/include to satisfy building BPF
programs and their header dependencies. Add the system include paths
from the non-BPF clang compile so that these headers can be found.
This code was taken from:
tools/testing/selftests/bpf/Makefile
Committer notes:
Had to adjust the '#ifndef NO_BPF_SKEL' to '#ifdef BUILD_BPF_SKEL' as
reverted that build BPF skels by default.
Also cope with the addition of -I$(srctree)/tools/include/uapi done by
Yang Jihong so that we prefer using the kernel sources headers instead
of older ones in the system.
Signed-off-by: Ian Rogers <[email protected]>
Cc: Adrian Hunter <[email protected]>
Cc: Albert Ou <[email protected]>
Cc: Alexander Shishkin <[email protected]>
Cc: Andrii Nakryiko <[email protected]>
Cc: Ian Rogers <[email protected]>
Cc: Ingo Molnar <[email protected]>
Cc: Jiri Olsa <[email protected]>
Cc: Mark Rutland <[email protected]>
Cc: Namhyung Kim <[email protected]>
Cc: Nathan Chancellor <[email protected]>
Cc: Nick Desaulniers <[email protected]>
Cc: Palmer Dabbelt <[email protected]>
Cc: Paul Walmsley <[email protected]>,
Cc: Peter Zijlstra <[email protected]>
Cc: Song Liu <[email protected]>
Cc: Tom Rix <[email protected]>
Cc: Yang Jihong <[email protected]>
Link: https://lore.kernel.org/lkml/[email protected]
Signed-off-by: Arnaldo Carvalho de Melo <[email protected]>
|
|
Currently, vmlinux.h uses the bpf.h and perf_event.h header files in the
system path. If the header files in compilation environment are old,
compilation may fail. For example:
/home/yangjihong/linux/tools/perf/util/bpf_skel/.tmp/../vmlinux.h:151:27: error: field has incomplete type 'union perf_sample_weight'
union perf_sample_weight weight;
Use the bpf.h and perf_event.h files in the source code directory to
avoid compilation compatibility problems.
Signed-off-by: Yang Jihong <[email protected]>
Tested-by: Arnaldo Carvalho de Melo <[email protected]>
Cc: Adrian Hunter <[email protected]>
Cc: Alexander Shishkin <[email protected]>
Cc: Ian Rogers <[email protected]>
Cc: Ingo Molnar <[email protected]>
Cc: Jiri Olsa <[email protected]>
Cc: Mark Rutland <[email protected]>
Cc: Namhyung Kim <[email protected]>
Cc: Nathan Chancellor <[email protected]>
Cc: Nick Desaulniers <[email protected]>
Cc: Peter Zijlstra <[email protected]>
Cc: Tom Rix <[email protected]>
Link: https://lore.kernel.org/r/[email protected]
Signed-off-by: Arnaldo Carvalho de Melo <[email protected]>
|
|
Do not assume which events may have a PMU name, allowing the logic to
keep an AUX event group together.
Example:
Before:
$ perf record --no-bpf-event -c 10 -e '{intel_pt//,tlb_flush.stlb_any/aux-sample-size=8192/pp}:u' -- sleep 0.1
WARNING: events were regrouped to match PMUs
Cannot add AUX area sampling to a group leader
$
After:
$ perf record --no-bpf-event -c 10 -e '{intel_pt//,tlb_flush.stlb_any/aux-sample-size=8192/pp}:u' -- sleep 0.1
[ perf record: Woken up 1 times to write data ]
[ perf record: Captured and wrote 0.078 MB perf.data ]
$ perf script -F-dso,+addr | grep -C5 tlb_flush.stlb_any | head -11
sleep 20444 [003] 7939.510243: 1 branches:uH: 7f5350cc82a2 dl_main+0x9a2 => 7f5350cb38f0 _dl_add_to_namespace_list+0x0
sleep 20444 [003] 7939.510243: 1 branches:uH: 7f5350cb3908 _dl_add_to_namespace_list+0x18 => 7f5350cbb080 rtld_mutex_dummy+0x0
sleep 20444 [003] 7939.510243: 1 branches:uH: 7f5350cc8350 dl_main+0xa50 => 0 [unknown]
sleep 20444 [003] 7939.510244: 1 branches:uH: 7f5350cc83ca dl_main+0xaca => 7f5350caeb60 _dl_process_pt_gnu_property+0x0
sleep 20444 [003] 7939.510245: 1 branches:uH: 7f5350caeb60 _dl_process_pt_gnu_property+0x0 => 0 [unknown]
sleep 20444 7939.510245: 10 tlb_flush.stlb_any/aux-sample-size=8192/pp: 0 7f5350caeb60 _dl_process_pt_gnu_property+0x0
sleep 20444 [003] 7939.510254: 1 branches:uH: 7f5350cc87fe dl_main+0xefe => 7f5350ccd240 strcmp+0x0
sleep 20444 [003] 7939.510254: 1 branches:uH: 7f5350cc8862 dl_main+0xf62 => 0 [unknown]
sleep 20444 [003] 7939.510255: 1 branches:uH: 7f5350cc9cdc dl_main+0x23dc => 0 [unknown]
sleep 20444 [003] 7939.510257: 1 branches:uH: 7f5350cc89f6 dl_main+0x10f6 => 7f5350cb9530 _dl_setup_hash+0x0
sleep 20444 [003] 7939.510257: 1 branches:uH: 7f5350cc8a2d dl_main+0x112d => 7f5350cb3990 _dl_new_object+0x0
$
Fixes: 347c2f0a0988c59c ("perf parse-events: Sort and group parsed events")
Suggested-by: Ian Rogers <[email protected]>
Signed-off-by: Adrian Hunter <[email protected]>
Tested-by: Ian Rogers <[email protected]>
Cc: Jiri Olsa <[email protected]>
Cc: Namhyung Kim <[email protected]>
Link: https://lore.kernel.org/r/[email protected]
Signed-off-by: Arnaldo Carvalho de Melo <[email protected]>
|
|
br_misp_retired.all_branches is supported on processors that support
Intel PT, so use it to test sample mode with an event that has been
given a PMU name.
Please note, the test fails prior to the fix "perf parse-events: Do not
break up AUX event group".
Signed-off-by: Adrian Hunter <[email protected]>
Cc: Ian Rogers <[email protected]>
Cc: Jiri Olsa <[email protected]>
Cc: Namhyung Kim <[email protected]>
Link: https://lore.kernel.org/r/[email protected]
Signed-off-by: Arnaldo Carvalho de Melo <[email protected]>
|
|
If we have a group of {cycles,faults} then we need the faults software
event to appear to be on the same PMU as cycles so that we don't split
the group in parse_events__sort_events_and_fix_groups.
This case is relatively easy as cycles is the leader and will have a PMU
name. In the reverse case, {faults,cycles} we still need faults to
appear to have the PMU name of cycles but the old behavior is just to
return "cpu".
For hybrid this fails as cycles will be on "cpu_core" or "cpu_atom",
causing faults to be split into a different group.
Change the behavior for software events so that the whole group is
searched for the named PMU.
Signed-off-by: Ian Rogers <[email protected]>
Tested-by: Adrian Hunter <[email protected]>
Cc: Ahmad Yasin <[email protected]>
Cc: Alexander Shishkin <[email protected]>
Cc: Andi Kleen <[email protected]>
Cc: Athira Rajeev <[email protected]>
Cc: Caleb Biggers <[email protected]>
Cc: Edward Baker <[email protected]>
Cc: Florian Fischer <[email protected]>
Cc: Ingo Molnar <[email protected]>
Cc: James Clark <[email protected]>
Cc: Jiri Olsa <[email protected]>
Cc: John Garry <[email protected]>
Cc: Kajol Jain <[email protected]>
Cc: Kan Liang <[email protected]>
Cc: Kang Minchul <[email protected]>
Cc: Leo Yan <[email protected]>
Cc: Mark Rutland <[email protected]>
Cc: Namhyung Kim <[email protected]>
Cc: Perry Taylor <[email protected]>
Cc: Peter Zijlstra <[email protected]>
Cc: Ravi Bangoria <[email protected]>
Cc: Rob Herring <[email protected]>
Cc: Samantha Alt <[email protected]>
Cc: Stephane Eranian <[email protected]>
Cc: Sumanth Korikkar <[email protected]>
Cc: Suzuki Poulouse <[email protected]>
Cc: Thomas Richter <[email protected]>
Cc: Tiezhu Yang <[email protected]>
Cc: Weilin Wang <[email protected]>
Cc: Xing Zhengjun <[email protected]>
Cc: Yang Jihong <[email protected]>
Link: https://lore.kernel.org/r/[email protected]
Signed-off-by: Arnaldo Carvalho de Melo <[email protected]>
|
|
Picking the changes from:
c68e3d473988b9af ("x86/include/asm/msr-index.h: Add IFS Array test bits")
Silencing these perf build warnings:
Warning: Kernel ABI header at 'tools/arch/x86/include/asm/msr-index.h' differs from latest version at 'arch/x86/include/asm/msr-index.h'
diff -u tools/arch/x86/include/asm/msr-index.h arch/x86/include/asm/msr-index.h
Signed-off-by: Yanteng Si <[email protected]>
Cc: Adrian Hunter <[email protected]>
Cc: Alexander Shishkin <[email protected]>
Cc: Ian Rogers <[email protected]>
Cc: Ingo Molnar <[email protected]>
Cc: Jiri Olsa <[email protected]>
Cc: Mark Rutland <[email protected]>
Cc: Namhyung Kim <[email protected]>
Cc: Peter Zijlstra <[email protected]>
Cc: [email protected]
Link: https://lore.kernel.org/r/05778ab3c168c8030f6b20e60375dc803f0cd300.1683712945.git.siyanteng@loongson.cn
Signed-off-by: Arnaldo Carvalho de Melo <[email protected]>
|
|
Picking the changes from:
e65733b5c59a1ea2 ("KVM: x86: Redefine 'longmode' as a flag for KVM_EXIT_HYPERCALL")
30ec7997d175cd68 ("KVM: arm64: timers: Allow userspace to set the global counter offset")
821d935c87bc9525 ("KVM: arm64: Introduce support for userspace SMCCC filtering")
81dc9504a7006b48 ("KVM: arm64: nv: timers: Support hyp timer emulation")
a8308b3fc9494953 ("KVM: arm64: Refactor hvc filtering to support different actions")
0e5c9a9d6548e9b1 ("KVM: arm64: Expose SMC/HVC width to userspace")
Silencing these perf build warnings:
Warning: Kernel ABI header at 'tools/include/uapi/linux/kvm.h' differs from latest version at 'include/uapi/linux/kvm.h'
diff -u tools/include/uapi/linux/kvm.h include/uapi/linux/kvm.h
Warning: Kernel ABI header at 'tools/arch/x86/include/uapi/asm/kvm.h' differs from latest version at 'arch/x86/include/uapi/asm/kvm.h'
diff -u tools/arch/x86/include/uapi/asm/kvm.h arch/x86/include/uapi/asm/kvm.h
Warning: Kernel ABI header at 'tools/arch/arm64/include/uapi/asm/kvm.h' differs from latest version at 'arch/arm64/include/uapi/asm/kvm.h'
diff -u tools/arch/arm64/include/uapi/asm/kvm.h arch/arm64/include/uapi/asm/kvm.h
Signed-off-by: Yanteng Si <[email protected]>
Cc: Adrian Hunter <[email protected]>
Cc: Alexander Shishkin <[email protected]>
Cc: Ian Rogers <[email protected]>
Cc: Ingo Molnar <[email protected]>
Cc: Jiri Olsa <[email protected]>
Cc: Mark Rutland <[email protected]>
Cc: Namhyung Kim <[email protected]>
Cc: Peter Zijlstra <[email protected]>
Cc: [email protected]
Link: https://lore.kernel.org/r/ac5adb58411d23b3360d436a65038fefe91c32a8.1683712945.git.siyanteng@loongson.cn
Signed-off-by: Arnaldo Carvalho de Melo <[email protected]>
|
|
Picking the changes from:
102882b5c62f6bfe ("ALSA: document that struct __snd_pcm_mmap_control64 is messed up")
9f656705c5faa18a ("ALSA: pcm: rewrite snd_pcm_playback_silence()")
Silencing these perf build warnings:
Warning: Kernel ABI header at 'tools/include/uapi/sound/asound.h' differs from latest version at 'include/uapi/sound/asound.h'
diff -u tools/include/uapi/sound/asound.h include/uapi/sound/asound.h
Signed-off-by: Yanteng Si <[email protected]>
Cc: Adrian Hunter <[email protected]>
Cc: Alexander Shishkin <[email protected]>
Cc: Ian Rogers <[email protected]>
Cc: Ingo Molnar <[email protected]>
Cc: Jiri Olsa <[email protected]>
Cc: Mark Rutland <[email protected]>
Cc: Namhyung Kim <[email protected]>
Cc: Peter Zijlstra <[email protected]>
Cc: [email protected]
Link: https://lore.kernel.org/r/5606e7989bbb029c400117f2e455ab995208266f.1683712945.git.siyanteng@loongson.cn
Signed-off-by: Arnaldo Carvalho de Melo <[email protected]>
|
|
Picking the changes from:
31088f6f7906253e ("uapi/linux/const.h: prefer ISO-friendly __typeof__")
Silencing these perf build warnings::
Warning: Kernel ABI header at 'tools/include/uapi/linux/const.h' differs from latest version at 'include/uapi/linux/const.h'
diff -u tools/include/uapi/linux/const.h include/uapi/linux/const.h
Signed-off-by: Yanteng Si <[email protected]>
Cc: Adrian Hunter <[email protected]>
Cc: Alexander Shishkin <[email protected]>
Cc: Ian Rogers <[email protected]>
Cc: Ingo Molnar <[email protected]>
Cc: Jiri Olsa <[email protected]>
Cc: Mark Rutland <[email protected]>
Cc: Namhyung Kim <[email protected]>
Cc: Peter Zijlstra <[email protected]>
Cc: [email protected]
Link: https://lore.kernel.org/r/33e963df304394f932d9108a1b0bb327f23a4eca.1683712945.git.siyanteng@loongson.cn
Signed-off-by: Arnaldo Carvalho de Melo <[email protected]>
|
|
Picking the changes from:
1cc064dce4ed0ff1 ("drm/i915/perf: Add support for OA media units")
c61d04c9eb435498 ("drm/i915/perf: Add engine class instance parameters to perf")
02abecdeebfcd384 ("drm/i915/uapi: Replace fake flex-array with flexible-array member")
Silencing these perf build warnings:
Warning: Kernel ABI header at 'tools/include/uapi/drm/i915_drm.h' differs from latest version at 'include/uapi/drm/i915_drm.h'
diff -u tools/include/uapi/drm/i915_drm.h include/uapi/drm/i915_drm.h
Signed-off-by: Yanteng Si <[email protected]>
Cc: Adrian Hunter <[email protected]>
Cc: Alexander Shishkin <[email protected]>
Cc: Ian Rogers <[email protected]>
Cc: Ingo Molnar <[email protected]>
Cc: Jiri Olsa <[email protected]>
Cc: Mark Rutland <[email protected]>
Cc: Namhyung Kim <[email protected]>
Cc: Peter Zijlstra <[email protected]>
Cc: [email protected]
Link: https://lore.kernel.org/r/4c0c150997ae1455f49094222daa121385643ae0.1683712945.git.siyanteng@loongson.cn
Signed-off-by: Arnaldo Carvalho de Melo <[email protected]>
|
|
Picking the changes from:
6068771673a38efc ("drm: document DRM_IOCTL_PRIME_HANDLE_TO_FD and PRIME_FD_TO_HANDLE")
61a55f8b1ebcde68 ("drm: document expectations for GETFB2 handles")
158350aae16a4e66 ("drm: document DRM_IOCTL_GEM_CLOSE")
Silencing these perf build warnings:
Warning: Kernel ABI header at 'tools/include/uapi/drm/drm.h' differs from latest version at 'include/uapi/drm/drm.h'
diff -u tools/include/uapi/drm/drm.h include/uapi/drm/drm.h
No changes in tooling as these are just C comment documentation changes.
Signed-off-by: Yanteng Si <[email protected]>
Cc: Adrian Hunter <[email protected]>
Cc: Alexander Shishkin <[email protected]>
Cc: Ian Rogers <[email protected]>
Cc: Ingo Molnar <[email protected]>
Cc: Jiri Olsa <[email protected]>
Cc: Mark Rutland <[email protected]>
Cc: Namhyung Kim <[email protected]>
Cc: Peter Zijlstra <[email protected]>
Cc: [email protected]
Link: https://lore.kernel.org/r/7552c61660bf079f2979fdcbcef8e921255f877a.1683712945.git.siyanteng@loongson.cn
Signed-off-by: Arnaldo Carvalho de Melo <[email protected]>
|
|
Picking the changes from:
91d0b78c5177f3e4 ("inet: Add IP_LOCAL_PORT_RANGE socket option")
Silencing these perf build warnings:
Warning: Kernel ABI header at 'tools/include/uapi/linux/in.h' differs from latest version at 'include/uapi/linux/in.h'
diff -u tools/include/uapi/linux/in.h include/uapi/linux/in.h
Signed-off-by: Yanteng Si <[email protected]>
Cc: Adrian Hunter <[email protected]>
Cc: Alexander Shishkin <[email protected]>
Cc: Ian Rogers <[email protected]>
Cc: Ingo Molnar <[email protected]>
Cc: Jiri Olsa <[email protected]>
Cc: Mark Rutland <[email protected]>
Cc: Namhyung Kim <[email protected]>
Cc: Peter Zijlstra <[email protected]>
Cc: [email protected]
Link: https://lore.kernel.org/r/23aabc69956ac94fbf388b05c8be08a64e8c7ccc.1683712945.git.siyanteng@loongson.cn
Signed-off-by: Arnaldo Carvalho de Melo <[email protected]>
|
|
clang isn't available
Build BPF skels require having a compiler able to generate BPF bytecode,
and so far this is only possible with clang, so check for its
availability and fail the build when the user explicitely ask for BPF
skels to be built.
Acked-by: Namhyung Kim <[email protected]>
Cc: Adrian Hunter <[email protected]>
Cc: Andrii Nakryiko <[email protected]>,
Cc: Ian Rogers <[email protected]>
Cc: Ingo Molnar <[email protected]>
Cc: Jiri Olsa <[email protected]>
Cc: Song Liu <[email protected]>
Yang: Yang Jihong <[email protected]>,
Signed-off-by: Arnaldo Carvalho de Melo <[email protected]>
|
|
Test case 'Test java symbol' might run for a long time. On Fedora 38 the
run time is very, very long:
Output before:
# time ./perf test 108
108: Test java symbol : Ok
real 22m15.775s
user 3m42.584s
sys 4m30.685s
#
The reason is a lookup for the server for debug symbols as shown in:
# cat /etc/debuginfod/elfutils.urls
https://debuginfod.fedoraproject.org/
#
This lookup is done for every symbol/sample, so about 3500 lookups
will take place.
To omit this lookup, which is not needed, unset environment variable
DEBUGINFOD_URLS=''.
Output after:
# time ./perf test 108
108: Test java symbol : Ok
real 0m6.242s
user 0m4.982s
sys 0m3.243s
#
Signed-off-by: Thomas Richter <[email protected]>
Tested-by: Arnaldo Carvalho de Melo <[email protected]>
Cc: Heiko Carstens <[email protected]>
Cc: Ian Rogers <[email protected]>
Cc: Leo Yan <[email protected]>
Cc: Sumanth Korikkar <[email protected]>
Cc: Sven Schnelle <[email protected]>
Cc: Vasily Gorbik <[email protected]>
Link: https://lore.kernel.org/r/[email protected]
Signed-off-by: Arnaldo Carvalho de Melo <[email protected]>
|
|
The pmu_group_name by default returns "cpu" which on non-hybrid/ARM
means that ungrouped software, and hardware events are all going to
sort by the original insertion index.
However, on hybrid and ARM wildcard expansion may mean the PMU name is
set and events will be unnecessarily reordered - triggering the
reordering warning.
Signed-off-by: Ian Rogers <[email protected]>
Tested-by: Kan Liang <[email protected]>
Cc: Adrian Hunter <[email protected]>
Cc: Ahmad Yasin <[email protected]>
Cc: Alexander Shishkin <[email protected]>
Cc: Andi Kleen <[email protected]>
Cc: Athira Rajeev <[email protected]>
Cc: Caleb Biggers <[email protected]>
Cc: Edward Baker <[email protected]>
Cc: Florian Fischer <[email protected]>
Cc: Ingo Molnar <[email protected]>
Cc: James Clark <[email protected]>
Cc: Jiri Olsa <[email protected]>
Cc: John Garry <[email protected]>
Cc: Kajol Jain <[email protected]>
Cc: Kang Minchul <[email protected]>
Cc: Leo Yan <[email protected]>
Cc: Mark Rutland <[email protected]>
Cc: Namhyung Kim <[email protected]>
Cc: Perry Taylor <[email protected]>
Cc: Peter Zijlstra <[email protected]>
Cc: Ravi Bangoria <[email protected]>
Cc: Rob Herring <[email protected]>
Cc: Samantha Alt <[email protected]>
Cc: Stephane Eranian <[email protected]>
Cc: Sumanth Korikkar <[email protected]>
Cc: Suzuki Poulouse <[email protected]>
Cc: Thomas Richter <[email protected]>
Cc: Tiezhu Yang <[email protected]>
Cc: Weilin Wang <[email protected]>
Cc: Xing Zhengjun <[email protected]>
Cc: Yang Jihong <[email protected]>
Link: https://lore.kernel.org/r/[email protected]
Signed-off-by: Arnaldo Carvalho de Melo <[email protected]>
|
|
Some metric groups have metrics that don't have fully overlapping
events, meaning that the group's events become unique event groups that
may need to multiplex with each other. This can be particularly
unfortunate when the groups wouldn't need to multiplex because there are
sufficient hardware counters.
Add a flag so that if recording a metric group then the metrics within
the group needn't use groups for their events. The flag is added to
Intel TopdownL1 and TopdownL2 metrics.
Signed-off-by: Ian Rogers <[email protected]>
Tested-by: Kan Liang <[email protected]>
Cc: Adrian Hunter <[email protected]>
Cc: Ahmad Yasin <[email protected]>
Cc: Alexander Shishkin <[email protected]>
Cc: Andi Kleen <[email protected]>
Cc: Athira Rajeev <[email protected]>
Cc: Caleb Biggers <[email protected]>
Cc: Edward Baker <[email protected]>
Cc: Florian Fischer <[email protected]>
Cc: Ingo Molnar <[email protected]>
Cc: James Clark <[email protected]>
Cc: Jiri Olsa <[email protected]>
Cc: John Garry <[email protected]>
Cc: Kajol Jain <[email protected]>
Cc: Kang Minchul <[email protected]>
Cc: Leo Yan <[email protected]>
Cc: Mark Rutland <[email protected]>
Cc: Namhyung Kim <[email protected]>
Cc: Perry Taylor <[email protected]>
Cc: Peter Zijlstra <[email protected]>
Cc: Ravi Bangoria <[email protected]>
Cc: Rob Herring <[email protected]>
Cc: Samantha Alt <[email protected]>
Cc: Stephane Eranian <[email protected]>
Cc: Sumanth Korikkar <[email protected]>
Cc: Suzuki Poulouse <[email protected]>
Cc: Thomas Richter <[email protected]>
Cc: Tiezhu Yang <[email protected]>
Cc: Weilin Wang <[email protected]>
Cc: Xing Zhengjun <[email protected]>
Cc: Yang Jihong <[email protected]>
Link: https://lore.kernel.org/r/[email protected]
Signed-off-by: Arnaldo Carvalho de Melo <[email protected]>
|
|
'perf stat' with no arguments will use default events and metrics. These
events may fail to open even with kernel and hypervisor disabled. When
these fail then the permissions error appears even though they were
implicitly selected. This is particularly a problem with the automatic
selection of the TopdownL1 metric group on certain architectures like
Skylake:
$ perf stat true
Error:
Access to performance monitoring and observability operations is limited.
Consider adjusting /proc/sys/kernel/perf_event_paranoid setting to open
access to performance monitoring and observability operations for processes
without CAP_PERFMON, CAP_SYS_PTRACE or CAP_SYS_ADMIN Linux capability.
More information can be found at 'Perf events and tool security' document:
https://www.kernel.org/doc/html/latest/admin-guide/perf-security.html
perf_event_paranoid setting is 2:
-1: Allow use of (almost) all events by all users
Ignore mlock limit after perf_event_mlock_kb without CAP_IPC_LOCK
>= 0: Disallow raw and ftrace function tracepoint access
>= 1: Disallow CPU event access
>= 2: Disallow kernel profiling
To make the adjusted perf_event_paranoid setting permanent preserve it
in /etc/sysctl.conf (e.g. kernel.perf_event_paranoid = <setting>)
$
This patch adds skippable evsels that when they fail to open won't cause
termination and will appear as "<not supported>" in output. The
TopdownL1 events, from the metric group, are marked as skippable. This
turns the failure above to:
$ perf stat perf bench internals synthesize
Computing performance of single threaded perf event synthesis by
synthesizing events on the perf process itself:
Average synthesis took: 49.287 usec (+- 0.083 usec)
Average num. events: 3.000 (+- 0.000)
Average time per event 16.429 usec
Average data synthesis took: 49.641 usec (+- 0.085 usec)
Average num. events: 11.000 (+- 0.000)
Average time per event 4.513 usec
Performance counter stats for 'perf bench internals synthesize':
1,222.38 msec task-clock:u # 0.993 CPUs utilized
0 context-switches:u # 0.000 /sec
0 cpu-migrations:u # 0.000 /sec
162 page-faults:u # 132.529 /sec
774,445,184 cycles:u # 0.634 GHz (49.61%)
1,640,969,811 instructions:u # 2.12 insn per cycle (59.67%)
302,052,148 branches:u # 247.102 M/sec (59.69%)
1,807,718 branch-misses:u # 0.60% of all branches (59.68%)
5,218,927 CPU_CLK_UNHALTED.REF_XCLK:u # 4.269 M/sec
# 17.3 % tma_frontend_bound
# 56.4 % tma_retiring
# nan % tma_backend_bound
# nan % tma_bad_speculation (60.01%)
536,580,469 IDQ_UOPS_NOT_DELIVERED.CORE:u # 438.965 M/sec (60.33%)
<not supported> INT_MISC.RECOVERY_CYCLES_ANY:u
5,223,936 CPU_CLK_UNHALTED.ONE_THREAD_ACTIVE:u # 4.274 M/sec (40.31%)
774,127,250 CPU_CLK_UNHALTED.THREAD:u # 633.297 M/sec (50.34%)
1,746,579,518 UOPS_RETIRED.RETIRE_SLOTS:u # 1.429 G/sec (50.12%)
1,940,625,702 UOPS_ISSUED.ANY:u # 1.588 G/sec (49.70%)
1.231055525 seconds time elapsed
0.258327000 seconds user
0.965749000 seconds sys
$
The event INT_MISC.RECOVERY_CYCLES_ANY:u is skipped as it can't be
opened with paranoia 2 on Skylake. With a lower paranoia, or as root,
all events/metrics are computed.
Signed-off-by: Ian Rogers <[email protected]>
Tested-by: Kan Liang <[email protected]>
Cc: Adrian Hunter <[email protected]>
Cc: Ahmad Yasin <[email protected]>
Cc: Alexander Shishkin <[email protected]>
Cc: Andi Kleen <[email protected]>
Cc: Athira Rajeev <[email protected]>
Cc: Caleb Biggers <[email protected]>
Cc: Edward Baker <[email protected]>
Cc: Florian Fischer <[email protected]>
Cc: Ingo Molnar <[email protected]>
Cc: James Clark <[email protected]>
Cc: Jiri Olsa <[email protected]>
Cc: John Garry <[email protected]>
Cc: Kajol Jain <[email protected]>
Cc: Kang Minchul <[email protected]>
Cc: Leo Yan <[email protected]>
Cc: Mark Rutland <[email protected]>
Cc: Namhyung Kim <[email protected]>
Cc: Perry Taylor <[email protected]>
Cc: Peter Zijlstra <[email protected]>
Cc: Ravi Bangoria <[email protected]>
Cc: Rob Herring <[email protected]>
Cc: Samantha Alt <[email protected]>
Cc: Stephane Eranian <[email protected]>
Cc: Sumanth Korikkar <[email protected]>
Cc: Suzuki Poulouse <[email protected]>
Cc: Thomas Richter <[email protected]>
Cc: Tiezhu Yang <[email protected]>
Cc: Weilin Wang <[email protected]>
Cc: Xing Zhengjun <[email protected]>
Cc: Yang Jihong <[email protected]>
Link: https://lore.kernel.org/r/[email protected]
Signed-off-by: Arnaldo Carvalho de Melo <[email protected]>
|
|
Division by zero causes expression parsing to fail and no metric to be
generated. This can mean for short running benchmarks metrics are not
shown. Change the behavior to make the value nan, which gets shown like:
'''
$ perf stat -M TopdownL2 true
Performance counter stats for 'true':
1,031,492 INST_RETIRED.ANY # nan % tma_fetch_bandwidth
# nan % tma_heavy_operations
# nan % tma_light_operations
29,304 CPU_CLK_UNHALTED.REF_XCLK # nan % tma_fetch_latency
# nan % tma_branch_mispredicts
# nan % tma_machine_clears
# nan % tma_core_bound
# nan % tma_memory_bound
2,658,319 IDQ_UOPS_NOT_DELIVERED.CORE
11,167 EXE_ACTIVITY.BOUND_ON_STORES
262,058 EXE_ACTIVITY.1_PORTS_UTIL
<not counted> BR_MISP_RETIRED.ALL_BRANCHES (0.00%)
<not counted> INT_MISC.RECOVERY_CYCLES_ANY (0.00%)
<not counted> CPU_CLK_UNHALTED.ONE_THREAD_ACTIVE (0.00%)
<not counted> CPU_CLK_UNHALTED.THREAD (0.00%)
<not counted> UOPS_RETIRED.RETIRE_SLOTS (0.00%)
<not counted> CYCLE_ACTIVITY.STALLS_MEM_ANY (0.00%)
<not counted> UOPS_RETIRED.MACRO_FUSED (0.00%)
<not counted> IDQ_UOPS_NOT_DELIVERED.CYCLES_0_UOPS_DELIV.CORE (0.00%)
<not counted> EXE_ACTIVITY.2_PORTS_UTIL (0.00%)
<not counted> CYCLE_ACTIVITY.STALLS_TOTAL (0.00%)
<not counted> MACHINE_CLEARS.COUNT (0.00%)
<not counted> UOPS_ISSUED.ANY (0.00%)
0.002864879 seconds time elapsed
0.003012000 seconds user
0.000000000 seconds sys
'''
When events aren't supported a count of 0 can be confusing and make
metrics look meaningful. Change these to be nan also which, with the
next change, gets shown like:
'''
$ perf stat true
Performance counter stats for 'true':
1.25 msec task-clock:u # 0.387 CPUs utilized
0 context-switches:u # 0.000 /sec
0 cpu-migrations:u # 0.000 /sec
46 page-faults:u # 36.702 K/sec
255,942 cycles:u # 0.204 GHz (88.66%)
123,046 instructions:u # 0.48 insn per cycle
28,301 branches:u # 22.580 M/sec
2,489 branch-misses:u # 8.79% of all branches
4,719 CPU_CLK_UNHALTED.REF_XCLK:u # 3.765 M/sec
# nan % tma_frontend_bound
# nan % tma_retiring
# nan % tma_backend_bound
# nan % tma_bad_speculation
344,855 IDQ_UOPS_NOT_DELIVERED.CORE:u # 275.147 M/sec
<not supported> INT_MISC.RECOVERY_CYCLES_ANY:u
<not counted> CPU_CLK_UNHALTED.ONE_THREAD_ACTIVE:u (0.00%)
<not counted> CPU_CLK_UNHALTED.THREAD:u (0.00%)
<not counted> UOPS_RETIRED.RETIRE_SLOTS:u (0.00%)
<not counted> UOPS_ISSUED.ANY:u (0.00%)
0.003238142 seconds time elapsed
0.000000000 seconds user
0.003434000 seconds sys
'''
Ensure that nan metric values are quoted as nan isn't a valid number
in JSON.
Signed-off-by: Ian Rogers <[email protected]>
Tested-by: Kan Liang <[email protected]>
Cc: Adrian Hunter <[email protected]>
Cc: Ahmad Yasin <[email protected]>
Cc: Alexander Shishkin <[email protected]>
Cc: Andi Kleen <[email protected]>
Cc: Athira Rajeev <[email protected]>
Cc: Caleb Biggers <[email protected]>
Cc: Edward Baker <[email protected]>
Cc: Florian Fischer <[email protected]>
Cc: Ingo Molnar <[email protected]>
Cc: James Clark <[email protected]>
Cc: Jiri Olsa <[email protected]>
Cc: John Garry <[email protected]>
Cc: Kajol Jain <[email protected]>
Cc: Kang Minchul <[email protected]>
Cc: Leo Yan <[email protected]>
Cc: Mark Rutland <[email protected]>
Cc: Namhyung Kim <[email protected]>
Cc: Perry Taylor <[email protected]>
Cc: Peter Zijlstra <[email protected]>
Cc: Ravi Bangoria <[email protected]>
Cc: Rob Herring <[email protected]>
Cc: Samantha Alt <[email protected]>
Cc: Stephane Eranian <[email protected]>
Cc: Sumanth Korikkar <[email protected]>
Cc: Suzuki Poulouse <[email protected]>
Cc: Thomas Richter <[email protected]>
Cc: Tiezhu Yang <[email protected]>
Cc: Weilin Wang <[email protected]>
Cc: Xing Zhengjun <[email protected]>
Cc: Yang Jihong <[email protected]>
Link: https://lore.kernel.org/r/[email protected]
Signed-off-by: Arnaldo Carvalho de Melo <[email protected]>
|
|
"ip link set dev "$devbond1" nomaster"
This line code in bond-eth-type-change.sh is unnecessary.
Because $devbond1 was not added to any master device.
Signed-off-by: Liang Li <[email protected]>
Acked-by: Hangbin Liu <[email protected]>
Reviewed-by: Nikolay Aleksandrov <[email protected]>
Signed-off-by: David S. Miller <[email protected]>
|
|
TEST: num_grat_arp (active-backup miimon num_grat_arp 10) [ OK ]
TEST: num_grat_arp (active-backup miimon num_grat_arp 20) [ OK ]
TEST: num_grat_arp (active-backup miimon num_grat_arp 30) [ OK ]
TEST: num_grat_arp (active-backup miimon num_grat_arp 50) [ OK ]
Signed-off-by: Hangbin Liu <[email protected]>
Signed-off-by: David S. Miller <[email protected]>
|
|
When run the test in netns, it's not easy to get the tc stats via
tc_rule_handle_stats_get(). With the new netns parameter, we can get
stats from specific netns like
num=$(tc_rule_handle_stats_get "dev eth0 ingress" 101 ".packets" "-n ns")
Signed-off-by: Hangbin Liu <[email protected]>
Signed-off-by: David S. Miller <[email protected]>
|
|
Make sure flowtable interacts correctly with ingress and egress
chains, i.e. those get handled before and after flow table respectively.
Adds three more tests:
1. repeat flowtable test, but with 'ip dscp set cs3' done in
inet forward chain.
Expect that some packets have been mangled (before flowtable offload
became effective) while some pass without mangling (after offload
succeeds).
2. repeat flowtable test, but with 'ip dscp set cs3' done in
veth0:ingress.
Expect that all packets pass with cs3 dscp field.
3. same as 2, but use veth1:egress. Expect the same outcome.
Signed-off-by: Florian Westphal <[email protected]>
Signed-off-by: Pablo Neira Ayuso <[email protected]>
|
|
When running nft_flowtable.sh in VM on a busy server we've found that
the time of the netcat file transfers vary wildly.
Therefore replace hardcoded 3 second sleep with the loop checking for
a change in the file sizes. Once no change in detected we test the results.
Nice side effect is that we shave 1 second sleep in the fast case
(hard-coded 3 second sleep vs two 1 second sleeps).
Acked-by: Florian Westphal <[email protected]>
Signed-off-by: Boris Sukholitko <[email protected]>
Signed-off-by: Pablo Neira Ayuso <[email protected]>
|
|
Doing wait with no parameters may interfere with some of the tests
having their own background processes.
Although no such test is currently present, the cleanup is useful
to rely on the nft_flowtable.sh for local development (e.g. running
background tcpdump command during the tests).
Signed-off-by: Boris Sukholitko <[email protected]>
Signed-off-by: Pablo Neira Ayuso <[email protected]>
|
|
Some ps commands (e.g. busybox derived) have no -x option. For the
purposes of hash calculation of the list of processes this option is
inessential.
Signed-off-by: Boris Sukholitko <[email protected]>
Signed-off-by: Pablo Neira Ayuso <[email protected]>
|
|
Some ps commands (e.g. busybox derived) have no -p option. Use /proc for
pid existence check.
Signed-off-by: Boris Sukholitko <[email protected]>
Signed-off-by: Pablo Neira Ayuso <[email protected]>
|
|
The ftrace selftests do not currently produce KTAP output, they produce a
custom format much nicer for human consumption. This means that when run in
automated test systems we just get a single result for the suite as a whole
rather than recording results for individual test cases, making it harder
to look at the test data and masking things like inappropriate skips.
Address this by adding support for KTAP output to the ftracetest script and
providing a trivial wrapper which will be invoked by the kselftest runner
to generate output in this format by default, users using ftracetest
directly will continue to get the existing output.
This is not the most elegant solution but it is simple and effective. I
did consider implementing this by post processing the existing output
format but that felt more complex and likely to result in all output being
lost if something goes seriously wrong during the run which would not be
helpful. I did also consider just writing a separate runner script but
there's enough going on with things like the signal handling for that to
seem like it would be duplicating too much.
Acked-by: Steven Rostedt (Google) <[email protected]>
Acked-by: Masami Hiramatsu (Google) <[email protected]>
Tested-by: Masami Hiramatsu (Google) <[email protected]>
Signed-off-by: Mark Brown <[email protected]>
Signed-off-by: Shuah Khan <[email protected]>
|
|
The "test_encl.elf" file used by test_sgx is not installed in
INSTALL_PATH. Attempting to execute test_sgx causes false negative:
"
enclave executable open(): No such file or directory
main.c:188:unclobbered_vdso:Failed to load the test enclave.
"
Add "test_encl.elf" to TEST_FILES so that it will be installed.
Fixes: 2adcba79e69d ("selftests/x86: Add a selftest for SGX")
Signed-off-by: Yi Lai <[email protected]>
Reviewed-by: Jarkko Sakkinen <[email protected]>
Signed-off-by: Shuah Khan <[email protected]>
|
|
System-wide TSC read could cause a drift in C0 percentage calculation.
Because if first TSC is read and then one by one mperf is read for all
cpus, this introduces drift between mperf reading of later CPUs and TSC
reading. To lower this drift read TSC per CPU and also just after mperf
read. This technique improves C0 percentage calculation in Mperf monitor.
Before fix: (System 100% busy)
| Mperf || RAPL || Idle_Stats
PKG|CORE| CPU| C0 | Cx | Freq || pack | core || POLL | C1 | C2
0| 0| 0| 87.15| 12.85| 2695||168659003|3970468|| 0.00| 0.00| 0.00
0| 0| 256| 84.62| 15.38| 2695||168659003|3970468|| 0.00| 0.00| 0.00
0| 1| 1| 87.15| 12.85| 2695||168659003|3970468|| 0.00| 0.00| 0.00
0| 1| 257| 84.08| 15.92| 2695||168659003|3970468|| 0.00| 0.00| 0.00
0| 2| 2| 86.61| 13.39| 2695||168659003|3970468|| 0.00| 0.00| 0.00
0| 2| 258| 83.26| 16.74| 2695||168659003|3970468|| 0.00| 0.00| 0.00
0| 3| 3| 86.61| 13.39| 2695||168659003|3970468|| 0.00| 0.00| 0.00
0| 3| 259| 83.60| 16.40| 2695||168659003|3970468|| 0.00| 0.00| 0.00
0| 4| 4| 86.33| 13.67| 2695||168659003|3970468|| 0.00| 0.00| 0.00
0| 4| 260| 83.33| 16.67| 2695||168659003|3970468|| 0.00| 0.00| 0.00
0| 5| 5| 86.06| 13.94| 2695||168659003|3970468|| 0.00| 0.00| 0.00
0| 5| 261| 83.05| 16.95| 2695||168659003|3970468|| 0.00| 0.00| 0.00
0| 6| 6| 85.51| 14.49| 2695||168659003|3970468|| 0.00| 0.00| 0.00
After fix: (System 100% busy)
| Mperf || RAPL || Idle_Stats
PKG|CORE| CPU| C0 | Cx | Freq || pack | core || POLL | C1 | C2
0| 0| 0| 98.03| 1.97| 2415||163295480|3811189|| 0.00| 0.00| 0.00
0| 0| 256| 98.50| 1.50| 2394||163295480|3811189|| 0.00| 0.00| 0.00
0| 1| 1| 99.99| 0.01| 2401||163295480|3811189|| 0.00| 0.00| 0.00
0| 1| 257| 99.99| 0.01| 2375||163295480|3811189|| 0.00| 0.00| 0.00
0| 2| 2| 99.99| 0.01| 2401||163295480|3811189|| 0.00| 0.00| 0.00
0| 2| 258|100.00| 0.00| 2401||163295480|3811189|| 0.00| 0.00| 0.00
0| 3| 3|100.00| 0.00| 2401||163295480|3811189|| 0.00| 0.00| 0.00
0| 3| 259| 99.99| 0.01| 2435||163295480|3811189|| 0.00| 0.00| 0.00
0| 4| 4|100.00| 0.00| 2401||163295480|3811189|| 0.00| 0.00| 0.00
0| 4| 260|100.00| 0.00| 2435||163295480|3811189|| 0.00| 0.00| 0.00
0| 5| 5| 99.99| 0.01| 2401||163295480|3811189|| 0.00| 0.00| 0.00
0| 5| 261|100.00| 0.00| 2435||163295480|3811189|| 0.00| 0.00| 0.00
0| 6| 6|100.00| 0.00| 2401||163295480|3811189|| 0.00| 0.00| 0.00
0| 6| 262|100.00| 0.00| 2435||163295480|3811189|| 0.00| 0.00| 0.00
Cc: Thomas Renninger <[email protected]>
Cc: Shuah Khan <[email protected]>
Cc: Dominik Brodowski <[email protected]>
Fixes: 7fe2f6399a84 ("cpupowerutils - cpufrequtils extended with quite some features")
Signed-off-by: Wyes Karny <[email protected]>
Signed-off-by: Shuah Khan <[email protected]>
|
|
The sysfs_get_enabled() opened file processor not closed,
may cause a file handle leak.
Putting error handling and resource cleanup code together
makes the code easy to maintain and read.
Removed the unnecessary else if branch from the original
function, as it should return an error in cases other than '0'.
Signed-off-by: Hao Zeng <[email protected]>
Suggested-by: Shuah Khan <[email protected]>
Signed-off-by: Shuah Khan <[email protected]>
|
|
git://git.kernel.org/pub/scm/linux/kernel/git/acme/linux
Pull perf tool updates from Arnaldo Carvalho de Melo:
"Third version of perf tool updates, with the build problems with with
using a 'vmlinux.h' generated from the main build fixed, and the bpf
skeleton build disabled by default.
Build:
- Require libtraceevent to build, one can disable it using
NO_LIBTRACEEVENT=1.
It is required for tools like 'perf sched', 'perf kvm', 'perf
trace', etc.
libtraceevent is available in most distros so installing
'libtraceevent-devel' should be a one-time event to continue
building perf as usual.
Using NO_LIBTRACEEVENT=1 produces tooling that is functional and
sufficient for lots of users not interested in those libtraceevent
dependent features.
- Allow Python support in 'perf script' when libtraceevent isn't
linked, as not all features requires it, for instance Intel PT does
not use tracepoints.
- Error if the python interpreter needed for jevents to work isn't
available and NO_JEVENTS=1 isn't set, preventing a build without
support for JSON vendor events, which is a rare but possible
condition. The two check error messages:
$(error ERROR: No python interpreter needed for jevents generation. Install python or build with NO_JEVENTS=1.)
$(error ERROR: Python interpreter needed for jevents generation too old (older than 3.6). Install a newer python or build with NO_JEVENTS=1.)
- Make libbpf 1.0 the minimum required when building with out of
tree, distro provided libbpf.
- Use libsdtc++'s and LLVM's libcxx's __cxa_demangle, a portable C++
demangler, add 'perf test' entry for it.
- Make binutils libraries opt in, as distros disable building with it
due to licensing, they were used for C++ demangling, for instance.
- Switch libpfm4 to opt-out rather than opt-in, if libpfm-devel (or
equivalent) isn't installed, we'll just have a build warning:
Makefile.config:1144: libpfm4 not found, disables libpfm4 support. Please install libpfm4-dev
- Add a feature test for scandirat(), that is not implemented so far
in musl and uclibc, disabling features that need it, such as
scanning for tracepoints in /sys/kernel/tracing/events.
perf BPF filters:
- New feature where BPF can be used to filter samples, for instance:
$ sudo ./perf record -e cycles --filter 'period > 1000' true
$ sudo ./perf script
perf-exec 2273949 546850.708501: 5029 cycles: ffffffff826f9e25 finish_wait+0x5 ([kernel.kallsyms])
perf-exec 2273949 546850.708508: 32409 cycles: ffffffff826f9e25 finish_wait+0x5 ([kernel.kallsyms])
perf-exec 2273949 546850.708526: 143369 cycles: ffffffff82b4cdbf xas_start+0x5f ([kernel.kallsyms])
perf-exec 2273949 546850.708600: 372650 cycles: ffffffff8286b8f7 __pagevec_lru_add+0x117 ([kernel.kallsyms])
perf-exec 2273949 546850.708791: 482953 cycles: ffffffff829190de __mod_memcg_lruvec_state+0x4e ([kernel.kallsyms])
true 2273949 546850.709036: 501985 cycles: ffffffff828add7c tlb_gather_mmu+0x4c ([kernel.kallsyms])
true 2273949 546850.709292: 503065 cycles: 7f2446d97c03 _dl_map_object_deps+0x973 (/usr/lib/x86_64-linux-gnu/ld-linux-x86-64.so.2)
- In addition to 'period' (PERF_SAMPLE_PERIOD), the other
PERF_SAMPLE_ can be used for filtering, and also some other sample
accessible values, from tools/perf/Documentation/perf-record.txt:
Essentially the BPF filter expression is:
<term> <operator> <value> (("," | "||") <term> <operator> <value>)*
The <term> can be one of:
ip, id, tid, pid, cpu, time, addr, period, txn, weight, phys_addr,
code_pgsz, data_pgsz, weight1, weight2, weight3, ins_lat, retire_lat,
p_stage_cyc, mem_op, mem_lvl, mem_snoop, mem_remote, mem_lock,
mem_dtlb, mem_blk, mem_hops
The <operator> can be one of:
==, !=, >, >=, <, <=, &
The <value> can be one of:
<number> (for any term)
na, load, store, pfetch, exec (for mem_op)
l1, l2, l3, l4, cxl, io, any_cache, lfb, ram, pmem (for mem_lvl)
na, none, hit, miss, hitm, fwd, peer (for mem_snoop)
remote (for mem_remote)
na, locked (for mem_locked)
na, l1_hit, l1_miss, l2_hit, l2_miss, any_hit, any_miss, walk, fault (for mem_dtlb)
na, by_data, by_addr (for mem_blk)
hops0, hops1, hops2, hops3 (for mem_hops)
perf lock contention:
- Show lock type with address.
- Track and show mmap_lock, siglock and per-cpu rq_lock with address.
This is done for mmap_lock by following the current->mm pointer:
$ sudo ./perf lock con -abl -- sleep 10
contended total wait max wait avg wait address symbol
...
16344 312.30 ms 2.22 ms 19.11 us ffff8cc702595640
17686 310.08 ms 1.49 ms 17.53 us ffff8cc7025952c0
3 84.14 ms 45.79 ms 28.05 ms ffff8cc78114c478 mmap_lock
3557 76.80 ms 68.75 us 21.59 us ffff8cc77ca3af58
1 68.27 ms 68.27 ms 68.27 ms ffff8cda745dfd70
9 54.53 ms 7.96 ms 6.06 ms ffff8cc7642a48b8 mmap_lock
14629 44.01 ms 60.00 us 3.01 us ffff8cc7625f9ca0
3481 42.63 ms 140.71 us 12.24 us ffffffff937906ac vmap_area_lock
16194 38.73 ms 42.15 us 2.39 us ffff8cd397cbc560
11 38.44 ms 10.39 ms 3.49 ms ffff8ccd6d12fbb8 mmap_lock
1 5.43 ms 5.43 ms 5.43 ms ffff8cd70018f0d8
1674 5.38 ms 422.93 us 3.21 us ffffffff92e06080 tasklist_lock
581 4.51 ms 130.68 us 7.75 us ffff8cc9b1259058
5 3.52 ms 1.27 ms 703.23 us ffff8cc754510070
112 3.47 ms 56.47 us 31.02 us ffff8ccee38b3120
381 3.31 ms 73.44 us 8.69 us ffffffff93790690 purge_vmap_area_lock
255 3.19 ms 36.35 us 12.49 us ffff8d053ce30c80
- Update default map size to 16384.
- Allocate single letter option -M for --map-nr-entries, as it is
proving being frequently used.
- Fix struct rq lock access for older kernels with BPF's CO-RE
(Compile once, run everywhere).
- Fix problems found with MSAn.
perf report/top:
- Add inline information when using --call-graph=fp or lbr, as was
already done to the --call-graph=dwarf callchain mode.
- Improve the 'srcfile' sort key performance by really using an
optimization introduced in 6.2 for the 'srcline' sort key that
avoids calling addr2line for comparision with each sample.
perf sched:
- Make 'perf sched latency/map/replay' to use "sched:sched_waking"
instead of "sched:sched_waking", consistent with 'perf record'
since d566a9c2d482 ("perf sched: Prefer sched_waking event when it
exists").
perf ftrace:
- Make system wide the default target for latency subcommand, run the
following command then generate some network traffic and press
control+C:
# perf ftrace latency -T __kfree_skb
^C
DURATION | COUNT | GRAPH |
0 - 1 us | 27 | ############# |
1 - 2 us | 22 | ########### |
2 - 4 us | 8 | #### |
4 - 8 us | 5 | ## |
8 - 16 us | 24 | ############ |
16 - 32 us | 2 | # |
32 - 64 us | 1 | |
64 - 128 us | 0 | |
128 - 256 us | 0 | |
256 - 512 us | 0 | |
512 - 1024 us | 0 | |
1 - 2 ms | 0 | |
2 - 4 ms | 0 | |
4 - 8 ms | 0 | |
8 - 16 ms | 0 | |
16 - 32 ms | 0 | |
32 - 64 ms | 0 | |
64 - 128 ms | 0 | |
128 - 256 ms | 0 | |
256 - 512 ms | 0 | |
512 - 1024 ms | 0 | |
1 - ... s | 0 | |
#
perf top:
- Add --branch-history (LBR: Last Branch Record) option, just like
already available for 'perf record'.
- Fix segfault in thread__comm_len() where thread->comm was being
used outside thread->comm_lock.
perf annotate:
- Allow configuring objdump and addr2line in ~/.perfconfig., so that
you can use alternative binaries, such as llvm's.
perf kvm:
- Add TUI mode for 'perf kvm stat report'.
Reference counting:
- Add reference count checking infrastructure to check for use after
free, done to the 'cpumap', 'namespaces', 'maps' and 'map' structs,
more to come.
To build with it use -DREFCNT_CHECKING=1 in the make command line
to build tools/perf. Documented at:
https://perf.wiki.kernel.org/index.php/Reference_Count_Checking
- The above caught, for instance, fix, present in this series:
- Fix maps use after put in 'perf test "Share thread maps"':
'maps' is copied from leader, but the leader is put on line 79
and then 'maps' is used to read the reference count below - so
a use after put, with the put of maps happening within
thread__put.
Fixed by reversing the order of puts so that the leader is put
last.
- Also several fixes were made to places where reference counts were
not being held.
- Make this one of the tests in 'make -C tools/perf build-test' to
regularly build test it and to make sure no direct access to the
reference counted structs are made, doing that via accessors to
check the validity of the struct pointer.
ARM64:
- Fix 'perf report' segfault when filtering coresight traces by
sparse lists of CPUs.
- Add support for 'simd' as a sort field for 'perf report', to show
ARM's NEON SIMD's predicate flags: "partial" and "empty".
arm64 vendor events:
- Add N1 metrics.
Intel vendor events:
- Add graniterapids, grandridge and sierraforrest events.
- Refresh events for: alderlake, aldernaken, broadwell, broadwellde,
broadwellx, cascadelakx, haswell, haswellx, icelake, icelakex,
jaketown, meteorlake, knightslanding, sandybridge, sapphirerapids,
silvermont, skylake, tigerlake and westmereep-dp
- Refresh metrics for alderlake-n, broadwell, broadwellde,
broadwellx, haswell, haswellx, icelakex, ivybridge, ivytown and
skylakex.
perf stat:
- Implement --topdown using JSON metrics.
- Add TopdownL1 JSON metric as a default if present, but disable it
for now for some Intel hybrid architectures, a series of patches
addressing this is being reviewed and will be submitted for v6.5.
- Use metrics for --smi-cost.
- Update topdown documentation.
Vendor events (JSON) infrastructure:
- Add support for computing and printing metric threshold values. For
instance, here is one found in thesapphirerapids json file:
{
"BriefDescription": "Percentage of cycles spent in System Management Interrupts.",
"MetricExpr": "((msr@aperf@ - cycles) / msr@aperf@ if msr@smi@ > 0 else 0)",
"MetricGroup": "smi",
"MetricName": "smi_cycles",
"MetricThreshold": "smi_cycles > 0.1",
"ScaleUnit": "100%"
},
- Test parsing metric thresholds with the fake PMU in 'perf test
pmu-events'.
- Support for printing metric thresholds in 'perf list'.
- Add --metric-no-threshold option to 'perf stat'.
- Add rand (reverse and) and has_pmem (optane memory) support to
metrics.
- Sort list of input files to avoid depending on the order from
readdir() helping in obtaining reproducible builds.
S/390:
- Add common metrics: - CPI (cycles per instruction), prbstate (ratio
of instructions executed in problem state compared to total number
of instructions), l1mp (Level one instruction and data cache misses
per 100 instructions).
- Add cache metrics for z13, z14, z15 and z16.
- Add metric for TLB and cache.
ARM:
- Add raw decoding for SPE (Statistical Profiling Extension) v1.3 MTE
(Memory Tagging Extension) and MOPS (Memory Operations) load/store.
Intel PT hardware tracing:
- Add event type names UINTR (User interrupt delivered) and UIRET
(Exiting from user interrupt routine), documented in table 32-50
"CFE Packet Type and Vector Fields Details" in the Intel Processor
Trace chapter of The Intel SDM Volume 3 version 078.
- Add support for new branch instructions ERETS and ERETU.
- Fix CYC timestamps after standalone CBR
ARM CoreSight hardware tracing:
- Allow user to override timestamp and contextid settings.
- Fix segfault in dso lookup.
- Fix timeless decode mode detection.
- Add separate decode paths for timeless and per-thread modes.
auxtrace:
- Fix address filter entire kernel size.
Miscellaneous:
- Fix use-after-free and unaligned bugs in the PLT handling routines.
- Use zfree() to reduce chances of use after free.
- Add missing 0x prefix for addresses printed in hexadecimal in 'perf
probe'.
- Suppress massive unsupported target platform errors in the unwind
code.
- Fix return incorrect build_id size in elf_read_build_id().
- Fix 'perf scripts intel-pt-events.py' IPC output for Python 2 .
- Add missing new parameter in kfree_skb tracepoint to the python
scripts using it.
- Add 'perf bench syscall fork' benchmark.
- Add support for printing PERF_MEM_LVLNUM_UNC (Uncached access) in
'perf mem'.
- Fix wrong size expectation for perf test 'Setup struct
perf_event_attr' caused by the patch adding
perf_event_attr::config3.
- Fix some spelling mistakes"
* tag 'perf-tools-for-v6.4-3-2023-05-06' of git://git.kernel.org/pub/scm/linux/kernel/git/acme/linux: (365 commits)
Revert "perf build: Make BUILD_BPF_SKEL default, rename to NO_BPF_SKEL"
Revert "perf build: Warn for BPF skeletons if endian mismatches"
perf metrics: Fix SEGV with --for-each-cgroup
perf bpf skels: Stop using vmlinux.h generated from BTF, use subset of used structs + CO-RE
perf stat: Separate bperf from bpf_profiler
perf test record+probe_libc_inet_pton: Fix call chain match on x86_64
perf test record+probe_libc_inet_pton: Fix call chain match on s390
perf tracepoint: Fix memory leak in is_valid_tracepoint()
perf cs-etm: Add fix for coresight trace for any range of CPUs
perf build: Fix unescaped # in perf build-test
perf unwind: Suppress massive unsupported target platform errors
perf script: Add new parameter in kfree_skb tracepoint to the python scripts using it
perf script: Print raw ip instead of binary offset for callchain
perf symbols: Fix return incorrect build_id size in elf_read_build_id()
perf list: Modify the warning message about scandirat(3)
perf list: Fix memory leaks in print_tracepoint_events()
perf lock contention: Rework offset calculation with BPF CO-RE
perf lock contention: Fix struct rq lock access
perf stat: Disable TopdownL1 on hybrid
perf stat: Avoid SEGV on counter->name
...
|
|
This ensures that buffers retrieved from dynptr_data are allowed to be
passed in to helpers that take mem, like bpf_strncmp
Signed-off-by: Daniel Rosenberg <[email protected]>
Link: https://lore.kernel.org/r/[email protected]
Signed-off-by: Alexei Starovoitov <[email protected]>
|
|
This ensures we still reject invalid memory accesses in buffers that are
marked optional.
Signed-off-by: Daniel Rosenberg <[email protected]>
Link: https://lore.kernel.org/r/[email protected]
Signed-off-by: Alexei Starovoitov <[email protected]>
|
|
bpf_dynptr_slice(_rw) no longer requires a buffer for verification. If the
buffer is needed, but not present, the function will return NULL.
Signed-off-by: Daniel Rosenberg <[email protected]>
Link: https://lore.kernel.org/r/[email protected]
Signed-off-by: Alexei Starovoitov <[email protected]>
|
|
This reverts commit a980755beb5aca9002e1c95ba519b83a44242b5b.
We need to better polish building with BPF skels, so revert back to
making it an experimental feature that has to be explicitely enabled
using BUILD_BPF_SKEL=1.
Signed-off-by: Arnaldo Carvalho de Melo <[email protected]>
|
|
This reverts commit 51924ae69eea5bc90b5da525fbcf4bbd5f8551b3.
We need to better polish building with BPF skels, so revert back to
making it an experimental feature that has to be explicitely enabled
using BUILD_BPF_SKEL=1.
Signed-off-by: Arnaldo Carvalho de Melo <[email protected]>
|
|
test_progs:
Tests new kfunc bpf_task_under_cgroup().
The bpf program saves the new task's pid within a given cgroup to
the remote_pid, which is convenient for the user-mode program to
verify the test correctness.
The user-mode program creates its own mount namespace, and mounts the
cgroupsv2 hierarchy in there, call the fork syscall, then check if
remote_pid and local_pid are unequal.
Signed-off-by: Feng Zhou <[email protected]>
Acked-by: Yonghong Song <[email protected]>
Link: https://lore.kernel.org/r/[email protected]
Signed-off-by: Alexei Starovoitov <[email protected]>
|
|
git://git.kernel.org/pub/scm/linux/kernel/git/netdev/net
Pull networking fixes from Jakub Kicinski:
"Including fixes from netfilter.
Current release - regressions:
- sched: act_pedit: free pedit keys on bail from offset check
Current release - new code bugs:
- pds_core:
- Kconfig fixes (DEBUGFS and AUXILIARY_BUS)
- fix mutex double unlock in error path
Previous releases - regressions:
- sched: cls_api: remove block_cb from driver_list before freeing
- nf_tables: fix ct untracked match breakage
- eth: mtk_eth_soc: drop generic vlan rx offload
- sched: flower: fix error handler on replace
Previous releases - always broken:
- tcp: fix skb_copy_ubufs() vs BIG TCP
- ipv6: fix skb hash for some RST packets
- af_packet: don't send zero-byte data in packet_sendmsg_spkt()
- rxrpc: timeout handling fixes after moving client call connection
to the I/O thread
- ixgbe: fix panic during XDP_TX with > 64 CPUs
- igc: RMW the SRRCTL register to prevent losing timestamp config
- dsa: mt7530: fix corrupt frames using TRGMII on 40 MHz XTAL MT7621
- r8152:
- fix flow control issue of RTL8156A
- fix the poor throughput for 2.5G devices
- move setting r8153b_rx_agg_chg_indicate() to fix coalescing
- enable autosuspend
- ncsi: clear Tx enable mode when handling a Config required AEN
- octeontx2-pf: macsec: fixes for CN10KB ASIC rev
Misc:
- 9p: remove INET dependency"
* tag 'net-6.4-rc1' of git://git.kernel.org/pub/scm/linux/kernel/git/netdev/net: (69 commits)
net: bcmgenet: Remove phy_stop() from bcmgenet_netif_stop()
pds_core: fix mutex double unlock in error path
net/sched: flower: fix error handler on replace
Revert "net/sched: flower: Fix wrong handle assignment during filter change"
net/sched: flower: fix filter idr initialization
net: fec: correct the counting of XDP sent frames
bonding: add xdp_features support
net: enetc: check the index of the SFI rather than the handle
sfc: Add back mailing list
virtio_net: suppress cpu stall when free_unused_bufs
ice: block LAN in case of VF to VF offload
net: dsa: mt7530: fix network connectivity with multiple CPU ports
net: dsa: mt7530: fix corrupt frames using trgmii on 40 MHz XTAL MT7621
9p: Remove INET dependency
netfilter: nf_tables: fix ct untracked match breakage
af_packet: Don't send zero-byte data in packet_sendmsg_spkt().
igc: read before write to SRRCTL register
pds_core: add AUXILIARY_BUS and NET_DEVLINK to Kconfig
pds_core: remove CONFIG_DEBUG_FS from makefile
ionic: catch failure from devlink_alloc
...
|
|
Ensure the metric threshold is copied correctly or else a use of
uninitialized memory happens.
Fixes: d0a3052f6faefffc ("perf metric: Compute and print threshold values")
Reported-by: Namhyung Kim <[email protected]>
Signed-off-by: Ian Rogers <[email protected]>
Tested-by: Namhyung Kim <[email protected]>
Cc: Adrian Hunter <[email protected]>
Cc: Alexander Shishkin <[email protected]>
Cc: Ingo Molnar <[email protected]>
Cc: Jiri Olsa <[email protected]>
Cc: Kajol Jain <[email protected]>
Cc: Mark Rutland <[email protected]>
Cc: Peter Zijlstra <[email protected]>
Link: https://lore.kernel.org/r/[email protected]
Signed-off-by: Arnaldo Carvalho de Melo <[email protected]>
|
|
structs + CO-RE
Linus reported a build break due to using a vmlinux without a BTF elf
section to generate the vmlinux.h header with bpftool for use in the BPF
tools in tools/perf/util/bpf_skel/*.bpf.c.
Instead add a vmlinux.h file with the structs needed with the fields the
tools need, marking the structs with __attribute__((preserve_access_index)),
so that libbpf's CO-RE code can fixup the struct field offsets.
In some cases the vmlinux.h file that was being generated by bpftool
from the kernel BTF information was not needed at all, just including
linux/bpf.h, sometimes linux/perf_event.h was enough as non-UAPI
types were not being used.
To keep te patch small, include those UAPI headers from the trimmed down
vmlinux.h file, that then provides the tools with just the structs and
the subset of its fields needed for them.
Testing it:
# perf lock contention -b find / > /dev/null
^C contended total wait max wait avg wait type caller
7 53.59 us 10.86 us 7.66 us rwlock:R start_this_handle+0xa0
2 30.35 us 21.99 us 15.17 us rwsem:R iterate_dir+0x52
1 9.04 us 9.04 us 9.04 us rwlock:W start_this_handle+0x291
1 8.73 us 8.73 us 8.73 us spinlock raw_spin_rq_lock_nested+0x1e
#
# perf lock contention -abl find / > /dev/null
^C contended total wait max wait avg wait address symbol
1 262.96 ms 262.96 ms 262.96 ms ffff8e67502d0170 (mutex)
12 244.24 us 39.91 us 20.35 us ffff8e6af56f8070 mmap_lock (rwsem)
7 30.28 us 6.85 us 4.33 us ffff8e6c865f1d40 rq_lock (spinlock)
3 7.42 us 4.03 us 2.47 us ffff8e6c864b1d40 rq_lock (spinlock)
2 3.72 us 2.19 us 1.86 us ffff8e6c86571d40 rq_lock (spinlock)
1 2.42 us 2.42 us 2.42 us ffff8e6c86471d40 rq_lock (spinlock)
4 2.11 us 559 ns 527 ns ffffffff9a146c80 rcu_state (spinlock)
3 1.45 us 818 ns 482 ns ffff8e674ae8384c (rwlock)
1 870 ns 870 ns 870 ns ffff8e68456ee060 (rwlock)
1 663 ns 663 ns 663 ns ffff8e6c864f1d40 rq_lock (spinlock)
1 573 ns 573 ns 573 ns ffff8e6c86531d40 rq_lock (spinlock)
1 472 ns 472 ns 472 ns ffff8e6c86431740 (spinlock)
1 397 ns 397 ns 397 ns ffff8e67413a4f04 (spinlock)
#
# perf test offcpu
95: perf record offcpu profiling tests : Ok
#
# perf kwork latency --use-bpf
Starting trace, Hit <Ctrl+C> to stop and report
^C
Kwork Name | Cpu | Avg delay | Count | Max delay | Max delay start | Max delay end |
--------------------------------------------------------------------------------------------------------------------------------
(w)flush_memcg_stats_dwork | 0000 | 1056.212 ms | 2 | 2112.345 ms | 550113.229573 s | 550115.341919 s |
(w)toggle_allocation_gate | 0000 | 10.144 ms | 62 | 416.389 ms | 550113.453518 s | 550113.869907 s |
(w)0xffff8e6748e28080 | 0002 | 0.623 ms | 1 | 0.623 ms | 550110.989841 s | 550110.990464 s |
(w)vmstat_shepherd | 0000 | 0.586 ms | 10 | 2.828 ms | 550111.971536 s | 550111.974364 s |
(w)vmstat_update | 0007 | 0.363 ms | 5 | 1.634 ms | 550113.222520 s | 550113.224154 s |
(w)vmstat_update | 0000 | 0.324 ms | 10 | 2.827 ms | 550111.971526 s | 550111.974354 s |
(w)0xffff8e674c5f4a58 | 0002 | 0.102 ms | 5 | 0.134 ms | 550110.989839 s | 550110.989972 s |
(w)psi_avgs_work | 0001 | 0.086 ms | 3 | 0.107 ms | 550114.957852 s | 550114.957959 s |
(w)psi_avgs_work | 0000 | 0.079 ms | 5 | 0.100 ms | 550118.605668 s | 550118.605768 s |
(w)kfree_rcu_monitor | 0006 | 0.079 ms | 1 | 0.079 ms | 550110.925821 s | 550110.925900 s |
(w)psi_avgs_work | 0004 | 0.079 ms | 1 | 0.079 ms | 550109.581835 s | 550109.581914 s |
(w)psi_avgs_work | 0001 | 0.078 ms | 1 | 0.078 ms | 550109.197809 s | 550109.197887 s |
(w)psi_avgs_work | 0002 | 0.077 ms | 5 | 0.086 ms | 550110.669819 s | 550110.669905 s |
<SNIP>
# strace -e bpf -o perf-stat-bpf-counters.output perf stat -e cycles --bpf-counters sleep 1
Performance counter stats for 'sleep 1':
6,197,983 cycles
1.003922848 seconds time elapsed
0.000000000 seconds user
0.002032000 seconds sys
# head -7 perf-stat-bpf-counters.output
bpf(BPF_OBJ_GET, {pathname="/sys/fs/bpf/perf_attr_map", bpf_fd=0, file_flags=0}, 16) = 3
bpf(BPF_OBJ_GET_INFO_BY_FD, {info={bpf_fd=3, info_len=88, info=0x7ffcead64990}}, 16) = 0
bpf(BPF_MAP_LOOKUP_ELEM, {map_fd=3, key=0x24129e0, value=0x7ffcead65a48, flags=BPF_ANY}, 32) = 0
bpf(BPF_LINK_GET_FD_BY_ID, {link_id=1252}, 12) = -1 ENOENT (No such file or directory)
bpf(BPF_PROG_LOAD, {prog_type=BPF_PROG_TYPE_SOCKET_FILTER, insn_cnt=2, insns=0x7ffcead65780, license="GPL", log_level=0, log_size=0, log_buf=NULL, kern_version=KERNEL_VERSION(0, 0, 0), prog_flags=0, prog_name="", prog_ifindex=0, expected_attach_type=BPF_CGROUP_INET_INGRESS, prog_btf_fd=0, func_info_rec_size=0,
+func_info=NULL, func_info_cnt=0, line_info_rec_size=0, line_info=NULL, line_info_cnt=0, attach_btf_id=0, attach_prog_fd=0}, 116) = 4
bpf(BPF_PROG_LOAD, {prog_type=BPF_PROG_TYPE_SOCKET_FILTER, insn_cnt=2, insns=0x7ffcead65920, license="GPL", log_level=0, log_size=0, log_buf=NULL, kern_version=KERNEL_VERSION(0, 0, 0), prog_flags=0, prog_name="", prog_ifindex=0, expected_attach_type=BPF_CGROUP_INET_INGRESS, prog_btf_fd=0, func_info_rec_size=0,
+func_info=NULL, func_info_cnt=0, line_info_rec_size=0, line_info=NULL, line_info_cnt=0, attach_btf_id=0, attach_prog_fd=0, fd_array=NULL}, 128) = 4
bpf(BPF_BTF_LOAD, {btf="\237\353\1\0\30\0\0\0\0\0\0\0\20\0\0\0\20\0\0\0\5\0\0\0\1\0\0\0\0\0\0\1"..., btf_log_buf=NULL, btf_size=45, btf_log_size=0, btf_log_level=0}, 28) = 4
#
Reported-by: Linus Torvalds <[email protected]>
Suggested-by: Andrii Nakryiko <[email protected]>
Tested-by: Namhyung Kim <[email protected]>
Tested-by: Song Liu <[email protected]>
Cc: Adrian Hunter <[email protected]>
Cc: Alexei Starovoitov <[email protected]>
Cc: Ian Rogers <[email protected]>
Cc: Jiri Olsa <[email protected]>
Co-developed-by: Jiri Olsa <[email protected]>
Link: https://lore.kernel.org/lkml/[email protected]
Signed-off-by: Arnaldo Carvalho de Melo <[email protected]>
|
|
It seems that perf stat -b <prog id> doesn't produce any results:
$ perf stat -e cycles -b 4 -I 10000 -vvv
Control descriptor is not initialized
cycles: 0 0 0
time counts unit events
10.007641640 <not supported> cycles
Looks like this happens because fentry/fexit progs are getting loaded, but the
corresponding perf event is not enabled and not added into the events bpf map.
I think there is some mixing up between two type of bpf support, one for bperf
and one for bpf_profiler. Both are identified via evsel__is_bpf, based on which
perf events are enabled, but for the latter (bpf_profiler) a perf event is
required. Using evsel__is_bperf to check only bperf produces expected results:
$ perf stat -e cycles -b 4 -I 10000 -vvv
Control descriptor is not initialized
------------------------------------------------------------
perf_event_attr:
size 136
sample_type IDENTIFIER
read_format TOTAL_TIME_ENABLED|TOTAL_TIME_RUNNING
disabled 1
exclude_guest 1
------------------------------------------------------------
sys_perf_event_open: pid -1 cpu 0 group_fd -1 flags 0x8 = 3
------------------------------------------------------------
[...perf_event_attr for other CPUs...]
------------------------------------------------------------
cycles: 309426 169009 169009
time counts unit events
10.010091271 309426 cycles
The final numbers correspond (at least in the level of magnitude) to the
same metric obtained via bpftool.
Fixes: 112cb56164bc2108 ("perf stat: Introduce config stat.bpf-counter-events")
Reviewed-by: Song Liu <[email protected]>
Signed-off-by: Dmitrii Dolgov <[email protected]>
Tested-by: Song Liu <[email protected]>
Cc: Ian Rogers <[email protected]>
Cc: Ingo Molnar <[email protected]>
Cc: Jiri Olsa <[email protected]>
Cc: Namhyung Kim <[email protected]>
Cc: Song Liu <[email protected]>
Link: https://lore.kernel.org/r/[email protected]
Signed-off-by: Arnaldo Carvalho de Melo <[email protected]>
|
|
Now that precision propagation is supported fully in the presence of
subprogs, there is no need to work around iter test. Revert original
workaround.
This reverts be7dbd275dc6 ("selftests/bpf: avoid mark_all_scalars_precise() trigger in one of iter tests").
Signed-off-by: Andrii Nakryiko <[email protected]>
Link: https://lore.kernel.org/r/[email protected]
Signed-off-by: Alexei Starovoitov <[email protected]>
|
|
Add a bunch of tests validating verifier's precision backpropagation
logic in the presence of subprog calls and/or callback-calling
helpers/kfuncs.
We validate the following conditions:
- subprog_result_precise: static subprog r0 result precision handling;
- global_subprog_result_precise: global subprog r0 precision
shortcutting, similar to BPF helper handling;
- callback_result_precise: similarly r0 marking precise for
callback-calling helpers;
- parent_callee_saved_reg_precise, parent_callee_saved_reg_precise_global:
propagation of precision for callee-saved registers bypassing
static/global subprogs;
- parent_callee_saved_reg_precise_with_callback: same as above, but in
the presence of callback-calling helper;
- parent_stack_slot_precise, parent_stack_slot_precise_global:
similar to above, but instead propagating precision of stack slot
(spilled SCALAR reg);
- parent_stack_slot_precise_with_callback: same as above, but in the
presence of callback-calling helper;
- subprog_arg_precise: propagation of precision of static subprog's
input argument back to caller;
- subprog_spill_into_parent_stack_slot_precise: negative test
validating that verifier currently can't support backtracking of stack
access with non-r10 register, we validate that we fallback to
forcing precision for all SCALARs.
Signed-off-by: Andrii Nakryiko <[email protected]>
Link: https://lore.kernel.org/r/[email protected]
Signed-off-by: Alexei Starovoitov <[email protected]>
|
|
When precision backtracking bails out due to some unsupported sequence
of instructions (e.g., stack access through register other than r10), we
need to mark all SCALAR registers as precise to be safe. Currently,
though, we mark SCALARs precise only starting from the state we detected
unsupported condition, which could be one of the parent states of the
actual current state. This will leave some registers potentially not
marked as precise, even though they should. So make sure we start
marking scalars as precise from current state (env->cur_state).
Further, we don't currently detect a situation when we end up with some
stack slots marked as needing precision, but we ran out of available
states to find the instructions that populate those stack slots. This is
akin the `i >= func->allocated_stack / BPF_REG_SIZE` check and should be
handled similarly by falling back to marking all SCALARs precise. Add
this check when we run out of states.
Signed-off-by: Andrii Nakryiko <[email protected]>
Link: https://lore.kernel.org/r/[email protected]
Signed-off-by: Alexei Starovoitov <[email protected]>
|
|
Teach __mark_chain_precision logic to maintain register/stack masks
across all active frames when going from child state to parent state.
Currently this should be mostly no-op, as precision backtracking usually
bails out when encountering subprog entry/exit.
It's not very apparent from the diff due to increased indentation, but
the logic remains the same, except everything is done on specific `fr`
frame index. Calls to bt_clear_reg() and bt_clear_slot() are replaced
with frame-specific bt_clear_frame_reg() and bt_clear_frame_slot(),
where frame index is passed explicitly, instead of using current frame
number.
We also adjust logging to emit affected frame number. And we also add
better logging of human-readable register and stack slot masks, similar
to previous patch.
Signed-off-by: Andrii Nakryiko <[email protected]>
Link: https://lore.kernel.org/r/[email protected]
Signed-off-by: Alexei Starovoitov <[email protected]>
|
|
Add helper to format register and stack masks in more human-readable
format. Adjust logging a bit during backtrack propagation and especially
during forcing precision fallback logic to make it clearer what's going
on (with log_level=2, of course), and also start reporting affected
frame depth. This is in preparation for having more than one active
frame later when precision propagation between subprog calls is added.
Signed-off-by: Andrii Nakryiko <[email protected]>
Link: https://lore.kernel.org/r/[email protected]
Signed-off-by: Alexei Starovoitov <[email protected]>
|
|
Sometimes during debugging it's important that BPF program is loaded
with BPF_F_TEST_STATE_FREQ flag set to force verifier to do frequent
state checkpointing. Teach veristat to do this when -t ("test state")
flag is specified.
Signed-off-by: Andrii Nakryiko <[email protected]>
Link: https://lore.kernel.org/r/[email protected]
Signed-off-by: Alexei Starovoitov <[email protected]>
|
|
To make comments about arc and riscv arch in bpf_tracing.h accurate,
this patch fixes the comment about arc and adds the comment for riscv.
Signed-off-by: Kenjiro Nakayama <[email protected]>
Signed-off-by: Andrii Nakryiko <[email protected]>
Link: https://lore.kernel.org/bpf/[email protected]
|