| Age | Commit message (Collapse) | Author | Files | Lines |
|
Show the branch speculation info if provided by the branch recording
hardware feature. This can be useful for purposes of code optimization.
E.g.
$ perf record -j any,u ./test_branch
$ perf report --dump-raw-trace
Before:
[...]
8380958377610 0x40b178 [0x1b0]: PERF_RECORD_SAMPLE(IP, 0x2): 7952/7952: 0x4f851a period: 48973 addr: 0
... branch stack: nr:16
..... 0: 00000000004b52fd -> 00000000004f82c0 0 cycles P 0
..... 1: ffffffff8220137c -> 00000000004b52f0 0 cycles M 0
..... 2: 000000000041d1c4 -> 00000000004b52f0 0 cycles P 0
..... 3: 00000000004e7ead -> 000000000041d1b0 0 cycles M 0
..... 4: 00000000004e7f91 -> 00000000004e7ead 0 cycles P 0
..... 5: 00000000004e7ea8 -> 00000000004e7f70 0 cycles P 0
..... 6: 00000000004e7e52 -> 00000000004e7e98 0 cycles M 0
..... 7: 00000000004e7e1f -> 00000000004e7e40 0 cycles M 0
..... 8: 00000000004e7f60 -> 00000000004e7df0 0 cycles P 0
..... 9: 00000000004e7f58 -> 00000000004e7f60 0 cycles M 0
..... 10: 000000000041d85d -> 00000000004e7f50 0 cycles P 0
..... 11: 000000000043306a -> 000000000041d840 0 cycles P 0
..... 12: ffffffff8220137c -> 0000000000433040 0 cycles M 0
..... 13: 000000000041e4a1 -> 0000000000433040 0 cycles P 0
..... 14: ffffffff8220137c -> 000000000041e490 0 cycles M 0
..... 15: 000000000041d89b -> 000000000041e487 0 cycles P 0
... thread: test_branch:7952
...... dso: /data/sandipan/test_branch
[...]
After:
[...]
8380958377610 0x40b178 [0x1b0]: PERF_RECORD_SAMPLE(IP, 0x2): 7952/7952: 0x4f851a period: 48973 addr: 0
... branch stack: nr:16
..... 0: 00000000004b52fd -> 00000000004f82c0 0 cycles P 0 NON_SPEC_CORRECT_PATH
..... 1: ffffffff8220137c -> 00000000004b52f0 0 cycles M 0 NON_SPEC_CORRECT_PATH
..... 2: 000000000041d1c4 -> 00000000004b52f0 0 cycles P 0 NON_SPEC_CORRECT_PATH
..... 3: 00000000004e7ead -> 000000000041d1b0 0 cycles M 0 NON_SPEC_CORRECT_PATH
..... 4: 00000000004e7f91 -> 00000000004e7ead 0 cycles P 0 NON_SPEC_CORRECT_PATH
..... 5: 00000000004e7ea8 -> 00000000004e7f70 0 cycles P 0 NON_SPEC_CORRECT_PATH
..... 6: 00000000004e7e52 -> 00000000004e7e98 0 cycles M 0 SPEC_CORRECT_PATH
..... 7: 00000000004e7e1f -> 00000000004e7e40 0 cycles M 0 NON_SPEC_CORRECT_PATH
..... 8: 00000000004e7f60 -> 00000000004e7df0 0 cycles P 0 NON_SPEC_CORRECT_PATH
..... 9: 00000000004e7f58 -> 00000000004e7f60 0 cycles M 0 NON_SPEC_CORRECT_PATH
..... 10: 000000000041d85d -> 00000000004e7f50 0 cycles P 0 NON_SPEC_CORRECT_PATH
..... 11: 000000000043306a -> 000000000041d840 0 cycles P 0 NON_SPEC_CORRECT_PATH
..... 12: ffffffff8220137c -> 0000000000433040 0 cycles M 0 NON_SPEC_CORRECT_PATH
..... 13: 000000000041e4a1 -> 0000000000433040 0 cycles P 0 NON_SPEC_CORRECT_PATH
..... 14: ffffffff8220137c -> 000000000041e490 0 cycles M 0 NON_SPEC_CORRECT_PATH
..... 15: 000000000041d89b -> 000000000041e487 0 cycles P 0 NON_SPEC_CORRECT_PATH
... thread: test_branch:7952
...... dso: /data/sandipan/test_branch
[...]
With the addition of new branch flags, the "brstacksym" fields in perf
script output now shows speculation information after the branch type.
Change the regular expressions accordingly for the test to pass. Since
branch speculation information may vary across platforms, the test does
not look for specific values.
E.g.
$ perf test -v 110
Before:
110: Check branch stack sampling :
--- start ---
test child forked, pid 54154
Testing user branch stack sampling
+ grep -E -m1 ^brstack_bench\+[^ ]*/brstack_foo\+[^ ]*/IND_CALL$ /tmp/__perf_test.program.AfhUI/perf.script
+ cleanup
+ rm -rf /tmp/__perf_test.program.AfhUI
test child finished with -1
---- end ----
Check branch stack sampling: FAILED!
After:
110: Check branch stack sampling :
--- start ---
test child forked, pid 43716
Testing user branch stack sampling
+ grep -E -m1 ^brstack_bench\+[^ ]*/brstack_foo\+[^ ]*/IND_CALL/.*$ /tmp/__perf_test.program.xgzAi/perf.script
brstack_bench+0x66/brstack_foo+0x0/P/-/-/0/IND_CALL/NON_SPEC_CORRECT_PATH
+ grep -E -m1 ^brstack_foo\+[^ ]*/brstack_bar\+[^ ]*/CALL/.*$ /tmp/__perf_test.program.xgzAi/perf.script
brstack_foo+0x1b/brstack_bar+0x0/P/-/-/0/CALL/NON_SPEC_CORRECT_PATH
+ grep -E -m1 ^brstack_bench\+[^ ]*/brstack_foo\+[^ ]*/CALL/.*$ /tmp/__perf_test.program.xgzAi/perf.script
brstack_bench+0x58/brstack_foo+0x0/P/-/-/0/CALL/NON_SPEC_CORRECT_PATH
+ grep -E -m1 ^brstack_bench\+[^ ]*/brstack_bar\+[^ ]*/CALL/.*$ /tmp/__perf_test.program.xgzAi/perf.script
brstack_bench+0x5d/brstack_bar+0x0/P/-/-/0/CALL/NON_SPEC_CORRECT_PATH
+ grep -E -m1 ^brstack_bar\+[^ ]*/brstack_foo\+[^ ]*/RET/.*$ /tmp/__perf_test.program.xgzAi/perf.script
brstack_bar+0x31/brstack_foo+0x20/P/-/-/0/RET/NON_SPEC_CORRECT_PATH
+ grep -E -m1 ^brstack_foo\+[^ ]*/brstack_bench\+[^ ]*/RET/.*$ /tmp/__perf_test.program.xgzAi/perf.script
brstack_foo+0x36/brstack_bench+0x5d/P/-/-/0/RET/NON_SPEC_CORRECT_PATH
+ grep -E -m1 ^brstack_bench\+[^ ]*/brstack_bench\+[^ ]*/COND/.*$ /tmp/__perf_test.program.xgzAi/perf.script
brstack_bench+0x76/brstack_bench+0x7d/P/-/-/0/COND/NON_SPEC_CORRECT_PATH
+ grep -E -m1 ^brstack\+[^ ]*/brstack\+[^ ]*/UNCOND/.*$ /tmp/__perf_test.program.xgzAi/perf.script
brstack+0x5a/brstack+0x41/P/-/-/0/UNCOND/NON_SPEC_CORRECT_PATH
+ set +x
Testing branch stack filtering permutation (any_call,CALL|IND_CALL|COND_CALL|SYSCALL|IRQ)
Testing branch stack filtering permutation (call,CALL|SYSCALL)
Testing branch stack filtering permutation (cond,COND)
Testing branch stack filtering permutation (any_ret,RET|COND_RET|SYSRET|ERET)
Testing branch stack filtering permutation (call,cond,CALL|SYSCALL|COND)
Testing branch stack filtering permutation (any_call,cond,CALL|IND_CALL|COND_CALL|IRQ|SYSCALL|COND)
Testing branch stack filtering permutation (cond,any_call,any_ret,COND|CALL|IND_CALL|COND_CALL|SYSCALL|IRQ|RET|COND_RET|SYSRET|ERET)
test child finished with 0
---- end ----
Check branch stack sampling: Ok
Signed-off-by: Sandipan Das <[email protected]>
Cc: Alexander Shishkin <[email protected]>
Cc: Ananth Narayan <[email protected]>
Cc: Borislav Petkov <[email protected]>
Cc: Dave Hansen <[email protected]>
Cc: Ian Rogers <[email protected]>
Cc: Ingo Molnar <[email protected]>
Cc: James Clark <[email protected]>
Cc: Jiri Olsa <[email protected]>
Cc: Kajol Jain <[email protected]>
Cc: Madhavan Srinivasan <[email protected]>
Cc: Mark Rutland <[email protected]>
Cc: Namhyung Kim <[email protected]>
Cc: Peter Zijlstra <[email protected]>
Cc: Ravi Bangoria <[email protected]>
Cc: Santosh Shukla <[email protected]>
Cc: Stephane Eranian <[email protected]>
Cc: Thomas Gleixner <[email protected]>
Cc: Thomas Richter <[email protected]>
Cc: [email protected]
Link: https://lore.kernel.org/r/048d67c9de3cc8e3dbf19aaa7ff718dec91364c5.1675333809.git.sandipan.das@amd.com
Signed-off-by: Arnaldo Carvalho de Melo <[email protected]>
|
|
The latest version of grep claims the egrep is now obsolete so the build
now contains warnings that look like:
egrep: warning: egrep is obsolescent; using grep -E
fix this up by moving the related file to use "grep -E" instead.
sed -i "s/egrep/grep -E/g" `grep egrep -rwl tools/perf`
Here are the steps to install the latest grep:
wget http://ftp.gnu.org/gnu/grep/grep-3.8.tar.gz
tar xf grep-3.8.tar.gz
cd grep-3.8 && ./configure && make
sudo make install
export PATH=/usr/local/bin:$PATH
Signed-off-by: Tiezhu Yang <[email protected]>
Acked-by: Ian Rogers <[email protected]>
Cc: Alexander Shishkin <[email protected]>
Cc: Jiri Olsa <[email protected]>
Cc: Mark Rutland <[email protected]>
Cc: Namhyung Kim <[email protected]>
Cc: Peter Zijlstra <[email protected]>
Link: http://lore.kernel.org/lkml/[email protected]
Signed-off-by: Arnaldo Carvalho de Melo <[email protected]>
|
|
So that it can get rid of requirement of a compiler. Also rename the
symbols to match with the perf test workload.
Signed-off-by: Namhyung Kim <[email protected]>
Tested-by: James Clark <[email protected]>
Acked-by: German Gomez <[email protected]>
Cc: Adrian Hunter <[email protected]>
Cc: Athira Jajeev <[email protected]>
Cc: Ian Rogers <[email protected]>
Cc: Ingo Molnar <[email protected]>
Cc: Jiri Olsa <[email protected]>
Cc: Leo Yan <[email protected]>
Cc: Peter Zijlstra <[email protected]>
Cc: Zhengjun Xing <[email protected]>
Link: https://lore.kernel.org/r/[email protected]
Signed-off-by: Arnaldo Carvalho de Melo <[email protected]>
|
|
Commit f4a2aade6809c657 ("perf tests powerpc: Fix branch stack sampling
test to include sanity check for branch filter") added a skip if certain
branch options aren't available.
But the change added both -b (--branch-any) and --branch-filter options
at the same time, which will always result in a failure on any platform
because the arguments can't be used together.
Fix this by removing -b (--branch-any) and leaving --branch-filter which
already specifies 'any'. Also add warning messages to the test and perf
tool.
Output on x86 before this fix:
$ sudo ./perf test branch
108: Check branch stack sampling : Skip
After:
$ sudo ./perf test branch
108: Check branch stack sampling : Ok
Fixes: f4a2aade6809c657 ("perf tests powerpc: Fix branch stack sampling test to include sanity check for branch filter")
Signed-off-by: James Clark <[email protected]>
Tested-by: Athira Jajeev <[email protected]>
Cc: Alexander Shishkin <[email protected]>
Cc: [email protected]
Cc: Ingo Molnar <[email protected]>
Cc: Jiri Olsa <[email protected]>
Cc: Kajol Jain <[email protected]>
Cc: Mark Rutland <[email protected]>
Cc: Namhyung Kim <[email protected]>
Cc: Peter Zijlstra <[email protected]>
Link: https://lore.kernel.org/r/[email protected]
Signed-off-by: Arnaldo Carvalho de Melo <[email protected]>
|
|
for branch filter
Commit b55878c90ab92a24 ("perf test: Add test for branch stack
sampling") added test for branch stack sampling. There is a sanity check
in the beginning to skip the test if the hardware doesn't support branch
stack sampling.
Snippet
<<>>
skip the test if the hardware doesn't support branch stack sampling
perf record -b -o- -B true > /dev/null 2>&1 || exit 2
<<>>
But the testcase also uses branch sample types: save_type, any. if any
platform doesn't support the branch filters used in the test, the testcase
will fail. In powerpc, currently mutliple branch filters are not supported
and hence this test fails in powerpc. Fix the sanity check to look at
the support for branch filters used in this test before proceeding with
the test.
Fixes: b55878c90ab92a24 ("perf test: Add test for branch stack sampling")
Reported-by: Disha Goel <[email protected]>
Reviewed-by: Kajol Jain <[email protected]>
Signed-off-by: Athira Jajeev <[email protected]>
Cc: German Gomez <[email protected]>
Cc: Jiri Olsa <[email protected]>
Cc: [email protected]
Cc: Madhavan Srinivasan <[email protected]>
Cc: Michael Ellerman <[email protected]>
Cc: Nageswara R Sastry <[email protected]>
Link: https://lore.kernel.org/r/[email protected]
Signed-off-by: Arnaldo Carvalho de Melo <[email protected]>
|
|
same event
On linux-next tree 'perf test 95' ("Check branch stack sampling") was
added recently.
s390 does not support branch sampling at all and the test case fails
despite for checking branch support before hand.
The check for support of branching uses the software event named "dummy",
as seen in the line:
perf record -b -o- -e dummy -B true > /dev/null 2>&1 || exit 2
However when the branch recording is actually done, a different event is
used, as seen in the line:
perf record -o $TMPDIR/... --branch-filter any,save_type,u -- ...
The event is omitted and for "perf record" the default event is cycles,
which is not supported by s390 and this fails when executed on s390:
# perf record --branch-filter any,save_type,u -- /tmp/__perf_test.program.iDSmQ/a.out
Error:
cycles: PMU Hardware or event type doesn't support branch stack sampling.
#
Therefore fix this and use the same event cycles for testing support
and actually running the test.
Output before:
# ./perf test -Fv 95
95: Check branch stack sampling :
--- start ---
Testing user branch stack sampling
---- end ----
Check branch stack sampling: FAILED!
#
Output after:
# ./perf test -Fv 95
95: Check branch stack sampling :
--- start ---
---- end ----
Check branch stack sampling: Skip
#
Fixes: b55878c90ab92a24 ("perf test: Add test for branch stack sampling")
Reviewed-by: James Clark <[email protected]>
Signed-off-by: Thomas Richter <[email protected]>
Acked-by: German Gomez <[email protected]>
Cc: German Gomez <[email protected]>
Cc: Heiko Carstens <[email protected]>
Cc: Sumanth Korikkar <[email protected]>
Cc: Sven Schnelle <[email protected]>
Cc: Vasily Gorbik <[email protected]>
Link: https://lore.kernel.org/r/[email protected]
Signed-off-by: Arnaldo Carvalho de Melo <[email protected]>
|
|
Add a self test for branch stack sampling, to check that we get the
expected branch types, and filters behave as expected.
Suggested-by: Anshuman Khandual <[email protected]>
Signed-off-by: German Gomez <[email protected]>
Acked-by: Namhyung Kim <[email protected]>
Cc: Alexander Shishkin <[email protected]>
Cc: Ian Rogers <[email protected]>
Cc: Mark Rutland <[email protected]>
Link: https://lore.kernel.org/r/[email protected]
Tested-by: Jiri Olsa <[email protected]>
Signed-off-by: Arnaldo Carvalho de Melo <[email protected]>
|