Age | Commit message (Collapse) | Author | Files | Lines |
|
If no selection is made on the 'Selected branches' dialog, then the
output is the same as the 'All branches' report. That is not really an
error, and is not desirable for future reports, so remove it.
Signed-off-by: Adrian Hunter <[email protected]>
Cc: Jiri Olsa <[email protected]>
Signed-off-by: Arnaldo Carvalho de Melo <[email protected]>
|
|
Remove SQLTableDialogDataItem as it is no longer used.
Signed-off-by: Adrian Hunter <[email protected]>
Cc: Jiri Olsa <[email protected]>
Signed-off-by: Arnaldo Carvalho de Melo <[email protected]>
|
|
Create new dialog data item classes to replace SQLTableDialogDataItem.
This separates out different dialog data items and makes it easier to
add new ones. SQLTableDialogDataItem is removed in a separate patch
because it makes the diff more readable.
Signed-off-by: Adrian Hunter <[email protected]>
Cc: Jiri Olsa <[email protected]>
Signed-off-by: Arnaldo Carvalho de Melo <[email protected]>
|
|
The report name is a report variable so move it into into ReportVars.
Signed-off-by: Adrian Hunter <[email protected]>
Cc: Jiri Olsa <[email protected]>
Signed-off-by: Arnaldo Carvalho de Melo <[email protected]>
|
|
Factor out ReportVars to provide a single container for information from
report dialogs.
Signed-off-by: Adrian Hunter <[email protected]>
Cc: Jiri Olsa <[email protected]>
Signed-off-by: Arnaldo Carvalho de Melo <[email protected]>
|
|
Factor out ReportDialogBase so it can be re-used.
Signed-off-by: Adrian Hunter <[email protected]>
Cc: Jiri Olsa <[email protected]>
Signed-off-by: Arnaldo Carvalho de Melo <[email protected]>
|
|
Move column headers from SQLAutoTableModel into SQLTableModel so that
they can be used for other models based on SQLTableModel.
Signed-off-by: Adrian Hunter <[email protected]>
Cc: Jiri Olsa <[email protected]>
Signed-off-by: Arnaldo Carvalho de Melo <[email protected]>
|
|
calls table
The Call Graph depends on the calls table which is optional when exporting
data, so hide the Call Graph option if there is no calls table.
Signed-off-by: Adrian Hunter <[email protected]>
Cc: Jiri Olsa <[email protected]>
Signed-off-by: Arnaldo Carvalho de Melo <[email protected]>
|
|
Remove leftover debugging prints.
Signed-off-by: Adrian Hunter <[email protected]>
Cc: Jiri Olsa <[email protected]>
Signed-off-by: Arnaldo Carvalho de Melo <[email protected]>
|
|
exported-sql-viewer.py is a standalone python script and requires a
shebang. Also only python2 is supported at present. Restore the shebang
but use the more flexible 'env' form.
Signed-off-by: Adrian Hunter <[email protected]>
Cc: Jiri Olsa <[email protected]>
Cc: [email protected]
Fixes: a38352de4495 ("perf script python: Remove explicit shebang from Python script")
Signed-off-by: Arnaldo Carvalho de Melo <[email protected]>
|
|
x86 retpoline functions pollute the call graph by showing up everywhere
there is an indirect branch, but they do not really mean anything. Make
changes so that the default retpoline functions will no longer appear in
the call graph. Note this only affects the call graph, since all the
original branches are left unchanged.
This does not handle function return thunks, nor is there any
improvement for the handling of inline thunks or extern thunks.
Example:
$ cat simple-retpoline.c
__attribute__((noinline)) int bar(void)
{
return -1;
}
int foo(void)
{
return bar() + 1;
}
__attribute__((indirect_branch("thunk"))) int main()
{
int (*volatile fn)(void) = foo;
fn();
return fn();
}
$ gcc -ggdb3 -Wall -Wextra -O2 -o simple-retpoline simple-retpoline.c
$ objdump -d simple-retpoline
<SNIP>
0000000000001040 <main>:
1040: 48 83 ec 18 sub $0x18,%rsp
1044: 48 8d 05 25 01 00 00 lea 0x125(%rip),%rax # 1170 <foo>
104b: 48 89 44 24 08 mov %rax,0x8(%rsp)
1050: 48 8b 44 24 08 mov 0x8(%rsp),%rax
1055: e8 1f 01 00 00 callq 1179 <__x86_indirect_thunk_rax>
105a: 48 8b 44 24 08 mov 0x8(%rsp),%rax
105f: 48 83 c4 18 add $0x18,%rsp
1063: e9 11 01 00 00 jmpq 1179 <__x86_indirect_thunk_rax>
<SNIP>
0000000000001160 <bar>:
1160: b8 ff ff ff ff mov $0xffffffff,%eax
1165: c3 retq
<SNIP>
0000000000001170 <foo>:
1170: e8 eb ff ff ff callq 1160 <bar>
1175: 83 c0 01 add $0x1,%eax
1178: c3 retq
0000000000001179 <__x86_indirect_thunk_rax>:
1179: e8 07 00 00 00 callq 1185 <__x86_indirect_thunk_rax+0xc>
117e: f3 90 pause
1180: 0f ae e8 lfence
1183: eb f9 jmp 117e <__x86_indirect_thunk_rax+0x5>
1185: 48 89 04 24 mov %rax,(%rsp)
1189: c3 retq
<SNIP>
$ perf record -o simple-retpoline.perf.data -e intel_pt/cyc/u ./simple-retpoline
[ perf record: Woken up 1 times to write data ]
[ perf record: Captured and wrote 0,017 MB simple-retpoline.perf.data ]
$ perf script -i simple-retpoline.perf.data --itrace=be -s ~/libexec/perf-core/scripts/python/export-to-sqlite.py simple-retpoline.db branches calls
2019-01-08 14:03:37.851655 Creating database...
2019-01-08 14:03:37.863256 Writing records...
2019-01-08 14:03:38.069750 Adding indexes
2019-01-08 14:03:38.078799 Done
$ ~/libexec/perf-core/scripts/python/exported-sql-viewer.py simple-retpoline.db
Before:
main
-> __x86_indirect_thunk_rax
-> __x86_indirect_thunk_rax
-> foo
-> bar
After:
main
-> foo
-> bar
Signed-off-by: Adrian Hunter <[email protected]>
Tested-by: Arnaldo Carvalho de Melo <[email protected]>
Acked-by: Jiri Olsa <[email protected]>
Link: http://lkml.kernel.org/r/[email protected]
[ Remove (sym->name != NULL) test, this is not a pointer and breaks the build with clang version 7.0.1 (Fedora 7.0.1-2.fc30) ]
Signed-off-by: Arnaldo Carvalho de Melo <[email protected]>
|
|
Improve thread_stack__no_call_return() to better handle 'returns' that
do not match the stack i.e. 'no call'. See code comments for details.
The example below shows how retpolines are affected:
Example:
$ cat simple-retpoline.c
__attribute__((noinline)) int bar(void)
{
return -1;
}
int foo(void)
{
return bar() + 1;
}
__attribute__((indirect_branch("thunk"))) int main()
{
int (*volatile fn)(void) = foo;
fn();
return fn();
}
$ gcc -ggdb3 -Wall -Wextra -O2 -o simple-retpoline simple-retpoline.c
$ objdump -d simple-retpoline
<SNIP>
0000000000001040 <main>:
1040: 48 83 ec 18 sub $0x18,%rsp
1044: 48 8d 05 25 01 00 00 lea 0x125(%rip),%rax # 1170 <foo>
104b: 48 89 44 24 08 mov %rax,0x8(%rsp)
1050: 48 8b 44 24 08 mov 0x8(%rsp),%rax
1055: e8 1f 01 00 00 callq 1179 <__x86_indirect_thunk_rax>
105a: 48 8b 44 24 08 mov 0x8(%rsp),%rax
105f: 48 83 c4 18 add $0x18,%rsp
1063: e9 11 01 00 00 jmpq 1179 <__x86_indirect_thunk_rax>
<SNIP>
0000000000001160 <bar>:
1160: b8 ff ff ff ff mov $0xffffffff,%eax
1165: c3 retq
<SNIP>
0000000000001170 <foo>:
1170: e8 eb ff ff ff callq 1160 <bar>
1175: 83 c0 01 add $0x1,%eax
1178: c3 retq
0000000000001179 <__x86_indirect_thunk_rax>:
1179: e8 07 00 00 00 callq 1185 <__x86_indirect_thunk_rax+0xc>
117e: f3 90 pause
1180: 0f ae e8 lfence
1183: eb f9 jmp 117e <__x86_indirect_thunk_rax+0x5>
1185: 48 89 04 24 mov %rax,(%rsp)
1189: c3 retq
<SNIP>
$ perf record -o simple-retpoline.perf.data -e intel_pt/cyc/u ./simple-retpoline
[ perf record: Woken up 1 times to write data ]
[ perf record: Captured and wrote 0,017 MB simple-retpoline.perf.data ]
$ perf script -i simple-retpoline.perf.data --itrace=be -s ~/libexec/perf-core/scripts/python/export-to-sqlite.py simple-retpoline.db branches calls
2019-01-08 14:03:37.851655 Creating database...
2019-01-08 14:03:37.863256 Writing records...
2019-01-08 14:03:38.069750 Adding indexes
2019-01-08 14:03:38.078799 Done
$ ~/libexec/perf-core/scripts/python/exported-sql-viewer.py simple-retpoline.db
Before:
main
-> __x86_indirect_thunk_rax
-> __x86_indirect_thunk_rax
-> __x86_indirect_thunk_rax
-> bar
After:
main
-> __x86_indirect_thunk_rax
-> __x86_indirect_thunk_rax
-> foo
-> bar
Committer testing:
Chose "Reports", Then "Context-Sensitive Call Graph" and then go on
expanding:
Before:
simple-retpolin
PID:PID
_start
_start
__libc_start_main
main
__x86_indirect_thunk_rax
__x86_indirect_thunk_rax
bar
After:
Remove the "simple.retpoline.db" file, run again the 'perf script' line
to regenerate the .db file and run the exported-sql-viewer.py again to
get the same all the way to 'main', then, from there, including 'main':
main
__x86_indirect_thunk_rax
__x86_indirect_thunk_rax
foo
bar
Signed-off-by: Adrian Hunter <[email protected]>
Tested-by: Arnaldo Carvalho de Melo <[email protected]>
Acked-by: Jiri Olsa <[email protected]>
Link: http://lkml.kernel.org/r/[email protected]
Signed-off-by: Arnaldo Carvalho de Melo <[email protected]>
|
|
The output of "perf annotate -l --stdio xxx" changed since commit 425859ff0de33
("perf annotate: No need to calculate notes->start twice") removed notes->start
assignment in symbol__calc_lines(). It will get failed in
find_address_in_section() from symbol__tty_annotate() subroutine as the
a2l->addr is wrong. So the annotate summary doesn't report the line number of
source code correctly.
Before fix:
liwei@euler:~/main_code/hulk_work/hulk/tools/perf$ cat common_while_1.c
void hotspot_1(void)
{
volatile int i;
for (i = 0; i < 0x10000000; i++);
for (i = 0; i < 0x10000000; i++);
for (i = 0; i < 0x10000000; i++);
}
int main(void)
{
hotspot_1();
return 0;
}
liwei@euler:~/main_code/hulk_work/hulk/tools/perf$ gcc common_while_1.c -g -o common_while_1
liwei@euler:~/main_code/hulk_work/hulk/tools/perf$ sudo ./perf record ./common_while_1
[ perf record: Woken up 2 times to write data ]
[ perf record: Captured and wrote 0.488 MB perf.data (12498 samples) ]
liwei@euler:~/main_code/hulk_work/hulk/tools/perf$ sudo ./perf annotate -l -s hotspot_1 --stdio
Sorted summary for file /home/liwei/main_code/hulk_work/hulk/tools/perf/common_while_1
----------------------------------------------
19.30 common_while_1[32]
19.03 common_while_1[4e]
19.01 common_while_1[16]
5.04 common_while_1[13]
4.99 common_while_1[4b]
4.78 common_while_1[2c]
4.77 common_while_1[10]
4.66 common_while_1[2f]
4.59 common_while_1[51]
4.59 common_while_1[35]
4.52 common_while_1[19]
4.20 common_while_1[56]
0.51 common_while_1[48]
Percent | Source code & Disassembly of common_while_1 for cycles:ppp (12480 samples, percent: local period)
-----------------------------------------------------------------------------------------------------------------
:
:
:
: Disassembly of section .text:
:
: 00000000000005fa <hotspot_1>:
: hotspot_1():
: void hotspot_1(void)
: {
0.00 : 5fa: push %rbp
0.00 : 5fb: mov %rsp,%rbp
: volatile int i;
:
: for (i = 0; i < 0x10000000; i++);
0.00 : 5fe: movl $0x0,-0x4(%rbp)
0.00 : 605: jmp 610 <hotspot_1+0x16>
0.00 : 607: mov -0x4(%rbp),%eax
common_while_1[10] 4.77 : 60a: add $0x1,%eax
common_while_1[13] 5.04 : 60d: mov %eax,-0x4(%rbp)
common_while_1[16] 19.01 : 610: mov -0x4(%rbp),%eax
common_while_1[19] 4.52 : 613: cmp $0xfffffff,%eax
0.00 : 618: jle 607 <hotspot_1+0xd>
: for (i = 0; i < 0x10000000; i++);
...
After fix:
liwei@euler:~/main_code/hulk_work/hulk/tools/perf$ sudo ./perf record ./common_while_1
[ perf record: Woken up 2 times to write data ]
[ perf record: Captured and wrote 0.488 MB perf.data (12500 samples) ]
liwei@euler:~/main_code/hulk_work/hulk/tools/perf$ sudo ./perf annotate -l -s hotspot_1 --stdio
Sorted summary for file /home/liwei/main_code/hulk_work/hulk/tools/perf/common_while_1
----------------------------------------------
33.34 common_while_1.c:5
33.34 common_while_1.c:6
33.32 common_while_1.c:7
Percent | Source code & Disassembly of common_while_1 for cycles:ppp (12482 samples, percent: local period)
-----------------------------------------------------------------------------------------------------------------
:
:
:
: Disassembly of section .text:
:
: 00000000000005fa <hotspot_1>:
: hotspot_1():
: void hotspot_1(void)
: {
0.00 : 5fa: push %rbp
0.00 : 5fb: mov %rsp,%rbp
: volatile int i;
:
: for (i = 0; i < 0x10000000; i++);
0.00 : 5fe: movl $0x0,-0x4(%rbp)
0.00 : 605: jmp 610 <hotspot_1+0x16>
0.00 : 607: mov -0x4(%rbp),%eax
common_while_1.c:5 4.70 : 60a: add $0x1,%eax
4.89 : 60d: mov %eax,-0x4(%rbp)
common_while_1.c:5 19.03 : 610: mov -0x4(%rbp),%eax
common_while_1.c:5 4.72 : 613: cmp $0xfffffff,%eax
0.00 : 618: jle 607 <hotspot_1+0xd>
: for (i = 0; i < 0x10000000; i++);
0.00 : 61a: movl $0x0,-0x4(%rbp)
0.00 : 621: jmp 62c <hotspot_1+0x32>
0.00 : 623: mov -0x4(%rbp),%eax
common_while_1.c:6 4.54 : 626: add $0x1,%eax
4.73 : 629: mov %eax,-0x4(%rbp)
common_while_1.c:6 19.54 : 62c: mov -0x4(%rbp),%eax
common_while_1.c:6 4.54 : 62f: cmp $0xfffffff,%eax
...
Signed-off-by: Wei Li <[email protected]>
Acked-by: Jiri Olsa <[email protected]>
Tested-by: Arnaldo Carvalho de Melo <[email protected]>
Cc: Alexander Shishkin <[email protected]>
Cc: Jin Yao <[email protected]>
Cc: Namhyung Kim <[email protected]>
Cc: Peter Zijlstra <[email protected]>
Fixes: 425859ff0de33 ("perf annotate: No need to calculate notes->start twice")
Link: http://lkml.kernel.org/r/[email protected]
Signed-off-by: Arnaldo Carvalho de Melo <[email protected]>
|
|
Let rm_rf() remove a file if it's provided by path, not just
directories.
Signed-off-by: Jiri Olsa <[email protected]>
Cc: Alexander Shishkin <[email protected]>
Cc: Alexey Budankov <[email protected]>
Cc: Andi Kleen <[email protected]>
Cc: Namhyung Kim <[email protected]>
Cc: Peter Zijlstra <[email protected]>
Link: http://lkml.kernel.org/r/[email protected]
Signed-off-by: Arnaldo Carvalho de Melo <[email protected]>
|
|
So it does not screw up single -v verbose output.
Signed-off-by: Jiri Olsa <[email protected]>
Cc: Alexander Shishkin <[email protected]>
Cc: Andi Kleen <[email protected]>
Cc: Namhyung Kim <[email protected]>
Cc: Peter Zijlstra <[email protected]>
Link: http://lkml.kernel.org/r/[email protected]
Signed-off-by: Arnaldo Carvalho de Melo <[email protected]>
|
|
Add a missing new line into pr_debug call in perf_event__synthesize_bpf_events(),
so that the error message does not screw the verbose output.
Signed-off-by: Jiri Olsa <[email protected]>
Cc: Alexander Shishkin <[email protected]>
Cc: Andi Kleen <[email protected]>
Cc: Namhyung Kim <[email protected]>
Cc: Peter Zijlstra <[email protected]>
Cc: Song Liu <[email protected]>
Link: http://lkml.kernel.org/r/[email protected]
Signed-off-by: Arnaldo Carvalho de Melo <[email protected]>
|
|
Add support to add/remove fields for specific event types in -F option.
It's now possible to use '+-' after event type, like:
# cat > test.c
#include <stdio.h>
int main(void)
{
printf("Hello world\n");
while(1) {}
}
^D
# gcc -g -o test test.c
# perf probe -x test 'test.c:5'
# perf record -e '{cpu/cpu-cycles,period=10000/,probe_test:main}:S' ./test
...
# perf script -Ftrace:+period,-cpu
test 3859 396291.117343: 10275 cpu/cpu-cycles,period=10000/: 7f..
test 3859 396291.118234: 11041 cpu/cpu-cycles,period=10000/: ffffff..
test 3859 396291.118234: 1 probe_test:main:
test 3859 396291.118248: 8668 cpu/cpu-cycles,period=10000/: ffffff..
test 3859 396291.118263: 10139 cpu/cpu-cycles,period=10000/: ffffff..
Committer testing:
Couldn't make the test above work, but tested it with:
# perf probe -x hello main
Added new event:
probe_hello:main (on main in /home/acme/c/hello)
You can now use it in all perf tools, such as:
perf record -e probe_hello:main -aR sleep 1
# perf record -e probe_hello:main ./hello
hello, world
[ perf record: Woken up 1 times to write data ]
[ perf record: Captured and wrote 0.025 MB perf.data (1 samples) ]
# perf script
hello 21454 [002] 254116.874005: probe_hello:main: (401126)
#
# perf script -Ftrace:+period,-cpu
hello 21454 254116.874005: 1 probe_hello:main: (401126)
Signed-off-by: Jiri Olsa <[email protected]>
Tested-by: Arnaldo Carvalho de Melo <[email protected]>
Cc: Alexander Shishkin <[email protected]>
Cc: Andi Kleen <[email protected]>
Cc: Namhyung Kim <[email protected]>
Cc: Peter Zijlstra <[email protected]>
Link: http://lkml.kernel.org/r/[email protected]
Signed-off-by: Arnaldo Carvalho de Melo <[email protected]>
|
|
Force sample_type setup for slave events in group leader sessions.
We don't get sample for slave events, we make them when delivering group
leader sample. Set the slave event to follow the master sample_type to
ease up report.
Signed-off-by: Jiri Olsa <[email protected]>
Cc: Alexander Shishkin <[email protected]>
Cc: Andi Kleen <[email protected]>
Cc: Namhyung Kim <[email protected]>
Cc: Peter Zijlstra <[email protected]>
Link: http://lkml.kernel.org/r/[email protected]
Signed-off-by: Arnaldo Carvalho de Melo <[email protected]>
|
|
There's no reason to deliver a sample with zero period. It means there
was no value for slave event since its last group leader sample.
Signed-off-by: Jiri Olsa <[email protected]>
Cc: Alexander Shishkin <[email protected]>
Cc: Andi Kleen <[email protected]>
Cc: Namhyung Kim <[email protected]>
Cc: Peter Zijlstra <[email protected]>
Link: http://lkml.kernel.org/r/[email protected]
Signed-off-by: Arnaldo Carvalho de Melo <[email protected]>
|
|
Initial use case:
Dumping the maps setup by tools/perf/examples/bpf/augmented_raw_syscalls.c,
which so far are just booleans, showing just non-zeroed entries:
# cat ~/.perfconfig
[llvm]
dump-obj = true
clang-opt = -g
[trace]
#add_events = /home/acme/git/perf/tools/perf/examples/bpf/augmented_raw_syscalls.o
add_events = /wb/augmented_raw_syscalls.o
$ date
Tue Feb 19 16:29:33 -03 2019
$ ls -la /wb/augmented_raw_syscalls.o
-rwxr-xr-x. 1 root root 14048 Jan 24 12:09 /wb/augmented_raw_syscalls.o
$ file /wb/augmented_raw_syscalls.o
/wb/augmented_raw_syscalls.o: ELF 64-bit LSB relocatable, eBPF, version 1 (SYSV), with debug_info, not stripped
$
# trace -e recvmmsg,sendmmsg --map-dump foobar
ERROR: BPF map "foobar" not found
# trace -e recvmmsg,sendmmsg --map-dump filtered_pids
ERROR: BPF map "filtered_pids" not found
# trace -e recvmmsg,sendmmsg --map-dump pids_filtered
[2583] = 1,
[2267] = 1,
^Z
[1]+ Stopped trace -e recvmmsg,sendmmsg --map-dump pids_filtered
# pidof trace
2267
# ps ax|grep gnome-terminal|grep -v grep
2583 ? Ssl 58:33 /usr/libexec/gnome-terminal-server
^C
# trace -e recvmmsg,sendmmsg --map-dump syscalls
[299] = 1,
[307] = 1,
^C
# grep x64_recvmmsg arch/x86/entry/syscalls/syscall_64.tbl
299 64 recvmmsg __x64_sys_recvmmsg
# grep x64_sendmmsg arch/x86/entry/syscalls/syscall_64.tbl
307 64 sendmmsg __x64_sys_sendmmsg
#
Next step probably will be something like 'perf stat's --interval-print and
--interval-clear.
Cc: Adrian Hunter <[email protected]>
Cc: Alexei Starovoitov <[email protected]>
Cc: Daniel Borkmann <[email protected]>
Cc: Jiri Olsa <[email protected]>
Cc: Martin KaFai Lau <[email protected]>
Cc: Namhyung Kim <[email protected]>
Cc: Yonghong Song <[email protected]>
Link: https://lkml.kernel.org/n/[email protected]
Signed-off-by: Arnaldo Carvalho de Melo <[email protected]>
|
|
At some point I'll suggest moving this to libbpf, for now I'll
experiment with ways to dump BPF maps set by events in 'perf trace',
starting with a very basic dumper for the current very limited needs
of the augmented_raw_syscalls code: dumping booleans.
Having functions that apply to the map keys and values and do table
lookup in things like syscall id to string tables should come next.
Cc: Adrian Hunter <[email protected]>
Cc: Alexei Starovoitov <[email protected]>
Cc: Daniel Borkmann <[email protected]>
Cc: Jiri Olsa <[email protected]>
Cc: Martin KaFai Lau <[email protected]>
Cc: Namhyung Kim <[email protected]>
Cc: Yonghong Song <[email protected]>
Link: https://lkml.kernel.org/n/[email protected]
Signed-off-by: Arnaldo Carvalho de Melo <[email protected]>
|
|
Commit 489338a717a0 ("perf tests evsel-tp-sched: Fix bitwise operator")
causes test case 14 "Parse sched tracepoints fields" to fail on s390.
This test succeeds on x86.
In fact this test now fails on all architectures with type char treated
as type unsigned char.
The root cause is the signed-ness of character arrays in the tracepoints
sched_switch for structure members prev_comm and next_comm.
On s390 the output of:
[root@m35lp76 perf]# cat /sys/kernel/debug/tracing/events/sched/sched_switch/format
name: sched_switch
ID: 287
format:
field:unsigned short common_type; offset:0; size:2; signed:0;
...
field:char prev_comm[16]; offset:8; size:16; signed:0;
...
field:char next_comm[16]; offset:40; size:16; signed:0;
reveals the character arrays prev_comm and next_comm are per
default unsigned char and have values in the range of 0..255.
On x86 both fields are signed as this output shows:
[root@f29]# cat /sys/kernel/debug/tracing/events/sched/sched_switch/format
name: sched_switch
ID: 287
format:
field:unsigned short common_type; offset:0; size:2; signed:0;
...
field:char prev_comm[16]; offset:8; size:16; signed:1;
...
field:char next_comm[16]; offset:40; size:16; signed:1;
and the character arrays prev_comm and next_comm are per default signed
char and have values in the range of -1..127. The implementation of
type char is architecture specific.
Since the character arrays in both tracepoints sched_switch and
sched_wakeup should contain ascii characters, simply omit the check for
signedness in the test case.
Output before:
[root@m35lp76 perf]# ./perf test -F 14
14: Parse sched tracepoints fields :
--- start ---
sched:sched_switch: "prev_comm" signedness(0) is wrong, should be 1
sched:sched_switch: "next_comm" signedness(0) is wrong, should be 1
sched:sched_wakeup: "comm" signedness(0) is wrong, should be 1
---- end ----
14: Parse sched tracepoints fields : FAILED!
[root@m35lp76 perf]#
Output after:
[root@m35lp76 perf]# ./perf test -Fv 14
14: Parse sched tracepoints fields :
--- start ---
---- end ----
Parse sched tracepoints fields: Ok
[root@m35lp76 perf]#
Fixes: 489338a717a0 ("perf tests evsel-tp-sched: Fix bitwise operator")
Signed-off-by: Thomas Richter <[email protected]>
Cc: Heiko Carstens <[email protected]>
Cc: Hendrik Brueckner <[email protected]>
Cc: Martin Schwidefsky <[email protected]>
Link: http://lkml.kernel.org/r/[email protected]
Signed-off-by: Arnaldo Carvalho de Melo <[email protected]>
|
|
According to the current documentation the flags section is placed after
the file header itself but the code assumes to find the flags section
after the data section. This change updates the documentation to that
assumption.
Signed-off-by: Jonas Rabenstein <[email protected]>
Acked-by: Jiri Olsa <[email protected]>
Cc: Alexander Shishkin <[email protected]>
Cc: Andi Kleen <[email protected]>
Cc: Namhyung Kim <[email protected]>
Cc: Peter Zijlstra <[email protected]>
Cc: Stephane Eranian <[email protected]>
Cc: Thomas Richter <[email protected]>
Link: http://lkml.kernel.org/r/[email protected]
Signed-off-by: Arnaldo Carvalho de Melo <[email protected]>
|
|
The content of the HEADER_CMDLINE feature header is a perf_header_string_list
of the argument vector and not a perf_header_string of the commandline.
Signed-off-by: Jonas Rabenstein <[email protected]>
Acked-by: Jiri Olsa <[email protected]>
Cc: Alexander Shishkin <[email protected]>
Cc: Andi Kleen <[email protected]>
Cc: Namhyung Kim <[email protected]>
Cc: Peter Zijlstra <[email protected]>
Cc: Stephane Eranian <[email protected]>
Cc: Thomas Richter <[email protected]>
Link: http://lkml.kernel.org/r/[email protected]
Signed-off-by: Arnaldo Carvalho de Melo <[email protected]>
|
|
We can't assume inlined symbols with the same name are equal, because
their address range may be different. This will cause the symbols with
different addresses be shadowed when adding to the hist entry, and lead
to ERANGE error when checking the symbol address during sample parse,
the addr should be within the range of [sym.start, sym.end].
The error message is like: "0x36aea60 [0x8]: failed to process type: 68".
The second parameter of symbol__new() is the length of the fake symbol
for the inline frame, which is the subtraction of the end and start
address of base_sym.
Signed-off-by: He Kuang <[email protected]>
Acked-by: Jiri Olsa <[email protected]>
Cc: Alexander Shishkin <[email protected]>
Cc: Milian Wolff <[email protected]>
Cc: Namhyung Kim <[email protected]>
Cc: Peter Zijlstra <[email protected]>
Fixes: aa441895f7b4 ("perf report: Compare symbol name for inlined frames when sorting")
Link: http://lkml.kernel.org/r/[email protected]
Signed-off-by: Arnaldo Carvalho de Melo <[email protected]>
|
|
Use sysfs__mountpoint() when reading sysfs files to obtain cpu/numa
topologies.
Also use scnprintf instead of sprintf as suggested by Namhyung.
Signed-off-by: Jiri Olsa <[email protected]>
Acked-by: Namhyung Kim <[email protected]>
Cc: Alexander Shishkin <[email protected]>
Cc: Peter Zijlstra <[email protected]>
Link: http://lkml.kernel.org/r/[email protected]
Signed-off-by: Arnaldo Carvalho de Melo <[email protected]>
|
|
Add the numa_topology object to return the list of numa nodes together
with their cpus. It will replace the numa code in header.c and will be
used from 'perf record' in the following patches.
Add the following interface functions to load numa details:
struct numa_topology *numa_topology__new(void);
void numa_topology__delete(struct numa_topology *tp);
And replace the current (copied) local interface, with no functional
changes.
Signed-off-by: Jiri Olsa <[email protected]>
Acked-by: Namhyung Kim <[email protected]>
Cc: Alexander Shishkin <[email protected]>
Cc: Peter Zijlstra <[email protected]>
Link: http://lkml.kernel.org/r/[email protected]
Signed-off-by: Arnaldo Carvalho de Melo <[email protected]>
|
|
Make struct cpu_topo global and rename it to 'struct cpu_topology', so
that it can be used from the 'perf record' command in the following
patches.
Add the following interface functions to load/free cpu topology details:
struct cpu_topology *cpu_topology__new(void);
void cpu_topology__delete(struct cpu_topology *tp);
Move it to a separate source file cputopo.c together with numa related
object in the following patches.
No functional change, the new interface will be used in upcoming changes.
Signed-off-by: Jiri Olsa <[email protected]>
Acked-by: Namhyung Kim <[email protected]>
Cc: Alexander Shishkin <[email protected]>
Cc: Peter Zijlstra <[email protected]>
Link: http://lkml.kernel.org/r/[email protected]
Signed-off-by: Arnaldo Carvalho de Melo <[email protected]>
|
|
We are currently passing the node index instead of the real node number.
Signed-off-by: Jiri Olsa <[email protected]>
Acked-by: Namhyung Kim <[email protected]>
Cc: Alexander Shishkin <[email protected]>
Cc: Peter Zijlstra <[email protected]>
Fixes: fbe96f29ce4b ("perf tools: Make perf.data more self-descriptive (v8)"
Link: http://lkml.kernel.org/r/[email protected]
Signed-off-by: Arnaldo Carvalho de Melo <[email protected]>
|
|
The netfilter conflicts were rather simple overlapping
changes.
However, the cls_tcindex.c stuff was a bit more complex.
On the 'net' side, Cong is fixing several races and memory
leaks. Whilst on the 'net-next' side we have Vlad adding
the rtnl-ness support.
What I've decided to do, in order to resolve this, is revert the
conversion over to using a workqueue that Cong did, bringing us back
to pure RCU. I did it this way because I believe that either Cong's
races don't apply with have Vlad did things, or Cong will have to
implement the race fix slightly differently.
Signed-off-by: David S. Miller <[email protected]>
|
|
If perf was built without trace support, the trace+probe_vfs_getname.sh
'perf test' entry fails:
# perf trace -h
perf: 'trace' is not a perf-command. See 'perf --help'
# perf test 64
64: Check open filename arg using perf trace + vfs_getname: FAILED!
Check trace support, so that we'll skip the test in that case:
# perf test 64
64: Check open filename arg using perf trace + vfs_getname: Skip
Signed-off-by: Tommi Rantala <[email protected]>
Cc: Alexander Shishkin <[email protected]>
Cc: Hendrik Brueckner <[email protected]>
Cc: Jiri Olsa <[email protected]>
Cc: Kim Phillips <[email protected]>
Cc: Michael Petlan <[email protected]>
Cc: Namhyung Kim <[email protected]>
Cc: Peter Zijlstra <[email protected]>
Link: http://lkml.kernel.org/r/[email protected]
Signed-off-by: Arnaldo Carvalho de Melo <[email protected]>
|
|
Not used at all.
Signed-off-by: Jiri Olsa <[email protected]>
Cc: Alexander Shishkin <[email protected]>
Cc: Namhyung Kim <[email protected]>
Cc: Peter Zijlstra <[email protected]>
Link: http://lkml.kernel.org/r/[email protected]
Signed-off-by: Arnaldo Carvalho de Melo <[email protected]>
|
|
Simplifying the code a bit.
Signed-off-by: Jiri Olsa <[email protected]>
Cc: Alexander Shishkin <[email protected]>
Cc: Namhyung Kim <[email protected]>
Cc: Peter Zijlstra <[email protected]>
Link: http://lkml.kernel.org/r/[email protected]
Signed-off-by: Arnaldo Carvalho de Melo <[email protected]>
|
|
Display metric expression itself when --details is specified.
Current list with no details:
# perf list metrics
...
TopDownL1:
IPC
[Instructions Per Cycle (per logical thread)]
SLOTS
[Total issue-pipeline slots]
...
Detailed output with metric formula:
# perf list --details metrics
...
TopDownL1:
IPC
[Instructions Per Cycle (per logical thread)]
[inst_retired.any / cpu_clk_unhalted.thread]
SLOTS
[Total issue-pipeline slots]
[4*(( cpu_clk_unhalted.thread_any / 2 ) if #smt_on else cycles)]
...
Signed-off-by: Jiri Olsa <[email protected]>
Cc: Alexander Shishkin <[email protected]>
Cc: Namhyung Kim <[email protected]>
Cc: Peter Zijlstra <[email protected]>
Link: http://lkml.kernel.org/r/[email protected]
Signed-off-by: Arnaldo Carvalho de Melo <[email protected]>
|
|
Fixing legacy symbol events parsing. We can't support single slash
separator, like 'cycles/u', because it conflicts with non empty terms,
like 'cycles/period/u'.
Keeping only '//' and ':' separator for these events:
cycles//u
cycles:k
And removing '/' separator support, which is not working
anymore. Also adding automated tests for above events.
Signed-off-by: Jiri Olsa <[email protected]>
Cc: Alexander Shishkin <[email protected]>
Cc: Namhyung Kim <[email protected]>
Cc: Peter Zijlstra <[email protected]>
Link: http://lkml.kernel.org/r/[email protected]
Signed-off-by: Arnaldo Carvalho de Melo <[email protected]>
|
|
Rename build libperf to perf, because it's used to build perf.
The libperf build object name will be used for libperf library.
Signed-off-by: Jiri Olsa <[email protected]>
Cc: Alexander Shishkin <[email protected]>
Cc: Namhyung Kim <[email protected]>
Cc: Peter Zijlstra <[email protected]>
Link: http://lkml.kernel.org/r/[email protected]
Signed-off-by: Arnaldo Carvalho de Melo <[email protected]>
|
|
Simple rename, no functional change.
Signed-off-by: Jiri Olsa <[email protected]>
Cc: Alexander Shishkin <[email protected]>
Cc: Namhyung Kim <[email protected]>
Cc: Peter Zijlstra <[email protected]>
Link: http://lkml.kernel.org/r/[email protected]
Signed-off-by: Arnaldo Carvalho de Melo <[email protected]>
|
|
There's no need for perf build to use libperf.a,
we can use directly libperf-in.o.
The libperf.a stays as a target if needed:
$ make libperf.a
...
CC util/pmu.o
CC util/pmu-flex.o
LD util/libperf-in.o
LD libperf-in.o
AR libperf.a
Signed-off-by: Jiri Olsa <[email protected]>
Tested-by: Arnaldo Carvalho de Melo <[email protected]>
Cc: Alexander Shishkin <[email protected]>
Cc: Namhyung Kim <[email protected]>
Cc: Peter Zijlstra <[email protected]>
Link: http://lkml.kernel.org/r/[email protected]
Signed-off-by: Arnaldo Carvalho de Melo <[email protected]>
|
|
Making the auxtrace_buffer fetch function modular so that it can be
called from different decoding context (timeless vs. non-timeless),
avoiding to repeat code.
No change in functionality is introduced by this patch.
Signed-off-by: Mathieu Poirier <[email protected]>
Cc: Jiri Olsa <[email protected]>
Cc: Leo Yan <[email protected]>
Cc: Peter Zijlstra <[email protected]>
Cc: Suzuki K Poulouse <[email protected]>
Cc: [email protected]
Link: http://lkml.kernel.org/r/[email protected]
Signed-off-by: Arnaldo Carvalho de Melo <[email protected]>
|
|
Making the main packet processing loop modular so that it can be called
from different decoding context (timeless vs. non-timless), avoiding to
repeat code.
Signed-off-by: Mathieu Poirier <[email protected]>
Cc: Jiri Olsa <[email protected]>
Cc: Leo Yan <[email protected]>
Cc: Peter Zijlstra <[email protected]>
Cc: Suzuki K Poulouse <[email protected]>
Cc: [email protected]
Link: http://lkml.kernel.org/r/[email protected]
Signed-off-by: Arnaldo Carvalho de Melo <[email protected]>
|
|
Making the main decoder block modular so that it can be called from
different decoding context (timeless vs. non-timeless), avoiding
to repeat code.
No change in functionality is introduced by this patch.
Signed-off-by: Mathieu Poirier <[email protected]>
Cc: Jiri Olsa <[email protected]>
Cc: Leo Yan <[email protected]>
Cc: Peter Zijlstra <[email protected]>
Cc: Suzuki K Poulouse <[email protected]>
Cc: [email protected]
Link: http://lkml.kernel.org/r/[email protected]
Signed-off-by: Arnaldo Carvalho de Melo <[email protected]>
|
|
This patch makes decoding of auxtrace buffer centered around a struct
cs_etm_queue. This eliminates surperflous variables and is a precursor
for work that simplifies the main decoder loop.
Signed-off-by: Mathieu Poirier <[email protected]>
Cc: Jiri Olsa <[email protected]>
Cc: Leo Yan <[email protected]>
Cc: Peter Zijlstra <[email protected]>
Cc: Suzuki K Poulouse <[email protected]>
Cc: [email protected]
Link: http://lkml.kernel.org/r/[email protected]
Signed-off-by: Arnaldo Carvalho de Melo <[email protected]>
|
|
Moving initialisation of the kernel start address to function
cs_etm__setup_queues(), considered to be the common denominator for
queue initialisation. That way we don't have to repeat the same code
at different places.
No change of functionatlity is introduced by this patch.
Signed-off-by: Mathieu Poirier <[email protected]>
Cc: Jiri Olsa <[email protected]>
Cc: Leo Yan <[email protected]>
Cc: Peter Zijlstra <[email protected]>
Cc: Suzuki K Poulouse <[email protected]>
Cc: [email protected]
Link: http://lkml.kernel.org/r/[email protected]
Signed-off-by: Arnaldo Carvalho de Melo <[email protected]>
|
|
Function cs_etm__alloc_queue() should only be concerned with the allocation
of memory for the etmq and accompanying decoder. Everything else should
be done in the calling function.
Signed-off-by: Mathieu Poirier <[email protected]>
Cc: Jiri Olsa <[email protected]>
Cc: Leo Yan <[email protected]>
Cc: Peter Zijlstra <[email protected]>
Cc: Suzuki K Poulouse <[email protected]>
Cc: [email protected]
Link: http://lkml.kernel.org/r/[email protected]
Signed-off-by: Arnaldo Carvalho de Melo <[email protected]>
|
|
The comment just before initialising the decoder is plane wrong since it
is part of the decoding queue setup function and the operation code
specifically mention that trace data is to be decoded rather than printed
out.
This patch simply fix the comment to prevent people from getting really
confused.
Signed-off-by: Mathieu Poirier <[email protected]>
Cc: Jiri Olsa <[email protected]>
Cc: Leo Yan <[email protected]>
Cc: Peter Zijlstra <[email protected]>
Cc: Suzuki K Poulouse <[email protected]>
Cc: [email protected]
Link: http://lkml.kernel.org/r/[email protected]
Signed-off-by: Arnaldo Carvalho de Melo <[email protected]>
|
|
The trace parameter initialisation code is repeated in two different
places, something that bloats the file and can lead to errors. This
is fixed by introducing a helper function and calling the right
protocol initialisation code when required.
Signed-off-by: Mathieu Poirier <[email protected]>
Cc: Jiri Olsa <[email protected]>
Cc: Leo Yan <[email protected]>
Cc: Peter Zijlstra <[email protected]>
Cc: Suzuki K Poulouse <[email protected]>
Cc: [email protected]
Link: http://lkml.kernel.org/r/[email protected]
Signed-off-by: Arnaldo Carvalho de Melo <[email protected]>
|
|
Memory allocated for variable 't_params' isn't released properly in the
error path of function cs_etm_queue *cs_etm__alloc_queue() and
cs_etm__dump_event(), something this patch addresses.
Signed-off-by: Mathieu Poirier <[email protected]>
Cc: Jiri Olsa <[email protected]>
Cc: Leo Yan <[email protected]>
Cc: Peter Zijlstra <[email protected]>
Cc: Suzuki K Poulouse <[email protected]>
Cc: [email protected]
Link: http://lkml.kernel.org/r/[email protected]
Signed-off-by: Arnaldo Carvalho de Melo <[email protected]>
|
|
Introducing function cs_etm_decoder__init_dparams() to avoid repeating
code at two different places.
Signed-off-by: Mathieu Poirier <[email protected]>
Cc: Jiri Olsa <[email protected]>
Cc: Leo Yan <[email protected]>
Cc: Peter Zijlstra <[email protected]>
Cc: Suzuki K Poulouse <[email protected]>
Cc: [email protected]
Link: http://lkml.kernel.org/r/[email protected]
Signed-off-by: Arnaldo Carvalho de Melo <[email protected]>
|
|
Function cs_etm__mem_access() is supposed to return a u32 but the error
path returns negative values at a couple of places, something that really
throws off the clients using it. Fix the situation by return '0'.
Signed-off-by: Mathieu Poirier <[email protected]>
Cc: Jiri Olsa <[email protected]>
Cc: Leo Yan <[email protected]>
Cc: Peter Zijlstra <[email protected]>
Cc: Suzuki K Poulouse <[email protected]>
Cc: [email protected]
Link: http://lkml.kernel.org/r/[email protected]
Signed-off-by: Arnaldo Carvalho de Melo <[email protected]>
|
|
Field "time" and "timestamp" in structure cs_etm_queue are no longer
used and need to be removed.
Signed-off-by: Mathieu Poirier <[email protected]>
Cc: Jiri Olsa <[email protected]>
Cc: Leo Yan <[email protected]>
Cc: Peter Zijlstra <[email protected]>
Cc: Suzuki K Poulouse <[email protected]>
Cc: [email protected]
Link: http://lkml.kernel.org/r/[email protected]
Signed-off-by: Arnaldo Carvalho de Melo <[email protected]>
|