Age | Commit message (Collapse) | Author | Files | Lines |
|
To pick the changes in:
344c6c804703 ("KVM/Hyper-V: Add new KVM capability KVM_CAP_HYPERV_DIRECT_TLBFLUSH")
dee04eee9182 ("KVM: RISC-V: Add KVM_REG_RISCV for ONE_REG interface")
These trigger the rebuild of this object:
CC /tmp/build/perf/trace/beauty/ioctl.o
But do not result in any change in tooling, as the additions are not
being used in any table generatator.
This silences this perf build warning:
Warning: Kernel ABI header at 'tools/include/uapi/linux/kvm.h' differs from latest version at 'include/uapi/linux/kvm.h'
diff -u tools/include/uapi/linux/kvm.h include/uapi/linux/kvm.h
Cc: Adrian Hunter <[email protected]>
Cc: Anup Patel <[email protected]>
Cc: Jiri Olsa <[email protected]>
Cc: Namhyung Kim <[email protected]>
Cc: Paolo Bonzini <[email protected]>
Cc: Paul Walmsley <[email protected]>
Cc: Tianyu Lan <[email protected]>
Link: https://lkml.kernel.org/n/[email protected]
Signed-off-by: Arnaldo Carvalho de Melo <[email protected]>
|
|
To pick the changes in:
0cb8410b90e7 ("kvm: svm: Intercept RDPRU")
That trigger a rebuild in too in tooling:
CC /tmp/build/perf/arch/x86/util/kvm-stat.o
But this time around no changes in tooling results, as SVM_EXIT_RDPRU
wasn't added to SVM_EXIT_REASONS, that is used in kvm-stat.c.
And addresses this perf build warnings:
Warning: Kernel ABI header at 'tools/arch/x86/include/uapi/asm/svm.h' differs from latest version at 'arch/x86/include/uapi/asm/svm.h'
diff -u tools/arch/x86/include/uapi/asm/svm.h arch/x86/include/uapi/asm/svm.h
Cc: Adrian Hunter <[email protected]>
Cc: Jim Mattson <[email protected]>
Cc: Jiri Olsa <[email protected]>
Cc: Namhyung Kim <[email protected]>
Cc: Paolo Bonzini <[email protected]>
Link: https://lkml.kernel.org/n/[email protected]
Signed-off-by: Arnaldo Carvalho de Melo <[email protected]>
|
|
To pick the changes in:
bf653b78f960 ("KVM: vmx: Introduce handle_unexpected_vmexit and handle WAITPKG vmexit")
That trigger these changes in tooling:
CC /tmp/build/perf/arch/x86/util/kvm-stat.o
INSTALL GTK UI
DESCEND plugins
make[3]: Nothing to be done for '/tmp/build/perf/plugins/libtraceevent-dynamic-list'.
INSTALL trace_plugins
LD /tmp/build/perf/arch/x86/util/perf-in.o
LD /tmp/build/perf/arch/x86/perf-in.o
LD /tmp/build/perf/arch/perf-in.o
LD /tmp/build/perf/perf-in.o
LINK /tmp/build/perf/perf
And this is not just because that header is included, kvm-stat.c
uses the VMX_EXIT_REASONS define and it got changed by the above cset.
And addresses this perf build warnings:
Warning: Kernel ABI header at 'tools/arch/x86/include/uapi/asm/vmx.h' differs from latest version at 'arch/x86/include/uapi/asm/vmx.h'
diff -u tools/arch/x86/include/uapi/asm/vmx.h arch/x86/include/uapi/asm/vmx.h
Cc: Adrian Hunter <[email protected]>
Cc: David Ahern <[email protected]>
Cc: Jiri Olsa <[email protected]>
Cc: Namhyung Kim <[email protected]>
Cc: Paolo Bonzini <[email protected]>
Cc: Tao Xu <[email protected]>
Cc: Wang Nan <[email protected]>
Link: https://lkml.kernel.org/n/[email protected]
Signed-off-by: Arnaldo Carvalho de Melo <[email protected]>
|
|
There is a memory leak problem in the failure paths of
build_cl_output(), so fix it.
Signed-off-by: Yunfeng Ye <[email protected]>
Acked-by: Jiri Olsa <[email protected]>
Cc: Alexander Shishkin <[email protected]>
Cc: Feilong Lin <[email protected]>
Cc: Hu Shiyuan <[email protected]>
Cc: Mark Rutland <[email protected]>
Cc: Namhyung Kim <[email protected]>
Cc: Peter Zijlstra <[email protected]>
Link: http://lore.kernel.org/lkml/[email protected]
Signed-off-by: Arnaldo Carvalho de Melo <[email protected]>
|
|
slow_copyfile() opens the file by name, so "write" permissions must not
be removed in copyfile_mode_ns() before calling slow_copyfile().
Example:
Before:
$ sudo chmod +r /proc/kcore
$ sudo setcap "cap_sys_admin,cap_sys_ptrace,cap_syslog,cap_sys_rawio=ep" tools/perf/perf
$ tools/perf/perf buildid-cache -k /proc/kcore
Couldn't add /proc/kcore
After:
$ sudo chmod +r /proc/kcore
$ sudo setcap "cap_sys_admin,cap_sys_ptrace,cap_syslog,cap_sys_rawio=ep" tools/perf/perf
$ tools/perf/perf buildid-cache -v -k /proc/kcore
kcore added to build-id cache directory /home/ahunter/.debug/[kernel.kcore]/37e340b1b5a7cf4f57ba8de2bc777359588a957f/2019100709562289
Signed-off-by: Adrian Hunter <[email protected]>
Acked-by: Jiri Olsa <[email protected]>
Tested-by: Arnaldo Carvalho de Melo <[email protected]>
Link: http://lore.kernel.org/lkml/[email protected]
Signed-off-by: Arnaldo Carvalho de Melo <[email protected]>
|
|
Store SYMBOL_ANNOTATE_ERRNO__BPF_MISSING_BTF in variable *ret*, instead
of returning in the middle of the function and leaking multiple
resources: prog_linfo, btf, s and bfdf.
Addresses-Coverity-ID: 1454832 ("Structurally dead code")
Fixes: 11aad897f6d1 ("perf annotate: Don't return -1 for error when doing BPF disassembly")
Signed-off-by: Gustavo A. R. Silva <[email protected]>
Acked-by: Jiri Olsa <[email protected]>
Cc: Alexander Shishkin <[email protected]>
Cc: Mark Rutland <[email protected]>
Cc: Namhyung Kim <[email protected]>
Cc: Peter Zijlstra <[email protected]>
Link: http://lore.kernel.org/lkml/20191014171047.GA30850@embeddedor
Signed-off-by: Arnaldo Carvalho de Melo <[email protected]>
|
|
Both build_mem_topology() and rm_rf_depth_pat() have resource leaks of
closedir() on the error paths.
Fix this by calling closedir() before function returns.
Fixes: e2091cedd51b ("perf tools: Add MEM_TOPOLOGY feature to perf data file")
Fixes: cdb6b0235f17 ("perf tools: Add pattern name checking to rm_rf")
Signed-off-by: Yunfeng Ye <[email protected]>
Acked-by: Jiri Olsa <[email protected]>
Cc: Alexander Shishkin <[email protected]>
Cc: Alexei Starovoitov <[email protected]>
Cc: Alexey Budankov <[email protected]>
Cc: Andi Kleen <[email protected]>
Cc: Daniel Borkmann <[email protected]>
Cc: Feilong Lin <[email protected]>
Cc: Hu Shiyuan <[email protected]>
Cc: Igor Lubashev <[email protected]>
Cc: Kan Liang <[email protected]>
Cc: Mark Rutland <[email protected]>
Cc: Martin KaFai Lau <[email protected]>
Cc: Namhyung Kim <[email protected]>
Cc: Peter Zijlstra <[email protected]>
Cc: Song Liu <[email protected]>
Cc: Yonghong Song <[email protected]>
Link: http://lore.kernel.org/lkml/[email protected]
Signed-off-by: Arnaldo Carvalho de Melo <[email protected]>
|
|
In the earlier fix for the memory overrun of id arrays I managed to typo
the wrong event in the fix.
Of course we need to close the current event in the loop, not the
original failing event.
The same test case as in the original patch still passes.
Fixes: 7834fa948beb ("perf evlist: Fix access of freed id arrays")
Signed-off-by: Andi Kleen <[email protected]>
Cc: Jiri Olsa <[email protected]>
Link: http://lore.kernel.org/lkml/[email protected]
Signed-off-by: Arnaldo Carvalho de Melo <[email protected]>
|
|
The build of file libperf-jvmti.so succeeds but the resulting
object fails to load:
# ~/linux/tools/perf/perf record -k mono -- java \
-XX:+PreserveFramePointer \
-agentpath:/root/linux/tools/perf/libperf-jvmti.so \
hog 100000 123450
Error occurred during initialization of VM
Could not find agent library /root/linux/tools/perf/libperf-jvmti.so
in absolute path, with error:
/root/linux/tools/perf/libperf-jvmti.so: undefined symbol: _ctype
Add the missing _ctype symbol into the build script.
Fixes: 79743bc927f6 ("perf jvmti: Link against tools/lib/string.o to have weak strlcpy()")
Signed-off-by: Thomas Richter <[email protected]>
Cc: Heiko Carstens <[email protected]>
Cc: Vasily Gorbik <[email protected]>
Link: http://lore.kernel.org/lkml/[email protected]
Signed-off-by: Arnaldo Carvalho de Melo <[email protected]>
|
|
Just like strace has:
# trace -s sleep 1
Summary of events:
sleep (32370), 80 events, 93.0%
syscall calls errors total min avg max stddev
(msec) (msec) (msec) (msec) (%)
--------------- -------- ------ -------- --------- --------- --------- ------
nanosleep 1 0 1000.402 1000.402 1000.402 1000.402 0.00%
mmap 8 0 0.023 0.002 0.003 0.004 8.49%
close 5 0 0.015 0.001 0.003 0.009 51.39%
mprotect 4 0 0.014 0.002 0.003 0.005 16.95%
openat 3 0 0.013 0.003 0.004 0.005 14.29%
munmap 1 0 0.010 0.010 0.010 0.010 0.00%
read 4 0 0.005 0.001 0.001 0.002 16.83%
brk 4 0 0.004 0.001 0.001 0.002 20.82%
access 1 1 0.004 0.004 0.004 0.004 0.00%
fstat 3 0 0.003 0.001 0.001 0.001 12.17%
lseek 3 0 0.003 0.001 0.001 0.001 11.45%
arch_prctl 2 1 0.002 0.001 0.001 0.001 2.30%
execve 1 0 0.000 0.000 0.000 0.000 0.00%
#
# perf trace -S sleep 1
? ... [continued]: execve()) = 0
0.028 brk(brk: NULL) = 0x559f5bd96000
0.033 arch_prctl(option: 0x3001, arg2: 0x7ffda8b715a0) = -1 EINVAL (Invalid argument)
0.046 access(filename: "/etc/ld.so.preload", mode: R) = -1 ENOENT (No such file or directory)
0.055 openat(dfd: CWD, filename: "/etc/ld.so.cache", flags: RDONLY|CLOEXEC) = 3
0.060 fstat(fd: 3, statbuf: 0x7ffda8b707a0) = 0
0.062 mmap(addr: NULL, len: 134346, prot: READ, flags: PRIVATE, fd: 3, off: 0) = 0x7f3aedfc4000
0.066 close(fd: 3) = 0
0.079 openat(dfd: CWD, filename: "/lib64/libc.so.6", flags: RDONLY|CLOEXEC) = 3
0.085 read(fd: 3, buf: 0x7ffda8b70948, count: 832) = 832
0.088 lseek(fd: 3, offset: 792, whence: SET) = 792
0.090 read(fd: 3, buf: 0x7ffda8b70810, count: 68) = 68
0.093 fstat(fd: 3, statbuf: 0x7ffda8b707f0) = 0
0.095 mmap(addr: NULL, len: 8192, prot: READ|WRITE, flags: PRIVATE|ANONYMOUS) = 0x7f3aedfc2000
0.101 lseek(fd: 3, offset: 792, whence: SET) = 792
0.103 read(fd: 3, buf: 0x7ffda8b70450, count: 68) = 68
0.105 lseek(fd: 3, offset: 864, whence: SET) = 864
0.107 read(fd: 3, buf: 0x7ffda8b70470, count: 32) = 32
0.110 mmap(addr: NULL, len: 1857472, prot: READ, flags: PRIVATE|DENYWRITE, fd: 3, off: 0) = 0x7f3aeddfc000
0.114 mprotect(start: 0x7f3aede1e000, len: 1679360, prot: NONE) = 0
0.121 mmap(addr: 0x7f3aede1e000, len: 1363968, prot: READ|EXEC, flags: PRIVATE|FIXED|DENYWRITE, fd: 3, off: 0x22000) = 0x7f3aede1e000
0.127 mmap(addr: 0x7f3aedf6b000, len: 311296, prot: READ, flags: PRIVATE|FIXED|DENYWRITE, fd: 3, off: 0x16f000) = 0x7f3aedf6b000
0.131 mmap(addr: 0x7f3aedfb8000, len: 24576, prot: READ|WRITE, flags: PRIVATE|FIXED|DENYWRITE, fd: 3, off: 0x1bb000) = 0x7f3aedfb8000
0.138 mmap(addr: 0x7f3aedfbe000, len: 14272, prot: READ|WRITE, flags: PRIVATE|FIXED|ANONYMOUS) = 0x7f3aedfbe000
0.147 close(fd: 3) = 0
0.158 arch_prctl(option: SET_FS, arg2: 0x7f3aedfc3580) = 0
0.210 mprotect(start: 0x7f3aedfb8000, len: 16384, prot: READ) = 0
0.230 mprotect(start: 0x559f5b27d000, len: 4096, prot: READ) = 0
0.236 mprotect(start: 0x7f3aee00f000, len: 4096, prot: READ) = 0
0.240 munmap(addr: 0x7f3aedfc4000, len: 134346) = 0
0.300 brk(brk: NULL) = 0x559f5bd96000
0.302 brk(brk: 0x559f5bdb7000) = 0x559f5bdb7000
0.305 brk(brk: NULL) = 0x559f5bdb7000
0.310 openat(dfd: CWD, filename: "/usr/lib/locale/locale-archive", flags: RDONLY|CLOEXEC) = 3
0.315 fstat(fd: 3, statbuf: 0x7f3aedfbdac0) = 0
0.318 mmap(addr: NULL, len: 217750512, prot: READ, flags: PRIVATE, fd: 3, off: 0) = 0x7f3ae0e52000
0.325 close(fd: 3) = 0
0.358 nanosleep(rqtp: 0x7ffda8b714b0, rmtp: NULL) = 0
1000.622 close(fd: 1) = 0
1000.641 close(fd: 2) = 0
1000.664 exit_group(error_code: 0) = ?
Summary of events:
sleep (722), 80 events, 93.0%
syscall calls errors total min avg max stddev
(msec) (msec) (msec) (msec) (%)
--------------- -------- ------ -------- --------- --------- --------- ------
nanosleep 1 0 1000.194 1000.194 1000.194 1000.194 0.00%
mmap 8 0 0.025 0.002 0.003 0.005 10.17%
close 5 0 0.018 0.001 0.004 0.010 50.18%
mprotect 4 0 0.016 0.003 0.004 0.006 16.81%
openat 3 0 0.011 0.003 0.004 0.004 6.57%
munmap 1 0 0.010 0.010 0.010 0.010 0.00%
brk 4 0 0.005 0.001 0.001 0.002 20.72%
read 4 0 0.005 0.001 0.001 0.002 16.71%
access 1 1 0.005 0.005 0.005 0.005 0.00%
fstat 3 0 0.004 0.001 0.001 0.002 14.82%
lseek 3 0 0.003 0.001 0.001 0.001 11.66%
arch_prctl 2 1 0.002 0.001 0.001 0.001 3.59%
execve 1 0 0.000 0.000 0.000 0.000 0.00%
#
Works for system wide, e.g. for 1ms:
# perf trace -s -a sleep 0.001
Summary of events:
sleep (768), 94 events, 37.9%
syscall calls errors total min avg max stddev
(msec) (msec) (msec) (msec) (%)
--------------- -------- ------ -------- --------- --------- --------- ------
nanosleep 1 0 1.133 1.133 1.133 1.133 0.00%
execve 7 6 0.351 0.003 0.050 0.316 88.53%
mmap 8 0 0.024 0.002 0.003 0.004 8.86%
mprotect 4 0 0.017 0.003 0.004 0.006 16.02%
openat 3 0 0.013 0.004 0.004 0.005 8.34%
munmap 1 0 0.010 0.010 0.010 0.010 0.00%
brk 4 0 0.007 0.001 0.002 0.002 10.99%
close 5 0 0.005 0.001 0.001 0.002 11.69%
read 5 0 0.005 0.000 0.001 0.002 30.53%
access 1 1 0.004 0.004 0.004 0.004 0.00%
fstat 3 0 0.004 0.001 0.001 0.002 10.74%
lseek 3 0 0.003 0.001 0.001 0.001 10.20%
arch_prctl 2 1 0.002 0.001 0.001 0.001 3.34%
Web Content (21258), 46 events, 18.5%
syscall calls errors total min avg max stddev
(msec) (msec) (msec) (msec) (%)
--------------- -------- ------ -------- --------- --------- --------- ------
recvmsg 12 12 0.015 0.001 0.001 0.002 8.50%
futex 2 0 0.008 0.003 0.004 0.005 27.08%
poll 6 0 0.006 0.000 0.001 0.002 22.14%
read 2 0 0.006 0.002 0.003 0.003 26.08%
write 1 0 0.002 0.002 0.002 0.002 0.00%
Web Content (4365), 36 events, 14.5%
syscall calls errors total min avg max stddev
(msec) (msec) (msec) (msec) (%)
--------------- -------- ------ -------- --------- --------- --------- ------
recvmsg 10 10 0.015 0.001 0.002 0.003 11.83%
poll 5 0 0.006 0.000 0.001 0.002 28.44%
futex 2 0 0.005 0.001 0.003 0.004 48.29%
read 1 0 0.003 0.003 0.003 0.003 0.00%
Timer (21275), 14 events, 5.6%
syscall calls errors total min avg max stddev
(msec) (msec) (msec) (msec) (%)
--------------- -------- ------ -------- --------- --------- --------- ------
futex 6 1 0.240 0.000 0.040 0.149 64.58%
write 1 0 0.008 0.008 0.008 0.008 0.00%
Timer (4383), 14 events, 5.6%
syscall calls errors total min avg max stddev
(msec) (msec) (msec) (msec) (%)
--------------- -------- ------ -------- --------- --------- --------- ------
futex 6 2 0.186 0.000 0.031 0.181 96.45%
write 1 0 0.010 0.010 0.010 0.010 0.00%
Web Content (20354), 28 events, 11.3%
syscall calls errors total min avg max stddev
(msec) (msec) (msec) (msec) (%)
--------------- -------- ------ -------- --------- --------- --------- ------
recvmsg 8 8 0.010 0.001 0.001 0.002 15.24%
poll 4 0 0.004 0.000 0.001 0.002 35.68%
futex 1 0 0.003 0.003 0.003 0.003 0.00%
read 1 0 0.003 0.003 0.003 0.003 0.00%
Timer (20371), 10 events, 4.0%
syscall calls errors total min avg max stddev
(msec) (msec) (msec) (msec) (%)
--------------- -------- ------ -------- --------- --------- --------- ------
futex 4 1 0.077 0.000 0.019 0.075 95.46%
write 1 0 0.005 0.005 0.005 0.005 0.00%
[root@quaco ~]#
Cc: Adrian Hunter <[email protected]>
Cc: Andi Kleen <[email protected]>
Cc: Brendan Gregg <[email protected]>
Cc: Jiri Olsa <[email protected]>
Cc: Luis Cláudio Gonçalves <[email protected]>
Cc: Namhyung Kim <[email protected]>
Link: https://lkml.kernel.org/n/[email protected]
Signed-off-by: Arnaldo Carvalho de Melo <[email protected]>
|
|
'perf record' has supported --all-kernel / --all-user to configure all
used events to run in kernel space or run in user space. But 'perf stat'
doesn't support these options.
It would be useful to support these options in 'perf stat' too to keep
the same semantics available in both tools.
Signed-off-by: Jin Yao <[email protected]>
Acked-by: Jiri Olsa <[email protected]>
Cc: Alexander Shishkin <[email protected]>
Cc: Andi Kleen <[email protected]>
Cc: Kan Liang <[email protected]>
Cc: Peter Zijlstra <[email protected]>
Link: http://lore.kernel.org/lkml/[email protected]
Signed-off-by: Arnaldo Carvalho de Melo <[email protected]>
|
|
The build of file libperf-jvmti.so succeeds but the resulting
object fails to load:
# ~/linux/tools/perf/perf record -k mono -- java \
-XX:+PreserveFramePointer \
-agentpath:/root/linux/tools/perf/libperf-jvmti.so \
hog 100000 123450
Error occurred during initialization of VM
Could not find agent library /root/linux/tools/perf/libperf-jvmti.so
in absolute path, with error:
/root/linux/tools/perf/libperf-jvmti.so: undefined symbol: _ctype
Add the missing _ctype symbol into the build script.
Fixes: 79743bc927f6 ("perf jvmti: Link against tools/lib/string.o to have weak strlcpy()")
Signed-off-by: Thomas Richter <[email protected]>
Cc: Heiko Carstens <[email protected]>
Cc: Vasily Gorbik <[email protected]>
Link: http://lore.kernel.org/lkml/[email protected]
Signed-off-by: Arnaldo Carvalho de Melo <[email protected]>
|
|
Remove redirection of objdump's stderr to /dev/null to help diagnose
failures.
Fix the '--no-show-raw' flag to be '--no-show-raw-insn' which binutils
is permissive and allows, but fails with LLVM objdump.
Signed-off-by: Ian Rogers <[email protected]>
Cc: Alexander Shishkin <[email protected]>
Cc: Jin Yao <[email protected]>
Cc: Jiri Olsa <[email protected]>
Cc: Mark Rutland <[email protected]>
Cc: Namhyung Kim <[email protected]>
Cc: Peter Zijlstra <[email protected]>
Cc: Song Liu <[email protected]>
Cc: Stephane Eranian <[email protected]>
Cc: [email protected]
Link: http://lore.kernel.org/lkml/[email protected]
Signed-off-by: Arnaldo Carvalho de Melo <[email protected]>
|
|
Avoiding a pipe allows objdump command failures to surface. Move to the
caller of symbol__parse_objdump_line the call to strim that removes
leading and trailing tabs. Add a new expand_tabs function that if a tab
is present allocate a new line in which tabs are expanded. In
symbol__parse_objdump_line the line had no leading spaces, so simplify
the line_ip processing.
Signed-off-by: Ian Rogers <[email protected]>
Tested-by: Arnaldo Carvalho de Melo <[email protected]>
Cc: Alexander Shishkin <[email protected]>
Cc: Jin Yao <[email protected]>
Cc: Jiri Olsa <[email protected]>
Cc: Mark Rutland <[email protected]>
Cc: Namhyung Kim <[email protected]>
Cc: Peter Zijlstra <[email protected]>
Cc: Song Liu <[email protected]>
Cc: Stephane Eranian <[email protected]>
Cc: [email protected]
Link: http://lore.kernel.org/lkml/[email protected]
Signed-off-by: Arnaldo Carvalho de Melo <[email protected]>
|
|
Simplify the objdump command by not piping the output of objdump through
grep. Instead, drop lines that match the grep pattern during the reading
loop.
Signed-off-by: Ian Rogers <[email protected]>
Tested-by: Arnaldo Carvalho de Melo <[email protected]>
Cc: Alexander Shishkin <[email protected]>
Cc: Jin Yao <[email protected]>
Cc: Jiri Olsa <[email protected]>
Cc: Mark Rutland <[email protected]>
Cc: Namhyung Kim <[email protected]>
Cc: Peter Zijlstra <[email protected]>
Cc: Song Liu <[email protected]>
Cc: Stephane Eranian <[email protected]>
Cc: [email protected]
Link: http://lore.kernel.org/lkml/[email protected]
Signed-off-by: Arnaldo Carvalho de Melo <[email protected]>
|
|
Reduce duplicated logic by using the subcmd library. Ensure when errors
occur they are reported to the caller. Before this patch, if no lines
are read the error status is 0.
Signed-off-by: Ian Rogers <[email protected]>
Tested-by: Arnaldo Carvalho de Melo <[email protected]>
Cc: Alexander Shishkin <[email protected]>
Cc: Jin Yao <[email protected]>
Cc: Jiri Olsa <[email protected]>
Cc: Mark Rutland <[email protected]>
Cc: Namhyung Kim <[email protected]>
Cc: Peter Zijlstra <[email protected]>
Cc: Song Liu <[email protected]>
Cc: Stephane Eranian <[email protected]>
Cc: [email protected]
Link: http://lore.kernel.org/lkml/[email protected]
Link: http://lore.kernel.org/lkml/[email protected]
[ merged follow up fix for NULL termination as in the 2nd link above ]
Signed-off-by: Arnaldo Carvalho de Melo <[email protected]>
|
|
Objdump output is parsed using getline which allocates memory for the
read. Getline will realloc if the memory is too small, but currently the
line is always freed after the call.
Simplify parse_objdump_line by performing the reading in symbol__disassemble.
Signed-off-by: Ian Rogers <[email protected]>
Tested-by: Arnaldo Carvalho de Melo <[email protected]>
Cc: Alexander Shishkin <[email protected]>
Cc: Jin Yao <[email protected]>
Cc: Jiri Olsa <[email protected]>
Cc: Mark Rutland <[email protected]>
Cc: Namhyung Kim <[email protected]>
Cc: Peter Zijlstra <[email protected]>
Cc: Song Liu <[email protected]>
Cc: Stephane Eranian <[email protected]>
Cc: [email protected]
Link: http://lore.kernel.org/lkml/[email protected]
Signed-off-by: Arnaldo Carvalho de Melo <[email protected]>
|
|
We received a user report that call-graph DWARF mode was enabled in
'perf record' but 'perf report' didn't unwind the callstack correctly.
The reason was, libunwind was not compiled in.
We can use 'perf -vv' to check the compiled libraries but it would be
valuable to report a warning to user directly (especially valuable for
a perf newbie).
The warning is:
Warning:
Please install libunwind development packages during the perf build.
Both TUI and stdio are supported.
Signed-off-by: Jin Yao <[email protected]>
Cc: Alexander Shishkin <[email protected]>
Cc: Andi Kleen <[email protected]>
Cc: Jiri Olsa <[email protected]>
Cc: Kan Liang <[email protected]>
Cc: Peter Zijlstra <[email protected]>
Link: http://lore.kernel.org/lkml/[email protected]
Signed-off-by: Arnaldo Carvalho de Melo <[email protected]>
|
|
When executing the task exit testing case, perf gets stuck in an endless
loop this case and doesn't return back on Arm64 Juno board.
After digging into this issue, since Juno board has Arm's big.LITTLE
CPUs, thus the PMUs are not compatible between the big CPUs and little
CPUs. This leads to a PMU event that cannot be enabled properly when
the traced task is migrated from one variant's CPU to another variant.
Finally, the test case runs into infinite loop for cannot read out any
event data after return from polling.
Eventually, we need to work out formal solution to allow PMU events can
be freely migrated from one CPU variant to another, but this is a
difficult task and a different topic. This patch tries to fix the Perf
test case to avoid infinite loop, when the testing detects 1000 times
retrying for reading empty events, it will directly bail out and return
failure. This allows the Perf tool can continue its other test cases.
Signed-off-by: Leo Yan <[email protected]>
Cc: Alexander Shishkin <[email protected]>
Cc: Jiri Olsa <[email protected]>
Cc: Mark Rutland <[email protected]>
Cc: Namhyung Kim <[email protected]>
Link: http://lore.kernel.org/lkml/[email protected]
Signed-off-by: Arnaldo Carvalho de Melo <[email protected]>
|
|
When fail to mmap events in task exit case, it misses to set 'err' to
-1; thus the testing will not report failure for it.
This patch sets 'err' to -1 when fails to mmap events, thus Perf tool
can report correct result.
Fixes: d723a55096b8 ("perf test: Add test case for checking number of EXIT events")
Signed-off-by: Leo Yan <[email protected]>
Cc: Alexander Shishkin <[email protected]>
Cc: Jiri Olsa <[email protected]>
Cc: Mark Rutland <[email protected]>
Cc: Namhyung Kim <[email protected]>
Link: http://lore.kernel.org/lkml/[email protected]
Signed-off-by: Arnaldo Carvalho de Melo <[email protected]>
|
|
In the earlier fix for the memory overrun of id arrays I managed to typo
the wrong event in the fix.
Of course we need to close the current event in the loop, not the
original failing event.
The same test case as in the original patch still passes.
Fixes: 7834fa948beb ("perf evlist: Fix access of freed id arrays")
Signed-off-by: Andi Kleen <[email protected]>
Cc: Jiri Olsa <[email protected]>
Link: http://lore.kernel.org/lkml/[email protected]
Signed-off-by: Arnaldo Carvalho de Melo <[email protected]>
|
|
My earlier patch to just enable --reltime with --time was a little too
optimistic. The --time parsing would accept absolute time, which is
very confusing to the user.
Support relative time in --time parsing too. This only works with recent
perf record that records the first sample time. Otherwise we error out.
Fixes: 3714437d3fcc ("perf script: Allow --time with --reltime")
Signed-off-by: Andi Kleen <[email protected]>
Cc: Jiri Olsa <[email protected]>
Link: http://lore.kernel.org/lkml/[email protected]
Signed-off-by: Arnaldo Carvalho de Melo <[email protected]>
|
|
By using "make TCMALLOC=1" you can enable perf to be build for usage
with libtcmalloc.so (gperftools).
Get heap profile (tools/perf directory):
$ <install gperftools>
$ make TCMALLOC=1 DEBUG=1
$ HEAPPROFILE=/tmp/heapprof ./perf ...
$ pprof ./perf /tmp/heapprof.000*
(pprof) top
Total: 2335.5 MB
1735.1 74.3% 74.3% 1735.1 74.3% memdup
402.0 17.2% 91.5% 402.0 17.2% zalloc
140.2 6.0% 97.5% 145.8 6.2% map__new
33.6 1.4% 98.9% 33.6 1.4% symbol__new
12.4 0.5% 99.5% 12.4 0.5% alloc_event
6.2 0.3% 99.7% 6.2 0.3% nsinfo__new
5.5 0.2% 100.0% 5.5 0.2% nsinfo__copy
0.3 0.0% 100.0% 0.3 0.0% dso__new
0.1 0.0% 100.0% 0.1 0.0% do_read_string
0.0 0.0% 100.0% 0.0 0.0% __GI__IO_file_doallocate
See callstack:
$ pprof --pdf ./perf /tmp/heapprof.00* > callstack.pdf
$ pprof --web ./perf /tmp/heapprof.00*
Committer testing:
Install gperftools, on fedora:
# dnf install gperftools-devel
Then build:
$ make TCMALLOC=1 DEBUG=1 -C tools/perf O=/tmp/build/perf install-bin
Verify that it linked against the right library:
$ ldd ~/bin/perf | grep tcma
libtcmalloc.so.4 => /lib64/libtcmalloc.so.4 (0x00007fb2953a7000)
$
Run 'perf trace' system wide for 1 minute:
# HEAPPROFILE=/tmp/heapprof perf trace -a sleep 1m
<SNIP>
59985.524 ( 0.006 ms): Web Content/20354 recvmsg(fd: 9<socket:[1762817]>, msg: 0x7ffee5fdafb0) = -1 EAGAIN (Resource temporarily unavailable)
59985.536 ( 0.005 ms): Web Content/20354 recvmsg(fd: 9<socket:[1762817]>, msg: 0x7ffee5fdafc0) = -1 EAGAIN (Resource temporarily unavailable)
59981.956 (10.143 ms): SCTP timer/21716 ... [continued]: select()) = 0 (Timeout)
59985.549 ( ): Web Content/20354 poll(ufds: 0x7f1df38af180, nfds: 3, timeout_msecs: 4294967295) ...
0.926 (59999.481 ms): sleep/29764 ... [continued]: nanosleep()) = 0
59992.133 ( ): SCTP timer/21716 select(tvp: 0x7ff5bf7fee80) ...
60000.477 ( 0.009 ms): sleep/29764 close(fd: 1) = 0
60000.493 ( 0.005 ms): sleep/29764 close(fd: 2) = 0
60000.514 ( ): sleep/29764 exit_group() = ?
Dumping heap profile to /tmp/heapprof.0001.heap (Exiting, 3 MB in use)
[root@quaco ~]#
Install pprof:
# dnf install pprof
And run it:
# pprof ~/bin/perf /tmp/heapprof.0001.heap
Using local file /root/bin/perf.
Using local file /tmp/heapprof.0001.heap.
Welcome to pprof! For help, type 'help'.
(pprof) top
Total: 4.0 MB
1.7 42.0% 42.0% 2.2 54.1% map__new
0.9 23.3% 65.3% 0.9 23.3% zalloc
0.5 11.4% 76.7% 0.5 11.4% dso__new
0.2 5.6% 82.3% 0.3 8.5% trace__sys_enter
0.2 4.9% 87.2% 0.2 4.9% __GI___strdup
0.2 3.8% 91.0% 0.2 3.8% new_term
0.1 2.2% 93.2% 0.4 10.1% __perf_pmu__new_alias
0.0 1.0% 94.3% 0.0 1.2% event_read_fields
0.0 0.8% 95.1% 0.0 0.8% nsinfo__new
0.0 0.7% 95.8% 0.1 3.2% trace__read_syscall_info
(pprof)
Signed-off-by: Jiri Olsa <[email protected]>
Tested-by: Arnaldo Carvalho de Melo <[email protected]>
Cc: Alexander Shishkin <[email protected]>
Cc: Andi Kleen <[email protected]>
Cc: Michael Petlan <[email protected]>
Cc: Namhyung Kim <[email protected]>
Cc: Peter Zijlstra <[email protected]>
Link: http://lore.kernel.org/lkml/[email protected]
Signed-off-by: Arnaldo Carvalho de Melo <[email protected]>
|
|
Add a test that checks that if pid namespaces are configured the fdinfo
file of a pidfd contains an NSpid: entry containing the process id in
the current and additionally all nested namespaces. In the case that
a pidfd is from a pid namespace not in the same namespace hierarchy as
the process accessing the fdinfo file, ensure the 'NSpid' shows 0 for
that pidfd, analogous to the 'Pid' entry.
Signed-off-by: Christian Kellner <[email protected]>
Acked-by: Christian Brauner <[email protected]>
Link: https://lore.kernel.org/r/[email protected]
Signed-off-by: Christian Brauner <[email protected]>
|
|
git://git.kernel.org/pub/scm/linux/kernel/git/acme/linux into perf/core
Pull perf/core improvements and fixes from Arnaldo Carvalho de Melo:
perf trace:
Arnaldo Carvalho de Melo:
- Reuse the strace-like syscall_arg_fmt->scnprintf() beautification routines
(convert integer arguments into strings, like open flags, etc) in tracepoint
arguments.
For now the type based scnprintf routines (pid_t, umode_t, etc) and the
ones based in well known arg name based ("fd", etc) gets associated with
tracepoint args of that type.
A tracepoint only arg, "msr", for the msr:{write,read}_msr gets added as
an initial step.
- Introduce syscall_arg_fmt->strtoul() methods to be the reverse operation
of ->scnprintf(), i.e. to go from a string to an integer.
- Implement --filter, just like in 'perf record', that affects the tracepoint
events specied thus far in the command line, use the ->strtoul() methods
to allow strings in tables associated with beautifiers to the integers
the in-kernel tracepoint (eBPF later) filters expect, e.g.:
# perf trace --max-events 1 -e sched:*ipi --filter="cpu==1 || cpu==2"
0.000 as/24630 sched:sched_wake_idle_without_ipi(cpu: 1)
#
# perf trace --max-events 1 --max-stack=32 -e msr:* --filter="msr==IA32_TSC_DEADLINE"
207.000 cc1/19963 msr:write_msr(msr: IA32_TSC_DEADLINE, val: 5442316760822)
do_trace_write_msr ([kernel.kallsyms])
do_trace_write_msr ([kernel.kallsyms])
lapic_next_deadline ([kernel.kallsyms])
clockevents_program_event ([kernel.kallsyms])
hrtimer_interrupt ([kernel.kallsyms])
smp_apic_timer_interrupt ([kernel.kallsyms])
apic_timer_interrupt ([kernel.kallsyms])
[0x6ff66c] (/usr/lib/gcc-cross/alpha-linux-gnu/8/cc1)
[0x7047c3] (/usr/lib/gcc-cross/alpha-linux-gnu/8/cc1)
[0x707708] (/usr/lib/gcc-cross/alpha-linux-gnu/8/cc1)
execute_one_pass (/usr/lib/gcc-cross/alpha-linux-gnu/8/cc1)
[0x4f3d37] (/usr/lib/gcc-cross/alpha-linux-gnu/8/cc1)
[0x4f3d49] (/usr/lib/gcc-cross/alpha-linux-gnu/8/cc1)
execute_pass_list (/usr/lib/gcc-cross/alpha-linux-gnu/8/cc1)
cgraph_node::expand (/usr/lib/gcc-cross/alpha-linux-gnu/8/cc1)
[0x2625b4] (/usr/lib/gcc-cross/alpha-linux-gnu/8/cc1)
symbol_table::finalize_compilation_unit (/usr/lib/gcc-cross/alpha-linux-gnu/8/cc1)
[0x5ae8b9] (/usr/lib/gcc-cross/alpha-linux-gnu/8/cc1)
toplev::main (/usr/lib/gcc-cross/alpha-linux-gnu/8/cc1)
main (/usr/lib/gcc-cross/alpha-linux-gnu/8/cc1)
[0x26b6a] (/usr/lib/x86_64-linux-gnu/libc-2.29.so)
#
# perf trace --max-events 8 -e msr:* --filter="msr==IA32_SPEC_CTRL"
0.000 :13281/13281 msr:write_msr(msr: IA32_SPEC_CTRL, val: 6)
0.063 migration/3/25 msr:write_msr(msr: IA32_SPEC_CTRL)
0.217 kworker/u16:1-/4826 msr:write_msr(msr: IA32_SPEC_CTRL)
0.687 rcu_sched/11 msr:write_msr(msr: IA32_SPEC_CTRL)
0.696 :13280/13280 msr:write_msr(msr: IA32_SPEC_CTRL, val: 6)
0.305 :13281/13281 msr:write_msr(msr: IA32_SPEC_CTRL, val: 6)
0.355 :13274/13274 msr:write_msr(msr: IA32_SPEC_CTRL, val: 6)
2.743 kworker/u16:0-/6711 msr:write_msr(msr: IA32_SPEC_CTRL)
#
# perf trace --max-events 8 --cpu 1 -e msr:* --filter="msr!=IA32_SPEC_CTRL && msr!=IA32_TSC_DEADLINE && msr != FS_BASE"
0.000 mtr-packet/30819 msr:write_msr(msr: 0x830, val: 68719479037)
0.096 :0/0 msr:read_msr(msr: IA32_TSC_ADJUST)
238.925 mtr-packet/30819 msr:write_msr(msr: 0x830, val: 8589936893)
511.010 :0/0 msr:write_msr(msr: 0x830, val: 68719479037)
1005.052 :0/0 msr:read_msr(msr: IA32_TSC_ADJUST)
1235.131 CPU 0/KVM/3750 msr:write_msr(msr: 0x830, val: 4294969595)
1235.195 CPU 0/KVM/3750 msr:read_msr(msr: IA32_SYSENTER_ESP, val: -2199023037952)
1235.201 CPU 0/KVM/3750 msr:read_msr(msr: IA32_APICBASE, val: 4276096000)
#
- Default to not using libtraceevent and its plugins for beautifying
tracepoint arguments, since now we're reusing the strace-like beatufiers.
Use --libtraceevent_print (using just --libtrace is unambiguous and can
be used as a short hand) to go back to those beautifiers.
This will help in the transition, as can be seen in some of the sched tracepoints
that still need some work in the libbeauty based mode:
# trace --no-inherit -e msr:*,*sleep,sched:* sleep 1
0.000 ( ): sched:sched_waking(comm: "trace", pid: 3319 (trace), prio: 120, success: 1)
0.006 ( ): sched:sched_wakeup(comm: "trace", pid: 3319 (trace), prio: 120, success: 1)
0.348 ( ): sched:sched_process_exec(filename: 140212596720100, pid: 3319 (sleep), old_pid: 3319 (sleep))
0.490 ( ): msr:write_msr(msr: FS_BASE, val: 139631189321088)
0.670 ( ): nanosleep(rqtp: 0x7ffc52c23bc0) ...
0.674 ( ): sched:sched_stat_runtime(comm: "sleep", pid: 3319 (sleep), runtime: 659259, vruntime: 78942418342)
0.675 ( ): sched:sched_switch(prev_comm: "sleep", prev_pid: 3319 (sleep), prev_prio: 120, prev_state: 1, next_comm: "swapper/0", next_prio: 120)
1001.059 ( ): sched:sched_waking(comm: "sleep", pid: 3319 (sleep), prio: 120, success: 1)
1001.098 ( ): sched:sched_wakeup(comm: "sleep", pid: 3319 (sleep), prio: 120, success: 1)
0.670 (1000.504 ms): ... [continued]: nanosleep()) = 0
1001.456 ( ): sched:sched_process_exit(comm: "sleep", pid: 3319 (sleep), prio: 120)
# trace --libtrace --no-inherit -e msr:*,*sleep,sched:* sleep 1
# trace --libtrace --no-inherit -e msr:*,*sleep,sched:* sleep 1
0.000 ( ): sched:sched_waking(comm=trace pid=3323 prio=120 target_cpu=000)
0.007 ( ): sched:sched_wakeup(comm=trace pid=3323 prio=120 target_cpu=000)
0.382 ( ): sched:sched_process_exec(filename=/usr/bin/sleep pid=3323 old_pid=3323)
0.525 ( ): msr:write_msr(c0000100, value 7f5d508a0580)
0.713 ( ): nanosleep(rqtp: 0x7fff487fb4a0) ...
0.717 ( ): sched:sched_stat_runtime(comm=sleep pid=3323 runtime=617722 [ns] vruntime=78957731636 [ns])
0.719 ( ): sched:sched_switch(prev_comm=sleep prev_pid=3323 prev_prio=120 prev_state=S ==> next_comm=swapper/0 next_pid=0 next_prio=120)
1001.117 ( ): sched:sched_waking(comm=sleep pid=3323 prio=120 target_cpu=000)
1001.157 ( ): sched:sched_wakeup(comm=sleep pid=3323 prio=120 target_cpu=000)
0.713 (1000.522 ms): ... [continued]: nanosleep()) = 0
1001.538 ( ): sched:sched_process_exit(comm=sleep pid=3323 prio=120)
#
- Make -v (verbose) mode be honoured for .perfconfig based trace.add_events,
to help in diagnosing problems with building eBPF events (-e source.c).
- When using eBPF syscall payload augmentation do not show strace-like
syscalls when all the user specified was some tracepoint event, bringing
the behaviour in line with that of when not using eBPF augmentation.
Intel PT:
exported-sql-viewer GUI:
Adrian Hunter:
- Add LookupModel, HBoxLayout, VBoxLayout, global time range calculations
so as to add a time chart by CPU.
perf script:
Andi Kleen:
- Allow --time (to specify a time span of interest) with --reltime
perf diff:
Jin Yao:
- Report noise for cycles diff, i.e. a histogram + stddev.
(timestamps relative to start).
perf annotate:
Arnaldo Carvalho de Melo:
- Initialize env->cpuid when running in live mode (perf top), as it
is used in some of the per arch annotation init routines.
samples bpf:
Björn Töpel:
- Fixup fallout of using tools/perf/perf-sys. from outside tools/perf.
Core:
Ian Rogers:
- Avoid 'sample_reg_masks' being const + weak, as this breaks with some
compilers that constant-propagate from the weak symbol.
libperf:
- First part of moving the perf_mmap class from tools/perf to libperf.
- Propagate CFLAGS to libperf from the tools/perf Makefile.
Vendor events:
John Garry:
- Add entry in MAINTAINERS with reviewers for the for perf tool arm64
pmu-events files.
Signed-off-by: Arnaldo Carvalho de Melo <[email protected]>
Signed-off-by: Ingo Molnar <[email protected]>
|
|
Alexei Starovoitov says:
====================
pull-request: bpf-next 2019-10-14
The following pull-request contains BPF updates for your *net-next* tree.
12 days of development and
85 files changed, 1889 insertions(+), 1020 deletions(-)
The main changes are:
1) auto-generation of bpf_helper_defs.h, from Andrii.
2) split of bpf_helpers.h into bpf_{helpers, helper_defs, endian, tracing}.h
and move into libbpf, from Andrii.
3) Track contents of read-only maps as scalars in the verifier, from Andrii.
4) small x86 JIT optimization, from Daniel.
5) cross compilation support, from Ivan.
6) bpf flow_dissector enhancements, from Jakub and Stanislav.
====================
Signed-off-by: David S. Miller <[email protected]>
|
|
Fixes test module build.
Reported-by: Jan Kiszka <[email protected]>
Signed-off-by: Michael S. Tsirkin <[email protected]>
|
|
Given lots of selftests won't work without recent enough Clang/LLVM that
fully supports BTF, there is no point in maintaining outdated BTF
support detection and fall-back to pahole logic. Just assume we have
everything we need.
Signed-off-by: Andrii Nakryiko <[email protected]>
Signed-off-by: Alexei Starovoitov <[email protected]>
Link: https://lore.kernel.org/bpf/[email protected]
|
|
Given BPF programs rely on libbpf's bpf_helper_defs.h, which is
auto-generated during libbpf build, libbpf build has to happen before
we attempt progs/*.c build. Enforce it as order-only dependency.
Fixes: 24f25763d6de ("libbpf: auto-generate list of BPF helper definitions")
Signed-off-by: Andrii Nakryiko <[email protected]>
Signed-off-by: Alexei Starovoitov <[email protected]>
Link: https://lore.kernel.org/bpf/[email protected]
|
|
In case of C/LDFLAGS there is no way to pass them correctly to build
command, for instance when --sysroot is used or external libraries
are used, like -lelf, wich can be absent in toolchain. This can be
used for samples/bpf cross-compiling allowing to get elf lib from
sysroot.
Signed-off-by: Ivan Khoronzhuk <[email protected]>
Signed-off-by: Alexei Starovoitov <[email protected]>
Acked-by: Andrii Nakryiko <[email protected]>
Link: https://lore.kernel.org/bpf/[email protected]
|
|
No need to use C++ for test_libbpf target when libbpf is on C and it
can be tested with C, after this change the CXXFLAGS in makefiles can
be avoided, at least in bpf samples, when sysroot is used, passing
same C/LDFLAGS as for lib.
Add "return 0" in test_libbpf to avoid warn, but also remove spaces at
start of the lines to keep same style and avoid warns while apply.
Signed-off-by: Ivan Khoronzhuk <[email protected]>
Signed-off-by: Alexei Starovoitov <[email protected]>
Acked-by: Andrii Nakryiko <[email protected]>
Link: https://lore.kernel.org/bpf/[email protected]
|
|
git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip
Pull perf fixes from Ingo Molnar:
"Mostly tooling fixes, but also a couple of updates for new Intel
models (which are technically hw-enablement, but to users it's a fix
to perf behavior on those new CPUs - hope this is fine), an AUX
inheritance fix, event time-sharing fix, and a fix for lost non-perf
NMI events on AMD systems"
* 'perf-urgent-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip: (36 commits)
perf/x86/cstate: Add Tiger Lake CPU support
perf/x86/msr: Add Tiger Lake CPU support
perf/x86/intel: Add Tiger Lake CPU support
perf/x86/cstate: Update C-state counters for Ice Lake
perf/x86/msr: Add new CPU model numbers for Ice Lake
perf/x86/cstate: Add Comet Lake CPU support
perf/x86/msr: Add Comet Lake CPU support
perf/x86/intel: Add Comet Lake CPU support
perf/x86/amd: Change/fix NMI latency mitigation to use a timestamp
perf/core: Fix corner case in perf_rotate_context()
perf/core: Rework memory accounting in perf_mmap()
perf/core: Fix inheritance of aux_output groups
perf annotate: Don't return -1 for error when doing BPF disassembly
perf annotate: Return appropriate error code for allocation failures
perf annotate: Fix arch specific ->init() failure errors
perf annotate: Propagate the symbol__annotate() error return
perf annotate: Fix the signedness of failure returns
perf annotate: Propagate perf_env__arch() error
perf evsel: Fall back to global 'perf_env' in perf_evsel__env()
perf tools: Propagate get_cpuid() error
...
|
|
git://git.kernel.org/pub/scm/linux/kernel/git/powerpc/linux
Pull powerpc fixes from Michael Ellerman:
"Fix a kernel crash in spufs_create_root() on Cell machines, since the
new mount API went in.
Fix a regression in our KVM code caused by our recent PCR changes.
Avoid a warning message about a failing hypervisor API on systems that
don't have that API.
A couple of minor build fixes.
Thanks to: Alexey Kardashevskiy, Alistair Popple, Desnes A. Nunes do
Rosario, Emmanuel Nicolet, Jordan Niethe, Laurent Dufour, Stephen
Rothwell"
* tag 'powerpc-5.4-3' of git://git.kernel.org/pub/scm/linux/kernel/git/powerpc/linux:
spufs: fix a crash in spufs_create_root()
powerpc/kvm: Fix kvmppc_vcore->in_guest value in kvmhv_switch_to_host
selftests/powerpc: Fix compile error on tlbie_test due to newer gcc
powerpc/pseries: Remove confusing warning message.
powerpc/64s/radix: Fix build failure with RADIX_MMU=n
|
|
Alexei Starovoitov says:
====================
pull-request: bpf 2019-10-12
The following pull-request contains BPF updates for your *net* tree.
The main changes are:
1) a bunch of small fixes. Nothing critical.
====================
Signed-off-by: David S. Miller <[email protected]>
|
|
Add basic tests to verify functionality of netdevsim reporters.
Signed-off-by: Jiri Pirko <[email protected]>
Signed-off-by: David S. Miller <[email protected]>
|
|
Old GCC versions are producing invalid typedef for __gnuc_va_list
pointing to void. Special-case this and emit valid:
typedef __builtin_va_list __gnuc_va_list;
Reported-by: John Fastabend <[email protected]>
Signed-off-by: Andrii Nakryiko <[email protected]>
Signed-off-by: Daniel Borkmann <[email protected]>
Acked-by: Martin KaFai Lau <[email protected]>
Acked-by: John Fastabend <[email protected]>
Link: https://lore.kernel.org/bpf/[email protected]
|
|
Existing BPF_CORE_READ() macro generates slightly suboptimal code. If
there are intermediate pointers to be read, initial source pointer is
going to be assigned into a temporary variable and then temporary
variable is going to be uniformly used as a "source" pointer for all
intermediate pointer reads. Schematically (ignoring all the type casts),
BPF_CORE_READ(s, a, b, c) is expanded into:
({
const void *__t = src;
bpf_probe_read(&__t, sizeof(*__t), &__t->a);
bpf_probe_read(&__t, sizeof(*__t), &__t->b);
typeof(s->a->b->c) __r;
bpf_probe_read(&__r, sizeof(*__r), &__t->c);
})
This initial `__t = src` makes calls more uniform, but causes slightly
less optimal register usage sometimes when compiled with Clang. This can
cascase into, e.g., more register spills.
This patch fixes this issue by generating more optimal sequence:
({
const void *__t;
bpf_probe_read(&__t, sizeof(*__t), &src->a); /* <-- src here */
bpf_probe_read(&__t, sizeof(*__t), &__t->b);
typeof(s->a->b->c) __r;
bpf_probe_read(&__r, sizeof(*__r), &__t->c);
})
Fixes: 7db3822ab991 ("libbpf: Add BPF_CORE_READ/BPF_CORE_READ_INTO helpers")
Signed-off-by: Andrii Nakryiko <[email protected]>
Signed-off-by: Daniel Borkmann <[email protected]>
Link: https://lore.kernel.org/bpf/[email protected]
|
|
Make sure a new flow dissector program can be attached to replace the old
one with a single syscall. Also check that attaching the same program twice
is prohibited.
Signed-off-by: Jakub Sitnicki <[email protected]>
Signed-off-by: Daniel Borkmann <[email protected]>
Reviewed-by: Stanislav Fomichev <[email protected]>
Acked-by: Martin KaFai Lau <[email protected]>
Link: https://lore.kernel.org/bpf/[email protected]
|
|
This patch prints the stddev and hist for the cycles diff of program
block. It can help us to understand if the cycles is noisy or not.
This patch is inspired by Andi Kleen's patch:
https://lwn.net/Articles/600471/
We create new option '--cycles-hist'.
Example:
perf record -b ./div
perf record -b ./div
perf diff -c cycles
# Baseline [Program Block Range] Cycles Diff Shared Object Symbol
# ........ .......................................................... .... ................. ............................
#
46.72% [div.c:40 -> div.c:40] 0 div [.] main
46.72% [div.c:42 -> div.c:44] 0 div [.] main
46.72% [div.c:42 -> div.c:39] 0 div [.] main
20.54% [random_r.c:357 -> random_r.c:394] 1 libc-2.27.so [.] __random_r
20.54% [random_r.c:357 -> random_r.c:380] 0 libc-2.27.so [.] __random_r
20.54% [random_r.c:388 -> random_r.c:388] 0 libc-2.27.so [.] __random_r
20.54% [random_r.c:388 -> random_r.c:391] 0 libc-2.27.so [.] __random_r
17.04% [random.c:288 -> random.c:291] 0 libc-2.27.so [.] __random
17.04% [random.c:291 -> random.c:291] 0 libc-2.27.so [.] __random
17.04% [random.c:293 -> random.c:293] 0 libc-2.27.so [.] __random
17.04% [random.c:295 -> random.c:295] 0 libc-2.27.so [.] __random
17.04% [random.c:295 -> random.c:295] 0 libc-2.27.so [.] __random
17.04% [random.c:298 -> random.c:298] 0 libc-2.27.so [.] __random
8.40% [div.c:22 -> div.c:25] 0 div [.] compute_flag
8.40% [div.c:27 -> div.c:28] 0 div [.] compute_flag
5.14% [rand.c:26 -> rand.c:27] 0 libc-2.27.so [.] rand
5.14% [rand.c:28 -> rand.c:28] 0 libc-2.27.so [.] rand
2.15% [rand@plt+0 -> rand@plt+0] 0 div [.] rand@plt
0.00% [kernel.kallsyms] [k] __x86_indirect_thunk_rax
0.00% [do_mmap+714 -> do_mmap+732] -10 [kernel.kallsyms] [k] do_mmap
0.00% [do_mmap+737 -> do_mmap+765] 1 [kernel.kallsyms] [k] do_mmap
0.00% [do_mmap+262 -> do_mmap+299] 0 [kernel.kallsyms] [k] do_mmap
0.00% [__x86_indirect_thunk_r15+0 -> __x86_indirect_thunk_r15+0] 7 [kernel.kallsyms] [k] __x86_indirect_thunk_r15
0.00% [native_sched_clock+0 -> native_sched_clock+119] -1 [kernel.kallsyms] [k] native_sched_clock
0.00% [native_write_msr+0 -> native_write_msr+16] -13 [kernel.kallsyms] [k] native_write_msr
When we enable the option '--cycles-hist', the output is
perf diff -c cycles --cycles-hist
# Baseline [Program Block Range] Cycles Diff stddev/Hist Shared Object Symbol
# ........ .......................................................... .... ................. ................. ............................
#
46.72% [div.c:40 -> div.c:40] 0 ± 37.8% ▁█▁▁██▁█ div [.] main
46.72% [div.c:42 -> div.c:44] 0 ± 49.4% ▁▁▂█▂▂▂▂ div [.] main
46.72% [div.c:42 -> div.c:39] 0 ± 24.1% ▃█▂▄▁▃▂▁ div [.] main
20.54% [random_r.c:357 -> random_r.c:394] 1 ± 33.5% ▅▂▁█▃▁▂▁ libc-2.27.so [.] __random_r
20.54% [random_r.c:357 -> random_r.c:380] 0 ± 39.4% ▁▁█▁██▅▁ libc-2.27.so [.] __random_r
20.54% [random_r.c:388 -> random_r.c:388] 0 libc-2.27.so [.] __random_r
20.54% [random_r.c:388 -> random_r.c:391] 0 ± 41.2% ▁▃▁▂█▄▃▁ libc-2.27.so [.] __random_r
17.04% [random.c:288 -> random.c:291] 0 ± 48.8% ▁▁▁▁███▁ libc-2.27.so [.] __random
17.04% [random.c:291 -> random.c:291] 0 ±100.0% ▁█▁▁▁▁▁▁ libc-2.27.so [.] __random
17.04% [random.c:293 -> random.c:293] 0 ±100.0% ▁█▁▁▁▁▁▁ libc-2.27.so [.] __random
17.04% [random.c:295 -> random.c:295] 0 ±100.0% ▁█▁▁▁▁▁▁ libc-2.27.so [.] __random
17.04% [random.c:295 -> random.c:295] 0 libc-2.27.so [.] __random
17.04% [random.c:298 -> random.c:298] 0 ± 75.6% ▃█▁▁▁▁▁▁ libc-2.27.so [.] __random
8.40% [div.c:22 -> div.c:25] 0 ± 42.1% ▁▃▁▁███▁ div [.] compute_flag
8.40% [div.c:27 -> div.c:28] 0 ± 41.8% ██▁▁▄▁▁▄ div [.] compute_flag
5.14% [rand.c:26 -> rand.c:27] 0 ± 37.8% ▁▁▁████▁ libc-2.27.so [.] rand
5.14% [rand.c:28 -> rand.c:28] 0 libc-2.27.so [.] rand
2.15% [rand@plt+0 -> rand@plt+0] 0 div [.] rand@plt
0.00% [kernel.kallsyms] [k] __x86_indirect_thunk_rax
0.00% [do_mmap+714 -> do_mmap+732] -10 [kernel.kallsyms] [k] do_mmap
0.00% [do_mmap+737 -> do_mmap+765] 1 [kernel.kallsyms] [k] do_mmap
0.00% [do_mmap+262 -> do_mmap+299] 0 [kernel.kallsyms] [k] do_mmap
0.00% [__x86_indirect_thunk_r15+0 -> __x86_indirect_thunk_r15+0] 7 [kernel.kallsyms] [k] __x86_indirect_thunk_r15
0.00% [native_sched_clock+0 -> native_sched_clock+119] -1 ± 38.5% ▄█▁ [kernel.kallsyms] [k] native_sched_clock
0.00% [native_write_msr+0 -> native_write_msr+16] -13 ± 47.1% ▁█▇▃▁▁ [kernel.kallsyms] [k] native_write_msr
v8:
---
Rebase to perf/core branch
v7:
---
1. v6 got Jiri's ACK.
2. Rebase to latest perf/core branch.
v6:
---
1. Jiri provides better code for using data__hpp_register() in ui_init().
Use this code in v6.
v5:
---
1. Refine the use of data__hpp_register() in ui_init() according to
Jiri's suggestion.
v4:
---
1. Rename the new option from '--noisy' to '--cycles-hist'
2. Remove the option '-n'.
3. Only update the spark value and stats when '--cycles-hist' is enabled.
4. Remove the code of printing '..'.
v3:
---
1. Move the histogram to a separate column
2. Move the svals[] out of struct stats
v2:
---
Jiri got a compile error,
CC builtin-diff.o
builtin-diff.c: In function ‘compute_cycles_diff’:
builtin-diff.c:712:10: error: taking the absolute value of unsigned type ‘u64’ {aka ‘long unsigned int’} has no effect [-Werror=absolute-value]
712 | labs(pair->block_info->cycles_spark[i] -
| ^~~~
Because the result of u64 - u64 is still u64. Now we change the type of
cycles_spark[] to s64.
Signed-off-by: Jin Yao <[email protected]>
Acked-by: Jiri Olsa <[email protected]>
Cc: Alexander Shishkin <[email protected]>
Cc: Andi Kleen <[email protected]>
Cc: Kan Liang <[email protected]>
Cc: Peter Zijlstra <[email protected]>
Link: http://lore.kernel.org/lkml/[email protected]
Signed-off-by: Arnaldo Carvalho de Melo <[email protected]>
|
|
Andi reported that 'make DEBUG=1' does not propagate to the libbperf
code. It's true also for the other flags. Changing the code to propagate
the global build flags to libperf compilation.
Reported-by: Andi Kleen <[email protected]>
Signed-off-by: Jiri Olsa <[email protected]>
Cc: Alexander Shishkin <[email protected]>
Cc: Michael Petlan <[email protected]>
Cc: Namhyung Kim <[email protected]>
Cc: Peter Zijlstra <[email protected]>
Link: http://lore.kernel.org/lkml/[email protected]
Signed-off-by: Arnaldo Carvalho de Melo <[email protected]>
|
|
fix test module build.
Signed-off-by: Michael S. Tsirkin <[email protected]>
|
|
Test virtual server via ipip tunnel.
Tested:
# selftests: netfilter: ipvs.sh
# Testing DR mode...
# Testing NAT mode...
# Testing Tunnel mode...
# ipvs.sh: PASS
ok 6 selftests: netfilter: ipvs.sh
Signed-off-by: Haishuang Yan <[email protected]>
Signed-off-by: Simon Horman <[email protected]>
|
|
Test virtual server via NAT.
Tested:
# selftests: netfilter: ipvs.sh
# Testing DR mode...
# Testing NAT mode...
# ipvs.sh: PASS
Signed-off-by: Haishuang Yan <[email protected]>
Signed-off-by: Simon Horman <[email protected]>
|
|
Test virutal server via directing routing for IPv4.
Tested:
# selftests: netfilter: ipvs.sh
# Testing DR mode...
# ipvs.sh: PASS
ok 6 selftests: netfilter: ipvs.sh
Signed-off-by: Haishuang Yan <[email protected]>
Signed-off-by: Simon Horman <[email protected]>
|
|
Add tests checking that verifier does proper constant propagation for
read-only maps. If constant propagation didn't work, skipp_loop and
part_loop BPF programs would be rejected due to BPF verifier otherwise
not being able to prove they ever complete. With constant propagation,
though, they are succesfully validated as properly terminating loops.
Signed-off-by: Andrii Nakryiko <[email protected]>
Signed-off-by: Daniel Borkmann <[email protected]>
Link: https://lore.kernel.org/bpf/[email protected]
|
|
Added test case for layered IP operation for a single source IP4/IP6
address and a single destination IP4/IP6 address.
Signed-off-by: Roman Mashak <[email protected]>
Signed-off-by: Jakub Kicinski <[email protected]>
|
|
Introduce the perf_evlist__filter_pollfd function and export it in the
perf/evlist.h header, so that libperf users can check if the descriptor
is still alive.
Signed-off-by: Jiri Olsa <[email protected]>
Cc: Alexander Shishkin <[email protected]>
Cc: Michael Petlan <[email protected]>
Cc: Namhyung Kim <[email protected]>
Cc: Peter Zijlstra <[email protected]>
Link: http://lore.kernel.org/lkml/[email protected]
Signed-off-by: Arnaldo Carvalho de Melo <[email protected]>
|
|
Add a static perf_evlist__purge() function to purge evsels from a evlist.
Add also perf_evlist__for_each_entry_safe() which is used by
perf_evlist__purge().
Signed-off-by: Jiri Olsa <[email protected]>
Cc: Alexander Shishkin <[email protected]>
Cc: Michael Petlan <[email protected]>
Cc: Namhyung Kim <[email protected]>
Cc: Peter Zijlstra <[email protected]>
Link: http://lore.kernel.org/lkml/[email protected]
Signed-off-by: Arnaldo Carvalho de Melo <[email protected]>
|
|
Add the perf_evlist__exit() function, so far it's not exported and added
only for internal use for perf and libperf.
USe it to release cpus/threads and pollfd array.
Signed-off-by: Jiri Olsa <[email protected]>
Cc: Alexander Shishkin <[email protected]>
Cc: Michael Petlan <[email protected]>
Cc: Namhyung Kim <[email protected]>
Cc: Peter Zijlstra <[email protected]>
Link: http://lore.kernel.org/lkml/[email protected]
Signed-off-by: Arnaldo Carvalho de Melo <[email protected]>
|
|
It's needed in libperf only, so move it to the perf_evlist__mmap_ops()
function.
Signed-off-by: Jiri Olsa <[email protected]>
Cc: Alexander Shishkin <[email protected]>
Cc: Michael Petlan <[email protected]>
Cc: Namhyung Kim <[email protected]>
Cc: Peter Zijlstra <[email protected]>
Link: http://lore.kernel.org/lkml/[email protected]
Signed-off-by: Arnaldo Carvalho de Melo <[email protected]>
|