aboutsummaryrefslogtreecommitdiff
path: root/tools/perf
AgeCommit message (Collapse)AuthorFilesLines
2018-04-13perf record: Change warning for missing sysfs entry to debugThomas Richter1-1/+2
Using perf on 4.16.0 kernel on s390 shows this warning: failed: can't open node sysfs data each time I run command perf record ... for example: [root@s35lp76 perf]# ./perf record -e rB0000 -- sleep 1 [ perf record: Woken up 1 times to write data ] failed: can't open node sysfs data [ perf record: Captured and wrote 0.001 MB perf.data (4 samples) ] [root@s35lp76 perf]# It turns out commit e2091cedd51bf ("perf tools: Add MEM_TOPOLOGY feature to perf data file") tries to open directory named /sys/devices/system/node/ which does not exist on s390. This is the call stack: __cmd_record +---> perf_session__write_header +---> perf_header__adds_write +---> do_write_feat +---> write_mem_topology +---> build_mem_topology prints warning The issue starts in do_write_feat() which unconditionally loops over all features and now includes HEADER_MEM_TOPOLOGY and calls write_mem_topology(). Function record__init_features() at the beginning of __cmd_record() sets all features and then turns off some of them. Fix this by changing the warning to a level 2 debug output statement. So it is only shown when debug level 2 or higher is set. Signed-off-by: Thomas Richter <tmricht@linux.ibm.com> Cc: Heiko Carstens <heiko.carstens@de.ibm.com> Cc: Hendrik Brueckner <brueckner@linux.vnet.ibm.com> Cc: Martin Schwidefsky <schwidefsky@de.ibm.com> Link: http://lkml.kernel.org/r/20180412133246.92801-1-tmricht@linux.ibm.com Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
2018-04-12perf tests: Disable breakpoint accounting test for powerpcSandipan Das1-0/+1
We disable this test as instruction breakpoints (HW_BREAKPOINT_X) are not available for powerpc. Before applying patch: 21: Breakpoint accounting : --- start --- test child forked, pid 3635 failed opening event 0 failed opening event 0 watchpoints count 1, breakpoints count 0, has_ioctl 1, share 0 test child finished with -2 ---- end ---- Breakpoint accounting: Skip After applying patch: 21: Breakpoint accounting : Disabled Signed-off-by: Sandipan Das <sandipan@linux.vnet.ibm.com> Cc: Jiri Olsa <jolsa@redhat.com> Cc: Naveen N. Rao <naveen.n.rao@linux.vnet.ibm.com> Cc: Ravi Bangoria <ravi.bangoria@linux.vnet.ibm.com> Link: http://lkml.kernel.org/r/20180412162140.2992-1-sandipan@linux.vnet.ibm.com Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
2018-04-12perf sched: Fix documentation for timehistTakuya Yamamoto1-2/+2
Fixed a incorrect option and usage to those shown by "perf sched timehist -h", i.e. the default is really --call-graph, which is equivalent to -g. Signed-off-by: Takuya Yamamoto <tkydevel@gmail.com> Cc: Peter Zijlstra <peterz@infradead.org> Link: https://lkml.kernel.org/n/tip-8fzo0dlsi1mku5aqx8brep5s@git.kernel.org Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
2018-04-12perf version: Print status for syscall_tableJin Yao1-0/+3
This patch doesn't print "libaudit" line if HAVE_SYSCALL_TABLE_SUPPORT is available and add a line for HAVE_SYSCALL_TABLE_SUPPORT. For example, $ ./perf -vv perf version 4.13.rc5.gc2f8af9 dwarf: [ on ] # HAVE_DWARF_SUPPORT dwarf_getlocations: [ on ] # HAVE_DWARF_GETLOCATIONS_SUPPORT glibc: [ on ] # HAVE_GLIBC_SUPPORT gtk2: [ on ] # HAVE_GTK2_SUPPORT syscall_table: [ on ] # HAVE_SYSCALL_TABLE_SUPPORT libbfd: [ on ] # HAVE_LIBBFD_SUPPORT libelf: [ on ] # HAVE_LIBELF_SUPPORT libnuma: [ on ] # HAVE_LIBNUMA_SUPPORT numa_num_possible_cpus: [ on ] # HAVE_LIBNUMA_SUPPORT libperl: [ on ] # HAVE_LIBPERL_SUPPORT libpython: [ on ] # HAVE_LIBPYTHON_SUPPORT libslang: [ on ] # HAVE_SLANG_SUPPORT libcrypto: [ on ] # HAVE_LIBCRYPTO_SUPPORT libunwind: [ on ] # HAVE_LIBUNWIND_SUPPORT libdw-dwarf-unwind: [ on ] # HAVE_DWARF_SUPPORT zlib: [ on ] # HAVE_ZLIB_SUPPORT lzma: [ on ] # HAVE_LZMA_SUPPORT get_cpuid: [ on ] # HAVE_AUXTRACE_SUPPORT bpf: [ on ] # HAVE_LIBBPF_SUPPORT The line "syscall_table: [ on ] # HAVE_SYSCALL_TABLE_SUPPORT" is new created. Signed-off-by: Jin Yao <yao.jin@linux.intel.com> Suggested-by: Arnaldo Carvalho de Melo <acme@redhat.com> Tested-by: Arnaldo Carvalho de Melo <acme@redhat.com> Cc: Alexander Shishkin <alexander.shishkin@linux.intel.com> Cc: Andi Kleen <ak@linux.intel.com> Cc: Jiri Olsa <jolsa@kernel.org> Cc: Kan Liang <kan.liang@linux.intel.com> Cc: Peter Zijlstra <peterz@infradead.org> Link: http://lkml.kernel.org/r/1523269609-28824-4-git-send-email-yao.jin@linux.intel.com Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
2018-04-12perf tools: Rename HAVE_SYSCALL_TABLE to HAVE_SYSCALL_TABLE_SUPPORTJin Yao5-8/+8
To be consistent with other HAVE_XXX_SUPPORT uses in Makefile.config, this patch renames HAVE_SYSCALL_TABLE to HAVE_SYSCALL_TABLE_SUPPORT and updates the C code accordingly. Signed-off-by: Jin Yao <yao.jin@linux.intel.com> Suggested-by: Arnaldo Carvalho de Melo <acme@redhat.com> Cc: Alexander Shishkin <alexander.shishkin@linux.intel.com> Cc: Andi Kleen <ak@linux.intel.com> Cc: Jiri Olsa <jolsa@kernel.org> Cc: Kan Liang <kan.liang@linux.intel.com> Cc: Peter Zijlstra <peterz@infradead.org> Link: http://lkml.kernel.org/r/1523269609-28824-3-git-send-email-yao.jin@linux.intel.com Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
2018-04-12perf script: Use HAVE_LIBXXX_SUPPORT to replace NO_LIBXXXJin Yao2-4/+4
In Makefile.config, we define the conditional compilation variables HAVE_LIBPERL_SUPPORT and HAVE_LIBPYTHON_SUPPORT. To make the C code more consistent, this patch replaces NO_LIBPERL/NO_LIBPYTHON in C code with HAVE_LIBPERL_SUPPORT/ HAVE_LIBPYTHON_SUPPORT. Signed-off-by: Jin Yao <yao.jin@linux.intel.com> Suggested-by: Ingo Molnar <mingo@kernel.org> Cc: Alexander Shishkin <alexander.shishkin@linux.intel.com> Cc: Andi Kleen <ak@linux.intel.com> Cc: Jiri Olsa <jolsa@kernel.org> Cc: Kan Liang <kan.liang@linux.intel.com> Cc: Peter Zijlstra <peterz@infradead.org> Link: http://lkml.kernel.org/r/1523269609-28824-2-git-send-email-yao.jin@linux.intel.com Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
2018-04-12perf tests bpf: Remove unused ptrace.h include from LLVM testArnaldo Carvalho de Melo1-1/+0
The bpf-script-test-kbuild.c script, used in one of the LLVM subtests, includes ptrace.h unnecessarily, and that ends up making it include a header that uses asm(_ASM_SP), a feature that is not supported by clang <= 4.0, breaking that 'perf test' entry. This ended up leading to the ca26cffa4e4a ("x86/asm: Allow again using asm.h when building for the 'bpf' clang target"), adding an ifndef __BPF__ to the arch/x86/include/asm/asm.h file. Newer clang versions accept that asm(_ASM_SP) construct, so just remove the ptrace.h include, which paves the way for reverting ca26cffa4e4a ("x86/asm: Allow again using asm.h when building for the 'bpf' clang target"). Suggested-by: Yonghong Song <yhs@fb.com> Acked-by: Yonghong Song <yhs@fb.com> Link: https://lkml.kernel.org/r/613f0a0d-c433-8f4d-dcc1-c9889deae39e@fb.com Cc: Adrian Hunter <adrian.hunter@intel.com> Cc: Alexander Potapenko <glider@google.com> Cc: Alexei Starovoitov <alexei.starovoitov@gmail.com> Cc: Andrey Ryabinin <aryabinin@virtuozzo.com> Cc: Andy Lutomirski <luto@kernel.org> Cc: Arnd Bergmann <arnd@arndb.de> Cc: Daniel Borkmann <daniel@iogearbox.net> Cc: David Ahern <dsahern@gmail.com> Cc: Dmitriy Vyukov <dvyukov@google.com> Cc: Jiri Olsa <jolsa@kernel.org> Cc: Josh Poimboeuf <jpoimboe@redhat.com> Cc: Linus Torvalds <torvalds@linux-foundation.org> Cc: Matthias Kaehlcke <mka@chromium.org> Cc: Miguel Bernal Marin <miguel.bernal.marin@linux.intel.com> Cc: Namhyung Kim <namhyung@kernel.org> Cc: Peter Zijlstra <peterz@infradead.org> Cc: Thomas Gleixner <tglx@linutronix.de> Cc: Wang Nan <wangnan0@huawei.com> Link: https://lkml.kernel.org/n/tip-clbcnzbakdp18ibme4wt43ib@git.kernel.org Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
2018-04-12perf jvmti: Give hints about package names needed to buildArnaldo Carvalho de Melo1-1/+1
Give as examples of package names to install to have this built for fedora and debian, to help the user a bit. The part from 'e.g.:' onwards: No openjdk development package found, please install JDK package, e.g. openjdk-8-jdk, java-1.8.0-openjdk-devel Cc: Andi Kleen <andi@firstfloor.org> Cc: David Ahern <dsahern@gmail.com> Cc: Jiri Olsa <jolsa@kernel.org> Cc: Namhyung Kim <namhyung@kernel.org> Cc: Stephane Eranian <eranian@google.com> Cc: William Cohen <wcohen@redhat.com> Link: https://lkml.kernel.org/n/tip-edbi4r2pvzn7no6ebxbtczng@git.kernel.org Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
2018-04-12perf annotate browser: Allow showing offsets in more than just jump targetsArnaldo Carvalho de Melo1-0/+5
Jesper wanted to see offsets at callq sites when doing some performance investigation related to retpolines, so save him some time by providing a 'O' hotkey to allow showing offsets from function start at call instructions or in all instructions, just go on pressing 'O' till the offsets you need appear. Example: Starts with: Samples: 64 of event 'cycles:ppp', 100000 Hz, Event count (approx.): 318963 ixgbe_read_reg /proc/kcore Percent│ ↑ je 2a │ ┌──cmp $0xffffffff,%r13d │ ├──je d0 │ │ mov $0x53e3,%edi │ │→ callq __const_udelay │ │ sub $0x1,%r15d │ │↑ jne 83 │ │ mov 0x8(%rbp),%rax │ │ testb $0x20,0x1799(%rax) │ │↑ je 2a │ │ mov 0x200(%rax),%rdi │ │ mov %r13d,%edx │ │ mov $0xffffffffc02595d8,%rsi │ │→ callq netdev_warn │ │↑ jmpq 2a │d0:└─→mov 0x8(%rbp),%rsi │ mov %rbp,%rdi │ mov %eax,0x4(%rsp) │ → callq ixgbe_remove_adapter.isra.77 │ mov 0x4(%rsp),%eax Press 'h' for help on key bindings ============================================================================ Pess 'O': Samples: 64 of event 'cycles:ppp', 100000 Hz, Event count (approx.): 318963 ixgbe_read_reg /proc/kcore Percent│ ↑ je 2a │ ┌──cmp $0xffffffff,%r13d │ ├──je d0 │ │ mov $0x53e3,%edi │99:│→ callq __const_udelay │ │ sub $0x1,%r15d │ │↑ jne 83 │ │ mov 0x8(%rbp),%rax │ │ testb $0x20,0x1799(%rax) │ │↑ je 2a │ │ mov 0x200(%rax),%rdi │ │ mov %r13d,%edx │ │ mov $0xffffffffc02595d8,%rsi │c6:│→ callq netdev_warn │ │↑ jmpq 2a │d0:└─→mov 0x8(%rbp),%rsi │ mov %rbp,%rdi │ mov %eax,0x4(%rsp) │db: → callq ixgbe_remove_adapter.isra.77 │ mov 0x4(%rsp),%eax Press 'h' for help on key bindings ============================================================================ Press 'O' again: Samples: 64 of event 'cycles:ppp', 100000 Hz, Event count (approx.): 318963 ixgbe_read_reg /proc/kcore Percent│8c: ↑ je 2a │8e:┌──cmp $0xffffffff,%r13d │92:├──je d0 │94:│ mov $0x53e3,%edi │99:│→ callq __const_udelay │9e:│ sub $0x1,%r15d │a2:│↑ jne 83 │a4:│ mov 0x8(%rbp),%rax │a8:│ testb $0x20,0x1799(%rax) │af:│↑ je 2a │b5:│ mov 0x200(%rax),%rdi │bc:│ mov %r13d,%edx │bf:│ mov $0xffffffffc02595d8,%rsi │c6:│→ callq netdev_warn │cb:│↑ jmpq 2a │d0:└─→mov 0x8(%rbp),%rsi │d4: mov %rbp,%rdi │d7: mov %eax,0x4(%rsp) │db: → callq ixgbe_remove_adapter.isra.77 │e0: mov 0x4(%rsp),%eax Press 'h' for help on key bindings ============================================================================ Press 'O' again and it will show just jump target offsets. Suggested-by: Jesper Dangaard Brouer <brouer@redhat.com> Cc: Adrian Hunter <adrian.hunter@intel.com> Cc: Alexei Starovoitov <alexei.starovoitov@gmail.com> Cc: Andi Kleen <ak@linux.intel.com> Cc: Daniel Borkmann <daniel@iogearbox.net> Cc: David Ahern <dsahern@gmail.com> Cc: Jin Yao <yao.jin@linux.intel.com> Cc: Jiri Olsa <jolsa@kernel.org> Cc: Linus Torvalds <torvalds@linux-foundation.org> Cc: Martin Liška <mliska@suse.cz> Cc: Namhyung Kim <namhyung@kernel.org> Cc: Ravi Bangoria <ravi.bangoria@linux.vnet.ibm.com> Cc: Thomas Richter <tmricht@linux.vnet.ibm.com> Cc: Wang Nan <wangnan0@huawei.com> Link: https://lkml.kernel.org/n/tip-upp6pfdetwlsx18ec2uf1od4@git.kernel.org Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
2018-04-12perf annotate: Allow showing offsets in more than just jump targetsArnaldo Carvalho de Melo2-2/+18
Jesper wanted to see offsets at callq sites when doing some performance investigation related to retpolines, so save him some time by providing an 'struct annotation_options' to control where offsets should appear: just on jump targets? That + call instructions? All? This puts in place the logic to show the offsets, now we need to wire this up in the TUI browser (next patch) and on the 'perf annotate --stdio2" interface, where we need a more general mechanism to setup the 'annotation_options' struct from the command line. Suggested-by: Jesper Dangaard Brouer <brouer@redhat.com> Cc: Adrian Hunter <adrian.hunter@intel.com> Cc: Alexei Starovoitov <alexei.starovoitov@gmail.com> Cc: Andi Kleen <ak@linux.intel.com> Cc: Daniel Borkmann <daniel@iogearbox.net> Cc: David Ahern <dsahern@gmail.com> Cc: Jin Yao <yao.jin@linux.intel.com> Cc: Jiri Olsa <jolsa@kernel.org> Cc: Linus Torvalds <torvalds@linux-foundation.org> Cc: Martin Liška <mliska@suse.cz> Cc: Namhyung Kim <namhyung@kernel.org> Cc: Ravi Bangoria <ravi.bangoria@linux.vnet.ibm.com> Cc: Thomas Richter <tmricht@linux.vnet.ibm.com> Cc: Wang Nan <wangnan0@huawei.com> Link: https://lkml.kernel.org/n/tip-m3jc9c3swobye9tj08gnh5i7@git.kernel.org Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
2018-04-12perf tests: Run dwarf unwind test on arm32Kim Phillips3-0/+30
Enable the unwind test on arm32: $ perf test unwind 58: DWARF unwind : Ok Signed-off-by: Kim Phillips <kim.phillips@arm.com> Cc: Alexander Shishkin <alexander.shishkin@linux.intel.com> Cc: Brian Robbins <brianrob@microsoft.com> Cc: Jiri Olsa <jolsa@redhat.com> Cc: Namhyung Kim <namhyung@kernel.org> Cc: Peter Zijlstra <peterz@infradead.org> Link: http://lkml.kernel.org/r/20180410191624.a3a468670dd4548c66d3d094@arm.com Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
2018-04-12perf stat: Enable 1ms interval for printing event counters valuesAlexey Budankov2-13/+3
Currently print count interval for performance counters values is limited by 10ms so reading the values at frequencies higher than 100Hz is restricted by the tool. This change makes perf stat -I possible on frequencies up to 1KHz and, to some extent, makes perf stat -I to be on-par with perf record sampling profiling. When running perf stat -I for monitoring e.g. PCIe uncore counters and at the same time profiling some I/O workload by perf record e.g. for cpu-cycles and context switches, it is then possible to observe consolidated CPU/OS/IO(Uncore) performance picture for that workload. Tool overhead warning printed when specifying -v option can be missed due to screen scrolling in case you have output to the console so message is moved into help available by running perf stat -h. Signed-off-by: Alexey Budankov <alexey.budankov@linux.intel.com> Acked-by: Jiri Olsa <jolsa@kernel.org> Cc: Alexander Shishkin <alexander.shishkin@linux.intel.com> Cc: Andi Kleen <ak@linux.intel.com> Cc: Namhyung Kim <namhyung@kernel.org> Cc: Peter Zijlstra <peterz@infradead.org> Link: http://lkml.kernel.org/r/b842ad6a-d606-32e4-afe5-974071b5198e@linux.intel.com Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
2018-04-09perf tests clang: Fix function name for clang IR testSandipan Das1-1/+1
As stated in tests/llvm-src-base.c, the name of the bpf function should be "bpf_func__SyS_epoll_pwait" but this clang test fails as it tries to lookup "bpf_func__SyS_epoll_wait". Before applying patch: 55: builtin clang support : 55.1: builtin clang compile C source to IR : FAILED! 55.2: builtin clang compile C source to ELF object : Skip After applying patch: 55: builtin clang support : 55.1: builtin clang compile C source to IR : Ok 55.2: builtin clang compile C source to ELF object : Ok Signed-off-by: Sandipan Das <sandipan@linux.vnet.ibm.com> Tested-by: Arnaldo Carvalho de Melo <acme@redhat.com> Cc: Jiri Olsa <jolsa@redhat.com> Cc: Naveen N. Rao <naveen.n.rao@linux.vnet.ibm.com> Fixes: e67d52d411c3 ("perf clang: Update test case to use real BPF script") Link: http://lkml.kernel.org/r/20180404180419.19056-3-sandipan@linux.vnet.ibm.com Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
2018-04-09perf clang: Add support for recent clang versionsSandipan Das1-1/+10
The clang API calls used by perf have changed in recent releases and builds succeed with libclang-3.9 only. This introduces compatibility with libclang-4.0 and above. Without this patch, we will see the following compilation errors with libclang-4.0+: util/c++/clang.cpp: In function ‘clang::CompilerInvocation* perf::createCompilerInvocation(llvm::opt::ArgStringList, llvm::StringRef&, clang::DiagnosticsEngine&)’: util/c++/clang.cpp:62:33: error: ‘IK_C’ was not declared in this scope Opts.Inputs.emplace_back(Path, IK_C); ^~~~ util/c++/clang.cpp: In function ‘std::unique_ptr<llvm::Module> perf::getModuleFromSource(llvm::opt::ArgStringList, llvm::StringRef, llvm::IntrusiveRefCntPtr<clang::vfs::FileSystem>)’: util/c++/clang.cpp:75:26: error: no matching function for call to ‘clang::CompilerInstance::setInvocation(clang::CompilerInvocation*)’ Clang.setInvocation(&*CI); ^ In file included from util/c++/clang.cpp:14:0: /usr/include/clang/Frontend/CompilerInstance.h:231:8: note: candidate: void clang::CompilerInstance::setInvocation(std::shared_ptr<clang::CompilerInvocation>) void setInvocation(std::shared_ptr<CompilerInvocation> Value); ^~~~~~~~~~~~~ Committer testing: Tested on Fedora 27 after installing the clang-devel and llvm-devel packages, versions: # rpm -qa | egrep llvm\|clang llvm-5.0.1-6.fc27.x86_64 clang-libs-5.0.1-5.fc27.x86_64 clang-5.0.1-5.fc27.x86_64 clang-tools-extra-5.0.1-5.fc27.x86_64 llvm-libs-5.0.1-6.fc27.x86_64 llvm-devel-5.0.1-6.fc27.x86_64 clang-devel-5.0.1-5.fc27.x86_64 # Make sure you don't have some older version lying around in /usr/local, etc, then: $ make LIBCLANGLLVM=1 -C tools/perf install-bin And in the end perf will be linked agains these libraries: # ldd ~/bin/perf | egrep -i llvm\|clang libclangAST.so.5 => /lib64/libclangAST.so.5 (0x00007f8bb2eb4000) libclangBasic.so.5 => /lib64/libclangBasic.so.5 (0x00007f8bb29e3000) libclangCodeGen.so.5 => /lib64/libclangCodeGen.so.5 (0x00007f8bb23f7000) libclangDriver.so.5 => /lib64/libclangDriver.so.5 (0x00007f8bb2060000) libclangFrontend.so.5 => /lib64/libclangFrontend.so.5 (0x00007f8bb1d06000) libclangLex.so.5 => /lib64/libclangLex.so.5 (0x00007f8bb1a3e000) libclangTooling.so.5 => /lib64/libclangTooling.so.5 (0x00007f8bb17d4000) libclangEdit.so.5 => /lib64/libclangEdit.so.5 (0x00007f8bb15c5000) libclangSema.so.5 => /lib64/libclangSema.so.5 (0x00007f8bb0cc9000) libclangAnalysis.so.5 => /lib64/libclangAnalysis.so.5 (0x00007f8bb0a23000) libclangParse.so.5 => /lib64/libclangParse.so.5 (0x00007f8bb0725000) libclangSerialization.so.5 => /lib64/libclangSerialization.so.5 (0x00007f8bb039a000) libLLVM-5.0.so => /lib64/libLLVM-5.0.so (0x00007f8bace98000) libclangASTMatchers.so.5 => /lib64/../lib64/libclangASTMatchers.so.5 (0x00007f8bab735000) libclangFormat.so.5 => /lib64/../lib64/libclangFormat.so.5 (0x00007f8bab4b2000) libclangRewrite.so.5 => /lib64/../lib64/libclangRewrite.so.5 (0x00007f8bab2a1000) libclangToolingCore.so.5 => /lib64/../lib64/libclangToolingCore.so.5 (0x00007f8bab08e000) # Signed-off-by: Sandipan Das <sandipan@linux.vnet.ibm.com> Tested-by: Arnaldo Carvalho de Melo <acme@redhat.com> Cc: Jiri Olsa <jolsa@redhat.com> Cc: Naveen N. Rao <naveen.n.rao@linux.vnet.ibm.com> Fixes: 00b86691c77c ("perf clang: Add builtin clang support ant test case") Link: http://lkml.kernel.org/r/20180404180419.19056-2-sandipan@linux.vnet.ibm.com Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
2018-04-09perf tools: Fix perf builds with clang supportSandipan Das1-1/+2
For libclang, some distro packages provide static libraries (.a) while some provide shared libraries (.so). Currently, perf code can only be linked with static libraries. This makes perf build possible for both cases. Signed-off-by: Sandipan Das <sandipan@linux.vnet.ibm.com> Cc: Jiri Olsa <jolsa@redhat.com> Cc: Naveen N. Rao <naveen.n.rao@linux.vnet.ibm.com> Fixes: d58ac0bf8d1e ("perf build: Add clang and llvm compile and linking support") Link: http://lkml.kernel.org/r/20180404180419.19056-1-sandipan@linux.vnet.ibm.com Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
2018-04-09perf tools: No need to include namespaces.h in util.hArnaldo Carvalho de Melo1-2/+2
The only thing that is needed there is a forward declaration for 'struct nsinfo', so disentanble this, which in turns allows built-in clang builds, i.e. 'make LIBCLANGLLVM=1 -C tools/perf'. Cc: Adrian Hunter <adrian.hunter@intel.com> Cc: David Ahern <dsahern@gmail.com> Cc: Jiri Olsa <jolsa@kernel.org> Cc: Namhyung Kim <namhyung@kernel.org> Cc: Naveen N. Rao <naveen.n.rao@linux.vnet.ibm.com> Cc: Sandipan Das <sandipan@linux.vnet.ibm.com> Cc: Wang Nan <wangnan0@huawei.com> Link: https://lkml.kernel.org/n/tip-vq26rsuwq1cqylpcyvq89c84@git.kernel.org Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
2018-04-06perf hists browser: Remove leftover from row returned from refreshArnaldo Carvalho de Melo1-8/+2
The per-browser screen refresh routine (ui_browser->refresh()) should return the first row that should be cleaned after the rows just printed, in case not all rows available on the screen gets filled. When moving the extra title lines logic from the hists browser to the generic ui_browser class, one piece of that logic remained in the hists browser and then when going back from the annotate browser to the hists browser in a case where fewer lines were displayed in the hists browser, for instance when filtering the entries per substring, one line of the annotate browser would remain on the screen, fix that. Example of the screen artifact: ================================================================================ Samples: 73K of event 'cycles:ppp', 4000 Hz, Event count (approx.): 45172901394 Overhead Shared O Symbol 0.30% [kernel] [k] __indirect_thunk_start 0.09% [kernel] [k] __x86_indirect_thunk_r10 │ lfence ================================================================================ Here from 'perf top' the view was zoomed with '/thunk' to functions having that substring, then the first was annotated and from the annotate browser ESC was pressed, then the first lines were overwritten, but the 'lfence' line remained due to the off by one bug fixed in this cset. Cc: Adrian Hunter <adrian.hunter@intel.com> Cc: Andi Kleen <ak@linux.intel.com> Cc: David Ahern <dsahern@gmail.com> Cc: Jin Yao <yao.jin@linux.intel.com> Cc: Jiri Olsa <jolsa@kernel.org> Cc: Namhyung Kim <namhyung@kernel.org> Cc: Wang Nan <wangnan0@huawei.com> Fixes: ef9ff6017e3c ("perf ui browser: Move the extra title lines from the hists browser") Link: https://lkml.kernel.org/n/tip-odryfso74eaarm0z3e4v9owx@git.kernel.org Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
2018-04-06perf hists browser: Show extra_title_lines in the 'D' debug hotkeyArnaldo Carvalho de Melo1-1/+2
To help in fixing problems in the browser. Cc: Adrian Hunter <adrian.hunter@intel.com> Cc: David Ahern <dsahern@gmail.com> Cc: Jiri Olsa <jolsa@kernel.org> Cc: Namhyung Kim <namhyung@kernel.org> Cc: Wang Nan <wangnan0@huawei.com> Link: https://lkml.kernel.org/n/tip-uj0n76yqh5bf98i0edckd47t@git.kernel.org Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
2018-04-06perf auxtrace: Make auxtrace_queues__add_buffer() do CPU filteringAdrian Hunter1-10/+10
In preparation for supporting AUX area sampling buffers, auxtrace_queues__add_buffer() needs to be more generic. To that end, move CPU filtering into it. Signed-off-by: Adrian Hunter <adrian.hunter@intel.com> Cc: Jiri Olsa <jolsa@redhat.com> Link: http://lkml.kernel.org/r/1520327598-1317-8-git-send-email-adrian.hunter@intel.com Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
2018-04-05perf report: Remove duplicated 'samples' in lost samples warningArnaldo Carvalho de Melo1-1/+1
The following message, emitted when samples are lost due to system overload, had one 'samples' too many, ditch it: Processed 25333 samples and lost 20.88% samples! Cc: Adrian Hunter <adrian.hunter@intel.com> Cc: David Ahern <dsahern@gmail.com> Cc: Jiri Olsa <jolsa@kernel.org> Cc: Namhyung Kim <namhyung@kernel.org> Cc: Wang Nan <wangnan0@huawei.com> Cc: Kan Liang <kan.liang@intel.com> Link: https://lkml.kernel.org/n/tip-oev1469y02hmfere6r2kkxp6@git.kernel.org Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
2018-04-05perf ui browser: Fixup cleaning unused lines at the bottomArnaldo Carvalho de Melo1-2/+2
Now that we can have extra title lines we should use ui_browser->rows and not ->height when drawing lines, as well as adding ui_browser->extra_title_lines to browser->y when cleaning unused lines at the bottom, otherwise we end up clobbering with spaces the last line just shown by ui_browser->refresh() routine. Cc: Adrian Hunter <adrian.hunter@intel.com> Cc: Andi Kleen <ak@linux.intel.com> Cc: David Ahern <dsahern@gmail.com> Cc: Jin Yao <yao.jin@linux.intel.com> Cc: Jiri Olsa <jolsa@kernel.org> Cc: Namhyung Kim <namhyung@kernel.org> Cc: Wang Nan <wangnan0@huawei.com> Fixes: ef9ff6017e3c ("perf ui browser: Move the extra title lines from the hists browser") Link: https://lkml.kernel.org/n/tip-dfcpokt1pm5ixm8n9pxwtstz@git.kernel.org Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
2018-04-05perf annotate browser: Fixup vertical line separating metrics from instructionsArnaldo Carvalho de Melo1-1/+1
Now that we can have extra title lines we should use ui_browser->rows and not ->height when drawing lines, as it will use ui_browser__gotorc() and that will take the extra title lines into account, which was causing an off by one at the end of the vertical line drawn by __ui_browser__vline(), fix it. The visual effect was that the last line, with status messages, was being overwritten by the vertical line, looking like: Press 'h' for help on│key bindings Cc: Adrian Hunter <adrian.hunter@intel.com> Cc: Andi Kleen <ak@linux.intel.com> Cc: David Ahern <dsahern@gmail.com> Cc: Jin Yao <yao.jin@linux.intel.com> Cc: Jiri Olsa <jolsa@kernel.org> Cc: Namhyung Kim <namhyung@kernel.org> Cc: Wang Nan <wangnan0@huawei.com> Fixes: ef9ff6017e3c ("perf ui browser: Move the extra title lines from the hists browser") Link: https://lkml.kernel.org/n/tip-08y1ln3xjn76zvizz1i1dsvn@git.kernel.org Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
2018-04-05perf annotate: Show group details on the title lineArnaldo Carvalho de Melo1-2/+5
To match what is shown in the main 'perf report/top' title lines, i.e. if a group is being shown, either a real group (recorded with "-e '{a,b,c}') or a forced group (using 'perf report --group' for a perf.data file recorded without {}) we will show multiple columns, one per event, but we were failing to show the group details, so, for: # perf report --header-only | grep cmdline # cmdline : /home/acme/bin/perf record -e {cycles,instructions,cache-misses} # perf report --group The first line was showing just "cycles", now it shows the correct line, which is: Samples: 578 of events 'anon group { cycles, instructions, cache-misses }', 4000 Hz, Event count (approx.): 487421794 syscall_return_via_sysret /lib/modules/4.16.0-rc7/build/vmlinux 0.22 2.97 0.00 │ ↓ jmp 6c │ mov %cr3,%rdi 1.33 10.89 4.00 │ ↓ jmp 62 │ mov %rdi,%rax <SNIP> Cc: Adrian Hunter <adrian.hunter@intel.com> Cc: Andi Kleen <ak@linux.intel.com> Cc: David Ahern <dsahern@gmail.com> Cc: Jin Yao <yao.jin@linux.intel.com> Cc: Jiri Olsa <jolsa@kernel.org> Cc: Namhyung Kim <namhyung@kernel.org> Cc: Wang Nan <wangnan0@huawei.com> Fixes: 6920e2854e9a ("perf annotate browser: Show extra title line with event information") Link: https://lkml.kernel.org/n/tip-i41tqh17c2dabnyzjh99r1oz@git.kernel.org Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
2018-04-05perf auxtrace: Make auxtrace_queues__add_buffer() allocate struct bufferAdrian Hunter1-30/+24
In preparation for supporting AUX area sampling buffers, auxtrace_queues__add_buffer() needs to be more generic. To that end, move memory allocation for struct buffer into it. Signed-off-by: Adrian Hunter <adrian.hunter@intel.com> Cc: Jiri Olsa <jolsa@redhat.com> Link: http://lkml.kernel.org/r/1520327598-1317-7-git-send-email-adrian.hunter@intel.com Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
2018-04-04Merge tag 'perf-core-for-mingo-4.17-20180403' of ↵Ingo Molnar20-116/+382
git://git.kernel.org/pub/scm/linux/kernel/git/acme/linux into perf/core Pull perf/core improvements and fixes from Arnaldo Carvalho de Melo: - Show only failing syscalls with 'perf trace --failure' (Arnaldo Carvalho de Melo) e.g: See what 'openat' syscalls are failing: # perf trace --failure -e openat 762.323 ( 0.007 ms): VideoCapture/4566 openat(dfd: CWD, filename: /dev/video2) = -1 ENOENT No such file or directory <SNIP N /dev/videoN open attempts... sigh, where is that improvised camera lid?!? > 790.228 ( 0.008 ms): VideoCapture/4566 openat(dfd: CWD, filename: /dev/video63) = -1 ENOENT No such file or directory ^C# - Show information about the event (freq, nr_samples, total period/nr_events) in the annotate --tui and --stdio2 'perf annotate' output, similar to the first line in the 'perf report --tui', but just for the samples for a the annotated symbol (Arnaldo Carvalho de Melo) - Introduce 'perf version --build-options' to show what features were linked, aliased as well as a shorter 'perf -vv' (Jin Yao) - Add a "dso_size" sort order (Kim Phillips) - Remove redundant ')' in the tracepoint output in 'perf trace' (Changbin Du) - Synchronize x86's cpufeatures.h, no effect on toolss (Arnaldo Carvalho de Melo) Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com> Signed-off-by: Ingo Molnar <mingo@kernel.org>
2018-04-03perf trace: Remove redundant ')'Changbin Du1-1/+1
There is a redundant ')' at the tail of each event. So remove it. $ sudo perf trace --no-syscalls -e 'kmem:*' -a 899.342 kmem:kfree:(vfs_writev+0xb9) call_site=ffffffff9c453979 ptr=(nil)) 899.344 kmem:kfree:(___sys_recvmsg+0x188) call_site=ffffffff9c9b8b88 ptr=(nil)) Signed-off-by: Changbin Du <changbin.du@intel.com> Tested-by: Arnaldo Carvalho de Melo <acme@redhat.com> Cc: Jiri Olsa <jolsa@redhat.com> Cc: Namhyung Kim <namhyung@kernel.org> Cc: Peter Zijlstra <peterz@infradead.org> Link: http://lkml.kernel.org/r/1520937601-24952-1-git-send-email-changbin.du@intel.com Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
2018-04-03perf annotate stdio2: Print more descriptive event information headerArnaldo Carvalho de Melo1-7/+3
To match the recently added event header information to --tui, e.g.: # perf annotate --ignore-vmlinux --stdio2 _raw_spin_lock_irqsave Samples: 128 of event 'cycles:ppp', 4000 Hz, Event count (approx.): 48617682 _raw_spin_lock_irqsave() /proc/kcore 0.78 nop 7.03 push %rbx 3.12 pushfq 6.25 pop %rax nop mov %rax,%rbx 3.12 cli nop xor %eax,%eax mov $0x1,%edx 79.69 lock cmpxchg %edx,(%rdi) test %eax,%eax ↓ jne 2b mov %rbx,%rax pop %rbx ← retq 2b: mov %eax,%esi → callq *ffffffffb30eaed0 mov %rbx,%rax pop %rbx ← retq # Cc: Adrian Hunter <adrian.hunter@intel.com> Cc: David Ahern <dsahern@gmail.com> Cc: Jiri Olsa <jolsa@kernel.org> Cc: Martin Liška <mliska@suse.cz> Cc: Namhyung Kim <namhyung@kernel.org> Cc: Wang Nan <wangnan0@huawei.com> Link: https://lkml.kernel.org/n/tip-ujy46x7cldyhyxelyf2b9quy@git.kernel.org Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
2018-04-03perf annotate browser: Show extra title line with event informationArnaldo Carvalho de Melo1-4/+27
So at the top we'll have two lines, like this, from 'perf report': # perf report --group --ignore-vmlinux ===================================================================================================== Samples: 46 of events 'cycles', 4000 Hz, Event count (approx.): 5154895 _raw_spin_lock_irqsave /proc/kcore Percent │ nop │ push %rbx 0.00 14.29 0.00 │ pushfq 9.09 0.00 0.00 │ pop %rax 9.09 0.00 20.00 │ nop │ mov %rax,%rbx │ cli 4.55 7.14 0.00 │ nop │ xor %eax,%eax │ mov $0x1,%edx │ lock cmpxchg %edx,(%rdi) 77.27 78.57 70.00 │ test %eax,%eax │ ↓ jne 2b │ mov %rbx,%rax 0.00 0.00 10.00 │ pop %rbx │ ← retq │2b: mov %eax,%esi │ → callq queued_spin_lock_slowpath │ mov %rbx,%rax │ pop %rbx Press 'h' for help on│key bindings ===================================================================================================== 9.09 + 9.09 + 4.55 + 77.27 = 100 14.29 + 7.14 + 78.57 = 100 20 + 70 + 10 = 100 We can do the math by using 't' to toggle from 'percent' to nr ===================================================================================================== Samples: 46 of events 'cycles', 4000 Hz, Event count (approx.): 5154895 _raw_spin_lock_irqsave /proc/kcore Period │ nop │ push %rbx 0 79273 0 │ pushfq 190455 0 0 │ pop %rax 198038 0 3045 │ nop │ mov %rax,%rbx │ cli 217233 32562 0 │ nop │ xor %eax,%eax │ mov $0x1,%edx │ lock cmpxchg %edx,(%rdi) 3421649 979174 28273 │ test %eax,%eax │ ↓ jne 2b │ mov %rbx,%rax 0 0 5193 │ pop %rbx │ ← retq │2b: mov %eax,%esi │ → callq queued_spin_lock_slowpath │ mov %rbx,%rax │ pop %rbx Press 'h' for help on│key bindings ===================================================================================================== 79273 + 190455 + 198038 + 3045 + 217233 + 32562 + 3421649 + 979174 + 28273 + 5193 = 5154895 Or number of samples: ===================================================================================================== ooSamples: 46 of events 'cycles', 4000 Hz, Event count (approx.): 5154895 _raw_spin_lock_irqsave /proc/kcore Samples │ nop │ push %rbx 0 2 0 │ pushfq 2 0 0 │ pop %rax 2 0 2 │ nop │ mov %rax,%rbx │ cli 1 1 0 │ nop │ xor %eax,%eax │ mov $0x1,%edx │ lock cmpxchg %edx,(%rdi) 17 11 7 │ test %eax,%eax │ ↓ jne 2b │ mov %rbx,%rax 0 0 1 │ pop %rbx │ ← retq │2b: mov %eax,%esi │ → callq queued_spin_lock_slowpath │ mov %rbx,%rax │ pop %rbx Press 'h' for help on key bindings ===================================================================================================== 2 + 2 + 2 + 2 + 1 + 1 + 17 + 11 + 7 + 1 = 46 Suggested-by: Martin Liška <mliska@suse.cz> Cc: Adrian Hunter <adrian.hunter@intel.com> Cc: David Ahern <dsahern@gmail.com> Cc: Jiri Olsa <jolsa@kernel.org> Cc: Linus Torvalds <torvalds@linux-foundation.org> Cc: Namhyung Kim <namhyung@kernel.org> Cc: Wang Nan <wangnan0@huawei.com> Bugzilla: https://bugzilla.kernel.org/show_bug.cgi?id=196935 Link: https://lkml.kernel.org/n/tip-ezccyxld50wtwyt66np6aomo@git.kernel.org Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
2018-04-03perf annotate: Introduce annotation__scnprintf_samples_period() methodArnaldo Carvalho de Melo2-0/+50
To print a string using the total period (nr_events) and the number of samples for a given annotation, i.e. for a given symbol, the counterpart to hists__scnprintf_samples_period(), that is for all the samples in a session (be it a live session, think 'perf top' or a perf.data file, think 'perf report'). Cc: Adrian Hunter <adrian.hunter@intel.com> Cc: David Ahern <dsahern@gmail.com> Cc: Jiri Olsa <jolsa@kernel.org> Cc: Martin Liška <mliska@suse.cz> Cc: Namhyung Kim <namhyung@kernel.org> Cc: Wang Nan <wangnan0@huawei.com> Bugzilla: https://bugzilla.kernel.org/show_bug.cgi?id=196935 Link: https://lkml.kernel.org/n/tip-goj2wu4fxutc8vd46mw3yg14@git.kernel.org Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
2018-04-03perf ui browser: Move the extra title lines from the hists browserArnaldo Carvalho de Melo3-20/+33
This will be useful for the annotate browser as well, that wants to have extra title lines, i.e. the current ui_browser unconditionally reserves the first line for a browser title and the last one for status messages. But some browsers, like the buckets one (hists browser) needs extra lines to show headers, allowing it to be shown or not, press 'H' in 'perf top' or 'perf report' to see this feature. So move that logic to the core ui_browser used by the hists_browser ('perf top' and 'perf report' main interface) so that it can be used by the annotate browser too. Cc: Adrian Hunter <adrian.hunter@intel.com> Cc: David Ahern <dsahern@gmail.com> Cc: Jiri Olsa <jolsa@kernel.org> Cc: Martin Liška <mliska@suse.cz> Cc: Namhyung Kim <namhyung@kernel.org> Cc: Wang Nan <wangnan0@huawei.com> Bugzilla: https://bugzilla.kernel.org/show_bug.cgi?id=196935 Link: https://lkml.kernel.org/n/tip-r38xm3ut37ulbg1o5tn5iise@git.kernel.org Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
2018-04-03perf hists: Move hists__scnprintf_title() away from the TUI codeArnaldo Carvalho de Melo2-79/+81
The previous patch made this function useful to non-TUI parts of the tools, but left it where the function from what it was carved, so that the patch showed more clearly the process. Now just move it outside the TUI parts so that we can finally use it, even when the TUI code doesn't get built/linked. Cc: Adrian Hunter <adrian.hunter@intel.com> Cc: David Ahern <dsahern@gmail.com> Cc: Jiri Olsa <jolsa@kernel.org> Cc: Martin Liška <mliska@suse.cz> Cc: Namhyung Kim <namhyung@kernel.org> Cc: Wang Nan <wangnan0@huawei.com> Bugzilla: https://bugzilla.kernel.org/show_bug.cgi?id=196935 Link: https://lkml.kernel.org/n/tip-hqj7hvcr3mu5lvcqp3cssio6@git.kernel.org Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
2018-04-03perf hists: Introduce hists__scnprint_title()Arnaldo Carvalho de Melo2-4/+17
That is not use any struct hists_browser internals, so that it can be shared with the other UIs and tools. Cc: Adrian Hunter <adrian.hunter@intel.com> Cc: David Ahern <dsahern@gmail.com> Cc: Jiri Olsa <jolsa@kernel.org> Cc: Martin Liška <mliska@suse.cz> Cc: Namhyung Kim <namhyung@kernel.org> Cc: Wang Nan <wangnan0@huawei.com> Bugzilla: https://bugzilla.kernel.org/show_bug.cgi?id=196935 Link: https://lkml.kernel.org/n/tip-w8mczjnqnbcj9yzfkv9ja6ro@git.kernel.org Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
2018-04-03perf hists browser: Rename perf_evsel_browser_title to a more descriptive nameArnaldo Carvalho de Melo1-5/+3
Rename it to hists_browser__scnprintf_title() to better reflect that it provides a scnprintf-like function operating on a hists_browser instance. This paves the way to have a non-hists_browser specific function to scnprintf format a title with per evsel information to use in other tools or UIs. Cc: Adrian Hunter <adrian.hunter@intel.com> Cc: David Ahern <dsahern@gmail.com> Cc: Jiri Olsa <jolsa@kernel.org> Cc: Martin Liška <mliska@suse.cz> Cc: Namhyung Kim <namhyung@kernel.org> Cc: Wang Nan <wangnan0@huawei.com> Bugzilla: https://bugzilla.kernel.org/show_bug.cgi?id=196935 Link: https://lkml.kernel.org/n/tip-sntpyzxsnme9jvuz2qntwoh2@git.kernel.org Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
2018-04-02Merge tag 'arch-removal' of ↵Linus Torvalds1-4/+0
git://git.kernel.org/pub/scm/linux/kernel/git/arnd/asm-generic Pul removal of obsolete architecture ports from Arnd Bergmann: "This removes the entire architecture code for blackfin, cris, frv, m32r, metag, mn10300, score, and tile, including the associated device drivers. I have been working with the (former) maintainers for each one to ensure that my interpretation was right and the code is definitely unused in mainline kernels. Many had fond memories of working on the respective ports to start with and getting them included in upstream, but also saw no point in keeping the port alive without any users. In the end, it seems that while the eight architectures are extremely different, they all suffered the same fate: There was one company in charge of an SoC line, a CPU microarchitecture and a software ecosystem, which was more costly than licensing newer off-the-shelf CPU cores from a third party (typically ARM, MIPS, or RISC-V). It seems that all the SoC product lines are still around, but have not used the custom CPU architectures for several years at this point. In contrast, CPU instruction sets that remain popular and have actively maintained kernel ports tend to all be used across multiple licensees. [ See the new nds32 port merged in the previous commit for the next generation of "one company in charge of an SoC line, a CPU microarchitecture and a software ecosystem" - Linus ] The removal came out of a discussion that is now documented at https://lwn.net/Articles/748074/. Unlike the original plans, I'm not marking any ports as deprecated but remove them all at once after I made sure that they are all unused. Some architectures (notably tile, mn10300, and blackfin) are still being shipped in products with old kernels, but those products will never be updated to newer kernel releases. After this series, we still have a few architectures without mainline gcc support: - unicore32 and hexagon both have very outdated gcc releases, but the maintainers promised to work on providing something newer. At least in case of hexagon, this will only be llvm, not gcc. - openrisc, risc-v and nds32 are still in the process of finishing their support or getting it added to mainline gcc in the first place. They all have patched gcc-7.3 ports that work to some degree, but complete upstream support won't happen before gcc-8.1. Csky posted their first kernel patch set last week, their situation will be similar [ Palmer Dabbelt points out that RISC-V support is in mainline gcc since gcc-7, although gcc-7.3.0 is the recommended minimum - Linus ]" This really says it all: 2498 files changed, 95 insertions(+), 467668 deletions(-) * tag 'arch-removal' of git://git.kernel.org/pub/scm/linux/kernel/git/arnd/asm-generic: (74 commits) MAINTAINERS: UNICORE32: Change email account staging: iio: remove iio-trig-bfin-timer driver tty: hvc: remove tile driver tty: remove bfin_jtag_comm and hvc_bfin_jtag drivers serial: remove tile uart driver serial: remove m32r_sio driver serial: remove blackfin drivers serial: remove cris/etrax uart drivers usb: Remove Blackfin references in USB support usb: isp1362: remove blackfin arch glue usb: musb: remove blackfin port usb: host: remove tilegx platform glue pwm: remove pwm-bfin driver i2c: remove bfin-twi driver spi: remove blackfin related host drivers watchdog: remove bfin_wdt driver can: remove bfin_can driver mmc: remove bfin_sdh driver input: misc: remove blackfin rotary driver input: keyboard: remove bf54x driver ...
2018-04-02perf version: Add man pageJin Yao1-0/+24
Since a new option '--build-options' is created for 'perf version', so we need to document it. Signed-off-by: Jin Yao <yao.jin@linux.intel.com> Acked-by: Jiri Olsa <jolsa@kernel.org> Cc: Alexander Shishkin <alexander.shishkin@linux.intel.com> Cc: Andi Kleen <ak@linux.intel.com> Cc: Jin Yao <yao.jin@intel.com> Cc: Kan Liang <kan.liang@intel.com> Cc: Peter Zijlstra <peterz@infradead.org> Link: http://lkml.kernel.org/r/1522402036-22915-7-git-send-email-yao.jin@linux.intel.com Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
2018-04-02perf tools: Add 'perf -vv' as an alias to 'perf version --build-options'Jin Yao2-0/+7
We keep having bug reports that when users build perf on their own, but they don't install some needed libraries such as libelf, libbfd/libibery. The perf can build, but it is missing important functionality. This patch provides a new option '-vv' for perf which will print the compiled-in status of libraries. The 'perf -vv' is mapped to 'perf version --build-options'. For example: $ ./perf -vv perf version 4.13.rc5.g6727c5 dwarf: [ on ] # HAVE_DWARF_SUPPORT dwarf_getlocations: [ on ] # HAVE_DWARF_GETLOCATIONS_SUPPORT glibc: [ on ] # HAVE_GLIBC_SUPPORT gtk2: [ on ] # HAVE_GTK2_SUPPORT libaudit: [ OFF ] # HAVE_LIBAUDIT_SUPPORT libbfd: [ on ] # HAVE_LIBBFD_SUPPORT libelf: [ on ] # HAVE_LIBELF_SUPPORT libnuma: [ on ] # HAVE_LIBNUMA_SUPPORT numa_num_possible_cpus: [ on ] # HAVE_LIBNUMA_SUPPORT libperl: [ on ] # HAVE_LIBPERL_SUPPORT libpython: [ on ] # HAVE_LIBPYTHON_SUPPORT libslang: [ on ] # HAVE_SLANG_SUPPORT libcrypto: [ on ] # HAVE_LIBCRYPTO_SUPPORT libunwind: [ on ] # HAVE_LIBUNWIND_SUPPORT libdw-dwarf-unwind: [ on ] # HAVE_DWARF_SUPPORT zlib: [ on ] # HAVE_ZLIB_SUPPORT lzma: [ on ] # HAVE_LZMA_SUPPORT get_cpuid: [ on ] # HAVE_AUXTRACE_SUPPORT bpf: [ on ] # HAVE_LIBBPF_SUPPORT v3: One bug is found in v2. It didn't process the option like '-vabc' correctly. Fix this bug. v2: Use a global variable version_verbose to record the number of 'v'. Signed-off-by: Jin Yao <yao.jin@linux.intel.com> Tested-by: Arnaldo Carvalho de Melo <acme@redhat.com> Acked-by: Jiri Olsa <jolsa@kernel.org> Cc: Alexander Shishkin <alexander.shishkin@linux.intel.com> Cc: Andi Kleen <ak@linux.intel.com> Cc: Jin Yao <yao.jin@intel.com> Cc: Kan Liang <kan.liang@intel.com> Cc: Peter Zijlstra <peterz@infradead.org> Link: http://lkml.kernel.org/r/1522402036-22915-6-git-send-email-yao.jin@linux.intel.com Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
2018-04-02perf version: Print the compiled-in status of librariesJin Yao1-1/+81
This patch checks the values passed by CFLAGS (-DHAVE_XXX) and then print the status of libraries. For example, if HAVE_DWARF_SUPPORT is defined, that means the library "dwarf" is compiled-in. The patch will print the status "on" for this library otherwise it print the status "OFF". A new option '--build-options' created for 'perf version' supports the printing of library status. For example: $ ./perf version --build-options or ./perf --version --build-options or ./perf -v --build-options perf version 4.13.rc5.g6727c5 dwarf: [ on ] # HAVE_DWARF_SUPPORT dwarf_getlocations: [ on ] # HAVE_DWARF_GETLOCATIONS_SUPPORT glibc: [ on ] # HAVE_GLIBC_SUPPORT gtk2: [ on ] # HAVE_GTK2_SUPPORT libaudit: [ OFF ] # HAVE_LIBAUDIT_SUPPORT libbfd: [ on ] # HAVE_LIBBFD_SUPPORT libelf: [ on ] # HAVE_LIBELF_SUPPORT libnuma: [ on ] # HAVE_LIBNUMA_SUPPORT numa_num_possible_cpus: [ on ] # HAVE_LIBNUMA_SUPPORT libperl: [ on ] # HAVE_LIBPERL_SUPPORT libpython: [ on ] # HAVE_LIBPYTHON_SUPPORT libslang: [ on ] # HAVE_SLANG_SUPPORT libcrypto: [ on ] # HAVE_LIBCRYPTO_SUPPORT libunwind: [ on ] # HAVE_LIBUNWIND_SUPPORT libdw-dwarf-unwind: [ on ] # HAVE_DWARF_SUPPORT zlib: [ on ] # HAVE_ZLIB_SUPPORT lzma: [ on ] # HAVE_LZMA_SUPPORT get_cpuid: [ on ] # HAVE_AUXTRACE_SUPPORT bpf: [ on ] # HAVE_LIBBPF_SUPPORT v4: 1. Also print the macro name. That would make it easier to grep around in the source looking for where code related a particular features is located. 2. Update since HAVE_DWARF_GETLOCATIONS is renamed to HAVE_DWARF_GETLOCATIONS_SUPPORT v3: Remove following unnecessary help message. 1. [ on ]: library is compiled-in [ OFF ]: library is disabled in make configuration OR library is not installed in build environment 2. Create '--build-options' option. 3. Use standard option parsing API 'parse_options'. v2: 1. Use IS_BUILTIN macro to replace #ifdef/#endif block. 2. Print color for on/OFF. Signed-off-by: Jin Yao <yao.jin@linux.intel.com> Suggested-by: Arnaldo Carvalho de Melo <acme@redhat.com> Suggested-by: Ingo Molnar <mingo@kernel.org> Suggested-by: Jiri Olsa <jolsa@kernel.org> Tested-by: Arnaldo Carvalho de Melo <acme@redhat.com> Acked-by: Jiri Olsa <jolsa@kernel.org> Cc: Alexander Shishkin <alexander.shishkin@linux.intel.com> Cc: Andi Kleen <ak@linux.intel.com> Cc: Jin Yao <yao.jin@intel.com> Cc: Kan Liang <kan.liang@intel.com> Cc: Peter Zijlstra <peterz@infradead.org> Link: http://lkml.kernel.org/r/1522402036-22915-5-git-send-email-yao.jin@linux.intel.com Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
2018-04-02perf config: Rename to HAVE_DWARF_GETLOCATIONS_SUPPORTJin Yao2-2/+2
In Makefile.config, to make all libraries flags have _SUPPORT suffix, rename HAVE_DWARF_GETLOCATIONS to HAVE_DWARF_GETLOCATIONS_SUPPORT Signed-off-by: Jin Yao <yao.jin@linux.intel.com> Suggested-by: Ingo Molnar <mingo@kernel.org> Acked-by: Jiri Olsa <jolsa@kernel.org> Cc: Alexander Shishkin <alexander.shishkin@linux.intel.com> Cc: Andi Kleen <ak@linux.intel.com> Cc: Jin Yao <yao.jin@intel.com> Cc: Kan Liang <kan.liang@intel.com> Cc: Peter Zijlstra <peterz@infradead.org> Link: http://lkml.kernel.org/r/1522402036-22915-4-git-send-email-yao.jin@linux.intel.com Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
2018-04-02perf config: Add some new -DHAVE_XXX to CFLAGSJin Yao1-0/+6
For most of libraries, in perf.config, they are recorded with -DHAVE_XXX in CFLAGS according to if the libraries are compiled-in. Then C code then will know if the library is compiled-in or not. While for glibc, no -DHAVE_GLIBC_SUPPORT exists. For python and perl libraries, only -DNO_PYTHON and -DNO_LIBPERL exist. To make the code more consistent, the patch creates -DHAVE_LIBPYTHON_SUPPORT and -DHAVE_LIBPERL_SUPPORT if the python and perl libraries are compiled-in. Since the existing flags -DNO_PYTHON and -DNO_LIBPERL are being used in many places in C code, this patch doesn't remove them. In a follow-up patch, we will recontruct the C code and then use HAVE_XXX instead. v3: Move 'CFLAGS += -DHAVE_LIBPYTHON_SUPPORT' and 'CFLAGS += -DHAVE_LIBPERL_SUPPORT' to other places to avoid duplicated feature checking. v2: Create -DHAVE_GLIBC_SUPPORT, -DHAVE_LIBPYTHON_SUPPORT and -DHAVE_LIBPERL_SUPPORT. Signed-off-by: Jin Yao <yao.jin@linux.intel.com> Acked-by: Jiri Olsa <jolsa@kernel.org> Cc: Alexander Shishkin <alexander.shishkin@linux.intel.com> Cc: Andi Kleen <ak@linux.intel.com> Cc: Jin Yao <yao.jin@intel.com> Cc: Kan Liang <kan.liang@intel.com> Cc: Peter Zijlstra <peterz@infradead.org> Link: http://lkml.kernel.org/r/1522402036-22915-3-git-send-email-yao.jin@linux.intel.com Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
2018-04-02perf trace: Show only failing syscallsArnaldo Carvalho de Melo2-3/+9
For instance: # perf probe "vfs_getname=getname_flags:72 pathname=result->name:string" Added new event: probe:vfs_getname (on getname_flags:72 with pathname=result->name:string) You can now use it in all perf tools, such as: perf record -e probe:vfs_getname -aR sleep 1 # perf trace --failure sleep 1 0.043 ( 0.010 ms): sleep/10978 access(filename: /etc/ld.so.preload, mode: R) = -1 ENOENT No such file or directory For reference, here are all the syscalls in this case: # perf trace sleep 1 ? ( ): sleep/10976 ... [continued]: execve()) = 0 0.027 ( 0.001 ms): sleep/10976 brk() = 0x55bdc2d04000 0.044 ( 0.010 ms): sleep/10976 access(filename: /etc/ld.so.preload, mode: R) = -1 ENOENT No such file or directory 0.057 ( 0.006 ms): sleep/10976 openat(dfd: CWD, filename: /etc/ld.so.cache, flags: CLOEXEC) = 3 0.064 ( 0.002 ms): sleep/10976 fstat(fd: 3, statbuf: 0x7fffac22b370) = 0 0.067 ( 0.003 ms): sleep/10976 mmap(len: 111457, prot: READ, flags: PRIVATE, fd: 3) = 0x7feec8615000 0.071 ( 0.001 ms): sleep/10976 close(fd: 3) = 0 0.080 ( 0.007 ms): sleep/10976 openat(dfd: CWD, filename: /lib64/libc.so.6, flags: CLOEXEC) = 3 0.088 ( 0.002 ms): sleep/10976 read(fd: 3, buf: 0x7fffac22b538, count: 832) = 832 0.092 ( 0.001 ms): sleep/10976 fstat(fd: 3, statbuf: 0x7fffac22b3d0) = 0 0.094 ( 0.002 ms): sleep/10976 mmap(len: 8192, prot: READ|WRITE, flags: PRIVATE|ANONYMOUS) = 0x7feec8613000 0.099 ( 0.004 ms): sleep/10976 mmap(len: 3889792, prot: EXEC|READ, flags: PRIVATE|DENYWRITE, fd: 3) = 0x7feec8057000 0.104 ( 0.007 ms): sleep/10976 mprotect(start: 0x7feec8203000, len: 2097152) = 0 0.112 ( 0.005 ms): sleep/10976 mmap(addr: 0x7feec8403000, len: 24576, prot: READ|WRITE, flags: PRIVATE|DENYWRITE|FIXED, fd: 3, off: 1753088) = 0x7feec8403000 0.120 ( 0.003 ms): sleep/10976 mmap(addr: 0x7feec8409000, len: 14976, prot: READ|WRITE, flags: PRIVATE|ANONYMOUS|FIXED) = 0x7feec8409000 0.128 ( 0.001 ms): sleep/10976 close(fd: 3) = 0 0.139 ( 0.001 ms): sleep/10976 arch_prctl(option: 4098, arg2: 140663540761856) = 0 0.186 ( 0.004 ms): sleep/10976 mprotect(start: 0x7feec8403000, len: 16384, prot: READ) = 0 0.204 ( 0.003 ms): sleep/10976 mprotect(start: 0x55bdc0ec3000, len: 4096, prot: READ) = 0 0.209 ( 0.004 ms): sleep/10976 mprotect(start: 0x7feec8631000, len: 4096, prot: READ) = 0 0.214 ( 0.010 ms): sleep/10976 munmap(addr: 0x7feec8615000, len: 111457) = 0 0.269 ( 0.001 ms): sleep/10976 brk() = 0x55bdc2d04000 0.271 ( 0.002 ms): sleep/10976 brk(brk: 0x55bdc2d25000) = 0x55bdc2d25000 0.274 ( 0.001 ms): sleep/10976 brk() = 0x55bdc2d25000 0.278 ( 0.007 ms): sleep/10976 open(filename: /usr/lib/locale/locale-archive, flags: CLOEXEC) = 3 0.288 ( 0.001 ms): sleep/10976 fstat(fd: 3</usr/lib/locale/locale-archive>, statbuf: 0x7feec8408aa0) = 0 0.290 ( 0.003 ms): sleep/10976 mmap(len: 113045344, prot: READ, flags: PRIVATE, fd: 3) = 0x7feec1488000 0.297 ( 0.001 ms): sleep/10976 close(fd: 3</usr/lib/locale/locale-archive>) = 0 0.325 (1000.193 ms): sleep/10976 nanosleep(rqtp: 0x7fffac22c0b0) = 0 1000.560 ( 0.006 ms): sleep/10976 close(fd: 1) = 0 1000.573 ( 0.005 ms): sleep/10976 close(fd: 2) = 0 1000.596 ( ): sleep/10976 exit_group() # And can be done systemwide, etc, with backtraces: # perf trace --max-stack=16 --failure sleep 1 0.048 ( 0.015 ms): sleep/11092 access(filename: /etc/ld.so.preload, mode: R) = -1 ENOENT No such file or directory __access (inlined) dl_main (/usr/lib64/ld-2.26.so) # Or for some specific syscalls: # perf trace --max-stack=16 -e openat --failure cat /tmp/rien cat: /tmp/rien: No such file or directory 0.251 ( 0.012 ms): cat/11106 openat(dfd: CWD, filename: /tmp/rien) = -1 ENOENT No such file or directory __libc_open64 (inlined) main (/usr/bin/cat) __libc_start_main (/usr/lib64/libc-2.26.so) _start (/usr/bin/cat) # Look for inotify* syscalls that fail, system wide, for 2 seconds, with backtraces: # perf trace -a --max-stack=16 --failure -e inotify* sleep 2 819.165 ( 0.058 ms): gmain/1724 inotify_add_watch(fd: 8<anon_inode:inotify>, pathname: /home/acme/~, mask: 16789454) = -1 ENOENT No such file or directory __GI_inotify_add_watch (inlined) _ik_watch (/usr/lib64/libgio-2.0.so.0.5400.3) _ip_start_watching (/usr/lib64/libgio-2.0.so.0.5400.3) im_scan_missing (/usr/lib64/libgio-2.0.so.0.5400.3) g_timeout_dispatch (/usr/lib64/libglib-2.0.so.0.5400.3) g_main_context_dispatch (/usr/lib64/libglib-2.0.so.0.5400.3) g_main_context_iterate.isra.23 (/usr/lib64/libglib-2.0.so.0.5400.3) g_main_context_iteration (/usr/lib64/libglib-2.0.so.0.5400.3) glib_worker_main (/usr/lib64/libglib-2.0.so.0.5400.3) g_thread_proxy (/usr/lib64/libglib-2.0.so.0.5400.3) start_thread (/usr/lib64/libpthread-2.26.so) __GI___clone (inlined) # Cc: Adrian Hunter <adrian.hunter@intel.com> Cc: David Ahern <dsahern@gmail.com> Cc: Jiri Olsa <jolsa@kernel.org> Cc: Namhyung Kim <namhyung@kernel.org> Cc: Wang Nan <wangnan0@huawei.com> Link: https://lkml.kernel.org/n/tip-8f7d3mngaxvi7tlzloz3n7cs@git.kernel.org Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
2018-04-02perf tools: Add a "dso_size" sort orderKim Phillips5-0/+48
Add DSO size to perf report/top sort output list. This includes adding a map__size fn to map.h, which is approximately equal to the DSO data file_size: DSO file size map (end-start) file / (end-start) libwebkit2gtk-4.0.so.37.24.9 43260072 41295872 95% libglib-2.0.so.0.5400.1 1125680 1118208 99% libc-2.26.so 1960656 1925120 101% libdbus-1.so.3.14.13 309456 303104 102% Sample output: $ ./perf report -s dso_size,dso Samples: 2K of event 'cycles:uppp', Event count (approx.): 128373340 Overhead DSO size Shared Object 90.62% unknown [unknown] 2.87% 1118208 libglib-2.0.so.0.5400.1 1.92% 303104 libdbus-1.so.3.14.13 1.42% 1925120 libc-2.26.so 0.77% 41295872 libwebkit2gtk-4.0.so.37.24.9 0.61% 335872 libgobject-2.0.so.0.5400.1 0.41% 1052672 libgdk-3.so.0.2200.25 0.36% 106496 libpthread-2.26.so 0.29% 221184 dbus-daemon 0.17% 159744 ld-2.26.so 0.13% 49152 libwayland-client.so.0.3.0 0.12% 1642496 libgio-2.0.so.0.5400.1 0.09% 7327744 libgtk-3.so.0.2200.25 0.09% 12324864 libmozjs-52.so.0.0.0 0.05% 4796416 perf 0.04% 843776 libgjs.so.0.0.0 0.03% 1409024 libmutter-clutter-1.so Committer testing: To sort by DSO size, use: # perf report -F dso_size,dso,overhead -s dso_size <SNIP> 3465216 libdns-export.so.174.0.1 0.00% 3522560 libgc.so.1.0.3 0.00% 3538944 libbfd-2.29-13.fc27.so 0.59% 3670016 libunistring.so.2.1.0 0.00% 3723264 libguile-2.0.so.22.8.1 0.00% 3776512 libgio-2.0.so.0.5400.3 0.00% 3891200 libc-2.26.so 0.96% 3944448 libmozjs-17.0.so 0.00% 4218880 libperl.so.5.26.1 0.18% 4452352 libpython2.7.so.1.0 0.02% 4472832 perf 0.02% 4603904 git 0.01% 4751360 libcrypto.so.1.1.0g 0.00% 5005312 libslang.so.2.3.1 0.00% 7315456 libgtk-3.so.0.2200.26 0.09% 8818688 i965_dri.so 2.46% 8818688 i965_dri.so (deleted) 1.26% 12414976 libmozjs-52.so.0.0.0 0.03% 23642112 cc1 2.02% 27889664 [kernel.kallsyms] 25.41% 80834560 libxul.so (deleted) 15.68% 98078720 chrome 32.03% 1056964608 [kernel.kallsyms] 1.59% # Signed-off-by: Kim Phillips <kim.phillips@arm.com> Tested-by: Arnaldo Carvalho de Melo <acme@redhat.com> Cc: Alexander Shishkin <alexander.shishkin@linux.intel.com> Cc: Andi Kleen <ak@linux.intel.com> Cc: Jin Yao <yao.jin@linux.intel.com> Cc: Jiri Olsa <jolsa@redhat.com> Cc: Maxim Kuvyrkov <maxim.kuvyrkov@linaro.org> Cc: Milian Wolff <milian.wolff@kdab.com> Cc: Namhyung Kim <namhyung@kernel.org> Cc: Peter Zijlstra <peterz@infradead.org> Link: http://lkml.kernel.org/r/20180327060956.1c01ebe67a2a941bb4468c6f@arm.com Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
2018-03-27perf vendor events s390: Add JSON files for IBM z14Thomas Richter4-0/+469
Add CPU measurement counter facility event description files (json files) for IBM z14. Signed-off-by: Thomas Richter <tmricht@linux.vnet.ibm.com> Reviewed-by: Hendrik Brueckner <brueckner@linux.vnet.ibm.com> Cc: Heiko Carstens <heiko.carstens@de.ibm.com> Cc: Martin Schwidefsky <schwidefsky@de.ibm.com> Link: http://lkml.kernel.org/r/20180326082538.2258-5-tmricht@linux.vnet.ibm.com Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
2018-03-27perf vendor events s390: Add JSON files for IBM z13Thomas Richter4-0/+511
Add CPU measurement counter facility event description files (json files) for IBM z13. Signed-off-by: Thomas Richter <tmricht@linux.vnet.ibm.com> Reviewed-by: Hendrik Brueckner <brueckner@linux.vnet.ibm.com> Cc: Heiko Carstens <heiko.carstens@de.ibm.com> Cc: Martin Schwidefsky <schwidefsky@de.ibm.com> Link: http://lkml.kernel.org/r/20180326082538.2258-4-tmricht@linux.vnet.ibm.com Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
2018-03-27perf vendor events s390: Add JSON files for IBM zEC12 zBC12Thomas Richter4-0/+385
Add CPU measurement counter facility event description files (json files) for IBM zEC12 and zBC12. Signed-off-by: Thomas Richter <tmricht@linux.vnet.ibm.com> Reviewed-by: Hendrik Brueckner <brueckner@linux.vnet.ibm.com> Cc: Heiko Carstens <heiko.carstens@de.ibm.com> Cc: Martin Schwidefsky <schwidefsky@de.ibm.com> Link: http://lkml.kernel.org/r/20180326082538.2258-3-tmricht@linux.vnet.ibm.com Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
2018-03-27perf vendor events s390: Add JSON files for IBM z196Thomas Richter4-0/+319
Add CPU measurement counter facility event description files (json files) for IBM z196. Signed-off-by: Thomas Richter <tmricht@linux.vnet.ibm.com> Reviewed-by: Hendrik Brueckner <brueckner@linux.vnet.ibm.com> Cc: Heiko Carstens <heiko.carstens@de.ibm.com> Cc: Martin Schwidefsky <schwidefsky@de.ibm.com> Link: http://lkml.kernel.org/r/20180326082538.2258-2-tmricht@linux.vnet.ibm.com Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
2018-03-27perf vendor events s390: Add JSON files for IBM z10EC z10BCThomas Richter4-0/+284
Add CPU measurement counter facility event description files (JSON files) for IBM z10EC and z10BC. Signed-off-by: Thomas Richter <tmricht@linux.vnet.ibm.com> Reviewed-by: Hendrik Brueckner <brueckner@linux.vnet.ibm.com> Cc: Heiko Carstens <heiko.carstens@de.ibm.com> Cc: Martin Schwidefsky <schwidefsky@de.ibm.com> Link: http://lkml.kernel.org/r/20180326082538.2258-1-tmricht@linux.vnet.ibm.com Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
2018-03-27perf mmap: Be consistent when checking for an unmaped ring bufferArnaldo Carvalho de Melo1-1/+12
The previous patch is insufficient to cure the reported 'perf trace' segfault, as it only cures the perf_mmap__read_done() case, moving the segfault to perf_mmap__read_init() functio, fix it by doing the same refcount check. Cc: Adrian Hunter <adrian.hunter@intel.com> Cc: Arnaldo Carvalho de Melo <acme@redhat.com> Cc: David Ahern <dsahern@gmail.com> Cc: Jiri Olsa <jolsa@kernel.org> Cc: Kan Liang <kan.liang@intel.com> Cc: Namhyung Kim <namhyung@kernel.org> Cc: Wang Nan <wangnan0@huawei.com> Fixes: 8872481bd048 ("perf mmap: Introduce perf_mmap__read_init()") Link: https://lkml.kernel.org/r/20180326144127.GF18897@kernel.org Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
2018-03-27perf mmap: Fix accessing unmapped mmap in perf_mmap__read_done()Kan Liang1-0/+6
There is a segmentation fault when running 'perf trace'. For example: [root@jouet e]# perf trace -e *chdir -o /tmp/bla perf report --ignore-vmlinux -i ../perf.data The perf_mmap__consume() could unmap the mmap. It needs to check the refcnt in perf_mmap__read_done(). Reported-by: Arnaldo Carvalho de Melo <acme@redhat.com> Signed-off-by: Kan Liang <kan.liang@linux.intel.com> Cc: Jiri Olsa <jolsa@kernel.org> Cc: Namhyung Kim <namhyung@kernel.org> Cc: Wang Nan <wangnan0@huawei.com> Fixes: ee023de05f35 ("perf mmap: Introduce perf_mmap__read_done()") Link: http://lkml.kernel.org/r/1522071729-16776-1-git-send-email-kan.liang@linux.intel.com Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
2018-03-27perf build: Fix check-headers.sh opts assignmentJiri Olsa1-0/+1
Currently the "opts" variable is not zero-ed and we keep on adding to it, ending up with: $ check-headers.sh 2>&1 + opts=' "-B"' + opts=' "-B" "-B"' + opts=' "-B" "-B" "-B"' + opts=' "-B" "-B" "-B" "-B"' + opts=' "-B" "-B" "-B" "-B" "-B"' + opts=' "-B" "-B" "-B" "-B" "-B" "-B"' Fix this by initializing it in the check() function, right before starting the loop. Signed-off-by: Jiri Olsa <jolsa@kernel.org> Tested-by: Arnaldo Carvalho de Melo <acme@redhat.com> Cc: Alexander Shishkin <alexander.shishkin@linux.intel.com> Cc: David Ahern <dsahern@gmail.com> Cc: Namhyung Kim <namhyung@kernel.org> Cc: Peter Zijlstra <peterz@infradead.org> Link: http://lkml.kernel.org/r/20180321140515.2252-1-jolsa@kernel.org Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
2018-03-23perf annotate: Use absolute addresses to calculate jump target offsetsArnaldo Carvalho de Melo1-3/+2
These types of jumps were confusing the annotate browser: entry_SYSCALL_64 /lib/modules/4.16.0-rc5-00086-gdf09348f78dc/build/vmlinux entry_SYSCALL_64 /lib/modules/4.16.0-rc5-00086-gdf09348f78dc/build/vmlinux Percent│ffffffff81a00020: swapgs <SNIP> │ffffffff81a00128: ↓ jae ffffffff81a00139 <syscall_return_via_sysret+0x53> <SNIP> │ffffffff81a00155: → jmpq *0x825d2d(%rip) # ffffffff82225e88 <pv_cpu_ops+0xe8> I.e. the syscall_return_via_sysret function is actually "inside" the entry_SYSCALL_64 function, and the offsets in jumps like these (+0x53) are relative to syscall_return_via_sysret, not to syscall_return_via_sysret. Or this may be some artifact in how the assembler marks the start and end of a function and how this ends up in the ELF symtab for vmlinux, i.e. syscall_return_via_sysret() isn't "inside" entry_SYSCALL_64, but just right after it. From readelf -sw vmlinux: 80267: ffffffff81a00020 315 NOTYPE GLOBAL DEFAULT 1 entry_SYSCALL_64 316: ffffffff81a000e6 0 NOTYPE LOCAL DEFAULT 1 syscall_return_via_sysret 0xffffffff81a00020 + 315 > 0xffffffff81a000e6 So instead of looking for offsets after that last '+' sign, calculate offsets for jump target addresses that are inside the function being disassembled from the absolute address, 0xffffffff81a00139 in this case, subtracting from it the objdump address for the start of the function being disassembled, entry_SYSCALL_64() in this case. So, before this patch: entry_SYSCALL_64 /lib/modules/4.16.0-rc5-00086-gdf09348f78dc/build/vmlinux Percent│ pop %r10 │ pop %r9 │ pop %r8 │ pop %rax │ pop %rsi │ pop %rdx │ pop %rsi │ mov %rsp,%rdi │ mov %gs:0x5004,%rsp │ pushq 0x28(%rdi) │ pushq (%rdi) │ push %rax │ ↑ jmp 6c │ mov %cr3,%rdi │ ↑ jmp 62 │ mov %rdi,%rax │ and $0x7ff,%rdi │ bt %rdi,%gs:0x2219a │ ↑ jae 53 │ btr %rdi,%gs:0x2219a │ mov %rax,%rdi │ ↑ jmp 5b After: entry_SYSCALL_64 /lib/modules/4.16.0-rc5-00086-gdf09348f78dc/build/vmlinux 0.65 │ → jne swapgs_restore_regs_and_return_to_usermode │ pop %r10 │ pop %r9 │ pop %r8 │ pop %rax │ pop %rsi │ pop %rdx │ pop %rsi │ mov %rsp,%rdi │ mov %gs:0x5004,%rsp │ pushq 0x28(%rdi) │ pushq (%rdi) │ push %rax │ ↓ jmp 132 │ mov %cr3,%rdi │ ┌──jmp 128 │ │ mov %rdi,%rax │ │ and $0x7ff,%rdi │ │ bt %rdi,%gs:0x2219a │ │↓ jae 119 │ │ btr %rdi,%gs:0x2219a │ │ mov %rax,%rdi │ │↓ jmp 121 │119:│ mov %rax,%rdi │ │ bts $0x3f,%rdi │121:│ or $0x800,%rdi │128:└─→or $0x1000,%rdi │ mov %rdi,%cr3 │132: pop %rax │ pop %rdi │ pop %rsp │ → jmpq *0x825d2d(%rip) # ffffffff82225e88 <pv_cpu_ops+0xe8> With those at least navigating to the right destination, an improvement for these cases seems to be to be to somehow mark those inner functions, which in this case could be: entry_SYSCALL_64 /lib/modules/4.16.0-rc5-00086-gdf09348f78dc/build/vmlinux │syscall_return_via_sysret: │ pop %r15 │ pop %r14 │ pop %r13 │ pop %r12 │ pop %rbp │ pop %rbx │ pop %rsi │ pop %r10 │ pop %r9 │ pop %r8 │ pop %rax │ pop %rsi │ pop %rdx │ pop %rsi │ mov %rsp,%rdi │ mov %gs:0x5004,%rsp │ pushq 0x28(%rdi) │ pushq (%rdi) │ push %rax │ ↓ jmp 132 │ mov %cr3,%rdi │ ┌──jmp 128 │ │ mov %rdi,%rax │ │ and $0x7ff,%rdi │ │ bt %rdi,%gs:0x2219a │ │↓ jae 119 │ │ btr %rdi,%gs:0x2219a │ │ mov %rax,%rdi │ │↓ jmp 121 │119:│ mov %rax,%rdi │ │ bts $0x3f,%rdi │121:│ or $0x800,%rdi │128:└─→or $0x1000,%rdi │ mov %rdi,%cr3 │132: pop %rax │ pop %rdi │ pop %rsp │ → jmpq *0x825d2d(%rip) # ffffffff82225e88 <pv_cpu_ops+0xe8> This all gets much better viewed if one uses 'perf report --ignore-vmlinux' forcing the usage of /proc/kcore + /proc/kallsyms, when the above actually gets down to: # perf report --ignore-vmlinux ## do '/64', will show the function names containing '64', ## navigate to /entry_SYSCALL_64_after_hwframe.annotation, ## press 'A' to annotate, then 'P' to print that annotation ## to a file ## From another xterm (or see on screen, this 'P' thing is for ## getting rid of those right side scroll bars/spaces): # cat /entry_SYSCALL_64_after_hwframe.annotation entry_SYSCALL_64_after_hwframe() /proc/kcore Event: cycles:ppp Percent Disassembly of section load0: ffffffff9aa00044 <load0>: 11.97 push %rax 4.85 push %rdi push %rsi 2.59 push %rdx 2.27 push %rcx 0.32 pushq $0xffffffffffffffda 1.29 push %r8 xor %r8d,%r8d 1.62 push %r9 0.65 xor %r9d,%r9d 1.62 push %r10 xor %r10d,%r10d 5.50 push %r11 xor %r11d,%r11d 3.56 push %rbx xor %ebx,%ebx 4.21 push %rbp xor %ebp,%ebp 2.59 push %r12 0.97 xor %r12d,%r12d 3.24 push %r13 xor %r13d,%r13d 2.27 push %r14 xor %r14d,%r14d 4.21 push %r15 xor %r15d,%r15d 0.97 mov %rsp,%rdi 5.50 → callq do_syscall_64 14.56 mov 0x58(%rsp),%rcx 7.44 mov 0x80(%rsp),%r11 0.32 cmp %rcx,%r11 → jne swapgs_restore_regs_and_return_to_usermode 0.32 shl $0x10,%rcx 0.32 sar $0x10,%rcx 3.24 cmp %rcx,%r11 → jne swapgs_restore_regs_and_return_to_usermode 2.27 cmpq $0x33,0x88(%rsp) 1.29 → jne swapgs_restore_regs_and_return_to_usermode mov 0x30(%rsp),%r11 8.74 cmp %r11,0x90(%rsp) → jne swapgs_restore_regs_and_return_to_usermode 0.32 test $0x10100,%r11 → jne swapgs_restore_regs_and_return_to_usermode 0.32 cmpq $0x2b,0xa0(%rsp) 0.65 → jne swapgs_restore_regs_and_return_to_usermode I.e. using kallsyms makes the function start/end be done differently than using what is in the vmlinux ELF symtab and actually the hits goes to entry_SYSCALL_64_after_hwframe, which is a GLOBAL() after the start of entry_SYSCALL_64: ENTRY(entry_SYSCALL_64) UNWIND_HINT_EMPTY <SNIP> pushq $__USER_CS /* pt_regs->cs */ pushq %rcx /* pt_regs->ip */ GLOBAL(entry_SYSCALL_64_after_hwframe) pushq %rax /* pt_regs->orig_ax */ PUSH_AND_CLEAR_REGS rax=$-ENOSYS And it goes and ends at: cmpq $__USER_DS, SS(%rsp) /* SS must match SYSRET */ jne swapgs_restore_regs_and_return_to_usermode /* * We win! This label is here just for ease of understanding * perf profiles. Nothing jumps here. */ syscall_return_via_sysret: /* rcx and r11 are already restored (see code above) */ UNWIND_HINT_EMPTY POP_REGS pop_rdi=0 skip_r11rcx=1 So perhaps some people should really just play with '--ignore-vmlinux' to force /proc/kcore + kallsyms. One idea is to do both, i.e. have a vmlinux annotation and a kcore+kallsyms one, when possible, and even show the patched location, etc. Reported-by: Linus Torvalds <torvalds@linux-foundation.org> Cc: Adrian Hunter <adrian.hunter@intel.com> Cc: Andi Kleen <ak@linux.intel.com> Cc: David Ahern <dsahern@gmail.com> Cc: Jin Yao <yao.jin@linux.intel.com> Cc: Jiri Olsa <jolsa@kernel.org> Cc: Namhyung Kim <namhyung@kernel.org> Cc: Wang Nan <wangnan0@huawei.com> Link: https://lkml.kernel.org/n/tip-r11knxv8voesav31xokjiuo6@git.kernel.org Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>