aboutsummaryrefslogtreecommitdiff
path: root/tools/perf
AgeCommit message (Collapse)AuthorFilesLines
2019-01-21perf report: Display arch specific diagnostic counter sets, starting with s390Thomas Richter7-1/+277
On s390 the event bc000 (also named CF_DIAG) extracts the CPU Measurement Facility diagnostic counter sets and displays them as counter number and counter value pairs sorted by counter set number. Output: [root@s35lp76 perf]# ./perf report -D --stdio [00000000] Counterset:0 Counters:6 Counter:000 Value:0x000000000085ec36 Counter:001 Value:0x0000000000796c94 Counter:002 Value:0x0000000000005ada Counter:003 Value:0x0000000000092460 Counter:004 Value:0x0000000000006073 Counter:005 Value:0x00000000001a9a73 [0x000038] Counterset:1 Counters:2 Counter:000 Value:0x000000000007c59f Counter:001 Value:0x000000000002fad6 [0x000050] Counterset:2 Counters:16 Counter:000 Value:000000000000000000 Counter:001 Value:000000000000000000 Counter:002 Value:000000000000000000 Counter:003 Value:000000000000000000 Counter:004 Value:000000000000000000 Counter:005 Value:000000000000000000 Counter:006 Value:000000000000000000 Counter:007 Value:000000000000000000 Counter:008 Value:000000000000000000 Counter:009 Value:000000000000000000 Counter:010 Value:000000000000000000 Counter:011 Value:000000000000000000 Counter:012 Value:000000000000000000 Counter:013 Value:000000000000000000 Counter:014 Value:000000000000000000 Counter:015 Value:000000000000000000 [0x0000d8] Counterset:3 Counters:128 Counter:000 Value:0x000000000000020f Counter:001 Value:0x00000000000001d8 Counter:002 Value:0x000000000000d7fa Counter:003 Value:0x000000000000008b ... The number in brackets is the offset into the raw data field of the sample. New functions trace_event_sample_raw__init() and s390_sample_raw() are introduced in the code path to enable interpretation on non s390 platforms. This event bc000 attached raw data is generated only on s390 platform. Correct display on other platforms requires correct endianness handling. Committer notes: Added a init function that sets up a evlist function pointer to avoid repeated tests on evlist->env and calls to perf_env__name() that involves normalizing, etc, for each PERF_RECORD_SAMPLE. Removed needless __maybe_unused from the trace_event_raw() prototype in session.h, move it to be an static function in evlist. The 'offset' variable is a size_t, not an u64, fix it to avoid this on some arches: CC /tmp/build/perf/util/s390-sample-raw.o util/s390-sample-raw.c: In function 's390_cpumcfdg_testctr': util/s390-sample-raw.c:77:4: error: format '%llx' expects argument of type 'long long unsigned int', but argument 4 has type 'size_t' [-Werror=format=] pr_err("Invalid counter set entry at %#" PRIx64 "\n", ^ cc1: all warnings being treated as errors Signed-off-by: Thomas Richter <[email protected]> Reviewed-by: Hendrik Brueckner <[email protected]> Link: https://lkml.kernel.org/r/[email protected] Cc: Martin Schwidefsky <[email protected]> Cc: Heiko Carstens <[email protected]> Link: https://lkml.kernel.org/n/[email protected] Signed-off-by: Arnaldo Carvalho de Melo <[email protected]>
2019-01-21perf tools: Remove duplicate headersBrajeswar Ghosh4-4/+0
Remove duplicate headers which are included more than once in the same file. Signed-off-by: Brajeswar Ghosh <[email protected]> Acked-by: Souptick Joarder <[email protected]> Cc: Alexander Shishkin <[email protected]> Cc: Colin King <[email protected]> Cc: Jiri Olsa <[email protected]> Cc: Namhyung Kim <[email protected]> Cc: Peter Zijlstra <[email protected]> Cc: Sabyasachi Gupta <[email protected]> Link: http://lkml.kernel.org/r/20190115135916.GA3629@hp-pavilion-15-notebook-pc-brajeswar Signed-off-by: Arnaldo Carvalho de Melo <[email protected]>
2019-01-21perf session: Add reader__process_events functionJiri Olsa1-24/+38
The reader object is defined by file's fd, data offset and data size. Now we can simply define a reader object for an arbitrary file data portion and pass it to reader__process_events(). Signed-off-by: Jiri Olsa <[email protected]> Acked-by: Namhyung Kim <[email protected]> Cc: Alexander Shishkin <[email protected]> Cc: Peter Zijlstra <[email protected]> Link: http://lkml.kernel.org/r/[email protected] Signed-off-by: Arnaldo Carvalho de Melo <[email protected]>
2019-01-21perf session: Add 'data_offset' member to reader objectJiri Olsa1-4/+5
Add 'data_offset' member to reader object. Signed-off-by: Jiri Olsa <[email protected]> Acked-by: Namhyung Kim <[email protected]> Cc: Alexander Shishkin <[email protected]> Cc: Peter Zijlstra <[email protected]> Link: http://lkml.kernel.org/r/[email protected] Signed-off-by: Arnaldo Carvalho de Melo <[email protected]>
2019-01-21perf session: Add 'data_size' member to reader objectJiri Olsa1-1/+3
Add a 'data_size' member to the reader object. Keep the 'data_size' variable instead of replacing it with rd.data_size as it will be used in the following patch. Signed-off-by: Jiri Olsa <[email protected]> Acked-by: Namhyung Kim <[email protected]> Cc: Alexander Shishkin <[email protected]> Cc: Peter Zijlstra <[email protected]> Link: http://lkml.kernel.org/r/[email protected] Signed-off-by: Arnaldo Carvalho de Melo <[email protected]>
2019-01-21perf session: Add reader objectJiri Olsa1-2/+8
Add a session private reader object to encapsulate the reading of the event data block. Starting with a 'fd' field. Signed-off-by: Jiri Olsa <[email protected]> Acked-by: Namhyung Kim <[email protected]> Cc: Alexander Shishkin <[email protected]> Cc: Peter Zijlstra <[email protected]> Link: http://lkml.kernel.org/r/[email protected] Signed-off-by: Arnaldo Carvalho de Melo <[email protected]>
2019-01-21perf session: Get rid of file_size variableJiri Olsa1-7/+5
It's not needed and removing it makes the code a little simpler for the upcoming changes. It's safe to replace file_size with data_size, because the perf_data__size() value is never smaller than data_offset + data_size. Signed-off-by: Jiri Olsa <[email protected]> Acked-by: Namhyung Kim <[email protected]> Cc: Alexander Shishkin <[email protected]> Cc: Peter Zijlstra <[email protected]> Link: http://lkml.kernel.org/r/[email protected] Signed-off-by: Arnaldo Carvalho de Melo <[email protected]>
2019-01-21perf session: Rearrange perf_session__process_events functionJiri Olsa1-13/+7
To reduce function arguments and the code. Signed-off-by: Jiri Olsa <[email protected]> Acked-by: Namhyung Kim <[email protected]> Cc: Alexander Shishkin <[email protected]> Cc: Peter Zijlstra <[email protected]> Link: http://lkml.kernel.org/r/[email protected] Signed-off-by: Arnaldo Carvalho de Melo <[email protected]>
2019-01-21perf tools: Replace automatic const char[] variables by staticsRasmus Villemoes6-11/+11
An automatic const char[] variable gets initialized at runtime, just like any other automatic variable. For long strings, that uses a lot of stack and wastes time building the string; e.g. for the "No %s allocation events..." case one has: 444516: 48 b8 4e 6f 20 25 73 20 61 6c movabs $0x6c61207325206f4e,%rax # "No %s al" ... 444674: 48 89 45 80 mov %rax,-0x80(%rbp) 444678: 48 b8 6c 6f 63 61 74 69 6f 6e movabs $0x6e6f697461636f6c,%rax # "location" 444682: 48 89 45 88 mov %rax,-0x78(%rbp) 444686: 48 b8 20 65 76 65 6e 74 73 20 movabs $0x2073746e65766520,%rax # " events " 444690: 66 44 89 55 c4 mov %r10w,-0x3c(%rbp) 444695: 48 89 45 90 mov %rax,-0x70(%rbp) 444699: 48 b8 66 6f 75 6e 64 2e 20 20 movabs $0x20202e646e756f66,%rax Make them all static so that the compiler just references objects in .rodata. Committer testing: Ok, using dwarves's codiff tool: $ codiff --functions /tmp/perf.before ~/bin/perf builtin-sched.c: cmd_sched | -48 1 function changed, 48 bytes removed, diff: -48 builtin-report.c: cmd_report | -32 1 function changed, 32 bytes removed, diff: -32 builtin-kmem.c: cmd_kmem | -64 build_alloc_func_list | -50 2 functions changed, 114 bytes removed, diff: -114 builtin-c2c.c: perf_c2c__report | -390 1 function changed, 390 bytes removed, diff: -390 ui/browsers/header.c: tui__header_window | -104 1 function changed, 104 bytes removed, diff: -104 /home/acme/bin/perf: 9 functions changed, 688 bytes removed, diff: -688 Signed-off-by: Rasmus Villemoes <[email protected]> Acked-by: Jiri Olsa <[email protected]> Tested-by: Arnaldo Carvalho de Melo <[email protected]> Cc: Alexander Shishkin <[email protected]> Cc: Namhyung Kim <[email protected]> Cc: Peter Zijlstra <[email protected]> Link: http://lkml.kernel.org/r/[email protected] Signed-off-by: Arnaldo Carvalho de Melo <[email protected]>
2019-01-21perf script: Fix crash when processing recorded stat dataTony Jones1-5/+2
While updating perf to work with Python3 and Python2 I noticed that the stat-cpi script was dumping core. $ perf stat -e cycles,instructions record -o /tmp/perf.data /bin/false Performance counter stats for '/bin/false': 802,148 cycles 604,622 instructions 802,148 cycles 604,622 instructions 0.001445842 seconds time elapsed $ perf script -i /tmp/perf.data -s scripts/python/stat-cpi.py Segmentation fault (core dumped) ... ... rblist=rblist@entry=0xb2a200 <rt_stat>, new_entry=new_entry@entry=0x7ffcb755c310) at util/rblist.c:33 ctx=<optimized out>, type=<optimized out>, create=<optimized out>, cpu=<optimized out>, evsel=<optimized out>) at util/stat-shadow.c:118 ctx=<optimized out>, type=<optimized out>, st=<optimized out>) at util/stat-shadow.c:196 count=count@entry=727442, cpu=cpu@entry=0, st=0xb2a200 <rt_stat>) at util/stat-shadow.c:239 config=config@entry=0xafeb40 <stat_config>, counter=counter@entry=0x133c6e0) at util/stat.c:372 ... ... The issue is that since 1fcd03946b52 perf_stat__update_shadow_stats now calls update_runtime_stat passing rt_stat rather than calling update_stats but perf_stat__init_shadow_stats has never been called to initialize rt_stat in the script path processing recorded stat data. Since I can't see any reason why perf_stat__init_shadow_stats() is presently initialized like it is in builtin-script.c::perf_sample__fprint_metric() [4bd1bef8bba2f] I'm proposing it instead be initialized once in __cmd_script Committer testing: After applying the patch: # perf script -i /tmp/perf.data -s tools/perf/scripts/python/stat-cpi.py 0.001970: cpu -1, thread -1 -> cpi 1.709079 (1075684/629394) # No segfault. Signed-off-by: Tony Jones <[email protected]> Reviewed-by: Jiri Olsa <[email protected]> Tested-by: Arnaldo Carvalho de Melo <[email protected]> Tested-by: Ravi Bangoria <[email protected]> Cc: Andi Kleen <[email protected]> Cc: Jin Yao <[email protected]> Fixes: 1fcd03946b52 ("perf stat: Update per-thread shadow stats") Link: http://lkml.kernel.org/r/[email protected] Signed-off-by: Arnaldo Carvalho de Melo <[email protected]>
2019-01-21perf top: Fix wrong hottest instruction highlightedHe Kuang1-6/+10
The annotation line percentage is compared and inserted into the rbtree, but the percent field of 'struct annotation_data' is an array, the comparison result between them is the address difference. This patch compares the right slot of percent array according to opts->percent_type and makes things right. The problem can be reproduced by pressing 'H' in perf top annotation view. It should highlight the instruction line which has the highest sampling percentage. Signed-off-by: He Kuang <[email protected]> Reviewed-by: Jiri Olsa <[email protected]> Cc: Alexander Shishkin <[email protected]> Cc: Namhyung Kim <[email protected]> Cc: Peter Zijlstra <[email protected]> Link: http://lkml.kernel.org/r/[email protected] Signed-off-by: Arnaldo Carvalho de Melo <[email protected]>
2019-01-21perf tools: Handle TOPOLOGY headers with no CPUStephane Eranian1-2/+9
This patch fixes an issue in cpumap.c when used with the TOPOLOGY header. In some configurations, some NUMA nodes may have no CPU (empty cpulist). Yet a cpumap map must be created otherwise perf abort with an error. This patch handles this case by creating a dummy map. Before: $ perf record -o - -e cycles noploop 2 | perf script -i - 0x6e8 [0x6c]: failed to process type: 80 After: $ perf record -o - -e cycles noploop 2 | perf script -i - noploop for 2 seconds Signed-off-by: Stephane Eranian <[email protected]> Acked-by: Jiri Olsa <[email protected]> Cc: Andi Kleen <[email protected]> Cc: Kan Liang <[email protected]> Cc: Peter Zijlstra <[email protected]> Link: http://lkml.kernel.org/r/[email protected] Signed-off-by: Arnaldo Carvalho de Melo <[email protected]>
2019-01-21perf/doc: Update design.txt for exclude_{host|guest} flagsAndrew Murray1-0/+4
Update design.txt to reflect the presence of the exclude_host and exclude_guest perf flags. Signed-off-by: Andrew Murray <[email protected]> Signed-off-by: Peter Zijlstra (Intel) <[email protected]> Cc: Arnaldo Carvalho de Melo <[email protected]> Cc: Benjamin Herrenschmidt <[email protected]> Cc: Borislav Petkov <[email protected]> Cc: Ivan Kokshaysky <[email protected]> Cc: Linus Torvalds <[email protected]> Cc: Mark Rutland <[email protected]> Cc: Matt Turner <[email protected]> Cc: Michael Ellerman <[email protected]> Cc: Paul Mackerras <[email protected]> Cc: Peter Zijlstra <[email protected]> Cc: Richard Henderson <[email protected]> Cc: Russell King <[email protected]> Cc: Sascha Hauer <[email protected]> Cc: Shawn Guo <[email protected]> Cc: Thomas Gleixner <[email protected]> Cc: Will Deacon <[email protected]> Cc: [email protected] Cc: [email protected] Cc: [email protected] Cc: [email protected] Link: https://lkml.kernel.org/r/[email protected] Signed-off-by: Ingo Molnar <[email protected]>
2019-01-19Merge tag 'powerpc-5.0-3' of ↵Linus Torvalds2-1/+3
git://git.kernel.org/pub/scm/linux/kernel/git/powerpc/linux Pull powerpc fixes from Michael Ellerman: "A couple of weeks of fixes. There's one fix for an oops on Power9 machines with Open CAPI adapters. And a fix for probable memory corruption in some of the new NPU code, caught by smatch though and not seen in the wild. Plus a few other minor fixes. There's one non-fix which is the perf_regs change. That was sent during the merge window but I accidentally only merged the first of two patches in the series. It's been in linux-next so hopefully doesn't conflict with anything in acme's tree. Thanks to: Alexey Kardashevskiy, Andrew Donnellan, Breno Leitao, Christian Lamparter, Christophe Leroy, Dan Carpenter, Frederic Barrat, Greg Kurz, Jason A. Donenfeld, Madhavan Srinivasan" * tag 'powerpc-5.0-3' of git://git.kernel.org/pub/scm/linux/kernel/git/powerpc/linux: powerpc/syscalls: Fix syscall tracing powerpc/pseries: Fix build break due to pnv_npu2_init() powerpc/4xx/ocm: Fix fix for phys_addr_t printf warnings powerpc/powernv/npu: Fix oops in pnv_try_setup_npu_table_group() powerpc/tm: Limit TM code inside PPC_TRANSACTIONAL_MEM powerpc/8xx: fix setting of pagetable for Abatron BDI debug tool. powerpc/powernv/npu: Allocate enough memory in pnv_try_setup_npu_table_group() powerpc/perf: Update perf_regs structure to include MMCRA
2019-01-18perf python: Remove -fstack-clash-protection when building with some clang ↵Arnaldo Carvalho de Melo1-0/+2
versions These options are not present in some (all?) clang versions, so when we build for a distro that has a gcc new enough to have these options and that the distro python build config settings use them but clang doesn't support, b00m. This is the case with fedora rawhide (now gearing towards f30), so check if clang has the and remove the missing ones from CFLAGS. Cc: Eduardo Habkost <[email protected]> Cc: Thiago Macieira <[email protected]> Cc: Adrian Hunter <[email protected]> Cc: Jiri Olsa <[email protected]> Cc: Namhyung Kim <[email protected]> Link: https://lkml.kernel.org/n/[email protected] Signed-off-by: Arnaldo Carvalho de Melo <[email protected]>
2019-01-18perf script: Fix crash with printing mixed trace point and other eventsAndi Kleen1-1/+1
'perf script' crashes currently when printing mixed trace points and other events because the trace format does not handle events without trace meta data. Add a simple check to avoid that. % cat > test.c main() { printf("Hello world\n"); } ^D % gcc -g -o test test.c % sudo perf probe -x test 'test.c:3' % perf record -e '{cpu/cpu-cycles,period=10000/,probe_test:main}:S' ./test % perf script <segfault> Committer testing: Before: # perf probe -x /lib64/libc-2.28.so malloc Added new event: probe_libc:malloc (on malloc in /usr/lib64/libc-2.28.so) You can now use it in all perf tools, such as: perf record -e probe_libc:malloc -aR sleep 1 # perf probe -l probe_libc:malloc (on __libc_malloc@malloc/malloc.c in /usr/lib64/libc-2.28.so) # perf record -e '{cpu/cpu-cycles,period=10000/,probe_libc:*}:S' sleep 1 [ perf record: Woken up 1 times to write data ] [ perf record: Captured and wrote 0.023 MB perf.data (40 samples) ] # perf script Segmentation fault (core dumped) ^C # After: # perf script | head -6 sleep 2888 94796.944981: 16198 cpu/cpu-cycles,period=10000/: ffffffff925dc04f get_random_u32+0x1f (/lib/modules/5.0.0-rc2+/build/vmlinux) sleep 2888 [-01] 94796.944981: probe_libc:malloc: sleep 2888 94796.944983: 4713 cpu/cpu-cycles,period=10000/: ffffffff922763af change_protection+0xcf (/lib/modules/5.0.0-rc2+/build/vmlinux) sleep 2888 [-01] 94796.944983: probe_libc:malloc: sleep 2888 94796.944986: 9934 cpu/cpu-cycles,period=10000/: ffffffff922777e0 move_page_tables+0x0 (/lib/modules/5.0.0-rc2+/build/vmlinux) sleep 2888 [-01] 94796.944986: probe_libc:malloc: # Signed-off-by: Andi Kleen <[email protected]> Tested-by: Arnaldo Carvalho de Melo <[email protected]> Acked-by: Jiri Olsa <[email protected]> Link: http://lkml.kernel.org/r/[email protected] Signed-off-by: Arnaldo Carvalho de Melo <[email protected]>
2019-01-17perf ordered_events: Fix crash in ordered_events__freeJiri Olsa1-2/+4
Song Liu reported crash in 'perf record': > #0 0x0000000000500055 in ordered_events(float, long double,...)(...) () > #1 0x0000000000500196 in ordered_events.reinit () > #2 0x00000000004fe413 in perf_session.process_events () > #3 0x0000000000440431 in cmd_record () > #4 0x00000000004a439f in run_builtin () > #5 0x000000000042b3e5 in main ()" This can happen when we get out of buffers during event processing. The subsequent ordered_events__free() call assumes oe->buffer != NULL and crashes. Add a check to prevent that. Reported-by: Song Liu <[email protected]> Signed-off-by: Jiri Olsa <[email protected]> Reviewed-by: Song Liu <[email protected]> Tested-by: Song Liu <[email protected]> Cc: Alexander Shishkin <[email protected]> Cc: Namhyung Kim <[email protected]> Cc: Peter Zijlstra <[email protected]> Cc: Stephane Eranian <[email protected]> Link: http://lkml.kernel.org/r/[email protected] Fixes: d5ceb62b3654 ("perf ordered_events: Add 'struct ordered_events_buffer' layer") Signed-off-by: Arnaldo Carvalho de Melo <[email protected]>
2019-01-10tools headers powerpc: Remove unistd.hRavi Bangoria1-1/+0
We use syscall.tbl to generate system call table on powerpc. The unistd.h copy is no longer required now. Remove it. Signed-off-by: Ravi Bangoria <[email protected]> Cc: Jiri Olsa <[email protected]> Cc: Michael Ellerman <[email protected]> Cc: Namhyung Kim <[email protected]> Cc: [email protected] Link: http://lkml.kernel.org/r/[email protected] Signed-off-by: Arnaldo Carvalho de Melo <[email protected]>
2019-01-10perf powerpc: Rework syscall table generationRavi Bangoria3-14/+450
Commit aff850393200 ("powerpc: add system call table generation support") changed how systemcall table is generated for powerpc. Incorporate these changes into perf as well. Committer testing: $ podman run --entrypoint=/bin/sh --privileged -v /home/acme/git:/git --rm -ti docker.io/acmel/linux-perf-tools-build-ubuntu:18.04-x-powerpc64 perfbuilder@d7a7af166a80:/git/perf$ head -2 /etc/os-release NAME="Ubuntu" VERSION="18.04.1 LTS (Bionic Beaver)" perfbuilder@d7a7af166a80:/git/perf$ perfbuilder@d7a7af166a80:/git/perf$ make ARCH=powerpc CROSS_COMPILE=powerpc64-linux-gnu- EXTRA_CFLAGS= -C /git/linux/tools/perf O=/tmp/build/perf make: Entering directory '/git/linux/tools/perf' BUILD: Doing 'make -j8' parallel build HOSTCC /tmp/build/perf/fixdep.o HOSTLD /tmp/build/perf/fixdep-in.o LINK /tmp/build/perf/fixdep Warning: Kernel ABI header at 'tools/include/uapi/linux/mman.h' differs from latest version at 'include/uapi/linux/mman.h' diff -u tools/include/uapi/linux/mman.h include/uapi/linux/mman.h sh: 1: command: Illegal option -c Auto-detecting system features: ... dwarf: [ on ] ... dwarf_getlocations: [ on ] ... glibc: [ on ] ... gtk2: [ OFF ] ... libaudit: [ OFF ] ... libbfd: [ OFF ] ... libelf: [ on ] ... libnuma: [ OFF ] ... numa_num_possible_cpus: [ OFF ] ... libperl: [ OFF ] ... libpython: [ OFF ] ... libslang: [ OFF ] ... libcrypto: [ OFF ] ... libunwind: [ OFF ] ... libdw-dwarf-unwind: [ on ] ... zlib: [ on ] ... lzma: [ OFF ] ... get_cpuid: [ OFF ] ... bpf: [ on ] Makefile.config:445: No sys/sdt.h found, no SDT events are defined, please install systemtap-sdt-devel or systemtap-sdt-dev Makefile.config:491: No libunwind found. Please install libunwind-dev[el] >= 1.1 and/or set LIBUNWIND_DIR Makefile.config:583: No libcrypto.h found, disables jitted code injection, please install libssl-devel or libssl-dev Makefile.config:598: slang not found, disables TUI support. Please install slang-devel, libslang-dev or libslang2-dev Makefile.config:612: GTK2 not found, disables GTK2 support. Please install gtk2-devel or libgtk2.0-dev Makefile.config:639: Missing perl devel files. Disabling perl scripting support, please install perl-ExtUtils-Embed/libperl-dev Makefile.config:666: No python interpreter was found: disables Python support - please install python-devel/python-dev Makefile.config:721: No bfd.h/libbfd found, please install binutils-dev[el]/zlib-static/libiberty-dev to gain symbol demangling Makefile.config:750: No liblzma found, disables xz kernel module decompression, please install xz-devel/liblzma-dev Makefile.config:763: No numa.h found, disables 'perf bench numa mem' benchmark, please install numactl-devel/libnuma-devel/libnuma-dev Makefile.config:814: No libbabeltrace found, disables 'perf data' CTF format support, please install libbabeltrace-dev[el]/libbabeltrace-ctf-dev Makefile.config:840: No alternatives command found, you need to set JDIR= to point to the root of your Java directory GEN /tmp/build/perf/common-cmds.h <SNIP> CC /tmp/build/perf/util/syscalltbl.o <SNIP> LD /tmp/build/perf/libperf-in.o AR /tmp/build/perf/libperf.a LINK /tmp/build/perf/perf make: Leaving directory '/git/linux/tools/perf' perfbuilder@d7a7af166a80:/git/perf$ head /tmp/build/perf/arch/powerpc/include/generated/asm/syscalls_64.c static const char *syscalltbl_powerpc_64[] = { [0] = "restart_syscall", [1] = "exit", [2] = "fork", [3] = "read", [4] = "write", [5] = "open", [6] = "close", [7] = "waitpid", [8] = "creat", perfbuilder@d7a7af166a80:/git/perf$ tail /tmp/build/perf/arch/powerpc/include/generated/asm/syscalls_64.c [381] = "pwritev2", [382] = "kexec_file_load", [383] = "statx", [384] = "pkey_alloc", [385] = "pkey_free", [386] = "pkey_mprotect", [387] = "rseq", [388] = "io_pgetevents", }; #define SYSCALLTBL_POWERPC_64_MAX_ID 388 perfbuilder@d7a7af166a80:/git/perf$ head /tmp/build/perf/arch/powerpc/include/generated/asm/syscalls_32.c static const char *syscalltbl_powerpc_32[] = { [0] = "restart_syscall", [1] = "exit", [2] = "fork", [3] = "read", [4] = "write", [5] = "open", [6] = "close", [7] = "waitpid", [8] = "creat", perfbuilder@d7a7af166a80:/git/perf$ tail /tmp/build/perf/arch/powerpc/include/generated/asm/syscalls_32.c [381] = "pwritev2", [382] = "kexec_file_load", [383] = "statx", [384] = "pkey_alloc", [385] = "pkey_free", [386] = "pkey_mprotect", [387] = "rseq", [388] = "io_pgetevents", }; #define SYSCALLTBL_POWERPC_32_MAX_ID 388 perfbuilder@d7a7af166a80:/git/perf$ Signed-off-by: Ravi Bangoria <[email protected]> Reported-by: Arnaldo Carvalho de Melo <[email protected]> Tested-by: Arnaldo Carvalho de Melo <[email protected]> Cc: Jiri Olsa <[email protected]> Cc: Michael Ellerman <[email protected]> Cc: Namhyung Kim <[email protected]> Cc: [email protected] Link: http://lkml.kernel.org/r/[email protected] Signed-off-by: Arnaldo Carvalho de Melo <[email protected]>
2019-01-09perf symbols: Add 'arch_cpu_idle' to the list of kernel idle symbolsArnaldo Carvalho de Melo1-0/+1
When testing 'perf top' on a armhf system (32-bit, Orange Pi Zero), I noticed that 'arch_cpu_idle' dominated, add it to the list of idle symbols, so that we can see what is that being done when not idle. Cc: Adrian Hunter <[email protected]> Cc: Jiri Olsa <[email protected]> Cc: Namhyung Kim <[email protected]> Link: https://lkml.kernel.org/n/[email protected] Signed-off-by: Arnaldo Carvalho de Melo <[email protected]>
2019-01-08perf beauty: Switch from using uapi/linux/fs.h to uapi/linux/mount.hArnaldo Carvalho de Melo1-2/+2
As now we'll update our fs.h copy and what tools/perf/trace/beauty/mount_flags.sh needs just got moved to mount.h, use that instead. Cc: Adrian Hunter <[email protected]> Cc: David Howells <[email protected]> Cc: Jiri Olsa <[email protected]> Cc: Luis Cláudio Gonçalves <[email protected]> Cc: Namhyung Kim <[email protected]> Cc: Wang Nan <[email protected]> Link: https://lkml.kernel.org/n/[email protected] Signed-off-by: Arnaldo Carvalho de Melo <[email protected]>
2019-01-08tools include uapi: Grab a copy of linux/mount.hArnaldo Carvalho de Melo1-0/+1
We were using a copy of uapi/linux/fs.h to create the mount syscall 'flags' string table to use in 'perf trace', to convert from the number obtained via the raw_syscalls:sys_enter into a string, using tools/perf/trace/beauty/mount_flags.sh, but in e262e32d6bde ("vfs: Suppress MS_* flag defs within the kernel unless explicitly enabled") those defines got moved to linux/mount.h, so grab a copy of mount.h too. Keep the uapi/linux/fs.h as we'll use it for the SEEK_ constants. Cc: Adrian Hunter <[email protected]> Cc: David Howells <[email protected]> Cc: Jiri Olsa <[email protected]> Cc: Luis Cláudio Gonçalves <[email protected]> Cc: Namhyung Kim <[email protected]> Cc: Wang Nan <[email protected]> Link: https://lkml.kernel.org/n/[email protected] Signed-off-by: Arnaldo Carvalho de Melo <[email protected]>
2019-01-08perf top: Lift restriction on using callchains without "sym" in --sortArnaldo Carvalho de Melo1-6/+1
This restriction is not present in 'perf report' and since 'perf top' uses the same hists browser, remove it from it as well. With this we create per event buckets with callchain trees, so that # perf top --sort dso -g --no-children Bucketizes samples by DSO and below it shows the callchains leading to functions in this DSO. Try also: # perf top -e sched:*switch -g --no-children To see the callchains leading to sched switches, pressing 'E' to expand all one can quickly see the most common scheduler switches and what leads to them, for instance, calls to IO, futexes, etc. Acked-by: Namhyung Kim <[email protected]> Cc: Adrian Hunter <[email protected]> Cc: Jiri Olsa <[email protected]> Link: https://lkml.kernel.org/r/[email protected] Signed-off-by: Arnaldo Carvalho de Melo <[email protected]>
2019-01-08perf tests: Add a test for the ARM 32-bit [vectors] pageFlorian Fainelli4-0/+34
perf on ARM requires CONFIG_KUSER_HELPERS to be turned on to allow some independance with respect to the ARM CPU being used. Add a test which tries to locate the [vectors] page, created when CONFIG_KUSER_HELPERS is turned on to help asses the system's health. Signed-off-by: Florian Fainelli <[email protected]> Acked-by: Jiri Olsa <[email protected]> Cc: Alexander Shishkin <[email protected]> Cc: Chris Healy <[email protected]> Cc: Greg Kroah-Hartman <[email protected]> Cc: Kim Phillips <[email protected]> Cc: Lucas Stach <[email protected]> Cc: Namhyung Kim <[email protected]> Cc: Peter Zijlstra <[email protected]> Cc: Ravi Bangoria <[email protected]> Cc: Russell King <[email protected]> Cc: Thomas Gleixner <[email protected]> Cc: Thomas Richter <[email protected]> Link: http://lkml.kernel.org/r/[email protected] Signed-off-by: Arnaldo Carvalho de Melo <[email protected]>
2019-01-08perf tools: Make find_vdso_map() more modularFlorian Fainelli4-12/+11
In preparation for checking that the vectors page on the ARM architecture, refactor the find_vdso_map() function to accept finding an arbitrary string and create a dedicated helper function for that under util/find-map.c and update the filename to find-map.c and all references to it: perf-read-vdso.c and util/vdso.c. Signed-off-by: Florian Fainelli <[email protected]> Acked-by: Jiri Olsa <[email protected]> Cc: Alexander Shishkin <[email protected]> Cc: Chris Healy <[email protected]> Cc: Greg Kroah-Hartman <[email protected]> Cc: Kim Phillips <[email protected]> Cc: Lucas Stach <[email protected]> Cc: Namhyung Kim <[email protected]> Cc: Peter Zijlstra <[email protected]> Cc: Ravi Bangoria <[email protected]> Cc: Russell King <[email protected]> Cc: Thomas Gleixner <[email protected]> Cc: Thomas Richter <[email protected]> Link: http://lkml.kernel.org/r/[email protected] Signed-off-by: Arnaldo Carvalho de Melo <[email protected]>
2019-01-08perf trace: Fix alignment for [continued] linesArnaldo Carvalho de Melo1-2/+3
We were not taking into account the "... [continued]" printed characters, fix it. Cc: Adrian Hunter <[email protected]> Cc: Jiri Olsa <[email protected]> Cc: Luis Cláudio Gonçalves <[email protected]> Cc: Namhyung Kim <[email protected]> Cc: Wang Nan <[email protected]> Link: https://lkml.kernel.org/n/[email protected] Signed-off-by: Arnaldo Carvalho de Melo <[email protected]>
2019-01-08perf trace: Fix ')' placement in "interrupted" syscall linesArnaldo Carvalho de Melo1-2/+8
When we get the sys_enter for a syscall we check if the last one is still waiting for its matching sys_exit, if so we print this: 468.753 ( ): firefox/32382 poll(ufds: 0x7f3988d3dd00, nfds: 7, timeout_msecs: 4294967295) ... 449.575 ( 0.004 ms): Softwar~cThrea/32434 futex(uaddr: 0x7f39a18a9b70, op: WAKE|PRIVATE_FLAG, val: 1) = 0 At some point we'll get that poll sys_exit event and will print a "[continued]" line. While making the sizing of the alignment after the syscall arg list and its result configurable, so that we can mimic strace, which uses a smaller alingment by default, a bug was introduced where the closing parens appeared before the syscall name and its arg list, fix it. Fixes: 4b8a240ed5e0 ("perf trace: Add alignment spaces after the closing parens") Cc: Adrian Hunter <[email protected]> Cc: Jiri Olsa <[email protected]> Cc: Luis Cláudio Gonçalves <[email protected]> Cc: Namhyung Kim <[email protected]> Cc: Wang Nan <[email protected]> Link: https://lkml.kernel.org/n/[email protected] Signed-off-by: Arnaldo Carvalho de Melo <[email protected]>
2019-01-08Merge tag 'perf-core-for-mingo-4.21-20190104' of ↵Ingo Molnar9-22/+34
git://git.kernel.org/pub/scm/linux/kernel/git/acme/linux into perf/urgent Pull perf/core improvements and fixes from Arnaldo Carvalho de Melo: perf annotate: Ivan Krylov: - Pass filename to objdump via execl, fixing usage with filenames with special characters. perf report: Jin Yao: Fix wrong iteration count in --branch-history perf stat: Jin Yao: - Fix endless wait for child process perf test: Arnaldo Carvalho de Melo: - Use a fallback to get the pathname in vfs_getname in tools build: Jiri Olsa: - Allow overriding CFLAGS assignments. Misc: Arnaldo Carvalho de Melo: - Syncronize UAPI headers Mattias Jacobsson: - Remove redundant va_end() in strbuf_addv() Signed-off-by: Arnaldo Carvalho de Melo <[email protected]> Signed-off-by: Ingo Molnar <[email protected]>
2019-01-08powerpc/perf: Update perf_regs structure to include MMCRAMadhavan Srinivasan2-1/+3
On each sample, Monitor Mode Control Register A (MMCRA) content is saved in pt_regs. MMCRA does not have a entry as-is in the pt_regs but instead, MMCRA content is saved in the "dsisr" register of pt_regs. Patch adds another entry to the perf_regs structure to include the "MMCRA" printing which internally maps to the "dsisr" of pt_regs. It also check for the MMCRA availability in the platform and present value accordingly mpe: This was the 2nd patch in a series with commit 333804dc3b7a ("powerpc/perf: Update perf_regs structure to include SIER") but I accidentally only merged the 1st patch, so merge this one now. Signed-off-by: Madhavan Srinivasan <[email protected]> Signed-off-by: Michael Ellerman <[email protected]>
2019-01-06Merge branch 'perf-urgent-for-linus' of ↵Linus Torvalds20-147/+400
git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip Pull perf tooling updates form Ingo Molnar: "A final batch of perf tooling changes: mostly fixes and small improvements" * 'perf-urgent-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip: (29 commits) perf session: Add comment for perf_session__register_idle_thread() perf thread-stack: Fix thread stack processing for the idle task perf thread-stack: Allocate an array of thread stacks perf thread-stack: Factor out thread_stack__init() perf thread-stack: Allow for a thread stack array perf thread-stack: Avoid direct reference to the thread's stack perf thread-stack: Tidy thread_stack__bottom() usage perf thread-stack: Simplify some code in thread_stack__process() tools gpio: Allow overriding CFLAGS tools power turbostat: Override CFLAGS assignments and add LDFLAGS to build command tools thermal tmon: Allow overriding CFLAGS assignments tools power x86_energy_perf_policy: Override CFLAGS assignments and add LDFLAGS to build command perf c2c: Increase the HITM ratio limit for displayed cachelines perf c2c: Change the default coalesce setup perf trace beauty ioctl: Beautify USBDEVFS_ commands perf trace beauty: Export function to get the files for a thread perf trace: Wire up ioctl's USBDEBFS_ cmd table generator perf beauty ioctl: Add generator for USBDEVFS_ ioctl commands tools headers uapi: Grab a copy of usbdevice_fs.h perf trace: Store the major number for a file when storing its pathname ...
2019-01-04perf test shell: Use a fallback to get the pathname in vfs_getnameArnaldo Carvalho de Melo1-1/+2
Some kernels, like 4.19.13-300.fc29.x86_64 in fedora 29, fail with the existing probe definition asking for the contents of result->name, working when we ask for the 'filename' variable instead, so add a fallback to that. Now those tests are back working on fedora 29 systems with that kernel: # perf test vfs_getname 65: Use vfs_getname probe to get syscall args filenames : Ok 66: Add vfs_getname probe to get syscall args filenames : Ok 67: Check open filename arg using perf trace + vfs_getname: Ok # Cc: Adrian Hunter <[email protected]> Cc: Jiri Olsa <[email protected]> Cc: Namhyung Kim <[email protected]> Link: https://lkml.kernel.org/n/[email protected] Signed-off-by: Arnaldo Carvalho de Melo <[email protected]>
2019-01-04perf python: Make sure the python binding output directory is in placeArnaldo Carvalho de Melo1-1/+3
Instead of doing an unconditional mkdir, use a dummy Makefile variable to check if the directory is there and if not, create it. This is better than what we had and will help with other python bindings that are in development, like one involved with python backtraces. Cc: Adrian Hunter <[email protected]> Cc: Jiri Olsa <[email protected]> Cc: Namhyung Kim <[email protected]> Link: https://lkml.kernel.org/n/[email protected] Signed-off-by: Arnaldo Carvalho de Melo <[email protected]>
2019-01-04perf strbuf: Remove redundant va_end() in strbuf_addv()Mattias Jacobsson1-1/+0
Each call to va_copy() should have one, and only one, corresponding call to va_end(). In strbuf_addv() some code paths result in va_end() getting called multiple times. Remove the superfluous va_end(). Signed-off-by: Mattias Jacobsson <[email protected]> Cc: Jiri Olsa <[email protected]> Cc: Peter Zijlstra <[email protected]> Cc: Sanskriti Sharma <[email protected]> Link: http://lkml.kernel.org/r/[email protected] Fixes: ce49d8436cff ("perf strbuf: Match va_{add,copy} with va_end") Signed-off-by: Arnaldo Carvalho de Melo <[email protected]>
2019-01-04perf annotate: Pass filename to objdump via execlIvan Krylov1-4/+4
The symbol__disassemble() function uses shell to launch objdump and filter its output via grep. Passing filenames by interpolating them into the command line via "%s" may lead to problems if said filenames contain special characters. Instead, pass the filename as a command line argument where it is not subject to any kind of interpretation, then use quoted shell interpolation to build the strings we need safely. Signed-off-by: Ivan Krylov <[email protected]> Acked-by: Jiri Olsa <[email protected]> Tested-by: Arnaldo Carvalho de Melo <[email protected]> Cc: Alexander Shishkin <[email protected]> Cc: Namhyung Kim <[email protected]> Cc: Peter Zijlstra <[email protected]> Link: http://lkml.kernel.org/r/20181014111803.5d83b806@Tarkus Signed-off-by: Arnaldo Carvalho de Melo <[email protected]>
2019-01-04perf report: Fix wrong iteration count in --branch-historyJin Yao3-13/+22
By calculating the removed loops, we can get the iteration count. But the iteration count could be reported incorrectly, reporting impossibly high counts. That's because previous code uses the number of removed LBR entries for the iteration count. That's not good. Fix this by increasing the iteration count when a loop is detected. When matching the chain, the iteration count would be added up, finally we need to compute the average value when printing out. For example, $ perf report --branch-history --stdio --no-children Before: ---f2 +0 | |--33.62%--f1 +9 (cycles:1) | f1 +0 | main +22 (cycles:1) | main +17 | main +38 (cycles:1) | main +27 | f1 +26 (cycles:1) | f1 +24 | f2 +27 (cycles:7) | f2 +0 | f1 +19 (cycles:1) | f1 +14 | f2 +27 (cycles:11) | f2 +0 | f1 +9 (cycles:1 iter:2968 avg_cycles:3) | f1 +0 | main +22 (cycles:1 iter:2968 avg_cycles:3) | main +17 | main +38 (cycles:1 iter:2968 avg_cycles:3) 2968 is an impossible high iteration count and avg_cycles is too small. After: ---f2 +0 | |--33.62%--f1 +9 (cycles:1) | f1 +0 | main +22 (cycles:1) | main +17 | main +38 (cycles:1) | main +27 | f1 +26 (cycles:1) | f1 +24 | f2 +27 (cycles:7) | f2 +0 | f1 +19 (cycles:1) | f1 +14 | f2 +27 (cycles:11) | f2 +0 | f1 +9 (cycles:1 iter:1 avg_cycles:23) | f1 +0 | main +22 (cycles:1 iter:1 avg_cycles:23) | main +17 | main +38 (cycles:1 iter:1 avg_cycles:23) avg_cycles:23 is the average cycles of this iteration. Fixes: c4ee06251d42 ("perf report: Calculate the average cycles of iterations") Signed-off-by: Jin Yao <[email protected]> Cc: Alexander Shishkin <[email protected]> Cc: Andi Kleen <[email protected]> Cc: Jiri Olsa <[email protected]> Cc: Kan Liang <[email protected]> Cc: Peter Zijlstra <[email protected]> Link: http://lkml.kernel.org/r/[email protected] Signed-off-by: Arnaldo Carvalho de Melo <[email protected]>
2019-01-03Remove 'type' argument from access_ok() functionLinus Torvalds1-1/+1
Nobody has actually used the type (VERIFY_READ vs VERIFY_WRITE) argument of the user address range verification function since we got rid of the old racy i386-only code to walk page tables by hand. It existed because the original 80386 would not honor the write protect bit when in kernel mode, so you had to do COW by hand before doing any user access. But we haven't supported that in a long time, and these days the 'type' argument is a purely historical artifact. A discussion about extending 'user_access_begin()' to do the range checking resulted this patch, because there is no way we're going to move the old VERIFY_xyz interface to that model. And it's best done at the end of the merge window when I've done most of my merges, so let's just get this done once and for all. This patch was mostly done with a sed-script, with manual fix-ups for the cases that weren't of the trivial 'access_ok(VERIFY_xyz' form. There were a couple of notable cases: - csky still had the old "verify_area()" name as an alias. - the iter_iov code had magical hardcoded knowledge of the actual values of VERIFY_{READ,WRITE} (not that they mattered, since nothing really used it) - microblaze used the type argument for a debug printout but other than those oddities this should be a total no-op patch. I tried to fix up all architectures, did fairly extensive grepping for access_ok() uses, and the changes are trivial, but I may have missed something. Any missed conversion should be trivially fixable, though. Signed-off-by: Linus Torvalds <[email protected]>
2019-01-03tools beauty: Make the prctl option table generator catch all PR_ optionsArnaldo Carvalho de Melo1-1/+1
In ba8308856564 ("arm64: add prctl control for resetting ptrauth keys") the PR_PAC_RESET_KEYS prctl option was introduced, get that into the regex in addition to PR_GET_* and PR_SET_*: So just get everything that matches '^#define PR_\w+' this ends up adding these entries: $ tools/perf/trace/beauty/prctl_option.sh > after $ diff -u before after --- before 2019-01-03 14:58:51.541807353 -0300 +++ after 2019-01-03 15:17:05.909583804 -0300 @@ -19,12 +19,18 @@ [20] = "SET_ENDIAN", [21] = "GET_SECCOMP", [22] = "SET_SECCOMP", + [23] = "CAPBSET_READ", + [24] = "CAPBSET_DROP", [25] = "GET_TSC", [26] = "SET_TSC", [27] = "GET_SECUREBITS", [28] = "SET_SECUREBITS", [29] = "SET_TIMERSLACK", [30] = "GET_TIMERSLACK", + [31] = "TASK_PERF_EVENTS_DISABLE", + [32] = "TASK_PERF_EVENTS_ENABLE", + [33] = "MCE_KILL", + [34] = "MCE_KILL_GET", [35] = "SET_MM", [36] = "SET_CHILD_SUBREAPER", [37] = "GET_CHILD_SUBREAPER", @@ -33,8 +39,13 @@ [40] = "GET_TID_ADDRESS", [41] = "SET_THP_DISABLE", [42] = "GET_THP_DISABLE", + [43] = "MPX_ENABLE_MANAGEMENT", + [44] = "MPX_DISABLE_MANAGEMENT", [45] = "SET_FP_MODE", [46] = "GET_FP_MODE", + [47] = "CAP_AMBIENT", + [50] = "SVE_SET_VL", + [51] = "SVE_GET_VL", [52] = "GET_SPECULATION_CTRL", [53] = "SET_SPECULATION_CTRL", [54] = "PAC_RESET_KEYS", $ Cc: Adrian Hunter <[email protected]> Cc: Jiri Olsa <[email protected]> Cc: Kristina Martsenko <[email protected]> Cc: Namhyung Kim <[email protected]> Cc: Will Deacon <[email protected]> Link: https://lkml.kernel.org/n/[email protected] Signed-off-by: Arnaldo Carvalho de Melo <[email protected]>
2019-01-03perf stat: Fix endless wait for child processJin Yao1-1/+2
We hit a 'perf stat' issue by using following script: #!/bin/bash sleep 1000 & exec perf stat -a -e cycles -I1000 -- sleep 5 Since "perf stat" is launched by exec, the "sleep 1000" would be the child process of "perf stat". The wait4() call will not return because it's waiting for the child process "sleep 1000" to end. So 'perf stat' doesn't return even after 5s passes. This patch lets 'perf stat' return when the specified child process ends (in this case, the specified child process is "sleep 5"). Committer testing: # cat test.sh #!/bin/bash sleep 10 & exec perf stat -a -e cycles -I1000 -- sleep 5 # Before: # time ./test.sh # time counts unit events 1.001113090 108,453,351 cycles 2.002062196 142,075,435 cycles 3.002896194 164,801,068 cycles 4.003731666 107,062,140 cycles 5.002068867 112,241,832 cycles real 0m10.066s user 0m0.016s sys 0m0.101s # After: # time ./test.sh # time counts unit events 1.001016096 91,412,027 cycles 2.002014963 124,063,708 cycles 3.002883964 125,993,929 cycles 4.003706470 120,465,734 cycles 5.002006778 163,560,355 cycles real 0m5.123s user 0m0.014s sys 0m0.105s # Signed-off-by: Jin Yao <[email protected]> Reviewed-by: Jiri Olsa <[email protected]> Tested-by: Arnaldo Carvalho de Melo <[email protected]> Cc: Alexander Shishkin <[email protected]> Cc: Andi Kleen <[email protected]> Cc: Kan Liang <[email protected]> Cc: Peter Zijlstra <[email protected]> Link: http://lkml.kernel.org/r/[email protected] Signed-off-by: Arnaldo Carvalho de Melo <[email protected]>
2019-01-02perf session: Add comment for perf_session__register_idle_thread()Adrian Hunter1-0/+7
Add a comment to perf_session__register_idle_thread() to bring attention to a pitfall with the idle task thread structure. The pitfall is that there should really be a 'struct thread' for the idle task of each cpu, but there is only one that can have pid == tid == 0. Signed-off-by: Adrian Hunter <[email protected]> Acked-by: Jiri Olsa <[email protected]> Link: http://lkml.kernel.org/r/[email protected] Signed-off-by: Arnaldo Carvalho de Melo <[email protected]>
2019-01-02perf thread-stack: Fix thread stack processing for the idle taskAdrian Hunter5-25/+69
perf creates a single 'struct thread' to represent the idle task. That is because threads are identified by PID and TID, and the idle task always has PID == TID == 0. However, there are actually separate idle tasks for each CPU. That creates a problem for thread stack processing which assumes that each thread has a single stack, not one stack per CPU. Fix that by passing through the CPU number, and in the case of the idle "thread", pick the thread stack from an array based on the CPU number. Signed-off-by: Adrian Hunter <[email protected]> Acked-by: Jiri Olsa <[email protected]> Link: http://lkml.kernel.org/r/[email protected] Signed-off-by: Arnaldo Carvalho de Melo <[email protected]>
2019-01-02perf thread-stack: Allocate an array of thread stacksAdrian Hunter1-12/+17
In preparation for fixing thread stack processing for the idle task, allocate an array of thread stacks. Signed-off-by: Adrian Hunter <[email protected]> Acked-by: Jiri Olsa <[email protected]> Link: http://lkml.kernel.org/r/[email protected] [ No need to check for NULL when calling zfree(), noticed by Jiri Olsa ] Signed-off-by: Arnaldo Carvalho de Melo <[email protected]>
2019-01-02perf thread-stack: Factor out thread_stack__init()Adrian Hunter1-7/+19
In preparation for fixing thread stack processing for the idle task, factor out thread_stack__init(). Signed-off-by: Adrian Hunter <[email protected]> Acked-by: Jiri Olsa <[email protected]> Link: http://lkml.kernel.org/r/[email protected] Signed-off-by: Arnaldo Carvalho de Melo <[email protected]>
2019-01-02perf thread-stack: Allow for a thread stack arrayAdrian Hunter1-6/+34
In preparation for fixing thread stack processing for the idle task, allow for a thread stack array. Signed-off-by: Adrian Hunter <[email protected]> Acked-by: Jiri Olsa <[email protected]> Link: http://lkml.kernel.org/r/[email protected] Signed-off-by: Arnaldo Carvalho de Melo <[email protected]>
2019-01-02perf thread-stack: Avoid direct reference to the thread's stackAdrian Hunter1-32/+49
In preparation for fixing thread stack processing for the idle task, avoid direct reference to the thread's stack. The thread stack will change to an array of thread stacks, at which point the meaning of the direct reference will change. Signed-off-by: Adrian Hunter <[email protected]> Acked-by: Jiri Olsa <[email protected]> Link: http://lkml.kernel.org/r/[email protected] [ Rename thread_stack__ts() to thread__stack() since this operates on a 'thread' struct ] Signed-off-by: Arnaldo Carvalho de Melo <[email protected]>
2019-01-02perf thread-stack: Tidy thread_stack__bottom() usageAdrian Hunter1-4/+3
In preparation for fixing thread stack processing for the idle task, tidy thread_stack__bottom() usage. Specifically, the parameter 'thread' is not needed. Signed-off-by: Adrian Hunter <[email protected]> Acked-by: Jiri Olsa <[email protected]> Link: http://lkml.kernel.org/r/[email protected] Signed-off-by: Arnaldo Carvalho de Melo <[email protected]>
2019-01-02perf thread-stack: Simplify some code in thread_stack__process()Adrian Hunter1-11/+7
In preparation for fixing thread stack processing for the idle task, simplify some code in thread_stack__process(). Signed-off-by: Adrian Hunter <[email protected]> Acked-by: Jiri Olsa <[email protected]> Link: http://lkml.kernel.org/r/[email protected] Signed-off-by: Arnaldo Carvalho de Melo <[email protected]>
2018-12-29Merge tag 'nds32-for-linus-4.21' of ↵Linus Torvalds5-0/+336
git://git.kernel.org/pub/scm/linux/kernel/git/greentime/linux Pull nds32 updates from Greentime Hu: - Perf support - Power management support - FPU support - Hardware prefetcher support - Build error fixed - Performance enhancement * tag 'nds32-for-linus-4.21' of git://git.kernel.org/pub/scm/linux/kernel/git/greentime/linux: nds32: support hardware prefetcher nds32: Fix the items of hwcap_str ordering issue. math-emu/soft-fp.h: (_FP_ROUND_ZERO) cast 0 to void to fix warning math-emu/op-2.h: Use statement expressions to prevent negative constant shift nds32: support denormalized result through FP emulator nds32: Support FP emulation nds32: nds32 FPU port nds32: Remove duplicated include from pm.c nds32: Power management for nds32 nds32: Add document for NDS32 PMU. nds32: Add perf call-graph support. nds32: Perf porting nds32: Fix bug in bitfield.h nds32: Fix gcc 8.0 compiler option incompatible. nds32: Fill all TLB entries with kernel image mapping nds32: Remove the redundant assignment
2018-12-28perf c2c: Increase the HITM ratio limit for displayed cachelinesJiri Olsa1-1/+1
The cachelines being reported are the ones with percentages all the way down to 0.05%. That makes for very long output files. Raising that to 0.1%. The user can always specify --show-all if they want all the cachelines with hits. Suggested-by: Joe Mario <[email protected]> Signed-off-by: Jiri Olsa <[email protected]> Cc: Alexander Shishkin <[email protected]> Cc: Namhyung Kim <[email protected]> Cc: Peter Zijlstra <[email protected]> Link: http://lkml.kernel.org/r/[email protected] Signed-off-by: Arnaldo Carvalho de Melo <[email protected]>
2018-12-28perf c2c: Change the default coalesce setupJiri Olsa1-1/+1
Joe suggested to have the coalesce default set just to 'iaddr', because it's easier to read on the default 'perf c2c report' output. By removing the "pid" field from the default -c/--coalesce option, the 'perf c2c' report will group all the relevant PIDs under the instruction address ('iaddr') bucket. User can always run "-c pid,iaddr" for a more fine grained output on particular PIDs. Suggested-by: Joe Mario <[email protected]> Signed-off-by: Jiri Olsa <[email protected]> Cc: Alexander Shishkin <[email protected]> Cc: Namhyung Kim <[email protected]> Cc: Peter Zijlstra <[email protected]> Link: http://lkml.kernel.org/r/[email protected] Signed-off-by: Arnaldo Carvalho de Melo <[email protected]>
2018-12-28perf trace beauty ioctl: Beautify USBDEVFS_ commandsArnaldo Carvalho de Melo1-0/+22
For instance, while debugging the 'galileo' python utility to synchronize fitbit trackers: # perf trace -e ioctl ./run --force ioctl(0</dev/pts/8>, TCSETS, 0x7ffe28666420) = 0 ioctl(0</dev/pts/8>, TCSETS, 0x7ffe28666290) = 0 ioctl(1</dev/pts/8>, TCSETS, 0x7ffe28666290) = 0 ioctl(2</dev/pts/8>, TCSETS, 0x7ffe28666290) = 0 ioctl(3</home/acme/hg/galileo/run>, TCSETS, 0x7ffe286663f0) = -1 ENOTTY (Inappropriate ioctl for device) ioctl(1</dev/pts/8>, TCSETS, 0x7ffe286655a0) = 0 ioctl(1</dev/pts/8>, TCSETS, 0x7ffe28665470) = 0 ioctl(1</dev/pts/8>, TCSETS, 0x7ffe28665470) = 0 ioctl(1</dev/pts/8>, TCSETS, 0x7ffe286654a0) = 0 ioctl(1</dev/pts/8>, TCSETS, 0x7ffe286654a0) = 0 ioctl(1</dev/pts/8>, TCSETS, 0x7ffe28665400) = 0 ioctl(1</dev/pts/8>, TIOCSWINSZ, 0x7ffe286654c0) = 0 ioctl(0</dev/pts/8>, TIOCSWINSZ, 0x7ffe28665560) = 0 ioctl(0</dev/pts/8>, TIOCSWINSZ, 0x7ffe28665560) = 0 ioctl(0</dev/pts/8>, TIOCMGET, 0x7ffe28665560) = 0 ioctl(0</dev/pts/8>, TCSETS, 0x7ffe28665530) = 0 ioctl(10</dev/bus/usb/001/011>, USBDEVFS_GET_CAPABILITIES, 0x561468dad048) = 0 ioctl(10</dev/bus/usb/001/011>, USBDEVFS_GETDRIVER, 0x7ffe28665500) = -1 ENODATA (No data available) ioctl(10</dev/bus/usb/001/011>, USBDEVFS_GETDRIVER, 0x7ffe28665500) = -1 ENODATA (No data available) ioctl(10</dev/bus/usb/001/011>, USBDEVFS_SETCONFIGURATION, 0x7ffe2866513c) = 0 ioctl(10</dev/bus/usb/001/011>, USBDEVFS_CLAIMINTERFACE, 0x7ffe286647bc) = 0 ioctl(10</dev/bus/usb/001/011>, USBDEVFS_SUBMITURB, 0x561468dace40) = 0 ioctl(10</dev/bus/usb/001/011>, USBDEVFS_REAPURBNDELAY, 0x7ffe28664c10) = 0 ioctl(10</dev/bus/usb/001/011>, USBDEVFS_REAPURBNDELAY, 0x7ffe28664c10) = -1 EAGAIN (Resource temporarily unavailable) ioctl(10</dev/bus/usb/001/011>, USBDEVFS_SUBMITURB, 0x561468dace40) = 0 ioctl(10</dev/bus/usb/001/011>, USBDEVFS_REAPURBNDELAY, 0x7ffe28664dd0) = 0 ioctl(10</dev/bus/usb/001/011>, USBDEVFS_REAPURBNDELAY, 0x7ffe28664dd0) = -1 EAGAIN (Resource temporarily unavailable) <SNIP> ioctl(10</dev/bus/usb/001/011>, USBDEVFS_SUBMITURB, 0x561468e72ec0) = 0 ioctl(10</dev/bus/usb/001/011>, USBDEVFS_REAPURBNDELAY, 0x7ffe28664cc0) = 0 ioctl(10</dev/bus/usb/001/011>, USBDEVFS_REAPURBNDELAY, 0x7ffe28664cc0) = -1 EAGAIN (Resource temporarily unavailable) ioctl(10</dev/bus/usb/001/011>, USBDEVFS_RELEASEINTERFACE, 0x7ffe2866463c) = 0 ioctl(10</dev/bus/usb/001/011>, USBDEVFS_RELEASEINTERFACE, 0x7ffe2866463c) = 0 Tracker: 813F4690C3D1: Synchronisation successful # Cc: Adrian Hunter <[email protected]> Cc: Jiri Olsa <[email protected]> Cc: Luis Cláudio Gonçalves <[email protected]> Cc: Namhyung Kim <[email protected]> Cc: Wang Nan <[email protected]> Link: https://lkml.kernel.org/n/[email protected] Signed-off-by: Arnaldo Carvalho de Melo <[email protected]>