aboutsummaryrefslogtreecommitdiff
path: root/tools
AgeCommit message (Collapse)AuthorFilesLines
2018-04-05perf ui browser: Fixup cleaning unused lines at the bottomArnaldo Carvalho de Melo1-2/+2
Now that we can have extra title lines we should use ui_browser->rows and not ->height when drawing lines, as well as adding ui_browser->extra_title_lines to browser->y when cleaning unused lines at the bottom, otherwise we end up clobbering with spaces the last line just shown by ui_browser->refresh() routine. Cc: Adrian Hunter <adrian.hunter@intel.com> Cc: Andi Kleen <ak@linux.intel.com> Cc: David Ahern <dsahern@gmail.com> Cc: Jin Yao <yao.jin@linux.intel.com> Cc: Jiri Olsa <jolsa@kernel.org> Cc: Namhyung Kim <namhyung@kernel.org> Cc: Wang Nan <wangnan0@huawei.com> Fixes: ef9ff6017e3c ("perf ui browser: Move the extra title lines from the hists browser") Link: https://lkml.kernel.org/n/tip-dfcpokt1pm5ixm8n9pxwtstz@git.kernel.org Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
2018-04-05perf annotate browser: Fixup vertical line separating metrics from instructionsArnaldo Carvalho de Melo1-1/+1
Now that we can have extra title lines we should use ui_browser->rows and not ->height when drawing lines, as it will use ui_browser__gotorc() and that will take the extra title lines into account, which was causing an off by one at the end of the vertical line drawn by __ui_browser__vline(), fix it. The visual effect was that the last line, with status messages, was being overwritten by the vertical line, looking like: Press 'h' for help on│key bindings Cc: Adrian Hunter <adrian.hunter@intel.com> Cc: Andi Kleen <ak@linux.intel.com> Cc: David Ahern <dsahern@gmail.com> Cc: Jin Yao <yao.jin@linux.intel.com> Cc: Jiri Olsa <jolsa@kernel.org> Cc: Namhyung Kim <namhyung@kernel.org> Cc: Wang Nan <wangnan0@huawei.com> Fixes: ef9ff6017e3c ("perf ui browser: Move the extra title lines from the hists browser") Link: https://lkml.kernel.org/n/tip-08y1ln3xjn76zvizz1i1dsvn@git.kernel.org Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
2018-04-05perf annotate: Show group details on the title lineArnaldo Carvalho de Melo1-2/+5
To match what is shown in the main 'perf report/top' title lines, i.e. if a group is being shown, either a real group (recorded with "-e '{a,b,c}') or a forced group (using 'perf report --group' for a perf.data file recorded without {}) we will show multiple columns, one per event, but we were failing to show the group details, so, for: # perf report --header-only | grep cmdline # cmdline : /home/acme/bin/perf record -e {cycles,instructions,cache-misses} # perf report --group The first line was showing just "cycles", now it shows the correct line, which is: Samples: 578 of events 'anon group { cycles, instructions, cache-misses }', 4000 Hz, Event count (approx.): 487421794 syscall_return_via_sysret /lib/modules/4.16.0-rc7/build/vmlinux 0.22 2.97 0.00 │ ↓ jmp 6c │ mov %cr3,%rdi 1.33 10.89 4.00 │ ↓ jmp 62 │ mov %rdi,%rax <SNIP> Cc: Adrian Hunter <adrian.hunter@intel.com> Cc: Andi Kleen <ak@linux.intel.com> Cc: David Ahern <dsahern@gmail.com> Cc: Jin Yao <yao.jin@linux.intel.com> Cc: Jiri Olsa <jolsa@kernel.org> Cc: Namhyung Kim <namhyung@kernel.org> Cc: Wang Nan <wangnan0@huawei.com> Fixes: 6920e2854e9a ("perf annotate browser: Show extra title line with event information") Link: https://lkml.kernel.org/n/tip-i41tqh17c2dabnyzjh99r1oz@git.kernel.org Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
2018-04-05perf auxtrace: Make auxtrace_queues__add_buffer() allocate struct bufferAdrian Hunter1-30/+24
In preparation for supporting AUX area sampling buffers, auxtrace_queues__add_buffer() needs to be more generic. To that end, move memory allocation for struct buffer into it. Signed-off-by: Adrian Hunter <adrian.hunter@intel.com> Cc: Jiri Olsa <jolsa@redhat.com> Link: http://lkml.kernel.org/r/1520327598-1317-7-git-send-email-adrian.hunter@intel.com Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
2018-04-04Merge tag 'perf-core-for-mingo-4.17-20180403' of ↵Ingo Molnar22-116/+418
git://git.kernel.org/pub/scm/linux/kernel/git/acme/linux into perf/core Pull perf/core improvements and fixes from Arnaldo Carvalho de Melo: - Show only failing syscalls with 'perf trace --failure' (Arnaldo Carvalho de Melo) e.g: See what 'openat' syscalls are failing: # perf trace --failure -e openat 762.323 ( 0.007 ms): VideoCapture/4566 openat(dfd: CWD, filename: /dev/video2) = -1 ENOENT No such file or directory <SNIP N /dev/videoN open attempts... sigh, where is that improvised camera lid?!? > 790.228 ( 0.008 ms): VideoCapture/4566 openat(dfd: CWD, filename: /dev/video63) = -1 ENOENT No such file or directory ^C# - Show information about the event (freq, nr_samples, total period/nr_events) in the annotate --tui and --stdio2 'perf annotate' output, similar to the first line in the 'perf report --tui', but just for the samples for a the annotated symbol (Arnaldo Carvalho de Melo) - Introduce 'perf version --build-options' to show what features were linked, aliased as well as a shorter 'perf -vv' (Jin Yao) - Add a "dso_size" sort order (Kim Phillips) - Remove redundant ')' in the tracepoint output in 'perf trace' (Changbin Du) - Synchronize x86's cpufeatures.h, no effect on toolss (Arnaldo Carvalho de Melo) Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com> Signed-off-by: Ingo Molnar <mingo@kernel.org>
2018-04-03perf trace: Remove redundant ')'Changbin Du1-1/+1
There is a redundant ')' at the tail of each event. So remove it. $ sudo perf trace --no-syscalls -e 'kmem:*' -a 899.342 kmem:kfree:(vfs_writev+0xb9) call_site=ffffffff9c453979 ptr=(nil)) 899.344 kmem:kfree:(___sys_recvmsg+0x188) call_site=ffffffff9c9b8b88 ptr=(nil)) Signed-off-by: Changbin Du <changbin.du@intel.com> Tested-by: Arnaldo Carvalho de Melo <acme@redhat.com> Cc: Jiri Olsa <jolsa@redhat.com> Cc: Namhyung Kim <namhyung@kernel.org> Cc: Peter Zijlstra <peterz@infradead.org> Link: http://lkml.kernel.org/r/1520937601-24952-1-git-send-email-changbin.du@intel.com Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
2018-04-03perf annotate stdio2: Print more descriptive event information headerArnaldo Carvalho de Melo1-7/+3
To match the recently added event header information to --tui, e.g.: # perf annotate --ignore-vmlinux --stdio2 _raw_spin_lock_irqsave Samples: 128 of event 'cycles:ppp', 4000 Hz, Event count (approx.): 48617682 _raw_spin_lock_irqsave() /proc/kcore 0.78 nop 7.03 push %rbx 3.12 pushfq 6.25 pop %rax nop mov %rax,%rbx 3.12 cli nop xor %eax,%eax mov $0x1,%edx 79.69 lock cmpxchg %edx,(%rdi) test %eax,%eax ↓ jne 2b mov %rbx,%rax pop %rbx ← retq 2b: mov %eax,%esi → callq *ffffffffb30eaed0 mov %rbx,%rax pop %rbx ← retq # Cc: Adrian Hunter <adrian.hunter@intel.com> Cc: David Ahern <dsahern@gmail.com> Cc: Jiri Olsa <jolsa@kernel.org> Cc: Martin Liška <mliska@suse.cz> Cc: Namhyung Kim <namhyung@kernel.org> Cc: Wang Nan <wangnan0@huawei.com> Link: https://lkml.kernel.org/n/tip-ujy46x7cldyhyxelyf2b9quy@git.kernel.org Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
2018-04-03perf annotate browser: Show extra title line with event informationArnaldo Carvalho de Melo1-4/+27
So at the top we'll have two lines, like this, from 'perf report': # perf report --group --ignore-vmlinux ===================================================================================================== Samples: 46 of events 'cycles', 4000 Hz, Event count (approx.): 5154895 _raw_spin_lock_irqsave /proc/kcore Percent │ nop │ push %rbx 0.00 14.29 0.00 │ pushfq 9.09 0.00 0.00 │ pop %rax 9.09 0.00 20.00 │ nop │ mov %rax,%rbx │ cli 4.55 7.14 0.00 │ nop │ xor %eax,%eax │ mov $0x1,%edx │ lock cmpxchg %edx,(%rdi) 77.27 78.57 70.00 │ test %eax,%eax │ ↓ jne 2b │ mov %rbx,%rax 0.00 0.00 10.00 │ pop %rbx │ ← retq │2b: mov %eax,%esi │ → callq queued_spin_lock_slowpath │ mov %rbx,%rax │ pop %rbx Press 'h' for help on│key bindings ===================================================================================================== 9.09 + 9.09 + 4.55 + 77.27 = 100 14.29 + 7.14 + 78.57 = 100 20 + 70 + 10 = 100 We can do the math by using 't' to toggle from 'percent' to nr ===================================================================================================== Samples: 46 of events 'cycles', 4000 Hz, Event count (approx.): 5154895 _raw_spin_lock_irqsave /proc/kcore Period │ nop │ push %rbx 0 79273 0 │ pushfq 190455 0 0 │ pop %rax 198038 0 3045 │ nop │ mov %rax,%rbx │ cli 217233 32562 0 │ nop │ xor %eax,%eax │ mov $0x1,%edx │ lock cmpxchg %edx,(%rdi) 3421649 979174 28273 │ test %eax,%eax │ ↓ jne 2b │ mov %rbx,%rax 0 0 5193 │ pop %rbx │ ← retq │2b: mov %eax,%esi │ → callq queued_spin_lock_slowpath │ mov %rbx,%rax │ pop %rbx Press 'h' for help on│key bindings ===================================================================================================== 79273 + 190455 + 198038 + 3045 + 217233 + 32562 + 3421649 + 979174 + 28273 + 5193 = 5154895 Or number of samples: ===================================================================================================== ooSamples: 46 of events 'cycles', 4000 Hz, Event count (approx.): 5154895 _raw_spin_lock_irqsave /proc/kcore Samples │ nop │ push %rbx 0 2 0 │ pushfq 2 0 0 │ pop %rax 2 0 2 │ nop │ mov %rax,%rbx │ cli 1 1 0 │ nop │ xor %eax,%eax │ mov $0x1,%edx │ lock cmpxchg %edx,(%rdi) 17 11 7 │ test %eax,%eax │ ↓ jne 2b │ mov %rbx,%rax 0 0 1 │ pop %rbx │ ← retq │2b: mov %eax,%esi │ → callq queued_spin_lock_slowpath │ mov %rbx,%rax │ pop %rbx Press 'h' for help on key bindings ===================================================================================================== 2 + 2 + 2 + 2 + 1 + 1 + 17 + 11 + 7 + 1 = 46 Suggested-by: Martin Liška <mliska@suse.cz> Cc: Adrian Hunter <adrian.hunter@intel.com> Cc: David Ahern <dsahern@gmail.com> Cc: Jiri Olsa <jolsa@kernel.org> Cc: Linus Torvalds <torvalds@linux-foundation.org> Cc: Namhyung Kim <namhyung@kernel.org> Cc: Wang Nan <wangnan0@huawei.com> Bugzilla: https://bugzilla.kernel.org/show_bug.cgi?id=196935 Link: https://lkml.kernel.org/n/tip-ezccyxld50wtwyt66np6aomo@git.kernel.org Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
2018-04-03perf annotate: Introduce annotation__scnprintf_samples_period() methodArnaldo Carvalho de Melo2-0/+50
To print a string using the total period (nr_events) and the number of samples for a given annotation, i.e. for a given symbol, the counterpart to hists__scnprintf_samples_period(), that is for all the samples in a session (be it a live session, think 'perf top' or a perf.data file, think 'perf report'). Cc: Adrian Hunter <adrian.hunter@intel.com> Cc: David Ahern <dsahern@gmail.com> Cc: Jiri Olsa <jolsa@kernel.org> Cc: Martin Liška <mliska@suse.cz> Cc: Namhyung Kim <namhyung@kernel.org> Cc: Wang Nan <wangnan0@huawei.com> Bugzilla: https://bugzilla.kernel.org/show_bug.cgi?id=196935 Link: https://lkml.kernel.org/n/tip-goj2wu4fxutc8vd46mw3yg14@git.kernel.org Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
2018-04-03perf ui browser: Move the extra title lines from the hists browserArnaldo Carvalho de Melo3-20/+33
This will be useful for the annotate browser as well, that wants to have extra title lines, i.e. the current ui_browser unconditionally reserves the first line for a browser title and the last one for status messages. But some browsers, like the buckets one (hists browser) needs extra lines to show headers, allowing it to be shown or not, press 'H' in 'perf top' or 'perf report' to see this feature. So move that logic to the core ui_browser used by the hists_browser ('perf top' and 'perf report' main interface) so that it can be used by the annotate browser too. Cc: Adrian Hunter <adrian.hunter@intel.com> Cc: David Ahern <dsahern@gmail.com> Cc: Jiri Olsa <jolsa@kernel.org> Cc: Martin Liška <mliska@suse.cz> Cc: Namhyung Kim <namhyung@kernel.org> Cc: Wang Nan <wangnan0@huawei.com> Bugzilla: https://bugzilla.kernel.org/show_bug.cgi?id=196935 Link: https://lkml.kernel.org/n/tip-r38xm3ut37ulbg1o5tn5iise@git.kernel.org Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
2018-04-03perf hists: Move hists__scnprintf_title() away from the TUI codeArnaldo Carvalho de Melo2-79/+81
The previous patch made this function useful to non-TUI parts of the tools, but left it where the function from what it was carved, so that the patch showed more clearly the process. Now just move it outside the TUI parts so that we can finally use it, even when the TUI code doesn't get built/linked. Cc: Adrian Hunter <adrian.hunter@intel.com> Cc: David Ahern <dsahern@gmail.com> Cc: Jiri Olsa <jolsa@kernel.org> Cc: Martin Liška <mliska@suse.cz> Cc: Namhyung Kim <namhyung@kernel.org> Cc: Wang Nan <wangnan0@huawei.com> Bugzilla: https://bugzilla.kernel.org/show_bug.cgi?id=196935 Link: https://lkml.kernel.org/n/tip-hqj7hvcr3mu5lvcqp3cssio6@git.kernel.org Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
2018-04-03perf hists: Introduce hists__scnprint_title()Arnaldo Carvalho de Melo2-4/+17
That is not use any struct hists_browser internals, so that it can be shared with the other UIs and tools. Cc: Adrian Hunter <adrian.hunter@intel.com> Cc: David Ahern <dsahern@gmail.com> Cc: Jiri Olsa <jolsa@kernel.org> Cc: Martin Liška <mliska@suse.cz> Cc: Namhyung Kim <namhyung@kernel.org> Cc: Wang Nan <wangnan0@huawei.com> Bugzilla: https://bugzilla.kernel.org/show_bug.cgi?id=196935 Link: https://lkml.kernel.org/n/tip-w8mczjnqnbcj9yzfkv9ja6ro@git.kernel.org Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
2018-04-03perf hists browser: Rename perf_evsel_browser_title to a more descriptive nameArnaldo Carvalho de Melo1-5/+3
Rename it to hists_browser__scnprintf_title() to better reflect that it provides a scnprintf-like function operating on a hists_browser instance. This paves the way to have a non-hists_browser specific function to scnprintf format a title with per evsel information to use in other tools or UIs. Cc: Adrian Hunter <adrian.hunter@intel.com> Cc: David Ahern <dsahern@gmail.com> Cc: Jiri Olsa <jolsa@kernel.org> Cc: Martin Liška <mliska@suse.cz> Cc: Namhyung Kim <namhyung@kernel.org> Cc: Wang Nan <wangnan0@huawei.com> Bugzilla: https://bugzilla.kernel.org/show_bug.cgi?id=196935 Link: https://lkml.kernel.org/n/tip-sntpyzxsnme9jvuz2qntwoh2@git.kernel.org Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
2018-04-02Merge tag 'arch-removal' of ↵Linus Torvalds16-146/+3
git://git.kernel.org/pub/scm/linux/kernel/git/arnd/asm-generic Pul removal of obsolete architecture ports from Arnd Bergmann: "This removes the entire architecture code for blackfin, cris, frv, m32r, metag, mn10300, score, and tile, including the associated device drivers. I have been working with the (former) maintainers for each one to ensure that my interpretation was right and the code is definitely unused in mainline kernels. Many had fond memories of working on the respective ports to start with and getting them included in upstream, but also saw no point in keeping the port alive without any users. In the end, it seems that while the eight architectures are extremely different, they all suffered the same fate: There was one company in charge of an SoC line, a CPU microarchitecture and a software ecosystem, which was more costly than licensing newer off-the-shelf CPU cores from a third party (typically ARM, MIPS, or RISC-V). It seems that all the SoC product lines are still around, but have not used the custom CPU architectures for several years at this point. In contrast, CPU instruction sets that remain popular and have actively maintained kernel ports tend to all be used across multiple licensees. [ See the new nds32 port merged in the previous commit for the next generation of "one company in charge of an SoC line, a CPU microarchitecture and a software ecosystem" - Linus ] The removal came out of a discussion that is now documented at https://lwn.net/Articles/748074/. Unlike the original plans, I'm not marking any ports as deprecated but remove them all at once after I made sure that they are all unused. Some architectures (notably tile, mn10300, and blackfin) are still being shipped in products with old kernels, but those products will never be updated to newer kernel releases. After this series, we still have a few architectures without mainline gcc support: - unicore32 and hexagon both have very outdated gcc releases, but the maintainers promised to work on providing something newer. At least in case of hexagon, this will only be llvm, not gcc. - openrisc, risc-v and nds32 are still in the process of finishing their support or getting it added to mainline gcc in the first place. They all have patched gcc-7.3 ports that work to some degree, but complete upstream support won't happen before gcc-8.1. Csky posted their first kernel patch set last week, their situation will be similar [ Palmer Dabbelt points out that RISC-V support is in mainline gcc since gcc-7, although gcc-7.3.0 is the recommended minimum - Linus ]" This really says it all: 2498 files changed, 95 insertions(+), 467668 deletions(-) * tag 'arch-removal' of git://git.kernel.org/pub/scm/linux/kernel/git/arnd/asm-generic: (74 commits) MAINTAINERS: UNICORE32: Change email account staging: iio: remove iio-trig-bfin-timer driver tty: hvc: remove tile driver tty: remove bfin_jtag_comm and hvc_bfin_jtag drivers serial: remove tile uart driver serial: remove m32r_sio driver serial: remove blackfin drivers serial: remove cris/etrax uart drivers usb: Remove Blackfin references in USB support usb: isp1362: remove blackfin arch glue usb: musb: remove blackfin port usb: host: remove tilegx platform glue pwm: remove pwm-bfin driver i2c: remove bfin-twi driver spi: remove blackfin related host drivers watchdog: remove bfin_wdt driver can: remove bfin_can driver mmc: remove bfin_sdh driver input: misc: remove blackfin rotary driver input: keyboard: remove bf54x driver ...
2018-04-02Merge branch 'perf-core-for-linus' of ↵Linus Torvalds177-1920/+8122
git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip Pull perf updates from Ingo Molnar: "The main kernel side changes were: - Modernize the kprobe and uprobe creation/destruction tooling ABIs: The existing text based APIs (kprobe_events and uprobe_events in tracefs), are naive, limited ABIs in that they require user-space to clean up after themselves, which is both difficult and fragile if the tool is buggy or exits unexpectedly. In other words they are not really suited for modern, robust tooling. So introduce a modern, file descriptor based ABI that does not have these limitations: introduce the 'perf_kprobe' and 'perf_uprobe' PMUs and extend the perf_event_open() syscall to create events with a kprobe/uprobe attached to them. These [k,u]probe are associated with this file descriptor, so they are not available in tracefs. (Song Liu) - Intel Cannon Lake CPU support (Harry Pan) - Intel PT cleanups (Alexander Shishkin) - Improve the performance of pinned/flexible event groups by using RB trees (Alexey Budankov) - Add PERF_EVENT_IOC_MODIFY_ATTRIBUTES which allows the modification of hardware breakpoints, which new ABI variant massively speeds up existing tooling that uses hardware breakpoints to instrument (and debug) memory usage. (Milind Chabbi, Jiri Olsa) - Various Intel PEBS handling fixes and improvements, and other Intel PMU improvements (Kan Liang) - Various perf core improvements and optimizations (Peter Zijlstra) - ... misc cleanups, fixes and updates. There's over 200 tooling commits, here's an (imperfect) list of highlights: - 'perf annotate' improvements: * Recognize and handle jumps to other functions as calls, which improves the navigation along jumps and back. (Arnaldo Carvalho de Melo) * Add the 'P' hotkey in TUI annotation to dump annotation output into a file, to ease e-mail reporting of annotation details. (Arnaldo Carvalho de Melo) * Add an IPC/cycles column to the TUI (Jin Yao) * Improve s390 assembly annotation (Thomas Richter) * Refactor the output formatting logic to better separate it into interactive and non-interactive features and add the --stdio2 output variant to demonstrate this. (Arnaldo Carvalho de Melo) - 'perf script' improvements: * Add Python 3 support (Jaroslav Škarvada) * Add --show-round-event (Jiri Olsa) - 'perf c2c' improvements: * Add NUMA analysis support (Jiri Olsa) - 'perf trace' improvements: * Improve PowerPC support (Ravi Bangoria) - 'perf inject' improvements: * Integrate ARM CoreSight traces (Robert Walker) - 'perf stat' improvements: * Add the --interval-count option (yuzhoujian) * Add the --timeout option (yuzhoujian) - 'perf sched' improvements (Changbin Du) - Vendor events improvements : * Add IBM s390 vendor events (Thomas Richter) * Add and improve arm64 vendor events (John Garry, Ganapatrao Kulkarni) * Update POWER9 vendor events (Sukadev Bhattiprolu) - Intel PT tooling improvements (Adrian Hunter) - PMU handling improvements (Agustin Vega-Frias) - Record machine topology in perf.data (Jiri Olsa) - Various overwrite related cleanups (Kan Liang) - Add arm64 dwarf post unwind support (Kim Phillips, Jean Pihet) - ... and lots of other changes, cleanups and fixes, see the shortlog and Git history for details" * 'perf-core-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip: (262 commits) perf/x86/intel: Enable C-state residency events for Cannon Lake perf/x86/intel: Add Cannon Lake support for RAPL profiling perf/x86/pt, coresight: Clean up address filter structure perf vendor events s390: Add JSON files for IBM z14 perf vendor events s390: Add JSON files for IBM z13 perf vendor events s390: Add JSON files for IBM zEC12 zBC12 perf vendor events s390: Add JSON files for IBM z196 perf vendor events s390: Add JSON files for IBM z10EC z10BC perf mmap: Be consistent when checking for an unmaped ring buffer perf mmap: Fix accessing unmapped mmap in perf_mmap__read_done() perf build: Fix check-headers.sh opts assignment perf/x86: Update rdpmc_always_available static key to the modern API perf annotate: Use absolute addresses to calculate jump target offsets perf annotate: Defer searching for comma in raw line till it is needed perf annotate: Support jumping from one function to another perf annotate: Add "_local" to jump/offset validation routines perf python: Reference Py_None before returning it perf annotate: Mark jumps to outher functions with the call arrow perf annotate: Pass function descriptor to its instruction parsing routines perf annotate: No need to calculate notes->start twice ...
2018-04-02Merge branch 'locking-core-for-linus' of ↵Linus Torvalds40-0/+4234
git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip Pull locking updates from Ingo Molnar: "The main changes in the locking subsystem in this cycle were: - Add the Linux Kernel Memory Consistency Model (LKMM) subsystem, which is an an array of tools in tools/memory-model/ that formally describe the Linux memory coherency model (a.k.a. Documentation/memory-barriers.txt), and also produce 'litmus tests' in form of kernel code which can be directly executed and tested. Here's a high level background article about an earlier version of this work on LWN.net: https://lwn.net/Articles/718628/ The design principles: "There is reason to believe that Documentation/memory-barriers.txt could use some help, and a major purpose of this patch is to provide that help in the form of a design-time tool that can produce all valid executions of a small fragment of concurrent Linux-kernel code, which is called a "litmus test". This tool's functionality is roughly similar to a full state-space search. Please note that this is a design-time tool, not useful for regression testing. However, we hope that the underlying Linux-kernel memory model will be incorporated into other tools capable of analyzing large bodies of code for regression-testing purposes." [...] "A second tool is klitmus7, which converts litmus tests to loadable kernel modules for direct testing. As with herd7, the klitmus7 code is freely available from http://diy.inria.fr/sources/index.html (and via "git" at https://github.com/herd/herdtools7)" [...] Credits go to: "This patch was the result of a most excellent collaboration founded by Jade Alglave and also including Alan Stern, Andrea Parri, and Luc Maranget." ... and to the gents listed in the MAINTAINERS entry: LINUX KERNEL MEMORY CONSISTENCY MODEL (LKMM) M: Alan Stern <stern@rowland.harvard.edu> M: Andrea Parri <parri.andrea@gmail.com> M: Will Deacon <will.deacon@arm.com> M: Peter Zijlstra <peterz@infradead.org> M: Boqun Feng <boqun.feng@gmail.com> M: Nicholas Piggin <npiggin@gmail.com> M: David Howells <dhowells@redhat.com> M: Jade Alglave <j.alglave@ucl.ac.uk> M: Luc Maranget <luc.maranget@inria.fr> M: "Paul E. McKenney" <paulmck@linux.vnet.ibm.com> The LKMM project already found several bugs in Linux locking primitives and improved the understanding and the documentation of the Linux memory model all around. - Add KASAN instrumentation to atomic APIs (Dmitry Vyukov) - Add RWSEM API debugging and reorganize the lock debugging Kconfig (Waiman Long) - ... misc cleanups and other smaller changes" * 'locking-core-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip: (31 commits) locking/Kconfig: Restructure the lock debugging menu locking/Kconfig: Add LOCK_DEBUGGING_SUPPORT to make it more readable locking/rwsem: Add DEBUG_RWSEMS to look for lock/unlock mismatches lockdep: Make the lock debug output more useful locking/rtmutex: Handle non enqueued waiters gracefully in remove_waiter() locking/atomic, asm-generic, x86: Add comments for atomic instrumentation locking/atomic, asm-generic: Add KASAN instrumentation to atomic operations locking/atomic/x86: Switch atomic.h to use atomic-instrumented.h locking/atomic, asm-generic: Add asm-generic/atomic-instrumented.h locking/xchg/alpha: Remove superfluous memory barriers from the _local() variants tools/memory-model: Finish the removal of rb-dep, smp_read_barrier_depends(), and lockless_dereference() tools/memory-model: Add documentation of new litmus test tools/memory-model: Remove mention of docker/gentoo image locking/memory-barriers: De-emphasize smp_read_barrier_depends() some more locking/lockdep: Show unadorned pointers mutex: Drop linkage.h from mutex.h tools/memory-model: Remove rb-dep, smp_read_barrier_depends, and lockless_dereference tools/memory-model: Convert underscores to hyphens tools/memory-model: Add a S lock-based external-view litmus test tools/memory-model: Add required herd7 version to README file ...
2018-04-02Merge branch 'core-rcu-for-linus' of ↵Linus Torvalds11-43/+44
git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip Pull RCU updates from Ingo Molnar: "The main RCU subsystem changes in this cycle were: - Miscellaneous fixes, perhaps most notably removing obsolete code whose only purpose in life was to gather information for the now-removed RCU debugfs facility. Other notable changes include removing NO_HZ_FULL_ALL in favor of the nohz_full kernel boot parameter, minor optimizations for expedited grace periods, some added tracing, creating an RCU-specific workqueue using Tejun's new WQ_MEM_RECLAIM flag, and several cleanups to code and comments. - SRCU cleanups and optimizations. - Torture-test updates, perhaps most notably the adding of ARMv8 support, but also including numerous cleanups and usability fixes" * 'core-rcu-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip: (37 commits) rcu: Create RCU-specific workqueues with rescuers torture: Provide more sensible nreader/nwriter defaults for rcuperf torture: Grace periods do not piggyback off of themselves torture: Adjust rcuperf trace processing to allow for workqueues torture: Default jitter off when running rcuperf torture: Specify qemu memory size with --memory argument rcutorture: Add basic ARM64 support to run scripts rcutorture: Update kvm.sh header comment rcutorture: Record which grace-period primitives are tested rcutorture: Re-enable testing of dynamic expediting rcutorture: Avoid fake-writer use of undefined primitives rcutorture: Abstract function and module names rcutorture: Replace multi-instance kzalloc() with kcalloc() rcu: Remove SRCU throttling srcu: Remove dead code in srcu_gp_end() srcu: Reduce scans of srcu_data in counter wrap check srcu: Prevent sdp->srcu_gp_seq_needed_exp counter wrap srcu: Abstract function name rcu: Make expedited RCU CPU selection avoid unnecessary stores rcu: Trace expedited GP delays due to transitioning CPUs ...
2018-04-02perf version: Add man pageJin Yao1-0/+24
Since a new option '--build-options' is created for 'perf version', so we need to document it. Signed-off-by: Jin Yao <yao.jin@linux.intel.com> Acked-by: Jiri Olsa <jolsa@kernel.org> Cc: Alexander Shishkin <alexander.shishkin@linux.intel.com> Cc: Andi Kleen <ak@linux.intel.com> Cc: Jin Yao <yao.jin@intel.com> Cc: Kan Liang <kan.liang@intel.com> Cc: Peter Zijlstra <peterz@infradead.org> Link: http://lkml.kernel.org/r/1522402036-22915-7-git-send-email-yao.jin@linux.intel.com Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
2018-04-02perf tools: Add 'perf -vv' as an alias to 'perf version --build-options'Jin Yao2-0/+7
We keep having bug reports that when users build perf on their own, but they don't install some needed libraries such as libelf, libbfd/libibery. The perf can build, but it is missing important functionality. This patch provides a new option '-vv' for perf which will print the compiled-in status of libraries. The 'perf -vv' is mapped to 'perf version --build-options'. For example: $ ./perf -vv perf version 4.13.rc5.g6727c5 dwarf: [ on ] # HAVE_DWARF_SUPPORT dwarf_getlocations: [ on ] # HAVE_DWARF_GETLOCATIONS_SUPPORT glibc: [ on ] # HAVE_GLIBC_SUPPORT gtk2: [ on ] # HAVE_GTK2_SUPPORT libaudit: [ OFF ] # HAVE_LIBAUDIT_SUPPORT libbfd: [ on ] # HAVE_LIBBFD_SUPPORT libelf: [ on ] # HAVE_LIBELF_SUPPORT libnuma: [ on ] # HAVE_LIBNUMA_SUPPORT numa_num_possible_cpus: [ on ] # HAVE_LIBNUMA_SUPPORT libperl: [ on ] # HAVE_LIBPERL_SUPPORT libpython: [ on ] # HAVE_LIBPYTHON_SUPPORT libslang: [ on ] # HAVE_SLANG_SUPPORT libcrypto: [ on ] # HAVE_LIBCRYPTO_SUPPORT libunwind: [ on ] # HAVE_LIBUNWIND_SUPPORT libdw-dwarf-unwind: [ on ] # HAVE_DWARF_SUPPORT zlib: [ on ] # HAVE_ZLIB_SUPPORT lzma: [ on ] # HAVE_LZMA_SUPPORT get_cpuid: [ on ] # HAVE_AUXTRACE_SUPPORT bpf: [ on ] # HAVE_LIBBPF_SUPPORT v3: One bug is found in v2. It didn't process the option like '-vabc' correctly. Fix this bug. v2: Use a global variable version_verbose to record the number of 'v'. Signed-off-by: Jin Yao <yao.jin@linux.intel.com> Tested-by: Arnaldo Carvalho de Melo <acme@redhat.com> Acked-by: Jiri Olsa <jolsa@kernel.org> Cc: Alexander Shishkin <alexander.shishkin@linux.intel.com> Cc: Andi Kleen <ak@linux.intel.com> Cc: Jin Yao <yao.jin@intel.com> Cc: Kan Liang <kan.liang@intel.com> Cc: Peter Zijlstra <peterz@infradead.org> Link: http://lkml.kernel.org/r/1522402036-22915-6-git-send-email-yao.jin@linux.intel.com Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
2018-04-02perf version: Print the compiled-in status of librariesJin Yao1-1/+81
This patch checks the values passed by CFLAGS (-DHAVE_XXX) and then print the status of libraries. For example, if HAVE_DWARF_SUPPORT is defined, that means the library "dwarf" is compiled-in. The patch will print the status "on" for this library otherwise it print the status "OFF". A new option '--build-options' created for 'perf version' supports the printing of library status. For example: $ ./perf version --build-options or ./perf --version --build-options or ./perf -v --build-options perf version 4.13.rc5.g6727c5 dwarf: [ on ] # HAVE_DWARF_SUPPORT dwarf_getlocations: [ on ] # HAVE_DWARF_GETLOCATIONS_SUPPORT glibc: [ on ] # HAVE_GLIBC_SUPPORT gtk2: [ on ] # HAVE_GTK2_SUPPORT libaudit: [ OFF ] # HAVE_LIBAUDIT_SUPPORT libbfd: [ on ] # HAVE_LIBBFD_SUPPORT libelf: [ on ] # HAVE_LIBELF_SUPPORT libnuma: [ on ] # HAVE_LIBNUMA_SUPPORT numa_num_possible_cpus: [ on ] # HAVE_LIBNUMA_SUPPORT libperl: [ on ] # HAVE_LIBPERL_SUPPORT libpython: [ on ] # HAVE_LIBPYTHON_SUPPORT libslang: [ on ] # HAVE_SLANG_SUPPORT libcrypto: [ on ] # HAVE_LIBCRYPTO_SUPPORT libunwind: [ on ] # HAVE_LIBUNWIND_SUPPORT libdw-dwarf-unwind: [ on ] # HAVE_DWARF_SUPPORT zlib: [ on ] # HAVE_ZLIB_SUPPORT lzma: [ on ] # HAVE_LZMA_SUPPORT get_cpuid: [ on ] # HAVE_AUXTRACE_SUPPORT bpf: [ on ] # HAVE_LIBBPF_SUPPORT v4: 1. Also print the macro name. That would make it easier to grep around in the source looking for where code related a particular features is located. 2. Update since HAVE_DWARF_GETLOCATIONS is renamed to HAVE_DWARF_GETLOCATIONS_SUPPORT v3: Remove following unnecessary help message. 1. [ on ]: library is compiled-in [ OFF ]: library is disabled in make configuration OR library is not installed in build environment 2. Create '--build-options' option. 3. Use standard option parsing API 'parse_options'. v2: 1. Use IS_BUILTIN macro to replace #ifdef/#endif block. 2. Print color for on/OFF. Signed-off-by: Jin Yao <yao.jin@linux.intel.com> Suggested-by: Arnaldo Carvalho de Melo <acme@redhat.com> Suggested-by: Ingo Molnar <mingo@kernel.org> Suggested-by: Jiri Olsa <jolsa@kernel.org> Tested-by: Arnaldo Carvalho de Melo <acme@redhat.com> Acked-by: Jiri Olsa <jolsa@kernel.org> Cc: Alexander Shishkin <alexander.shishkin@linux.intel.com> Cc: Andi Kleen <ak@linux.intel.com> Cc: Jin Yao <yao.jin@intel.com> Cc: Kan Liang <kan.liang@intel.com> Cc: Peter Zijlstra <peterz@infradead.org> Link: http://lkml.kernel.org/r/1522402036-22915-5-git-send-email-yao.jin@linux.intel.com Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
2018-04-02perf config: Rename to HAVE_DWARF_GETLOCATIONS_SUPPORTJin Yao2-2/+2
In Makefile.config, to make all libraries flags have _SUPPORT suffix, rename HAVE_DWARF_GETLOCATIONS to HAVE_DWARF_GETLOCATIONS_SUPPORT Signed-off-by: Jin Yao <yao.jin@linux.intel.com> Suggested-by: Ingo Molnar <mingo@kernel.org> Acked-by: Jiri Olsa <jolsa@kernel.org> Cc: Alexander Shishkin <alexander.shishkin@linux.intel.com> Cc: Andi Kleen <ak@linux.intel.com> Cc: Jin Yao <yao.jin@intel.com> Cc: Kan Liang <kan.liang@intel.com> Cc: Peter Zijlstra <peterz@infradead.org> Link: http://lkml.kernel.org/r/1522402036-22915-4-git-send-email-yao.jin@linux.intel.com Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
2018-04-02perf config: Add some new -DHAVE_XXX to CFLAGSJin Yao1-0/+6
For most of libraries, in perf.config, they are recorded with -DHAVE_XXX in CFLAGS according to if the libraries are compiled-in. Then C code then will know if the library is compiled-in or not. While for glibc, no -DHAVE_GLIBC_SUPPORT exists. For python and perl libraries, only -DNO_PYTHON and -DNO_LIBPERL exist. To make the code more consistent, the patch creates -DHAVE_LIBPYTHON_SUPPORT and -DHAVE_LIBPERL_SUPPORT if the python and perl libraries are compiled-in. Since the existing flags -DNO_PYTHON and -DNO_LIBPERL are being used in many places in C code, this patch doesn't remove them. In a follow-up patch, we will recontruct the C code and then use HAVE_XXX instead. v3: Move 'CFLAGS += -DHAVE_LIBPYTHON_SUPPORT' and 'CFLAGS += -DHAVE_LIBPERL_SUPPORT' to other places to avoid duplicated feature checking. v2: Create -DHAVE_GLIBC_SUPPORT, -DHAVE_LIBPYTHON_SUPPORT and -DHAVE_LIBPERL_SUPPORT. Signed-off-by: Jin Yao <yao.jin@linux.intel.com> Acked-by: Jiri Olsa <jolsa@kernel.org> Cc: Alexander Shishkin <alexander.shishkin@linux.intel.com> Cc: Andi Kleen <ak@linux.intel.com> Cc: Jin Yao <yao.jin@intel.com> Cc: Kan Liang <kan.liang@intel.com> Cc: Peter Zijlstra <peterz@infradead.org> Link: http://lkml.kernel.org/r/1522402036-22915-3-git-send-email-yao.jin@linux.intel.com Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
2018-04-02tools include: Add config.h header fileJiri Olsa1-0/+34
Adding IS_BUILTIN macro and its dependencies into tools world. It's taken from kernel's include/linux/kconfig.h, which can't be taken completely due to its kconfig dependencies. Signed-off-by: Jiri Olsa <jolsa@kernel.org> Cc: Alexander Shishkin <alexander.shishkin@linux.intel.com> Cc: Andi Kleen <ak@linux.intel.com> Cc: Jin Yao <yao.jin@intel.com> Cc: Kan Liang <kan.liang@intel.com> Cc: Peter Zijlstra <peterz@infradead.org> Link: http://lkml.kernel.org/r/1522402036-22915-2-git-send-email-yao.jin@linux.intel.com Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
2018-04-02perf trace: Show only failing syscallsArnaldo Carvalho de Melo2-3/+9
For instance: # perf probe "vfs_getname=getname_flags:72 pathname=result->name:string" Added new event: probe:vfs_getname (on getname_flags:72 with pathname=result->name:string) You can now use it in all perf tools, such as: perf record -e probe:vfs_getname -aR sleep 1 # perf trace --failure sleep 1 0.043 ( 0.010 ms): sleep/10978 access(filename: /etc/ld.so.preload, mode: R) = -1 ENOENT No such file or directory For reference, here are all the syscalls in this case: # perf trace sleep 1 ? ( ): sleep/10976 ... [continued]: execve()) = 0 0.027 ( 0.001 ms): sleep/10976 brk() = 0x55bdc2d04000 0.044 ( 0.010 ms): sleep/10976 access(filename: /etc/ld.so.preload, mode: R) = -1 ENOENT No such file or directory 0.057 ( 0.006 ms): sleep/10976 openat(dfd: CWD, filename: /etc/ld.so.cache, flags: CLOEXEC) = 3 0.064 ( 0.002 ms): sleep/10976 fstat(fd: 3, statbuf: 0x7fffac22b370) = 0 0.067 ( 0.003 ms): sleep/10976 mmap(len: 111457, prot: READ, flags: PRIVATE, fd: 3) = 0x7feec8615000 0.071 ( 0.001 ms): sleep/10976 close(fd: 3) = 0 0.080 ( 0.007 ms): sleep/10976 openat(dfd: CWD, filename: /lib64/libc.so.6, flags: CLOEXEC) = 3 0.088 ( 0.002 ms): sleep/10976 read(fd: 3, buf: 0x7fffac22b538, count: 832) = 832 0.092 ( 0.001 ms): sleep/10976 fstat(fd: 3, statbuf: 0x7fffac22b3d0) = 0 0.094 ( 0.002 ms): sleep/10976 mmap(len: 8192, prot: READ|WRITE, flags: PRIVATE|ANONYMOUS) = 0x7feec8613000 0.099 ( 0.004 ms): sleep/10976 mmap(len: 3889792, prot: EXEC|READ, flags: PRIVATE|DENYWRITE, fd: 3) = 0x7feec8057000 0.104 ( 0.007 ms): sleep/10976 mprotect(start: 0x7feec8203000, len: 2097152) = 0 0.112 ( 0.005 ms): sleep/10976 mmap(addr: 0x7feec8403000, len: 24576, prot: READ|WRITE, flags: PRIVATE|DENYWRITE|FIXED, fd: 3, off: 1753088) = 0x7feec8403000 0.120 ( 0.003 ms): sleep/10976 mmap(addr: 0x7feec8409000, len: 14976, prot: READ|WRITE, flags: PRIVATE|ANONYMOUS|FIXED) = 0x7feec8409000 0.128 ( 0.001 ms): sleep/10976 close(fd: 3) = 0 0.139 ( 0.001 ms): sleep/10976 arch_prctl(option: 4098, arg2: 140663540761856) = 0 0.186 ( 0.004 ms): sleep/10976 mprotect(start: 0x7feec8403000, len: 16384, prot: READ) = 0 0.204 ( 0.003 ms): sleep/10976 mprotect(start: 0x55bdc0ec3000, len: 4096, prot: READ) = 0 0.209 ( 0.004 ms): sleep/10976 mprotect(start: 0x7feec8631000, len: 4096, prot: READ) = 0 0.214 ( 0.010 ms): sleep/10976 munmap(addr: 0x7feec8615000, len: 111457) = 0 0.269 ( 0.001 ms): sleep/10976 brk() = 0x55bdc2d04000 0.271 ( 0.002 ms): sleep/10976 brk(brk: 0x55bdc2d25000) = 0x55bdc2d25000 0.274 ( 0.001 ms): sleep/10976 brk() = 0x55bdc2d25000 0.278 ( 0.007 ms): sleep/10976 open(filename: /usr/lib/locale/locale-archive, flags: CLOEXEC) = 3 0.288 ( 0.001 ms): sleep/10976 fstat(fd: 3</usr/lib/locale/locale-archive>, statbuf: 0x7feec8408aa0) = 0 0.290 ( 0.003 ms): sleep/10976 mmap(len: 113045344, prot: READ, flags: PRIVATE, fd: 3) = 0x7feec1488000 0.297 ( 0.001 ms): sleep/10976 close(fd: 3</usr/lib/locale/locale-archive>) = 0 0.325 (1000.193 ms): sleep/10976 nanosleep(rqtp: 0x7fffac22c0b0) = 0 1000.560 ( 0.006 ms): sleep/10976 close(fd: 1) = 0 1000.573 ( 0.005 ms): sleep/10976 close(fd: 2) = 0 1000.596 ( ): sleep/10976 exit_group() # And can be done systemwide, etc, with backtraces: # perf trace --max-stack=16 --failure sleep 1 0.048 ( 0.015 ms): sleep/11092 access(filename: /etc/ld.so.preload, mode: R) = -1 ENOENT No such file or directory __access (inlined) dl_main (/usr/lib64/ld-2.26.so) # Or for some specific syscalls: # perf trace --max-stack=16 -e openat --failure cat /tmp/rien cat: /tmp/rien: No such file or directory 0.251 ( 0.012 ms): cat/11106 openat(dfd: CWD, filename: /tmp/rien) = -1 ENOENT No such file or directory __libc_open64 (inlined) main (/usr/bin/cat) __libc_start_main (/usr/lib64/libc-2.26.so) _start (/usr/bin/cat) # Look for inotify* syscalls that fail, system wide, for 2 seconds, with backtraces: # perf trace -a --max-stack=16 --failure -e inotify* sleep 2 819.165 ( 0.058 ms): gmain/1724 inotify_add_watch(fd: 8<anon_inode:inotify>, pathname: /home/acme/~, mask: 16789454) = -1 ENOENT No such file or directory __GI_inotify_add_watch (inlined) _ik_watch (/usr/lib64/libgio-2.0.so.0.5400.3) _ip_start_watching (/usr/lib64/libgio-2.0.so.0.5400.3) im_scan_missing (/usr/lib64/libgio-2.0.so.0.5400.3) g_timeout_dispatch (/usr/lib64/libglib-2.0.so.0.5400.3) g_main_context_dispatch (/usr/lib64/libglib-2.0.so.0.5400.3) g_main_context_iterate.isra.23 (/usr/lib64/libglib-2.0.so.0.5400.3) g_main_context_iteration (/usr/lib64/libglib-2.0.so.0.5400.3) glib_worker_main (/usr/lib64/libglib-2.0.so.0.5400.3) g_thread_proxy (/usr/lib64/libglib-2.0.so.0.5400.3) start_thread (/usr/lib64/libpthread-2.26.so) __GI___clone (inlined) # Cc: Adrian Hunter <adrian.hunter@intel.com> Cc: David Ahern <dsahern@gmail.com> Cc: Jiri Olsa <jolsa@kernel.org> Cc: Namhyung Kim <namhyung@kernel.org> Cc: Wang Nan <wangnan0@huawei.com> Link: https://lkml.kernel.org/n/tip-8f7d3mngaxvi7tlzloz3n7cs@git.kernel.org Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
2018-04-02tools headers: Synchronize x86's cpufeatures.hArnaldo Carvalho de Melo1-0/+2
Due to these commits: 1da961d72ab0 ("x86/cpufeatures: Add Intel Total Memory Encryption cpufeature") 7958b2246fad ("x86/cpufeatures: Add Intel PCONFIG cpufeature") To silence this perf build warning: Warning: Kernel ABI header at 'tools/arch/x86/include/asm/cpufeatures.h' differs from latest version at 'arch/x86/include/asm/cpufeatures.h' Nothing in those csets requires changes in tools/perf/, so just sync it to silence the build. Cc: Adrian Hunter <adrian.hunter@intel.com> Cc: David Ahern <dsahern@gmail.com> Cc: Jiri Olsa <jolsa@kernel.org> Cc: Kirill A. Shutemov <kirill.shutemov@linux.intel.com> Cc: Namhyung Kim <namhyung@kernel.org> Cc: Wang Nan <wangnan0@huawei.com> Link: https://lkml.kernel.org/n/tip-m2yl8wj0uxs8pncq2ncfcx46@git.kernel.org Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
2018-04-02perf tools: Add a "dso_size" sort orderKim Phillips5-0/+48
Add DSO size to perf report/top sort output list. This includes adding a map__size fn to map.h, which is approximately equal to the DSO data file_size: DSO file size map (end-start) file / (end-start) libwebkit2gtk-4.0.so.37.24.9 43260072 41295872 95% libglib-2.0.so.0.5400.1 1125680 1118208 99% libc-2.26.so 1960656 1925120 101% libdbus-1.so.3.14.13 309456 303104 102% Sample output: $ ./perf report -s dso_size,dso Samples: 2K of event 'cycles:uppp', Event count (approx.): 128373340 Overhead DSO size Shared Object 90.62% unknown [unknown] 2.87% 1118208 libglib-2.0.so.0.5400.1 1.92% 303104 libdbus-1.so.3.14.13 1.42% 1925120 libc-2.26.so 0.77% 41295872 libwebkit2gtk-4.0.so.37.24.9 0.61% 335872 libgobject-2.0.so.0.5400.1 0.41% 1052672 libgdk-3.so.0.2200.25 0.36% 106496 libpthread-2.26.so 0.29% 221184 dbus-daemon 0.17% 159744 ld-2.26.so 0.13% 49152 libwayland-client.so.0.3.0 0.12% 1642496 libgio-2.0.so.0.5400.1 0.09% 7327744 libgtk-3.so.0.2200.25 0.09% 12324864 libmozjs-52.so.0.0.0 0.05% 4796416 perf 0.04% 843776 libgjs.so.0.0.0 0.03% 1409024 libmutter-clutter-1.so Committer testing: To sort by DSO size, use: # perf report -F dso_size,dso,overhead -s dso_size <SNIP> 3465216 libdns-export.so.174.0.1 0.00% 3522560 libgc.so.1.0.3 0.00% 3538944 libbfd-2.29-13.fc27.so 0.59% 3670016 libunistring.so.2.1.0 0.00% 3723264 libguile-2.0.so.22.8.1 0.00% 3776512 libgio-2.0.so.0.5400.3 0.00% 3891200 libc-2.26.so 0.96% 3944448 libmozjs-17.0.so 0.00% 4218880 libperl.so.5.26.1 0.18% 4452352 libpython2.7.so.1.0 0.02% 4472832 perf 0.02% 4603904 git 0.01% 4751360 libcrypto.so.1.1.0g 0.00% 5005312 libslang.so.2.3.1 0.00% 7315456 libgtk-3.so.0.2200.26 0.09% 8818688 i965_dri.so 2.46% 8818688 i965_dri.so (deleted) 1.26% 12414976 libmozjs-52.so.0.0.0 0.03% 23642112 cc1 2.02% 27889664 [kernel.kallsyms] 25.41% 80834560 libxul.so (deleted) 15.68% 98078720 chrome 32.03% 1056964608 [kernel.kallsyms] 1.59% # Signed-off-by: Kim Phillips <kim.phillips@arm.com> Tested-by: Arnaldo Carvalho de Melo <acme@redhat.com> Cc: Alexander Shishkin <alexander.shishkin@linux.intel.com> Cc: Andi Kleen <ak@linux.intel.com> Cc: Jin Yao <yao.jin@linux.intel.com> Cc: Jiri Olsa <jolsa@redhat.com> Cc: Maxim Kuvyrkov <maxim.kuvyrkov@linaro.org> Cc: Milian Wolff <milian.wolff@kdab.com> Cc: Namhyung Kim <namhyung@kernel.org> Cc: Peter Zijlstra <peterz@infradead.org> Link: http://lkml.kernel.org/r/20180327060956.1c01ebe67a2a941bb4468c6f@arm.com Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
2018-03-31Merge branch 'x86-pti-for-linus' of ↵Linus Torvalds1-0/+11
git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip Pull x86 PTI fixes from Ingo Molnar: "Two fixes: a relatively simple objtool fix that makes Clang built kernels work with ORC debug info, plus an alternatives macro fix" * 'x86-pti-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip: x86/alternatives: Fixup alternative_call_2 objtool: Add Clang support
2018-03-31Merge branch 'linus' into locking/core, to pick up fixesIngo Molnar6-6/+224
Signed-off-by: Ingo Molnar <mingo@kernel.org>
2018-03-30Merge git://git.kernel.org/pub/scm/linux/kernel/git/davem/netLinus Torvalds1-1/+1
Pull networking fixes from David Miller: 1) Fix RCU locking in xfrm_local_error(), from Taehee Yoo. 2) Fix return value assignments and thus error checking in iwl_mvm_start_ap_ibss(), from Johannes Berg. 3) Don't count header length twice in vti4, from Stefano Brivio. 4) Fix deadlock in rt6_age_examine_exception, from Eric Dumazet. 5) Fix out-of-bounds access in nf_sk_lookup_slow{v4,v6}() from Subash Abhinov. 6) Check nladdr size in netlink_connect(), from Alexander Potapenko. 7) VF representor SQ numbers are 32 not 16 bits, in mlx5 driver, from Or Gerlitz. 8) Out of bounds read in skb_network_protocol(), from Eric Dumazet. 9) r8169 driver sets driver data pointer after register_netdev() which is too late. Fix from Heiner Kallweit. 10) Fix memory leak in mlx4 driver, from Moshe Shemesh. 11) The multi-VLAN decap fix added a regression when dealing with device that lack a MAC header, such as tun. Fix from Toshiaki Makita. 12) Fix integer overflow in dynamic interrupt coalescing code. From Tal Gilboa. 13) Use after free in vrf code, from David Ahern. 14) IPV6 route leak between VRFs fix, also from David Ahern. * git://git.kernel.org/pub/scm/linux/kernel/git/davem/net: (81 commits) net: mvneta: fix enable of all initialized RXQs net/ipv6: Fix route leaking between VRFs vrf: Fix use after free and double free in vrf_finish_output ipv6: sr: fix seg6 encap performances with TSO enabled net/dim: Fix int overflow vlan: Fix vlan insertion for packets without ethernet header net: Fix untag for vlan packets without ethernet header atm: iphase: fix spelling mistake: "Receiverd" -> "Received" vhost: validate log when IOTLB is enabled qede: Do not drop rx-checksum invalidated packets. hv_netvsc: enable multicast if necessary ip_tunnel: Resolve ipsec merge conflict properly. lan78xx: Crash in lan78xx_writ_reg (Workqueue: events lan78xx_deferred_multicast_write) qede: Fix barrier usage after tx doorbell write. vhost: correctly remove wait queue during poll failure net/mlx4_core: Fix memory leak while delete slave's resources net/mlx4_en: Fix mixed PFC and Global pause user control requests net/smc: use announced length in sock_recvmsg() llc: properly handle dev_queue_xmit() return value strparser: Fix sign of err codes ...
2018-03-29Merge branch 'perf/urgent' into perf/coreIngo Molnar5-2/+196
Conflicts: kernel/events/hw_breakpoint.c Signed-off-by: Ingo Molnar <mingo@kernel.org>
2018-03-27perf vendor events s390: Add JSON files for IBM z14Thomas Richter4-0/+469
Add CPU measurement counter facility event description files (json files) for IBM z14. Signed-off-by: Thomas Richter <tmricht@linux.vnet.ibm.com> Reviewed-by: Hendrik Brueckner <brueckner@linux.vnet.ibm.com> Cc: Heiko Carstens <heiko.carstens@de.ibm.com> Cc: Martin Schwidefsky <schwidefsky@de.ibm.com> Link: http://lkml.kernel.org/r/20180326082538.2258-5-tmricht@linux.vnet.ibm.com Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
2018-03-27perf vendor events s390: Add JSON files for IBM z13Thomas Richter4-0/+511
Add CPU measurement counter facility event description files (json files) for IBM z13. Signed-off-by: Thomas Richter <tmricht@linux.vnet.ibm.com> Reviewed-by: Hendrik Brueckner <brueckner@linux.vnet.ibm.com> Cc: Heiko Carstens <heiko.carstens@de.ibm.com> Cc: Martin Schwidefsky <schwidefsky@de.ibm.com> Link: http://lkml.kernel.org/r/20180326082538.2258-4-tmricht@linux.vnet.ibm.com Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
2018-03-27perf vendor events s390: Add JSON files for IBM zEC12 zBC12Thomas Richter4-0/+385
Add CPU measurement counter facility event description files (json files) for IBM zEC12 and zBC12. Signed-off-by: Thomas Richter <tmricht@linux.vnet.ibm.com> Reviewed-by: Hendrik Brueckner <brueckner@linux.vnet.ibm.com> Cc: Heiko Carstens <heiko.carstens@de.ibm.com> Cc: Martin Schwidefsky <schwidefsky@de.ibm.com> Link: http://lkml.kernel.org/r/20180326082538.2258-3-tmricht@linux.vnet.ibm.com Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
2018-03-27perf vendor events s390: Add JSON files for IBM z196Thomas Richter4-0/+319
Add CPU measurement counter facility event description files (json files) for IBM z196. Signed-off-by: Thomas Richter <tmricht@linux.vnet.ibm.com> Reviewed-by: Hendrik Brueckner <brueckner@linux.vnet.ibm.com> Cc: Heiko Carstens <heiko.carstens@de.ibm.com> Cc: Martin Schwidefsky <schwidefsky@de.ibm.com> Link: http://lkml.kernel.org/r/20180326082538.2258-2-tmricht@linux.vnet.ibm.com Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
2018-03-27perf vendor events s390: Add JSON files for IBM z10EC z10BCThomas Richter4-0/+284
Add CPU measurement counter facility event description files (JSON files) for IBM z10EC and z10BC. Signed-off-by: Thomas Richter <tmricht@linux.vnet.ibm.com> Reviewed-by: Hendrik Brueckner <brueckner@linux.vnet.ibm.com> Cc: Heiko Carstens <heiko.carstens@de.ibm.com> Cc: Martin Schwidefsky <schwidefsky@de.ibm.com> Link: http://lkml.kernel.org/r/20180326082538.2258-1-tmricht@linux.vnet.ibm.com Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
2018-03-27perf mmap: Be consistent when checking for an unmaped ring bufferArnaldo Carvalho de Melo1-1/+12
The previous patch is insufficient to cure the reported 'perf trace' segfault, as it only cures the perf_mmap__read_done() case, moving the segfault to perf_mmap__read_init() functio, fix it by doing the same refcount check. Cc: Adrian Hunter <adrian.hunter@intel.com> Cc: Arnaldo Carvalho de Melo <acme@redhat.com> Cc: David Ahern <dsahern@gmail.com> Cc: Jiri Olsa <jolsa@kernel.org> Cc: Kan Liang <kan.liang@intel.com> Cc: Namhyung Kim <namhyung@kernel.org> Cc: Wang Nan <wangnan0@huawei.com> Fixes: 8872481bd048 ("perf mmap: Introduce perf_mmap__read_init()") Link: https://lkml.kernel.org/r/20180326144127.GF18897@kernel.org Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
2018-03-27perf mmap: Fix accessing unmapped mmap in perf_mmap__read_done()Kan Liang1-0/+6
There is a segmentation fault when running 'perf trace'. For example: [root@jouet e]# perf trace -e *chdir -o /tmp/bla perf report --ignore-vmlinux -i ../perf.data The perf_mmap__consume() could unmap the mmap. It needs to check the refcnt in perf_mmap__read_done(). Reported-by: Arnaldo Carvalho de Melo <acme@redhat.com> Signed-off-by: Kan Liang <kan.liang@linux.intel.com> Cc: Jiri Olsa <jolsa@kernel.org> Cc: Namhyung Kim <namhyung@kernel.org> Cc: Wang Nan <wangnan0@huawei.com> Fixes: ee023de05f35 ("perf mmap: Introduce perf_mmap__read_done()") Link: http://lkml.kernel.org/r/1522071729-16776-1-git-send-email-kan.liang@linux.intel.com Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
2018-03-27perf build: Fix check-headers.sh opts assignmentJiri Olsa1-0/+1
Currently the "opts" variable is not zero-ed and we keep on adding to it, ending up with: $ check-headers.sh 2>&1 + opts=' "-B"' + opts=' "-B" "-B"' + opts=' "-B" "-B" "-B"' + opts=' "-B" "-B" "-B" "-B"' + opts=' "-B" "-B" "-B" "-B" "-B"' + opts=' "-B" "-B" "-B" "-B" "-B" "-B"' Fix this by initializing it in the check() function, right before starting the loop. Signed-off-by: Jiri Olsa <jolsa@kernel.org> Tested-by: Arnaldo Carvalho de Melo <acme@redhat.com> Cc: Alexander Shishkin <alexander.shishkin@linux.intel.com> Cc: David Ahern <dsahern@gmail.com> Cc: Namhyung Kim <namhyung@kernel.org> Cc: Peter Zijlstra <peterz@infradead.org> Link: http://lkml.kernel.org/r/20180321140515.2252-1-jolsa@kernel.org Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
2018-03-27objtool: Add Clang supportJosh Poimboeuf1-0/+11
Since the ORC unwinder was made the default on x86_64, Clang-built defconfig kernels have triggered some new objtool warnings: drivers/gpu/drm/i915/i915_gpu_error.o: warning: objtool: i915_error_printf()+0x6c: return with modified stack frame drivers/gpu/drm/i915/intel_display.o: warning: objtool: pipe_config_err()+0xa6: return with modified stack frame The problem is that objtool has never seen clang-built binaries before. Shockingly enough, objtool is apparently able to follow the code flow mostly fine, except for one instruction sequence. Instead of a LEAVE instruction, clang restores RSP and RBP the long way: 67c: 48 89 ec mov %rbp,%rsp 67f: 5d pop %rbp Teach objtool about this new code sequence. Reported-and-test-by: Matthias Kaehlcke <mka@chromium.org> Signed-off-by: Josh Poimboeuf <jpoimboe@redhat.com> Cc: Linus Torvalds <torvalds@linux-foundation.org> Cc: Matthias Kaehlcke <mka@chromium.org> Cc: Peter Zijlstra <peterz@infradead.org> Cc: Thomas Gleixner <tglx@linutronix.de> Link: http://lkml.kernel.org/r/fce88ce81c356eedcae7f00ed349cfaddb3363cc.1521741586.git.jpoimboe@redhat.com Signed-off-by: Ingo Molnar <mingo@kernel.org>
2018-03-26ktest: remove obsolete architecturesArnd Bergmann1-30/+1
A number of architectures are being removed from the kernel, so we no longer need to test them. Acked-by: Steven Rostedt (VMware) <rostedt@goodmis.org> Signed-off-by: Arnd Bergmann <arnd@arndb.de>
2018-03-25Merge branch 'x86-pti-for-linus' of ↵Linus Torvalds1-2/+6
git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip Pull x86 and PTI fixes from Ingo Molnar: "Misc fixes: - fix EFI pagetables freeing - fix vsyscall pagetable setting on Xen PV guests - remove ancient CONFIG_X86_PPRO_FENCE=y - x86 is TSO again - fix two binutils (ld) development version related incompatibilities - clean up breakpoint handling - fix an x86 self-test" * 'x86-pti-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip: x86/entry/64: Don't use IST entry for #BP stack x86/efi: Free efi_pgd with free_pages() x86/vsyscall/64: Use proper accessor to update P4D entry x86/cpu: Remove the CONFIG_X86_PPRO_FENCE=y quirk x86/boot/64: Verify alignment of the LOAD segment x86/build/64: Force the linker to use 2MB page size selftests/x86/ptrace_syscall: Fix for yet more glibc interference
2018-03-24tools: bpftool: don't use hex numbers in JSON outputJakub Kicinski1-1/+1
JSON does not accept hex numbers with 0x prefix. Simply print as decimal numbers, JSON should be primarily machine-readable. Signed-off-by: Jakub Kicinski <jakub.kicinski@netronome.com> Reviewed-by: Quentin Monnet <quentin.monnet@netronome.com> Fixes: 831a0aafe5c3 ("tools: bpftool: add JSON output for `bpftool map *` commands") Signed-off-by: Alexei Starovoitov <ast@kernel.org>
2018-03-23Merge tag 'trace-v4.16-rc4' of ↵Linus Torvalds3-0/+186
git://git.kernel.org/pub/scm/linux/kernel/git/rostedt/linux-trace Pull kprobe fixes from Steven Rostedt: "The documentation for kprobe events says that symbol offets can take both a + and - sign to get to befor and after the symbol address. But in actuality, the code does not support the minus. This fixes that issue, and adds a few more selftests to kprobe events" * tag 'trace-v4.16-rc4' of git://git.kernel.org/pub/scm/linux/kernel/git/rostedt/linux-trace: selftests: ftrace: Add a testcase for probepoint selftests: ftrace: Add a testcase for string type with kprobe_event selftests: ftrace: Add probe event argument syntax testcase tracing: probeevent: Fix to support minus offset from symbol
2018-03-23perf annotate: Use absolute addresses to calculate jump target offsetsArnaldo Carvalho de Melo1-3/+2
These types of jumps were confusing the annotate browser: entry_SYSCALL_64 /lib/modules/4.16.0-rc5-00086-gdf09348f78dc/build/vmlinux entry_SYSCALL_64 /lib/modules/4.16.0-rc5-00086-gdf09348f78dc/build/vmlinux Percent│ffffffff81a00020: swapgs <SNIP> │ffffffff81a00128: ↓ jae ffffffff81a00139 <syscall_return_via_sysret+0x53> <SNIP> │ffffffff81a00155: → jmpq *0x825d2d(%rip) # ffffffff82225e88 <pv_cpu_ops+0xe8> I.e. the syscall_return_via_sysret function is actually "inside" the entry_SYSCALL_64 function, and the offsets in jumps like these (+0x53) are relative to syscall_return_via_sysret, not to syscall_return_via_sysret. Or this may be some artifact in how the assembler marks the start and end of a function and how this ends up in the ELF symtab for vmlinux, i.e. syscall_return_via_sysret() isn't "inside" entry_SYSCALL_64, but just right after it. From readelf -sw vmlinux: 80267: ffffffff81a00020 315 NOTYPE GLOBAL DEFAULT 1 entry_SYSCALL_64 316: ffffffff81a000e6 0 NOTYPE LOCAL DEFAULT 1 syscall_return_via_sysret 0xffffffff81a00020 + 315 > 0xffffffff81a000e6 So instead of looking for offsets after that last '+' sign, calculate offsets for jump target addresses that are inside the function being disassembled from the absolute address, 0xffffffff81a00139 in this case, subtracting from it the objdump address for the start of the function being disassembled, entry_SYSCALL_64() in this case. So, before this patch: entry_SYSCALL_64 /lib/modules/4.16.0-rc5-00086-gdf09348f78dc/build/vmlinux Percent│ pop %r10 │ pop %r9 │ pop %r8 │ pop %rax │ pop %rsi │ pop %rdx │ pop %rsi │ mov %rsp,%rdi │ mov %gs:0x5004,%rsp │ pushq 0x28(%rdi) │ pushq (%rdi) │ push %rax │ ↑ jmp 6c │ mov %cr3,%rdi │ ↑ jmp 62 │ mov %rdi,%rax │ and $0x7ff,%rdi │ bt %rdi,%gs:0x2219a │ ↑ jae 53 │ btr %rdi,%gs:0x2219a │ mov %rax,%rdi │ ↑ jmp 5b After: entry_SYSCALL_64 /lib/modules/4.16.0-rc5-00086-gdf09348f78dc/build/vmlinux 0.65 │ → jne swapgs_restore_regs_and_return_to_usermode │ pop %r10 │ pop %r9 │ pop %r8 │ pop %rax │ pop %rsi │ pop %rdx │ pop %rsi │ mov %rsp,%rdi │ mov %gs:0x5004,%rsp │ pushq 0x28(%rdi) │ pushq (%rdi) │ push %rax │ ↓ jmp 132 │ mov %cr3,%rdi │ ┌──jmp 128 │ │ mov %rdi,%rax │ │ and $0x7ff,%rdi │ │ bt %rdi,%gs:0x2219a │ │↓ jae 119 │ │ btr %rdi,%gs:0x2219a │ │ mov %rax,%rdi │ │↓ jmp 121 │119:│ mov %rax,%rdi │ │ bts $0x3f,%rdi │121:│ or $0x800,%rdi │128:└─→or $0x1000,%rdi │ mov %rdi,%cr3 │132: pop %rax │ pop %rdi │ pop %rsp │ → jmpq *0x825d2d(%rip) # ffffffff82225e88 <pv_cpu_ops+0xe8> With those at least navigating to the right destination, an improvement for these cases seems to be to be to somehow mark those inner functions, which in this case could be: entry_SYSCALL_64 /lib/modules/4.16.0-rc5-00086-gdf09348f78dc/build/vmlinux │syscall_return_via_sysret: │ pop %r15 │ pop %r14 │ pop %r13 │ pop %r12 │ pop %rbp │ pop %rbx │ pop %rsi │ pop %r10 │ pop %r9 │ pop %r8 │ pop %rax │ pop %rsi │ pop %rdx │ pop %rsi │ mov %rsp,%rdi │ mov %gs:0x5004,%rsp │ pushq 0x28(%rdi) │ pushq (%rdi) │ push %rax │ ↓ jmp 132 │ mov %cr3,%rdi │ ┌──jmp 128 │ │ mov %rdi,%rax │ │ and $0x7ff,%rdi │ │ bt %rdi,%gs:0x2219a │ │↓ jae 119 │ │ btr %rdi,%gs:0x2219a │ │ mov %rax,%rdi │ │↓ jmp 121 │119:│ mov %rax,%rdi │ │ bts $0x3f,%rdi │121:│ or $0x800,%rdi │128:└─→or $0x1000,%rdi │ mov %rdi,%cr3 │132: pop %rax │ pop %rdi │ pop %rsp │ → jmpq *0x825d2d(%rip) # ffffffff82225e88 <pv_cpu_ops+0xe8> This all gets much better viewed if one uses 'perf report --ignore-vmlinux' forcing the usage of /proc/kcore + /proc/kallsyms, when the above actually gets down to: # perf report --ignore-vmlinux ## do '/64', will show the function names containing '64', ## navigate to /entry_SYSCALL_64_after_hwframe.annotation, ## press 'A' to annotate, then 'P' to print that annotation ## to a file ## From another xterm (or see on screen, this 'P' thing is for ## getting rid of those right side scroll bars/spaces): # cat /entry_SYSCALL_64_after_hwframe.annotation entry_SYSCALL_64_after_hwframe() /proc/kcore Event: cycles:ppp Percent Disassembly of section load0: ffffffff9aa00044 <load0>: 11.97 push %rax 4.85 push %rdi push %rsi 2.59 push %rdx 2.27 push %rcx 0.32 pushq $0xffffffffffffffda 1.29 push %r8 xor %r8d,%r8d 1.62 push %r9 0.65 xor %r9d,%r9d 1.62 push %r10 xor %r10d,%r10d 5.50 push %r11 xor %r11d,%r11d 3.56 push %rbx xor %ebx,%ebx 4.21 push %rbp xor %ebp,%ebp 2.59 push %r12 0.97 xor %r12d,%r12d 3.24 push %r13 xor %r13d,%r13d 2.27 push %r14 xor %r14d,%r14d 4.21 push %r15 xor %r15d,%r15d 0.97 mov %rsp,%rdi 5.50 → callq do_syscall_64 14.56 mov 0x58(%rsp),%rcx 7.44 mov 0x80(%rsp),%r11 0.32 cmp %rcx,%r11 → jne swapgs_restore_regs_and_return_to_usermode 0.32 shl $0x10,%rcx 0.32 sar $0x10,%rcx 3.24 cmp %rcx,%r11 → jne swapgs_restore_regs_and_return_to_usermode 2.27 cmpq $0x33,0x88(%rsp) 1.29 → jne swapgs_restore_regs_and_return_to_usermode mov 0x30(%rsp),%r11 8.74 cmp %r11,0x90(%rsp) → jne swapgs_restore_regs_and_return_to_usermode 0.32 test $0x10100,%r11 → jne swapgs_restore_regs_and_return_to_usermode 0.32 cmpq $0x2b,0xa0(%rsp) 0.65 → jne swapgs_restore_regs_and_return_to_usermode I.e. using kallsyms makes the function start/end be done differently than using what is in the vmlinux ELF symtab and actually the hits goes to entry_SYSCALL_64_after_hwframe, which is a GLOBAL() after the start of entry_SYSCALL_64: ENTRY(entry_SYSCALL_64) UNWIND_HINT_EMPTY <SNIP> pushq $__USER_CS /* pt_regs->cs */ pushq %rcx /* pt_regs->ip */ GLOBAL(entry_SYSCALL_64_after_hwframe) pushq %rax /* pt_regs->orig_ax */ PUSH_AND_CLEAR_REGS rax=$-ENOSYS And it goes and ends at: cmpq $__USER_DS, SS(%rsp) /* SS must match SYSRET */ jne swapgs_restore_regs_and_return_to_usermode /* * We win! This label is here just for ease of understanding * perf profiles. Nothing jumps here. */ syscall_return_via_sysret: /* rcx and r11 are already restored (see code above) */ UNWIND_HINT_EMPTY POP_REGS pop_rdi=0 skip_r11rcx=1 So perhaps some people should really just play with '--ignore-vmlinux' to force /proc/kcore + kallsyms. One idea is to do both, i.e. have a vmlinux annotation and a kcore+kallsyms one, when possible, and even show the patched location, etc. Reported-by: Linus Torvalds <torvalds@linux-foundation.org> Cc: Adrian Hunter <adrian.hunter@intel.com> Cc: Andi Kleen <ak@linux.intel.com> Cc: David Ahern <dsahern@gmail.com> Cc: Jin Yao <yao.jin@linux.intel.com> Cc: Jiri Olsa <jolsa@kernel.org> Cc: Namhyung Kim <namhyung@kernel.org> Cc: Wang Nan <wangnan0@huawei.com> Link: https://lkml.kernel.org/n/tip-r11knxv8voesav31xokjiuo6@git.kernel.org Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
2018-03-23perf annotate: Defer searching for comma in raw line till it is neededArnaldo Carvalho de Melo1-1/+2
That strchr() in jump__scnprintf() needs to be nuked somehow, as it, IIRC is already done in jump__parse() and if needed at scnprintf() time, should be stashed in the struct filled in parse() time. For now jus defer it to just before where it is used. Cc: Adrian Hunter <adrian.hunter@intel.com> Cc: Andi Kleen <ak@linux.intel.com> Cc: David Ahern <dsahern@gmail.com> Cc: Jin Yao <yao.jin@linux.intel.com> Cc: Jiri Olsa <jolsa@kernel.org> Cc: Namhyung Kim <namhyung@kernel.org> Cc: Wang Nan <wangnan0@huawei.com> Link: https://lkml.kernel.org/n/tip-j0t5hagnphoz9xw07bh3ha3g@git.kernel.org Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
2018-03-23perf annotate: Support jumping from one function to anotherArnaldo Carvalho de Melo2-7/+22
For instance: entry_SYSCALL_64 /lib/modules/4.16.0-rc5-00086-gdf09348f78dc/build/vmlinux 5.50 │ → callq do_syscall_64 14.56 │ mov 0x58(%rsp),%rcx 7.44 │ mov 0x80(%rsp),%r11 0.32 │ cmp %rcx,%r11 │ → jne swapgs_restore_regs_and_return_to_usermode 0.32 │ shl $0x10,%rcx 0.32 │ sar $0x10,%rcx 3.24 │ cmp %rcx,%r11 │ → jne swapgs_restore_regs_and_return_to_usermode 2.27 │ cmpq $0x33,0x88(%rsp) 1.29 │ → jne swapgs_restore_regs_and_return_to_usermode │ mov 0x30(%rsp),%r11 8.74 │ cmp %r11,0x90(%rsp) │ → jne swapgs_restore_regs_and_return_to_usermode 0.32 │ test $0x10100,%r11 │ → jne swapgs_restore_regs_and_return_to_usermode 0.32 │ cmpq $0x2b,0xa0(%rsp) 0.65 │ → jne swapgs_restore_regs_and_return_to_usermode It'll behave just like a "call" instruction, i.e. press enter or right arrow over one such line and the browser will navigate to the annotated disassembly of that function, which when exited, via left arrow or esc, will come back to the calling function. Now to support jump to an offset on a different function... Reported-by: Linus Torvalds <torvalds@linux-foundation.org> Cc: Adrian Hunter <adrian.hunter@intel.com> Cc: Andi Kleen <ak@linux.intel.com> Cc: David Ahern <dsahern@gmail.com> Cc: Jin Yao <yao.jin@linux.intel.com> Cc: Jiri Olsa <jolsa@kernel.org> Cc: Namhyung Kim <namhyung@kernel.org> Cc: Wang Nan <wangnan0@huawei.com> Link: https://lkml.kernel.org/n/tip-78o508mqvr8inhj63ddtw7mo@git.kernel.org Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
2018-03-23perf annotate: Add "_local" to jump/offset validation routinesArnaldo Carvalho de Melo3-9/+16
Because they all really check if we can access data structures/visual constructs where a "jump" instruction targets code in the same function, i.e. things like: __pthread_mutex_lock /usr/lib64/libpthread-2.26.so 1.95 │ mov __pthread_force_elision,%ecx │ ┌──test %ecx,%ecx 0.07 │ ├──je 60 │ │ test $0x300,%esi │ │↓ jne 60 │ │ or $0x100,%esi │ │ mov %esi,0x10(%rdi) │ 42:│ mov %esi,%edx │ │ lea 0x16(%r8),%rsi │ │ mov %r8,%rdi │ │ and $0x80,%edx │ │ add $0x8,%rsp │ │→ jmpq __lll_lock_elision │ │ nop 0.29 │ 60:└─→and $0x80,%esi 0.07 │ mov $0x1,%edi 0.29 │ xor %eax,%eax 2.53 │ lock cmpxchg %edi,(%r8) And not things like that "jmpq __lll_lock_elision", that instead should behave like a "call" instruction and "jump" to the disassembly of "___lll_lock_elision". Cc: Adrian Hunter <adrian.hunter@intel.com> Cc: Andi Kleen <ak@linux.intel.com> Cc: David Ahern <dsahern@gmail.com> Cc: Jin Yao <yao.jin@linux.intel.com> Cc: Jiri Olsa <jolsa@kernel.org> Cc: Namhyung Kim <namhyung@kernel.org> Cc: Wang Nan <wangnan0@huawei.com> Link: https://lkml.kernel.org/n/tip-3cwx39u3h66dfw9xjrlt7ca2@git.kernel.org Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
2018-03-23perf python: Reference Py_None before returning itPetr Machata1-1/+3
Python None objects are handled just like all the other objects with respect to their reference counting. Before returning Py_None, its reference count thus needs to be bumped. Signed-off-by: Petr Machata <petrm@mellanox.com> Acked-by: Jiri Olsa <jolsa@kernel.org> Cc: Alexander Shishkin <alexander.shishkin@linux.intel.com> Cc: Namhyung Kim <namhyung@kernel.org> Cc: Peter Zijlstra <peterz@infradead.org> Cc: Petr Machata <petrm@mellanox.com> Link: http://lkml.kernel.org/r/b1e565ecccf68064d8d54f37db5d028dda8fa522.1521675563.git.petrm@mellanox.com Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
2018-03-23selftests: ftrace: Add a testcase for probepointMasami Hiramatsu1-0/+43
Add a testcase for probe point definition. This tests symbol, address and symbol+offset syntax. The offset must be positive and smaller than UINT_MAX. Link: http://lkml.kernel.org/r/152129043097.31874.14273580606301767394.stgit@devbox Signed-off-by: Masami Hiramatsu <mhiramat@kernel.org> Signed-off-by: Steven Rostedt (VMware) <rostedt@goodmis.org>
2018-03-23selftests: ftrace: Add a testcase for string type with kprobe_eventMasami Hiramatsu1-0/+46
Add a testcase for string type with kprobe event. This tests good/bad syntax combinations and also the traced data is correct in several way. Link: http://lkml.kernel.org/r/152129038381.31874.9201387794548737554.stgit@devbox Signed-off-by: Masami Hiramatsu <mhiramat@kernel.org> Signed-off-by: Steven Rostedt (VMware) <rostedt@goodmis.org>