aboutsummaryrefslogtreecommitdiff
path: root/tools
AgeCommit message (Collapse)AuthorFilesLines
2014-12-10net, lib: kill arch_fast_hash library bitsDaniel Borkmann1-6/+0
As there are now no remaining users of arch_fast_hash(), lets kill it entirely. This basically reverts commit 71ae8aac3e19 ("lib: introduce arch optimized hash library") and follow-up work, that is f.e., commit 237217546d44 ("lib: hash: follow-up fixups for arch hash"), commit e3fec2f74f7f ("lib: Add missing arch generic-y entries for asm-generic/hash.h") and last but not least commit 6a02652df511 ("perf tools: Fix include for non x86 architectures"). Cc: Francesco Fusco <[email protected]> Cc: Thomas Graf <[email protected]> Cc: Arnaldo Carvalho de Melo <[email protected]> Signed-off-by: Daniel Borkmann <[email protected]> Signed-off-by: David S. Miller <[email protected]>
2014-12-10perf kvm stat live: Mark events as (x86 only) in help outputAlexander Yarygin1-1/+2
The mmio and ioport events are useful only on x86. Signed-off-by: Alexander Yarygin <[email protected]> Acked-by: David Ahern <[email protected]> Cc: Christian Borntraeger <[email protected]> Cc: Jiri Olsa <[email protected]> Link: http://lkml.kernel.org/r/[email protected] Signed-off-by: Christian Borntraeger <[email protected]> Signed-off-by: Arnaldo Carvalho de Melo <[email protected]>
2014-12-09Merge branch 'perf-urgent-for-linus' of ↵Linus Torvalds2-6/+6
git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip Pull leftover perf fixes from Ingo Molnar: "Two perf fixes left over from the previous cycle" * 'perf-urgent-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip: perf session: Do not fail on processing out of order event x86/asm/traps: Disable tracing and kprobes in fixup_bad_iret and sync_regs
2014-12-09Merge branch 'perf-core-for-linus' of ↵Linus Torvalds90-946/+4555
git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip Pull perf events update from Ingo Molnar: "On the kernel side there's few changes, the one that stands out is PEBS machine state sampling support on x86, by Stephane Eranian. On the tooling side: User visible tooling changes: - Don't open the DWARF info multiple times, keeping instead a dwfl handle in struct dso, greatly speeding up 'perf report' on powerpc. (Sukadev Bhattiprolu) - Introduce PARSE_OPT_DISABLED option flag and use it to avoid showing undersired options in tools that provides frontends to 'perf record', like sched, kvm, etc (Namhyung Kim) - Fallback to kallsyms when using the minimal 'ELF' loader (Arnaldo Carvalho de Melo) - Fix annotation with kcore (Adrian Hunter) - Support source line numbers in annotate using a hotkey (Andi Kleen) - Callchain improvements including: * Enable printing the srcline in the history * Make get_srcline fall back to sym+offset (Andi Kleen) - TUI hist_entry browser fixes, including showing missing overhead value for first level callchain. Detected comparing the output of --stdio/--gui (that matched) with --tui, that had this problem. (Namhyung Kim) - Support handling complete branch stacks as histograms (Andi Kleen) Tooling infrastructure changes: - Prep work for supporting per-pkg and snapshot counters in 'perf stat' (Jiri Olsa) - 'perf stat' refactorings, moving stuff from it to evsel.c to use in per-pkg/snapshot format changes (Jiri Olsa) - Add per-pkg format file parsing (Matt Fleming) - Clean up libelf feature support code (Namhyung Kim) - Add gzip decompression support for kernel modules (Namhyung Kim) - More prep patches for Intel PT, including a a thread stack and more stuff made available via the database export mechanism (Adrian Hunter) - More Intel PT work, including a facility to export sample data (comms, threads, symbol names, etc) in a database friendly way, with an script to use this to create a postgresql database. (Adrian Hunter) - Make sure that thread->mg->machine points to the machine where the thread exists (it was being set only for the kmaps kernel modules case, do it as well for the mmaps) and use it to shorten function signatures (Arnaldo Carvalho de Melo) ... and lots of other fixes and smaller improvements" * 'perf-core-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip: (91 commits) perf report: In branch stack mode use address history sorting perf report: Add --branch-history option perf callchain: Support handling complete branch stacks as histograms perf stat: Add support for snapshot counters perf stat: Add support for per-pkg counters perf tools: Remove perf_evsel__read interface perf stat: Use read_counter in read_counter_aggr perf stat: Make read_counter work over the thread dimension perf stat: Use perf_evsel__read_cb in read_counter perf tools: Add snapshot format file parsing perf tools: Add per-pkg format file parsing perf evsel: Introduce perf_evsel__read_cb function perf evsel: Introduce perf_counts_values__scale function perf evsel: Introduce perf_evsel__compute_deltas function perf tools: Allow to force redirect pr_debug to stderr. perf tools: Fix segfault due to invalid kernel dso access perf callchain: Make get_srcline fall back to sym+offset perf symbols: Move bfd_demangle stubbing to its only user perf callchain: Enable printing the srcline in the history perf tools: Collapse first level callchain entry if it has sibling ...
2014-12-09Merge branch 'core-rcu-for-linus' of ↵Linus Torvalds91-1633/+36
git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip Pull RCU updates from Ingo Molnar: "These are the main changes in this cycle: - Streamline RCU's use of per-CPU variables, shifting from "cpu" arguments to functions to "this_"-style per-CPU variable accessors. - signal-handling RCU updates. - real-time updates. - torture-test updates. - miscellaneous fixes. - documentation updates" * 'core-rcu-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip: (34 commits) rcu: Fix FIXME in rcu_tasks_kthread() rcu: More info about potential deadlocks with rcu_read_unlock() rcu: Optimize cond_resched_rcu_qs() rcu: Add sparse check for RCU_INIT_POINTER() documentation: memory-barriers.txt: Correct example for reorderings documentation: Add atomic_long_t to atomic_ops.txt documentation: Additional restriction for control dependencies documentation: Document RCU self test boot params rcutorture: Fix rcu_torture_cbflood() memory leak rcutorture: Remove obsolete kversion param in kvm.sh rcutorture: Remove stale test configurations rcutorture: Enable RCU self test in configs rcutorture: Add early boot self tests torture: Run Linux-kernel binary out of results directory cpu: Avoid puts_pending overflow rcu: Remove "cpu" argument to rcu_cleanup_after_idle() rcu: Remove "cpu" argument to rcu_prepare_for_idle() rcu: Remove "cpu" argument to rcu_needs_cpu() rcu: Remove "cpu" argument to rcu_note_context_switch() rcu: Remove "cpu" argument to rcu_preempt_check_callbacks() ...
2014-12-09perf tests: Fix attr tests size values to cope with machine state on ↵Jiri Olsa2-2/+2
interrupt ABI changes Following change adjusted 'struct perf_event_attr', but let the attr test's sizes untouched: 60e2364e60e8 perf: Add ability to sample machine state on interrupt [jolsa@krava perf]$ ./perf test attr -vv --- start --- test child forked, pid 9719 running './tests/attr/test-stat-group1' 'PERF_TEST_ATTR=/tmp/tmp4drvul ./perf stat -o /tmp/tmp4drvul/perf.data -e '{cycles,instructions}' kill >/dev/null 2>&1' ret 1 expected size=96, got 104 FAILED './tests/attr/test-stat-group1' - match failure Adjusting test size values for attr test. Reported-by: Arnaldo Carvalho de Melo <[email protected]> Signed-off-by: Jiri Olsa <[email protected]> Cc: Andi Kleen <[email protected]> Cc: Ingo Molnar <[email protected]> Cc: Peter Zijlstra <[email protected]> Cc: Stephane Eranian <[email protected]> Link: http://lkml.kernel.org/r/[email protected] Signed-off-by: Arnaldo Carvalho de Melo <[email protected]>
2014-12-09calloc/xcalloc: Fix argument orderArjun Sreedharan2-5/+5
The calloc() and xcalloc() functions takes @nmemb first and then @size. Fix all w/ pattern "calloc\s*(\s*sizeof". Signed-off-by: Arjun Sreedharan <[email protected]> Cc: "Yann E. MORIN" <[email protected]> Cc: Ingo Molnar <[email protected]> Cc: Paul Mackerras <[email protected]> Cc: Peter Zijlstra <[email protected]> Link: http://lkml.kernel.org/r/[email protected] Signed-off-by: Arnaldo Carvalho de Melo <[email protected]>
2014-12-09perf callchain: Move cpumode resolve code to add_callchain_ipKan Liang1-37/+35
Using flag to distinguish between branch_history and normal callchain. Move the cpumode to add_callchain_ip function. No change in behavior. Signed-off-by: Kan Liang <[email protected]> Acked-by: Jiri Olsa <[email protected]> Cc: Andi Kleen <[email protected]> Cc: Ingo Molnar <[email protected]> Cc: Jiri Olsa <[email protected]> Cc: Namhyung Kim <[email protected]> Cc: Paul Mackerras <[email protected]> Cc: Peter Zijlstra <[email protected]> Cc: Stephane Eranian <[email protected]> Link: http://lkml.kernel.org/r/[email protected] Signed-off-by: Arnaldo Carvalho de Melo <[email protected]>
2014-12-09perf callchain: Fixup parameter handling error messageKan Liang1-1/+1
Fix up parse_callchain_record_opt error message for 'fp', in the past using '-g fp' was a valid alternative to '--call-graph fp', which is not the case since: commit 09b0fd45ff63413df94cbd832a765076b201edbb Author: Jiri Olsa <[email protected]> Date: Sat Oct 26 16:25:33 2013 +0200 perf record: Split -g and --call-graph I.e. -g means "use the configured unwind data collection method" which has as default 'fp', while --call-graph requires passing the method to use. Signed-off-by: Kan Liang <[email protected]> Acked-by: Jiri Olsa <[email protected]> Cc: Andi Kleen <[email protected]> Cc: Ingo Molnar <[email protected]> Cc: Jiri Olsa <[email protected]> Cc: Namhyung Kim <[email protected]> Cc: Paul Mackerras <[email protected]> Cc: Peter Zijlstra <[email protected]> Cc: Stephane Eranian <[email protected]> Link: http://lkml.kernel.org/r/[email protected] [ split this from a larger patch related to LBR based unwinding ] Signed-off-by: Arnaldo Carvalho de Melo <[email protected]>
2014-12-09perf tools: Add --buildid-dir option to set cache directoryJiri Olsa4-6/+22
Adding --buildid-dir to be able to set specific cache directory. It's going to be handy for buildid tests coming in shortly. Signed-off-by: Jiri Olsa <[email protected]> Cc: Corey Ashford <[email protected]> Cc: David Ahern <[email protected]> Cc: Frederic Weisbecker <[email protected]> Cc: Ingo Molnar <[email protected]> Cc: Namhyung Kim <[email protected]> Cc: Paul Mackerras <[email protected]> Cc: Peter Zijlstra <[email protected]> Cc: Stephane Eranian <[email protected]> Cc: Steven Rostedt <[email protected]> Link: http://lkml.kernel.org/r/[email protected] Signed-off-by: Arnaldo Carvalho de Melo <[email protected]>
2014-12-09perf buildid cache: Fix -a segfault related to kcore handlingJiri Olsa1-1/+1
The kcore_filename is uninitialized and trash value could trigger build_id_cache__add_kcore function ending up with segfault. Signed-off-by: Jiri Olsa <[email protected]> Cc: Adrian Hunter <[email protected]> Cc: Corey Ashford <[email protected]> Cc: David Ahern <[email protected]> Cc: Frederic Weisbecker <[email protected]> Cc: Ingo Molnar <[email protected]> Cc: Namhyung Kim <[email protected]> Cc: Paul Mackerras <[email protected]> Cc: Peter Zijlstra <[email protected]> Cc: Stephane Eranian <[email protected]> Cc: Steven Rostedt <[email protected]> Link: http://lkml.kernel.org/r/[email protected] Signed-off-by: Arnaldo Carvalho de Melo <[email protected]>
2014-12-09perf buildid-cache: Remove extra debugdir variablesJiri Olsa2-13/+7
There's no need to copy over the buildid_dir into separate variable with no change. This is leftover from commit: 45de34bbe3e1 perf buildid: add perfconfig option to specify buildid cache dir that added global buildid_dir variable that holds cache directory, but did not cleanup the debugdir copies. Signed-off-by: Jiri Olsa <[email protected]> Cc: Corey Ashford <[email protected]> Cc: David Ahern <[email protected]> Cc: Frederic Weisbecker <[email protected]> Cc: Ingo Molnar <[email protected]> Cc: Namhyung Kim <[email protected]> Cc: Paul Mackerras <[email protected]> Cc: Peter Zijlstra <[email protected]> Cc: Stephane Eranian <[email protected]> Cc: Steven Rostedt <[email protected]> Link: http://lkml.kernel.org/r/[email protected] Signed-off-by: Arnaldo Carvalho de Melo <[email protected]>
2014-12-09perf tools: Use single strcmp call instead of twoJiri Olsa1-1/+1
There's no need to use 2 strcmp calls, one is enough. Signed-off-by: Jiri Olsa <[email protected]> Cc: Corey Ashford <[email protected]> Cc: David Ahern <[email protected]> Cc: Frederic Weisbecker <[email protected]> Cc: Ingo Molnar <[email protected]> Cc: Namhyung Kim <[email protected]> Cc: Paul Mackerras <[email protected]> Cc: Peter Zijlstra <[email protected]> Cc: Stephane Eranian <[email protected]> Cc: Steven Rostedt <[email protected]> Link: http://lkml.kernel.org/r/[email protected] Signed-off-by: Arnaldo Carvalho de Melo <[email protected]>
2014-12-09perf hists browser: Change print format from %lu to %PRIu64Tom Huynh1-1/+1
The nr_events variable in tools/perf/ui/browsers/hists.c is of type u64, so the print format (%lu) causes 'perf report' to show 0 event count when running with 32-bit userspace without redirection. This patch fixes that problem by printing nr_events as PRIu64. Signed-off-by: Tom Huynh <[email protected]> Cc: Ingo Molnar <[email protected]> Cc: Jiri Olsa <[email protected]> Cc: Kim Phillips <[email protected]> Cc: Matt Mullins <[email protected]> Cc: Namhyung Kim <[email protected]> Cc: Paul Mackerras <[email protected]> Cc: Peter Zijlstra <[email protected]> Link: http://lkml.kernel.org/r/[email protected] Signed-off-by: Arnaldo Carvalho de Melo <[email protected]>
2014-12-09perf bench: Fix memcpy/memset outputRabin Vincent1-8/+10
The memcpy and memset benchmarks return bogus results when iterations > 0 because the iterations value is not taken into account when calculating the final result: $ perf bench mem memset --only-prefault --length 1GB --iterations 1 # Running 'mem/memset' benchmark: # Copying 1GB Bytes ... 20.798669 GB/Sec (with prefault) $ perf bench mem memset --only-prefault --length 1GB --iterations 10 # Running 'mem/memset' benchmark: # Copying 1GB Bytes ... 2.086576 GB/Sec (with prefault) $ perf bench mem memset --only-prefault --length 1GB --iterations 100 # Running 'mem/memset' benchmark: # Copying 1GB Bytes ... 212.840917 MB/Sec (with prefault) Fix this. Signed-off-by: Rabin Vincent <[email protected]> Acked-by: Ingo Molnar <[email protected]> Cc: Ingo Molnar <[email protected]> Cc: Paul Mackerras <[email protected]> Cc: Peter Zijlstra <[email protected]> Cc: Rabin Vincent <[email protected]> Cc: Rabin Vincent <[email protected]> Link: http://lkml.kernel.org/r/[email protected] Signed-off-by: Arnaldo Carvalho de Melo <[email protected]>
2014-12-09perf bench: Merge memset into memcpyRabin Vincent3-305/+90
The memset benchmark is largely copy-pasted from the memcpy benchmark. Merge the two now that memcpy is made more generic. Signed-off-by: Rabin Vincent <[email protected]> Cc: Ingo Molnar <[email protected]> Cc: Paul Mackerras <[email protected]> Cc: Peter Zijlstra <[email protected]> Cc: Rabin Vincent <[email protected]> Link: http://lkml.kernel.org/r/[email protected] Signed-off-by: Arnaldo Carvalho de Melo <[email protected]>
2014-12-09perf bench: Prepare memcpy for mergeRabin Vincent1-78/+104
The memset benchmark is largely copy-pasted from the memcpy benchmark. Prepare the memcpy file for merge with memset by extracting out a generic function. Signed-off-by: Rabin Vincent <[email protected]> Cc: Ingo Molnar <[email protected]> Cc: Paul Mackerras <[email protected]> Cc: Peter Zijlstra <[email protected]> Cc: Rabin Vincent <[email protected]> Link: http://lkml.kernel.org/r/[email protected] Signed-off-by: Arnaldo Carvalho de Melo <[email protected]>
2014-12-09virtio: add support for 64 bit features.Michael S. Tsirkin1-1/+1
Change u32 to u64, and use BIT_ULL and 1ULL everywhere. Note: transports are unchanged, and only set low 32 bit. This guarantees that no transport sets e.g. VERSION_1 by mistake without proper support. Based on patch by Rusty. Signed-off-by: Rusty Russell <[email protected]> Signed-off-by: Cornelia Huck <[email protected]> Signed-off-by: Michael S. Tsirkin <[email protected]> Reviewed-by: David Hildenbrand <[email protected]> Reviewed-by: Cornelia Huck <[email protected]>
2014-12-09virtio: use u32, not bitmap for featuresMichael S. Tsirkin4-33/+12
It seemed like a good idea to use bitmap for features in struct virtio_device, but it's actually a pain, and seems to become even more painful when we get more than 32 feature bits. Just change it to a u32 for now. Based on patch by Rusty. Suggested-by: David Hildenbrand <[email protected]> Signed-off-by: Rusty Russell <[email protected]> Signed-off-by: Cornelia Huck <[email protected]> Signed-off-by: Michael S. Tsirkin <[email protected]> Reviewed-by: Cornelia Huck <[email protected]>
2014-12-08Merge remote-tracking branch 'scsi-queue/core-for-3.19' into for-linusJames Bottomley1-3/+6
2014-12-05tools: cpupower: fix return checks for sysfs_get_idlestate_count()Prarit Bhargava1-4/+4
Red Hat and Fedora use a bug reporting tool that gathers data about "broken" systems called sosreport. Among other things, it includes the output of 'cpupower idle-info'. Executing 'cpupower idle-info' on a system that has cpuidle disabled via 'cpuidle.off=1' results in a 300 second hang in the cpupower application. ie) [root@intel-brickland-05]# cpupower idle-info Could not determine cpuidle driver Analyzing CPU 0: Number of idle states: -19 [hang] The problem is that the cpupower code only checks for a zero return from sysfs_get_idlestate_count(). The function can return -ENODEV (-19) as above. This patch fixes callers to sysfs_get_idlestate_count() to check the right return values. Signed-off-by: Prarit Bhargava <[email protected]> Signed-off-by: Thomas Renninger <[email protected]> Signed-off-by: Rafael J. Wysocki <[email protected]>
2014-12-04ftracetest: Add --verbose option for showing echo outputMasami Hiramatsu1-10/+23
Add --verbose/-v option for showing echo output in testcases. This is good for checking the progress of testcases which take a longer time to run. To implement this feature, all the testcase failures are captured in ftracetest and send signal to set SIG_RESULT=FAIL. Link: http://lkml.kernel.org/r/[email protected] Suggested-by: Steven Rostedt <[email protected]> Signed-off-by: Masami Hiramatsu <[email protected]> Signed-off-by: Steven Rostedt <[email protected]>
2014-12-04ftracetest: Fix to show descriptions on dashMasami Hiramatsu1-1/+2
The ftracetest doesn't show testcase's descriptions when it is executed on dash. This fixes that to show the descriptions on dash correctly by passing it via a variable instead of directly passing the grep command output. Link: http://lkml.kernel.org/r/[email protected] Signed-off-by: Masami Hiramatsu <[email protected]> Signed-off-by: Steven Rostedt <[email protected]>
2014-12-03selftest: size: Add size test for Linux kernelTim Bird4-0/+114
This test shows the amount of memory used by the system. Note that this is dependent on the user-space that is loaded when this program runs. Optimally, this program would be run as the init program itself. The program is optimized for size itself, to avoid conflating its own execution with that of the system software. The code is compiled statically, with no stdlibs. On my x86_64 system, this results in a statically linked binary of less than 5K. Signed-off-by: Tim Bird <[email protected]> Reviewed-by: Josh Triplett <[email protected]> Signed-off-by: Shuah Khan <[email protected]>
2014-12-02usbip: remove unneeded structureJulia Lawall1-2/+0
Delete a local structure that is only used to be initialized by memset. A semantic patch that makes this change is as follows: (http://coccinelle.lip6.fr/) // <smpl> @@ identifier x,i; @@ { ... when any -struct i x; <+... when != x - memset(&x,...); ...+> } // </smpl> Signed-off-by: Julia Lawall <[email protected]> Acked-by: Valentina Manea <[email protected]> Signed-off-by: Greg Kroah-Hartman <[email protected]>
2014-12-02selftests/kcmp: Always try to build the testMichael Ellerman1-16/+2
Don't prevent the test building on non-x86. Just try and build it and let the chips fall where they may. Add support for CROSS_COMPILE while we're at it. Also we don't need a custom rule for building kcmp_test. Signed-off-by: Michael Ellerman <[email protected]> Reviewed-by: Christopher Covington <[email protected]> Signed-off-by: Shuah Khan <[email protected]>
2014-12-02selftests/kcmp: Don't include kernel headersMichael Ellerman1-4/+0
The kcmp test mucks with the include path to bring in the kernel headers, and x86 headers too for reasons that are not clear. Now that kcmp.h is exported none of that should be necessary. Signed-off-by: Michael Ellerman <[email protected]> Acked-by: Cyrill Gorcunov <[email protected]> Signed-off-by: Shuah Khan <[email protected]>
2014-12-02mnt: Update unprivileged remount testEric W. Biederman1-30/+142
- MNT_NODEV should be irrelevant except when reading back mount flags, no longer specify MNT_NODEV on remount. - Test MNT_NODEV on devpts where it is meaningful even for unprivileged mounts. - Add a test to verify that remount of a prexisting mount with the same flags is allowed and does not change those flags. - Cleanup up the definitions of MS_REC, MS_RELATIME, MS_STRICTATIME that are used when the code is built in an environment without them. - Correct the test error messages when tests fail. There were not 5 tests that tested MS_RELATIME. Cc: [email protected] Signed-off-by: Eric W. Biederman <[email protected]>
2014-12-01perf report: In branch stack mode use address history sortingAndi Kleen1-0/+1
Enable CCKEY_ADDRESS address history sorting with --branch-history. This makes get_srcline display the source lines correctly, otherwise all history entries for a function a hunked into one. Signed-off-by: Andi Kleen <[email protected]> Link: http://lkml.kernel.org/r/[email protected] Signed-off-by: Arnaldo Carvalho de Melo <[email protected]>
2014-12-01perf report: Add --branch-history optionAndi Kleen2-4/+27
Add a --branch-history option to perf report that changes all the settings necessary for using the branches in callstacks. This is just a short cut to make this nicer to use, it does not enable any functionality by itself. v2: Change sort order. Rename option to --branch-history to be less confusing. v3: Updates v4: Fix conflict with newer perf base v5: Port to latest tip v6: Add more comments. Remove CCKEY_ADDRESS setting. Remove unnecessary branch_mode setting. Use a boolean. Signed-off-by: Andi Kleen <[email protected]> Cc: Jiri Olsa <[email protected]> Cc: Namhyung Kim <[email protected]> Link: http://lkml.kernel.org/r/[email protected] Signed-off-by: Arnaldo Carvalho de Melo <[email protected]>
2014-12-01perf callchain: Support handling complete branch stacks as histogramsAndi Kleen6-13/+132
Currently branch stacks can be only shown as edge histograms for individual branches. I never found this display particularly useful. This implements an alternative mode that creates histograms over complete branch traces, instead of individual branches, similar to how normal callgraphs are handled. This is done by putting it in front of the normal callgraph and then using the normal callgraph histogram infrastructure to unify them. This way in complex functions we can understand the control flow that lead to a particular sample, and may even see some control flow in the caller for short functions. Example (simplified, of course for such simple code this is usually not needed), please run this after the whole patchkit is in, as at this point in the patch order there is no --branch-history, that will be added in a patch after this one: tcall.c: volatile a = 10000, b = 100000, c; __attribute__((noinline)) f2() { c = a / b; } __attribute__((noinline)) f1() { f2(); f2(); } main() { int i; for (i = 0; i < 1000000; i++) f1(); } % perf record -b -g ./tsrc/tcall [ perf record: Woken up 1 times to write data ] [ perf record: Captured and wrote 0.044 MB perf.data (~1923 samples) ] % perf report --no-children --branch-history ... 54.91% tcall.c:6 [.] f2 tcall | |--65.53%-- f2 tcall.c:5 | | | |--70.83%-- f1 tcall.c:11 | | f1 tcall.c:10 | | main tcall.c:18 | | main tcall.c:18 | | main tcall.c:17 | | main tcall.c:17 | | f1 tcall.c:13 | | f1 tcall.c:13 | | f2 tcall.c:7 | | f2 tcall.c:5 | | f1 tcall.c:12 | | f1 tcall.c:12 | | f2 tcall.c:7 | | f2 tcall.c:5 | | f1 tcall.c:11 | | | --29.17%-- f1 tcall.c:12 | f1 tcall.c:12 | f2 tcall.c:7 | f2 tcall.c:5 | f1 tcall.c:11 | f1 tcall.c:10 | main tcall.c:18 | main tcall.c:18 | main tcall.c:17 | main tcall.c:17 | f1 tcall.c:13 | f1 tcall.c:13 | f2 tcall.c:7 | f2 tcall.c:5 | f1 tcall.c:12 The default output is unchanged. This is only implemented in perf report, no change to record or anywhere else. This adds the basic code to report: - add a new "branch" option to the -g option parser to enable this mode - when the flag is set include the LBR into the callstack in machine.c. The rest of the history code is unchanged and doesn't know the difference between LBR entry and normal call entry. - detect overlaps with the callchain - remove small loop duplicates in the LBR Current limitations: - The LBR flags (mispredict etc.) are not shown in the history and LBR entries have no special marker. - It would be nice if annotate marked the LBR entries somehow (e.g. with arrows) v2: Various fixes. v3: Merge further patches into this one. Fix white space. v4: Improve manpage. Address review feedback. v5: Rename functions. Better error message without -g. Fix crash without -b. v6: Rebase v7: Rebase. Use NO_ENTRY in memset. v8: Port to latest tip. Move add_callchain_ip to separate patch. Skip initial entries in callchain. Minor cleanups. Signed-off-by: Andi Kleen <[email protected]> Cc: Jiri Olsa <[email protected]> Cc: Namhyung Kim <[email protected]> Link: http://lkml.kernel.org/r/[email protected] Signed-off-by: Arnaldo Carvalho de Melo <[email protected]>
2014-12-01perf stat: Add support for snapshot countersJiri Olsa1-2/+4
The .snapshot file indicates that the provided event value is a snapshot value. Bypassing the delta computation logic for such event. Signed-off-by: Jiri Olsa <[email protected]> Cc: Andi Kleen <[email protected]> Cc: Corey Ashford <[email protected]> Cc: David Ahern <[email protected]> Cc: Frederic Weisbecker <[email protected]> Cc: Ingo Molnar <[email protected]> Cc: Matt Fleming <[email protected]> Cc: Namhyung Kim <[email protected]> Cc: Paul Mackerras <[email protected]> Cc: Peter Zijlstra <[email protected]> Cc: Stephane Eranian <[email protected]> Link: http://lkml.kernel.org/r/[email protected] Signed-off-by: Arnaldo Carvalho de Melo <[email protected]>
2014-12-01perf stat: Add support for per-pkg countersJiri Olsa2-0/+50
The .per-pkg file indicates that all but one value per socket should be discarded. Adding the logic of skipping the rest of the socket once first value was read. Signed-off-by: Jiri Olsa <[email protected]> Cc: Andi Kleen <[email protected]> Cc: Corey Ashford <[email protected]> Cc: David Ahern <[email protected]> Cc: Frederic Weisbecker <[email protected]> Cc: Ingo Molnar <[email protected]> Cc: Matt Fleming <[email protected]> Cc: Namhyung Kim <[email protected]> Cc: Paul Mackerras <[email protected]> Cc: Peter Zijlstra <[email protected]> Cc: Stephane Eranian <[email protected]> Link: http://lkml.kernel.org/r/[email protected] Signed-off-by: Arnaldo Carvalho de Melo <[email protected]>
2014-12-01perf tools: Remove perf_evsel__read interfaceJiri Olsa2-63/+0
Removing the perf_evsel__read interfaces because we replaced the only user in the stat command code. Signed-off-by: Jiri Olsa <[email protected]> Cc: Andi Kleen <[email protected]> Cc: Corey Ashford <[email protected]> Cc: David Ahern <[email protected]> Cc: Frederic Weisbecker <[email protected]> Cc: Ingo Molnar <[email protected]> Cc: Matt Fleming <[email protected]> Cc: Namhyung Kim <[email protected]> Cc: Paul Mackerras <[email protected]> Cc: Peter Zijlstra <[email protected]> Cc: Stephane Eranian <[email protected]> Link: http://lkml.kernel.org/r/[email protected] Signed-off-by: Arnaldo Carvalho de Melo <[email protected]>
2014-12-01perf stat: Use read_counter in read_counter_aggrJiri Olsa1-2/+16
Use the read_counter function as the values retrieval function for aggr counter values thus eliminating the use of __perf_evsel__read function. Signed-off-by: Jiri Olsa <[email protected]> Cc: Andi Kleen <[email protected]> Cc: Corey Ashford <[email protected]> Cc: David Ahern <[email protected]> Cc: Frederic Weisbecker <[email protected]> Cc: Ingo Molnar <[email protected]> Cc: Matt Fleming <[email protected]> Cc: Namhyung Kim <[email protected]> Cc: Paul Mackerras <[email protected]> Cc: Peter Zijlstra <[email protected]> Cc: Stephane Eranian <[email protected]> Link: http://lkml.kernel.org/r/[email protected] Signed-off-by: Arnaldo Carvalho de Melo <[email protected]>
2014-12-01perf stat: Make read_counter work over the thread dimensionJiri Olsa1-4/+11
The read function will be used later for both aggr and cpu counters, so we need to make it work over threads as well. Signed-off-by: Jiri Olsa <[email protected]> Cc: Andi Kleen <[email protected]> Cc: Corey Ashford <[email protected]> Cc: David Ahern <[email protected]> Cc: Frederic Weisbecker <[email protected]> Cc: Ingo Molnar <[email protected]> Cc: Matt Fleming <[email protected]> Cc: Namhyung Kim <[email protected]> Cc: Paul Mackerras <[email protected]> Cc: Peter Zijlstra <[email protected]> Cc: Stephane Eranian <[email protected]> Link: http://lkml.kernel.org/r/[email protected] Signed-off-by: Arnaldo Carvalho de Melo <[email protected]>
2014-12-01perf stat: Use perf_evsel__read_cb in read_counterJiri Olsa1-6/+21
Replacing __perf_evsel__read_on_cpu function with perf_evsel__read_cb function. The read_cb callback will be used later for global aggregation counter values as well. Signed-off-by: Jiri Olsa <[email protected]> Cc: Andi Kleen <[email protected]> Cc: Corey Ashford <[email protected]> Cc: David Ahern <[email protected]> Cc: Frederic Weisbecker <[email protected]> Cc: Ingo Molnar <[email protected]> Cc: Matt Fleming <[email protected]> Cc: Namhyung Kim <[email protected]> Cc: Paul Mackerras <[email protected]> Cc: Peter Zijlstra <[email protected]> Cc: Stephane Eranian <[email protected]> Link: http://lkml.kernel.org/r/[email protected] Signed-off-by: Arnaldo Carvalho de Melo <[email protected]>
2014-11-28perf session: Do not fail on processing out of order eventJiri Olsa2-6/+6
Linus reported perf report command being interrupted due to processing of 'out of order' event, with following error: Timestamp below last timeslice flush 0x5733a8 [0x28]: failed to process type: 3 I could reproduce the issue and in my case it was caused by one CPU (mmap) being behind during record and userspace mmap reader seeing the data after other CPUs data were already stored. This is expected under some circumstances because we need to limit the number of events that we queue for reordering when we receive a PERF_RECORD_FINISHED_ROUND or when we force flush due to memory pressure. Reported-by: Linus Torvalds <[email protected]> Signed-off-by: Jiri Olsa <[email protected]> Acked-by: Ingo Molnar <[email protected]> Cc: Andi Kleen <[email protected]> Cc: Corey Ashford <[email protected]> Cc: David Ahern <[email protected]> Cc: Frederic Weisbecker <[email protected]> Cc: Ingo Molnar <[email protected]> Cc: Linus Torvalds <[email protected]> Cc: Matt Fleming <[email protected]> Cc: Namhyung Kim <[email protected]> Cc: Paul Mackerras <[email protected]> Cc: Peter Zijlstra <[email protected]> Cc: Stephane Eranian <[email protected]> Link: http://lkml.kernel.org/r/[email protected] Signed-off-by: Arnaldo Carvalho de Melo <[email protected]>
2014-11-26tools: hv: ignore ENOBUFS and ENOMEM in the KVP daemonDexuan Cui1-0/+14
Under high memory pressure and very high KVP R/W test pressure, the netlink recvfrom() may transiently return ENOBUFS to the daemon -- we found this during a 2-week stress test. We'd better not terminate the daemon on the failure, because a typical KVP user will re-try the R/W and hopefully it will succeed next time. We can also ignore the errors on sending. Cc: K. Y. Srinivasan <[email protected]> Signed-off-by: Dexuan Cui <[email protected]> Reviewed-by: Vitaly Kuznetsov <[email protected]> Signed-off-by: Greg Kroah-Hartman <[email protected]>
2014-11-26Tools: hv: vssdaemon: skip all filesystems mounted readonlyVitaly Kuznetsov1-1/+1
Instead of making a list of exceptions for readonly filesystems in addition to iso9660 we already have it is better to skip freeze operation for all readonly-mounted filesystems. Signed-off-by: Vitaly Kuznetsov <[email protected]> Signed-off-by: K. Y. Srinivasan <[email protected]> Acked-by: Dexuan Cui <[email protected]> Signed-off-by: Greg Kroah-Hartman <[email protected]>
2014-11-26Tools: hv: vssdaemon: report freeze errorsVitaly Kuznetsov1-4/+12
When ioctl(fd, FIFREEZE, 0) results in an error we cannot report it to syslog instantly since that can cause write to a frozen disk. However, the name of the filesystem which caused the error and errno are valuable and we would like to get a nice human-readable message in the log. Save errno before calling vss_operate(VSS_OP_THAW) and report the error right after. Unfortunately, FITHAW errors cannot be reported the same way as we need to finish thawing all filesystems before calling syslog(). We should also avoid calling endmntent() for the second time in case we encountered an error during freezing of '/' as it usually results in SEGSEGV. Signed-off-by: Vitaly Kuznetsov <[email protected]> Signed-off-by: K. Y. Srinivasan <[email protected]> Acked-by: Dexuan Cui <[email protected]> Signed-off-by: Greg Kroah-Hartman <[email protected]>
2014-11-24perf tools: Add snapshot format file parsingJiri Olsa4-11/+40
The .snapshot file indicates that the provided event value is a snapshot value and we have to bypass the delta computation logic. Adding support to check up this file and set event flag accordingly. Signed-off-by: Jiri Olsa <[email protected]> Cc: Andi Kleen <[email protected]> Cc: Corey Ashford <[email protected]> Cc: David Ahern <[email protected]> Cc: Frederic Weisbecker <[email protected]> Cc: Ingo Molnar <[email protected]> Cc: Matt Fleming <[email protected]> Cc: Namhyung Kim <[email protected]> Cc: Paul Mackerras <[email protected]> Cc: Peter Zijlstra <[email protected]> Cc: Stephane Eranian <[email protected]> Link: http://lkml.kernel.org/r/[email protected] Signed-off-by: Arnaldo Carvalho de Melo <[email protected]>
2014-11-24perf tools: Add per-pkg format file parsingMatt Fleming4-0/+31
The .per-pkg file indicates that all but one value per socket should be discarded. Adding support to check up this file and set event flag accordingly. This patch is part of Matt's original patch: http://marc.info/?l=linux-kernel&m=141527675002139&w=2 only the file parsing part, the rest is solved differently. Signed-off-by: Matt Fleming <[email protected]> Cc: Andi Kleen <[email protected]> Cc: Corey Ashford <[email protected]> Cc: David Ahern <[email protected]> Cc: Frederic Weisbecker <[email protected]> Cc: Ingo Molnar <[email protected]> Cc: Jiri Olsa <[email protected]> Cc: Namhyung Kim <[email protected]> Cc: Paul Mackerras <[email protected]> Cc: Peter Zijlstra <[email protected]> Cc: Stephane Eranian <[email protected]> Link: http://lkml.kernel.org/r/[email protected] Signed-off-by: Jiri Olsa <[email protected]> Signed-off-by: Arnaldo Carvalho de Melo <[email protected]>
2014-11-24perf evsel: Introduce perf_evsel__read_cb functionJiri Olsa2-0/+23
Adding perf_evsel__read_cb read function that retuns count values via callback. It will be used later in stat command as single way to retrieve counter values. Signed-off-by: Jiri Olsa <[email protected]> Cc: Andi Kleen <[email protected]> Cc: Corey Ashford <[email protected]> Cc: David Ahern <[email protected]> Cc: Frederic Weisbecker <[email protected]> Cc: Ingo Molnar <[email protected]> Cc: Matt Fleming <[email protected]> Cc: Namhyung Kim <[email protected]> Cc: Paul Mackerras <[email protected]> Cc: Peter Zijlstra <[email protected]> Cc: Stephane Eranian <[email protected]> Link: http://lkml.kernel.org/r/[email protected] Signed-off-by: Arnaldo Carvalho de Melo <[email protected]>
2014-11-24perf evsel: Introduce perf_counts_values__scale functionJiri Olsa2-25/+25
Factoring out scale login into perf_counts_values__scale function. Signed-off-by: Jiri Olsa <[email protected]> Cc: Andi Kleen <[email protected]> Cc: Corey Ashford <[email protected]> Cc: David Ahern <[email protected]> Cc: Frederic Weisbecker <[email protected]> Cc: Ingo Molnar <[email protected]> Cc: Matt Fleming <[email protected]> Cc: Namhyung Kim <[email protected]> Cc: Paul Mackerras <[email protected]> Cc: Peter Zijlstra <[email protected]> Cc: Stephane Eranian <[email protected]> Link: http://lkml.kernel.org/r/[email protected] Signed-off-by: Arnaldo Carvalho de Melo <[email protected]>
2014-11-24perf evsel: Introduce perf_evsel__compute_deltas functionJiri Olsa2-5/+7
Making compute_deltas functions global and renaming it to perf_evsel__compute_deltas. It will be used in stat command in later patch. Signed-off-by: Jiri Olsa <[email protected]> Cc: Andi Kleen <[email protected]> Cc: Corey Ashford <[email protected]> Cc: David Ahern <[email protected]> Cc: Frederic Weisbecker <[email protected]> Cc: Ingo Molnar <[email protected]> Cc: Matt Fleming <[email protected]> Cc: Namhyung Kim <[email protected]> Cc: Paul Mackerras <[email protected]> Cc: Peter Zijlstra <[email protected]> Cc: Stephane Eranian <[email protected]> Link: http://lkml.kernel.org/r/[email protected] Signed-off-by: Arnaldo Carvalho de Melo <[email protected]>
2014-11-24perf tools: Allow to force redirect pr_debug to stderr.Andi Kleen1-1/+3
When debugging the tui browser I find it useful to redirect the debug log into a file. Currently it's always forced to the message line. Add an option to force it to stderr. Then it can be easily redirected. Example: [root@zoo ~]# perf --debug stderr report -vv 2> /tmp/debug [root@zoo ~]# tail /tmp/debug dso open failed, mmap: No such file or directory dso open failed, mmap: No such file or directory dso open failed, mmap: No such file or directory dso open failed, mmap: No such file or directory dso open failed, mmap: No such file or directory Using /root/.debug/.build-id/4e/841948927029fb650132253642d5dbb2c1fb93 for symbols Failed to open /tmp/perf-8831.map, continuing without symbols Failed to open /tmp/perf-12721.map, continuing without symbols Failed to open /tmp/perf-6966.map, continuing without symbols Failed to open /tmp/perf-8802.map, continuing without symbols [root@zoo ~]# Signed-off-by: Andi Kleen <[email protected]> Cc: Jiri Olsa <[email protected]> Cc: Namhyung Kim <[email protected]> Link: http://lkml.kernel.org/r/[email protected] Signed-off-by: Arnaldo Carvalho de Melo <[email protected]>
2014-11-24perf tools: Fix segfault due to invalid kernel dso accessNamhyung Kim1-2/+2
Jiri reported that the commit 96d78059d6d9 ("perf tools: Make vmlinux short name more like kallsyms short name") segfaults on perf script. When processing kernel mmap event, it should access the 'kernel' variable as sometimes it cannot find a matching dso from build-id table so 'dso' might be invalid. Reported-by: Jiri Olsa <[email protected]> Tested-by: Jiri Olsa <[email protected]> Signed-off-by: Namhyung Kim <[email protected]> Cc: Adrian Hunter <[email protected]> Cc: David Ahern <[email protected]> Cc: Ingo Molnar <[email protected]> Cc: Jiri Olsa <[email protected]> Cc: Namhyung Kim <[email protected]> Cc: Paul Mackerras <[email protected]> Cc: Peter Zijlstra <[email protected]> Link: http://lkml.kernel.org/r/[email protected] Signed-off-by: Arnaldo Carvalho de Melo <[email protected]>
2014-11-24perf callchain: Make get_srcline fall back to sym+offsetAndi Kleen6-8/+20
When the source line is not found fall back to sym + offset. This is generally much more useful than a raw address. For this we need to pass in the symbol from the caller. For some callers it's awkward to compute, so we stay at the old behaviour. Signed-off-by: Andi Kleen <[email protected]> Cc: Jiri Olsa <[email protected]> Cc: Namhyung Kim <[email protected]> Link: http://lkml.kernel.org/r/[email protected] Signed-off-by: Arnaldo Carvalho de Melo <[email protected]>
2014-11-24perf symbols: Move bfd_demangle stubbing to its only userArnaldo Carvalho de Melo2-21/+21
We need to define bfd_demangle() to either a wrapper for cplus_demangle() or to a stub when NO_DEMANGLE is defined. That is at odds with using bfd.h for some other reason, as it defines bfd_demangle() and then if code that wants to use symbol.h, where the above stubbing/wrapping is done, and bfd.h for other reasons, we end up with a build error where bfd_demangle() is found to be redefined. Avoid that by moving the stubbing/wrapping to symbol-elf.c, that is the only user of such function. If we ever get to a point where there are more valid users, we can then introduce a header for that. Cc: Adrian Hunter <[email protected]> Cc: Andi Kleen <[email protected]> Cc: Borislav Petkov <[email protected]> Cc: David Ahern <[email protected]> Cc: Don Zickus <[email protected]> Cc: Frederic Weisbecker <[email protected]> Cc: Jiri Olsa <[email protected]> Cc: Mike Galbraith <[email protected]> Cc: Namhyung Kim <[email protected]> Cc: Peter Zijlstra <[email protected]> Cc: Stephane Eranian <[email protected]> Link: http://lkml.kernel.org/n/[email protected] Signed-off-by: Arnaldo Carvalho de Melo <[email protected]>