aboutsummaryrefslogtreecommitdiff
AgeCommit message (Collapse)AuthorFilesLines
2014-06-04perf probe: Fix a segfault if asked for variable it doesn't findMasami Hiramatsu1-2/+2
Fix a segfault bug by asking for variable it doesn't find. Since the convert_variable() didn't handle error code returned from convert_variable_location(), it just passed an incomplete variable field and then a segfault was occurred when formatting the field. This fixes that bug by handling success code correctly in convert_variable(). Other callers of convert_variable_location() are correctly checking the return code. This bug was introduced by following commit. But another hidden erroneous error handling has been there previously (-ENOMEM case). commit 3d918a12a1b3088ac16ff37fa52760639d6e2403 Signed-off-by: Masami Hiramatsu <[email protected]> Reported-by: Arnaldo Carvalho de Melo <[email protected]> Tested-by: Arnaldo Carvalho de Melo <[email protected]> Cc: Peter Zijlstra <[email protected]> Cc: Paul Mackerras <[email protected]> Cc: Ingo Molnar <[email protected]> Cc: Namhyung Kim <[email protected]> Link: http://lkml.kernel.org/r/[email protected] Signed-off-by: Jiri Olsa <[email protected]>
2014-06-03perf tools: Fix 'make help' message errorJianyu Zhan2-3/+3
Currently 'make help' message has such hint: use "make prefix=<path> <install target>" to install to a particular path like make prefix=/usr/local install install-doc But this is misleading, when I specify "prefix=/usr/local", it has got no respect at all. This is because that, "DESTDIR" is considered first. In this case, "DESTDIR" has an empty value, so "prefix" is honored. However, "prefix" is unconditionally assigned to $HOME, regardless of what it is set to from command line. So our "prefix" setting got no respect and the actual destination falls back to $HOME. This patch fixes this issue and corrects the help message. Signed-off-by: Jianyu Zhan <[email protected]> Acked-by: Namhyung Kim <[email protected]> Link: http://lkml.kernel.org/r/[email protected] Signed-off-by: Jiri Olsa <[email protected]>
2014-06-03perf record: Fix poll return value propagationJiri Olsa1-1/+5
If the perf record command is interrupted in record__mmap_read_all function, the 'done' is set and err has the latest poll return value, which is most likely positive number (= number of pollfds ready to read). This 'positive err' is then propagated to the exit code, resulting in not finishing the perf.data header properly, causing following error in report: # perf record -F 50000 -a --- make the system real busy, so there's more chance to interrupt perf in event writing code --- ^C[ perf record: Woken up 16 times to write data ] [ perf record: Captured and wrote 30.292 MB perf.data (~1323468 samples) ] # perf report --stdio > /dev/null WARNING: The perf.data file's data size field is 0 which is unexpected. Was the 'perf record' command properly terminated? Fixing this by checking for positive poll return value and setting err to 0. Acked-by: Arnaldo Carvalho de Melo <[email protected]> Acked-by: Namhyung Kim <[email protected]> Cc: Arnaldo Carvalho de Melo <[email protected]> Cc: Corey Ashford <[email protected]> Cc: David Ahern <[email protected]> Cc: Frederic Weisbecker <[email protected]> Cc: Ingo Molnar <[email protected]> Cc: Namhyung Kim <[email protected]> Cc: Paul Mackerras <[email protected]> Cc: Peter Zijlstra <[email protected]> Link: http://lkml.kernel.org/r/[email protected] Signed-off-by: Jiri Olsa <[email protected]>
2014-06-03perf tools: Move elide bool into perf_hpp_fmt structJiri Olsa4-40/+68
After output/sort fields refactoring, it's expensive to check the elide bool in its current location inside the 'struct sort_entry'. The perf_hpp__should_skip function gets highly noticable in workloads with high number of output/sort fields, like for: $ perf report -i perf-test.data -F overhead,sample,period,comm,pid,dso,symbol,cpu --stdio Performance report: 9.70% perf [.] perf_hpp__should_skip Moving the elide bool into the 'struct perf_hpp_fmt', which makes the perf_hpp__should_skip just single struct read. Got speedup of around 22% for my test perf.data workload. The change should not harm any other workload types. Performance counter stats for (10 runs): before: 358,319,732,626 cycles ( +- 0.55% ) 467,129,581,515 instructions # 1.30 insns per cycle ( +- 0.00% ) 150.943975206 seconds time elapsed ( +- 0.62% ) now: 278,785,972,990 cycles ( +- 0.12% ) 370,146,797,640 instructions # 1.33 insns per cycle ( +- 0.00% ) 116.416670507 seconds time elapsed ( +- 0.31% ) Acked-by: Namhyung Kim <[email protected]> Cc: Arnaldo Carvalho de Melo <[email protected]> Cc: Corey Ashford <[email protected]> Cc: David Ahern <[email protected]> Cc: Frederic Weisbecker <[email protected]> Cc: Ingo Molnar <[email protected]> Cc: Namhyung Kim <[email protected]> Cc: Paul Mackerras <[email protected]> Cc: Peter Zijlstra <[email protected]> Cc: Stephane Eranian <[email protected]> Link: http://lkml.kernel.org/r/[email protected] Signed-off-by: Jiri Olsa <[email protected]>
2014-06-03perf tools: Remove elide setup for SORT_MODE__MEMORY modeJiri Olsa1-13/+0
There's no need to setup elide of sort_dso sort entry again with symbol_conf.dso_list list. The only difference were list names of memory mode data, which does not make much sense to me. Acked-by: Namhyung Kim <[email protected]> Cc: Arnaldo Carvalho de Melo <[email protected]> Cc: Corey Ashford <[email protected]> Cc: David Ahern <[email protected]> Cc: Frederic Weisbecker <[email protected]> Cc: Ingo Molnar <[email protected]> Cc: Namhyung Kim <[email protected]> Cc: Paul Mackerras <[email protected]> Cc: Peter Zijlstra <[email protected]> Cc: Stephane Eranian <[email protected]> Link: http://lkml.kernel.org/r/[email protected] Signed-off-by: Jiri Olsa <[email protected]>
2014-06-03perf tools: Fix "==" into "=" in ui_browser__warning assignmentzhangdianfang1-1/+1
Convert "==" into "=" in ui_browser__warning assignment. Bug description: https://bugzilla.kernel.org/show_bug.cgi?id=76751 Reported-by: David Binderman <[email protected]> Signed-off-by: Dianfang Zhang <[email protected]> Acked-by: Arnaldo Carvalho de Melo <[email protected]> Cc: Jean Delvare <[email protected]> Link: http://lkml.kernel.org/r/[email protected] [ changed the changelog a bit ] Signed-off-by: Jiri Olsa <[email protected]>
2014-06-03perf tools: Allow overriding sysfs and proc finding with env varCody P Schafer1-1/+42
SYSFS_PATH and PROC_PATH environment variables now let the user override the detection of sysfs and proc locations for testing purposes. Signed-off-by: Cody P Schafer <[email protected]> Cc: Sukadev Bhattiprolu <[email protected]> Link: http://lkml.kernel.org/r/[email protected] Signed-off-by: Jiri Olsa <[email protected]>
2014-06-03perf tools: Consider header files outside perf directory in tags targetSebastian Andrzej Siewior1-3/+6
This fixes lookups like "vi -t event_format" Signed-off-by: Sebastian Andrzej Siewior <[email protected]> Link: http://lkml.kernel.org/r/[email protected] Signed-off-by: Jiri Olsa <[email protected]>
2014-06-03perf tools: Add warning when disabling perl scripting support due to missing ↵Arnaldo Carvalho de Melo1-0/+1
devel files We were just showing "libperl: OFF", unlike other features where we present the user with a message helping have a feature built in. Fix it by adding the following message: config/Makefile:450: Missing perl devel files. Disabling perl scripting support, consider installing perl-ExtUtils-Embed Signed-off-by: Arnaldo Carvalho de Melo <[email protected]> Cc: Adrian Hunter <[email protected]> Cc: David Ahern <[email protected]> Cc: Don Zickus <[email protected]> Cc: Frederic Weisbecker <[email protected]> Cc: Mike Galbraith <[email protected]> Cc: Namhyung Kim <[email protected]> Cc: Paul Mackerras <[email protected]> Cc: Peter Zijlstra <[email protected]> Cc: Stephane Eranian <[email protected]> Link: http://lkml.kernel.org/n/[email protected] Signed-off-by: Jiri Olsa <[email protected]>
2014-06-03perf trace: Warn the user when not availableArnaldo Carvalho de Melo1-2/+6
When the audit-libs devel package is not found at build time we disable the 'trace' command, as we are not able to map syscall numbers to strings, but then the message the user is presented is cryptic: [root@zoo linux]# trace ls perf: 'ls' is not a perf-command. See 'perf --help'. Fix it by presenting a more helpful message: [root@zoo linux]# trace l trace command not available: missing audit-libs devel package at build time. Signed-off-by: Arnaldo Carvalho de Melo <[email protected]> Cc: Adrian Hunter <[email protected]> Cc: David Ahern <[email protected]> Cc: Don Zickus <[email protected]> Cc: Frederic Weisbecker <[email protected]> Cc: Mike Galbraith <[email protected]> Cc: Namhyung Kim <[email protected]> Cc: Paul Mackerras <[email protected]> Cc: Peter Zijlstra <[email protected]> Cc: Stephane Eranian <[email protected]> Link: http://lkml.kernel.org/n/[email protected] Signed-off-by: Jiri Olsa <[email protected]>
2014-06-03Merge tag 'perf-core-for-mingo' of ↵Ingo Molnar28-323/+1768
git://git.kernel.org/pub/scm/linux/kernel/git/jolsa/perf into perf/core Pull perf/core improvements and fixes from Jiri Olsa: * Add support to accumulate hist periods (Namhyung Kim) Signed-off-by: Jiri Olsa <[email protected]> Signed-off-by: Ingo Molnar <[email protected]>
2014-06-02uprobes: Teach copy_insn() to support tmpfsOleg Nesterov1-5/+10
tmpfs is widely used but as Denys reports shmem_aops doesn't have ->readpage() and thus you can't probe a binary on this filesystem. As Hugh suggested we can use shmem_read_mapping_page() in this case, just we need to check shmem_mapping() if ->readpage == NULL. Reported-by: Denys Vlasenko <[email protected]> Suggested-by: Hugh Dickins <[email protected]> Acked-by: Srikar Dronamraju <[email protected]> Signed-off-by: Oleg Nesterov <[email protected]>
2014-06-02uprobes: Shift ->readpage check from __copy_insn() to uprobe_register()Oleg Nesterov1-3/+3
copy_insn() fails with -EIO if ->readpage == NULL, but this error is not propagated unless uprobe_register() path finds ->mm which already mmaps this file. In this case (say) "perf record" does not actually install the probe, but the user can't know about this. Move this check into uprobe_register() so that this problem can be detected earlier and reported to user. Note: this is still not perfect, - copy_insn() and arch_uprobe_analyze_insn() should be called by uprobe_register() but this is not simple, we need vm_file for read_mapping_page() (although perhaps we can pass NULL), and we need ->mm for is_64bit_mm() (although this logic is broken anyway). - uprobe_register() should be called by create_trace_uprobe(), not by probe_event_enable(), so that an error can be detected at "perf probe -x" time. This also needs more changes in the core uprobe code, uprobe register/unregister interface was poorly designed from the very beginning. Reported-by: Denys Vlasenko <[email protected]> Acked-by: Srikar Dronamraju <[email protected]> Signed-off-by: Oleg Nesterov <[email protected]>
2014-06-01Linux 3.15-rc8Linus Torvalds1-1/+1
2014-06-01Merge branch 'merge' of ↵Linus Torvalds3-1/+3
git://git.kernel.org/pub/scm/linux/kernel/git/benh/powerpc Pull powerpc fix from Ben Herrenschmidt: "Here's just one trivial patch to wire up sys_renameat2 which I seem to have completely missed so far. (My test build scripts fwd me warnings but miss the ones generated for missing syscalls)" * 'merge' of git://git.kernel.org/pub/scm/linux/kernel/git/benh/powerpc: powerpc: Wire renameat2() syscall
2014-06-01Merge branch 'upstream' of git://git.linux-mips.org/pub/scm/ralf/upstream-linusLinus Torvalds11-30/+28
Pull MIPS fixes from Ralf Baechle: "A fair number of fixes across the field. Nothing terribly complicated; the one liners in below changelog should be fairly descriptive. Noteworthy is the SB1 change which the result of changes to binutils resulting in one big gas warning for most files being assembled as well as the asid_cache and branch emulation fixes which fix corruption or possible uninteded behaviour of kernel or application code. The remainder of fixes are more platforms or subsystem specific" * 'upstream' of git://git.linux-mips.org/pub/scm/ralf/upstream-linus: MIPS: R46000: Fix Micro-assembler field overflow for R4600 V2 MIPS: ptrace: Avoid smp_processor_id() in preemptible code MIPS: Lemote 2F: cs5536: mfgpt: use raw locks MIPS: SB1: Fix excessive kernel warnings. MIPS: RC32434: fix broken PCI resource initialization MIPS: malta: memory.c: Initialize the 'memsize' variable MIPS: Fix typo when reporting cache and ftlb errors for ImgTec cores MIPS: Fix inconsistancy of __NR_Linux_syscalls value MIPS: Fix branch emulation of branch likely instructions. MIPS: Fix a typo error in AUDIT_ARCH definition MIPS: Change type of asid_cache to unsigned long
2014-06-01Merge branch 'sched-urgent-for-linus' of ↵Linus Torvalds6-31/+78
git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip Pull scheduler fixes from Ingo Molnar: "Various fixlets, mostly related to the (root-only) SCHED_DEADLINE policy, but also a hotplug bug fix and a fix for a NR_CPUS related overallocation bug causing a suspend/resume regression" * 'sched-urgent-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip: sched: Fix hotplug vs. set_cpus_allowed_ptr() sched/cpupri: Replace NR_CPUS arrays sched/deadline: Replace NR_CPUS arrays sched/deadline: Restrict user params max value to 2^63 ns sched/deadline: Change sched_getparam() behaviour vs SCHED_DEADLINE sched: Disallow sched_attr::sched_policy < 0 sched: Make sched_setattr() correctly return -EFBIG
2014-06-02powerpc: Wire renameat2() syscallBenjamin Herrenschmidt3-1/+3
Signed-off-by: Benjamin Herrenschmidt <[email protected]>
2014-06-01perf tests: Add a test case for cumulating callchainsNamhyung Kim5-2/+735
Now it adds a new testcase to verify --children option working correctly. Signed-off-by: Namhyung Kim <[email protected]> Cc: Arun Sharma <[email protected]> Cc: Frederic Weisbecker <[email protected]> Link: http://lkml.kernel.org/r/[email protected] Signed-off-by: Jiri Olsa <[email protected]>
2014-06-01perf tests: Define and use symbolic names for fake symbolsNamhyung Kim5-62/+92
In various histogram test cases, fake symbols are used as raw numbers. Define macros for each pid, map, symbols so that it can increase readability somewhat. Signed-off-by: Namhyung Kim <[email protected]> Cc: Arun Sharma <[email protected]> Cc: Frederic Weisbecker <[email protected]> Link: http://lkml.kernel.org/r/[email protected] Signed-off-by: Jiri Olsa <[email protected]>
2014-06-01perf tools: Reset output/sort order to defaultNamhyung Kim1-0/+3
When reset_output_field() is called, also reset field/sort order to NULL so that it can have the default values. It's needed for testing. Signed-off-by: Namhyung Kim <[email protected]> CC: Arun Sharma <[email protected]> Cc: Frederic Weisbecker <[email protected]> Link: http://lkml.kernel.org/r/[email protected] Signed-off-by: Jiri Olsa <[email protected]>
2014-06-01perf ui/gtk: Fix callchain displayNamhyung Kim1-1/+9
With current output field change, GTK browser cannot display callchain information correctly since it couldn't determine where the symbol column is. This is a problem - just for now I changed to use the last column since it'll work for most cases. Also it has a same problem of the percentage as stdio code. Signed-off-by: Namhyung Kim <[email protected]> Cc: Arun Sharma <[email protected]> Cc: Frederic Weisbecker <[email protected]> Link: http://lkml.kernel.org/r/[email protected] Signed-off-by: Jiri Olsa <[email protected]>
2014-06-01perf ui/stdio: Fix invalid percentage value of cumulated hist entriesNamhyung Kim1-1/+3
On stdio, there's a problem that it shows invalid values for callchains in cumulated hist entries. It's because it only cares about the self period. But with --children behavior, we always add callchain info to the cumulated entries so it should use the value in that case. Before: # Children Self Command Shared Object Symbol # ........ ........ ....... ................. ................ # 61.22% 0.32% swapper [kernel.kallsyms] [k] cpu_idle | --- cpu_idle | |--16530.76%-- start_secondary | |--2758.70%-- rest_init | start_kernel | x86_64_start_reservations | x86_64_start_kernel --6837850969203030.00%-- [...] After: # Children Self Command Shared Object Symbol # ........ ........ ....... ................. ................ # 61.22% 0.32% swapper [kernel.kallsyms] [k] cpu_idle | --- cpu_idle | |--85.70%-- start_secondary | --14.30%-- rest_init start_kernel x86_64_start_reservations x86_64_start_kernel Signed-off-by: Namhyung Kim <[email protected]> Cc: Arun Sharma <[email protected]> Cc: Frederic Weisbecker <[email protected]> Link: http://lkml.kernel.org/r/[email protected] Signed-off-by: Jiri Olsa <[email protected]>
2014-06-01perf tools: Enable --children option by defaultNamhyung Kim1-5/+6
Now perf top and perf report will show children column by default if it has callchain information. Requested-by: Ingo Molnar <[email protected]> Signed-off-by: Namhyung Kim <[email protected]> Tested-by: Rodrigo Campos <[email protected]> Tested-by: Arun Sharma <[email protected]> Cc: Frederic Weisbecker <[email protected]> Link: http://lkml.kernel.org/r/[email protected] Signed-off-by: Jiri Olsa <[email protected]>
2014-06-01perf top: Add top.children config optionNamhyung Kim1-0/+4
Add top.children config option for setting default value of callchain accumulation. It affects the output only if one of -g or --call-graph option is given as well. A user can write .perfconfig file like below to enable accumulation by default: $ cat ~/.perfconfig [top] children = true And it can be disabled through command line: $ perf top --no-children Signed-off-by: Namhyung Kim <[email protected]> Tested-by: Arun Sharma <[email protected]> Tested-by: Rodrigo Campos <[email protected]> Cc: Frederic Weisbecker <[email protected]> Link: http://lkml.kernel.org/r/[email protected] Signed-off-by: Jiri Olsa <[email protected]>
2014-06-01perf top: Add --children optionNamhyung Kim2-1/+14
The --children option is for showing accumulated overhead (period) value as well as self overhead. It should be used with one of -g or --call-graph option. Signed-off-by: Namhyung Kim <[email protected]> Tested-by: Arun Sharma <[email protected]> Tested-by: Rodrigo Campos <[email protected]> Cc: Frederic Weisbecker <[email protected]> Link: http://lkml.kernel.org/r/[email protected] Signed-off-by: Jiri Olsa <[email protected]>
2014-06-01perf top: Convert to hist_entry_iterNamhyung Kim1-35/+41
Reuse hist_entry_iter__add() function to share the similar code with perf report. Note that it needs to be called with hists.lock so tweak some internal functions not to deadlock or hold the lock too long. Signed-off-by: Namhyung Kim <[email protected]> Tested-by: Arun Sharma <[email protected]> Tested-by: Rodrigo Campos <[email protected]> Link: http://lkml.kernel.org/r/[email protected] Signed-off-by: Jiri Olsa <[email protected]>
2014-06-01perf tools: Add callback function to hist_entry_iterNamhyung Kim5-53/+84
The new ->add_entry_cb() will be called after an entry was added to the histogram. It's used for code sharing between perf report and perf top. Note that ops->add_*_entry() should set iter->he properly in order to call the ->add_entry_cb. Also pass @arg to the callback function. It'll be used by perf top later. Signed-off-by: Namhyung Kim <[email protected]> Tested-by: Arun Sharma <[email protected]> Tested-by: Rodrigo Campos <[email protected]> Cc: Frederic Weisbecker <[email protected]> Link: http://lkml.kernel.org/r/[email protected] Signed-off-by: Jiri Olsa <[email protected]>
2014-06-01perf tools: Do not auto-remove Children column if --fields givenNamhyung Kim1-0/+3
Depending on the configuration perf inserts/removes the Children column in the output automatically. But it might not be what user wants if [s]he give --fields option explicitly. Signed-off-by: Namhyung Kim <[email protected]> Tested-by: Rodrigo Campos <[email protected]> Cc: Arun Sharma <[email protected]> Cc: Frederic Weisbecker <[email protected]> Link: http://lkml.kernel.org/r/[email protected] Signed-off-by: Jiri Olsa <[email protected]>
2014-06-01perf report: Add report.children config optionNamhyung Kim1-0/+4
Add report.children config option for setting default value of callchain accumulation. It affects the report output only if perf.data contains callchain info. A user can write .perfconfig file like below to enable accumulation by default: $ cat ~/.perfconfig [report] children = true And it can be disabled through command line: $ perf report --no-children Signed-off-by: Namhyung Kim <[email protected]> Tested-by: Arun Sharma <[email protected]> Tested-by: Rodrigo Campos <[email protected]> Cc: Frederic Weisbecker <[email protected]> Link: http://lkml.kernel.org/r/[email protected] Signed-off-by: Jiri Olsa <[email protected]>
2014-06-01perf report: Add --children optionNamhyung Kim2-2/+20
The --children option is for showing accumulated overhead (period) value as well as self overhead. Signed-off-by: Namhyung Kim <[email protected]> Tested-by: Arun Sharma <[email protected]> Tested-by: Rodrigo Campos <[email protected]> Cc: Frederic Weisbecker <[email protected]> Link: http://lkml.kernel.org/r/[email protected] Signed-off-by: Jiri Olsa <[email protected]>
2014-06-01perf tools: Add more hpp helper functionsNamhyung Kim2-0/+21
Sometimes it needs to disable some columns at runtime. Add help functions to support that. Signed-off-by: Namhyung Kim <[email protected]> Tested-by: Arun Sharma <[email protected]> Tested-by: Rodrigo Campos <[email protected]> Cc: Frederic Weisbecker <[email protected]> Link: http://lkml.kernel.org/r/[email protected] Signed-off-by: Jiri Olsa <[email protected]>
2014-06-01perf tools: Apply percent-limit to cumulative percentageNamhyung Kim4-36/+31
If -g cumulative option is given, it needs to show entries which don't have self overhead. So apply percent-limit to accumulated overhead percentage in this case. Signed-off-by: Namhyung Kim <[email protected]> Tested-by: Arun Sharma <[email protected]> Tested-by: Rodrigo Campos <[email protected]> Cc: Frederic Weisbecker <[email protected]> Link: http://lkml.kernel.org/r/[email protected] Signed-off-by: Jiri Olsa <[email protected]>
2014-06-01perf ui/gtk: Add support to accumulated hist statNamhyung Kim1-0/+17
Print accumulated stat of a hist entry if requested. Signed-off-by: Namhyung Kim <[email protected]> Tested-by: Arun Sharma <[email protected]> Tested-by: Rodrigo Campos <[email protected]> Cc: Frederic Weisbecker <[email protected]> Link: http://lkml.kernel.org/r/[email protected] Signed-off-by: Jiri Olsa <[email protected]>
2014-06-01perf ui/browser: Add support to accumulated hist statNamhyung Kim1-0/+25
Print accumulated stat of a hist entry if requested. Signed-off-by: Namhyung Kim <[email protected]> Tested-by: Arun Sharma <[email protected]> Tested-by: Rodrigo Campos <[email protected]> Cc: Frederic Weisbecker <[email protected]> Link: http://lkml.kernel.org/r/[email protected] Signed-off-by: Jiri Olsa <[email protected]>
2014-06-01perf ui/hist: Add support to accumulated hist statNamhyung Kim3-0/+104
Print accumulated stat of a hist entry if requested. To do that, add new HPP_PERCENT_ACC_FNS macro and generate a perf_hpp_fmt using it. The __hpp__sort_acc() function sorts entries by accumulated period value. When accumulated periods of two entries are same (i.e. single path callchain) put the caller above since accumulation tends to put callers on higher position for obvious reason. Also add "overhead_children" output field to be selected by user. Signed-off-by: Namhyung Kim <[email protected]> Tested-by: Arun Sharma <[email protected]> Tested-by: Rodrigo Campos <[email protected]> Cc: Frederic Weisbecker <[email protected]> Link: http://lkml.kernel.org/r/[email protected] Signed-off-by: Jiri Olsa <[email protected]>
2014-06-01perf tools: Save callchain info for each cumulative entryNamhyung Kim1-2/+14
When accumulating callchain entry, also save current snapshot of the chain so that it can show the rest of the chain. Signed-off-by: Namhyung Kim <[email protected]> Tested-by: Arun Sharma <[email protected]> Tested-by: Rodrigo Campos <[email protected]> Cc: Frederic Weisbecker <[email protected]> Link: http://lkml.kernel.org/r/[email protected] Signed-off-by: Jiri Olsa <[email protected]>
2014-06-01perf callchain: Add callchain_cursor_snapshot()Namhyung Kim1-0/+9
The callchain_cursor_snapshot() is for saving current status of the callchain. It'll be used to accumulate callchain information for each node. Signed-off-by: Namhyung Kim <[email protected]> Tested-by: Arun Sharma <[email protected]> Tested-by: Rodrigo Campos <[email protected]> Cc: Frederic Weisbecker <[email protected]> Link: http://lkml.kernel.org/r/[email protected] Signed-off-by: Jiri Olsa <[email protected]>
2014-06-01perf report: Cache cumulative callchainsNamhyung Kim1-0/+42
It is possble that a callchain has cycles or recursive calls. In that case it'll end up having entries more than 100% overhead in the output. In order to prevent such entries, cache each callchain node and skip if same entry already cumulated. Signed-off-by: Namhyung Kim <[email protected]> Tested-by: Arun Sharma <[email protected]> Tested-by: Rodrigo Campos <[email protected]> Cc: Frederic Weisbecker <[email protected]> Link: http://lkml.kernel.org/r/[email protected] Signed-off-by: Jiri Olsa <[email protected]>
2014-06-01perf tools: Update cpumode for each cumulative entryNamhyung Kim3-11/+46
The cpumode and level in struct addr_localtion was set for a sample and but updated as cumulative callchains were added. This led to have non-matching symbol and cpumode in the output. Update it accordingly based on the fact whether the map is a part of the kernel or not. This is a reverse of what thread__find_addr_map() does. Signed-off-by: Namhyung Kim <[email protected]> Tested-by: Arun Sharma <[email protected]> Tested-by: Rodrigo Campos <[email protected]> Cc: Frederic Weisbecker <[email protected]> Link: http://lkml.kernel.org/r/[email protected] Signed-off-by: Jiri Olsa <[email protected]>
2014-06-01perf hists: Accumulate hist entry stat based on the callchainNamhyung Kim4-1/+101
Call __hists__add_entry() for each callchain node to get an accumulated stat for an entry. Introduce new cumulative_iter ops to process them properly. Signed-off-by: Namhyung Kim <[email protected]> Tested-by: Arun Sharma <[email protected]> Tested-by: Rodrigo Campos <[email protected]> Cc: Frederic Weisbecker <[email protected]> Link: http://lkml.kernel.org/r/[email protected] Signed-off-by: Jiri Olsa <[email protected]>
2014-06-01perf hists: Check if accumulated when adding a hist entryNamhyung Kim6-17/+26
To support callchain accumulation, @entry should be recognized if it's accumulated or not when add_hist_entry() called. The period of an accumulated entry should be added to ->stat_acc but not ->stat. Add @sample_self arg for that. Signed-off-by: Namhyung Kim <[email protected]> Tested-by: Arun Sharma <[email protected]> Tested-by: Rodrigo Campos <[email protected]> Cc: Frederic Weisbecker <[email protected]> Link: http://lkml.kernel.org/r/[email protected] Signed-off-by: Jiri Olsa <[email protected]>
2014-06-01perf hists: Add support for accumulated stat of hist entryNamhyung Kim3-2/+28
Maintain accumulated stat information in hist_entry->stat_acc if symbol_conf.cumulate_callchain is set. Fields in ->stat_acc have same vaules initially, and will be updated as callchain is processed later. Signed-off-by: Namhyung Kim <[email protected]> Tested-by: Arun Sharma <[email protected]> Tested-by: Rodrigo Campos <[email protected]> Cc: Frederic Weisbecker <[email protected]> Link: http://lkml.kernel.org/r/[email protected] Signed-off-by: Jiri Olsa <[email protected]>
2014-06-01perf tools: Introduce struct hist_entry_iterNamhyung Kim5-179/+372
There're some duplicate code when adding hist entries. They are different in that some have branch info or mem info but generally do same thing. So introduce new struct hist_entry_iter and add callbacks to customize each case in general way. The new perf_evsel__add_entry() function will look like: iter->prepare_entry(); iter->add_single_entry(); while (iter->next_entry()) iter->add_next_entry(); iter->finish_entry(); This will help further work like the cumulative callchain patchset. Signed-off-by: Namhyung Kim <[email protected]> Tested-by: Arun Sharma <[email protected]> Tested-by: Rodrigo Campos <[email protected]> Cc: David Ahern <[email protected]> Cc: Frederic Weisbecker <[email protected]> Cc: Stephane Eranian <[email protected]> Link: http://lkml.kernel.org/r/[email protected] Signed-off-by: Jiri Olsa <[email protected]>
2014-06-01perf tools: Introduce hists__inc_nr_samples()Namhyung Kim7-12/+13
There're some duplicate code for counting number of samples. Add hists__inc_nr_samples() and reuse it. Suggested-by: Jiri Olsa <[email protected]> Signed-off-by: Namhyung Kim <[email protected]> Link: http://lkml.kernel.org/r/[email protected] Signed-off-by: Jiri Olsa <[email protected]>
2014-05-31Merge branch 'core-urgent-for-linus' of ↵Linus Torvalds2-17/+67
git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip Pull core futex/rtmutex fixes from Thomas Gleixner: "Three fixlets for long standing issues in the futex/rtmutex code unearthed by Dave Jones syscall fuzzer: - Add missing early deadlock detection checks in the futex code - Prevent user space from attaching a futex to kernel threads - Make the deadlock detector of rtmutex work again Looks large, but is more comments than code change" * 'core-urgent-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip: rtmutex: Fix deadlock detector for real futex: Prevent attaching to kernel threads futex: Add another early deadlock detection check
2014-05-31Merge branch 'drm-fixes' of git://people.freedesktop.org/~airlied/linuxLinus Torvalds12-304/+298
Pull drm fixes from Dave Airlie: "Mostly quiet now: i915: fixing userspace visiblie issues, all stable marked radeon: one more pll fix, two crashers, one suspend/resume regression" * 'drm-fixes' of git://people.freedesktop.org/~airlied/linux: drm/radeon: Resume fbcon last drm/radeon: only allocate necessary size for vm bo list drm/radeon: don't allow RADEON_GEM_DOMAIN_CPU for command submission drm/radeon: avoid crash if VM command submission isn't available drm/radeon: lower the ref * post PLL maximum once more drm/i915: Prevent negative relocation deltas from wrapping drm/i915: Only copy back the modified fields to userspace from execbuffer drm/i915: Fix dynamic allocation of physical handles
2014-05-31dcache: add missing lockdep annotationLinus Torvalds1-1/+1
lock_parent() very much on purpose does nested locking of dentries, and is careful to maintain the right order (lock parent first). But because it didn't annotate the nested locking order, lockdep thought it might be a deadlock on d_lock, and complained. Add the proper annotation for the inner locking of the child dentry to make lockdep happy. Introduced by commit 046b961b45f9 ("shrink_dentry_list(): take parent's ->d_lock earlier"). Reported-and-tested-by: Josh Boyer <[email protected]> Cc: Al Viro <[email protected]> Signed-off-by: Linus Torvalds <[email protected]>
2014-05-31drm/radeon: Resume fbcon lastDaniel Vetter1-5/+6
So a few people complained that commit 177cf92de4aa97ec1435987e91696ed8b5023130 Author: Daniel Vetter <[email protected]> Date: Tue Apr 1 22:14:59 2014 +0200 drm/crtc-helpers: fix dpms on logic which was merged into 3.15-rc1, broke resume on radeons. Strangely git bisect lead everyone to commit 25f397a429dfa43f22c278d0119a60a343aa568f Author: Daniel Vetter <[email protected]> Date: Fri Jul 19 18:57:11 2013 +0200 drm/crtc-helper: explicit DPMS on after modeset which was merged long ago and actually part of 3.14. Digging deeper I've noticed (again) that the call to drm_helper_resume_force_mode in the radeon resume handlers was a no-op previously because everything gets shut down on suspend. radeon does this with explicit calls to drm_helper_connector_dpms with DPMS_OFF. But with 177c we now force the dpms state to ON, so suddenly resume_force_mode actually forced the crtcs back on. This is the intention of the change after all, the problem is that radeon resumes the fbdev console layer _before_ restoring the display, through calling fb_set_suspend. And fbcon does an immediate ->set_par, which in turn causes the same forced mode restore to happen. Two concurrent modeset operations didn't lead to happiness. Fix this by delaying the fbcon resume until the end of the readeon resum functions. v2: Fix up a bit of the spelling fail. References: https://lkml.org/lkml/2014/5/29/1043 References: https://lkml.org/lkml/2014/5/2/388 Bugzilla: https://bugzilla.kernel.org/show_bug.cgi?id=74751 Tested-by: Ken Moffat <[email protected]> Cc: Alex Deucher <[email protected]> Cc: Ken Moffat <[email protected]> Signed-off-by: Daniel Vetter <[email protected]> Signed-off-by: Dave Airlie <[email protected]>
2014-05-31Merge branch 'drm-fixes-3.15' of ↵Dave Airlie3-8/+21
git://people.freedesktop.org/~deathsimple/linux into drm-fixes this is the next pull request for stashed up radeon fixes for 3.15. This is finally calming down with only four patches in this pull request. * 'drm-fixes-3.15' of git://people.freedesktop.org/~deathsimple/linux: drm/radeon: only allocate necessary size for vm bo list drm/radeon: don't allow RADEON_GEM_DOMAIN_CPU for command submission drm/radeon: avoid crash if VM command submission isn't available drm/radeon: lower the ref * post PLL maximum once more