aboutsummaryrefslogtreecommitdiff
path: root/tools/perf/arch/powerpc/util
AgeCommit message (Collapse)AuthorFilesLines
2020-09-10perf metricgroup: Pass pmu_event structure as a parameter for ↵Kajol Jain1-2/+5
arch_get_runtimeparam() This patch adds passing of pmu_event as a parameter in function 'arch_get_runtimeparam' which can be used to get details like if the event is percore/perchip. Signed-off-by: Kajol Jain <[email protected]> Acked-by: Ian Rogers <[email protected]> Acked-by: Jiri Olsa <[email protected]> Cc: Alexander Shishkin <[email protected]> Cc: Andi Kleen <[email protected]> Cc: Jin Yao <[email protected]> Cc: John Garry <[email protected]> Cc: Madhavan Srinivasan <[email protected]> Cc: Mark Rutland <[email protected]> Cc: Namhyung Kim <[email protected]> Cc: Paul Clarke <[email protected]> Cc: Peter Zijlstra <[email protected]> Cc: Ravi Bangoria <[email protected]> Link: http://lore.kernel.org/lkml/[email protected] Signed-off-by: Arnaldo Carvalho de Melo <[email protected]>
2020-08-10Merge tag 'perf-tools-2020-08-10' of ↵Linus Torvalds3-8/+71
git://git.kernel.org/pub/scm/linux/kernel/git/acme/linux Pull perf tools updates from Arnaldo Carvalho de Melo: "New features: - Introduce controlling how 'perf stat' and 'perf record' works via a control file descriptor, allowing starting with events configured but disabled until commands are received via the control file descriptor. This allows, for instance for tools such as Intel VTune to make further use of perf as its Linux platform driver. - Improve 'perf record' to to register in a perf.data file header the clockid used to help later correlate things like syslog files and perf events recorded. - Add basic syscall and find_next_bit benchmarks to 'perf bench'. - Allow using computed metrics in calculating other metrics. For instance: { .metric_expr = "l2_rqsts.demand_data_rd_hit + l2_rqsts.pf_hit + l2_rqsts.rfo_hit", .metric_name = "DCache_L2_All_Hits", }, { .metric_expr = "max(l2_rqsts.all_demand_data_rd - l2_rqsts.demand_data_rd_hit, 0) + l2_rqsts.pf_miss + l2_rqsts.rfo_miss", .metric_name = "DCache_L2_All_Miss", }, { .metric_expr = "dcache_l2_all_hits + dcache_l2_all_miss", .metric_name = "DCache_L2_All", } - Add suport for 'd_ratio', '>' and '<' operators to the expression resolver used in calculating metrics in 'perf stat'. Support for new kernel features: - Support TEXT_POKE and KSYMBOL_TYPE_OOL perf metadata events to cope with things like ftrace, trampolines, i.e. changes in the kernel text that gets in the way of properly decoding Intel PT hardware traces, for instance. Intel PT: - Add various knobs to reduce the volume of Intel PT traces by reducing the level of details such as decoding just some types of packets (e.g., FUP/TIP, PSB+), also filtering by time range. - Add new itrace options (log flags to the 'd' option, error flags to the 'e' one, etc), controlling how Intel PT is transformed into perf events, document some missing options (e.g., how to synthesize callchains). BPF: - Properly report BPF errors when parsing events. - Do not setup side-band events if LIBBPF is not linked, fixing a segfault. Libraries: - Improvements to the libtraceevent plugin mechanism. - Improve libtracevent support for KVM trace events SVM exit reasons. - Add a libtracevent plugins for decoding syscalls/sys_enter_futex and for tlb_flush. - Ensure sample_period is set libpfm4 events in 'perf test'. - Fixup libperf namespacing, to make sure what is in libperf has the perf_ namespace while what is now only in tools/perf/ doesn't use that prefix. Arch specific: - Improve the testing of vendor events and metrics in 'perf test'. - Allow no ARM CoreSight hardware tracer sink to be specified on command line. - Fix arm_spe_x recording when mixed with other perf events. - Add s390 idle functions 'psw_idle' and 'psw_idle_exit' to list of idle symbols. - List kernel supplied event aliases for arm64 in 'perf list'. - Add support for extended register capability in PowerPC 9 and 10. - Added nest IMC power9 metric events. Miscellaneous: - No need to setup sample_regs_intr/sample_regs_user for dummy events. - Update various copies of kernel headers, some causing perf to handle new syscalls, MSRs, etc. - Improve usage of flex and yacc, enabling warnings and addressing the fallout. - Add missing '--output' option to 'perf kmem' so that it can pass it along to 'perf record'. - 'perf probe' fixes related to adding multiple probes on the same address for the same event. - Make 'perf probe' warn if the target function is a GNU indirect function. - Remove //anon mmap events from 'perf inject jit' to fix supporting both using ELF files for generated functions and the perf-PID.map approaches" * tag 'perf-tools-2020-08-10' of git://git.kernel.org/pub/scm/linux/kernel/git/acme/linux: (144 commits) perf record: Skip side-band event setup if HAVE_LIBBPF_SUPPORT is not set perf tools powerpc: Add support for extended regs in power10 perf tools powerpc: Add support for extended register capability tools headers UAPI: Sync drm/i915_drm.h with the kernel sources tools arch x86: Sync asm/cpufeatures.h with the kernel sources tools arch x86: Sync the msr-index.h copy with the kernel sources tools headers UAPI: update linux/in.h copy tools headers API: Update close_range affected files perf script: Add 'tod' field to display time of day perf script: Change the 'enum perf_output_field' enumerators to be 64 bits perf data: Add support to store time of day in CTF data conversion perf tools: Move clockid_res_ns under clock struct perf header: Store clock references for -k/--clockid option perf tools: Add clockid_name function perf clockid: Move parse_clockid() to new clockid object tools lib traceevent: Handle possible strdup() error in tep_add_plugin_path() API libtraceevent: Fixed description of tep_add_plugin_path() API libtraceevent: Fixed type in PRINT_FMT_STING libtraceevent: Fixed broken indentation in parse_ip4_print_args() libtraceevent: Improve error handling of tep_plugin_add_option() API ...
2020-08-07perf tools powerpc: Add support for extended regs in power10Athira Rajeev1-0/+6
Added support for supported regs which are new in power10 ( MMCR3, SIER2, SIER3 ) to sample_reg_mask in the tool side to use with `-I?` option. Also added PVR check to send extended mask for power10 at kernel while capturing extended regs in each sample. Signed-off-by: Athira Jajeev <[email protected]> Reviewed-by: Kajol Jain <[email protected]> Reviewed-by: Ravi Bangoria <[email protected]> Tested-by: Ravi Bangoria <[email protected]> Cc: Jiri Olsa <[email protected]> Cc: Madhavan Srinivasan <[email protected]> Cc: Michael Ellerman <[email protected]> Cc: Michael Neuling <[email protected]> Cc: Ravi Bangoria <[email protected]> Cc: [email protected] Signed-off-by: Arnaldo Carvalho de Melo <[email protected]>
2020-08-07perf tools powerpc: Add support for extended register capabilityAnju T Sudhakar3-8/+65
Add extended regs to sample_reg_mask in the tool side to use with `-I?` option. Perf tools side uses extended mask to display the platform supported register names (with -I? option) to the user and also send this mask to the kernel to capture the extended registers in each sample. Hence decide the mask value based on the processor version. Currently definitions for `mfspr`, `SPRN_PVR` are part of `arch/powerpc/util/header.c`. Move this to a header file so that these definitions can be re-used in other source files as well. Signed-off-by: Anju T Sudhakar <[email protected]> Reviewed-by: Kajol Jain <[email protected]> Reviewed-by: Madhavan Srinivasan <[email protected]> Reviewed--by: Ravi Bangoria <[email protected]> Tested-by: Ravi Bangoria <[email protected]> Cc: Jiri Olsa <[email protected]> Cc: Michael Ellerman <[email protected]> Cc: Michael Neuling <[email protected]> <[email protected]> Cc: Ravi Bangoria <[email protected]> Cc: [email protected] [Decide extended mask at run time based on platform] Signed-off-by: Athira Jajeev <[email protected]> Signed-off-by: Arnaldo Carvalho de Melo <[email protected]>
2020-07-23KVM: PPC: Fix typo on H_DISABLE_AND_GET hcallLeonardo Bras1-1/+1
On PAPR+ the hcall() on 0x1B0 is called H_DISABLE_AND_GET, but got defined as H_DISABLE_AND_GETC instead. This define was introduced with a typo in commit <b13a96cfb055> ("[PATCH] powerpc: Extends HCALL interface for InfiniBand usage"), and was later used without having the typo noticed. Signed-off-by: Leonardo Bras <[email protected]> Acked-by: Paul Mackerras <[email protected]> Signed-off-by: Michael Ellerman <[email protected]> Link: https://lore.kernel.org/r/[email protected]
2020-06-01perf libdw: Fix off-by 1 relative directory includesIan Rogers1-3/+3
This is currently working due to extra include paths in the build. Before: $ cd tools/perf/arch/arm64/util $ ls -la ../../util/unwind-libdw.h ls: cannot access '../../util/unwind-libdw.h': No such file or directory After: $ ls -la ../../../util/unwind-libdw.h -rw-r----- 1 irogers irogers 553 Apr 17 14:31 ../../../util/unwind-libdw.h Signed-off-by: Ian Rogers <[email protected]> Acked-by: Namhyung Kim <[email protected]> Cc: Alexander Shishkin <[email protected]> Cc: Jiri Olsa <[email protected]> Cc: Mark Rutland <[email protected]> Cc: Peter Zijlstra <[email protected]> Cc: Stephane Eranian <[email protected]> Link: http://lore.kernel.org/lkml/[email protected] Signed-off-by: Arnaldo Carvalho de Melo <[email protected]>
2020-05-28perf powerpc: Don't ignore sym-handling.c fileRavi Bangoria1-0/+1
Commit 7eec00a74720 ("perf symbols: Consolidate symbol fixup issue") removed powerpc specific sym-handling.c file from Build. This wasn't caught by build CI because all functions in this file are declared as __weak in common code. Fix it. Fixes: 7eec00a74720 ("perf symbols: Consolidate symbol fixup issue") Reported-by: Sandipan Das <[email protected]> Signed-off-by: Ravi Bangoria <[email protected]> Reviewed-by: Leo Yan <[email protected]> Reviewed-by: Naveen N. Rao <[email protected]> Acked-by: Sandipan Das <[email protected]> Cc: Jiri Olsa <[email protected]> Link: http://lore.kernel.org/lkml/[email protected] Signed-off-by: Arnaldo Carvalho de Melo <[email protected]>
2020-05-05perf evsel: Rename perf_evsel__{str,int}val() and other tracepoint field ↵Arnaldo Carvalho de Melo1-1/+1
metehods to to evsel__*() As those are not 'struct evsel' methods, not part of tools/lib/perf/, aka libperf, to whom the perf_ prefix belongs. Cc: Adrian Hunter <[email protected]> Cc: Jiri Olsa <[email protected]> Cc: Namhyung Kim <[email protected]> Signed-off-by: Arnaldo Carvalho de Melo <[email protected]>
2020-04-30perf metricgroups: Enhance JSON/metric infrastructure to handle "?"Kajol Jain1-0/+8
Patch enhances current metric infrastructure to handle "?" in the metric expression. The "?" can be use for parameters whose value not known while creating metric events and which can be replace later at runtime to the proper value. It also add flexibility to create multiple events out of single metric event added in JSON file. Patch adds function 'arch_get_runtimeparam' which is a arch specific function, returns the count of metric events need to be created. By default it return 1. This infrastructure needed for hv_24x7 socket/chip level events. "hv_24x7" chip level events needs specific chip-id to which the data is requested. Function 'arch_get_runtimeparam' implemented in header.c which extract number of sockets from sysfs file "sockets" under "/sys/devices/hv_24x7/interface/". With this patch basically we are trying to create as many metric events as define by runtime_param. For that one loop is added in function 'metricgroup__add_metric', which create multiple events at run time depend on return value of 'arch_get_runtimeparam' and merge that event in 'group_list'. To achieve that we are actually passing this parameter value as part of `expr__find_other` function and changing "?" present in metric expression with this value. As in our JSON file, there gonna be single metric event, and out of which we are creating multiple events. To understand which data count belongs to which parameter value, we also printing param value in generic_metric function. For example, command:# ./perf stat -M PowerBUS_Frequency -C 0 -I 1000 1.000101867 9,356,933 hv_24x7/pm_pb_cyc,chip=0/ # 2.3 GHz PowerBUS_Frequency_0 1.000101867 9,366,134 hv_24x7/pm_pb_cyc,chip=1/ # 2.3 GHz PowerBUS_Frequency_1 2.000314878 9,365,868 hv_24x7/pm_pb_cyc,chip=0/ # 2.3 GHz PowerBUS_Frequency_0 2.000314878 9,366,092 hv_24x7/pm_pb_cyc,chip=1/ # 2.3 GHz PowerBUS_Frequency_1 So, here _0 and _1 after PowerBUS_Frequency specify parameter value. Signed-off-by: Kajol Jain <[email protected]> Acked-by: Jiri Olsa <[email protected]> Cc: Alexander Shishkin <[email protected]> Cc: Andi Kleen <[email protected]> Cc: Anju T Sudhakar <[email protected]> Cc: Benjamin Herrenschmidt <[email protected]> Cc: Greg Kroah-Hartman <[email protected]> Cc: Jin Yao <[email protected]> Cc: Joe Mario <[email protected]> Cc: Kan Liang <[email protected]> Cc: Madhavan Srinivasan <[email protected]> Cc: Mamatha Inamdar <[email protected]> Cc: Mark Rutland <[email protected]> Cc: Michael Ellerman <[email protected]> Cc: Michael Petlan <[email protected]> Cc: Namhyung Kim <[email protected]> Cc: Paul Mackerras <[email protected]> Cc: Peter Zijlstra <[email protected]> Cc: Ravi Bangoria <[email protected]> Cc: Sukadev Bhattiprolu <[email protected]> Cc: Thomas Gleixner <[email protected]> Cc: [email protected] Link: http://lore.kernel.org/lkml/[email protected] Signed-off-by: Arnaldo Carvalho de Melo <[email protected]>
2020-03-23perf symbols: Consolidate symbol fixup issueLeo Yan2-11/+0
After copying Arm64's perf archive with object files and perf.data file to x86 laptop, the x86's perf kernel symbol resolution fails. It outputs 'unknown' for all symbols parsing. This issue is root caused by the function elf__needs_adjust_symbols(), x86 perf tool uses one weak version, Arm64 (and powerpc) has rewritten their own version. elf__needs_adjust_symbols() decides if need to parse symbols with the relative offset address; but x86 building uses the weak function which misses to check for the elf type 'ET_DYN', so that it cannot parse symbols in Arm DSOs due to the wrong result from elf__needs_adjust_symbols(). The DSO parsing should not depend on any specific architecture perf building; e.g. x86 perf tool can parse Arm and Arm64 DSOs, vice versa. And confirmed by Naveen N. Rao that powerpc64 kernels are not being built as ET_DYN anymore and change to ET_EXEC. This patch removes the arch specific functions for Arm64 and powerpc and changes elf__needs_adjust_symbols() as a common function. In the common elf__needs_adjust_symbols(), it checks an extra condition 'ET_DYN' for elf header type. With this fixing, the Arm64 DSO can be parsed properly with x86's perf tool. Before: # perf script main 3258 1 branches: 0 [unknown] ([unknown]) => ffff800010c4665c [unknown] ([kernel.kallsyms]) main 3258 1 branches: ffff800010c46670 [unknown] ([kernel.kallsyms]) => ffff800010c4eaec [unknown] ([kernel.kallsyms]) main 3258 1 branches: ffff800010c4eaec [unknown] ([kernel.kallsyms]) => ffff800010c4eb00 [unknown] ([kernel.kallsyms]) main 3258 1 branches: ffff800010c4eb08 [unknown] ([kernel.kallsyms]) => ffff800010c4e780 [unknown] ([kernel.kallsyms]) main 3258 1 branches: ffff800010c4e7a0 [unknown] ([kernel.kallsyms]) => ffff800010c4eeac [unknown] ([kernel.kallsyms]) main 3258 1 branches: ffff800010c4eebc [unknown] ([kernel.kallsyms]) => ffff800010c4ed80 [unknown] ([kernel.kallsyms]) After: # perf script main 3258 1 branches: 0 [unknown] ([unknown]) => ffff800010c4665c coresight_timeout+0x54 ([kernel.kallsyms]) main 3258 1 branches: ffff800010c46670 coresight_timeout+0x68 ([kernel.kallsyms]) => ffff800010c4eaec etm4_enable_hw+0x3cc ([kernel.kallsyms]) main 3258 1 branches: ffff800010c4eaec etm4_enable_hw+0x3cc ([kernel.kallsyms]) => ffff800010c4eb00 etm4_enable_hw+0x3e0 ([kernel.kallsyms]) main 3258 1 branches: ffff800010c4eb08 etm4_enable_hw+0x3e8 ([kernel.kallsyms]) => ffff800010c4e780 etm4_enable_hw+0x60 ([kernel.kallsyms]) main 3258 1 branches: ffff800010c4e7a0 etm4_enable_hw+0x80 ([kernel.kallsyms]) => ffff800010c4eeac etm4_enable+0x2d4 ([kernel.kallsyms]) main 3258 1 branches: ffff800010c4eebc etm4_enable+0x2e4 ([kernel.kallsyms]) => ffff800010c4ed80 etm4_enable+0x1a8 ([kernel.kallsyms]) v3: Changed to check for ET_DYN across all architectures. v2: Fixed Arm64 and powerpc native building. Reported-by: Mike Leach <[email protected]> Signed-off-by: Leo Yan <[email protected]> Reviewed-by: Naveen N. Rao <[email protected]> Acked-by: Jiri Olsa <[email protected]> Cc: Alexander Shishkin <[email protected]> Cc: Allison Randal <[email protected]> Cc: Enrico Weigelt <[email protected]> Cc: Greg Kroah-Hartman <[email protected]> Cc: Hendrik Brueckner <[email protected]> Cc: John Garry <[email protected]> Cc: Kate Stewart <[email protected]> Cc: Mark Rutland <[email protected]> Cc: Mathieu Poirier <[email protected]> Cc: Namhyung Kim <[email protected]> Cc: Peter Zijlstra <[email protected]> Cc: Thomas Gleixner <[email protected]> Cc: Thomas Richter <[email protected]> Link: http://lore.kernel.org/lkml/[email protected] Signed-off-by: Arnaldo Carvalho de Melo <[email protected]>
2020-03-06tools: Fix off-by 1 relative directory includesIan Rogers1-2/+2
This is currently working due to extra include paths in the build. Committer testing: $ cd tools/include/uapi/asm/ Before this patch: $ ls -la ../../arch/x86/include/uapi/asm/errno.h ls: cannot access '../../arch/x86/include/uapi/asm/errno.h': No such file or directory $ After this patch; $ ls -la ../../../arch/x86/include/uapi/asm/errno.h -rw-rw-r--. 1 acme acme 31 Feb 20 12:42 ../../../arch/x86/include/uapi/asm/errno.h $ Check that that is still under tools/, i.e. hasn't escaped into the main kernel sources: $ cd ../../../arch/x86/include/uapi/asm/ $ pwd /home/acme/git/perf/tools/arch/x86/include/uapi/asm $ Signed-off-by: Ian Rogers <[email protected]> Tested-by: Arnaldo Carvalho de Melo <[email protected]> Acked-by: Jiri Olsa <[email protected]> Cc: Adrian Hunter <[email protected]> Cc: Alexander Shishkin <[email protected]> Cc: Alexios Zavras <[email protected]> Cc: Andi Kleen <[email protected]> Cc: Greg Kroah-Hartman <[email protected]> Cc: Igor Lubashev <[email protected]> Cc: Kan Liang <[email protected]> Cc: Mark Rutland <[email protected]> Cc: Mathieu Poirier <[email protected]> Cc: Namhyung Kim <[email protected]> Cc: Nick Desaulniers <[email protected]> Cc: Peter Zijlstra <[email protected]> Cc: Stephane Eranian <[email protected]> Cc: Thomas Gleixner <[email protected]> Cc: Wei Li <[email protected]> Link: http://lore.kernel.org/lkml/[email protected] Signed-off-by: Arnaldo Carvalho de Melo <[email protected]>
2019-11-18perf parse: Report initial event parsing errorIan Rogers1-2/+2
Record the first event parsing error and report. Implementing feedback from Jiri Olsa: https://lkml.org/lkml/2019/10/28/680 An example error is: $ tools/perf/perf stat -e c/c/ WARNING: multiple event parsing errors event syntax error: 'c/c/' \___ unknown term valid terms: event,filter_rem,filter_opc0,edge,filter_isoc,filter_tid,filter_loc,filter_nc,inv,umask,filter_opc1,tid_en,thresh,filter_all_op,filter_not_nm,filter_state,filter_nm,config,config1,config2,name,period,percore Initial error: event syntax error: 'c/c/' \___ Cannot find PMU `c'. Missing kernel support? Run 'perf list' for a list of valid events Usage: perf stat [<options>] [<command>] -e, --event <event> event selector. use 'perf list' to list available events Signed-off-by: Ian Rogers <[email protected]> Tested-by: Arnaldo Carvalho de Melo <[email protected]> Cc: Adrian Hunter <[email protected]> Cc: Alexander Shishkin <[email protected]> Cc: Allison Randal <[email protected]> Cc: Andi Kleen <[email protected]> Cc: Anju T Sudhakar <[email protected]> Cc: Christian Borntraeger <[email protected]> Cc: Davidlohr Bueso <[email protected]> Cc: Jin Yao <[email protected]> Cc: Jiri Olsa <[email protected]> Cc: Mark Rutland <[email protected]> Cc: Namhyung Kim <[email protected]> Cc: Peter Zijlstra <[email protected]> Cc: Ravi Bangoria <[email protected]> Cc: Stephane Eranian <[email protected]> Cc: Thomas Gleixner <[email protected]> Cc: Thomas Richter <[email protected]> Link: http://lore.kernel.org/lkml/[email protected] Signed-off-by: Arnaldo Carvalho de Melo <[email protected]>
2019-09-30perf tools: Propagate get_cpuid() errorArnaldo Carvalho de Melo1-1/+2
For consistency, propagate the exact cause for get_cpuid() to have failed. Cc: Adrian Hunter <[email protected]> Cc: Jiri Olsa <[email protected]> Cc: Namhyung Kim <[email protected]> Link: https://lkml.kernel.org/n/[email protected] Signed-off-by: Arnaldo Carvalho de Melo <[email protected]>
2019-09-20perf kvm stat: Set 'trace_cycles' as default event for 'perf kvm record' in ↵Anju T Sudhakar1-4/+12
powerpc Use 'trace_imc/trace_cycles' as the default event for 'perf kvm record' in powerpc. Signed-off-by: Anju T Sudhakar <[email protected]> Reviewed-by: Ravi Bangoria <[email protected]> Cc: Alexander Shishkin <[email protected]> Cc: Jiri Olsa <[email protected]> Cc: Madhavan Srinivasan <[email protected]> Cc: Michael Ellerman <[email protected]> Cc: Namhyung Kim <[email protected]> Cc: Peter Zijlstra <[email protected]> Cc: [email protected] Link: http://lore.kernel.org/lkml/[email protected] [ Add missing pmu.h header, needed because this patch uses pmu_have_event() ] Signed-off-by: Arnaldo Carvalho de Melo <[email protected]>
2019-09-20perf kvm: Add arch neutral function to choose event for perf kvm recordAnju T Sudhakar1-0/+37
'perf kvm record' uses 'cycles'(if the user did not specify any event) as the default event to profile the guest. This will not provide any proper samples from the guest incase of powerpc architecture, since in powerpc the PMUs are controlled by the guest rather than the host. Patch adds a function to pick an arch specific event for 'perf kvm record', instead of selecting 'cycles' as a default event for all architectures. For powerpc this function checks for any user specified event, and if there isn't any it returns invalid instead of proceeding with 'cycles' event. Signed-off-by: Anju T Sudhakar <[email protected]> Reviewed-by: Ravi Bangoria <[email protected]> Cc: Alexander Shishkin <[email protected]> Cc: Jiri Olsa <[email protected]> Cc: Madhavan Srinivasan <[email protected]> Cc: Michael Ellerman <[email protected]> Cc: Namhyung Kim <[email protected]> Cc: Peter Zijlstra <[email protected]> Cc: [email protected] Link: http://lore.kernel.org/lkml/[email protected] Signed-off-by: Arnaldo Carvalho de Melo <[email protected]>
2019-09-20perf callchain: Remove needless event.h includeArnaldo Carvalho de Melo1-0/+1
All we need is a bunch of struct forward declarations and then add event.h to the only place that was getting it indirectly via callchain.h. Cc: Adrian Hunter <[email protected]> Cc: Jiri Olsa <[email protected]> Cc: Namhyung Kim <[email protected]> Link: https://lkml.kernel.org/n/[email protected] Signed-off-by: Arnaldo Carvalho de Melo <[email protected]>
2019-09-20perf tools: Remove util.h from where it is not neededArnaldo Carvalho de Melo2-2/+0
Check that it is not needed and remove, fixing up some fallout for places where it was only serving to get something else. Cc: Adrian Hunter <[email protected]> Cc: Jiri Olsa <[email protected]> Cc: Namhyung Kim <[email protected]> Link: https://lkml.kernel.org/n/[email protected] Signed-off-by: Arnaldo Carvalho de Melo <[email protected]>
2019-09-20perf tools: Remove debug.h from places where it is not neededArnaldo Carvalho de Melo1-1/+0
Pruning a bit more the includes dependency tree. Building this thing on lots of containers takes time, we better reduce the time per build, each container is doing 6 builds when clang and clang-devel are available, and the plan is to do a 'make -C tools/perf build-test' that have many more. Also helps when doing normal development, as touching some random file will have a much reduced chance of triggering lots of rebuilds. Cc: Adrian Hunter <[email protected]> Cc: Jiri Olsa <[email protected]> Cc: Namhyung Kim <[email protected]> Link: https://lkml.kernel.org/n/[email protected] Signed-off-by: Arnaldo Carvalho de Melo <[email protected]>
2019-08-31perf symbols: Move mem_info and branch_info out of symbol.hArnaldo Carvalho de Melo1-0/+1
The mem_info struct goes to mem-events.h and branch_info goes to branch.h, where they belong, this way we can remove several headers from symbols.h and trim the include dependency tree more. Cc: Adrian Hunter <[email protected]> Cc: Jiri Olsa <[email protected]> Cc: Namhyung Kim <[email protected]> Link: https://lkml.kernel.org/n/[email protected] Signed-off-by: Arnaldo Carvalho de Melo <[email protected]>
2019-08-31perf symbols: Move symsrc prototypes to a separate headerArnaldo Carvalho de Melo1-0/+1
So that we can remove dso.h from symbol.h and reduce the header dependency tree. Fixup cases where struct dso guts are needed but were obtained via symbol.h, indirectly. Cc: Adrian Hunter <[email protected]> Cc: Jiri Olsa <[email protected]> Cc: Namhyung Kim <[email protected]> Link: https://lkml.kernel.org/n/[email protected] Signed-off-by: Arnaldo Carvalho de Melo <[email protected]>
2019-08-31perf event: Remove needless include directives from event.hArnaldo Carvalho de Melo1-0/+1
bpf.h and build-id.h are not needed at all in event.h, remove them. And fixup the fallout of files that were getting needed stuff from this now pruned include. Cc: Adrian Hunter <[email protected]> Cc: Jiri Olsa <[email protected]> Cc: Namhyung Kim <[email protected]> Link: https://lkml.kernel.org/n/[email protected] Signed-off-by: Arnaldo Carvalho de Melo <[email protected]>
2019-08-29perf tools: Remove perf.h from source files not needing itArnaldo Carvalho de Melo1-1/+0
With the movement of lots of stuff out of perf.h to other headers we ended up not needing it in lots of places, remove it from those places. Cc: Adrian Hunter <[email protected]> Cc: Jiri Olsa <[email protected]> Cc: Namhyung Kim <[email protected]> Link: https://lkml.kernel.org/n/[email protected] Signed-off-by: Arnaldo Carvalho de Melo <[email protected]>
2019-07-29perf evlist: Rename perf_evlist__new() to evlist__new()Jiri Olsa1-1/+1
Rename perf_evlist__new() to evlist__new(), so we don't have a name clash when we add perf_evlist__new() in libperf. Signed-off-by: Jiri Olsa <[email protected]> Cc: Alexander Shishkin <[email protected]> Cc: Alexey Budankov <[email protected]> Cc: Andi Kleen <[email protected]> Cc: Michael Petlan <[email protected]> Cc: Namhyung Kim <[email protected]> Cc: Peter Zijlstra <[email protected]> Link: http://lkml.kernel.org/r/[email protected] Signed-off-by: Arnaldo Carvalho de Melo <[email protected]>
2019-07-29perf evlist: Rename struct perf_evlist to struct evlistJiri Olsa1-3/+3
Rename struct perf_evlist to struct evlist, so we don't have a name clash when we add struct perf_evlist in libperf. Committer notes: Added fixes to build on arm64, from Jiri and from me (tools/perf/util/cs-etm.c) Signed-off-by: Jiri Olsa <[email protected]> Cc: Alexander Shishkin <[email protected]> Cc: Alexey Budankov <[email protected]> Cc: Andi Kleen <[email protected]> Cc: Michael Petlan <[email protected]> Cc: Namhyung Kim <[email protected]> Cc: Peter Zijlstra <[email protected]> Link: http://lkml.kernel.org/r/[email protected] Signed-off-by: Arnaldo Carvalho de Melo <[email protected]>
2019-07-29perf evsel: Rename struct perf_evsel to struct evselJiri Olsa1-3/+3
Rename struct perf_evsel to struct evsel, so we don't have a name clash when we add struct perf_evsel in libperf. Committer notes: Added fixes for arm64, provided by Jiri. Signed-off-by: Jiri Olsa <[email protected]> Cc: Alexander Shishkin <[email protected]> Cc: Alexey Budankov <[email protected]> Cc: Andi Kleen <[email protected]> Cc: Michael Petlan <[email protected]> Cc: Namhyung Kim <[email protected]> Cc: Peter Zijlstra <[email protected]> Link: http://lkml.kernel.org/r/[email protected] Signed-off-by: Arnaldo Carvalho de Melo <[email protected]>
2019-07-09tools lib: Adopt zalloc()/zfree() from tools/perfArnaldo Carvalho de Melo1-1/+1
Eroding a bit more the tools/perf/util/util.h hodpodge header. Cc: Adrian Hunter <[email protected]> Cc: Jiri Olsa <[email protected]> Cc: Namhyung Kim <[email protected]> Link: https://lkml.kernel.org/n/[email protected] Signed-off-by: Arnaldo Carvalho de Melo <[email protected]>
2019-07-09perf tools: Add missing headers, mostly stdlib.hArnaldo Carvalho de Melo1-0/+2
Part of the erosion of util/util.h, that will lose its include stdlib.h, we need to add it to places where it is needed but was getting it indirectly. Cc: Adrian Hunter <[email protected]> Cc: Jiri Olsa <[email protected]> Cc: Namhyung Kim <[email protected]> Link: https://lkml.kernel.org/n/[email protected] Signed-off-by: Arnaldo Carvalho de Melo <[email protected]>
2019-06-19treewide: Replace GPLv2 boilerplate/reference with SPDX - rule 500Thomas Gleixner1-3/+1
Based on 2 normalized pattern(s): this program is free software you can redistribute it and or modify it under the terms of the gnu general public license version 2 as published by the free software foundation this program is free software you can redistribute it and or modify it under the terms of the gnu general public license version 2 as published by the free software foundation # extracted by the scancode license scanner the SPDX license identifier GPL-2.0-only has been chosen to replace the boilerplate/reference in 4122 file(s). Signed-off-by: Thomas Gleixner <[email protected]> Reviewed-by: Enrico Weigelt <[email protected]> Reviewed-by: Kate Stewart <[email protected]> Reviewed-by: Allison Randal <[email protected]> Cc: [email protected] Link: https://lkml.kernel.org/r/[email protected] Signed-off-by: Greg Kroah-Hartman <[email protected]>
2019-05-30treewide: Replace GPLv2 boilerplate/reference with SPDX - rule 152Thomas Gleixner3-15/+3
Based on 1 normalized pattern(s): this program is free software you can redistribute it and or modify it under the terms of the gnu general public license as published by the free software foundation either version 2 of the license or at your option any later version extracted by the scancode license scanner the SPDX license identifier GPL-2.0-or-later has been chosen to replace the boilerplate/reference in 3029 file(s). Signed-off-by: Thomas Gleixner <[email protected]> Reviewed-by: Allison Randal <[email protected]> Cc: [email protected] Link: https://lkml.kernel.org/r/[email protected] Signed-off-by: Greg Kroah-Hartman <[email protected]>
2019-02-14perf tools: Rename build libperf to perfJiri Olsa1-9/+9
Rename build libperf to perf, because it's used to build perf. The libperf build object name will be used for libperf library. Signed-off-by: Jiri Olsa <[email protected]> Cc: Alexander Shishkin <[email protected]> Cc: Namhyung Kim <[email protected]> Cc: Peter Zijlstra <[email protected]> Link: http://lkml.kernel.org/r/[email protected] Signed-off-by: Arnaldo Carvalho de Melo <[email protected]>
2019-02-09Merge tag 'perf-core-for-mingo-5.1-20190206' of ↵Ingo Molnar2-0/+5
git://git.kernel.org/pub/scm/linux/kernel/git/acme/linux into perf/core Pull perf/core improvements and fixes from Arnaldo Carvalho de Melo: Hardware tracing: Adrian Hunter: - Handle calls optimized into jumps to a different symbol in the thread stack routines used to process hardware traces (Adrian Hunter) Intel PT: Adrian Hunter: - Fix overlap calculation for padding. - Fix CYC timestamp calculation after OVF. - Packet splitting can only happen in 32-bit. - Add timestamp to auxtrace errors. ARM CoreSight: Leo Yan: - Add last instruction information in packet - Set sample flags for instruction range, exception and return packets and for a trace discontinuity. - Add exception number in exception packet - Change tuple from traceID-CPU# to traceID-metadata - Add traceID in packet Mathieu Poirier: - Add "sinks" group to PMU directory - Use event attributes to send sink information to kernel - Remove set_drv_config() API, no longer used. perf annotate: Jiri Olsa: - Delay symbol annotation to the resort phase, speeding up 'perf report' startup. perf record: Alexey Budankov: - Allow binding userspace buffers to NUMA nodes. Symbols: Adrian Hunter: - Fix calculating of symbol sizes when splitting kallsyms into maps for kcore processing. Vendor events: William Cohen: - Intel: Fix Load_Miss_Real_Latency on CLX Misc: Arnaldo Carvalho de Melo: - Streamline headers, removing includes when all that is needed are just forward declarations, fixup the fallout for cases where headers should have been explicitely included but were instead obtained indirectly, by sheer luck. - Add fallback versions for CPU_{OR,EQUAL}(), so that code using it continue to build on older systems where those were not yet introduced or in systems using some other libc than the GNU one where those helpers aren't present. Documentation: Changbin Du: - Add documentation for BPF event selection. Signed-off-by: Arnaldo Carvalho de Melo <[email protected]> Signed-off-by: Ingo Molnar <[email protected]>
2019-02-06perf powerpc kvm-stat: Add missing evlist.h headerArnaldo Carvalho de Melo1-0/+1
This header was being obtained indirectly, by sheer luck, add it. Cc: Adrian Hunter <[email protected]> Cc: Hemant Kumar <[email protected]> Cc: Jiri Olsa <[email protected]> Cc: Namhyung Kim <[email protected]> Link: https://lkml.kernel.org/n/[email protected] Signed-off-by: Arnaldo Carvalho de Melo <[email protected]>
2019-02-06perf kvm stat: Replace kvm-stat.h includes with forward declarationsArnaldo Carvalho de Melo1-0/+1
To reduce the include header dependency tree and speed up perf builds. Cc: Adrian Hunter <[email protected]> Cc: Jiri Olsa <[email protected]> Cc: Namhyung Kim <[email protected]> Link: https://lkml.kernel.org/n/[email protected] Signed-off-by: Arnaldo Carvalho de Melo <[email protected]>
2019-02-06perf powerpc: Add missing headers to skip-callchain-idx.cArnaldo Carvalho de Melo1-0/+3
It uses several structs but don't explicitely includes the headers where they are defined, getting them by sheer luck from one of the headers it includes, since those are being streamlined to avoid unnecessary rebuilds when changes are made to a random header, they will break, fix them now so that they continue to build. Cc: Adrian Hunter <[email protected]> Cc: Jiri Olsa <[email protected]> Cc: Maynard Johnson <[email protected]> Cc: Namhyung Kim <[email protected]> Cc: Naveen N. Rao <[email protected]> Cc: Ravi Bangoria <[email protected]> Cc: Sandipan Das <[email protected]> Cc: Sukadev Bhattiprolu <[email protected]> Link: https://lkml.kernel.org/n/[email protected] Signed-off-by: Arnaldo Carvalho de Melo <[email protected]>
2019-02-04perf mem/c2c: Fix perf_mem_events to support powerpcRavi Bangoria2-0/+12
PowerPC hardware does not have a builtin latency filter (--ldlat) for the "mem-load" event and perf_mem_events by default includes "/ldlat=30/" which is causing a failure on PowerPC. Refactor the code to support "perf mem/c2c" on PowerPC. This patch depends on kernel side changes done my Madhavan: https://lists.ozlabs.org/pipermail/linuxppc-dev/2018-December/182596.html Signed-off-by: Ravi Bangoria <[email protected]> Acked-by: Jiri Olsa <[email protected]> Cc: Dick Fowles <[email protected]> Cc: Don Zickus <[email protected]> Cc: Joe Mario <[email protected]> Cc: Madhavan Srinivasan <[email protected]> Cc: Michael Ellerman <[email protected]> Cc: Namhyung Kim <[email protected]> Cc: [email protected] Link: http://lkml.kernel.org/r/[email protected] Signed-off-by: Arnaldo Carvalho de Melo <[email protected]>
2019-01-08powerpc/perf: Update perf_regs structure to include MMCRAMadhavan Srinivasan1-0/+1
On each sample, Monitor Mode Control Register A (MMCRA) content is saved in pt_regs. MMCRA does not have a entry as-is in the pt_regs but instead, MMCRA content is saved in the "dsisr" register of pt_regs. Patch adds another entry to the perf_regs structure to include the "MMCRA" printing which internally maps to the "dsisr" of pt_regs. It also check for the MMCRA availability in the platform and present value accordingly mpe: This was the 2nd patch in a series with commit 333804dc3b7a ("powerpc/perf: Update perf_regs structure to include SIER") but I accidentally only merged the 1st patch, so merge this one now. Signed-off-by: Madhavan Srinivasan <[email protected]> Signed-off-by: Michael Ellerman <[email protected]>
2018-12-20powerpc/perf: Update perf_regs structure to include SIERMadhavan Srinivasan1-0/+1
On each sample, Sample Instruction Event Register (SIER) content is saved in pt_regs. SIER does not have a entry as-is in the pt_regs but instead, SIER content is saved in the "dar" register of pt_regs. Patch adds another entry to the perf_regs structure to include the "SIER" printing which internally maps to the "dar" of pt_regs. It also check for the SIER availability in the platform and present value accordingly Signed-off-by: Madhavan Srinivasan <[email protected]> Signed-off-by: Michael Ellerman <[email protected]>
2018-10-09Merge remote-tracking branch 'remotes/powerpc/topic/ppc-kvm' into kvm-ppc-nextPaul Mackerras1-1/+0
This merges in the "ppc-kvm" topic branch of the powerpc tree to get a series of commits that touch both general arch/powerpc code and KVM code. These commits will be merged both via the KVM tree and the powerpc tree. Signed-off-by: Paul Mackerras <[email protected]>
2018-10-09KVM: PPC: Book3S: Simplify external interrupt handlingPaul Mackerras1-1/+0
Currently we use two bits in the vcpu pending_exceptions bitmap to indicate that an external interrupt is pending for the guest, one for "one-shot" interrupts that are cleared when delivered, and one for interrupts that persist until cleared by an explicit action of the OS (e.g. an acknowledge to an interrupt controller). The BOOK3S_IRQPRIO_EXTERNAL bit is used for one-shot interrupt requests and BOOK3S_IRQPRIO_EXTERNAL_LEVEL is used for persisting interrupts. In practice BOOK3S_IRQPRIO_EXTERNAL never gets used, because our Book3S platforms generally, and pseries in particular, expect external interrupt requests to persist until they are acknowledged at the interrupt controller. That combined with the confusion introduced by having two bits for what is essentially the same thing makes it attractive to simplify things by only using one bit. This patch does that. With this patch there is only BOOK3S_IRQPRIO_EXTERNAL, and by default it has the semantics of a persisting interrupt. In order to avoid breaking the ABI, we introduce a new "external_oneshot" flag which preserves the behaviour of the KVM_INTERRUPT ioctl with the KVM_INTERRUPT_SET argument. Reviewed-by: David Gibson <[email protected]> Signed-off-by: Paul Mackerras <[email protected]> Signed-off-by: Michael Ellerman <[email protected]>
2018-08-30perf probe powerpc: Ignore SyS symbols irrespective of endiannessSandipan Das1-1/+3
This makes sure that the SyS symbols are ignored for any powerpc system, not just the big endian ones. Reported-by: Naveen N. Rao <[email protected]> Signed-off-by: Sandipan Das <[email protected]> Reviewed-by: Kamalesh Babulal <[email protected]> Acked-by: Naveen N. Rao <[email protected]> Cc: Jiri Olsa <[email protected]> Cc: Ravi Bangoria <[email protected]> Fixes: fb6d59423115 ("perf probe ppc: Use the right prefix when ignoring SyS symbols on ppc") Link: http://lkml.kernel.org/r/[email protected] Signed-off-by: Arnaldo Carvalho de Melo <[email protected]>
2018-08-09perf probe powerpc: Fix trace event post-processingSandipan Das1-1/+3
In some cases, a symbol may have multiple aliases. Attempting to add an entry probe for such symbols results in a probe being added at an incorrect location while it fails altogether for return probes. This is only applicable for binaries with debug information. During the arch-dependent post-processing, the offset from the start of the symbol at which the probe is to be attached is determined and added to the start address of the symbol to get the probe's location. In case there are multiple aliases, this offset gets added multiple times for each alias of the symbol and we end up with an incorrect probe location. This can be verified on a powerpc64le system as shown below. $ nm /lib/modules/$(uname -r)/build/vmlinux | grep "sys_open$" ... c000000000414290 T __se_sys_open c000000000414290 T sys_open $ objdump -d /lib/modules/$(uname -r)/build/vmlinux | grep -A 10 "<__se_sys_open>:" c000000000414290 <__se_sys_open>: c000000000414290: 19 01 4c 3c addis r2,r12,281 c000000000414294: 70 c4 42 38 addi r2,r2,-15248 c000000000414298: a6 02 08 7c mflr r0 c00000000041429c: e8 ff a1 fb std r29,-24(r1) c0000000004142a0: f0 ff c1 fb std r30,-16(r1) c0000000004142a4: f8 ff e1 fb std r31,-8(r1) c0000000004142a8: 10 00 01 f8 std r0,16(r1) c0000000004142ac: c1 ff 21 f8 stdu r1,-64(r1) c0000000004142b0: 78 23 9f 7c mr r31,r4 c0000000004142b4: 78 1b 7e 7c mr r30,r3 For both the entry probe and the return probe, the probe location should be _text+4276888 (0xc000000000414298). Since another alias exists for 'sys_open', the post-processing code will end up adding the offset (8 for powerpc64le) twice and perf will attempt to add the probe at _text+4276896 (0xc0000000004142a0) instead. Before: # perf probe -v -a sys_open probe-definition(0): sys_open symbol:sys_open file:(null) line:0 offset:0 return:0 lazy:(null) 0 arguments Looking at the vmlinux_path (8 entries long) Using /lib/modules/4.18.0-rc8+/build/vmlinux for symbols Open Debuginfo file: /lib/modules/4.18.0-rc8+/build/vmlinux Try to find probe point from debuginfo. Symbol sys_open address found : c000000000414290 Matched function: __se_sys_open [2ad03a0] Probe point found: __se_sys_open+0 Found 1 probe_trace_events. Opening /sys/kernel/debug/tracing/kprobe_events write=1 Writing event: p:probe/sys_open _text+4276896 Added new event: probe:sys_open (on sys_open) ... # perf probe -v -a sys_open%return $retval probe-definition(0): sys_open%return symbol:sys_open file:(null) line:0 offset:0 return:1 lazy:(null) 0 arguments Looking at the vmlinux_path (8 entries long) Using /lib/modules/4.18.0-rc8+/build/vmlinux for symbols Open Debuginfo file: /lib/modules/4.18.0-rc8+/build/vmlinux Try to find probe point from debuginfo. Symbol sys_open address found : c000000000414290 Matched function: __se_sys_open [2ad03a0] Probe point found: __se_sys_open+0 Found 1 probe_trace_events. Opening /sys/kernel/debug/tracing/README write=0 Opening /sys/kernel/debug/tracing/kprobe_events write=1 Parsing probe_events: p:probe/sys_open _text+4276896 Group:probe Event:sys_open probe:p Writing event: r:probe/sys_open__return _text+4276896 Failed to write event: Invalid argument Error: Failed to add events. Reason: Invalid argument (Code: -22) After: # perf probe -v -a sys_open probe-definition(0): sys_open symbol:sys_open file:(null) line:0 offset:0 return:0 lazy:(null) 0 arguments Looking at the vmlinux_path (8 entries long) Using /lib/modules/4.18.0-rc8+/build/vmlinux for symbols Open Debuginfo file: /lib/modules/4.18.0-rc8+/build/vmlinux Try to find probe point from debuginfo. Symbol sys_open address found : c000000000414290 Matched function: __se_sys_open [2ad03a0] Probe point found: __se_sys_open+0 Found 1 probe_trace_events. Opening /sys/kernel/debug/tracing/kprobe_events write=1 Writing event: p:probe/sys_open _text+4276888 Added new event: probe:sys_open (on sys_open) ... # perf probe -v -a sys_open%return $retval probe-definition(0): sys_open%return symbol:sys_open file:(null) line:0 offset:0 return:1 lazy:(null) 0 arguments Looking at the vmlinux_path (8 entries long) Using /lib/modules/4.18.0-rc8+/build/vmlinux for symbols Open Debuginfo file: /lib/modules/4.18.0-rc8+/build/vmlinux Try to find probe point from debuginfo. Symbol sys_open address found : c000000000414290 Matched function: __se_sys_open [2ad03a0] Probe point found: __se_sys_open+0 Found 1 probe_trace_events. Opening /sys/kernel/debug/tracing/README write=0 Opening /sys/kernel/debug/tracing/kprobe_events write=1 Parsing probe_events: p:probe/sys_open _text+4276888 Group:probe Event:sys_open probe:p Writing event: r:probe/sys_open__return _text+4276888 Added new event: probe:sys_open__return (on sys_open%return) ... Reported-by: Aneesh Kumar <[email protected]> Signed-off-by: Sandipan Das <[email protected]> Acked-by: Naveen N. Rao <[email protected]> Cc: Aneesh Kumar <[email protected]> Cc: Jiri Olsa <[email protected]> Cc: Ravi Bangoria <[email protected]> Fixes: 99e608b5954c ("perf probe ppc64le: Fix probe location when using DWARF") Link: http://lkml.kernel.org/r/[email protected] Signed-off-by: Arnaldo Carvalho de Melo <[email protected]>
2018-07-24perf powerpc: Fix callchain ip filtering when return address is in a registerSandipan Das1-2/+6
For powerpc64, perf will filter out the second entry in the callchain, i.e. the LR value, if the return address of the function corresponding to the probed location has already been saved on its caller's stack. The state of the return address is determined using debug information. At any point within a function, if the return address is already saved somewhere, a DWARF expression can tell us about its location. If the return address in still in LR only, no DWARF expression would exist. Typically, the instructions in a function's prologue first copy the LR value to R0 and then pushes R0 on to the stack. If LR has already been copied to R0 but R0 is yet to be pushed to the stack, we can still get a DWARF expression that says that the return address is in R0. This is indicating that getting a DWARF expression for the return address does not guarantee the fact that it has already been saved on the stack. This can be observed on a powerpc64le system running Fedora 27 as shown below. # objdump -d /usr/lib64/libc-2.26.so | less ... 000000000015af20 <inet_pton>: 15af20: 0b 00 4c 3c addis r2,r12,11 15af24: e0 c1 42 38 addi r2,r2,-15904 15af28: a6 02 08 7c mflr r0 15af2c: f0 ff c1 fb std r30,-16(r1) 15af30: f8 ff e1 fb std r31,-8(r1) 15af34: 78 1b 7f 7c mr r31,r3 15af38: 78 23 83 7c mr r3,r4 15af3c: 78 2b be 7c mr r30,r5 15af40: 10 00 01 f8 std r0,16(r1) 15af44: c1 ff 21 f8 stdu r1,-64(r1) 15af48: 28 00 81 f8 std r4,40(r1) ... # readelf --debug-dump=frames-interp /usr/lib64/libc-2.26.so | less ... 00027024 0000000000000024 00027028 FDE cie=00000000 pc=000000000015af20..000000000015af88 LOC CFA r30 r31 ra 000000000015af20 r1+0 u u u 000000000015af34 r1+0 c-16 c-8 r0 000000000015af48 r1+64 c-16 c-8 c+16 000000000015af5c r1+0 c-16 c-8 c+16 000000000015af78 r1+0 u u ... # perf probe -x /usr/lib64/libc-2.26.so -a inet_pton+0x18 # perf record -e probe_libc:inet_pton -g ping -6 -c 1 ::1 # perf script Before: ping 2829 [005] 512917.460174: probe_libc:inet_pton: (7fff7e2baf38) 7fff7e2baf38 __GI___inet_pton+0x18 (/usr/lib64/libc-2.26.so) 7fff7e2705b4 getaddrinfo+0x164 (/usr/lib64/libc-2.26.so) 12f152d70 _init+0xbfc (/usr/bin/ping) 7fff7e1836a0 generic_start_main.isra.0+0x140 (/usr/lib64/libc-2.26.so) 7fff7e183898 __libc_start_main+0xb8 (/usr/lib64/libc-2.26.so) 0 [unknown] ([unknown]) After: ping 2829 [005] 512917.460174: probe_libc:inet_pton: (7fff7e2baf38) 7fff7e2baf38 __GI___inet_pton+0x18 (/usr/lib64/libc-2.26.so) 7fff7e26fa54 gaih_inet.constprop.7+0xf44 (/usr/lib64/libc-2.26.so) 7fff7e2705b4 getaddrinfo+0x164 (/usr/lib64/libc-2.26.so) 12f152d70 _init+0xbfc (/usr/bin/ping) 7fff7e1836a0 generic_start_main.isra.0+0x140 (/usr/lib64/libc-2.26.so) 7fff7e183898 __libc_start_main+0xb8 (/usr/lib64/libc-2.26.so) 0 [unknown] ([unknown]) Reported-by: Ravi Bangoria <[email protected]> Signed-off-by: Sandipan Das <[email protected]> Cc: Jiri Olsa <[email protected]> Cc: Maynard Johnson <[email protected]> Cc: Naveen N. Rao <[email protected]> Cc: Ravi Bangoria <[email protected]> Cc: Sukadev Bhattiprolu <[email protected]> Link: http://lkml.kernel.org/r/66e848a7bdf2d43b39210a705ff6d828a0865661.1530724939.git.sandipan@linux.ibm.com Signed-off-by: Arnaldo Carvalho de Melo <[email protected]>
2018-07-24perf powerpc: Fix callchain ip filteringSandipan Das1-1/+1
For powerpc64, redundant entries in the callchain are filtered out by determining the state of the return address and the stack frame using DWARF debug information. For making these filtering decisions we must analyze the debug information for the location corresponding to the program counter value, i.e. the first entry in the callchain, and not the LR value; otherwise, perf may filter out either the second or the third entry in the callchain incorrectly. This can be observed on a powerpc64le system running Fedora 27 as shown below. Case 1 - Attaching a probe at inet_pton+0x8 (binary offset 0x15af28). Return address is still in LR and a new stack frame is not yet allocated. The LR value, i.e. the second entry, should not be filtered out. # objdump -d /usr/lib64/libc-2.26.so | less ... 000000000010eb10 <gaih_inet.constprop.7>: ... 10fa48: 78 bb e4 7e mr r4,r23 10fa4c: 0a 00 60 38 li r3,10 10fa50: d9 b4 04 48 bl 15af28 <inet_pton+0x8> 10fa54: 00 00 00 60 nop 10fa58: ac f4 ff 4b b 10ef04 <gaih_inet.constprop.7+0x3f4> ... 0000000000110450 <getaddrinfo>: ... 1105a8: 54 00 ff 38 addi r7,r31,84 1105ac: 58 00 df 38 addi r6,r31,88 1105b0: 69 e5 ff 4b bl 10eb18 <gaih_inet.constprop.7+0x8> 1105b4: 78 1b 71 7c mr r17,r3 1105b8: 50 01 7f e8 ld r3,336(r31) ... 000000000015af20 <inet_pton>: 15af20: 0b 00 4c 3c addis r2,r12,11 15af24: e0 c1 42 38 addi r2,r2,-15904 15af28: a6 02 08 7c mflr r0 15af2c: f0 ff c1 fb std r30,-16(r1) 15af30: f8 ff e1 fb std r31,-8(r1) ... # perf probe -x /usr/lib64/libc-2.26.so -a inet_pton+0x8 # perf record -e probe_libc:inet_pton -g ping -6 -c 1 ::1 # perf script Before: ping 4507 [002] 514985.546540: probe_libc:inet_pton: (7fffa7dbaf28) 7fffa7dbaf28 __GI___inet_pton+0x8 (/usr/lib64/libc-2.26.so) 7fffa7d705b4 getaddrinfo+0x164 (/usr/lib64/libc-2.26.so) 13fb52d70 _init+0xbfc (/usr/bin/ping) 7fffa7c836a0 generic_start_main.isra.0+0x140 (/usr/lib64/libc-2.26.so) 7fffa7c83898 __libc_start_main+0xb8 (/usr/lib64/libc-2.26.so) 0 [unknown] ([unknown]) After: ping 4507 [002] 514985.546540: probe_libc:inet_pton: (7fffa7dbaf28) 7fffa7dbaf28 __GI___inet_pton+0x8 (/usr/lib64/libc-2.26.so) 7fffa7d6fa54 gaih_inet.constprop.7+0xf44 (/usr/lib64/libc-2.26.so) 7fffa7d705b4 getaddrinfo+0x164 (/usr/lib64/libc-2.26.so) 13fb52d70 _init+0xbfc (/usr/bin/ping) 7fffa7c836a0 generic_start_main.isra.0+0x140 (/usr/lib64/libc-2.26.so) 7fffa7c83898 __libc_start_main+0xb8 (/usr/lib64/libc-2.26.so) 0 [unknown] ([unknown]) Case 2 - Attaching a probe at _int_malloc+0x180 (binary offset 0x9cf10). Return address in still in LR and a new stack frame has already been allocated but not used. The caller's caller, i.e. the third entry, is invalid and should be filtered out and not the second one. # objdump -d /usr/lib64/libc-2.26.so | less ... 000000000009cd90 <_int_malloc>: 9cd90: 17 00 4c 3c addis r2,r12,23 9cd94: 70 a3 42 38 addi r2,r2,-23696 9cd98: 26 00 80 7d mfcr r12 9cd9c: f8 ff e1 fb std r31,-8(r1) 9cda0: 17 00 e4 3b addi r31,r4,23 9cda4: d8 ff 61 fb std r27,-40(r1) 9cda8: 78 23 9b 7c mr r27,r4 9cdac: 1f 00 bf 2b cmpldi cr7,r31,31 9cdb0: f0 ff c1 fb std r30,-16(r1) 9cdb4: b0 ff c1 fa std r22,-80(r1) 9cdb8: 78 1b 7e 7c mr r30,r3 9cdbc: 08 00 81 91 stw r12,8(r1) 9cdc0: 11 ff 21 f8 stdu r1,-240(r1) 9cdc4: 4c 01 9d 41 bgt cr7,9cf10 <_int_malloc+0x180> 9cdc8: 20 00 a4 2b cmpldi cr7,r4,32 ... 9cf08: 00 00 00 60 nop 9cf0c: 00 00 42 60 ori r2,r2,0 9cf10: e4 06 ff 7b rldicr r31,r31,0,59 9cf14: 40 f8 a4 7f cmpld cr7,r4,r31 9cf18: 68 05 9d 41 bgt cr7,9d480 <_int_malloc+0x6f0> ... 000000000009e3c0 <tcache_init.part.4>: ... 9e420: 40 02 80 38 li r4,576 9e424: 78 fb e3 7f mr r3,r31 9e428: 71 e9 ff 4b bl 9cd98 <_int_malloc+0x8> 9e42c: 00 00 a3 2f cmpdi cr7,r3,0 9e430: 78 1b 7e 7c mr r30,r3 ... 000000000009f7a0 <__libc_malloc>: ... 9f8f8: 00 00 89 2f cmpwi cr7,r9,0 9f8fc: 1c ff 9e 40 bne cr7,9f818 <__libc_malloc+0x78> 9f900: c9 ea ff 4b bl 9e3c8 <tcache_init.part.4+0x8> 9f904: 00 00 00 60 nop 9f908: e8 90 22 e9 ld r9,-28440(r2) ... # perf probe -x /usr/lib64/libc-2.26.so -a _int_malloc+0x180 # perf record -e probe_libc:_int_malloc -g ./test-malloc # perf script Before: test-malloc 6554 [009] 515975.797403: probe_libc:_int_malloc: (7fffa6e6cf10) 7fffa6e6cf10 _int_malloc+0x180 (/usr/lib64/libc-2.26.so) 7fffa6dd0000 [unknown] (/usr/lib64/libc-2.26.so) 7fffa6e6f904 malloc+0x164 (/usr/lib64/libc-2.26.so) 7fffa6e6f9fc malloc+0x25c (/usr/lib64/libc-2.26.so) 100006b4 main+0x38 (/home/testuser/test-malloc) 7fffa6df36a0 generic_start_main.isra.0+0x140 (/usr/lib64/libc-2.26.so) 7fffa6df3898 __libc_start_main+0xb8 (/usr/lib64/libc-2.26.so) 0 [unknown] ([unknown]) After: test-malloc 6554 [009] 515975.797403: probe_libc:_int_malloc: (7fffa6e6cf10) 7fffa6e6cf10 _int_malloc+0x180 (/usr/lib64/libc-2.26.so) 7fffa6e6e42c tcache_init.part.4+0x6c (/usr/lib64/libc-2.26.so) 7fffa6e6f904 malloc+0x164 (/usr/lib64/libc-2.26.so) 7fffa6e6f9fc malloc+0x25c (/usr/lib64/libc-2.26.so) 100006b4 main+0x38 (/home/sandipan/test-malloc) 7fffa6df36a0 generic_start_main.isra.0+0x140 (/usr/lib64/libc-2.26.so) 7fffa6df3898 __libc_start_main+0xb8 (/usr/lib64/libc-2.26.so) 0 [unknown] ([unknown]) Signed-off-by: Sandipan Das <[email protected]> Cc: Jiri Olsa <[email protected]> Cc: Maynard Johnson <[email protected]> Cc: Naveen N. Rao <[email protected]> Cc: Ravi Bangoria <[email protected]> Cc: Sukadev Bhattiprolu <[email protected]> Fixes: a60335ba3298 ("perf tools powerpc: Adjust callchain based on DWARF debug info") Link: http://lkml.kernel.org/r/24bb726d91ed173aebc972ec3f41a2ef2249434e.1530724939.git.sandipan@linux.ibm.com Signed-off-by: Arnaldo Carvalho de Melo <[email protected]>
2018-06-25perf report powerpc: Fix crash if callchain is emptySandipan Das1-1/+1
For some cases, the callchain provided by the kernel may be empty. So, the callchain ip filtering code will cause a crash if we do not check whether the struct ip_callchain pointer is NULL before accessing any members. This can be observed on a powerpc64le system running Fedora 27 as shown below. # perf record -b -e cycles:u ls Before: # perf report --branch-history perf: Segmentation fault -------- backtrace -------- perf[0x1027615c] linux-vdso64.so.1(__kernel_sigtramp_rt64+0x0)[0x7fff856304d8] perf(arch_skip_callchain_idx+0x44)[0x10257c58] perf[0x1017f2e4] perf(thread__resolve_callchain+0x124)[0x1017ff5c] perf(sample__resolve_callchain+0xf0)[0x10172788] ... After: # perf report --branch-history Samples: 25 of event 'cycles:u', Event count (approx.): 2306870 Overhead Source:Line Symbol Shared Object + 11.60% _init+35736 [.] _init ls + 9.84% strcoll_l.c:137 [.] __strcoll_l libc-2.26.so + 9.16% memcpy.S:175 [.] __memcpy_power7 libc-2.26.so + 9.01% gconv_charset.h:54 [.] _nl_find_locale libc-2.26.so + 8.87% dl-addr.c:52 [.] _dl_addr libc-2.26.so + 8.83% _init+236 [.] _init ls ... Reported-by: Ravi Bangoria <[email protected]> Signed-off-by: Sandipan Das <[email protected]> Acked-by: Ravi Bangoria <[email protected]> Cc: Jiri Olsa <[email protected]> Cc: Naveen N. Rao <[email protected]> Cc: Sukadev Bhattiprolu <[email protected]> Link: http://lkml.kernel.org/r/[email protected] Signed-off-by: Arnaldo Carvalho de Melo <[email protected]>
2018-04-26perf thread: Introduce thread__find_symbol()Arnaldo Carvalho de Melo1-2/+1
Out of thread__find_addr_location(..., MAP__FUNCTION, ...), idea here is to continue removing references to MAP__{FUNCTION,VARIABLE} ahead of getting both types of symbols in the same rbtree, as various places do two lookups, looking first at MAP__FUNCTION, then at MAP__VARIABLE. So thread__find_symbol() will eventually do just that, and 'struct symbol' will have the symbol type, for code that cares about that. Cc: Adrian Hunter <[email protected]> Cc: David Ahern <[email protected]> Cc: Jiri Olsa <[email protected]> Cc: Namhyung Kim <[email protected]> Cc: Wang Nan <[email protected]> Link: https://lkml.kernel.org/n/[email protected] Signed-off-by: Arnaldo Carvalho de Melo <[email protected]>
2017-12-27perf probe: Find versioned symbols from mapMasami Hiramatsu1-0/+8
Commit d80406453ad4 ("perf symbols: Allow user probes on versioned symbols") allows user to find default versioned symbols (with "@@") in map. However, it did not enable normal versioned symbol (with "@") for perf-probe. E.g. ===== # ./perf probe -x /lib64/libc-2.25.so malloc_get_state Failed to find symbol malloc_get_state in /usr/lib64/libc-2.25.so Error: Failed to add events. ===== This solves above issue by improving perf-probe symbol search function, as below. ===== # ./perf probe -x /lib64/libc-2.25.so malloc_get_state Added new event: probe_libc:malloc_get_state (on malloc_get_state in /usr/lib64/libc-2.25.so) You can now use it in all perf tools, such as: perf record -e probe_libc:malloc_get_state -aR sleep 1 # ./perf probe -l probe_libc:malloc_get_state (on malloc_get_state@GLIBC_2.2.5 in /usr/lib64/libc-2.25.so) ===== Signed-off-by: Masami Hiramatsu <[email protected]> Reviewed-by: Thomas Richter <[email protected]> Acked-by: Ravi Bangoria <[email protected]> Tested-by: Arnaldo Carvalho de Melo <[email protected]> Cc: Paul Clarke <[email protected]> Cc: bhargavb <[email protected]> Cc: [email protected] Link: http://lkml.kernel.org/r/151275049269.24652.1639103455496216255.stgit@devbox Signed-off-by: Arnaldo Carvalho de Melo <[email protected]>
2017-12-05perf pmu: Pass pmu as a parameter to get_cpuid_str()Ganapatrao Kulkarni1-1/+1
The cpuid string will not be same on all CPUs on heterogeneous platforms like ARM's big.LITTLE, adding provision(using pmu->cpus) to find cpuid string from associated CPUs of PMU CORE device. Also optimise arguments to function pmu_add_cpu_aliases. Signed-off-by: Ganapatrao Kulkarni <[email protected]> Acked-by: Will Deacon <[email protected]> Cc: Alexander Shishkin <[email protected]> Cc: Catalin Marinas <[email protected]> Cc: Jayachandran C <[email protected]> Cc: Jonathan Cameron <[email protected]> Cc: [email protected] Cc: Mark Rutland <[email protected]> Cc: Peter Zijlstra <[email protected]> Cc: Robert Richter <[email protected]> Cc: Shaokun Zhang <[email protected]> Link: http://lkml.kernel.org/r/[email protected] Signed-off-by: Arnaldo Carvalho de Melo <[email protected]>
2017-11-02License cleanup: add SPDX GPL-2.0 license identifier to files with no licenseGreg Kroah-Hartman6-0/+6
Many source files in the tree are missing licensing information, which makes it harder for compliance tools to determine the correct license. By default all files without license information are under the default license of the kernel, which is GPL version 2. Update the files which contain no license information with the 'GPL-2.0' SPDX license identifier. The SPDX identifier is a legally binding shorthand, which can be used instead of the full boiler plate text. This patch is based on work done by Thomas Gleixner and Kate Stewart and Philippe Ombredanne. How this work was done: Patches were generated and checked against linux-4.14-rc6 for a subset of the use cases: - file had no licensing information it it. - file was a */uapi/* one with no licensing information in it, - file was a */uapi/* one with existing licensing information, Further patches will be generated in subsequent months to fix up cases where non-standard license headers were used, and references to license had to be inferred by heuristics based on keywords. The analysis to determine which SPDX License Identifier to be applied to a file was done in a spreadsheet of side by side results from of the output of two independent scanners (ScanCode & Windriver) producing SPDX tag:value files created by Philippe Ombredanne. Philippe prepared the base worksheet, and did an initial spot review of a few 1000 files. The 4.13 kernel was the starting point of the analysis with 60,537 files assessed. Kate Stewart did a file by file comparison of the scanner results in the spreadsheet to determine which SPDX license identifier(s) to be applied to the file. She confirmed any determination that was not immediately clear with lawyers working with the Linux Foundation. Criteria used to select files for SPDX license identifier tagging was: - Files considered eligible had to be source code files. - Make and config files were included as candidates if they contained >5 lines of source - File already had some variant of a license header in it (even if <5 lines). All documentation files were explicitly excluded. The following heuristics were used to determine which SPDX license identifiers to apply. - when both scanners couldn't find any license traces, file was considered to have no license information in it, and the top level COPYING file license applied. For non */uapi/* files that summary was: SPDX license identifier # files ---------------------------------------------------|------- GPL-2.0 11139 and resulted in the first patch in this series. If that file was a */uapi/* path one, it was "GPL-2.0 WITH Linux-syscall-note" otherwise it was "GPL-2.0". Results of that was: SPDX license identifier # files ---------------------------------------------------|------- GPL-2.0 WITH Linux-syscall-note 930 and resulted in the second patch in this series. - if a file had some form of licensing information in it, and was one of the */uapi/* ones, it was denoted with the Linux-syscall-note if any GPL family license was found in the file or had no licensing in it (per prior point). Results summary: SPDX license identifier # files ---------------------------------------------------|------ GPL-2.0 WITH Linux-syscall-note 270 GPL-2.0+ WITH Linux-syscall-note 169 ((GPL-2.0 WITH Linux-syscall-note) OR BSD-2-Clause) 21 ((GPL-2.0 WITH Linux-syscall-note) OR BSD-3-Clause) 17 LGPL-2.1+ WITH Linux-syscall-note 15 GPL-1.0+ WITH Linux-syscall-note 14 ((GPL-2.0+ WITH Linux-syscall-note) OR BSD-3-Clause) 5 LGPL-2.0+ WITH Linux-syscall-note 4 LGPL-2.1 WITH Linux-syscall-note 3 ((GPL-2.0 WITH Linux-syscall-note) OR MIT) 3 ((GPL-2.0 WITH Linux-syscall-note) AND MIT) 1 and that resulted in the third patch in this series. - when the two scanners agreed on the detected license(s), that became the concluded license(s). - when there was disagreement between the two scanners (one detected a license but the other didn't, or they both detected different licenses) a manual inspection of the file occurred. - In most cases a manual inspection of the information in the file resulted in a clear resolution of the license that should apply (and which scanner probably needed to revisit its heuristics). - When it was not immediately clear, the license identifier was confirmed with lawyers working with the Linux Foundation. - If there was any question as to the appropriate license identifier, the file was flagged for further research and to be revisited later in time. In total, over 70 hours of logged manual review was done on the spreadsheet to determine the SPDX license identifiers to apply to the source files by Kate, Philippe, Thomas and, in some cases, confirmation by lawyers working with the Linux Foundation. Kate also obtained a third independent scan of the 4.13 code base from FOSSology, and compared selected files where the other two scanners disagreed against that SPDX file, to see if there was new insights. The Windriver scanner is based on an older version of FOSSology in part, so they are related. Thomas did random spot checks in about 500 files from the spreadsheets for the uapi headers and agreed with SPDX license identifier in the files he inspected. For the non-uapi files Thomas did random spot checks in about 15000 files. In initial set of patches against 4.14-rc6, 3 files were found to have copy/paste license identifier errors, and have been fixed to reflect the correct identifier. Additionally Philippe spent 10 hours this week doing a detailed manual inspection and review of the 12,461 patched files from the initial patch version early this week with: - a full scancode scan run, collecting the matched texts, detected license ids and scores - reviewing anything where there was a license detected (about 500+ files) to ensure that the applied SPDX license was correct - reviewing anything where there was no detection but the patch license was not GPL-2.0 WITH Linux-syscall-note to ensure that the applied SPDX license was correct This produced a worksheet with 20 files needing minor correction. This worksheet was then exported into 3 different .csv files for the different types of files to be modified. These .csv files were then reviewed by Greg. Thomas wrote a script to parse the csv files and add the proper SPDX tag to the file, in the format that the file expected. This script was further refined by Greg based on the output to detect more types of files automatically and to distinguish between header and source .c files (which need different comment types.) Finally Greg ran the script using the .csv files to generate the patches. Reviewed-by: Kate Stewart <[email protected]> Reviewed-by: Philippe Ombredanne <[email protected]> Reviewed-by: Thomas Gleixner <[email protected]> Signed-off-by: Greg Kroah-Hartman <[email protected]>
2017-07-18perf probe: Allow placing uprobes in alternate namespaces.Krister Johansen1-1/+1
Teaches perf how to place a uprobe on a file that's in a different mount namespace. The user must add the probe using the --target-ns argument to perf probe. Once it has been placed, it may be recorded against without further namespace-specific commands. Signed-off-by: Krister Johansen <[email protected]> Cc: Alexander Shishkin <[email protected]> Cc: Brendan Gregg <[email protected]> Cc: Peter Zijlstra <[email protected]> Cc: Ravi Bangoria <[email protected]> [ PPC build fixed by Ravi: ] Link: http://lkml.kernel.org/r/1500287542-6219-1-git-send-email-ravi.bangoria@linux.vnet.ibm.com Cc: Thomas-Mich Richter <[email protected]> [ Fix !HAVE_DWARF_SUPPORT build ] Link: http://lkml.kernel.org/r/[email protected] Signed-off-by: Arnaldo Carvalho de Melo <[email protected]>
2017-06-21perf unwind: Support for powerpcPaolo Bonzini2-0/+75
Porting PPC to libdw only needs an architecture-specific hook to move the register state from perf to libdw. The ARM and x86 architectures already use libdw, and it is useful to have as much common code for the unwinder as possible. Mark Wielaard has contributed a frame-based unwinder to libdw, so that unwinding works even for binaries that do not have CFI information. In addition, libunwind is always preferred to libdw by the build machinery so this cannot introduce regressions on machines that have both libunwind and libdw installed. Signed-off-by: Paolo Bonzini <[email protected]> Acked-by: Jiri Olsa <[email protected]> Acked-by: Milian Wolff <[email protected]> Acked-by: Ravi Bangoria <[email protected]> Cc: Naveen N. Rao <[email protected]> Cc: [email protected] Link: http://lkml.kernel.org/r/[email protected] Signed-off-by: Arnaldo Carvalho de Melo <[email protected]>