Commit graph

11588 commits

Author SHA1 Message Date
Jiri Olsa
9d88a1a170 perf tools: Move clockid_res_ns under clock struct
Move the clockid_res_ns struct member to the clock struct, so we have
the clock related stuff in one place.

Signed-off-by: Jiri Olsa <jolsa@kernel.org>
Cc: Alexander Shishkin <alexander.shishkin@linux.intel.com>
Cc: Andi Kleen <ak@linux.intel.com>
Cc: David Ahern <dsahern@gmail.com>
Cc: Geneviève Bastien <gbastien@versatic.net>
Cc: Ian Rogers <irogers@google.com>
Cc: Jeremie Galarneau <jgalar@efficios.com>
Cc: Michael Petlan <mpetlan@redhat.com>
Cc: Namhyung Kim <namhyung@kernel.org>
Cc: Peter Zijlstra <peterz@infradead.org>
Cc: Wang Nan <wangnan0@huawei.com>
Link: http://lore.kernel.org/lkml/20200805093444.314999-5-jolsa@kernel.org
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
2020-08-06 09:42:20 -03:00
Jiri Olsa
d1e325cf40 perf header: Store clock references for -k/--clockid option
Add a new CLOCK_DATA feature that stores reference times when
-k/--clockid option is specified.

It contains the clock id and its reference time together with wall clock
time taken at the 'same time', both values are in nanoseconds.

The format of data is as below:

  struct {
       u32 version;  /* version = 1 */
       u32 clockid;
       u64 wall_clock_ns;
       u64 clockid_time_ns;
  };

This clock reference times will be used in following changes to display
wall clock for perf events.

It's available only for recording with clockid specified, because it's
the only case where we can get reference time to wallclock time. It's
can't do that with perf clock yet.

Committer testing:

  $ perf record -h -k

   Usage: perf record [<options>] [<command>]
      or: perf record [<options>] -- <command> [<options>]

      -k, --clockid <clockid>
                            clockid to use for events, see clock_gettime()

  $ perf record -k monotonic sleep 1
  [ perf record: Woken up 1 times to write data ]
  [ perf record: Captured and wrote 0.017 MB perf.data (8 samples) ]
  $ perf report --header-only | grep clockid -A1
  # event : name = cycles:u, , id = { 88815, 88816, 88817, 88818, 88819, 88820, 88821, 88822 }, size = 120, { sample_period, sample_freq } = 4000, sample_type = IP|TID|TIME|PERIOD, read_format = ID, disabled = 1, inherit = 1, exclude_kernel = 1, mmap = 1, comm = 1, freq = 1, enable_on_exec = 1, task = 1, precise_ip = 3, sample_id_all = 1, exclude_guest = 1, mmap2 = 1, comm_exec = 1, use_clockid = 1, ksymbol = 1, bpf_event = 1, clockid = 1
  # CPU_TOPOLOGY info available, use -I to display
  --
  # clockid frequency: 1000 MHz
  # cpu pmu capabilities: branches=32, max_precise=3, pmu_name=skylake
  # clockid: monotonic (1)
  # reference time: 2020-08-06 09:40:21.619290 = 1596717621.619290 (TOD) = 21931.077673635 (monotonic)
  $

Original-patch-by: David Ahern <dsahern@gmail.com>
Signed-off-by: Jiri Olsa <jolsa@kernel.org>
Tested-by: Arnaldo Carvalho de Melo <acme@redhat.com>
Cc: Alexander Shishkin <alexander.shishkin@linux.intel.com>
Cc: Andi Kleen <ak@linux.intel.com>
Cc: David Ahern <dsahern@gmail.com>
Cc: Geneviève Bastien <gbastien@versatic.net>
Cc: Ian Rogers <irogers@google.com>
Cc: Jeremie Galarneau <jgalar@efficios.com>
Cc: Michael Petlan <mpetlan@redhat.com>
Cc: Namhyung Kim <namhyung@kernel.org>
Cc: Peter Zijlstra <peterz@infradead.org>
Cc: Wang Nan <wangnan0@huawei.com>
Link: http://lore.kernel.org/lkml/20200805093444.314999-4-jolsa@kernel.org
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
2020-08-06 09:35:06 -03:00
Jiri Olsa
cc3365bbd0 perf tools: Add clockid_name function
Add the clockid_name() function to get the clock name based on its
clockid.  It will be used in the following changes.

Signed-off-by: Jiri Olsa <jolsa@kernel.org>
Cc: Alexander Shishkin <alexander.shishkin@linux.intel.com>
Cc: Andi Kleen <ak@linux.intel.com>
Cc: David Ahern <dsahern@gmail.com>
Cc: Geneviève Bastien <gbastien@versatic.net>
Cc: Ian Rogers <irogers@google.com>
Cc: Jeremie Galarneau <jgalar@efficios.com>
Cc: Michael Petlan <mpetlan@redhat.com>
Cc: Namhyung Kim <namhyung@kernel.org>
Cc: Peter Zijlstra <peterz@infradead.org>
Cc: Wang Nan <wangnan0@huawei.com>
Link: http://lore.kernel.org/lkml/20200805093444.314999-3-jolsa@kernel.org
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
2020-08-06 09:33:57 -03:00
Jiri Olsa
6953beb4dd perf clockid: Move parse_clockid() to new clockid object
Move parse_clockid and all needed clcckid related stuff into clockid
object. We are going to add clockid_name function in following change,
so it's better it's placed in separated object and not in
builtin-record.c.

No functional change is intended.

Signed-off-by: Jiri Olsa <jolsa@kernel.org>
Cc: Alexander Shishkin <alexander.shishkin@linux.intel.com>
Cc: Andi Kleen <ak@linux.intel.com>
Cc: David Ahern <dsahern@gmail.com>
Cc: Geneviève Bastien <gbastien@versatic.net>
Cc: Ian Rogers <irogers@google.com>
Cc: Jeremie Galarneau <jgalar@efficios.com>
Cc: Michael Petlan <mpetlan@redhat.com>
Cc: Namhyung Kim <namhyung@kernel.org>
Cc: Peter Zijlstra <peterz@infradead.org>
Cc: Wang Nan <wangnan0@huawei.com>
Link: http://lore.kernel.org/lkml/20200805093444.314999-2-jolsa@kernel.org
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
2020-08-06 09:30:52 -03:00
Alexey Budankov
4b0297ef8a perf evsel: Extend message to mention CAP_SYS_PTRACE and perf security doc link
Adjust limited access message to mention CAP_SYS_PTRACE capability for
processes of unprivileged users. Add link to perf security document in
the end of the section about capabilities.

The change has been inspired by this discussion:
https://lore.kernel.org/lkml/20200722113007.GI77866@kernel.org/

Signed-off-by: Alexey Budankov <alexey.budankov@linux.intel.com>
Cc: Alexander Shishkin <alexander.shishkin@linux.intel.com>
Cc: Andi Kleen <ak@linux.intel.com>
Cc: Jiri Olsa <jolsa@redhat.com>
Cc: Namhyung Kim <namhyung@kernel.org>
Cc: Peter Zijlstra <peterz@infradead.org>
Link: http://lore.kernel.org/lkml/6f8a7425-6e7d-19aa-1605-e59836b9e2a6@linux.intel.com
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
2020-08-06 09:04:23 -03:00
Adrian Hunter
347a7389a7 perf intel-pt: Add support for decoding PSB+ only
A single q option decodes ip from only FUP/TIP packets. Make it so that
repeating the q option (i.e. qq) decodes only PSB+, getting ip if there
is a FUP packet within PSB+ (i.e. between PSB and PSBEND).

Example:

 $ perf record -e intel_pt//u grep -rI pudding drivers
 [ perf record: Woken up 52 times to write data ]
 [ perf record: Captured and wrote 57.870 MB perf.data ]
 $ time perf script --itrace=bi | wc -l
 58948289

 real    1m23.863s
 user    1m23.251s
 sys     0m7.452s
 $ time perf script --itrace=biq | wc -l
 3385694

 real    0m4.453s
 user    0m4.455s
 sys     0m0.328s
 $ time perf script --itrace=biqq | wc -l
 1883

 real    0m0.047s
 user    0m0.043s
 sys     0m0.009s

Signed-off-by: Adrian Hunter <adrian.hunter@intel.com>
Reviewed-by: Andi Kleen <ak@linux.intel.com>
Tested-by: Arnaldo Carvalho de Melo <acme@redhat.com>
Cc: Jiri Olsa <jolsa@redhat.com>
Link: http://lore.kernel.org/lkml/20200710151104.15137-13-adrian.hunter@intel.com
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
2020-08-06 09:02:43 -03:00
Adrian Hunter
7c1b16ba0e perf intel-pt: Add support for decoding FUP/TIP only
Use the new itrace 'q' option to add support for a mode of decoding that
ignores TNT, does not walk object code, but gets the ip from FUP and TIP
packets.

Example:

 $ perf record -e intel_pt//u grep -rI pudding drivers
 [ perf record: Woken up 52 times to write data ]
 [ perf record: Captured and wrote 57.870 MB perf.data ]
 $ time perf script --itrace=bi | wc -l
 58948289

 real    1m23.863s
 user    1m23.251s
 sys     0m7.452s
 $ time perf script --itrace=biq | wc -l
 3385694

 real    0m4.453s
 user    0m4.455s
 sys     0m0.328s

Signed-off-by: Adrian Hunter <adrian.hunter@intel.com>
Reviewed-by: Andi Kleen <ak@linux.intel.com>
Tested-by: Arnaldo Carvalho de Melo <acme@redhat.com>
Cc: Jiri Olsa <jolsa@redhat.com>
Link: http://lore.kernel.org/lkml/20200710151104.15137-12-adrian.hunter@intel.com
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
2020-08-06 09:02:14 -03:00
Adrian Hunter
51971536ef perf auxtrace: Add itrace 'q' option for quicker, less detailed decoding
The 'q' option is for modes of decoding that are quicker because they
skip or omit decoding some aspects of trace data.

If supported, the 'q' option may be repeated to increase the effect.

Signed-off-by: Adrian Hunter <adrian.hunter@intel.com>
Reviewed-by: Andi Kleen <ak@linux.intel.com>
Cc: Jiri Olsa <jolsa@redhat.com>
Link: http://lore.kernel.org/lkml/20200710151104.15137-11-adrian.hunter@intel.com
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
2020-08-06 08:24:03 -03:00
Adrian Hunter
d4575f5fce perf intel-pt: Time filter logged perf events
Change the debug logging (when used with the --time option) to time
filter logged perf events, but allow that to be overridden by using
"d+a" instead of plain "d".

That can reduce the size of the log file.

Signed-off-by: Adrian Hunter <adrian.hunter@intel.com>
Reviewed-by: Andi Kleen <ak@linux.intel.com>
Cc: Jiri Olsa <jolsa@redhat.com>
Link: http://lore.kernel.org/lkml/20200710151104.15137-10-adrian.hunter@intel.com
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
2020-08-06 08:23:19 -03:00
Adrian Hunter
8b83fccdd2 perf intel-pt: Use itrace debug log flags to suppress some messages
The "d" option may be followed by flags which affect what debug messages
will or will not be logged. Each flag must be preceded by either '+' or
'-'. The flags support by Intel PT are:

		-a	Suppress logging of perf events

Suppressing perf events is useful for decreasing the size of the log.

Signed-off-by: Adrian Hunter <adrian.hunter@intel.com>
Reviewed-by: Andi Kleen <ak@linux.intel.com>
Cc: Jiri Olsa <jolsa@redhat.com>
Link: http://lore.kernel.org/lkml/20200710151104.15137-9-adrian.hunter@intel.com
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
2020-08-06 08:23:00 -03:00
Adrian Hunter
935aac2d2d perf auxtrace: Add optional log flags to the itrace 'd' option
Allow the 'd' option to be followed by flags which will affect what debug
messages will or will not be reported. Each flag must be preceded by either
'+' or '-'. The flags are:
	a	all perf events

Signed-off-by: Adrian Hunter <adrian.hunter@intel.com>
Reviewed-by: Andi Kleen <ak@linux.intel.com>
Cc: Jiri Olsa <jolsa@redhat.com>
Link: http://lore.kernel.org/lkml/20200710151104.15137-8-adrian.hunter@intel.com
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
2020-08-06 08:22:38 -03:00
Adrian Hunter
1d846aeb86 perf intel-pt: Use itrace error flags to suppress some errors
The itrace "e" option may be followed by flags which affect what errors
will or will not be reported.  Each flag must be preceded by either '+' or '-'.
The flags supported by Intel PT are:

		-o	Suppress overflow errors
		-l	Suppress trace data lost errors
For example, for errors but not overflow or data lost errors:

	--itrace=e-o-l

Suppressing those errors can be useful for testing and debugging because
they are not due to decoding.

Signed-off-by: Adrian Hunter <adrian.hunter@intel.com>
Reviewed-by: Andi Kleen <ak@linux.intel.com>
Cc: Jiri Olsa <jolsa@redhat.com>
Link: http://lore.kernel.org/lkml/20200710151104.15137-7-adrian.hunter@intel.com
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
2020-08-06 08:22:07 -03:00
Adrian Hunter
cb971438b7 perf auxtrace: Add optional error flags to the itrace 'e' option
Allow the 'e' option to be followed by flags which will affect what errors
will or will not be reported. Each flag must be preceded by either '+' or
'-'. The flags are:
	o	overflow
	l	trace data lost

Signed-off-by: Adrian Hunter <adrian.hunter@intel.com>
Reviewed-by: Andi Kleen <ak@linux.intel.com>
Cc: Jiri Olsa <jolsa@redhat.com>
Link: http://lore.kernel.org/lkml/20200710151104.15137-6-adrian.hunter@intel.com
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
2020-08-06 08:21:31 -03:00
Adrian Hunter
1e8f786944 perf auxtrace: Add missing itrace options to help text
Add missing itrace options o, G and L.

Signed-off-by: Adrian Hunter <adrian.hunter@intel.com>
Reviewed-by: Andi Kleen <ak@linux.intel.com>
Cc: Jiri Olsa <jolsa@redhat.com>
Link: http://lore.kernel.org/lkml/20200710151104.15137-5-adrian.hunter@intel.com
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
2020-08-06 08:21:06 -03:00
Adrian Hunter
2c9a11af84 perf tools: Improve aux_output not supported error
For example:

 Before:
   $ perf record -e '{intel_pt/branch=0/,branch-loads/aux-output/ppp}' -- ls -l
   Error:
   branch-loads: PMU Hardware doesn't support sampling/overflow-interrupts. Try 'perf stat'

 After:
   $ perf record -e '{intel_pt/branch=0/,branch-loads/aux-output/ppp}' -- ls -l
   Error:
   branch-loads: PMU Hardware doesn't support 'aux_output' feature

Signed-off-by: Adrian Hunter <adrian.hunter@intel.com>
Reviewed-by: Andi Kleen <ak@linux.intel.com>
Tested-by: Arnaldo Carvalho de Melo <acme@redhat.com>
Cc: Jiri Olsa <jolsa@redhat.com>
Link: http://lore.kernel.org/lkml/20200710151104.15137-4-adrian.hunter@intel.com
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
2020-08-06 08:20:17 -03:00
Adrian Hunter
a58a057ce6 perf intel-pt: Fix duplicate branch after CBR
CBR events can result in a duplicate branch event, because the state
type defaults to a branch. Fix by clearing the state type.

Example: trace 'sleep' and hope for a frequency change

 Before:

   $ perf record -e intel_pt//u sleep 0.1
   [ perf record: Woken up 1 times to write data ]
   [ perf record: Captured and wrote 0.034 MB perf.data ]
   $ perf script --itrace=bpe > before.txt

 After:

   $ perf script --itrace=bpe > after.txt
   $ diff -u before.txt after.txt
   --- before.txt  2020-07-07 14:42:18.191508098 +0300
   +++ after.txt   2020-07-07 14:42:36.587891753 +0300
   @@ -29673,7 +29673,6 @@
               sleep 93431 [007] 15411.619905:          1  branches:u:                 0 [unknown] ([unknown]) =>     7f0818abb2e0 clock_nanosleep@@GLIBC_2.17+0x0 (/usr/lib/x86_64-linux-gnu/libc-2.31.so)
               sleep 93431 [007] 15411.619905:          1  branches:u:      7f0818abb30c clock_nanosleep@@GLIBC_2.17+0x2c (/usr/lib/x86_64-linux-gnu/libc-2.31.so) =>                0 [unknown] ([unknown])
               sleep 93431 [007] 15411.720069:         cbr:  cbr: 15 freq: 1507 MHz ( 56%)         7f0818abb30c clock_nanosleep@@GLIBC_2.17+0x2c (/usr/lib/x86_64-linux-gnu/libc-2.31.so)
   -           sleep 93431 [007] 15411.720069:          1  branches:u:      7f0818abb30c clock_nanosleep@@GLIBC_2.17+0x2c (/usr/lib/x86_64-linux-gnu/libc-2.31.so) =>                0 [unknown] ([unknown])
               sleep 93431 [007] 15411.720076:          1  branches:u:                 0 [unknown] ([unknown]) =>     7f0818abb30e clock_nanosleep@@GLIBC_2.17+0x2e (/usr/lib/x86_64-linux-gnu/libc-2.31.so)
               sleep 93431 [007] 15411.720077:          1  branches:u:      7f0818abb323 clock_nanosleep@@GLIBC_2.17+0x43 (/usr/lib/x86_64-linux-gnu/libc-2.31.so) =>     7f0818ac0eb7 __nanosleep+0x17 (/usr/lib/x86_64-linux-gnu/libc-2.31.so)
               sleep 93431 [007] 15411.720077:          1  branches:u:      7f0818ac0ebf __nanosleep+0x1f (/usr/lib/x86_64-linux-gnu/libc-2.31.so) =>     55cb7e4c2827 rpl_nanosleep+0x97 (/usr/bin/sleep)

Fixes: 91de8684f1 ("perf intel-pt: Cater for CBR change in PSB+")
Fixes: abe5a1d3e4 ("perf intel-pt: Decoder to output CBR changes immediately")
Signed-off-by: Adrian Hunter <adrian.hunter@intel.com>
Reviewed-by: Andi Kleen <ak@linux.intel.com>
Tested-by: Arnaldo Carvalho de Melo <acme@redhat.com>
Cc: Jiri Olsa <jolsa@redhat.com>
Cc: stable@vger.kernel.org
Link: http://lore.kernel.org/lkml/20200710151104.15137-3-adrian.hunter@intel.com
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
2020-08-06 08:18:32 -03:00
Adrian Hunter
401136bb08 perf intel-pt: Fix FUP packet state
While walking code towards a FUP ip, the packet state is
INTEL_PT_STATE_FUP or INTEL_PT_STATE_FUP_NO_TIP. That was mishandled
resulting in the state becoming INTEL_PT_STATE_IN_SYNC prematurely.  The
result was an occasional lost EXSTOP event.

Signed-off-by: Adrian Hunter <adrian.hunter@intel.com>
Reviewed-by: Andi Kleen <ak@linux.intel.com>
Cc: Jiri Olsa <jolsa@redhat.com>
Cc: stable@vger.kernel.org
Link: http://lore.kernel.org/lkml/20200710151104.15137-2-adrian.hunter@intel.com
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
2020-08-06 08:17:24 -03:00
Arnaldo Carvalho de Melo
94fb1afb14 Mgerge remote-tracking branch 'torvalds/master' into perf/core
To sync headers, for instance, in this case tools/perf was ahead of
upstream till Linus merged tip/perf/core to get the
PERF_RECORD_TEXT_POKE changes:

  Warning: Kernel ABI header at 'tools/include/uapi/linux/perf_event.h' differs from latest version at 'include/uapi/linux/perf_event.h'
  diff -u tools/include/uapi/linux/perf_event.h include/uapi/linux/perf_event.h

Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
2020-08-06 08:15:47 -03:00
Jin Yao
c4735d9902 perf evsel: Don't set sample_regs_intr/sample_regs_user for dummy event
Since commit 0a892c1c94 ("perf record: Add dummy event during system wide synthesis"),
a dummy event is added to capture mmaps.

But if we run perf-record as,

 # perf record -e cycles:p -IXMM0 -a -- sleep 1
 Error:
 dummy:HG: PMU Hardware doesn't support sampling/overflow-interrupts. Try 'perf stat'

The issue is, if we enable the extended regs (-IXMM0), but the
pmu->capabilities is not set with PERF_PMU_CAP_EXTENDED_REGS, the kernel
will return -EOPNOTSUPP error.

See following code:

/* in kernel/events/core.c */
static int perf_try_init_event(struct pmu *pmu, struct perf_event *event)

{
        ....
        if (!(pmu->capabilities & PERF_PMU_CAP_EXTENDED_REGS) &&
            has_extended_regs(event))
                ret = -EOPNOTSUPP;
        ....
}

For software dummy event, the PMU should not be set with
PERF_PMU_CAP_EXTENDED_REGS. But unfortunately now, the dummy
event has possibility to be set with PERF_REG_EXTENDED_MASK bit.

In evsel__config, /* tools/perf/util/evsel.c */

if (opts->sample_intr_regs) {
        attr->sample_regs_intr = opts->sample_intr_regs;
}

If we use -IXMM0, the attr>sample_regs_intr will be set with
PERF_REG_EXTENDED_MASK bit.

It doesn't make sense to set attr->sample_regs_intr for a
software dummy event.

This patch adds dummy event checking before setting
attr->sample_regs_intr and attr->sample_regs_user.

After:
  # ./perf record -e cycles:p -IXMM0 -a -- sleep 1
  [ perf record: Woken up 1 times to write data ]
  [ perf record: Captured and wrote 0.413 MB perf.data (45 samples) ]

Committer notes:

Adrian said this when providing his Acked-by:

"
This is fine.  It will not break PT.

no_aux_samples is useful for evsels that have been added by the code rather
than requested by the user.  For old kernels PT adds sched_switch tracepoint
to track context switches (before the current context switch event was
added) and having auxiliary sample information unnecessarily uses up space
in the perf buffer.
"

Fixes: 0a892c1c94 ("perf record: Add dummy event during system wide synthesis")
Signed-off-by: Jin Yao <yao.jin@linux.intel.com>
Acked-by: Adrian Hunter <adrian.hunter@intel.com>
Cc: Alexander Shishkin <alexander.shishkin@linux.intel.com>
Cc: Andi Kleen <ak@linux.intel.com>
Cc: Ian Rogers <irogers@google.com>
Cc: Jin Yao <yao.jin@intel.com>
Cc: Jiri Olsa <jolsa@kernel.org>
Cc: Kan Liang <kan.liang@linux.intel.com>
Cc: Peter Zijlstra <peterz@infradead.org>
Link: http://lore.kernel.org/lkml/20200720010013.18238-1-yao.jin@linux.intel.com
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
2020-08-04 09:04:31 -03:00
Alexey Budankov
1d078ccb33 perf record: Introduce --control fd:ctl-fd[,ack-fd] options
Introduce --control fd:ctl-fd[,ack-fd] options to pass open file
descriptors numbers from command line.

Extend perf-record.txt file with --control fd:ctl-fd[,ack-fd] options
description.

Document possible usage model introduced by --control fd:ctl-fd[,ack-fd]
options by providing example bash shell script.

Signed-off-by: Alexey Budankov <alexey.budankov@linux.intel.com>
Acked-by: Jiri Olsa <jolsa@redhat.com>
Acked-by: Namhyung Kim <namhyung@kernel.org>
Cc: Alexander Shishkin <alexander.shishkin@linux.intel.com>
Cc: Andi Kleen <ak@linux.intel.com>
Cc: Peter Zijlstra <peterz@infradead.org>
Link: http://lore.kernel.org/lkml/8dc01e1a-3a80-3f67-5385-4bc7112b0dd3@linux.intel.com
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
2020-08-04 08:50:52 -03:00
Alexey Budankov
acce022394 perf record: Implement control commands handling
Implement handling of 'enable' and 'disable' control commands coming
from control file descriptor.

Signed-off-by: Alexey Budankov <alexey.budankov@linux.intel.com>
Acked-by: Jiri Olsa <jolsa@redhat.com>
Acked-by: Namhyung Kim <namhyung@kernel.org>
Cc: Alexander Shishkin <alexander.shishkin@linux.intel.com>
Cc: Andi Kleen <ak@linux.intel.com>
Cc: Peter Zijlstra <peterz@infradead.org>
Link: http://lore.kernel.org/lkml/f0fde590-1320-dca1-39ff-da3322704d3b@linux.intel.com
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
2020-08-04 08:50:28 -03:00
Alexey Budankov
68cd3b45b9 perf record: Extend -D,--delay option with -1 value
Extend -D,--delay option with -1 to start collection with events
disabled to be enabled later by 'enable' command provided via control
file descriptor.

Signed-off-by: Alexey Budankov <alexey.budankov@linux.intel.com>
Acked-by: Jiri Olsa <jolsa@redhat.com>
Acked-by: Namhyung Kim <namhyung@kernel.org>
Cc: Alexander Shishkin <alexander.shishkin@linux.intel.com>
Cc: Andi Kleen <ak@linux.intel.com>
Cc: Peter Zijlstra <peterz@infradead.org>
Link: http://lore.kernel.org/lkml/3e7d362c-7973-ee5d-e81e-c60ea22432c3@linux.intel.com
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
2020-08-04 08:50:04 -03:00
Alexey Budankov
27e9769aad perf stat: Introduce --control fd:ctl-fd[,ack-fd] options
Introduce --control fd:ctl-fd[,ack-fd] options to pass open file
descriptors numbers from command line. Extend perf-stat.txt file with
--control fd:ctl-fd[,ack-fd] options description. Document possible
usage model introduced by --control fd:ctl-fd[,ack-fd] options by
providing example bash shell script.

Signed-off-by: Alexey Budankov <alexey.budankov@linux.intel.com>
Acked-by: Jiri Olsa <jolsa@redhat.com>
Acked-by: Namhyung Kim <namhyung@kernel.org>
Cc: Alexander Shishkin <alexander.shishkin@linux.intel.com>
Cc: Andi Kleen <ak@linux.intel.com>
Cc: Peter Zijlstra <peterz@infradead.org>
Link: http://lore.kernel.org/lkml/feabd5cf-0155-fb0a-4587-c71571f2d517@linux.intel.com
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
2020-08-04 08:48:58 -03:00
Arnaldo Carvalho de Melo
b1aa3db2c1 Merge remote-tracking branch 'torvalds/master' into perf/core
Minor conflict in tools/perf/arch/arm/util/auxtrace.c as one fix there
was cherry-picked for the last perf/urgent pull req to Linus, so was
already there.

Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
2020-08-03 09:37:31 -03:00
David S. Miller
bd0b33b248 Merge git://git.kernel.org/pub/scm/linux/kernel/git/netdev/net
Resolved kernel/bpf/btf.c using instructions from merge commit
69138b34a7

Signed-off-by: David S. Miller <davem@davemloft.net>
2020-08-02 01:02:12 -07:00
Ian Rogers
7c43b0c1d4 perf bench: Add benchmark of find_next_bit
for_each_set_bit, or similar functions like for_each_cpu, may be hot
within the kernel. If many bits were set then one could imagine on Intel
a "bt" instruction with every bit may be faster than the function call
and word length find_next_bit logic. Add a benchmark to measure this.

This benchmark on AMD rome and Intel skylakex shows "bt" is not a good
option except for very small bitmaps.

Committer testing:

  # perf bench
  Usage:
  	perf bench [<common options>] <collection> <benchmark> [<options>]

          # List of all available benchmark collections:

           sched: Scheduler and IPC benchmarks
         syscall: System call benchmarks
             mem: Memory access benchmarks
            numa: NUMA scheduling and MM benchmarks
           futex: Futex stressing benchmarks
           epoll: Epoll stressing benchmarks
       internals: Perf-internals benchmarks
             all: All benchmarks

  # perf bench mem

          # List of available benchmarks for collection 'mem':

          memcpy: Benchmark for memcpy() functions
          memset: Benchmark for memset() functions
        find_bit: Benchmark for find_bit() functions
             all: Run all memory access benchmarks

  # perf bench mem find_bit
  # Running 'mem/find_bit' benchmark:
  100000 operations 1 bits set of 1 bits
    Average for_each_set_bit took: 730.200 usec (+- 6.468 usec)
    Average test_bit loop took:    366.200 usec (+- 4.652 usec)
  100000 operations 1 bits set of 2 bits
    Average for_each_set_bit took: 781.000 usec (+- 24.247 usec)
    Average test_bit loop took:    550.200 usec (+- 4.152 usec)
  100000 operations 2 bits set of 2 bits
    Average for_each_set_bit took: 1113.400 usec (+- 112.340 usec)
    Average test_bit loop took:    1098.500 usec (+- 182.834 usec)
  100000 operations 1 bits set of 4 bits
    Average for_each_set_bit took: 843.800 usec (+- 8.772 usec)
    Average test_bit loop took:    948.800 usec (+- 10.278 usec)
  100000 operations 2 bits set of 4 bits
    Average for_each_set_bit took: 1185.800 usec (+- 114.345 usec)
    Average test_bit loop took:    1473.200 usec (+- 175.498 usec)
  100000 operations 4 bits set of 4 bits
    Average for_each_set_bit took: 1769.667 usec (+- 233.177 usec)
    Average test_bit loop took:    1864.933 usec (+- 187.470 usec)
  100000 operations 1 bits set of 8 bits
    Average for_each_set_bit took: 898.000 usec (+- 21.755 usec)
    Average test_bit loop took:    1768.400 usec (+- 23.672 usec)
  100000 operations 2 bits set of 8 bits
    Average for_each_set_bit took: 1244.900 usec (+- 116.396 usec)
    Average test_bit loop took:    2201.800 usec (+- 145.398 usec)
  100000 operations 4 bits set of 8 bits
    Average for_each_set_bit took: 1822.533 usec (+- 231.554 usec)
    Average test_bit loop took:    2569.467 usec (+- 168.453 usec)
  100000 operations 8 bits set of 8 bits
    Average for_each_set_bit took: 2845.100 usec (+- 441.365 usec)
    Average test_bit loop took:    3023.300 usec (+- 219.575 usec)
  100000 operations 1 bits set of 16 bits
    Average for_each_set_bit took: 923.400 usec (+- 17.560 usec)
    Average test_bit loop took:    3240.000 usec (+- 16.492 usec)
  100000 operations 2 bits set of 16 bits
    Average for_each_set_bit took: 1264.300 usec (+- 114.034 usec)
    Average test_bit loop took:    3714.400 usec (+- 158.898 usec)
  100000 operations 4 bits set of 16 bits
    Average for_each_set_bit took: 1817.867 usec (+- 222.199 usec)
    Average test_bit loop took:    4015.333 usec (+- 154.162 usec)
  100000 operations 8 bits set of 16 bits
    Average for_each_set_bit took: 2826.350 usec (+- 433.457 usec)
    Average test_bit loop took:    4460.350 usec (+- 210.762 usec)
  100000 operations 16 bits set of 16 bits
    Average for_each_set_bit took: 4615.600 usec (+- 809.350 usec)
    Average test_bit loop took:    5129.960 usec (+- 320.821 usec)
  100000 operations 1 bits set of 32 bits
    Average for_each_set_bit took: 904.400 usec (+- 14.250 usec)
    Average test_bit loop took:    6194.000 usec (+- 29.254 usec)
  100000 operations 2 bits set of 32 bits
    Average for_each_set_bit took: 1252.700 usec (+- 116.432 usec)
    Average test_bit loop took:    6652.400 usec (+- 154.352 usec)
  100000 operations 4 bits set of 32 bits
    Average for_each_set_bit took: 1824.200 usec (+- 229.133 usec)
    Average test_bit loop took:    6961.733 usec (+- 154.682 usec)
  100000 operations 8 bits set of 32 bits
    Average for_each_set_bit took: 2823.950 usec (+- 432.296 usec)
    Average test_bit loop took:    7351.900 usec (+- 193.626 usec)
  100000 operations 16 bits set of 32 bits
    Average for_each_set_bit took: 4552.560 usec (+- 785.141 usec)
    Average test_bit loop took:    7998.360 usec (+- 305.629 usec)
  100000 operations 32 bits set of 32 bits
    Average for_each_set_bit took: 7557.067 usec (+- 1407.702 usec)
    Average test_bit loop took:    9072.400 usec (+- 513.209 usec)
  100000 operations 1 bits set of 64 bits
    Average for_each_set_bit took: 896.800 usec (+- 14.389 usec)
    Average test_bit loop took:    11927.200 usec (+- 68.862 usec)
  100000 operations 2 bits set of 64 bits
    Average for_each_set_bit took: 1230.400 usec (+- 111.731 usec)
    Average test_bit loop took:    12478.600 usec (+- 189.382 usec)
  100000 operations 4 bits set of 64 bits
    Average for_each_set_bit took: 1844.733 usec (+- 244.826 usec)
    Average test_bit loop took:    12911.467 usec (+- 206.246 usec)
  100000 operations 8 bits set of 64 bits
    Average for_each_set_bit took: 2779.300 usec (+- 413.612 usec)
    Average test_bit loop took:    13372.650 usec (+- 239.623 usec)
  100000 operations 16 bits set of 64 bits
    Average for_each_set_bit took: 4423.920 usec (+- 748.240 usec)
    Average test_bit loop took:    13995.800 usec (+- 318.427 usec)
  100000 operations 32 bits set of 64 bits
    Average for_each_set_bit took: 7580.600 usec (+- 1462.407 usec)
    Average test_bit loop took:    15063.067 usec (+- 516.477 usec)
  100000 operations 64 bits set of 64 bits
    Average for_each_set_bit took: 13391.514 usec (+- 2765.371 usec)
    Average test_bit loop took:    16974.914 usec (+- 916.936 usec)
  100000 operations 1 bits set of 128 bits
    Average for_each_set_bit took: 1153.800 usec (+- 124.245 usec)
    Average test_bit loop took:    26959.000 usec (+- 714.047 usec)
  100000 operations 2 bits set of 128 bits
    Average for_each_set_bit took: 1445.200 usec (+- 113.587 usec)
    Average test_bit loop took:    25798.800 usec (+- 512.908 usec)
  100000 operations 4 bits set of 128 bits
    Average for_each_set_bit took: 1990.933 usec (+- 219.362 usec)
    Average test_bit loop took:    25589.400 usec (+- 348.288 usec)
  100000 operations 8 bits set of 128 bits
    Average for_each_set_bit took: 2963.000 usec (+- 419.487 usec)
    Average test_bit loop took:    25690.050 usec (+- 262.025 usec)
  100000 operations 16 bits set of 128 bits
    Average for_each_set_bit took: 4585.200 usec (+- 741.734 usec)
    Average test_bit loop took:    26125.040 usec (+- 274.127 usec)
  100000 operations 32 bits set of 128 bits
    Average for_each_set_bit took: 7626.200 usec (+- 1404.950 usec)
    Average test_bit loop took:    27038.867 usec (+- 442.554 usec)
  100000 operations 64 bits set of 128 bits
    Average for_each_set_bit took: 13343.371 usec (+- 2686.460 usec)
    Average test_bit loop took:    28936.543 usec (+- 883.257 usec)
  100000 operations 128 bits set of 128 bits
    Average for_each_set_bit took: 23442.950 usec (+- 4880.541 usec)
    Average test_bit loop took:    32484.125 usec (+- 1691.931 usec)
  100000 operations 1 bits set of 256 bits
    Average for_each_set_bit took: 1183.000 usec (+- 32.073 usec)
    Average test_bit loop took:    50114.600 usec (+- 198.880 usec)
  100000 operations 2 bits set of 256 bits
    Average for_each_set_bit took: 1550.000 usec (+- 124.550 usec)
    Average test_bit loop took:    50334.200 usec (+- 128.425 usec)
  100000 operations 4 bits set of 256 bits
    Average for_each_set_bit took: 2164.333 usec (+- 246.359 usec)
    Average test_bit loop took:    49959.867 usec (+- 188.035 usec)
  100000 operations 8 bits set of 256 bits
    Average for_each_set_bit took: 3211.200 usec (+- 454.829 usec)
    Average test_bit loop took:    50140.850 usec (+- 176.046 usec)
  100000 operations 16 bits set of 256 bits
    Average for_each_set_bit took: 5181.640 usec (+- 882.726 usec)
    Average test_bit loop took:    51003.160 usec (+- 419.601 usec)
  100000 operations 32 bits set of 256 bits
    Average for_each_set_bit took: 8369.333 usec (+- 1513.150 usec)
    Average test_bit loop took:    52096.700 usec (+- 573.022 usec)
  100000 operations 64 bits set of 256 bits
    Average for_each_set_bit took: 13866.857 usec (+- 2649.393 usec)
    Average test_bit loop took:    53989.600 usec (+- 938.808 usec)
  100000 operations 128 bits set of 256 bits
    Average for_each_set_bit took: 23588.350 usec (+- 4724.222 usec)
    Average test_bit loop took:    57300.625 usec (+- 1625.962 usec)
  100000 operations 256 bits set of 256 bits
    Average for_each_set_bit took: 42752.200 usec (+- 9202.084 usec)
    Average test_bit loop took:    64426.933 usec (+- 3402.326 usec)
  100000 operations 1 bits set of 512 bits
    Average for_each_set_bit took: 1632.000 usec (+- 229.954 usec)
    Average test_bit loop took:    98090.000 usec (+- 1120.435 usec)
  100000 operations 2 bits set of 512 bits
    Average for_each_set_bit took: 1937.700 usec (+- 148.902 usec)
    Average test_bit loop took:    100364.100 usec (+- 1433.219 usec)
  100000 operations 4 bits set of 512 bits
    Average for_each_set_bit took: 2528.000 usec (+- 243.654 usec)
    Average test_bit loop took:    99932.067 usec (+- 955.868 usec)
  100000 operations 8 bits set of 512 bits
    Average for_each_set_bit took: 3734.100 usec (+- 512.359 usec)
    Average test_bit loop took:    98944.750 usec (+- 812.070 usec)
  100000 operations 16 bits set of 512 bits
    Average for_each_set_bit took: 5551.400 usec (+- 846.605 usec)
    Average test_bit loop took:    98691.600 usec (+- 654.753 usec)
  100000 operations 32 bits set of 512 bits
    Average for_each_set_bit took: 8594.500 usec (+- 1446.072 usec)
    Average test_bit loop took:    99176.867 usec (+- 579.990 usec)
  100000 operations 64 bits set of 512 bits
    Average for_each_set_bit took: 13840.743 usec (+- 2527.055 usec)
    Average test_bit loop took:    100758.743 usec (+- 833.865 usec)
  100000 operations 128 bits set of 512 bits
    Average for_each_set_bit took: 23185.925 usec (+- 4532.910 usec)
    Average test_bit loop took:    103786.700 usec (+- 1475.276 usec)
  100000 operations 256 bits set of 512 bits
    Average for_each_set_bit took: 40322.400 usec (+- 8341.802 usec)
    Average test_bit loop took:    109433.378 usec (+- 2742.615 usec)
  100000 operations 512 bits set of 512 bits
    Average for_each_set_bit took: 71804.540 usec (+- 15436.546 usec)
    Average test_bit loop took:    120255.440 usec (+- 5252.777 usec)
  100000 operations 1 bits set of 1024 bits
    Average for_each_set_bit took: 1859.600 usec (+- 27.969 usec)
    Average test_bit loop took:    187676.000 usec (+- 1337.770 usec)
  100000 operations 2 bits set of 1024 bits
    Average for_each_set_bit took: 2273.600 usec (+- 139.420 usec)
    Average test_bit loop took:    188176.000 usec (+- 684.357 usec)
  100000 operations 4 bits set of 1024 bits
    Average for_each_set_bit took: 2940.400 usec (+- 268.213 usec)
    Average test_bit loop took:    189172.600 usec (+- 593.295 usec)
  100000 operations 8 bits set of 1024 bits
    Average for_each_set_bit took: 4224.200 usec (+- 547.933 usec)
    Average test_bit loop took:    190257.250 usec (+- 621.021 usec)
  100000 operations 16 bits set of 1024 bits
    Average for_each_set_bit took: 6090.560 usec (+- 877.975 usec)
    Average test_bit loop took:    190143.880 usec (+- 503.753 usec)
  100000 operations 32 bits set of 1024 bits
    Average for_each_set_bit took: 9178.800 usec (+- 1475.136 usec)
    Average test_bit loop took:    190757.100 usec (+- 494.757 usec)
  100000 operations 64 bits set of 1024 bits
    Average for_each_set_bit took: 14441.457 usec (+- 2545.497 usec)
    Average test_bit loop took:    192299.486 usec (+- 795.251 usec)
  100000 operations 128 bits set of 1024 bits
    Average for_each_set_bit took: 23623.825 usec (+- 4481.182 usec)
    Average test_bit loop took:    194885.550 usec (+- 1300.817 usec)
  100000 operations 256 bits set of 1024 bits
    Average for_each_set_bit took: 40194.956 usec (+- 8109.056 usec)
    Average test_bit loop took:    200259.311 usec (+- 2566.085 usec)
  100000 operations 512 bits set of 1024 bits
    Average for_each_set_bit took: 70983.560 usec (+- 15074.982 usec)
    Average test_bit loop took:    210527.460 usec (+- 4968.980 usec)
  100000 operations 1024 bits set of 1024 bits
    Average for_each_set_bit took: 136530.345 usec (+- 31584.400 usec)
    Average test_bit loop took:    233329.691 usec (+- 10814.036 usec)
  100000 operations 1 bits set of 2048 bits
    Average for_each_set_bit took: 3077.600 usec (+- 76.376 usec)
    Average test_bit loop took:    402154.400 usec (+- 518.571 usec)
  100000 operations 2 bits set of 2048 bits
    Average for_each_set_bit took: 3508.600 usec (+- 148.350 usec)
    Average test_bit loop took:    403814.500 usec (+- 1133.027 usec)
  100000 operations 4 bits set of 2048 bits
    Average for_each_set_bit took: 4219.333 usec (+- 285.844 usec)
    Average test_bit loop took:    404312.533 usec (+- 985.751 usec)
  100000 operations 8 bits set of 2048 bits
    Average for_each_set_bit took: 5670.550 usec (+- 615.238 usec)
    Average test_bit loop took:    405321.800 usec (+- 1038.487 usec)
  100000 operations 16 bits set of 2048 bits
    Average for_each_set_bit took: 7785.080 usec (+- 992.522 usec)
    Average test_bit loop took:    406746.160 usec (+- 1015.478 usec)
  100000 operations 32 bits set of 2048 bits
    Average for_each_set_bit took: 11163.800 usec (+- 1627.320 usec)
    Average test_bit loop took:    406124.267 usec (+- 898.785 usec)
  100000 operations 64 bits set of 2048 bits
    Average for_each_set_bit took: 16964.629 usec (+- 2806.130 usec)
    Average test_bit loop took:    406618.514 usec (+- 798.356 usec)
  100000 operations 128 bits set of 2048 bits
    Average for_each_set_bit took: 27219.625 usec (+- 4988.458 usec)
    Average test_bit loop took:    410149.325 usec (+- 1705.641 usec)
  100000 operations 256 bits set of 2048 bits
    Average for_each_set_bit took: 45138.578 usec (+- 8831.021 usec)
    Average test_bit loop took:    415462.467 usec (+- 2725.418 usec)
  100000 operations 512 bits set of 2048 bits
    Average for_each_set_bit took: 77450.540 usec (+- 15962.238 usec)
    Average test_bit loop took:    426089.180 usec (+- 5171.788 usec)
  100000 operations 1024 bits set of 2048 bits
    Average for_each_set_bit took: 138023.636 usec (+- 29826.959 usec)
    Average test_bit loop took:    446346.636 usec (+- 9904.417 usec)
  100000 operations 2048 bits set of 2048 bits
    Average for_each_set_bit took: 251072.600 usec (+- 55947.692 usec)
    Average test_bit loop took:    484855.983 usec (+- 18970.431 usec)
  #

Signed-off-by: Ian Rogers <irogers@google.com>
Tested-by: Arnaldo Carvalho de Melo <acme@redhat.com>
Cc: Alexander Shishkin <alexander.shishkin@linux.intel.com>
Cc: Andi Kleen <ak@linux.intel.com>
Cc: Jiri Olsa <jolsa@redhat.com>
Cc: Mark Rutland <mark.rutland@arm.com>
Cc: Namhyung Kim <namhyung@kernel.org>
Cc: Peter Zijlstra <peterz@infradead.org>
Cc: Stephane Eranian <eranian@google.com>
Cc: Thomas Gleixner <tglx@linutronix.de>
Link: http://lore.kernel.org/lkml/20200729220034.1337168-1-irogers@google.com
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
2020-07-31 09:32:11 -03:00
Wei Li
bd3c628f8f perf tools: Fix record failure when mixed with ARM SPE event
When recording with cache-misses and arm_spe_x event, I found that it
will just fail without showing any error info if i put cache-misses
after 'arm_spe_x' event.

  [root@localhost 0620]# perf record -e cache-misses \
				-e arm_spe_0/ts_enable=1,pct_enable=1,pa_enable=1,load_filter=1,jitter=1,store_filter=1,min_latency=0/ sleep 1
  [ perf record: Woken up 1 times to write data ]
  [ perf record: Captured and wrote 0.067 MB perf.data ]
  [root@localhost 0620]#
  [root@localhost 0620]# perf record -e arm_spe_0/ts_enable=1,pct_enable=1,pa_enable=1,load_filter=1,jitter=1,store_filter=1,min_latency=0/ \
				     -e  cache-misses sleep 1
  [root@localhost 0620]#

The current code can only work if the only event to be traced is an
'arm_spe_x', or if it is the last event to be specified. Otherwise the
last event type will be checked against all the arm_spe_pmus[i]->types,
none will match and an out of bound 'i' index will be used in
arm_spe_recording_init().

We don't support concurrent multiple arm_spe_x events currently, that
is checked in arm_spe_recording_options(), and it will show the relevant
info. So add the check and record of the first found 'arm_spe_pmu' to
fix this issue here.

Fixes: ffd3d18c20 ("perf tools: Add ARM Statistical Profiling Extensions (SPE) support")
Signed-off-by: Wei Li <liwei391@huawei.com>
Reviewed-by: Mathieu Poirier <mathieu.poirier@linaro.org>
Tested-by-by: Leo Yan <leo.yan@linaro.org>
Cc: Alexander Shishkin <alexander.shishkin@linux.intel.com>
Cc: Hanjun Guo <guohanjun@huawei.com>
Cc: Jiri Olsa <jolsa@redhat.com>
Cc: Kim Phillips <kim.phillips@arm.com>
Cc: Mark Rutland <mark.rutland@arm.com>
Cc: Mike Leach <mike.leach@linaro.org>
Cc: Namhyung Kim <namhyung@kernel.org>
Cc: Peter Zijlstra <peterz@infradead.org>
Cc: Suzuki Poulouse <suzuki.poulose@arm.com>
Cc: linux-arm-kernel@lists.infradead.org
Link: http://lore.kernel.org/lkml/20200724071111.35593-2-liwei391@huawei.com
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
2020-07-31 09:29:01 -03:00
Thomas Richter
463538a383 perf tests: Fix test 68 zstd compression for s390
Commit 5aa98879ef ("s390/cpum_sf: prohibit callchain data collection")
prohibits call graph sampling for hardware events on s390. The
information recorded is out of context and does not match.

On s390 this commit now breaks test case 68 Zstd perf.data
compression/decompression.

Therefore omit call graph sampling on s390 in this test.

Output before:
  [root@t35lp46 perf]# ./perf test -Fv 68
  68: Zstd perf.data compression/decompression              :
  --- start ---
  Collecting compressed record file:
  Error:
  cycles: PMU Hardware doesn't support sampling/overflow-interrupts.
                                Try 'perf stat'
  ---- end ----
  Zstd perf.data compression/decompression: FAILED!
  [root@t35lp46 perf]#

Output after:
[root@t35lp46 perf]# ./perf test -Fv 68
  68: Zstd perf.data compression/decompression              :
  --- start ---
  Collecting compressed record file:
  500+0 records in
  500+0 records out
  256000 bytes (256 kB, 250 KiB) copied, 0.00615638 s, 41.6 MB/s
  [ perf record: Woken up 1 times to write data ]
  [ perf record: Captured and wrote 0.004 MB /tmp/perf.data.X3M,
                        compressed (original 0.002 MB, ratio is 3.609) ]
  Checking compressed events stats:
  # compressed : Zstd, level = 1, ratio = 4
        COMPRESSED events:          1
  2ELIFREPh---- end ----
  Zstd perf.data compression/decompression: Ok
  [root@t35lp46 perf]#

Signed-off-by: Thomas Richter <tmricht@linux.ibm.com>
Reviewed-by: Sumanth Korikkar <sumanthk@linux.ibm.com>
Cc: Heiko Carstens <heiko.carstens@de.ibm.com>
Cc: Sven Schnelle <svens@linux.ibm.com>
Cc: Vasily Gorbik <gor@linux.ibm.com>
Link: http://lore.kernel.org/lkml/20200729135314.91281-1-tmricht@linux.ibm.com
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
2020-07-31 09:27:32 -03:00
Jiri Olsa
119e521a96 perf metric: Rename group_list to metric_list
Following the previous change that rename egroup to metric, there's no
reason to call the list 'group_list' anymore, renaming it to
metric_list.

Signed-off-by: Jiri Olsa <jolsa@kernel.org>
Reviewed-by: Kajol Jain <kjain@linux.ibm.com>
Acked-by: Ian Rogers <irogers@google.com>
Cc: Alexander Shishkin <alexander.shishkin@linux.intel.com>
Cc: Andi Kleen <ak@linux.intel.com>
Cc: John Garry <john.garry@huawei.com>
Cc: Michael Petlan <mpetlan@redhat.com>
Cc: Namhyung Kim <namhyung@kernel.org>
Cc: Paul Clarke <pc@us.ibm.com>
Cc: Peter Zijlstra <peterz@infradead.org>
Cc: Stephane Eranian <eranian@google.com>
Link: http://lore.kernel.org/lkml/20200719181320.785305-20-jolsa@kernel.org
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
2020-07-30 07:01:50 -03:00
Jiri Olsa
a0c05b3638 perf metric: Rename struct egroup to metric
Renaming struct egroup to metric, because it seems to make more sense.
Plus renaming all the variables that hold egroup to appropriate names.

Signed-off-by: Jiri Olsa <jolsa@kernel.org>
Reviewed-by: Kajol Jain <kjain@linux.ibm.com>
Acked-by: Ian Rogers <irogers@google.com>
Cc: Alexander Shishkin <alexander.shishkin@linux.intel.com>
Cc: Andi Kleen <ak@linux.intel.com>
Cc: John Garry <john.garry@huawei.com>
Cc: Michael Petlan <mpetlan@redhat.com>
Cc: Namhyung Kim <namhyung@kernel.org>
Cc: Paul Clarke <pc@us.ibm.com>
Cc: Peter Zijlstra <peterz@infradead.org>
Cc: Stephane Eranian <eranian@google.com>
Link: http://lore.kernel.org/lkml/20200719181320.785305-19-jolsa@kernel.org
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
2020-07-30 07:01:50 -03:00
Jiri Olsa
dfce77c580 perf metric: Add metric group test
Adding test for metric group plus compute_metric_group function to get
metrics values within the group.

Committer notes:

Fixed this;

  tests/parse-metric.c:327:7: error: missing field 'val' initializer [-Werror,-Wmissing-field-initializers]
                  { 0 },
                      ^

Signed-off-by: Jiri Olsa <jolsa@kernel.org>
Reviewed-by: Kajol Jain <kjain@linux.ibm.com>
Acked-by: Ian Rogers <irogers@google.com>
Cc: Alexander Shishkin <alexander.shishkin@linux.intel.com>
Cc: Andi Kleen <ak@linux.intel.com>
Cc: John Garry <john.garry@huawei.com>
Cc: Michael Petlan <mpetlan@redhat.com>
Cc: Namhyung Kim <namhyung@kernel.org>
Cc: Paul Clarke <pc@us.ibm.com>
Cc: Peter Zijlstra <peterz@infradead.org>
Cc: Stephane Eranian <eranian@google.com>
Link: http://lore.kernel.org/lkml/20200719181320.785305-18-jolsa@kernel.org
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
2020-07-30 07:01:50 -03:00
Jiri Olsa
b81ef466ac perf metric: Make compute_single function more precise
So far compute_single function relies on the fact, that there's only
single metric defined within evlist in all tests. In following patch we
will add test for metric group, so we need to be able to compute metric
by given name.

Adding the name argument to compute_single and iterating evlist and
evsel's expression to find the given metric.

Signed-off-by: Jiri Olsa <jolsa@kernel.org>
Cc: Alexander Shishkin <alexander.shishkin@linux.intel.com>
Cc: Andi Kleen <ak@linux.intel.com>
Cc: Ian Rogers <irogers@google.com>
Cc: John Garry <john.garry@huawei.com>
Cc: Kajol Jain <kjain@linux.ibm.com>
Cc: Michael Petlan <mpetlan@redhat.com>
Cc: Namhyung Kim <namhyung@kernel.org>
Cc: Paul Clarke <pc@us.ibm.com>
Cc: Peter Zijlstra <peterz@infradead.org>
Cc: Stephane Eranian <eranian@google.com>
Link: http://lore.kernel.org/lkml/20200719181320.785305-17-jolsa@kernel.org
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
2020-07-30 07:01:50 -03:00
Jiri Olsa
f6fb0960f9 perf metric: Add recursion check when processing nested metrics
Keeping the stack of nested metrics via 'struct expr_id' objects
and checking if we are in recursion via already processed metric.

The stack is implemented as static array within the struct egroup
with 100 entries, which should be enough nesting depth for any
metric we have or plan to have at the moment.

Adding test that simulates the recursion and checks we can
detect it.

Committer notes:

Bumped RECURSION_ID_MAX to 1000 as per Jiri's reply to Paul Clark on the
patch series e-mail discussion.

Fixed these:

  tests/parse-metric.c:308:7: error: missing field 'val' initializer [-Werror,-Wmissing-field-initializers]
                  { 0 },
                      ^

  util/metricgroup.c:924:28: error: missing field 'parent' initializer [-Werror,-Wmissing-field-initializers]
          struct expr_ids ids = { 0 };
                                    ^
  util/metricgroup.c:924:26: error: suggest braces around initialization of subobject [-Werror,-Wmissing-braces]
          struct expr_ids ids = { 0 };
                                  ^
                                  {}
  util/metricgroup.c:924:26: error: suggest braces around initialization of subobject [-Werror,-Wmissing-braces]
          struct expr_ids ids = { 0 };
                                  ^
                                  {}
  util/metricgroup.c:924:28: error: missing field 'cnt' initializer [-Werror,-Wmissing-field-initializers]
          struct expr_ids ids = { 0 };
                                    ^

Signed-off-by: Jiri Olsa <jolsa@kernel.org>
Reviewed-by: Kajol Jain <kjain@linux.ibm.com>
Acked-by: Ian Rogers <irogers@google.com>
Cc: Alexander Shishkin <alexander.shishkin@linux.intel.com>
Cc: Andi Kleen <ak@linux.intel.com>
Cc: John Garry <john.garry@huawei.com>
Cc: Michael Petlan <mpetlan@redhat.com>
Cc: Namhyung Kim <namhyung@kernel.org>
Cc: Paul Clarke <pc@us.ibm.com>
Cc: Peter Zijlstra <peterz@infradead.org>
Cc: Stephane Eranian <eranian@google.com>
Link: http://lore.kernel.org/lkml/20200719181320.785305-16-jolsa@kernel.org
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
2020-07-30 07:01:49 -03:00
Jiri Olsa
5a606f3b9c perf metric: Add DCache_L2 to metric parse test
Adding test that compute DCache_L2 metrics with other related metrics in it.

Committer notes:

Fixed up this:

  tests/parse-metric.c:285:7: error: missing field 'val' initializer [-Werror,-Wmissing-field-initializers]
                  { 0 },
                      ^

Signed-off-by: Jiri Olsa <jolsa@kernel.org>
Reviewed-by: Kajol Jain <kjain@linux.ibm.com>
Acked-by: Ian Rogers <irogers@google.com>
Cc: Alexander Shishkin <alexander.shishkin@linux.intel.com>
Cc: Andi Kleen <ak@linux.intel.com>
Cc: John Garry <john.garry@huawei.com>
Cc: Michael Petlan <mpetlan@redhat.com>
Cc: Namhyung Kim <namhyung@kernel.org>
Cc: Paul Clarke <pc@us.ibm.com>
Cc: Peter Zijlstra <peterz@infradead.org>
Cc: Stephane Eranian <eranian@google.com>
Link: http://lore.kernel.org/lkml/20200719181320.785305-15-jolsa@kernel.org
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
2020-07-30 07:01:49 -03:00
Jiri Olsa
55f30d6839 perf metric: Add cache_miss_cycles to metric parse test
Adding test that compute metric with other metrics in it.

  cache_miss_cycles = metric:dcache_miss_cpi + metric:icache_miss_cycles

Committer notes:

Fixed up initializer to cope with:

  tests/parse-metric.c:242:7: error: missing field 'val' initializer [-Werror,-Wmissing-field-initializers]
                  { 0 },

Signed-off-by: Jiri Olsa <jolsa@kernel.org>
Reviewed-by: Kajol Jain <kjain@linux.ibm.com>
Acked-by: Ian Rogers <irogers@google.com>
Cc: Alexander Shishkin <alexander.shishkin@linux.intel.com>
Cc: Andi Kleen <ak@linux.intel.com>
Cc: John Garry <john.garry@huawei.com>
Cc: Michael Petlan <mpetlan@redhat.com>
Cc: Namhyung Kim <namhyung@kernel.org>
Cc: Paul Clarke <pc@us.ibm.com>
Cc: Peter Zijlstra <peterz@infradead.org>
Cc: Stephane Eranian <eranian@google.com>
Link: http://lore.kernel.org/lkml/20200719181320.785305-14-jolsa@kernel.org
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
2020-07-30 07:01:49 -03:00
Jiri Olsa
98461d9dc1 perf metric: Add events for the current list
There's no need to iterate the whole list of groups, when adding new
events. The currently created groups are the ones we want to add.

Signed-off-by: Jiri Olsa <jolsa@kernel.org>
Reviewed-by: Kajol Jain <kjain@linux.ibm.com>
Acked-by: Ian Rogers <irogers@google.com>
Cc: Alexander Shishkin <alexander.shishkin@linux.intel.com>
Cc: Andi Kleen <ak@linux.intel.com>
Cc: John Garry <john.garry@huawei.com>
Cc: Michael Petlan <mpetlan@redhat.com>
Cc: Namhyung Kim <namhyung@kernel.org>
Cc: Paul Clarke <pc@us.ibm.com>
Cc: Peter Zijlstra <peterz@infradead.org>
Cc: Stephane Eranian <eranian@google.com>
Link: http://lore.kernel.org/lkml/20200719181320.785305-13-jolsa@kernel.org
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
2020-07-30 07:01:49 -03:00
Jiri Olsa
acf71b05d1 perf metric: Compute referenced metrics
Adding computation (expr__parse call) of referenced metric at
the point when it needs to be resolved during the parent metric
computation.

Once the inner metric is computed, the result is stored and
used if there's another usage of that metric.

Signed-off-by: Jiri Olsa <jolsa@kernel.org>
Reviewed-by: Kajol Jain <kjain@linux.ibm.com>
Acked-by: Ian Rogers <irogers@google.com>
Cc: Alexander Shishkin <alexander.shishkin@linux.intel.com>
Cc: Andi Kleen <ak@linux.intel.com>
Cc: John Garry <john.garry@huawei.com>
Cc: Michael Petlan <mpetlan@redhat.com>
Cc: Namhyung Kim <namhyung@kernel.org>
Cc: Paul Clarke <pc@us.ibm.com>
Cc: Peter Zijlstra <peterz@infradead.org>
Cc: Stephane Eranian <eranian@google.com>
Link: http://lore.kernel.org/lkml/20200719181320.785305-12-jolsa@kernel.org
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
2020-07-30 07:01:49 -03:00
Jiri Olsa
fc393839c1 perf metric: Add referenced metrics to hash data
Adding referenced metrics to the parsing context so they can be resolved
during the metric processing.

Adding expr__add_ref function to store referenced metrics into parse
context.

Signed-off-by: Jiri Olsa <jolsa@kernel.org>
Reviewed-by: Kajol Jain <kjain@linux.ibm.com>
Acked-by: Ian Rogers <irogers@google.com>
Cc: Alexander Shishkin <alexander.shishkin@linux.intel.com>
Cc: Andi Kleen <ak@linux.intel.com>
Cc: John Garry <john.garry@huawei.com>
Cc: Michael Petlan <mpetlan@redhat.com>
Cc: Namhyung Kim <namhyung@kernel.org>
Cc: Paul Clarke <pc@us.ibm.com>
Cc: Peter Zijlstra <peterz@infradead.org>
Cc: Stephane Eranian <eranian@google.com>
Link: http://lore.kernel.org/lkml/20200719181320.785305-11-jolsa@kernel.org
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
2020-07-30 07:01:49 -03:00
Jiri Olsa
4ea2896715 perf metric: Collect referenced metrics in struct metric_expr
Add referenced metrics into struct metric_expr object, so they are
accessible when computing the metric.

Storing just name and expression itself, so the metric can be resolved
and computed.

Signed-off-by: Jiri Olsa <jolsa@kernel.org>
Reviewed-by: Kajol Jain <kjain@linux.ibm.com>
Acked-by: Ian Rogers <irogers@google.com>
Cc: Alexander Shishkin <alexander.shishkin@linux.intel.com>
Cc: Andi Kleen <ak@linux.intel.com>
Cc: John Garry <john.garry@huawei.com>
Cc: Michael Petlan <mpetlan@redhat.com>
Cc: Namhyung Kim <namhyung@kernel.org>
Cc: Paul Clarke <pc@us.ibm.com>
Cc: Peter Zijlstra <peterz@infradead.org>
Cc: Stephane Eranian <eranian@google.com>
Link: http://lore.kernel.org/lkml/20200719181320.785305-10-jolsa@kernel.org
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
2020-07-30 07:01:49 -03:00
Jiri Olsa
83de0b7d53 perf metric: Collect referenced metrics in struct metric_ref_node
Collecting referenced metrics in struct metric_ref_node object,
so we can process them later on.

The change will parse nested metric names out of expression and
'resolve' them.

All referenced metrics are dissolved into one context, meaning all
nested metrics events and added to the parent context.

Signed-off-by: Jiri Olsa <jolsa@kernel.org>
Cc: Alexander Shishkin <alexander.shishkin@linux.intel.com>
Cc: Andi Kleen <ak@linux.intel.com>
Cc: John Garry <john.garry@huawei.com>
Cc: Michael Petlan <mpetlan@redhat.com>
Cc: Namhyung Kim <namhyung@kernel.org>
Cc: Paul Clarke <pc@us.ibm.com>
Cc: Peter Zijlstra <peterz@infradead.org>
Cc: Stephane Eranian <eranian@google.com>
Link: http://lore.kernel.org/lkml/20200719181320.785305-9-jolsa@kernel.org
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
2020-07-30 07:01:49 -03:00
Jiri Olsa
e7e1badd80 perf metric: Rename __metricgroup__add_metric to __add_metric
Renaming __metricgroup__add_metric to __add_metric to fit in the current
function names.

Signed-off-by: Jiri Olsa <jolsa@kernel.org>
Reviewed-by: Kajol Jain <kjain@linux.ibm.com>
Acked-by: Ian Rogers <irogers@google.com>
Cc: Alexander Shishkin <alexander.shishkin@linux.intel.com>
Cc: Andi Kleen <ak@linux.intel.com>
Cc: John Garry <john.garry@huawei.com>
Cc: Michael Petlan <mpetlan@redhat.com>
Cc: Namhyung Kim <namhyung@kernel.org>
Cc: Paul Clarke <pc@us.ibm.com>
Cc: Peter Zijlstra <peterz@infradead.org>
Cc: Stephane Eranian <eranian@google.com>
Link: http://lore.kernel.org/lkml/20200719181320.785305-8-jolsa@kernel.org
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
2020-07-30 07:01:49 -03:00
Jiri Olsa
a29c164aa3 perf metric: Add add_metric function
Decouple metric adding logging into add_metric function,
so it can be used from other places in following changes.

Signed-off-by: Jiri Olsa <jolsa@kernel.org>
Reviewed-by: Kajol Jain <kjain@linux.ibm.com>
Acked-by: Ian Rogers <irogers@google.com>
Cc: Alexander Shishkin <alexander.shishkin@linux.intel.com>
Cc: Andi Kleen <ak@linux.intel.com>
Cc: John Garry <john.garry@huawei.com>
Cc: Michael Petlan <mpetlan@redhat.com>
Cc: Namhyung Kim <namhyung@kernel.org>
Cc: Paul Clarke <pc@us.ibm.com>
Cc: Peter Zijlstra <peterz@infradead.org>
Cc: Stephane Eranian <eranian@google.com>
Link: http://lore.kernel.org/lkml/20200719181320.785305-7-jolsa@kernel.org
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
2020-07-30 07:01:49 -03:00
Jiri Olsa
ce39194034 perf metric: Add macros for iterating map events
Adding following macros to iterate events and metric:

  map_for_each_event(__pe, __idx, __map)
    - iterates over all pmu_events_map events

  map_for_each_metric(__pe, __idx, __map, __metric)
    - iterates over all metrics that match __metric argument

and use it in metricgroup__add_metric function. Macros will be be used
from other places in following changes.

Signed-off-by: Jiri Olsa <jolsa@kernel.org>
Reviewed-by: Kajol Jain <kjain@linux.ibm.com>
Acked-by: Ian Rogers <irogers@google.com>
Cc: Alexander Shishkin <alexander.shishkin@linux.intel.com>
Cc: Andi Kleen <ak@linux.intel.com>
Cc: John Garry <john.garry@huawei.com>
Cc: Michael Petlan <mpetlan@redhat.com>
Cc: Namhyung Kim <namhyung@kernel.org>
Cc: Paul Clarke <pc@us.ibm.com>
Cc: Peter Zijlstra <peterz@infradead.org>
Cc: Stephane Eranian <eranian@google.com>
Link: http://lore.kernel.org/lkml/20200719181320.785305-6-jolsa@kernel.org
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
2020-07-30 07:01:49 -03:00
Jiri Olsa
3fd29fa6c1 perf metric: Add expr__del_id function
Adding expr__del_id function to remove ID from hashmap.  It will save us
few lines in following changes.

Signed-off-by: Jiri Olsa <jolsa@kernel.org>
Reviewed-by: Kajol Jain <kjain@linux.ibm.com>
Acked-by: Ian Rogers <irogers@google.com>
Cc: Alexander Shishkin <alexander.shishkin@linux.intel.com>
Cc: Andi Kleen <ak@linux.intel.com>
Cc: John Garry <john.garry@huawei.com>
Cc: Michael Petlan <mpetlan@redhat.com>
Cc: Namhyung Kim <namhyung@kernel.org>
Cc: Paul Clarke <pc@us.ibm.com>
Cc: Peter Zijlstra <peterz@infradead.org>
Cc: Stephane Eranian <eranian@google.com>
Link: http://lore.kernel.org/lkml/20200719181320.785305-5-jolsa@kernel.org
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
2020-07-30 07:01:49 -03:00
Jiri Olsa
5c5f5e835f perf metric: Change expr__get_id to return struct expr_id_data
Changing expr__get_id to use and return struct expr_id_data
pointer as value for the ID. This way we can access data other
than value for given ID in following changes.

Signed-off-by: Jiri Olsa <jolsa@kernel.org>
Reviewed-by: Kajol Jain <kjain@linux.ibm.com>
Acked-by: Ian Rogers <irogers@google.com>
Cc: Alexander Shishkin <alexander.shishkin@linux.intel.com>
Cc: Andi Kleen <ak@linux.intel.com>
Cc: John Garry <john.garry@huawei.com>
Cc: Michael Petlan <mpetlan@redhat.com>
Cc: Namhyung Kim <namhyung@kernel.org>
Cc: Paul Clarke <pc@us.ibm.com>
Cc: Peter Zijlstra <peterz@infradead.org>
Cc: Stephane Eranian <eranian@google.com>
Link: http://lore.kernel.org/lkml/20200719181320.785305-4-jolsa@kernel.org
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
2020-07-30 07:01:49 -03:00
Jiri Olsa
332603c2aa perf metric: Add expr__add_id function
Add the expr__add_id() function to data for ID with zero value, which is
used when scanning the expression for IDs.

Signed-off-by: Jiri Olsa <jolsa@kernel.org>
Reviewed-by: Kajol Jain <kjain@linux.ibm.com>
Acked-by: Ian Rogers <irogers@google.com>
Cc: Alexander Shishkin <alexander.shishkin@linux.intel.com>
Cc: Andi Kleen <ak@linux.intel.com>
Cc: John Garry <john.garry@huawei.com>
Cc: Michael Petlan <mpetlan@redhat.com>
Cc: Namhyung Kim <namhyung@kernel.org>
Cc: Paul Clarke <pc@us.ibm.com>
Cc: Peter Zijlstra <peterz@infradead.org>
Cc: Stephane Eranian <eranian@google.com>
Link: http://lore.kernel.org/lkml/20200719181320.785305-3-jolsa@kernel.org
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
2020-07-30 07:01:49 -03:00
Jiri Olsa
60e10c0037 perf metric: Fix memory leak in expr__add_id function
Arnaldo found that we don't release value data in case the hashmap__set
fails. Releasing it in case of an error.

Reported-by: Arnaldo Carvalho de Melo <acme@redhat.com>
Signed-off-by: Jiri Olsa <jolsa@kernel.org>
Acked-by: Ian Rogers <irogers@google.com>
Cc: Alexander Shishkin <alexander.shishkin@linux.intel.com>
Cc: Andi Kleen <ak@linux.intel.com>
Cc: John Garry <john.garry@huawei.com>
Cc: Kajol Jain <kjain@linux.ibm.com>
Cc: Michael Petlan <mpetlan@redhat.com>
Cc: Namhyung Kim <namhyung@kernel.org>
Cc: Paul Clarke <pc@us.ibm.com>
Cc: Peter Zijlstra <peterz@infradead.org>
Cc: Stephane Eranian <eranian@google.com>
Link: http://lore.kernel.org/lkml/20200719181320.785305-2-jolsa@kernel.org
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
2020-07-30 07:01:49 -03:00
Ian Rogers
1b98c6e3ba perf test: Ensure sample_period is set libpfm4 events
Test that a command line option doesn't override the period set on a
libpfm4 event.

Without libpfm4 test passes as unsupported.

Signed-off-by: Ian Rogers <irogers@google.com>
Cc: Adrian Hunter <adrian.hunter@intel.com>
Cc: Alexander Shishkin <alexander.shishkin@linux.intel.com>
Cc: Alexei Starovoitov <ast@kernel.org>
Cc: Andi Kleen <ak@linux.intel.com>
Cc: Andrii Nakryiko <andriin@fb.com>
Cc: Athira Jajeev <atrajeev@linux.vnet.ibm.com>
Cc: Daniel Borkmann <daniel@iogearbox.net>
Cc: Jiri Olsa <jolsa@redhat.com>
Cc: John Fastabend <john.fastabend@gmail.com>
Cc: KP Singh <kpsingh@chromium.org>
Cc: Mark Rutland <mark.rutland@arm.com>
Cc: Martin KaFai Lau <kafai@fb.com>
Cc: Namhyung Kim <namhyung@kernel.org>
Cc: Peter Zijlstra <peterz@infradead.org>
Cc: Song Liu <songliubraving@fb.com>
Cc: Stephane Eranian <eranian@google.com>
Cc: Yonghong Song <yhs@fb.com>
Cc: bpf@vger.kernel.org
Cc: netdev@vger.kernel.org
Link: http://lore.kernel.org/lkml/20200728085734.609930-4-irogers@google.com
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
2020-07-30 07:01:49 -03:00
Jiri Olsa
4929e95a14 perf tools: Fix term parsing for raw syntax
Jin Yao reported issue with possible conflict between raw events and
term values in pmu event syntax.

Currently following syntax is resolved as raw event with 0xead value:

  uncore_imc_free_running/read/

instead of using 'read' term from uncore_imc_free_running pmu, because
'read' is correct raw event syntax with 0xead value.

To solve this issue we do following:

  - check existing terms during rXXXX syntax processing
    and make them priority in case of conflict

  - allow pmu/r0x1234/ syntax to be able to specify conflicting
    raw event (implemented in previous patch)

Also add automated tests for this and perf_pmu__parse_cleanup call to
parse_events_terms, so the test gets properly cleaned up.

Fixes: 3a6c51e4d6 ("perf parser: Add support to specify rXXX event with pmu")
Reported-by: Jin Yao <yao.jin@linux.intel.com>
Signed-off-by: Jiri Olsa <jolsa@kernel.org>
Tested-by: Jin Yao <yao.jin@linux.intel.com>
Acked-by: Ian Rogers <irogers@google.com>
Cc: Alexander Shishkin <alexander.shishkin@linux.intel.com>
Cc: Andi Kleen <ak@linux.intel.com>
Cc: Michael Petlan <mpetlan@redhat.com>
Cc: Namhyung Kim <namhyung@kernel.org>
Cc: Peter Zijlstra <peterz@infradead.org>
Cc: Stephane Eranian <eranian@google.com>
Cc: Thomas Richter <tmricht@linux.ibm.com>
Link: http://lore.kernel.org/lkml/20200726075244.1191481-2-jolsa@kernel.org
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
2020-07-30 07:01:48 -03:00
Jiri Olsa
c33cdf5411 perf tools: Allow r0x<HEX> event syntax
Add support to specify raw event with 'r0<HEX>' syntax within pmu term
syntax like:

  -e cpu/r0xdead/

It will be used to specify raw events in cases where they conflict with
real pmu terms, like 'read', which is valid raw event syntax, but also a
possible pmu term name as reported by Jin Yao.

Reported-by: Jin Yao <yao.jin@linux.intel.com>
Signed-off-by: Jiri Olsa <jolsa@kernel.org>
Acked-by: Ian Rogers <irogers@google.com>
Cc: Alexander Shishkin <alexander.shishkin@linux.intel.com>
Cc: Andi Kleen <ak@linux.intel.com>
Cc: Michael Petlan <mpetlan@redhat.com>
Cc: Namhyung Kim <namhyung@kernel.org>
Cc: Peter Zijlstra <peterz@infradead.org>
Cc: Stephane Eranian <eranian@google.com>
Cc: Thomas Richter <tmricht@linux.ibm.com>
Link: http://lore.kernel.org/lkml/20200725121959.1181869-1-jolsa@kernel.org
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
2020-07-30 07:01:48 -03:00