Age | Commit message (Collapse) | Author | Files | Lines |
|
Disable single-step by setting debug.control to KVM_GUESTDBG_ENABLE,
not to SINGLE_STEP_DISABLE. The latter is an arbitrary test enum that
just happens to have the same value as KVM_GUESTDBG_ENABLE, and so
effectively disables single-step debug.
No functional change intended.
Cc: Reiji Watanabe <[email protected]>
Fixes: b18e4d4aebdd ("KVM: arm64: selftests: Add a test case for KVM_GUESTDBG_SINGLESTEP")
Signed-off-by: Sean Christopherson <[email protected]>
Link: https://lore.kernel.org/r/[email protected]
Reviewed-by: Oliver Upton <[email protected]>
|
|
This tests PTRACE_SETREGSET with NT_X86_XSTATE modifying PKRU directly and
removing the PKRU bit from XSTATE_BV.
Signed-off-by: Kyle Huey <[email protected]>
Signed-off-by: Dave Hansen <[email protected]>
Link: https://lore.kernel.org/all/20221115230932.7126-7-khuey%40kylehuey.com
|
|
The perf/cpumap.h header is getting the 'struct perf_cpu_map' forward
declaration by luck, add it.
Acked-by: Ian Rogers <[email protected]>
Signed-off-by: Arnaldo Carvalho de Melo <[email protected]>
|
|
It just hits the header guard, becoming a no-op, ditch it.
Acked-by: Ian Rogers <[email protected]>
Signed-off-by: Arnaldo Carvalho de Melo <[email protected]>
|
|
Switch -I from tools/lib to the install path for the tools/lib
libraries. Add the include_headers build targets to prepare target, as
well as pmu-events.c compilation that dependes on libperf.
Signed-off-by: Ian Rogers <[email protected]>
Acked-by: Namhyung Kim <[email protected]>
Cc: Alexander Shishkin <[email protected]>
Cc: Andrii Nakryiko <[email protected]>
Cc: Jiri Olsa <[email protected]>
Cc: Mark Rutland <[email protected]>
Cc: Masahiro Yamada <[email protected]>
Cc: Nick Desaulniers <[email protected]>
Cc: Nicolas Schier <[email protected]>
Cc: Peter Zijlstra <[email protected]>
Cc: Stephane Eranian <[email protected]>
Cc: [email protected]
Link: http://lore.kernel.org/lkml/[email protected]
Link: http://lore.kernel.org/lkml/[email protected]
Signed-off-by: Arnaldo Carvalho de Melo <[email protected]>
|
|
Use public API when possible, don't include internal API in header
files in evsel.h. Fix any related breakages.
Committer note:
There was one missing case, when building for arm64:
arch/arm64/util/pmu.c: In function 'pmu_events_table__find':
arch/arm64/util/pmu.c:18:30: error: invalid use of undefined type 'struct perf_cpu_map'
18 | if (pmu->cpus->nr != cpu__max_cpu().cpu)
| ^~
Fix it by adding one more exception, including <internal/cpumap.h>
Signed-off-by: Ian Rogers <[email protected]>
Acked-by: Namhyung Kim <[email protected]>
Cc: Alexander Shishkin <[email protected]>
Cc: Andrii Nakryiko <[email protected]>
Cc: Jiri Olsa <[email protected]>
Cc: Mark Rutland <[email protected]>
Cc: Masahiro Yamada <[email protected]>
Cc: Nick Desaulniers <[email protected]>
Cc: Nicolas Schier <[email protected]>
Cc: Peter Zijlstra <[email protected]>
Cc: Stephane Eranian <[email protected]>
Cc: [email protected]
Link: http://lore.kernel.org/lkml/[email protected]
Signed-off-by: Arnaldo Carvalho de Melo <[email protected]>
|
|
Replace the perf_test_ prefix on symbol names with memstress_ to match
the new file name.
"memstress" better describes the functionality proveded by this library,
which is to provide functionality for creating and running a VM that
stresses VM memory by reading and writing to guest memory on all vCPUs
in parallel.
"memstress" also contains the same number of chracters as "perf_test",
making it a drop-in replacement in symbols, e.g. function names, without
impacting line lengths. Also the lack of underscore between "mem" and
"stress" makes it clear "memstress" is a noun.
Signed-off-by: David Matlack <[email protected]>
Reviewed-by: Sean Christopherson <[email protected]>
Link: https://lore.kernel.org/r/[email protected]
Signed-off-by: Sean Christopherson <[email protected]>
|
|
Rename the local variables "pta" (which is short for perf_test_args) for
args. "pta" is not an obvious acronym and using "args" mirrors
"vcpu_args".
Suggested-by: Sean Christopherson <[email protected]>
Signed-off-by: David Matlack <[email protected]>
Reviewed-by: Sean Christopherson <[email protected]>
Link: https://lore.kernel.org/r/[email protected]
Signed-off-by: Sean Christopherson <[email protected]>
|
|
Rename the perf_test_util.[ch] files to memstress.[ch]. Symbols are
renamed in the following commit to reduce the amount of churn here in
hopes of playiing nice with git's file rename detection.
The name "memstress" was chosen to better describe the functionality
proveded by this library, which is to create and run a VM that
reads/writes to guest memory on all vCPUs in parallel.
"memstress" also contains the same number of chracters as "perf_test",
making it a drop-in replacement in symbols, e.g. function names, without
impacting line lengths. Also the lack of underscore between "mem" and
"stress" makes it clear "memstress" is a noun.
Signed-off-by: David Matlack <[email protected]>
Reviewed-by: Sean Christopherson <[email protected]>
Link: https://lore.kernel.org/r/[email protected]
Signed-off-by: Sean Christopherson <[email protected]>
|
|
Create the ability to randomize page access order with the -a
argument. This includes the possibility that the same pages may be hit
multiple times during an iteration or not at all.
Population has random access as false to ensure all pages will be
touched by population and avoid page faults in late dirty memory that
would pollute the test results.
Signed-off-by: Colton Lewis <[email protected]>
Reviewed-by: David Matlack <[email protected]>
Link: https://lore.kernel.org/r/[email protected]
Signed-off-by: Sean Christopherson <[email protected]>
|
|
Randomize which pages are written vs read using the random number
generator.
Change the variable wr_fract and associated function calls to
write_percent that now operates as a percentage from 0 to 100 where X
means each page has an X% chance of being written. Change the -f
argument to -w to reflect the new variable semantics. Keep the same
default of 100% writes.
Population always uses 100% writes to ensure all memory is actually
populated and not just mapped to the zero page. The prevents expensive
copy-on-write faults from occurring during the dirty memory iterations
below, which would pollute the performance results.
Each vCPU calculates its own random seed by adding its index to the
seed provided.
Signed-off-by: Colton Lewis <[email protected]>
Reviewed-by: David Matlack <[email protected]>
Link: https://lore.kernel.org/r/[email protected]
Signed-off-by: Sean Christopherson <[email protected]>
|
|
Create a -r argument to specify a random seed. If no argument is
provided, the seed defaults to 1. The random seed is set with
perf_test_set_random_seed() and must be set before guest_code runs to
apply.
Signed-off-by: Colton Lewis <[email protected]>
Reviewed-by: David Matlack <[email protected]>
Link: https://lore.kernel.org/r/[email protected]
Signed-off-by: Sean Christopherson <[email protected]>
|
|
Implement random number generator for guest code to randomize parts
of the test, making it less predictable and a more accurate reflection
of reality.
The random number generator chosen is the Park-Miller Linear
Congruential Generator, a fancy name for a basic and well-understood
random number generator entirely sufficient for this purpose.
Signed-off-by: Colton Lewis <[email protected]>
Reviewed-by: Sean Christopherson <[email protected]>
Reviewed-by: David Matlack <[email protected]>
Link: https://lore.kernel.org/r/[email protected]
Signed-off-by: Sean Christopherson <[email protected]>
|
|
Add a command line option, -c, to pin vCPUs to physical CPUs (pCPUs),
i.e. to force vCPUs to run on specific pCPUs.
Requirement to implement this feature came in discussion on the patch
"Make page tables for eager page splitting NUMA aware"
https://lore.kernel.org/lkml/[email protected]/
This feature is useful as it provides a way to analyze performance based
on the vCPUs and dirty log worker locations, like on the different NUMA
nodes or on the same NUMA nodes.
To keep things simple, implementation is intentionally very limited,
either all of the vCPUs will be pinned followed by an optional main
thread or nothing will be pinned.
Signed-off-by: Vipin Sharma <[email protected]>
Suggested-by: David Matlack <[email protected]>
Reviewed-by: Sean Christopherson <[email protected]>
Link: https://lore.kernel.org/r/[email protected]
Signed-off-by: Sean Christopherson <[email protected]>
|
|
Many KVM selftests take command line arguments which are supposed to be
positive (>0) or non-negative (>=0). Some tests do these validation and
some missed adding the check.
Add atoi_positive() and atoi_non_negative() to validate inputs in
selftests before proceeding to use those values.
Signed-off-by: Vipin Sharma <[email protected]>
Reviewed-by: Sean Christopherson <[email protected]>
Link: https://lore.kernel.org/r/[email protected]
Signed-off-by: Sean Christopherson <[email protected]>
|
|
Change test args memslot_modification_delay and nr_memslot_modifications
to delay and nr_iterations for simplicity.
Signed-off-by: Vipin Sharma <[email protected]>
Suggested-by: Sean Christopherson <[email protected]>
Reviewed-by: Sean Christopherson <[email protected]>
Link: https://lore.kernel.org/r/[email protected]
Signed-off-by: Sean Christopherson <[email protected]>
|
|
Replace size_1gb defined in max_guest_memory_test.c with the SZ_1G,
SZ_2G and SZ_4G from linux/sizes.h header file.
Signed-off-by: Vipin Sharma <[email protected]>
Reviewed-by: Sean Christopherson <[email protected]>
Link: https://lore.kernel.org/r/[email protected]
Signed-off-by: Sean Christopherson <[email protected]>
|
|
atoi() doesn't detect errors. There is no way to know that a 0 return
is correct conversion or due to an error.
Introduce atoi_paranoid() to detect errors and provide correct
conversion. Replace all atoi() calls with atoi_paranoid().
Signed-off-by: Vipin Sharma <[email protected]>
Suggested-by: David Matlack <[email protected]>
Suggested-by: Sean Christopherson <[email protected]>
Reviewed-by: Sean Christopherson <[email protected]>
Link: https://lore.kernel.org/r/[email protected]
Signed-off-by: Sean Christopherson <[email protected]>
|
|
dirty_log_perf_test
There are 13 command line options and they are not in any order. Put
them in alphabetical order to make it easy to add new options.
No functional change intended.
Signed-off-by: Vipin Sharma <[email protected]>
Reviewed-by: Wei Wang <[email protected]>
Reviewed-by: Sean Christopherson <[email protected]>
Link: https://lore.kernel.org/r/[email protected]
Signed-off-by: Sean Christopherson <[email protected]>
|
|
dirty_log_perf_test
Passing -e option (Run VCPUs while dirty logging is being disabled) in
dirty_log_perf_test also unintentionally enables -g (Do not enable
KVM_CAP_MANUAL_DIRTY_LOG_PROTECT2). Add break between two switch case
logic.
Fixes: cfe12e64b065 ("KVM: selftests: Add an option to run vCPUs while disabling dirty logging")
Signed-off-by: Vipin Sharma <[email protected]>
Reviewed-by: Sean Christopherson <[email protected]>
Link: https://lore.kernel.org/r/[email protected]
Signed-off-by: Sean Christopherson <[email protected]>
|
|
Remove unnecessary include of internal threadmap.h and refcount.h in
thread_map.h. Switch to using public APIs when possible or including
the internal header file in the C file. Fix a transitive dependency in
openat-syscall.c broken by the clean up.
Signed-off-by: Ian Rogers <[email protected]>
Acked-by: Namhyung Kim <[email protected]>
Cc: Alexander Shishkin <[email protected]>
Cc: Andrii Nakryiko <[email protected]>
Cc: Jiri Olsa <[email protected]>
Cc: Mark Rutland <[email protected]>
Cc: Masahiro Yamada <[email protected]>
Cc: Nick Desaulniers <[email protected]>
Cc: Nicolas Schier <[email protected]>
Cc: Peter Zijlstra <[email protected]>
Cc: Stephane Eranian <[email protected]>
Cc: [email protected]
Link: http://lore.kernel.org/lkml/[email protected]
Signed-off-by: Arnaldo Carvalho de Melo <[email protected]>
|
|
hashmap.h comes from libbpf but isn't installed with its
headers. Always use the header file of the code in util. Change the
hashmap.h dependency in expr.h to a forward declaration, add the
necessary header file includes in the C files.
Signed-off-by: Ian Rogers <[email protected]>
Acked-by: Namhyung Kim <[email protected]>
Cc: Alexander Shishkin <[email protected]>
Cc: Andrii Nakryiko <[email protected]>
Cc: Jiri Olsa <[email protected]>
Cc: Mark Rutland <[email protected]>
Cc: Masahiro Yamada <[email protected]>
Cc: Nick Desaulniers <[email protected]>
Cc: Nicolas Schier <[email protected]>
Cc: Peter Zijlstra <[email protected]>
Cc: Stephane Eranian <[email protected]>
Cc: [email protected]
Link: http://lore.kernel.org/lkml/[email protected]
Signed-off-by: Arnaldo Carvalho de Melo <[email protected]>
|
|
The perf build currently has a '-Itools/lib' on the CC command
line. This causes issues as the libapi, libsubcmd, libtraceevent,
libbpf and libsymbol headers are all found via this path, making it
impossible to override include behavior.
Change the libsymbol build mirroring the libbpf, libsubcmd, libapi,
libperf and libtraceevent build, so that it is installed in a directory
along with its headers.
A later change will modify the include behavior. Don't build kallsyms.o
as part of util as this will lead to duplicate definitions. Add
kallsym's directory to the MANIFEST rather than individual files, so
that the Build and Makefile are added to a source tar ball.
Signed-off-by: Ian Rogers <[email protected]>
Acked-by: Namhyung Kim <[email protected]>
Cc: Alexander Shishkin <[email protected]>
Cc: Andrii Nakryiko <[email protected]>
Cc: Jiri Olsa <[email protected]>
Cc: Mark Rutland <[email protected]>
Cc: Masahiro Yamada <[email protected]>
Cc: Nick Desaulniers <[email protected]>
Cc: Nicolas Schier <[email protected]>
Cc: Peter Zijlstra <[email protected]>
Cc: Stephane Eranian <[email protected]>
Cc: [email protected]
Link: http://lore.kernel.org/lkml/[email protected]
Signed-off-by: Arnaldo Carvalho de Melo <[email protected]>
|
|
Add sufficient Makefile for libsymbol to be built as a dependency and
header files installed.
Signed-off-by: Ian Rogers <[email protected]>
Acked-by: Namhyung Kim <[email protected]>
Cc: Alexander Shishkin <[email protected]>
Cc: Andrii Nakryiko <[email protected]>
Cc: Jiri Olsa <[email protected]>
Cc: Mark Rutland <[email protected]>
Cc: Masahiro Yamada <[email protected]>
Cc: Nick Desaulniers <[email protected]>
Cc: Nicolas Schier <[email protected]>
Cc: Peter Zijlstra <[email protected]>
Cc: Stephane Eranian <[email protected]>
Cc: [email protected]
Link: http://lore.kernel.org/lkml/[email protected]
Signed-off-by: Arnaldo Carvalho de Melo <[email protected]>
|
|
Headers necessary for the perf build. Note, internal headers are also
installed as these are necessary for the build.
Signed-off-by: Ian Rogers <[email protected]>
Acked-by: Namhyung Kim <[email protected]>
Cc: Alexander Shishkin <[email protected]>
Cc: Andrii Nakryiko <[email protected]>
Cc: Jiri Olsa <[email protected]>
Cc: Mark Rutland <[email protected]>
Cc: Masahiro Yamada <[email protected]>
Cc: Nick Desaulniers <[email protected]>
Cc: Nicolas Schier <[email protected]>
Cc: Peter Zijlstra <[email protected]>
Cc: Stephane Eranian <[email protected]>
Cc: [email protected]
Link: http://lore.kernel.org/lkml/[email protected]
Signed-off-by: Arnaldo Carvalho de Melo <[email protected]>
|
|
Headers necessary for the perf build.
Signed-off-by: Ian Rogers <[email protected]>
Acked-by: Namhyung Kim <[email protected]>
Cc: Alexander Shishkin <[email protected]>
Cc: Andrii Nakryiko <[email protected]>
Cc: Jiri Olsa <[email protected]>
Cc: Mark Rutland <[email protected]>
Cc: Masahiro Yamada <[email protected]>
Cc: Nick Desaulniers <[email protected]>
Cc: Nicolas Schier <[email protected]>
Cc: Peter Zijlstra <[email protected]>
Cc: Stephane Eranian <[email protected]>
Cc: [email protected]
Link: http://lore.kernel.org/lkml/[email protected]
Signed-off-by: Arnaldo Carvalho de Melo <[email protected]>
|
|
The perf build currently has a '-Itools/lib' on the CC command
line. This causes issues as the libapi, libsubcmd, libtraceevent,
libbpf headers are all found via this path, making it impossible to
override include behavior.
Change the libtraceevent build mirroring the libbpf, libsubcmd, libapi
and libperf build, so that it is installed in a directory along with its
headers. A later change will modify the include behavior.
Similarly, the plugins are now installed into libtraceevent_plugins
except they have no header files.
Signed-off-by: Ian Rogers <[email protected]>
Acked-by: Namhyung Kim <[email protected]>
Cc: Alexander Shishkin <[email protected]>
Cc: Andrii Nakryiko <[email protected]>
Cc: Jiri Olsa <[email protected]>
Cc: Mark Rutland <[email protected]>
Cc: Masahiro Yamada <[email protected]>
Cc: Nick Desaulniers <[email protected]>
Cc: Nicolas Schier <[email protected]>
Cc: Peter Zijlstra <[email protected]>
Cc: Stephane Eranian <[email protected]>
Cc: [email protected]
Link: http://lore.kernel.org/lkml/[email protected]
Signed-off-by: Arnaldo Carvalho de Melo <[email protected]>
|
|
The perf build currently has a '-Itools/lib' on the CC command
line. This causes issues as the libapi, libsubcmd, libtraceevent,
libbpf headers are all found via this path, making it impossible to
override include behavior.
Change the libperf build mirroring the libbpf, libsubcmd and libapi
build, so that it is installed in a directory along with its headers. A
later change will modify the include behavior.
Signed-off-by: Ian Rogers <[email protected]>
Acked-by: Namhyung Kim <[email protected]>
Cc: Alexander Shishkin <[email protected]>
Cc: Andrii Nakryiko <[email protected]>
Cc: Jiri Olsa <[email protected]>
Cc: Mark Rutland <[email protected]>
Cc: Masahiro Yamada <[email protected]>
Cc: Nick Desaulniers <[email protected]>
Cc: Nicolas Schier <[email protected]>
Cc: Peter Zijlstra <[email protected]>
Cc: Stephane Eranian <[email protected]>
Cc: [email protected]
Link: http://lore.kernel.org/lkml/[email protected]
Signed-off-by: Arnaldo Carvalho de Melo <[email protected]>
|
|
The perf build currently has a '-Itools/lib' on the CC command line.
This causes issues as the libapi, libsubcmd, libtraceevent, libbpf
headers are all found via this path, making it impossible to override
include behavior.
Change the libapi build mirroring the libbpf and libsubcmd build, so
that it is installed in a directory along with its headers. A later
change will modify the include behavior.
Signed-off-by: Ian Rogers <[email protected]>
Acked-by: Namhyung Kim <[email protected]>
Cc: Alexander Shishkin <[email protected]>
Cc: Andrii Nakryiko <[email protected]>
Cc: Jiri Olsa <[email protected]>
Cc: Mark Rutland <[email protected]>
Cc: Masahiro Yamada <[email protected]>
Cc: Nick Desaulniers <[email protected]>
Cc: Nicolas Schier <[email protected]>
Cc: Peter Zijlstra <[email protected]>
Cc: Stephane Eranian <[email protected]>
Cc: [email protected]
Link: http://lore.kernel.org/lkml/[email protected]
Signed-off-by: Arnaldo Carvalho de Melo <[email protected]>
|
|
The perf build currently has a '-Itools/lib' on the CC command
line. This causes issues as the libapi, libsubcmd, libtraceevent,
libbpf headers are all found via this path, making it impossible to
override include behavior. Change the libsubcmd build mirroring the
libbpf build, so that it is installed in a directory along with its
headers. A later change will modify the include behavior.
Signed-off-by: Ian Rogers <[email protected]>
Acked-by: Namhyung Kim <[email protected]>
Cc: Alexander Shishkin <[email protected]>
Cc: Andrii Nakryiko <[email protected]>
Cc: Jiri Olsa <[email protected]>
Cc: Mark Rutland <[email protected]>
Cc: Masahiro Yamada <[email protected]>
Cc: Nick Desaulniers <[email protected]>
Cc: Nicolas Schier <[email protected]>
Cc: Peter Zijlstra <[email protected]>
Cc: Stephane Eranian <[email protected]>
Cc: [email protected]
Link: http://lore.kernel.org/lkml/[email protected]
Signed-off-by: Arnaldo Carvalho de Melo <[email protected]>
|
|
This allows libsubcmd to be installed as a dependency.
Signed-off-by: Ian Rogers <[email protected]>
Acked-by: Namhyung Kim <[email protected]>
Cc: Alexander Shishkin <[email protected]>
Cc: Andrii Nakryiko <[email protected]>
Cc: Jiri Olsa <[email protected]>
Cc: Mark Rutland <[email protected]>
Cc: Masahiro Yamada <[email protected]>
Cc: Nick Desaulniers <[email protected]>
Cc: Nicolas Schier <[email protected]>
Cc: Peter Zijlstra <[email protected]>
Cc: Stephane Eranian <[email protected]>
Cc: [email protected]
Link: http://lore.kernel.org/lkml/[email protected]
Signed-off-by: Arnaldo Carvalho de Melo <[email protected]>
|
|
This allows libapi to be installed as a dependency.
Signed-off-by: Ian Rogers <[email protected]>
Acked-by: Namhyung Kim <[email protected]>
Cmc: Alexander Shishkin <[email protected]>
Cc: Andrii Nakryiko <[email protected]>
Cc: Jiri Olsa <[email protected]>
Cc: Mark Rutland <[email protected]>
Cc: Masahiro Yamada <[email protected]>
Cc: Nick Desaulniers <[email protected]>
Cc: Peter Zijlstra <[email protected]>
Cc: Stephane Eranian <[email protected]>
Cc: [email protected]
Cc: nicolas schier <[email protected]>
Link: http://lore.kernel.org/lkml/[email protected]
Signed-off-by: Arnaldo Carvalho de Melo <[email protected]>
|
|
This initial code does a simple sample transfer tests. By default,
all PCM devices are detected and tested with short and long
buffering parameters for 4 seconds. If the sample transfer timing
is not in a +-100ms boundary, the test fails. Only the interleaved
buffering scheme is supported in this version.
The configuration may be modified with the configuration files.
A specific hardware configuration is detected and activated
using the sysfs regex matching. This allows to use the DMI string
(/sys/class/dmi/id/* tree) or any other system parameters
exposed in sysfs for the matching for the CI automation.
The configuration file may also specify the PCM device list to detect
the missing PCM devices.
v1..v2:
- added missing alsa-local.h header file
Cc: Mark Brown <[email protected]>
Cc: Pierre-Louis Bossart <[email protected]>
Cc: Liam Girdwood <[email protected]>
Cc: Jesse Barnes <[email protected]>
Cc: Jimmy Cheng-Yi Chiang <[email protected]>
Cc: Curtis Malainey <[email protected]>
Cc: Brian Norris <[email protected]>
Signed-off-by: Jaroslav Kysela <[email protected]>
Reviewed-by: Mark Brown <[email protected]>
Link: https://lore.kernel.org/r/[email protected]
Signed-off-by: Takashi Iwai <[email protected]>
|
|
Normally, --for-each-cgroup only works with AGGR_GLOBAL. However
the --topdown on some cpu (e.g. Intel Skylake) converts it to the
AGGR_CORE internally.
To support those machines, add print_aggr_cgroup and handle the events
like in print_cgroup_events().
$ perf stat -a --for-each-cgroup system.slice,user.slice --topdown sleep 1
nmi_watchdog enabled with topdown. May give wrong results.
Disable with echo 0 > /proc/sys/kernel/nmi_watchdog
Performance counter stats for 'system wide':
retiring bad speculation frontend bound backend bound
S0-D0-C0 2 system.slice 49.0% -46.6% 31.4%
S0-D0-C1 2 system.slice 55.5% 8.0% 45.5% -9.0%
S0-D0-C2 2 system.slice 87.8% 22.1% 30.3% -40.3%
S0-D0-C3 2 system.slice 53.3% -11.9% 45.2% 13.4%
S0-D0-C0 2 user.slice 123.5% 4.0% 48.5% -75.9%
S0-D0-C1 2 user.slice 19.9% 6.5% 89.9% -16.3%
S0-D0-C2 2 user.slice 29.9% 7.9% 71.3% -9.1
S0-D0-C3 2 user.slice 28.0% 7.2% 43.3% 21.5%
1.004136937 seconds time elapsed
Signed-off-by: Namhyung Kim <[email protected]>
Cc: Adrian Hunter <[email protected]>
Cc: Athira Jajeev <[email protected]>
Cc: Ian Rogers <[email protected]>
Cc: Ingo Molnar <[email protected]>
Cc: James Clark <[email protected]>
Cc: Jiri Olsa <[email protected]>
Cc: Kan Liang <[email protected]>
Cc: Peter Zijlstra <[email protected]>
Cc: Xing Zhengjun <[email protected]>
Link: https://lore.kernel.org/r/[email protected]
Signed-off-by: Arnaldo Carvalho de Melo <[email protected]>
|
|
When we have events for each cgroup, the metric should be printed for
each cgroup separately. Add print_cgroup_counter() to handle that
situation properly.
Also change print_metric_headers() not to print duplicate headers
by checking cgroups.
$ perf stat -a --for-each-cgroup system.slice,user.slice --metric-only sleep 1
Performance counter stats for 'system wide':
GHz insn per cycle branch-misses of all branches
system.slice 3.792 0.61 3.24%
user.slice 3.661 2.32 0.37%
1.016111516 seconds time elapsed
Signed-off-by: Namhyung Kim <[email protected]>
Cc: Adrian Hunter <[email protected]>
Cc: Athira Jajeev <[email protected]>
Cc: Ian Rogers <[email protected]>
Cc: Ingo Molnar <[email protected]>
Cc: James Clark <[email protected]>
Cc: Jiri Olsa <[email protected]>
Cc: Kan Liang <[email protected]>
Cc: Peter Zijlstra <[email protected]>
Cc: Xing Zhengjun <[email protected]>
Link: https://lore.kernel.org/r/[email protected]
Signed-off-by: Arnaldo Carvalho de Melo <[email protected]>
|
|
For the metric-only case, add new functions to handle the start and the
end of each metric display.
Signed-off-by: Namhyung Kim <[email protected]>
Cc: Adrian Hunter <[email protected]>
Cc: Athira Jajeev <[email protected]>
Cc: Ian Rogers <[email protected]>
Cc: Ingo Molnar <[email protected]>
Cc: James Clark <[email protected]>
Cc: Jiri Olsa <[email protected]>
Cc: Kan Liang <[email protected]>
Cc: Peter Zijlstra <[email protected]>
Cc: Xing Zhengjun <[email protected]>
Link: https://lore.kernel.org/r/[email protected]
Signed-off-by: Arnaldo Carvalho de Melo <[email protected]>
|
|
The prefix is needed for interval mode to print timestamp at the
beginning of each line. But the it's tricky for the metric only
mode since it doesn't print every evsel and combines the metrics
into a single line.
So it needed to pass 'first' argument to print_counter_aggrdata()
to determine if the current event is being printed at first. This
makes the code hard to read.
Let's move the logic out of the function and do it in the outer
print loop. This would enable further cleanups later.
Signed-off-by: Namhyung Kim <[email protected]>
Cc: Adrian Hunter <[email protected]>
Cc: Athira Jajeev <[email protected]>
Cc: Ian Rogers <[email protected]>
Cc: Ingo Molnar <[email protected]>
Cc: James Clark <[email protected]>
Cc: Jiri Olsa <[email protected]>
Cc: Kan Liang <[email protected]>
Cc: Peter Zijlstra <[email protected]>
Cc: Xing Zhengjun <[email protected]>
Link: https://lore.kernel.org/r/[email protected]
Signed-off-by: Arnaldo Carvalho de Melo <[email protected]>
|
|
Likewise, I think it'd better to have the control inside the function, and keep
the higher level function clearer.
Signed-off-by: Namhyung Kim <[email protected]>
Cc: Adrian Hunter <[email protected]>
Cc: Athira Jajeev <[email protected]>
Cc: Ian Rogers <[email protected]>
Cc: Ingo Molnar <[email protected]>
Cc: James Clark <[email protected]>
Cc: Jiri Olsa <[email protected]>
Cc: Kan Liang <[email protected]>
Cc: Peter Zijlstra <[email protected]>
Cc: Xing Zhengjun <[email protected]>
Link: https://lore.kernel.org/r/[email protected]
Signed-off-by: Arnaldo Carvalho de Melo <[email protected]>
|
|
There are print_header() and print_interval() to print header lines before
actual counter values. Also print_metric_headers() needs to be called for
the metric-only case.
Let's move all these logics to a single place including num_print_iv to
refresh the headers for interval mode.
Signed-off-by: Namhyung Kim <[email protected]>
Cc: Adrian Hunter <[email protected]>
Cc: Athira Jajeev <[email protected]>
Cc: Ian Rogers <[email protected]>
Cc: Ingo Molnar <[email protected]>
Cc: James Clark <[email protected]>
Cc: Jiri Olsa <[email protected]>
Cc: Kan Liang <[email protected]>
Cc: Peter Zijlstra <[email protected]>
Cc: Xing Zhengjun <[email protected]>
Link: https://lore.kernel.org/r/[email protected]
Signed-off-by: Arnaldo Carvalho de Melo <[email protected]>
|
|
The print would run only if metric_only is not set, but it's already in a
block that says it's in metric_only case. And there's no place to change
the setting.
Signed-off-by: Namhyung Kim <[email protected]>
Cc: Adrian Hunter <[email protected]>
Cc: Athira Jajeev <[email protected]>
Cc: Ian Rogers <[email protected]>
Cc: Ingo Molnar <[email protected]>
Cc: James Clark <[email protected]>
Cc: Jiri Olsa <[email protected]>
Cc: Kan Liang <[email protected]>
Cc: Peter Zijlstra <[email protected]>
Cc: Xing Zhengjun <[email protected]>
Link: https://lore.kernel.org/r/[email protected]
Signed-off-by: Arnaldo Carvalho de Melo <[email protected]>
|
|
Instead of using magic values, define symbolic constants and use them.
Also add aggr_header_std[] array to simplify aggr_mode handling.
Signed-off-by: Namhyung Kim <[email protected]>
Cc: Adrian Hunter <[email protected]>
Cc: Athira Jajeev <[email protected]>
Cc: Ian Rogers <[email protected]>
Cc: Ingo Molnar <[email protected]>
Cc: James Clark <[email protected]>
Cc: Jiri Olsa <[email protected]>
Cc: Kan Liang <[email protected]>
Cc: Peter Zijlstra <[email protected]>
Cc: Xing Zhengjun <[email protected]>
Link: https://lore.kernel.org/r/[email protected]
Signed-off-by: Arnaldo Carvalho de Melo <[email protected]>
|
|
This logic does not print the time directly, but it just puts the
timestamp in the buffer as a prefix. To reduce the confusion, factor
out the code into a separate function.
Signed-off-by: Namhyung Kim <[email protected]>
Cc: Adrian Hunter <[email protected]>
Cc: Athira Jajeev <[email protected]>
Cc: Ian Rogers <[email protected]>
Cc: Ingo Molnar <[email protected]>
Cc: James Clark <[email protected]>
Cc: Jiri Olsa <[email protected]>
Cc: Kan Liang <[email protected]>
Cc: Peter Zijlstra <[email protected]>
Cc: Xing Zhengjun <[email protected]>
Link: https://lore.kernel.org/r/[email protected]
Signed-off-by: Arnaldo Carvalho de Melo <[email protected]>
|
|
The print_metric_headers() shows metric headers a little bit for each
mode. Split it out to make the code clearer.
Signed-off-by: Namhyung Kim <[email protected]>
Cc: Adrian Hunter <[email protected]>
Cc: Athira Jajeev <[email protected]>
Cc: Ian Rogers <[email protected]>
Cc: Ingo Molnar <[email protected]>
Cc: James Clark <[email protected]>
Cc: Jiri Olsa <[email protected]>
Cc: Kan Liang <[email protected]>
Cc: Peter Zijlstra <[email protected]>
Cc: Xing Zhengjun <[email protected]>
Link: https://lore.kernel.org/r/[email protected]
Signed-off-by: Arnaldo Carvalho de Melo <[email protected]>
|
|
We don't know how long cgroup name is, but at least we can align short
ones like below.
$ perf stat -a --for-each-cgroup system.slice,user.slice true
Performance counter stats for 'system wide':
0.13 msec cpu-clock system.slice # 0.010 CPUs utilized
4 context-switches system.slice # 31.989 K/sec
1 cpu-migrations system.slice # 7.997 K/sec
0 page-faults system.slice # 0.000 /sec
450,673 cycles system.slice # 3.604 GHz (92.41%)
161,216 instructions system.slice # 0.36 insn per cycle (92.41%)
32,678 branches system.slice # 261.332 M/sec (92.41%)
2,628 branch-misses system.slice # 8.04% of all branches (92.41%)
14.29 msec cpu-clock user.slice # 1.163 CPUs utilized
35 context-switches user.slice # 2.449 K/sec
12 cpu-migrations user.slice # 839.691 /sec
57 page-faults user.slice # 3.989 K/sec
49,683,026 cycles user.slice # 3.477 GHz (99.38%)
110,790,266 instructions user.slice # 2.23 insn per cycle (99.38%)
24,552,255 branches user.slice # 1.718 G/sec (99.38%)
127,779 branch-misses user.slice # 0.52% of all branches (99.38%)
0.012289431 seconds time elapsed
Signed-off-by: Namhyung Kim <[email protected]>
Cc: Adrian Hunter <[email protected]>
Cc: Athira Jajeev <[email protected]>
Cc: Ian Rogers <[email protected]>
Cc: Ingo Molnar <[email protected]>
Cc: James Clark <[email protected]>
Cc: Jiri Olsa <[email protected]>
Cc: Kan Liang <[email protected]>
Cc: Peter Zijlstra <[email protected]>
Cc: Xing Zhengjun <[email protected]>
Link: https://lore.kernel.org/r/[email protected]
Signed-off-by: Arnaldo Carvalho de Melo <[email protected]>
|
|
Unfortunately, event running time, percentage and noise data are printed
in different positions in normal output than CSV/JSON. I think it's
better to put such details in where it actually prints.
So add before_metric argument to print_noise() and print_running() and
call them twice before and after the metric.
Signed-off-by: Namhyung Kim <[email protected]>
Cc: Adrian Hunter <[email protected]>
Cc: Athira Jajeev <[email protected]>
Cc: Ian Rogers <[email protected]>
Cc: Ingo Molnar <[email protected]>
Cc: James Clark <[email protected]>
Cc: Jiri Olsa <[email protected]>
Cc: Kan Liang <[email protected]>
Cc: Peter Zijlstra <[email protected]>
Cc: Xing Zhengjun <[email protected]>
Link: https://lore.kernel.org/r/[email protected]
Signed-off-by: Arnaldo Carvalho de Melo <[email protected]>
|
|
In the printout() function, it checks if the event is bad (i.e. not
counted or not supported) and print the result. But it does the same
what abs_printout() is doing. So add an argument to indicate the value
is ok or not and use the same function in both cases.
Signed-off-by: Namhyung Kim <[email protected]>
Cc: Adrian Hunter <[email protected]>
Cc: Athira Jajeev <[email protected]>
Cc: Ian Rogers <[email protected]>
Cc: Ingo Molnar <[email protected]>
Cc: James Clark <[email protected]>
Cc: Jiri Olsa <[email protected]>
Cc: Kan Liang <[email protected]>
Cc: Peter Zijlstra <[email protected]>
Cc: Xing Zhengjun <[email protected]>
Link: https://lore.kernel.org/r/[email protected]
Signed-off-by: Arnaldo Carvalho de Melo <[email protected]>
|
|
And split it for each output mode like others. I believe it makes the
code simpler and more intuitive. Now abs_printout() becomes just to
call sub-functions.
Signed-off-by: Namhyung Kim <[email protected]>
Cc: Adrian Hunter <[email protected]>
Cc: Athira Jajeev <[email protected]>
Cc: Ian Rogers <[email protected]>
Cc: Ingo Molnar <[email protected]>
Cc: James Clark <[email protected]>
Cc: Jiri Olsa <[email protected]>
Cc: Kan Liang <[email protected]>
Cc: Peter Zijlstra <[email protected]>
Cc: Xing Zhengjun <[email protected]>
Link: https://lore.kernel.org/r/[email protected]
Signed-off-by: Arnaldo Carvalho de Melo <[email protected]>
|
|
The aggr_printout() function is to print aggr_id and count (nr).
Split it for each output mode to simplify the code.
Signed-off-by: Namhyung Kim <[email protected]>
Cc: Adrian Hunter <[email protected]>
Cc: Athira Jajeev <[email protected]>
Cc: Ian Rogers <[email protected]>
Cc: Ingo Molnar <[email protected]>
Cc: James Clark <[email protected]>
Cc: Jiri Olsa <[email protected]>
Cc: Kan Liang <[email protected]>
Cc: Peter Zijlstra <[email protected]>
Cc: Xing Zhengjun <[email protected]>
Link: https://lore.kernel.org/r/[email protected]
Signed-off-by: Arnaldo Carvalho de Melo <[email protected]>
|
|
Likewise, split print_cgroup() for each output mode.
Signed-off-by: Namhyung Kim <[email protected]>
Cc: Adrian Hunter <[email protected]>
Cc: Athira Jajeev <[email protected]>
Cc: Ian Rogers <[email protected]>
Cc: Ingo Molnar <[email protected]>
Cc: James Clark <[email protected]>
Cc: Jiri Olsa <[email protected]>
Cc: Kan Liang <[email protected]>
Cc: Peter Zijlstra <[email protected]>
Cc: Xing Zhengjun <[email protected]>
Link: https://lore.kernel.org/r/[email protected]
Signed-off-by: Arnaldo Carvalho de Melo <[email protected]>
|
|
Likewise, split print_noise_pct() for each output mode. Although it's
a tiny function, more logic will be added soon so it'd be better split
it and treat it in the same way.
Signed-off-by: Namhyung Kim <[email protected]>
Cc: Adrian Hunter <[email protected]>
Cc: Athira Jajeev <[email protected]>
Cc: Ian Rogers <[email protected]>
Cc: Ingo Molnar <[email protected]>
Cc: James Clark <[email protected]>
Cc: Jiri Olsa <[email protected]>
Cc: Kan Liang <[email protected]>
Cc: Peter Zijlstra <[email protected]>
Cc: Xing Zhengjun <[email protected]>
Link: https://lore.kernel.org/r/[email protected]
Signed-off-by: Arnaldo Carvalho de Melo <[email protected]>
|