aboutsummaryrefslogtreecommitdiff
AgeCommit message (Collapse)AuthorFilesLines
2019-09-03perf/x86: Make more stuff staticValdis Klētnieks4-5/+5
When building with C=2, sparse makes note of a number of things: arch/x86/events/intel/rapl.c:637:30: warning: symbol 'rapl_attr_update' was not declared. Should it be static? arch/x86/events/intel/cstate.c:449:30: warning: symbol 'core_attr_update' was not declared. Should it be static? arch/x86/events/intel/cstate.c:457:30: warning: symbol 'pkg_attr_update' was not declared. Should it be static? arch/x86/events/msr.c:170:30: warning: symbol 'attr_update' was not declared. Should it be static? arch/x86/events/intel/lbr.c:276:1: warning: symbol 'lbr_from_quirk_key' was not declared. Should it be static? And they can all indeed be static. Signed-off-by: Valdis Kletnieks <[email protected]> Signed-off-by: Peter Zijlstra (Intel) <[email protected]> Cc: Alexander Shishkin <[email protected]> Cc: Arnaldo Carvalho de Melo <[email protected]> Cc: Jiri Olsa <[email protected]> Cc: Linus Torvalds <[email protected]> Cc: Peter Zijlstra <[email protected]> Cc: Thomas Gleixner <[email protected]> Link: https://lkml.kernel.org/r/128059.1565286242@turing-police Signed-off-by: Ingo Molnar <[email protected]>
2019-09-02x86, perf: Fix the dependency of the x86 insn decoder selftestMasami Hiramatsu1-1/+1
Since x86 instruction decoder is not only for kprobes, it should be tested when the insn.c is compiled. (e.g. perf is enabled but kprobes is disabled) Signed-off-by: Masami Hiramatsu <[email protected]> Cc: Linus Torvalds <[email protected]> Cc: Peter Zijlstra <[email protected]> Cc: Thomas Gleixner <[email protected]> Fixes: cbe5c34c8c1f ("x86: Compile insn.c and inat.c only for KPROBES") Signed-off-by: Ingo Molnar <[email protected]>
2019-09-02Merge tag 'perf-core-for-mingo-5.4-20190901' of ↵Ingo Molnar267-3578/+1319
git://git.kernel.org/pub/scm/linux/kernel/git/acme/linux into perf/core Pull perf/core improvements and fixes from Arnaldo Carvalho de Melo: objtool: Josh Poimboeuf: - Move x86 insn decoder to a common location. Arnaldo Carvalho de Melo: - Ignore intentional differences for the x86 insn decoder. build: Arnaldo Carvalho de Melo: - Ignore intentional differences for the x86 insn decoder. Intel PT: Josh Poimboeuf: - Use shared x86 insn decoder. metric groups: Jin Yao: - Scale the metric result. - Support multiple events. perf c2c: Jiri Olsa: - Display proper cpu count in nodes column. Miscellaneous: Kyle Meyer: - Replace MAX_NR_CPUS with perf_env::nr_cpus_online, i.e. with the number of online CPUs as detected at tool start and/or recorded in the perf.data file. libtraceevent: Tzvetomir Stoyanov: - Simplify the tep_print_event_* APIs. - Remove tep_register_trace_clock(). - Change users plugin directory. Cleanups: Arnaldo Carvalho de Melo: - Continue taming the includes hell: remove needless include directives, fix the fallout, rinse, repeat. Signed-off-by: Arnaldo Carvalho de Melo <[email protected]> Signed-off-by: Ingo Molnar <[email protected]>
2019-09-02Merge branch 'linus' into perf/core, to pick up fixesIngo Molnar214-916/+1823
Signed-off-by: Ingo Molnar <[email protected]>
2019-09-01Merge branch 'x86-urgent-for-linus' of ↵Linus Torvalds7-39/+43
git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip Pull x86 fixes from Thomas Gleixner: "A set of fixes for x86: - Fix the bogus detection of 32bit user mode for uretprobes which caused corruption of the user return address resulting in application crashes. In the uprobes handler in_ia32_syscall() is obviously always returning false on a 64bit kernel. Use user_64bit_mode() instead which works correctly. - Prevent large page splitting when ftrace flips RW/RO on the kernel text which caused iTLB performance issues. Ftrace wants to be converted to text_poke() which avoids the problem, but for now allow large page preservation in the static protections check when the change request spawns a full large page. - Prevent arch_dynirq_lower_bound() from returning 0 when the IOAPIC is configured via device tree. In the device tree case the GSI 1:1 mapping is meaningless therefore the lower bound which protects the GSI range on ACPI machines is irrelevant. Return the lower bound which the core hands to the function instead of blindly returning 0 which causes the core to allocate the invalid virtual interupt number 0 which in turn prevents all drivers from allocating and requesting an interrupt. - Remove the bogus initialization of LDR and DFR in the 32bit bigsmp APIC driver. That uses physical destination mode where LDR/DFR are ignored, but the initialization and the missing clear of LDR caused the APIC to be left in a inconsistent state on kexec/reboot. - Clear LDR when clearing the APIC registers so the APIC is in a well defined state. - Initialize variables proper in the find_trampoline_placement() code. - Silence GCC( build warning for the real mode part of the build" * 'x86-urgent-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip: x86/mm/cpa: Prevent large page split when ftrace flips RW on kernel text x86/build: Add -Wnoaddress-of-packed-member to REALMODE_CFLAGS, to silence GCC9 build warning x86/boot/compressed/64: Fix missing initialization in find_trampoline_placement() x86/apic: Include the LDR when clearing out APIC registers x86/apic: Do not initialize LDR and DFR for bigsmp uprobes/x86: Fix detection of 32-bit user mode x86/apic: Fix arch_dynirq_lower_bound() bug for DT enabled machines
2019-09-01Merge branch 'perf-urgent-for-linus' of ↵Linus Torvalds3-7/+24
git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip Pull perf fixes from Thomas Gleixner: "Two fixes for perf x86 hardware implementations: - Restrict the period on Nehalem machines to prevent perf from hogging the CPU - Prevent the AMD IBS driver from overwriting the hardwre controlled and pre-seeded reserved bits (0-6) in the count register which caused a sample bias for dispatched micro-ops" * 'perf-urgent-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip: perf/x86/amd/ibs: Fix sample bias for dispatched micro-ops perf/x86/intel: Restrict period on Nehalem
2019-09-01Merge branch 'turbostat' of ↵Linus Torvalds5-47/+90
git://git.kernel.org/pub/scm/linux/kernel/git/lenb/linux Pull turbostat updates from Len Brown: "User-space turbostat (and x86_energy_perf_policy) patches. They are primarily bug fixes from users" * 'turbostat' of git://git.kernel.org/pub/scm/linux/kernel/git/lenb/linux: tools/power turbostat: update version number tools/power turbostat: Add support for Hygon Fam 18h (Dhyana) RAPL tools/power turbostat: Fix caller parameter of get_tdp_amd() tools/power turbostat: Fix CPU%C1 display value tools/power turbostat: do not enforce 1ms tools/power turbostat: read from pipes too tools/power turbostat: Add Ice Lake NNPI support tools/power turbostat: rename has_hsw_msrs() tools/power turbostat: Fix Haswell Core systems tools/power turbostat: add Jacobsville support tools/power turbostat: fix buffer overrun tools/power turbostat: fix file descriptor leaks tools/power turbostat: fix leak of file descriptor on error return path tools/power turbostat: Make interval calculation per thread to reduce jitter tools/power turbostat: remove duplicate pc10 column tools/power x86_energy_perf_policy: Fix argument parsing tools/power: Fix typo in man page tools/power/x86: Enable compiler optimisations and Fortify by default tools/power x86_energy_perf_policy: Fix "uninitialized variable" warnings at -O2
2019-08-31objtool: Ignore intentional differences for the x86 insn decoderArnaldo Carvalho de Melo1-4/+5
Since we need to build this in !x86, we need to explicitely use the x86 files, not things like asm/insn.h, so we intentionally differ from the master copy in the kernel sources, add -I diff directives to ignore just these differences when checking for drift. Acked-by: Josh Poimboeuf <[email protected]> Link: http://lore.kernel.org/lkml/20190830193109.p7jagidsrahoa4pn@treble Acked-by: Masami Hiramatsu <[email protected]> Cc: Adrian Hunter <[email protected]> Cc: Jiri Olsa <[email protected]> Cc: Namhyung Kim <[email protected]> Cc: Peter Zijlstra (Intel) <[email protected]> Link: https://lkml.kernel.org/n/[email protected] Signed-off-by: Arnaldo Carvalho de Melo <[email protected]>
2019-08-31objtool: Update sync-check.sh from perf's check-headers.shArnaldo Carvalho de Melo1-5/+26
To allow using the -I trick that will be needed for checking the x86 insn decoder files. Without the specific -I lines we still get the same warnings as before: $ make -C tools/objtool/ clean ; make -C tools/objtool/ make: Entering directory '/home/acme/git/perf/tools/objtool' CLEAN objtool find -name '*.o' -delete -o -name '\.*.cmd' -delete -o -name '\.*.d' -delete rm -f arch/x86/inat-tables.c fixdep <SNIP> LD objtool-in.o make[1]: Leaving directory '/home/acme/git/perf/tools/objtool' Warning: Kernel ABI header at 'tools/arch/x86/include/asm/inat.h' differs from latest version at 'arch/x86/include/asm/inat.h' diff -u tools/arch/x86/include/asm/inat.h arch/x86/include/asm/inat.h Warning: Kernel ABI header at 'tools/arch/x86/include/asm/insn.h' differs from latest version at 'arch/x86/include/asm/insn.h' diff -u tools/arch/x86/include/asm/insn.h arch/x86/include/asm/insn.h Warning: Kernel ABI header at 'tools/arch/x86/lib/inat.c' differs from latest version at 'arch/x86/lib/inat.c' diff -u tools/arch/x86/lib/inat.c arch/x86/lib/inat.c Warning: Kernel ABI header at 'tools/arch/x86/lib/insn.c' differs from latest version at 'arch/x86/lib/insn.c' diff -u tools/arch/x86/lib/insn.c arch/x86/lib/insn.c /home/acme/git/perf/tools/objtool LINK objtool make: Leaving directory '/home/acme/git/perf/tools/objtool' $ The next patch will add the -I lines for those files. Acked-by: Josh Poimboeuf <[email protected]> Link: http://lore.kernel.org/lkml/20190830193109.p7jagidsrahoa4pn@treble Acked-by: Masami Hiramatsu <[email protected]> Cc: Adrian Hunter <[email protected]> Cc: Jiri Olsa <[email protected]> Cc: Namhyung Kim <[email protected]> Cc: Peter Zijlstra (Intel) <[email protected]> Link: https://lkml.kernel.org/n/[email protected] Signed-off-by: Arnaldo Carvalho de Melo <[email protected]>
2019-08-31perf build: Ignore intentional differences for the x86 insn decoderArnaldo Carvalho de Melo1-4/+4
Since we need to build this in !x86, we need to explicitely use the x86 files, not things like asm/insn.h, so we intentionally differ from the master copy in the kernel sources, add -I diff directives to ignore just these differences when checking for drift. Acked-by: Josh Poimboeuf <[email protected]> Acked-by: Masami Hiramatsu <[email protected]> Cc: Adrian Hunter <[email protected]> Cc: Jiri Olsa <[email protected]> Cc: Namhyung Kim <[email protected]> Cc: Peter Zijlstra (Intel) <[email protected]> Link: https://lkml.kernel.org/n/[email protected] Signed-off-by: Arnaldo Carvalho de Melo <[email protected]>
2019-08-31perf intel-pt: Use shared x86 insn decoderJosh Poimboeuf16-2632/+23
Now that there's a common version of the decoder for all tools, use it instead of the local copy. Also use perf's check-headers.sh script to diff the decoder files to make sure they remain in sync with the kernel version. Objtool has a similar check. Committer notes: Had to keep this all pointing explicitely to x86 headers/files, i.e. instead of asm/isnn.h we had to use ../include/asm/insn.h when the files were in differemt dirs, or just replace "<asm/foo.h>" with "foo.h". This way we continue to be able to process perf.data files with Intel PT traces in distros other than x86. Also fixed up the awk script paths to use $(srcdir)/tools/arch instead or relative directories so that we keep detached tarballs (make help | grep perf) working. For now the include lines in these headers are being ignored so as not to flag false reports of kernel/tools out of sync. Signed-off-by: Josh Poimboeuf <[email protected]> Reviewed-by: Masami Hiramatsu <[email protected]> Acked-by: Peter Zijlstra (Intel) <[email protected]> Cc: Adrian Hunter <[email protected]> Cc: Jiri Olsa <[email protected]> Cc: [email protected] Link: http://lore.kernel.org/lkml/8a37e615d2880f039505d693d1e068a009358a2b.1567118001.git.jpoimboe@redhat.com Signed-off-by: Arnaldo Carvalho de Melo <[email protected]>
2019-08-31perf intel-pt: Remove inat.c from build dependency listJosh Poimboeuf1-1/+1
intel-pt-insn-decoder.c includes inat.c directly, so it already has an implicit dependency on inat.c. The Build file dependency is redundant. Signed-off-by: Josh Poimboeuf <[email protected]> Reviewed-by: Masami Hiramatsu <[email protected]> Acked-by: Peter Zijlstra <[email protected]> Cc: Adrian Hunter <[email protected]> Cc: Jiri Olsa <[email protected]> Cc: [email protected] Link: http://lore.kernel.org/lkml/53776d6d29bc9eceb571d52df8fa32250c58a0f3.1567118001.git.jpoimboe@redhat.com Signed-off-by: Arnaldo Carvalho de Melo <[email protected]>
2019-08-31perf: Update .gitignore fileJosh Poimboeuf1-0/+3
After a "make tools/perf", git reports the following untracked files: tools/perf/feature/ tools/perf/fixdep tools/perf/libtraceevent-dynamic-list Add these generated files to perf's .gitignore file. Signed-off-by: Josh Poimboeuf <[email protected]> Reviewed-by: Masami Hiramatsu <[email protected]> Acked-by: Peter Zijlstra <[email protected]> Cc: Adrian Hunter <[email protected]> Cc: Jiri Olsa <[email protected]> Cc: [email protected] Link: http://lore.kernel.org/lkml/03acbc6c2fbc72054861f6c301875db75db33030.1567118001.git.jpoimboe@redhat.com Signed-off-by: Arnaldo Carvalho de Melo <[email protected]>
2019-08-31objtool: Move x86 insn decoder to a common locationJosh Poimboeuf12-12/+12
The kernel tree has three identical copies of the x86 instruction decoder. Two of them are in the tools subdir. The tools subdir is supposed to be completely standalone and separate from the kernel. So having at least one copy of the kernel decoder in the tools subdir is unavoidable. However, we don't need *two* of them. Move objtool's copy of the decoder to a shared location, so that perf will also be able to use it. Signed-off-by: Josh Poimboeuf <[email protected]> Reviewed-by: Masami Hiramatsu <[email protected]> Acked-by: Peter Zijlstra (Intel) <[email protected]> Cc: Adrian Hunter <[email protected]> Cc: Jiri Olsa <[email protected]> Cc: [email protected] Link: http://lore.kernel.org/lkml/55b486b88f6bcd0c9a2a04b34f964860c8390ca8.1567118001.git.jpoimboe@redhat.com Signed-off-by: Arnaldo Carvalho de Melo <[email protected]>
2019-08-31perf metricgroup: Support multiple events for metricgroupJin Yao3-44/+68
Some uncore metrics don't work as expected. For example, on cascadelakex: root@lkp-csl-2sp2:~# perf stat -M UNC_M_PMM_BANDWIDTH.TOTAL -a -- sleep 1 Performance counter stats for 'system wide': 1841092 unc_m_pmm_rpq_inserts 3680816 unc_m_pmm_wpq_inserts 1.001775055 seconds time elapsed root@lkp-csl-2sp2:~# perf stat -M UNC_M_PMM_READ_LATENCY -a -- sleep 1 Performance counter stats for 'system wide': 860649746 unc_m_pmm_rpq_occupancy.all 1840557 unc_m_pmm_rpq_inserts 12790627455 unc_m_clockticks 1.001773348 seconds time elapsed No metrics 'UNC_M_PMM_BANDWIDTH.TOTAL' or 'UNC_M_PMM_READ_LATENCY' are reported. The issue is, the case of an alias expanding to mulitple events is not supported, typically the uncore events. (see comments in find_evsel_group()). For UNC_M_PMM_BANDWIDTH.TOTAL in above example, the expanded event group is '{unc_m_pmm_rpq_inserts,unc_m_pmm_wpq_inserts}:W', but the actual events passed to find_evsel_group are: unc_m_pmm_rpq_inserts unc_m_pmm_rpq_inserts unc_m_pmm_rpq_inserts unc_m_pmm_rpq_inserts unc_m_pmm_rpq_inserts unc_m_pmm_rpq_inserts unc_m_pmm_wpq_inserts unc_m_pmm_wpq_inserts unc_m_pmm_wpq_inserts unc_m_pmm_wpq_inserts unc_m_pmm_wpq_inserts unc_m_pmm_wpq_inserts For this multiple events case, it's not supported well. This patch introduces a new field 'metric_leader' in struct evsel. The first event is considered as a metric leader. For the rest of same events, they point to the first event via it's metric_leader field in struct evsel. This design is for adding the counting results of all same events to the first event in group (the metric_leader). With this patch, root@lkp-csl-2sp2:~# perf stat -M UNC_M_PMM_BANDWIDTH.TOTAL -a -- sleep 1 Performance counter stats for 'system wide': 1842108 unc_m_pmm_rpq_inserts # 337.2 MB/sec UNC_M_PMM_BANDWIDTH.TOTAL 3682209 unc_m_pmm_wpq_inserts 1.001819706 seconds time elapsed root@lkp-csl-2sp2:~# perf stat -M UNC_M_PMM_READ_LATENCY -a -- sleep 1 Performance counter stats for 'system wide': 861970685 unc_m_pmm_rpq_occupancy.all # 219.4 ns UNC_M_PMM_READ_LATENCY 1842772 unc_m_pmm_rpq_inserts 12790196356 unc_m_clockticks 1.001749103 seconds time elapsed Now we can see the correct metrics 'UNC_M_PMM_BANDWIDTH.TOTAL' and 'UNC_M_PMM_READ_LATENCY'. Signed-off-by: Jin Yao <[email protected]> Cc: Alexander Shishkin <[email protected]> Cc: Andi Kleen <[email protected]> Cc: Jiri Olsa <[email protected]> Cc: Kan Liang <[email protected]> Cc: Peter Zijlstra <[email protected]> Link: http://lore.kernel.org/lkml/[email protected] Signed-off-by: Arnaldo Carvalho de Melo <[email protected]>
2019-08-31perf metricgroup: Scale the metric resultJin Yao3-11/+31
Some metrics define the scale unit, such as { "BriefDescription": "Intel Optane DC persistent memory read latency (ns). Derived from unc_m_pmm_rpq_occupancy.all", "Counter": "0,1,2,3", "EventCode": "0xE0", "EventName": "UNC_M_PMM_READ_LATENCY", "MetricExpr": "UNC_M_PMM_RPQ_OCCUPANCY.ALL / UNC_M_PMM_RPQ_INSERTS / UNC_M_CLOCKTICKS", "MetricName": "UNC_M_PMM_READ_LATENCY", "PerPkg": "1", "ScaleUnit": "6000000000ns", "UMask": "0x1", "Unit": "iMC" }, For above example, the ratio should be, ratio = (UNC_M_PMM_RPQ_OCCUPANCY.ALL / UNC_M_PMM_RPQ_INSERTS / UNC_M_CLOCKTICKS) * 6000000000 But in current code, the ratio is not scaled ( * 6000000000) With this patch, the ratio is scaled and the unit (ns) is printed. For example, # 219.4 ns UNC_M_PMM_READ_LATENCY Signed-off-by: Jin Yao <[email protected]> Cc: Alexander Shishkin <[email protected]> Cc: Andi Kleen <[email protected]> Cc: Jiri Olsa <[email protected]> Cc: Kan Liang <[email protected]> Cc: Peter Zijlstra <[email protected]> Link: http://lore.kernel.org/lkml/[email protected] Signed-off-by: Arnaldo Carvalho de Melo <[email protected]>
2019-08-31perf pmu: Change convert_scale from static to globalJin Yao2-3/+5
The function convert_scale() can be used to convert string to unit and scale. For example, s = "6000000000ns"; convert_scale(s, &unit, &scale); unit = "ns", scale = 6000000000. Currently this function is static. This patch renames the function to perf_pmu__convert_scale and changes the function to global. No functional change. Signed-off-by: Jin Yao <[email protected]> Cc: Alexander Shishkin <[email protected]> Cc: Andi Kleen <[email protected]> Cc: Jiri Olsa <[email protected]> Cc: Kan Liang <[email protected]> Cc: Peter Zijlstra <[email protected]> Link: http://lore.kernel.org/lkml/[email protected] Signed-off-by: Arnaldo Carvalho de Melo <[email protected]>
2019-08-31perf symbols: Move mem_info and branch_info out of symbol.hArnaldo Carvalho de Melo22-19/+47
The mem_info struct goes to mem-events.h and branch_info goes to branch.h, where they belong, this way we can remove several headers from symbols.h and trim the include dependency tree more. Cc: Adrian Hunter <[email protected]> Cc: Jiri Olsa <[email protected]> Cc: Namhyung Kim <[email protected]> Link: https://lkml.kernel.org/n/[email protected] Signed-off-by: Arnaldo Carvalho de Melo <[email protected]>
2019-08-31perf auxtrace: Uninline functions that touch perf_sessionArnaldo Carvalho de Melo41-38/+98
So that we don't carry the session.h include directive in auxtrace.h, which in turn opens a can of worms of files that were getting all sorts of things via that include, fix them all. Cc: Adrian Hunter <[email protected]> Cc: Jiri Olsa <[email protected]> Cc: Namhyung Kim <[email protected]> Link: https://lkml.kernel.org/n/[email protected] Signed-off-by: Arnaldo Carvalho de Melo <[email protected]>
2019-08-31perf tools: Remove needless evlist.h include directivesArnaldo Carvalho de Melo32-32/+33
Remove the last unneeded use of cache.h in a header, we can check where it is really needed, i.e. we can remove it and be sure that it isn't being obtained indirectly. This is an old file, used by now incorrectly in many places, so it was providing includes needed indirectly, fixup this fallout. Cc: Adrian Hunter <[email protected]> Cc: Jiri Olsa <[email protected]> Cc: Namhyung Kim <[email protected]> Link: https://lkml.kernel.org/n/[email protected] Signed-off-by: Arnaldo Carvalho de Melo <[email protected]>
2019-08-31perf tools: Remove needless evlist.h include directivesArnaldo Carvalho de Melo6-6/+6
Now that evlist.h isn't included by any other header, we can check where it is really needed, i.e. we can remove it and be sure that it isn't being obtained indirectly. Cc: Adrian Hunter <[email protected]> Cc: Jiri Olsa <[email protected]> Cc: Namhyung Kim <[email protected]> Link: https://lkml.kernel.org/n/[email protected] Signed-off-by: Arnaldo Carvalho de Melo <[email protected]>
2019-08-31perf tools: Remove needless thread_map.h include directivesArnaldo Carvalho de Melo3-3/+0
Now that thread_map.h isn't included by any other header, we can check where it is really needed, i.e. we can remove it and be sure that it isn't being obtained indirectly. Cc: Adrian Hunter <[email protected]> Cc: Jiri Olsa <[email protected]> Cc: Namhyung Kim <[email protected]> Link: https://lkml.kernel.org/n/[email protected] Signed-off-by: Arnaldo Carvalho de Melo <[email protected]>
2019-08-31perf tools: Remove needless thread.h include directivesArnaldo Carvalho de Melo9-9/+0
Now that thread.h isn't included by any other header, we can check where it is really needed, i.e. we can remove it and be sure that it isn't being obtained indirectly. Cc: Adrian Hunter <[email protected]> Cc: Jiri Olsa <[email protected]> Cc: Namhyung Kim <[email protected]> Link: https://lkml.kernel.org/n/[email protected] Signed-off-by: Arnaldo Carvalho de Melo <[email protected]>
2019-08-31perf tools: Remove needless map.h include directivesArnaldo Carvalho de Melo3-7/+3
Now that map.h isn't included by any other header, we can check where it is really needed, i.e. we can remove it and be sure that it isn't being obtained indirectly. Cc: Adrian Hunter <[email protected]> Cc: Jiri Olsa <[email protected]> Cc: Namhyung Kim <[email protected]> Link: https://lkml.kernel.org/n/[email protected] Signed-off-by: Arnaldo Carvalho de Melo <[email protected]>
2019-08-31perf probe: No need for symbol.h, symbol_conf is enoughArnaldo Carvalho de Melo1-1/+1
Remove one more unneeded use of symbol.h Cc: Adrian Hunter <[email protected]> Cc: Jiri Olsa <[email protected]> Cc: Namhyung Kim <[email protected]> Link: https://lkml.kernel.org/n/[email protected] Signed-off-by: Arnaldo Carvalho de Melo <[email protected]>
2019-08-31perf tools: Remove needless sort.h include directivesArnaldo Carvalho de Melo5-3/+2
Now that sort.h isn't included by any other header, we can check where it is really needed, i.e. we can remove it and be sure that it isn't being obtained indirectly. Cc: Adrian Hunter <[email protected]> Cc: Jiri Olsa <[email protected]> Cc: Namhyung Kim <[email protected]> Link: https://lkml.kernel.org/n/[email protected] Signed-off-by: Arnaldo Carvalho de Melo <[email protected]>
2019-08-31perf tools: Move 'struct events_stats' and prototypes to separate headerArnaldo Carvalho de Melo6-47/+58
This will allow us to untangle the header dependency a bit more, as some places will not need event.h anymore. Cc: Adrian Hunter <[email protected]> Cc: Jiri Olsa <[email protected]> Cc: Namhyung Kim <[email protected]> Link: https://lkml.kernel.org/n/[email protected] Signed-off-by: Arnaldo Carvalho de Melo <[email protected]>
2019-08-31perf hist: Remove needless ui/progress.h from hist.hArnaldo Carvalho de Melo4-1/+4
We only need a forward declaration, add it and fixup all the files that need ui_progress definitions but were wrongly getting it from hist.h. Cc: Adrian Hunter <[email protected]> Cc: Jiri Olsa <[email protected]> Cc: Namhyung Kim <[email protected]> Link: https://lkml.kernel.org/n/[email protected] Signed-off-by: Arnaldo Carvalho de Melo <[email protected]>
2019-08-31perf dsos: Move the dsos struct and its methods to separate source filesArnaldo Carvalho de Melo49-257/+331
So that we can reduce the header dependency tree further, in the process noticed that lots of places were getting even things like build-id routines and 'struct perf_tool' definition indirectly, so fix all those too. Cc: Adrian Hunter <[email protected]> Cc: Jiri Olsa <[email protected]> Cc: Namhyung Kim <[email protected]> Link: https://lkml.kernel.org/n/[email protected] Signed-off-by: Arnaldo Carvalho de Melo <[email protected]>
2019-08-31perf symbols: Move symsrc prototypes to a separate headerArnaldo Carvalho de Melo10-33/+58
So that we can remove dso.h from symbol.h and reduce the header dependency tree. Fixup cases where struct dso guts are needed but were obtained via symbol.h, indirectly. Cc: Adrian Hunter <[email protected]> Cc: Jiri Olsa <[email protected]> Cc: Namhyung Kim <[email protected]> Link: https://lkml.kernel.org/n/[email protected] Signed-off-by: Arnaldo Carvalho de Melo <[email protected]>
2019-08-31perf symbols: Add missing linux/refcount.h to symbol.hArnaldo Carvalho de Melo1-0/+1
We use refcount_t there, so we need that header or else it'll break when we remove dso.h, that is from where it is getting that definition now... Cc: Adrian Hunter <[email protected]> Cc: Jiri Olsa <[email protected]> Cc: Namhyung Kim <[email protected]> Link: https://lkml.kernel.org/n/[email protected] Signed-off-by: Arnaldo Carvalho de Melo <[email protected]>
2019-08-31perf symbol: Move C++ demangle defines to the only file using itArnaldo Carvalho de Melo2-6/+6
No need to have it generally available in such a critical header as symbol.h. Cc: Adrian Hunter <[email protected]> Cc: Jiri Olsa <[email protected]> Cc: Namhyung Kim <[email protected]> Link: https://lkml.kernel.org/n/[email protected] Signed-off-by: Arnaldo Carvalho de Melo <[email protected]>
2019-08-31perf dso: Adopt DSO related macros from symbol.hArnaldo Carvalho de Melo6-3/+7
Reducing the size of symbol.h by removing things that are better placed somewhere else. Cc: Adrian Hunter <[email protected]> Cc: Jiri Olsa <[email protected]> Cc: Namhyung Kim <[email protected]> Link: https://lkml.kernel.org/n/[email protected] Signed-off-by: Arnaldo Carvalho de Melo <[email protected]>
2019-08-31libtraceevent: Change users plugin directoryTzvetomir Stoyanov2-4/+4
To be compliant with XDG user directory layout, the user's plugin directory is changed from ~/.traceevent/plugins to ~/.local/lib/traceevent/plugins/ Suggested-by: Patrick McLean <[email protected]> Signed-off-by: Tzvetomir Stoyanov <[email protected]> Cc: Andrew Morton <[email protected]> Cc: Jiri Olsa <[email protected]> Cc: Namhyung Kim <[email protected]> Cc: Patrick McLean <[email protected]> Cc: [email protected] Link: https://lore.kernel.org/linux-trace-devel/20190313144206.41e75cf8@patrickm/ Link: http://lore.kernel.org/linux-trace-devel/[email protected] Link: http://lore.kernel.org/lkml/[email protected] Signed-off-by: Steven Rostedt (VMware) <[email protected]> Signed-off-by: Arnaldo Carvalho de Melo <[email protected]>
2019-08-31libtraceevent: Remove tep_register_trace_clock()Tzvetomir Stoyanov3-14/+0
The tep_register_trace_clock() API is used to instruct the traceevent library how to print the event time stamps. As event print interface if redesigned, this API is not needed any more. The new event print API is flexible and the user can specify how the time stamps are printed. Signed-off-by: Tzvetomir Stoyanov <[email protected]> Cc: Andrew Morton <[email protected]> Cc: Jiri Olsa <[email protected]> Cc: Namhyung Kim <[email protected]> Cc: Patrick McLean <[email protected]> Cc: [email protected] Link: http://lore.kernel.org/linux-trace-devel/[email protected] Link: http://lore.kernel.org/lkml/[email protected] Signed-off-by: Steven Rostedt (VMware) <[email protected]> Signed-off-by: Arnaldo Carvalho de Melo <[email protected]>
2019-08-31libtraceevent, perf tools: Changes in tep_print_event_* APIsTzvetomir Stoyanov7-200/+203
Libtraceevent APIs for printing various trace events information are complicated, there are complex extra parameters. To control the way event information is printed, the user should call a set of functions in a specific sequence. These APIs are reimplemented to provide a more simple interface for printing event information. Removed APIs: tep_print_event_task() tep_print_event_time() tep_print_event_data() tep_event_info() tep_is_latency_format() tep_set_latency_format() tep_data_latency_format() tep_set_print_raw() A new API for printing event information is introduced: void tep_print_event(struct tep_handle *tep, struct trace_seq *s, struct tep_record *record, const char *fmt, ...); where "fmt" is a printf-like format string, followed by the event fields to be printed. Supported fields: TEP_PRINT_PID, "%d" - event PID TEP_PRINT_CPU, "%d" - event CPU TEP_PRINT_COMM, "%s" - event command string TEP_PRINT_NAME, "%s" - event name TEP_PRINT_LATENCY, "%s" - event latency TEP_PRINT_TIME, %d - event time stamp. A divisor and precision can be specified as part of this format string: "%precision.divisord". Example: "%3.1000d" - divide the time by 1000 and print the first 3 digits before the dot. Thus, the time stamp "123456000" will be printed as "123.456" TEP_PRINT_INFO, "%s" - event information. TEP_PRINT_INFO_RAW, "%s" - event information, in raw format. Example: tep_print_event(tep, s, record, "%16s-%-5d [%03d] %s %6.1000d %s %s", TEP_PRINT_COMM, TEP_PRINT_PID, TEP_PRINT_CPU, TEP_PRINT_LATENCY, TEP_PRINT_TIME, TEP_PRINT_NAME, TEP_PRINT_INFO); Output: ls-11314 [005] d.h. 185207.366383 function __wake_up Signed-off-by: Tzvetomir Stoyanov <[email protected]> Cc: Andrew Morton <[email protected]> Cc: Jiri Olsa <[email protected]> Cc: Namhyung Kim <[email protected]> Cc: [email protected] Cc: Patrick McLean <[email protected]> Link: http://lore.kernel.org/linux-trace-devel/[email protected] Link: http://lore.kernel.org/lkml/[email protected] Signed-off-by: Steven Rostedt (VMware) <[email protected]> Signed-off-by: Arnaldo Carvalho de Melo <[email protected]>
2019-08-31perf event: Remove needless include directives from event.hArnaldo Carvalho de Melo8-6/+20
bpf.h and build-id.h are not needed at all in event.h, remove them. And fixup the fallout of files that were getting needed stuff from this now pruned include. Cc: Adrian Hunter <[email protected]> Cc: Jiri Olsa <[email protected]> Cc: Namhyung Kim <[email protected]> Link: https://lkml.kernel.org/n/[email protected] Signed-off-by: Arnaldo Carvalho de Melo <[email protected]>
2019-08-31perf env: Remove env.h from other headers where just a fwd decl is neededArnaldo Carvalho de Melo5-3/+9
And fixup the fallout of c files not building due to now missing headers. Cc: Adrian Hunter <[email protected]> Cc: Jiri Olsa <[email protected]> Cc: Namhyung Kim <[email protected]> Link: https://lkml.kernel.org/n/[email protected] Signed-off-by: Arnaldo Carvalho de Melo <[email protected]>
2019-08-31perf debug: Remove needless include directives from debug.hArnaldo Carvalho de Melo75-6/+104
All we need there is a forward declaration for 'union perf_event', so remove it from there and add missing header directives in places using things from this indirect include. Cc: Adrian Hunter <[email protected]> Cc: Jiri Olsa <[email protected]> Cc: Namhyung Kim <[email protected]> Link: https://lkml.kernel.org/n/[email protected] Signed-off-by: Arnaldo Carvalho de Melo <[email protected]>
2019-08-31tools/power turbostat: update version numberLen Brown1-1/+1
Today is 19.08.31, at least in some parts of the world. Signed-off-by: Len Brown <[email protected]>
2019-08-31tools/power turbostat: Add support for Hygon Fam 18h (Dhyana) RAPLPu Wen1-2/+7
Commit 9392bd98bba760be96ee ("tools/power turbostat: Add support for AMD Fam 17h (Zen) RAPL") and the commit 3316f99a9f1b68c578c5 ("tools/power turbostat: Also read package power on AMD F17h (Zen)") add AMD Fam 17h RAPL support. Hygon Family 18h(Dhyana) support RAPL in bit 14 of CPUID 0x80000007 EDX, and has MSRs RAPL_PWR_UNIT/CORE_ENERGY_STAT/PKG_ENERGY_STAT. So add Hygon Dhyana Family 18h support for RAPL. Already tested on Hygon multi-node systems and it shows correct per-core energy usage and the total package power. Signed-off-by: Pu Wen <[email protected]> Reviewed-by: Calvin Walton <[email protected]> Signed-off-by: Len Brown <[email protected]>
2019-08-31tools/power turbostat: Fix caller parameter of get_tdp_amd()Pu Wen1-1/+1
Commit 9392bd98bba760be96ee ("tools/power turbostat: Add support for AMD Fam 17h (Zen) RAPL") add a function get_tdp_amd(), the parameter is CPU family. But the rapl_probe_amd() function use wrong model parameter. Fix the wrong caller parameter of get_tdp_amd() to use family. Cc: <[email protected]> # v5.1+ Signed-off-by: Pu Wen <[email protected]> Reviewed-by: Calvin Walton <[email protected]> Signed-off-by: Len Brown <[email protected]>
2019-08-31tools/power turbostat: Fix CPU%C1 display valueSrinivas Pandruvada1-6/+17
In some case C1% will be wrong value, when platform doesn't have MSR for C1 residency. For example: Core CPU CPU%c1 - - 100.00 0 0 100.00 0 2 100.00 1 1 100.00 1 3 100.00 But adding Busy% will fix this Core CPU Busy% CPU%c1 - - 99.77 0.23 0 0 99.77 0.23 0 2 99.77 0.23 1 1 99.77 0.23 1 3 99.77 0.23 This issue can be reproduced on most of the recent systems including Broadwell, Skylake and later. This is because if we don't select Busy% or Avg_MHz or Bzy_MHz then mperf value will not be read from MSR, so it will be 0. But this is required for C1% calculation when MSR for C1 residency is not present. Same is true for C3, C6 and C7 column selection. So add another define DO_BIC_READ(), which doesn't depend on user column selection and use for mperf, C3, C6 and C7 related counters. So when there is no platform support for C1 residency counters, we still read these counters, if the CPU has support and user selected display of CPU%c1. Signed-off-by: Srinivas Pandruvada <[email protected]> Signed-off-by: Len Brown <[email protected]>
2019-08-31tools/power turbostat: do not enforce 1msArtem Bityutskiy1-5/+0
Turbostat works by taking a snapshot of counters, sleeping, taking another snapshot, calculating deltas, and printing out the table. The sleep time is controlled via -i option or by user sending a signal or a character to stdin. In the latter case, turbostat always adds 1 ms sleep before it reads the counters, in order to avoid larger imprecisions in the results in prints. While the 1 ms delay may be a good idea for a "dumb" user, it is a problem for an "aware" user. I do thousands and thousands of measurements over a short period of time (like 2ms), and turbostat unconditionally adds a 1ms to my interval, so I cannot get what I really need. This patch removes the unconditional 1ms sleep. This is an expert user tool, after all, and non-experts will unlikely ever use it in the non-fixed interval mode anyway, so I think it is OK to remove the 1ms delay. Signed-off-by: Artem Bityutskiy <[email protected]> Signed-off-by: Len Brown <[email protected]>
2019-08-31tools/power turbostat: read from pipes tooArtem Bityutskiy1-4/+16
Commit '47936f944e78 tools/power turbostat: fix printing on input' make a valid fix, but it completely disabled piped stdin support, which is a valuable use-case. Indeed, if stdin is a pipe, turbostat won't read anything from it, so it becomes impossible to get turbostat output at user-defined moments, instead of the regular intervals. There is no reason why this should works for terminals, but not for pipes. This patch improves the situation. Instead of ignoring pipes, we read data from them but gracefully handle the EOF case. Signed-off-by: Artem Bityutskiy <[email protected]> Signed-off-by: Len Brown <[email protected]>
2019-08-31tools/power turbostat: Add Ice Lake NNPI supportRajneesh Bhardwaj1-0/+1
This enables turbostat utility on Ice Lake NNPI SoC. Link: https://lkml.org/lkml/2019/6/5/1034 Signed-off-by: Rajneesh Bhardwaj <[email protected]> Signed-off-by: Len Brown <[email protected]>
2019-08-31tools/power turbostat: rename has_hsw_msrs()Len Brown1-4/+4
Perhaps if this more descriptive name had been used, then we wouldn't have had the HSW ULT vs HSW CORE bug, fixed by the previous commit. Signed-off-by: Len Brown <[email protected]>
2019-08-31tools/power turbostat: Fix Haswell Core systemsLen Brown1-4/+6
turbostat: cpu0: msr offset 0x630 read failed: Input/output error because Haswell Core does not have C8-C10. Output C8-C10 only on Haswell ULT. Fixes: f5a4c76ad7de ("tools/power turbostat: consolidate duplicate model numbers") Reported-by: Prarit Bhargava <[email protected]> Suggested-by: Kosuke Tatsukawa <[email protected]> Signed-off-by: Len Brown <[email protected]>
2019-08-31tools/power turbostat: add Jacobsville supportZhang Rui1-0/+3
Jacobsville behaves like Denverton. Signed-off-by: Zhang Rui <[email protected]> Signed-off-by: Len Brown <[email protected]>
2019-08-31tools/power turbostat: fix buffer overrunNaoya Horiguchi1-1/+1
turbostat could be terminated by general protection fault on some latest hardwares which (for example) support 9 levels of C-states and show 18 "tADDED" lines. That bloats the total output and finally causes buffer overrun. So let's extend the buffer to avoid this. Signed-off-by: Naoya Horiguchi <[email protected]> Signed-off-by: Len Brown <[email protected]>