aboutsummaryrefslogtreecommitdiff
path: root/tools/perf/arch/riscv
AgeCommit message (Collapse)AuthorFilesLines
2024-07-20Merge tag 'for-linus' of git://git.kernel.org/pub/scm/virt/kvm/kvmLinus Torvalds4-0/+115
Pull kvm updates from Paolo Bonzini: "ARM: - Initial infrastructure for shadow stage-2 MMUs, as part of nested virtualization enablement - Support for userspace changes to the guest CTR_EL0 value, enabling (in part) migration of VMs between heterogenous hardware - Fixes + improvements to pKVM's FF-A proxy, adding support for v1.1 of the protocol - FPSIMD/SVE support for nested, including merged trap configuration and exception routing - New command-line parameter to control the WFx trap behavior under KVM - Introduce kCFI hardening in the EL2 hypervisor - Fixes + cleanups for handling presence/absence of FEAT_TCRX - Miscellaneous fixes + documentation updates LoongArch: - Add paravirt steal time support - Add support for KVM_DIRTY_LOG_INITIALLY_SET - Add perf kvm-stat support for loongarch RISC-V: - Redirect AMO load/store access fault traps to guest - perf kvm stat support - Use guest files for IMSIC virtualization, when available s390: - Assortment of tiny fixes which are not time critical x86: - Fixes for Xen emulation - Add a global struct to consolidate tracking of host values, e.g. EFER - Add KVM_CAP_X86_APIC_BUS_CYCLES_NS to allow configuring the effective APIC bus frequency, because TDX - Print the name of the APICv/AVIC inhibits in the relevant tracepoint - Clean up KVM's handling of vendor specific emulation to consistently act on "compatible with Intel/AMD", versus checking for a specific vendor - Drop MTRR virtualization, and instead always honor guest PAT on CPUs that support self-snoop - Update to the newfangled Intel CPU FMS infrastructure - Don't advertise IA32_PERF_GLOBAL_OVF_CTRL as an MSR-to-be-saved, as it reads '0' and writes from userspace are ignored - Misc cleanups x86 - MMU: - Small cleanups, renames and refactoring extracted from the upcoming Intel TDX support - Don't allocate kvm_mmu_page.shadowed_translation for shadow pages that can't hold leafs SPTEs - Unconditionally drop mmu_lock when allocating TDP MMU page tables for eager page splitting, to avoid stalling vCPUs when splitting huge pages - Bug the VM instead of simply warning if KVM tries to split a SPTE that is non-present or not-huge. KVM is guaranteed to end up in a broken state because the callers fully expect a valid SPTE, it's all but dangerous to let more MMU changes happen afterwards x86 - AMD: - Make per-CPU save_area allocations NUMA-aware - Force sev_es_host_save_area() to be inlined to avoid calling into an instrumentable function from noinstr code - Base support for running SEV-SNP guests. API-wise, this includes a new KVM_X86_SNP_VM type, encrypting/measure the initial image into guest memory, and finalizing it before launching it. Internally, there are some gmem/mmu hooks needed to prepare gmem-allocated pages before mapping them into guest private memory ranges This includes basic support for attestation guest requests, enough to say that KVM supports the GHCB 2.0 specification There is no support yet for loading into the firmware those signing keys to be used for attestation requests, and therefore no need yet for the host to provide certificate data for those keys. To support fetching certificate data from userspace, a new KVM exit type will be needed to handle fetching the certificate from userspace. An attempt to define a new KVM_EXIT_COCO / KVM_EXIT_COCO_REQ_CERTS exit type to handle this was introduced in v1 of this patchset, but is still being discussed by community, so for now this patchset only implements a stub version of SNP Extended Guest Requests that does not provide certificate data x86 - Intel: - Remove an unnecessary EPT TLB flush when enabling hardware - Fix a series of bugs that cause KVM to fail to detect nested pending posted interrupts as valid wake eents for a vCPU executing HLT in L2 (with HLT-exiting disable by L1) - KVM: x86: Suppress MMIO that is triggered during task switch emulation Explicitly suppress userspace emulated MMIO exits that are triggered when emulating a task switch as KVM doesn't support userspace MMIO during complex (multi-step) emulation Silently ignoring the exit request can result in the WARN_ON_ONCE(vcpu->mmio_needed) firing if KVM exits to userspace for some other reason prior to purging mmio_needed See commit 0dc902267cb3 ("KVM: x86: Suppress pending MMIO write exits if emulator detects exception") for more details on KVM's limitations with respect to emulated MMIO during complex emulator flows Generic: - Rename the AS_UNMOVABLE flag that was introduced for KVM to AS_INACCESSIBLE, because the special casing needed by these pages is not due to just unmovability (and in fact they are only unmovable because the CPU cannot access them) - New ioctl to populate the KVM page tables in advance, which is useful to mitigate KVM page faults during guest boot or after live migration. The code will also be used by TDX, but (probably) not through the ioctl - Enable halt poll shrinking by default, as Intel found it to be a clear win - Setup empty IRQ routing when creating a VM to avoid having to synchronize SRCU when creating a split IRQCHIP on x86 - Rework the sched_in/out() paths to replace kvm_arch_sched_in() with a flag that arch code can use for hooking both sched_in() and sched_out() - Take the vCPU @id as an "unsigned long" instead of "u32" to avoid truncating a bogus value from userspace, e.g. to help userspace detect bugs - Mark a vCPU as preempted if and only if it's scheduled out while in the KVM_RUN loop, e.g. to avoid marking it preempted and thus writing guest memory when retrieving guest state during live migration blackout Selftests: - Remove dead code in the memslot modification stress test - Treat "branch instructions retired" as supported on all AMD Family 17h+ CPUs - Print the guest pseudo-RNG seed only when it changes, to avoid spamming the log for tests that create lots of VMs - Make the PMU counters test less flaky when counting LLC cache misses by doing CLFLUSH{OPT} in every loop iteration" * tag 'for-linus' of git://git.kernel.org/pub/scm/virt/kvm/kvm: (227 commits) crypto: ccp: Add the SNP_VLEK_LOAD command KVM: x86/pmu: Add kvm_pmu_call() to simplify static calls of kvm_pmu_ops KVM: x86: Introduce kvm_x86_call() to simplify static calls of kvm_x86_ops KVM: x86: Replace static_call_cond() with static_call() KVM: SEV: Provide support for SNP_EXTENDED_GUEST_REQUEST NAE event x86/sev: Move sev_guest.h into common SEV header KVM: SEV: Provide support for SNP_GUEST_REQUEST NAE event KVM: x86: Suppress MMIO that is triggered during task switch emulation KVM: x86/mmu: Clean up make_huge_page_split_spte() definition and intro KVM: x86/mmu: Bug the VM if KVM tries to split a !hugepage SPTE KVM: selftests: x86: Add test for KVM_PRE_FAULT_MEMORY KVM: x86: Implement kvm_arch_vcpu_pre_fault_memory() KVM: x86/mmu: Make kvm_mmu_do_page_fault() return mapped level KVM: x86/mmu: Account pf_{fixed,emulate,spurious} in callers of "do page fault" KVM: x86/mmu: Bump pf_taken stat only in the "real" page fault handler KVM: Add KVM_PRE_FAULT_MEMORY vcpu ioctl to pre-populate guest memory KVM: Document KVM_PRE_FAULT_MEMORY ioctl mm, virt: merge AS_UNMOVABLE and AS_INACCESSIBLE perf kvm: Add kvm-stat for loongarch64 LoongArch: KVM: Add PV steal time support in guest side ...
2024-06-26perf util: Make util its own libraryIan Rogers2-5/+5
Make the util directory into its own library. This is done to avoid compiling code twice, once for the perf tool and once for the perf python module. For convenience: arch/common.c scripts/perl/Perf-Trace-Util/Context.c scripts/python/Perf-Trace-Util/Context.c are made part of this library. Signed-off-by: Ian Rogers <[email protected]> Reviewed-by: James Clark <[email protected]> Cc: Suzuki K Poulose <[email protected]> Cc: Kees Cook <[email protected]> Cc: Palmer Dabbelt <[email protected]> Cc: Albert Ou <[email protected]> Cc: Nick Terrell <[email protected]> Cc: Gary Guo <[email protected]> Cc: Alex Gaynor <[email protected]> Cc: Boqun Feng <[email protected]> Cc: Wedson Almeida Filho <[email protected]> Cc: Ze Gao <[email protected]> Cc: Alice Ryhl <[email protected]> Cc: Andrei Vagin <[email protected]> Cc: Yicong Yang <[email protected]> Cc: Jonathan Cameron <[email protected]> Cc: Guo Ren <[email protected]> Cc: Miguel Ojeda <[email protected]> Cc: Will Deacon <[email protected]> Cc: Mike Leach <[email protected]> Cc: Leo Yan <[email protected]> Cc: Oliver Upton <[email protected]> Cc: John Garry <[email protected]> Cc: Benno Lossin <[email protected]> Cc: Björn Roy Baron <[email protected]> Cc: Andreas Hindborg <[email protected]> Cc: Paul Walmsley <[email protected]> Signed-off-by: Namhyung Kim <[email protected]> Link: https://lore.kernel.org/r/[email protected]
2024-06-26perf kvm/riscv: Port perf kvm stat to RISC-VShenlin Liang4-0/+115
'perf kvm stat report/record' generates a statistical analysis of KVM events and can be used to analyze guest exit reasons. "report" reports statistical analysis of guest exit events. To record kvm events on the host: # perf kvm stat record -a To report kvm VM EXIT events: # perf kvm stat report --event=vmexit Signed-off-by: Shenlin Liang <[email protected]> Reviewed-by: Atish Patra <[email protected]> Tested-by: Atish Patra <[email protected]> Link: https://lore.kernel.org/r/[email protected] Signed-off-by: Anup Patel <[email protected]>
2024-04-26perf riscv: Fix the warning due to the incompatible typeBen Zong-You Xie1-1/+1
In the 32-bit platform, the second argument of getline is expectd to be 'size_t *'(aka 'unsigned int *'), but line_sz is of type 'unsigned long *'. Therefore, declare line_sz as size_t. Signed-off-by: Ben Zong-You Xie <[email protected]> Reviewed-by: Alexandre Ghiti <[email protected]> Link: https://lore.kernel.org/r/[email protected] Signed-off-by: Palmer Dabbelt <[email protected]>
2024-02-15perf parse-regs: Introduce a weak function arch__sample_reg_masks()Leo Yan1-1/+6
Every architecture can provide a register list for sampling. If an architecture doesn't support register sampling, it won't define the data structure 'sample_reg_masks'. Consequently, any code using this structure must be protected by the macro 'HAVE_PERF_REGS_SUPPORT'. This patch defines a weak function, arch__sample_reg_masks(), which will be replaced by an architecture-defined function for returning the architecture's register list. With this refactoring, the function always exists, the condition checking for 'HAVE_PERF_REGS_SUPPORT' is not needed anymore, so remove it. Signed-off-by: Leo Yan <[email protected]> Reviewed-by: Ian Rogers <[email protected]> Cc: James Clark <[email protected]> Cc: Palmer Dabbelt <[email protected]> Cc: Albert Ou <[email protected]> Cc: Huacai Chen <[email protected]> Cc: Guo Ren <[email protected]> Cc: Will Deacon <[email protected]> Cc: Mike Leach <[email protected]> Cc: Kan Liang <[email protected]> Cc: Ming Wang <[email protected]> Cc: John Garry <[email protected]> Cc: Paul Walmsley <[email protected]> Cc: [email protected] Cc: [email protected] Signed-off-by: Namhyung Kim <[email protected]> Link: https://lore.kernel.org/r/[email protected]
2023-08-16perf parse-regs: Move out arch specific header from util/perf_regs.hLeo Yan2-0/+2
util/perf_regs.h includes another perf_regs.h: #include <perf_regs.h> Here it includes architecture specific header, for example, if we build arm64 target, the header tools/perf/arch/arm64/include/perf_regs.h is included. We use this implicit way to include architecture specific header, which is not directive; furthermore, util/perf_regs.c is coupled with the architecture specific definitions. This patch moves out arch specific header from util/perf_regs.h for generalizing the 'util' folder, as a result, the source files in 'arch' folder explicitly include architecture's perf_regs.h. Signed-off-by: Leo Yan <[email protected]> Acked-by: Ian Rogers <[email protected]> Cc: Adrian Hunter <[email protected]> Cc: Albert Ou <[email protected]> Cc: Alexander Shishkin <[email protected]> Cc: Eric Lin <[email protected]> Cc: Fangrui Song <[email protected]> Cc: Guo Ren <[email protected]> Cc: Huacai Chen <[email protected]> Cc: Ingo Molnar <[email protected]> Cc: Ivan Babrou <[email protected]> Cc: James Clark <[email protected]> Cc: Jiri Olsa <[email protected]> Cc: John Garry <[email protected]> Cc: Kan Liang <[email protected]> Cc: Mark Rutland <[email protected]> Cc: Mike Leach <[email protected]> Cc: Ming Wang <[email protected]> Cc: Namhyung Kim <[email protected]> Cc: Palmer Dabbelt <[email protected]> Cc: Paul Walmsley <[email protected]> Cc: Peter Zijlstra <[email protected]> Cc: Sandipan Das <[email protected]> Cc: Will Deacon <[email protected]> Cc: [email protected] Cc: [email protected] Cc: [email protected] Link: https://lore.kernel.org/r/[email protected] Signed-off-by: Arnaldo Carvalho de Melo <[email protected]>
2023-08-16perf parse-regs: Remove PERF_REGS_{MAX|MASK} from common codeLeo Yan1-0/+10
The macros PERF_REGS_MAX and PERF_REGS_MASK are architecture specific, let's remove them from the common file util/perf_regs.c. As a side effect, the weak functions arch__intr_reg_mask() and arch__user_reg_mask() just return zeros, every arch defines its own functions in the 'arch' folder for returning right values. Note, we don't need to return intr/user register masks dynamically, this is because these two functions are invoked during recording phase but not decoding phase, they are always invoked on the native environment, thus we don't need to parse them dynamically. Signed-off-by: Leo Yan <[email protected]> Acked-by: Ian Rogers <[email protected]> Cc: Adrian Hunter <[email protected]> Cc: Albert Ou <[email protected]> Cc: Alexander Shishkin <[email protected]> Cc: Eric Lin <[email protected]> Cc: Fangrui Song <[email protected]> Cc: Guo Ren <[email protected]> Cc: Huacai Chen <[email protected]> Cc: Ingo Molnar <[email protected]> Cc: Ivan Babrou <[email protected]> Cc: James Clark <[email protected]> Cc: Jiri Olsa <[email protected]> Cc: John Garry <[email protected]> Cc: Kan Liang <[email protected]> Cc: Mark Rutland <[email protected]> Cc: Mike Leach <[email protected]> Cc: Ming Wang <[email protected]> Cc: Namhyung Kim <[email protected]> Cc: Palmer Dabbelt <[email protected]> Cc: Paul Walmsley <[email protected]> Cc: Peter Zijlstra <[email protected]> Cc: Sandipan Das <[email protected]> Cc: Will Deacon <[email protected]> Cc: [email protected] Cc: [email protected] Cc: [email protected] Link: https://lore.kernel.org/r/[email protected] Signed-off-by: Arnaldo Carvalho de Melo <[email protected]>
2023-08-16perf parse-regs: Remove unused macros PERF_REG_{IP|SP}Leo Yan1-3/+0
The macros PERF_REG_{IP|SP} have been replaced by using functions perf_arch_reg_{ip|sp}(), remove them! Signed-off-by: Leo Yan <[email protected]> Acked-by: Ian Rogers <[email protected]> Cc: Adrian Hunter <[email protected]> Cc: Albert Ou <[email protected]> Cc: Alexander Shishkin <[email protected]> Cc: Eric Lin <[email protected]> Cc: Fangrui Song <[email protected]> Cc: Guo Ren <[email protected]> Cc: Huacai Chen <[email protected]> Cc: Ingo Molnar <[email protected]> Cc: Ivan Babrou <[email protected]> Cc: James Clark <[email protected]> Cc: Jiri Olsa <[email protected]> Cc: John Garry <[email protected]> Cc: Kan Liang <[email protected]> Cc: Mark Rutland <[email protected]> Cc: Mike Leach <[email protected]> Cc: Ming Wang <[email protected]> Cc: Namhyung Kim <[email protected]> Cc: Palmer Dabbelt <[email protected]> Cc: Paul Walmsley <[email protected]> Cc: Peter Zijlstra <[email protected]> Cc: Sandipan Das <[email protected]> Cc: Will Deacon <[email protected]> Cc: [email protected] Cc: [email protected] Cc: [email protected] Link: https://lore.kernel.org/r/[email protected] Signed-off-by: Arnaldo Carvalho de Melo <[email protected]>
2023-01-02perf tools riscv: Fix build error on riscv due to missing header for 'struct ↵Eric Lin1-1/+1
perf_sample' Since the definition of 'struct perf_sample' has been moved to sample.h, we need to include this header file to fix the build error as follows: arch/riscv/util/unwind-libdw.c: In function 'libdw__arch_set_initial_registers': arch/riscv/util/unwind-libdw.c:12:50: error: invalid use of undefined type 'struct perf_sample' 12 | struct regs_dump *user_regs = &ui->sample->user_regs; | ^~ Fixes: 9823147da6c893d9 ("perf tools: Move 'struct perf_sample' to a separate header file to disentangle headers") Signed-off-by: Eric Lin <[email protected]> Cc: Alexander Shishkin <[email protected]> Cc: [email protected] Cc: Jiri Olsa <[email protected]> Cc: [email protected] Cc: Namhyung Kim <[email protected]> Cc: Palmer Dabbelt <[email protected]> Cc: Paul Walmsley <[email protected]> Cc: Peter Zijlstra <[email protected]> Cc: Vincent Chen <[email protected]> Link: https://lore.kernel.org/r/[email protected] Signed-off-by: Arnaldo Carvalho de Melo <[email protected]>
2022-10-27perf tools riscv: Add support for get_cpuid_str functionNikita Shubin2-0/+105
The get_cpuid_str function returns the string that contains values of MVENDORID, MARCHID and MIMPID in hex format separated by coma. The values themselves are taken from first cpu entry in "/proc/cpuid" that contains "mvendorid", "marchid" and "mimpid". Signed-off-by: Nikita Shubin <[email protected]> Tested-by: Kautuk Consul <[email protected]> Acked-by: Palmer Dabbelt <[email protected]> Cc: Albert Ou <[email protected]> Cc: Alexander Shishkin <[email protected]> Cc: Anup Patel <[email protected]> Cc: Ingo Molnar <[email protected]> Cc: Jiri Olsa <[email protected]> Cc: Mark Rutland <[email protected]> Cc: Namhyung Kim <[email protected]> Cc: Palmer Dabbelt <[email protected]> Cc: Paul Walmsley <[email protected]> Cc: Peter Zijlstra <[email protected]> Cc: [email protected] Cc: [email protected] Link: https://lore.kernel.org/r/[email protected] Signed-off-by: Arnaldo Carvalho de Melo <[email protected]>
2022-04-11perf jitdump: Add riscv64 supportEric Lin1-0/+1
This patch enables perf jitdump for riscv64 and was tested with V8 on qemu rv64. Qemu rv64: $ perf record -e cpu-clock -c 1000 -g -k mono ./d8_rv64 --perf-prof --no-write-protect-code-memory test.js $ perf inject -j -i perf.data -o perf.data.jitted $ perf report -i perf.data.jitted Output: To display the perf.data header info, please use --header/--header-only options. Total Lost Samples: 0 Samples: 87K of event 'cpu-clock' Event count (approx.): 87974000 Children Self Command Shared Object Symbol .... 0.28% 0.06% d8_rv64 d8_rv64 [.] _ZN2v88internal6WasmJs7InstallEPNS0_7IsolateEb 0.28% 0.00% d8_rv64 d8_rv64 [.] _ZN2v88internal10ParserBaseINS0_6ParserEE22ParseLogicalExpressionEv 0.28% 0.03% d8_rv64 jitted-112-76.so [.] Builtin:InterpreterEntryTrampoline 0.12% 0.00% d8_rv64 d8_rv64 [.] _ZN2v88internal19ContextDeserializer11DeserializeEPNS0_7IsolateENS0_6HandleINS0_13JSGlobalProxyEEENS_33DeserializeInternalFieldsCallbackE 0.12% 0.01% d8_rv64 jitted-112-651.so [.] Builtin:CEntry_Return1_DontSaveFPRegs_ArgvOnStack_NoBuiltinExit .... Signed-off-by: Eric Lin <[email protected]> Cc: Albert Ou <[email protected]> Cc: Alexander Shishkin <[email protected]> Cc: Ilya Leoshkevich <[email protected]> Cc: Jiri Olsa <[email protected]> Cc: Mark Rutland <[email protected]> Cc: Namhyung Kim <[email protected]> Cc: Palmer Dabbelt <[email protected]> Cc: Paul Walmsley <[email protected]> Cc: Peter Zijlstra <[email protected]> Cc: [email protected] Cc: [email protected] Link: http://lore.kernel.org/lkml/[email protected] Signed-off-by: Arnaldo Carvalho de Melo <[email protected]>
2021-12-16perf arch: Support register names from all archsGerman Gomez1-74/+0
When reading a perf.data file with register values, there is a mismatch between the names and the values of the registers because the tool is built using only the register names from the local architecture. Reading a perf.data file that was recorded on ARM64, gives the following erroneous output on an X86 machine: # perf report -i perf_arm64.data -D [...] 24661932634451 0x698 [0x21d0]: PERF_RECORD_SAMPLE(IP, 0x1): 43239/43239: 0xffffc5be8f100f98 period: 1 addr: 0 ... user regs: mask 0x1ffffffff ABI 64-bit .... AX 0x0000ffffd1515817 .... BX 0x0000ffffd1515480 .... CX 0x0000aaaadabf6c80 .... DX 0x000000000000002e .... SI 0x0000000040100401 .... DI 0x0040600200000080 .... BP 0x0000ffffd1510e10 .... SP 0x0000000000000000 .... IP 0x00000000000000dd .... FLAGS 0x0000ffffd1510cd0 .... CS 0x0000000000000000 .... SS 0x0000000000000030 .... DS 0x0000ffffa569a208 .... ES 0x0000000000000000 .... FS 0x0000000000000000 .... GS 0x0000000000000000 .... R8 0x0000aaaad3de9650 .... R9 0x0000ffffa57397f0 .... R10 0x0000000000000001 .... R11 0x0000ffffa57fd000 .... R12 0x0000ffffd1515817 .... R13 0x0000ffffd1515480 .... R14 0x0000aaaadabf6c80 .... R15 0x0000000000000000 .... unknown 0x0000000000000001 .... unknown 0x0000000000000000 .... unknown 0x0000000000000000 .... unknown 0x0000000000000000 .... unknown 0x0000000000000000 .... unknown 0x0000ffffd1510d90 .... unknown 0x0000ffffa5739b90 .... unknown 0x0000ffffd1510d80 .... XMM0 0x0000ffffa57392c8 ... thread: perf-exec:43239 ...... dso: [kernel.kallsyms] As can be seen, the register names correspond to X86 registers, even though the perf.data file was recorded on an ARM64 system. After this patch, the output of the command displays the correct register names: # perf report -i perf_arm64.data -D [...] 24661932634451 0x698 [0x21d0]: PERF_RECORD_SAMPLE(IP, 0x1): 43239/43239: 0xffffc5be8f100f98 period: 1 addr: 0 ... user regs: mask 0x1ffffffff ABI 64-bit .... x0 0x0000ffffd1515817 .... x1 0x0000ffffd1515480 .... x2 0x0000aaaadabf6c80 .... x3 0x000000000000002e .... x4 0x0000000040100401 .... x5 0x0040600200000080 .... x6 0x0000ffffd1510e10 .... x7 0x0000000000000000 .... x8 0x00000000000000dd .... x9 0x0000ffffd1510cd0 .... x10 0x0000000000000000 .... x11 0x0000000000000030 .... x12 0x0000ffffa569a208 .... x13 0x0000000000000000 .... x14 0x0000000000000000 .... x15 0x0000000000000000 .... x16 0x0000aaaad3de9650 .... x17 0x0000ffffa57397f0 .... x18 0x0000000000000001 .... x19 0x0000ffffa57fd000 .... x20 0x0000ffffd1515817 .... x21 0x0000ffffd1515480 .... x22 0x0000aaaadabf6c80 .... x23 0x0000000000000000 .... x24 0x0000000000000001 .... x25 0x0000000000000000 .... x26 0x0000000000000000 .... x27 0x0000000000000000 .... x28 0x0000000000000000 .... x29 0x0000ffffd1510d90 .... lr 0x0000ffffa5739b90 .... sp 0x0000ffffd1510d80 .... pc 0x0000ffffa57392c8 ... thread: perf-exec:43239 ...... dso: [kernel.kallsyms] Tester comments: Athira reports: "Looks good to me. Tested this patchset in powerpc by capturing regs in powerpc and doing perf report to read the data from x86." Reported-by: Alexandre Truong <[email protected]> Reviewed-by: Athira Jajeev <[email protected]> Signed-off-by: German Gomez <[email protected]> Tested-by: Athira Jajeev <[email protected]> Cc: Alexander Shishkin <[email protected]> Cc: Jiri Olsa <[email protected]> Cc: John Garry <[email protected]> Cc: Leo Yan <[email protected]> Cc: Mark Rutland <[email protected]> Cc: Mathieu Poirier <[email protected]> Cc: Namhyung Kim <[email protected]> Cc: Will Deacon <[email protected]> Cc: [email protected] Cc: [email protected] Cc: [email protected] Link: https://lore.kernel.org/r/[email protected] Signed-off-by: Arnaldo Carvalho de Melo <[email protected]>
2021-02-18perf tools: Fix arm64 build error with gcc-11Jianlin Lv1-1/+1
gcc version: 11.0.0 20210208 (experimental) (GCC) Following build error on arm64: ....... In function ‘printf’, inlined from ‘regs_dump__printf’ at util/session.c:1141:3, inlined from ‘regs__printf’ at util/session.c:1169:2: /usr/include/aarch64-linux-gnu/bits/stdio2.h:107:10: \ error: ‘%-5s’ directive argument is null [-Werror=format-overflow=] 107 | return __printf_chk (__USE_FORTIFY_LEVEL - 1, __fmt, \ __va_arg_pack ()); ...... In function ‘fprintf’, inlined from ‘perf_sample__fprintf_regs.isra’ at \ builtin-script.c:622:14: /usr/include/aarch64-linux-gnu/bits/stdio2.h:100:10: \ error: ‘%5s’ directive argument is null [-Werror=format-overflow=] 100 | return __fprintf_chk (__stream, __USE_FORTIFY_LEVEL - 1, __fmt, 101 | __va_arg_pack ()); cc1: all warnings being treated as errors ....... This patch fixes Wformat-overflow warnings. Add helper function to convert NULL to "unknown". Signed-off-by: Jianlin Lv <[email protected]> Reviewed-by: John Garry <[email protected]> Acked-by: Jiri Olsa <[email protected]> Cc: Albert Ou <[email protected]> Cc: Alexander Shishkin <[email protected]> Cc: Anju T Sudhakar <[email protected]> Cc: Athira Jajeev <[email protected]> Cc: Guo Ren <[email protected]> Cc: Kajol Jain <[email protected]> Cc: Leo Yan <[email protected]> Cc: Mark Rutland <[email protected]> Cc: Mathieu Poirier <[email protected]> Cc: Namhyung Kim <[email protected]> Cc: Paul Walmsley <[email protected]> Cc: Peter Zijlstra <[email protected]> Cc: Ravi Bangoria <[email protected]> Cc: Will Deacon <[email protected]> Cc: Palmer Dabbelt <[email protected]> Cc: [email protected] Cc: [email protected] Cc: [email protected] Link: http://lore.kernel.org/lkml/[email protected] Signed-off-by: Arnaldo Carvalho de Melo <[email protected]>
2019-10-10perf tools: Avoid 'sample_reg_masks' being const + weakIan Rogers2-0/+8
Being const + weak breaks with some compilers that constant-propagate from the weak symbol. This behavior is outside of the specification, but in LLVM is chosen to match GCC's behavior. LLVM's implementation was set in this patch: https://github.com/llvm/llvm-project/commit/f49573d1eedcf1e44893d5a062ac1b72c8419646 A const + weak symbol is set to be weak_odr: https://llvm.org/docs/LangRef.html ODR is one definition rule, and given there is one constant definition constant-propagation is possible. It is possible to get this code to miscompile with LLVM when applying link time optimization. As compilers become more aggressive, this is likely to break in more instances. Move the definition of sample_reg_masks to the conditional part of perf_regs.h and guard usage with HAVE_PERF_REGS_SUPPORT. This avoids the weak symbol. Fix an issue when HAVE_PERF_REGS_SUPPORT isn't defined from patch v1. In v3, add perf_regs.c for architectures that HAVE_PERF_REGS_SUPPORT but don't declare sample_regs_masks. Further notes: Jiri asked: "Is this just a precaution or you actualy saw some breakage?" Ian answered: "We saw a breakage with clang with thinlto enabled for linking. Our compiler team had recently seen, and were surprised by, a similar issue and were able to dig out the weak ODR issue." Signed-off-by: Ian Rogers <[email protected]> Reviewed-by: Nick Desaulniers <[email protected]> Acked-by: Jiri Olsa <[email protected]> Cc: Albert Ou <[email protected]> Cc: Alexander Shishkin <[email protected]> Cc: Alexey Budankov <[email protected]> Cc: Andi Kleen <[email protected]> Cc: [email protected] Cc: Guo Ren <[email protected]> Cc: Kan Liang <[email protected]> Cc: [email protected] Cc: Mao Han <[email protected]> Cc: Mark Rutland <[email protected]> Cc: Namhyung Kim <[email protected]> Cc: Palmer Dabbelt <[email protected]> Cc: Paul Walmsley <[email protected]> Cc: Peter Zijlstra <[email protected]> Cc: Stephane Eranian <[email protected]> Link: http://lore.kernel.org/lkml/[email protected] Signed-off-by: Arnaldo Carvalho de Melo <[email protected]>
2019-09-05riscv: Add support for libdwMao Han6-0/+232
This patch adds support for DWARF register mappings and libdw registers initialization, which is used by perf callchain analyzing when --call-graph=dwarf is given. Signed-off-by: Mao Han <[email protected]> Cc: Paul Walmsley <[email protected]> Cc: Greentime Hu <[email protected]> Cc: Palmer Dabbelt <[email protected]> Cc: linux-riscv <[email protected]> Cc: Christoph Hellwig <[email protected]> Cc: Guo Ren <[email protected]> Tested-by: Greentime Hu <[email protected]> Signed-off-by: Paul Walmsley <[email protected]>