aboutsummaryrefslogtreecommitdiff
path: root/arch/x86/events
AgeCommit message (Collapse)AuthorFilesLines
2017-07-21Merge branch 'x86-urgent-for-linus' of ↵Linus Torvalds1-2/+2
git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip Pull x86 fixes from Ingo Molnar: "Half of the fixes are for various build time warnings triggered by randconfig builds. Most (but not all...) were harmless. There's also: - ACPI boundary condition fixes - UV platform fixes - defconfig updates - an AMD K6 CPU init fix - a %pOF printk format related preparatory change - .. and a warning fix related to the tlb/PCID changes" * 'x86-urgent-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip: x86/devicetree: Convert to using %pOF instead of ->full_name x86/platform/uv/BAU: Disable BAU on single hub configurations x86/platform/intel-mid: Fix a format string overflow warning x86/platform: Add PCI dependency for PUNIT_ATOM_DEBUG x86/build: Silence the build with "make -s" x86/io: Add "memory" clobber to insb/insw/insl/outsb/outsw/outsl x86/fpu/math-emu: Avoid bogus -Wint-in-bool-context warning x86/fpu/math-emu: Fix possible uninitialized variable use perf/x86: Shut up false-positive -Wmaybe-uninitialized warning x86/defconfig: Remove stale, old Kconfig options x86/ioapic: Pass the correct data to unmask_ioapic_irq() x86/acpi: Prevent out of bound access caused by broken ACPI tables x86/mm, KVM: Fix warning when !CONFIG_PREEMPT_COUNT x86/platform/uv/BAU: Fix congested_response_us not taking effect x86/cpu: Use indirect call to measure performance in init_amd_k6()
2017-07-21perf/x86/intel: Add proper condition to run sched_task callbacksJiri Olsa3-10/+14
We have 2 functions using the same sched_task callback: - PEBS drain for free running counters - LBR save/store Both of them are called from intel_pmu_sched_task() and either of them can be unwillingly triggered when the other one is configured to run. Let's say there's PEBS drain configured in sched_task callback for the event, but in the callback itself (intel_pmu_sched_task()) we will also run the code for LBR save/restore, which we did not ask for, but the code in intel_pmu_sched_task() does not check for that. This can lead to extra cycles in some perf monitoring, like when we monitor PEBS event without LBR data. # perf record --no-timestamp -c 10000 -e cycles:p ./perf bench sched pipe -l 1000000 (We need PEBS, non freq/non timestamp event to enable the sched_task callback) The perf stat of cycles and msr:write_msr for above command before the change: ... Performance counter stats for './perf record --no-timestamp -c 10000 -e cycles:p \ ./perf bench sched pipe -l 1000000' (5 runs): 18,519,557,441 cycles:k 91,195,527 msr:write_msr 29.334476406 seconds time elapsed And after the change: ... Performance counter stats for './perf record --no-timestamp -c 10000 -e cycles:p \ ./perf bench sched pipe -l 1000000' (5 runs): 18,704,973,540 cycles:k 27,184,720 msr:write_msr 16.977875900 seconds time elapsed There's no affect on cycles:k because the sched_task happens with events switched off, however the msr:write_msr tracepoint counter together with almost 50% of time speedup show the improvement. Monitoring LBR event and having extra PEBS drain processing in sched_task callback showed just a little speedup, because the drain function does not do much extra work in case there is no PEBS data. Adding conditions to recognize the configured work that needs to be done in the x86_pmu's sched_task callback. Suggested-by: Peter Zijlstra <peterz@infradead.org> Signed-off-by: Jiri Olsa <jolsa@kernel.org> Acked-by: Peter Zijlstra (Intel) <peterz@infradead.org> Cc: Kan Liang <kan.liang@intel.com> Cc: Alexander Shishkin <alexander.shishkin@linux.intel.com> Cc: Jiri Olsa <jolsa@kernel.org> Link: http://lkml.kernel.org/r/20170719075247.GA27506@krava Signed-off-by: Ingo Molnar <mingo@kernel.org>
2017-07-20perf/x86: Shut up false-positive -Wmaybe-uninitialized warningArnd Bergmann1-2/+2
The intialization function checks for various failure scenarios, but unfortunately the compiler gets a little confused about the possible combinations, leading to a false-positive build warning when -Wmaybe-uninitialized is set: arch/x86/events/core.c: In function ‘init_hw_perf_events’: arch/x86/events/core.c:264:3: warning: ‘reg_fail’ may be used uninitialized in this function [-Wmaybe-uninitialized] arch/x86/events/core.c:264:3: warning: ‘val_fail’ may be used uninitialized in this function [-Wmaybe-uninitialized] pr_err(FW_BUG "the BIOS has corrupted hw-PMU resources (MSR %x is %Lx)\n", We can't actually run into this case, so this shuts up the warning by initializing the variables to a known-invalid state. Suggested-by: Peter Zijlstra <peterz@infradead.org> Signed-off-by: Arnd Bergmann <arnd@arndb.de> Cc: Alexander Shishkin <alexander.shishkin@linux.intel.com> Cc: Andy Lutomirski <luto@kernel.org> Cc: Arnaldo Carvalho de Melo <acme@kernel.org> Cc: Josh Poimboeuf <jpoimboe@redhat.com> Cc: Linus Torvalds <torvalds@linux-foundation.org> Cc: Thomas Gleixner <tglx@linutronix.de> Link: http://lkml.kernel.org/r/20170719125310.2487451-2-arnd@arndb.de Link: https://patchwork.kernel.org/patch/9392595/ Signed-off-by: Ingo Molnar <mingo@kernel.org>
2017-07-18perf/x86/intel: Fix debug_store reset field for freq eventsJiri Olsa1-0/+2
There's a bug in PEBs event enabling code, that prevents PEBS freq events to work properly after non freq PEBS event was run. freq events - perf_event_attr::freq set -F <freq> option of perf record PEBS events - perf_event_attr::precise_ip > 0 default for perf record Like in following example with CPU 0 busy, we expect ~10000 samples for following perf tool run: # perf record -F 10000 -C 0 sleep 1 [ perf record: Woken up 2 times to write data ] [ perf record: Captured and wrote 0.640 MB perf.data (10031 samples) ] Everything's fine, but once we run non freq PEBS event like: # perf record -c 10000 -C 0 sleep 1 [ perf record: Woken up 4 times to write data ] [ perf record: Captured and wrote 1.053 MB perf.data (20061 samples) ] the freq events start to fail like this: # perf record -F 10000 -C 0 sleep 1 [ perf record: Woken up 1 times to write data ] [ perf record: Captured and wrote 0.185 MB perf.data (40 samples) ] The issue is in non freq PEBs event initialization of debug_store reset field, which value is used to auto-reload the counter value after PEBS event drain. This value is not being used for PEBS freq events, but once we run non freq event it stays in debug_store data and screws the sample_freq counting for PEBS freq events. Setting the reset field to 0 for freq events. Signed-off-by: Jiri Olsa <jolsa@kernel.org> Acked-by: Peter Zijlstra (Intel) <peterz@infradead.org> Cc: Alexander Shishkin <alexander.shishkin@linux.intel.com> Cc: Kan Liang <kan.liang@intel.com> Cc: Linus Torvalds <torvalds@linux-foundation.org> Cc: Peter Zijlstra <a.p.zijlstra@chello.nl> Cc: Thomas Gleixner <tglx@linutronix.de> Link: http://lkml.kernel.org/r/20170714163551.19459-1-jolsa@kernel.org Signed-off-by: Ingo Molnar <mingo@kernel.org>
2017-07-18perf/x86/intel: Add Goldmont Plus CPU PMU supportKan Liang3-0/+166
Add perf core PMU support for Intel Goldmont Plus CPU cores: - The init code is based on Goldmont. - There is a new cache event list, based on the Goldmont cache event list. - All four general-purpose performance counters support PEBS. - The first general-purpose performance counter is for reduced skid PEBS mechanism. Using :ppp to indicate the event which want to do reduced skid PEBS. - Goldmont Plus has 4-wide pipeline for Topdown Signed-off-by: Kan Liang <kan.liang@intel.com> Cc: Alexander Shishkin <alexander.shishkin@linux.intel.com> Cc: Arnaldo Carvalho de Melo <acme@redhat.com> Cc: Jiri Olsa <jolsa@redhat.com> Cc: Linus Torvalds <torvalds@linux-foundation.org> Cc: Peter Zijlstra <peterz@infradead.org> Cc: Stephane Eranian <eranian@google.com> Cc: Thomas Gleixner <tglx@linutronix.de> Cc: Vince Weaver <vincent.weaver@maine.edu> Cc: acme@kernel.org Link: http://lkml.kernel.org/r/20170712134423.17766-1-kan.liang@intel.com Signed-off-by: Ingo Molnar <mingo@kernel.org>
2017-07-18perf/x86/intel: Enable C-state residency events for Apollo LakeHarry Pan1-6/+20
Goldmont microarchitecture supports C1/C3/C6, PC2/PC3/PC6/PC10 state residency counters, the patch enables them for Apollo Lake platform. The MSR information is based on Intel Software Developers' Manual, Vol. 4, Order No. 335592, Table 2-6 and 2-12. Signed-off-by: Harry Pan <harry.pan@intel.com> Cc: Alexander Shishkin <alexander.shishkin@linux.intel.com> Cc: Arnaldo Carvalho de Melo <acme@redhat.com> Cc: Jiri Olsa <jolsa@redhat.com> Cc: Linus Torvalds <torvalds@linux-foundation.org> Cc: Peter Zijlstra <peterz@infradead.org> Cc: Stephane Eranian <eranian@google.com> Cc: Thomas Gleixner <tglx@linutronix.de> Cc: Vince Weaver <vincent.weaver@maine.edu> Cc: bp@suse.de Cc: davidcc@google.com Cc: gs0622@gmail.com Cc: lukasz.odzioba@intel.com Cc: piotr.luc@intel.com Cc: srinivas.pandruvada@linux.intel.com Link: http://lkml.kernel.org/r/20170717103749.24337-1-harry.pan@intel.com Signed-off-by: Ingo Molnar <mingo@kernel.org>
2017-07-03Merge branch 'smp-hotplug-for-linus' of ↵Linus Torvalds3-15/+13
git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip Pull SMP hotplug updates from Thomas Gleixner: "This update is primarily a cleanup of the CPU hotplug locking code. The hotplug locking mechanism is an open coded RWSEM, which allows recursive locking. The main problem with that is the recursive nature as it evades the full lockdep coverage and hides potential deadlocks. The rework replaces the open coded RWSEM with a percpu RWSEM and establishes full lockdep coverage that way. The bulk of the changes fix up recursive locking issues and address the now fully reported potential deadlocks all over the place. Some of these deadlocks have been observed in the RT tree, but on mainline the probability was low enough to hide them away." * 'smp-hotplug-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip: (37 commits) cpu/hotplug: Constify attribute_group structures powerpc: Only obtain cpu_hotplug_lock if called by rtasd ARM/hw_breakpoint: Fix possible recursive locking for arch_hw_breakpoint_init cpu/hotplug: Remove unused check_for_tasks() function perf/core: Don't release cred_guard_mutex if not taken cpuhotplug: Link lock stacks for hotplug callbacks acpi/processor: Prevent cpu hotplug deadlock sched: Provide is_percpu_thread() helper cpu/hotplug: Convert hotplug locking to percpu rwsem s390: Prevent hotplug rwsem recursion arm: Prevent hotplug rwsem recursion arm64: Prevent cpu hotplug rwsem recursion kprobes: Cure hotplug lock ordering issues jump_label: Reorder hotplug lock and jump_label_lock perf/tracing/cpuhotplug: Fix locking order ACPI/processor: Use cpu_hotplug_disable() instead of get_online_cpus() PCI: Replace the racy recursion prevention PCI: Use cpu_hotplug_disable() instead of get_online_cpus() perf/x86/intel: Drop get_online_cpus() in intel_snb_check_microcode() x86/perf: Drop EXPORT of perf_check_microcode ...
2017-07-03Merge branch 'x86-mm-for-linus' of ↵Linus Torvalds1-3/+2
git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip Pull x86 mm updates from Ingo Molnar: "The main changes in this cycle were: - Continued work to add support for 5-level paging provided by future Intel CPUs. In particular we switch the x86 GUP code to the generic implementation. (Kirill A. Shutemov) - Continued work to add PCID CPU support to native kernels as well. In this round most of the focus is on reworking/refreshing the TLB flush infrastructure for the upcoming PCID changes. (Andy Lutomirski)" * 'x86-mm-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip: (34 commits) x86/mm: Delete a big outdated comment about TLB flushing x86/mm: Don't reenter flush_tlb_func_common() x86/KASLR: Fix detection 32/64 bit bootloaders for 5-level paging x86/ftrace: Exclude functions in head64.c from function-tracing x86/mmap, ASLR: Do not treat unlimited-stack tasks as legacy mmap x86/mm: Remove reset_lazy_tlbstate() x86/ldt: Simplify the LDT switching logic x86/boot/64: Put __startup_64() into .head.text x86/mm: Add support for 5-level paging for KASLR x86/mm: Make kernel_physical_mapping_init() support 5-level paging x86/mm: Add sync_global_pgds() for configuration with 5-level paging x86/boot/64: Add support of additional page table level during early boot x86/boot/64: Rename init_level4_pgt and early_level4_pgt x86/boot/64: Rewrite startup_64() in C x86/boot/compressed: Enable 5-level paging during decompression stage x86/boot/efi: Define __KERNEL32_CS GDT on 64-bit configurations x86/boot/efi: Fix __KERNEL_CS definition of GDT entry on 64-bit configurations x86/boot/efi: Cleanup initialization of GDT entries x86/asm: Fix comment in return_from_SYSCALL_64() x86/mm/gup: Switch GUP to the generic get_user_page_fast() implementation ...
2017-07-03Merge branch 'sched-core-for-linus' of ↵Linus Torvalds1-6/+6
git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip Pull scheduler updates from Ingo Molnar: "The main changes in this cycle were: - Add the SYSTEM_SCHEDULING bootup state to move various scheduler debug checks earlier into the bootup. This turns silent and sporadically deadly bugs into nice, deterministic splats. Fix some of the splats that triggered. (Thomas Gleixner) - A round of restructuring and refactoring of the load-balancing and topology code (Peter Zijlstra) - Another round of consolidating ~20 of incremental scheduler code history: this time in terms of wait-queue nomenclature. (I didn't get much feedback on these renaming patches, and we can still easily change any names I might have misplaced, so if anyone hates a new name, please holler and I'll fix it.) (Ingo Molnar) - sched/numa improvements, fixes and updates (Rik van Riel) - Another round of x86/tsc scheduler clock code improvements, in hope of making it more robust (Peter Zijlstra) - Improve NOHZ behavior (Frederic Weisbecker) - Deadline scheduler improvements and fixes (Luca Abeni, Daniel Bristot de Oliveira) - Simplify and optimize the topology setup code (Lauro Ramos Venancio) - Debloat and decouple scheduler code some more (Nicolas Pitre) - Simplify code by making better use of llist primitives (Byungchul Park) - ... plus other fixes and improvements" * 'sched-core-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip: (103 commits) sched/cputime: Refactor the cputime_adjust() code sched/debug: Expose the number of RT/DL tasks that can migrate sched/numa: Hide numa_wake_affine() from UP build sched/fair: Remove effective_load() sched/numa: Implement NUMA node level wake_affine() sched/fair: Simplify wake_affine() for the single socket case sched/numa: Override part of migrate_degrades_locality() when idle balancing sched/rt: Move RT related code from sched/core.c to sched/rt.c sched/deadline: Move DL related code from sched/core.c to sched/deadline.c sched/cpuset: Only offer CONFIG_CPUSETS if SMP is enabled sched/fair: Spare idle load balancing on nohz_full CPUs nohz: Move idle balancer registration to the idle path sched/loadavg: Generalize "_idle" naming to "_nohz" sched/core: Drop the unused try_get_task_struct() helper function sched/fair: WARN() and refuse to set buddy when !se->on_rq sched/debug: Fix SCHED_WARN_ON() to return a value on !CONFIG_SCHED_DEBUG as well sched/wait: Disambiguate wq_entry->task_list and wq_head->task_list naming sched/wait: Move bit_wait_table[] and related functionality from sched/core.c to sched/wait_bit.c sched/wait: Split out the wait_bit*() APIs from <linux/wait.h> into <linux/wait_bit.h> sched/wait: Re-adjust macro line continuation backslashes in <linux/wait.h> ...
2017-07-03Merge branch 'perf-core-for-linus' of ↵Linus Torvalds4-2/+78
git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip Pull perf updates from Ingo Molnar: "Most of the changes are for tooling, the main changes in this cycle were: - Improve Intel-PT hardware tracing support, both on the kernel and on the tooling side: PTWRITE instruction support, power events for C-state tracing, etc. (Adrian Hunter) - Add support to measure SMI cost to the x86 architecture, with tooling support in 'perf stat' (Kan Liang) - Support function filtering in 'perf ftrace', plus related improvements (Namhyung Kim) - Allow adding and removing fields to the default 'perf script' columns, using + or - as field prefixes to do so (Andi Kleen) - Allow resolving the DSO name with 'perf script -F brstack{sym,off},dso' (Mark Santaniello) - Add perf tooling unwind support for PowerPC (Paolo Bonzini) - ... and various other improvements as well" * 'perf-core-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip: (84 commits) perf auxtrace: Add CPU filter support perf intel-pt: Do not use TSC packets for calculating CPU cycles to TSC perf intel-pt: Update documentation to include new ptwrite and power events perf intel-pt: Add example script for power events and PTWRITE perf intel-pt: Synthesize new power and "ptwrite" events perf intel-pt: Move code in intel_pt_synth_events() to simplify attr setting perf intel-pt: Factor out intel_pt_set_event_name() perf intel-pt: Tidy messages into called function intel_pt_synth_event() perf intel-pt: Tidy Intel PT evsel lookup into separate function perf intel-pt: Join needlessly wrapped lines perf intel-pt: Remove unused instructions_sample_period perf intel-pt: Factor out common code synthesizing event samples perf script: Add synthesized Intel PT power and ptwrite events perf/x86/intel: Constify the 'lbr_desc[]' array and make a function static perf script: Add 'synth' field for synthesized event payloads perf auxtrace: Add itrace option to output power events perf auxtrace: Add itrace option to output ptwrite events tools include: Add byte-swapping macros to kernel.h perf script: Add 'synth' event type for synthesized events x86/insn: perf tools: Add new ptwrite instruction ...
2017-06-30perf/x86/intel: Constify the 'lbr_desc[]' array and make a function staticColin Ian King1-2/+2
A few minor clean-ups: constify the lbr_desc[] array and make local function lbr_from_signext_quirk_rd() static to fix a sparse warning: "symbol 'lbr_from_signext_quirk_rd' was not declared. Should it be static?" Signed-off-by: Colin Ian King <colin.king@canonical.com> Cc: Dan Carpenter <dan.carpenter@oracle.com> Cc: Linus Torvalds <torvalds@linux-foundation.org> Cc: Peter Zijlstra <peterz@infradead.org> Cc: Thomas Gleixner <tglx@linutronix.de> Cc: kernel-janitors@vger.kernel.org Link: http://lkml.kernel.org/r/20170629091406.9870-1-colin.king@canonical.com Signed-off-by: Ingo Molnar <mingo@kernel.org>
2017-06-29perf/x86/intel/uncore: Fix wrong box pointer checkKan Liang1-1/+1
Should not init a NULL box. It will cause system crash. The issue looks like caused by a typo. This was not noticed because there is no NULL box. Also, for most boxes, they are enabled by default. The init code is not critical. Fixes: fff4b87e594a ("perf/x86/intel/uncore: Make package handling more robust") Signed-off-by: Kan Liang <kan.liang@intel.com> Signed-off-by: Thomas Gleixner <tglx@linutronix.de> Cc: stable@vger.kernel.org Link: http://lkml.kernel.org/r/20170629190926.2456-1-kan.liang@intel.com
2017-06-22perf/x86/intel: Add 1G DTLB load/store miss support for SKLKan Liang1-2/+2
Current DTLB load/store miss events (0x608/0x649) only counts 4K,2M and 4M page size. Need to extend the events to support any page size (4K/2M/4M/1G). The complete DTLB load/store miss events are: DTLB_LOAD_MISSES.WALK_COMPLETED 0xe08 DTLB_STORE_MISSES.WALK_COMPLETED 0xe49 Signed-off-by: Kan Liang <Kan.liang@intel.com> Cc: <stable@vger.kernel.org> Cc: Alexander Shishkin <alexander.shishkin@linux.intel.com> Cc: Arnaldo Carvalho de Melo <acme@redhat.com> Cc: Jiri Olsa <jolsa@redhat.com> Cc: Linus Torvalds <torvalds@linux-foundation.org> Cc: Peter Zijlstra <peterz@infradead.org> Cc: Thomas Gleixner <tglx@linutronix.de> Cc: eranian@google.com Link: http://lkml.kernel.org/r/20170619142609.11058-1-kan.liang@intel.com Signed-off-by: Ingo Molnar <mingo@kernel.org>
2017-06-08x86/ldt: Rename ldt_struct::size to ::nr_entriesBorislav Petkov1-1/+1
... because this is exactly what it is: the number of entries in the LDT. Calling it "size" is simply confusing and it is actually begging to be called "nr_entries" or somesuch, especially if you see constructs like: alloc_size = size * LDT_ENTRY_SIZE; since LDT_ENTRY_SIZE is the size of a single entry. There should be no functionality change resulting from this patch, as the before/after output from tools/testing/selftests/x86/ldt_gdt.c shows. Signed-off-by: Borislav Petkov <bp@suse.de> Acked-by: Andy Lutomirski <luto@amacapital.net> Cc: Linus Torvalds <torvalds@linux-foundation.org> Cc: Peter Zijlstra <peterz@infradead.org> Cc: Thomas Gleixner <tglx@linutronix.de> Link: http://lkml.kernel.org/r/20170606173116.13977-1-bp@alien8.de [ Renamed 'n_entries' to 'nr_entries' ] Signed-off-by: Ingo Molnar <mingo@kernel.org>
2017-06-05x86/mm: Rework lazy TLB to track the actual loaded mmAndy Lutomirski1-2/+1
Lazy TLB state is currently managed in a rather baroque manner. AFAICT, there are three possible states: - Non-lazy. This means that we're running a user thread or a kernel thread that has called use_mm(). current->mm == current->active_mm == cpu_tlbstate.active_mm and cpu_tlbstate.state == TLBSTATE_OK. - Lazy with user mm. We're running a kernel thread without an mm and we're borrowing an mm_struct. We have current->mm == NULL, current->active_mm == cpu_tlbstate.active_mm, cpu_tlbstate.state != TLBSTATE_OK (i.e. TLBSTATE_LAZY or 0). The current cpu is set in mm_cpumask(current->active_mm). CR3 points to current->active_mm->pgd. The TLB is up to date. - Lazy with init_mm. This happens when we call leave_mm(). We have current->mm == NULL, current->active_mm == cpu_tlbstate.active_mm, but that mm is only relelvant insofar as the scheduler is tracking it for refcounting. cpu_tlbstate.state != TLBSTATE_OK. The current cpu is clear in mm_cpumask(current->active_mm). CR3 points to swapper_pg_dir, i.e. init_mm->pgd. This patch simplifies the situation. Other than perf, x86 stops caring about current->active_mm at all. We have cpu_tlbstate.loaded_mm pointing to the mm that CR3 references. The TLB is always up to date for that mm. leave_mm() just switches us to init_mm. There are no longer any special cases for mm_cpumask, and switch_mm() switches mms without worrying about laziness. After this patch, cpu_tlbstate.state serves only to tell the TLB flush code whether it may switch to init_mm instead of doing a normal flush. This makes fairly extensive changes to xen_exit_mmap(), which used to look a bit like black magic. Perf is unchanged. With or without this change, perf may behave a bit erratically if it tries to read user memory in kernel thread context. We should build on this patch to teach perf to never look at user memory when cpu_tlbstate.loaded_mm != current->mm. Signed-off-by: Andy Lutomirski <luto@kernel.org> Cc: Andrew Morton <akpm@linux-foundation.org> Cc: Arjan van de Ven <arjan@linux.intel.com> Cc: Borislav Petkov <bpetkov@suse.de> Cc: Dave Hansen <dave.hansen@intel.com> Cc: Linus Torvalds <torvalds@linux-foundation.org> Cc: Mel Gorman <mgorman@suse.de> Cc: Michal Hocko <mhocko@suse.com> Cc: Nadav Amit <nadav.amit@gmail.com> Cc: Nadav Amit <namit@vmware.com> Cc: Peter Zijlstra <peterz@infradead.org> Cc: Rik van Riel <riel@redhat.com> Cc: Thomas Gleixner <tglx@linutronix.de> Cc: linux-mm@kvack.org Signed-off-by: Ingo Molnar <mingo@kernel.org>
2017-05-26perf/x86/intel: Drop get_online_cpus() in intel_snb_check_microcode()Sebastian Andrzej Siewior1-6/+5
If intel_snb_check_microcode() is invoked via microcode_init -> perf_check_microcode -> intel_snb_check_microcode then get_online_cpus() is invoked nested. This works with the current implementation of get_online_cpus() but prevents converting it to a percpu rwsem. intel_snb_check_microcode() is also invoked from intel_sandybridge_quirk() unprotected. Drop get_online_cpus() from intel_snb_check_microcode() and add it to intel_sandybridge_quirk() so both call sites are protected. Convert *_online_cpus() to the new interfaces while at it. Signed-off-by: Sebastian Andrzej Siewior <bigeasy@linutronix.de> Signed-off-by: Thomas Gleixner <tglx@linutronix.de> Acked-by: Ingo Molnar <mingo@kernel.org> Acked-by: Borislav Petkov <bp@suse.de> Cc: Paul E. McKenney <paulmck@linux.vnet.ibm.com> Cc: Peter Zijlstra <peterz@infradead.org> Cc: Steven Rostedt <rostedt@goodmis.org> Cc: Borislav Petkov <bp@alien8.de> Link: http://lkml.kernel.org/r/20170524081548.594862191@linutronix.de
2017-05-26x86/perf: Drop EXPORT of perf_check_microcodeThomas Gleixner1-1/+0
The only caller is the microcode update, which cannot be modular. Drop the export. Signed-off-by: Thomas Gleixner <tglx@linutronix.de> Acked-by: Ingo Molnar <mingo@kernel.org> Acked-by: Borislav Petkov <bp@suse.de> Cc: Paul E. McKenney <paulmck@linux.vnet.ibm.com> Cc: Peter Zijlstra <peterz@infradead.org> Cc: Sebastian Siewior <bigeasy@linutronix.de> Cc: Steven Rostedt <rostedt@goodmis.org> Cc: Borislav Petkov <bp@alien8.de> Link: http://lkml.kernel.org/r/20170524081548.515204988@linutronix.de
2017-05-26perf/x86/intel/cqm: Use cpuhp_setup_state_cpuslocked()Sebastian Andrzej Siewior1-8/+8
intel_cqm_init() holds get_online_cpus() while registerring the hotplug callbacks. cpuhp_setup_state() invokes get_online_cpus() as well. This is correct, but prevents the conversion of the hotplug locking to a percpu rwsem. Use cpuhp_setup_state_cpuslocked() to avoid the nested call. Convert *_online_cpus() to the new interfaces while at it. Signed-off-by: Sebastian Andrzej Siewior <bigeasy@linutronix.de> Signed-off-by: Thomas Gleixner <tglx@linutronix.de> Acked-by: Ingo Molnar <mingo@kernel.org> Cc: Paul E. McKenney <paulmck@linux.vnet.ibm.com> Cc: Fenghua Yu <fenghua.yu@intel.com> Cc: Peter Zijlstra <peterz@infradead.org> Cc: Steven Rostedt <rostedt@goodmis.org> Link: http://lkml.kernel.org/r/20170524081548.075604046@linutronix.de
2017-05-23perf/x86: Add sysfs entry to freeze counters on SMIKan Liang3-0/+76
Currently, the SMIs are visible to all performance counters, because many users want to measure everything including SMIs. But in some cases, the SMI cycles should not be counted - for example, to calculate the cost of an SMI itself. So a knob is needed. When setting FREEZE_WHILE_SMM bit in IA32_DEBUGCTL, all performance counters will be effected. There is no way to do per-counter freeze on SMI. So it should not use the per-event interface (e.g. ioctl or event attribute) to set FREEZE_WHILE_SMM bit. Adds sysfs entry /sys/device/cpu/freeze_on_smi to set FREEZE_WHILE_SMM bit in IA32_DEBUGCTL. When set, freezes perfmon and trace messages while in SMM. Value has to be 0 or 1. It will be applied to all processors. Also serialize the entire setting so we don't get multiple concurrent threads trying to update to different values. Signed-off-by: Kan Liang <Kan.liang@intel.com> Signed-off-by: Peter Zijlstra (Intel) <peterz@infradead.org> Reviewed-by: Thomas Gleixner <tglx@linutronix.de> Cc: Alexander Shishkin <alexander.shishkin@linux.intel.com> Cc: Arnaldo Carvalho de Melo <acme@redhat.com> Cc: Jiri Olsa <jolsa@redhat.com> Cc: Linus Torvalds <torvalds@linux-foundation.org> Cc: Peter Zijlstra <peterz@infradead.org> Cc: Stephane Eranian <eranian@google.com> Cc: Vince Weaver <vincent.weaver@maine.edu> Cc: acme@kernel.org Cc: bp@alien8.de Cc: jolsa@kernel.org Link: http://lkml.kernel.org/r/1494600673-244667-1-git-send-email-kan.liang@intel.com Signed-off-by: Ingo Molnar <mingo@kernel.org>
2017-05-15x86/tsc: Remodel cyc2ns to use seqcount_latch()Peter Zijlstra1-6/+6
Replace the custom multi-value scheme with the more regular seqcount_latch() scheme. Along with scrapping a lot of lines, the latch scheme is better documented and used in more places. The immediate benefit however is not being limited on the update side. The current code has a limit where the writers block which is hit by future changes. Signed-off-by: Peter Zijlstra (Intel) <peterz@infradead.org> Cc: Linus Torvalds <torvalds@linux-foundation.org> Cc: Mike Galbraith <efault@gmx.de> Cc: Peter Zijlstra <peterz@infradead.org> Cc: Thomas Gleixner <tglx@linutronix.de> Cc: linux-kernel@vger.kernel.org Signed-off-by: Ingo Molnar <mingo@kernel.org>
2017-05-03Merge tag 'perf-core-for-mingo-4.12-20170503' of ↵Ingo Molnar9-261/+333
git://git.kernel.org/pub/scm/linux/kernel/git/acme/linux into perf/urgent Pull perf/core improvements and fixes from Arnaldo Carvalho de Melo: Fixes: - Support setting probes in versioned user space symbols, such as pthread_create@@GLIBC_2.1, picking the default one, more work needed to make it possible to set it on the other versions, as the 'perf probe' syntax already uses @ for other purposes. (Paul Clarke) - Do not special case address zero as an error for routines that return addresses (symbol lookup), instead use the return as the success/error indication and pass a pointer to return the address, fixing 'perf test vmlinux' (the one that compares address between vmlinux and kallsyms) on s/390, where the '_text' address is equal to zero (Arnaldo Carvalho de Melo) Infrastructure changes: - More header sanitization, moving stuff out of util.h into more appropriate headers and objects and sometimes creating new ones (Arnaldo Carvalho de Melo) - Refactor a duplicated code for obtaining config file name (Taeung Song) Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com> Signed-off-by: Ingo Molnar <mingo@kernel.org>
2017-05-03perf/x86: Fix Broadwell-EP DRAM RAPL eventsVince Weaver1-1/+1
It appears as though the Broadwell-EP DRAM units share the special units quirk with Haswell-EP/KNL. Without this patch, you get really high results (a single DRAM using 20W of power). The powercap driver in drivers/powercap/intel_rapl.c already has this change. Signed-off-by: Vince Weaver <vincent.weaver@maine.edu> Cc: Alexander Shishkin <alexander.shishkin@linux.intel.com> Cc: Arnaldo Carvalho de Melo <acme@redhat.com> Cc: Jiri Olsa <jolsa@redhat.com> Cc: Kan Liang <kan.liang@intel.com> Cc: Linus Torvalds <torvalds@linux-foundation.org> Cc: Peter Zijlstra <peterz@infradead.org> Cc: Stephane Eranian <eranian@gmail.com> Cc: Stephane Eranian <eranian@google.com> Cc: Thomas Gleixner <tglx@linutronix.de> Cc: linux-kernel@vger.kernel.org Cc: <stable@vger.kernel.org> Signed-off-by: Ingo Molnar <mingo@kernel.org>
2017-04-14perf/x86: Fix spurious NMI with PEBS Load Latency eventKan Liang3-2/+3
Spurious NMIs will be observed with the following command: while :; do perf record -bae "cpu/umask=0x01,event=0xcd,ldlat=0x80/pp" -e "cpu/umask=0x03,event=0x0/" -e "cpu/umask=0x02,event=0x0/" -e cycles,branches,cache-misses -e cache-references -- sleep 10 done The bug was introduced by commit: 8077eca079a2 ("perf/x86/pebs: Add workaround for broken OVFL status on HSW+") That commit clears the status bits for the counters used for PEBS events, by masking the whole 64 bits pebs_enabled. However, only the low 32 bits of both status and pebs_enabled are reserved for PEBS-able counters. For status bits 32-34 are fixed counter overflow bits. For pebs_enabled bits 32-34 are for PEBS Load Latency. In the test case, the PEBS Load Latency event and fixed counter event could overflow at the same time. The fixed counter overflow bit will be cleared by mistake. Once it is cleared, the fixed counter overflow never be processed, which finally trigger spurious NMI. Correct the PEBS enabled mask by ignoring the non-PEBS bits. Signed-off-by: Kan Liang <kan.liang@intel.com> Signed-off-by: Peter Zijlstra (Intel) <peterz@infradead.org> Cc: Alexander Shishkin <alexander.shishkin@linux.intel.com> Cc: Arnaldo Carvalho de Melo <acme@redhat.com> Cc: Jiri Olsa <jolsa@redhat.com> Cc: Linus Torvalds <torvalds@linux-foundation.org> Cc: Peter Zijlstra <peterz@infradead.org> Cc: Stephane Eranian <eranian@google.com> Cc: Thomas Gleixner <tglx@linutronix.de> Cc: Vince Weaver <vincent.weaver@maine.edu> Fixes: 8077eca079a2 ("perf/x86/pebs: Add workaround for broken OVFL status on HSW+") Link: http://lkml.kernel.org/r/1491333246-3965-1-git-send-email-kan.liang@intel.com Signed-off-by: Ingo Molnar <mingo@kernel.org>
2017-04-14Merge branch 'perf/urgent' into perf/core, to pick up fixesIngo Molnar1-0/+3
Signed-off-by: Ingo Molnar <mingo@kernel.org>
2017-04-14perf/x86: Avoid exposing wrong/stale data in intel_pmu_lbr_read_32()Peter Zijlstra1-0/+3
When the perf_branch_entry::{in_tx,abort,cycles} fields were added, intel_pmu_lbr_read_32() wasn't updated to initialize them. Signed-off-by: Peter Zijlstra (Intel) <peterz@infradead.org> Cc: Linus Torvalds <torvalds@linux-foundation.org> Cc: Peter Zijlstra <peterz@infradead.org> Cc: Thomas Gleixner <tglx@linutronix.de> Cc: linux-kernel@vger.kernel.org Cc: <stable@vger.kernel.org> Fixes: 135c5612c460 ("perf/x86/intel: Support Haswell/v4 LBR format") Signed-off-by: Ingo Molnar <mingo@kernel.org>
2017-04-11perf/amd/uncore: Fix pr_fmt() prefixBorislav Petkov1-2/+5
Make it "perf/amd/uncore: ", i.e., something more specific than "perf: ". Signed-off-by: Borislav Petkov <bp@suse.de> Cc: Alexander Shishkin <alexander.shishkin@linux.intel.com> Cc: Arnaldo Carvalho de Melo <acme@redhat.com> Cc: Jiri Olsa <jolsa@redhat.com> Cc: Linus Torvalds <torvalds@linux-foundation.org> Cc: Peter Zijlstra <peterz@infradead.org> Cc: Stephane Eranian <eranian@google.com> Cc: Suravee Suthikulpanit <Suravee.Suthikulpanit@amd.com> Cc: Thomas Gleixner <tglx@linutronix.de> Cc: Vince Weaver <vincent.weaver@maine.edu> Link: http://lkml.kernel.org/r/20170410122047.3026-4-bp@alien8.de [ Changed it to perf/amd/uncore/ ] Signed-off-by: Ingo Molnar <mingo@kernel.org>
2017-04-11perf/amd/uncore: Clean up per-family setupBorislav Petkov1-38/+21
Fam16h is the same as the default one, remove it. Turn the switch-case into a simple if-else. No functionality change. Signed-off-by: Borislav Petkov <bp@suse.de> Cc: Alexander Shishkin <alexander.shishkin@linux.intel.com> Cc: Arnaldo Carvalho de Melo <acme@redhat.com> Cc: Jiri Olsa <jolsa@redhat.com> Cc: Linus Torvalds <torvalds@linux-foundation.org> Cc: Peter Zijlstra <peterz@infradead.org> Cc: Stephane Eranian <eranian@google.com> Cc: Suravee Suthikulpanit <Suravee.Suthikulpanit@amd.com> Cc: Thomas Gleixner <tglx@linutronix.de> Cc: Vince Weaver <vincent.weaver@maine.edu> Link: http://lkml.kernel.org/r/20170410122047.3026-3-bp@alien8.de Signed-off-by: Ingo Molnar <mingo@kernel.org>
2017-04-11perf/amd/uncore: Do feature check first, before assignmentsBorislav Petkov1-5/+4
... and save some unnecessary work. Remove now unused label while at it. Signed-off-by: Borislav Petkov <bp@suse.de> Cc: Alexander Shishkin <alexander.shishkin@linux.intel.com> Cc: Arnaldo Carvalho de Melo <acme@redhat.com> Cc: Jiri Olsa <jolsa@redhat.com> Cc: Linus Torvalds <torvalds@linux-foundation.org> Cc: Peter Zijlstra <peterz@infradead.org> Cc: Stephane Eranian <eranian@google.com> Cc: Suravee Suthikulpanit <Suravee.Suthikulpanit@amd.com> Cc: Thomas Gleixner <tglx@linutronix.de> Cc: Vince Weaver <vincent.weaver@maine.edu> Link: http://lkml.kernel.org/r/20170410122047.3026-2-bp@alien8.de Signed-off-by: Ingo Molnar <mingo@kernel.org>
2017-04-11Merge tag 'v4.11-rc6' into perf/core, to pick up fixesIngo Molnar1-3/+6
Signed-off-by: Ingo Molnar <mingo@kernel.org>
2017-03-30x86/events/amd/iommu: Enable support for multiple IOMMUsSuravee Suthikulpanit1-41/+71
Add support for multiple IOMMUs to perf by exposing an AMD IOMMU PMU for each IOMMU found in the system via: /bus/event_source/devices/amd_iommu_x where x is the IOMMU index. This allows users to specify different events to be programmed into the performance counters of each IOMMU. Signed-off-by: Suravee Suthikulpanit <suravee.suthikulpanit@amd.com> [ Improve readability, shorten names. ] Signed-off-by: Borislav Petkov <bp@suse.de> Signed-off-by: Peter Zijlstra (Intel) <peterz@infradead.org> Cc: Alexander Shishkin <alexander.shishkin@linux.intel.com> Cc: Arnaldo Carvalho de Melo <acme@redhat.com> Cc: Jiri Olsa <jolsa@redhat.com> Cc: Jörg Rödel <joro@8bytes.org> Cc: Linus Torvalds <torvalds@linux-foundation.org> Cc: Peter Zijlstra <peterz@infradead.org> Cc: Stephane Eranian <eranian@google.com> Cc: Thomas Gleixner <tglx@linutronix.de> Cc: Vince Weaver <vincent.weaver@maine.edu> Cc: iommu@lists.linux-foundation.org Link: http://lkml.kernel.org/r/1490166162-10002-11-git-send-email-Suravee.Suthikulpanit@amd.com Signed-off-by: Ingo Molnar <mingo@kernel.org>
2017-03-30x86/events/amd/iommu: Add IOMMU-specific hw_perf_event structSuravee Suthikulpanit1-63/+50
Current AMD IOMMU perf PMU inappropriately uses the hardware struct inside the union in struct hw_perf_event, extra_reg in particular. Instead, introduce an AMD IOMMU-specific struct with required parameters to be programmed into the IOMMU performance counter control register. Update the pasid field from 16 to 20 bits while at it. Signed-off-by: Suravee Suthikulpanit <suravee.suthikulpanit@amd.com> [ Fixup macros, shorten get_next_avail_iommu_bnk_cntr() local vars, massage commit message. ] Signed-off-by: Borislav Petkov <bp@suse.de> Signed-off-by: Peter Zijlstra (Intel) <peterz@infradead.org> Cc: Alexander Shishkin <alexander.shishkin@linux.intel.com> Cc: Arnaldo Carvalho de Melo <acme@redhat.com> Cc: Jiri Olsa <jolsa@redhat.com> Cc: Jörg Rödel <joro@8bytes.org> Cc: Linus Torvalds <torvalds@linux-foundation.org> Cc: Peter Zijlstra <peterz@infradead.org> Cc: Stephane Eranian <eranian@google.com> Cc: Thomas Gleixner <tglx@linutronix.de> Cc: Vince Weaver <vincent.weaver@maine.edu> Cc: iommu@lists.linux-foundation.org Link: http://lkml.kernel.org/r/1487926102-13073-10-git-send-email-Suravee.Suthikulpanit@amd.com Signed-off-by: Ingo Molnar <mingo@kernel.org>
2017-03-30x86/events/amd/iommu: Fix sysfs perf attribute groupsSuravee Suthikulpanit1-49/+32
Introduce static amd_iommu_attr_groups to simplify the sysfs attributes initialization code. Signed-off-by: Suravee Suthikulpanit <Suravee.Suthikulpanit@amd.com> Signed-off-by: Borislav Petkov <bp@suse.de> Signed-off-by: Peter Zijlstra (Intel) <peterz@infradead.org> Cc: Alexander Shishkin <alexander.shishkin@linux.intel.com> Cc: Arnaldo Carvalho de Melo <acme@redhat.com> Cc: Jiri Olsa <jolsa@redhat.com> Cc: Jörg Rödel <joro@8bytes.org> Cc: Linus Torvalds <torvalds@linux-foundation.org> Cc: Peter Zijlstra <peterz@infradead.org> Cc: Stephane Eranian <eranian@google.com> Cc: Thomas Gleixner <tglx@linutronix.de> Cc: Vince Weaver <vincent.weaver@maine.edu> Cc: iommu@lists.linux-foundation.org Link: http://lkml.kernel.org/r/1487926102-13073-9-git-send-email-Suravee.Suthikulpanit@amd.com Signed-off-by: Ingo Molnar <mingo@kernel.org>
2017-03-30x86/events, drivers/amd/iommu: Prepare for multiple IOMMUs supportSuravee Suthikulpanit2-23/+24
Currently, amd_iommu_pc_get_set_reg_val() cannot support multiple IOMMUs. Modify it to allow callers to specify an IOMMU. This is in preparation for supporting multiple IOMMUs. Signed-off-by: Suravee Suthikulpanit <Suravee.Suthikulpanit@amd.com> Signed-off-by: Borislav Petkov <bp@suse.de> Signed-off-by: Peter Zijlstra (Intel) <peterz@infradead.org> Cc: Alexander Shishkin <alexander.shishkin@linux.intel.com> Cc: Arnaldo Carvalho de Melo <acme@redhat.com> Cc: Jiri Olsa <jolsa@redhat.com> Cc: Jörg Rödel <joro@8bytes.org> Cc: Linus Torvalds <torvalds@linux-foundation.org> Cc: Peter Zijlstra <peterz@infradead.org> Cc: Stephane Eranian <eranian@google.com> Cc: Thomas Gleixner <tglx@linutronix.de> Cc: Vince Weaver <vincent.weaver@maine.edu> Cc: iommu@lists.linux-foundation.org Link: http://lkml.kernel.org/r/1487926102-13073-8-git-send-email-Suravee.Suthikulpanit@amd.com Signed-off-by: Ingo Molnar <mingo@kernel.org>
2017-03-30x86/events/amd/iommu.c: Modify functions to query max banks and countersSuravee Suthikulpanit2-15/+11
Currently, amd_iommu_pc_get_max_[banks|counters]() use end-point device ID to locate an IOMMU and check the reported max banks/counters. The logic assumes that the IOMMU_BASE_DEVID belongs to the first IOMMU, and uses it to acquire a reference to the first IOMMU, which does not work on certain systems. Instead, modify the function to take an IOMMU index, and use it to query the corresponding AMD IOMMU instance. Currently, hardcode the IOMMU index to 0 since the current AMD IOMMU perf implementation supports only a single IOMMU. A subsequent patch will add support for multiple IOMMUs, and will use a proper IOMMU index. Signed-off-by: Suravee Suthikulpanit <Suravee.Suthikulpanit@amd.com> Signed-off-by: Borislav Petkov <bp@suse.de> Signed-off-by: Peter Zijlstra (Intel) <peterz@infradead.org> Cc: Alexander Shishkin <alexander.shishkin@linux.intel.com> Cc: Arnaldo Carvalho de Melo <acme@redhat.com> Cc: Jiri Olsa <jolsa@redhat.com> Cc: Jörg Rödel <joro@8bytes.org> Cc: Linus Torvalds <torvalds@linux-foundation.org> Cc: Peter Zijlstra <peterz@infradead.org> Cc: Stephane Eranian <eranian@google.com> Cc: Thomas Gleixner <tglx@linutronix.de> Cc: Vince Weaver <vincent.weaver@maine.edu> Cc: iommu@lists.linux-foundation.org Link: http://lkml.kernel.org/r/1487926102-13073-7-git-send-email-Suravee.Suthikulpanit@amd.com Signed-off-by: Ingo Molnar <mingo@kernel.org>
2017-03-30x86/events, drivers/iommu/amd: Introduce amd_iommu_get_num_iommus()Suravee Suthikulpanit1-0/+2
Introduce amd_iommu_get_num_iommus(), which returns the value of amd_iommus_present. The function is used to replace direct access to the variable, which is now declared as static. This function will also be used by AMD IOMMU perf driver. Signed-off-by: Suravee Suthikulpanit <Suravee.Suthikulpanit@amd.com> Signed-off-by: Borislav Petkov <bp@suse.de> Signed-off-by: Peter Zijlstra (Intel) <peterz@infradead.org> Cc: Alexander Shishkin <alexander.shishkin@linux.intel.com> Cc: Arnaldo Carvalho de Melo <acme@redhat.com> Cc: Jiri Olsa <jolsa@redhat.com> Cc: Jörg Rödel <joro@8bytes.org> Cc: Linus Torvalds <torvalds@linux-foundation.org> Cc: Peter Zijlstra <peterz@infradead.org> Cc: Stephane Eranian <eranian@google.com> Cc: Thomas Gleixner <tglx@linutronix.de> Cc: Vince Weaver <vincent.weaver@maine.edu> Cc: iommu@lists.linux-foundation.org Link: http://lkml.kernel.org/r/1487926102-13073-6-git-send-email-Suravee.Suthikulpanit@amd.com Signed-off-by: Ingo Molnar <mingo@kernel.org>
2017-03-30x86/events/amd/iommu: Clean up perf_iommu_read()Suravee Suthikulpanit1-10/+6
Fix coding style and use GENMASK_ULL(). Signed-off-by: Suravee Suthikulpanit <suravee.suthikulpanit@amd.com> Signed-off-by: Borislav Petkov <bp@suse.de> Signed-off-by: Peter Zijlstra (Intel) <peterz@infradead.org> Cc: Alexander Shishkin <alexander.shishkin@linux.intel.com> Cc: Arnaldo Carvalho de Melo <acme@redhat.com> Cc: Jiri Olsa <jolsa@redhat.com> Cc: Jörg Rödel <joro@8bytes.org> Cc: Linus Torvalds <torvalds@linux-foundation.org> Cc: Peter Zijlstra <peterz@infradead.org> Cc: Stephane Eranian <eranian@google.com> Cc: Thomas Gleixner <tglx@linutronix.de> Cc: Vince Weaver <vincent.weaver@maine.edu> Cc: iommu@lists.linux-foundation.org Link: http://lkml.kernel.org/r/1487926102-13073-4-git-send-email-Suravee.Suthikulpanit@amd.com Signed-off-by: Ingo Molnar <mingo@kernel.org>
2017-03-30x86/events/amd/iommu: Clean up bitwise operationsSuravee Suthikulpanit1-9/+9
Clean up register initialization and make use of BIT_ULL(x) where appropriate. This should not affect logic and functionality. Signed-off-by: Suravee Suthikulpanit <suravee.suthikulpanit@amd.com> Signed-off-by: Borislav Petkov <bp@suse.de> Signed-off-by: Peter Zijlstra (Intel) <peterz@infradead.org> Cc: Alexander Shishkin <alexander.shishkin@linux.intel.com> Cc: Arnaldo Carvalho de Melo <acme@redhat.com> Cc: Jiri Olsa <jolsa@redhat.com> Cc: Jörg Rödel <joro@8bytes.org> Cc: Linus Torvalds <torvalds@linux-foundation.org> Cc: Peter Zijlstra <peterz@infradead.org> Cc: Stephane Eranian <eranian@google.com> Cc: Thomas Gleixner <tglx@linutronix.de> Cc: Vince Weaver <vincent.weaver@maine.edu> Cc: iommu@lists.linux-foundation.org Link: http://lkml.kernel.org/r/1487926102-13073-3-git-send-email-Suravee.Suthikulpanit@amd.com Signed-off-by: Ingo Molnar <mingo@kernel.org>
2017-03-30x86/events/amd/iommu: Declare pr_fmt() formatSuravee Suthikulpanit1-13/+9
Declare pr_fmt() format for perf/amd_iommu and remove unnecessary pr_debug() calls. Also check return value when _init_events_attrs() fails and issue an error message. Signed-off-by: Suravee Suthikulpanit <suravee.suthikulpanit@amd.com> Signed-off-by: Borislav Petkov <bp@suse.de> Signed-off-by: Peter Zijlstra (Intel) <peterz@infradead.org> Cc: Alexander Shishkin <alexander.shishkin@linux.intel.com> Cc: Arnaldo Carvalho de Melo <acme@redhat.com> Cc: Jiri Olsa <jolsa@redhat.com> Cc: Jörg Rödel <joro@8bytes.org> Cc: Linus Torvalds <torvalds@linux-foundation.org> Cc: Peter Zijlstra <peterz@infradead.org> Cc: Stephane Eranian <eranian@google.com> Cc: Thomas Gleixner <tglx@linutronix.de> Cc: Vince Weaver <vincent.weaver@maine.edu> Cc: iommu@lists.linux-foundation.org Link: http://lkml.kernel.org/r/1487926102-13073-2-git-send-email-Suravee.Suthikulpanit@amd.com Signed-off-by: Ingo Molnar <mingo@kernel.org>
2017-03-30perf/x86/intel/pt: Allow the disabling of branch tracingAlexander Shishkin2-2/+73
Now that Intel PT supports more types of trace content than just branch tracing, it may be useful to allow the user to disable branch tracing when it is not needed. The special case is BDW, where not setting BranchEn is not supported. This is slightly trickier than necessary, because up to this moment the driver has been setting BranchEn automatically and the userspace assumes as much. Instead of reversing the semantics of BranchEn, we introduce a 'passthrough' bit, which will forego the default and allow the user to set BranchEn to their heart's content. Signed-off-by: Alexander Shishkin <alexander.shishkin@linux.intel.com> Signed-off-by: Peter Zijlstra (Intel) <peterz@infradead.org> Cc: Arnaldo Carvalho de Melo <acme@infradead.org> Cc: Arnaldo Carvalho de Melo <acme@redhat.com> Cc: Borislav Petkov <bp@suse.de> Cc: Jiri Olsa <jolsa@redhat.com> Cc: Linus Torvalds <torvalds@linux-foundation.org> Cc: Peter Zijlstra <peterz@infradead.org> Cc: Stephane Eranian <eranian@google.com> Cc: Thomas Gleixner <tglx@linutronix.de> Cc: Vince Weaver <vincent.weaver@maine.edu> Cc: vince@deater.net Link: http://lkml.kernel.org/r/20170206144140.14402-1-alexander.shishkin@linux.intel.com Signed-off-by: Ingo Molnar <mingo@kernel.org>
2017-03-28Merge branch 'perf/urgent' into perf/core, to pick up fixesIngo Molnar1-2/+14
Signed-off-by: Ingo Molnar <mingo@kernel.org>
2017-03-23sched/clock, x86/perf: Fix "perf test tsc"Peter Zijlstra1-3/+6
People reported that commit: 5680d8094ffa ("sched/clock: Provide better clock continuity") broke "perf test tsc". That commit added another offset to the reported clock value; so take that into account when computing the provided offset values. Reported-by: Adrian Hunter <adrian.hunter@intel.com> Reported-by: Arnaldo Carvalho de Melo <acme@kernel.org> Tested-by: Alexander Shishkin <alexander.shishkin@linux.intel.com> Signed-off-by: Peter Zijlstra (Intel) <peterz@infradead.org> Cc: Linus Torvalds <torvalds@linux-foundation.org> Cc: Mike Galbraith <efault@gmx.de> Cc: Peter Zijlstra <peterz@infradead.org> Cc: Thomas Gleixner <tglx@linutronix.de> Fixes: 5680d8094ffa ("sched/clock: Provide better clock continuity") Signed-off-by: Ingo Molnar <mingo@kernel.org>
2017-03-17Merge branch 'perf-urgent-for-linus' of ↵Linus Torvalds1-2/+14
git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip Pull perf fixes from Thomas Gleixner: "A set of perf related fixes: - fix a CR4.PCE propagation issue caused by usage of mm instead of active_mm and therefore propagated the wrong value. - perf core fixes, which plug a use-after-free issue and make the event inheritance on fork more robust. - a tooling fix for symbol handling" * 'perf-urgent-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip: perf symbols: Fix symbols__fixup_end heuristic for corner cases x86/perf: Clarify why x86_pmu_event_mapped() isn't racy x86/perf: Fix CR4.PCE propagation to use active_mm instead of mm perf/core: Better explain the inherit magic perf/core: Simplify perf_event_free_task() perf/core: Fix event inheritance on fork() perf/core: Fix use-after-free in perf_release()
2017-03-17x86/perf: Clarify why x86_pmu_event_mapped() isn't racyAndy Lutomirski1-0/+12
Naively, it looks racy, but ->mmap_sem saves it. Add a comment and a lockdep assertion. Signed-off-by: Andy Lutomirski <luto@kernel.org> Cc: Alexander Shishkin <alexander.shishkin@linux.intel.com> Cc: Arnaldo Carvalho de Melo <acme@redhat.com> Cc: Borislav Petkov <bpetkov@suse.de> Cc: H. Peter Anvin <hpa@zytor.com> Cc: Jiri Olsa <jolsa@redhat.com> Cc: Linus Torvalds <torvalds@linux-foundation.org> Cc: Peter Zijlstra <peterz@infradead.org> Cc: Stephane Eranian <eranian@google.com> Cc: Thomas Gleixner <tglx@linutronix.de> Cc: Vince Weaver <vincent.weaver@maine.edu> Link: http://lkml.kernel.org/r/03a1e629063899168dfc4707f3bb6e581e21f5c6.1489694270.git.luto@kernel.org Signed-off-by: Ingo Molnar <mingo@kernel.org>
2017-03-17x86/perf: Fix CR4.PCE propagation to use active_mm instead of mmAndy Lutomirski1-2/+2
If one thread mmaps a perf event while another thread in the same mm is in some context where active_mm != mm (which can happen in the scheduler, for example), refresh_pce() would write the wrong value to CR4.PCE. This broke some PAPI tests. Reported-and-tested-by: Vince Weaver <vincent.weaver@maine.edu> Signed-off-by: Andy Lutomirski <luto@kernel.org> Cc: Alexander Shishkin <alexander.shishkin@linux.intel.com> Cc: Arnaldo Carvalho de Melo <acme@redhat.com> Cc: Borislav Petkov <bpetkov@suse.de> Cc: H. Peter Anvin <hpa@zytor.com> Cc: Jiri Olsa <jolsa@redhat.com> Cc: Linus Torvalds <torvalds@linux-foundation.org> Cc: Peter Zijlstra <peterz@infradead.org> Cc: Stephane Eranian <eranian@google.com> Cc: Thomas Gleixner <tglx@linutronix.de> Cc: stable@vger.kernel.org Fixes: 7911d3f7af14 ("perf/x86: Only allow rdpmc if a perf_event is mapped") Link: http://lkml.kernel.org/r/0c5b38a76ea50e405f9abe07a13dfaef87c173a1.1489694270.git.luto@kernel.org Signed-off-by: Ingo Molnar <mingo@kernel.org>
2017-03-16perf/x86/intel/pt: Handle VMX betterAlexander Shishkin1-17/+21
Since commit: 1c5ac21a0e ("perf/x86/intel/pt: Don't die on VMXON") ... PT events depend on re-scheduling to get enabled after a VMX session has taken place. This is, in particular, a problem for CPU context events, which don't normally get re-scheduled, unless there is a reason. This patch changes the VMX handling so that PT event gets re-enabled when VMX root mode exits. Also, notify the user when there's a gap in PT data due to VMX root mode by flagging AUX records as partial. In combination with vmm_exclusive=0 parameter of the kvm_intel driver, this will result in trace gaps only for the duration of the guest's timeslices. Signed-off-by: Alexander Shishkin <alexander.shishkin@linux.intel.com> Signed-off-by: Peter Zijlstra (Intel) <peterz@infradead.org> Cc: Arnaldo Carvalho de Melo <acme@infradead.org> Cc: Arnaldo Carvalho de Melo <acme@redhat.com> Cc: Jiri Olsa <jolsa@redhat.com> Cc: Linus Torvalds <torvalds@linux-foundation.org> Cc: Peter Zijlstra <peterz@infradead.org> Cc: Stephane Eranian <eranian@google.com> Cc: Thomas Gleixner <tglx@linutronix.de> Cc: Vince Weaver <vincent.weaver@maine.edu> Cc: vince@deater.net Link: http://lkml.kernel.org/r/20170220133352.17995-5-alexander.shishkin@linux.intel.com Signed-off-by: Ingo Molnar <mingo@kernel.org>
2017-03-16perf/core: Keep AUX flags in the output handleWill Deacon3-18/+16
In preparation for adding more flags to perf AUX records, introduce a separate API for setting the flags for a session, rather than appending more bool arguments to perf_aux_output_end. This allows to set each flag at the time a corresponding condition is detected, instead of tracking it in each driver's private state. Signed-off-by: Will Deacon <will.deacon@arm.com> Signed-off-by: Alexander Shishkin <alexander.shishkin@linux.intel.com> Signed-off-by: Peter Zijlstra (Intel) <peterz@infradead.org> Cc: Arnaldo Carvalho de Melo <acme@infradead.org> Cc: Arnaldo Carvalho de Melo <acme@redhat.com> Cc: Jiri Olsa <jolsa@redhat.com> Cc: Linus Torvalds <torvalds@linux-foundation.org> Cc: Mathieu Poirier <mathieu.poirier@linaro.org> Cc: Peter Zijlstra <peterz@infradead.org> Cc: Stephane Eranian <eranian@google.com> Cc: Thomas Gleixner <tglx@linutronix.de> Cc: Vince Weaver <vincent.weaver@maine.edu> Cc: vince@deater.net Link: http://lkml.kernel.org/r/20170220133352.17995-3-alexander.shishkin@linux.intel.com Signed-off-by: Ingo Molnar <mingo@kernel.org>
2017-03-16perf/x86: Add Top Down events to Intel GoldmontKan Liang1-0/+22
Goldmont supports full Top Down level 1 metrics (FrontendBound, Retiring, Backend Bound and Bad Speculation). It has 3 wide pipeline. Signed-off-by: Kan Liang <kan.liang@intel.com> Signed-off-by: Peter Zijlstra (Intel) <peterz@infradead.org> Cc: Alexander Shishkin <alexander.shishkin@linux.intel.com> Cc: Arnaldo Carvalho de Melo <acme@redhat.com> Cc: Jiri Olsa <jolsa@redhat.com> Cc: Linus Torvalds <torvalds@linux-foundation.org> Cc: Peter Zijlstra <peterz@infradead.org> Cc: Stephane Eranian <eranian@google.com> Cc: Thomas Gleixner <tglx@linutronix.de> Cc: Vince Weaver <vincent.weaver@maine.edu> Link: http://lkml.kernel.org/r/1486711438-80058-1-git-send-email-kan.liang@intel.com Signed-off-by: Ingo Molnar <mingo@kernel.org>
2017-03-07Merge branch 'x86-urgent-for-linus' of ↵Linus Torvalds4-6/+6
git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip Pull x86 fixes from Ingo Molnar: "Misc fixes and minor updates all over the place: - an SGI/UV fix - a defconfig update - a build warning fix - move the boot_params file to the arch location in debugfs - a pkeys fix - selftests fix - boot message fixes - sparse fixes - a resume warning fix - ioapic hotplug fixes - reboot quirks ... plus various minor cleanups" * 'x86-urgent-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip: x86/build/x86_64_defconfig: Enable CONFIG_R8169 x86/reboot/quirks: Add ASUS EeeBook X205TA/W reboot quirk x86/hpet: Prevent might sleep splat on resume x86/boot: Correct setup_header.start_sys name x86/purgatory: Fix sparse warning, symbol not declared x86/purgatory: Make functions and variables static x86/events: Remove last remnants of old filenames x86/pkeys: Check against max pkey to avoid overflows x86/ioapic: Split IOAPIC hot-removal into two steps x86/PCI: Implement pcibios_release_device to release IRQ from IOAPIC x86/intel_rdt: Remove duplicate inclusion of linux/cpu.h x86/vmware: Remove duplicate inclusion of asm/timer.h x86/hyperv: Hide unused label x86/reboot/quirks: Add ASUS EeeBook X205TA reboot quirk x86/platform/uv/BAU: Fix HUB errors by remove initial write to sw-ack register x86/selftests: Add clobbers for int80 on x86_64 x86/apic: Simplify enable_IR_x2apic(), remove try_to_enable_IR() x86/apic: Fix a warning message in logical CPU IDs allocation x86/kdebugfs: Move boot params hierarchy under (debugfs)/x86/
2017-03-02sched/headers: Prepare to remove the <linux/mm_types.h> dependency from ↵Ingo Molnar1-1/+1
<linux/sched.h> Update code that relied on sched.h including various MM types for them. This will allow us to remove the <linux/mm_types.h> include from <linux/sched.h>. Acked-by: Linus Torvalds <torvalds@linux-foundation.org> Cc: Mike Galbraith <efault@gmx.de> Cc: Peter Zijlstra <peterz@infradead.org> Cc: Thomas Gleixner <tglx@linutronix.de> Cc: linux-kernel@vger.kernel.org Signed-off-by: Ingo Molnar <mingo@kernel.org>
2017-03-02sched/headers: Prepare for new header dependencies before moving code to ↵Ingo Molnar2-0/+2
<linux/sched/clock.h> We are going to split <linux/sched/clock.h> out of <linux/sched.h>, which will have to be picked up from other headers and .c files. Create a trivial placeholder <linux/sched/clock.h> file that just maps to <linux/sched.h> to make this patch obviously correct and bisectable. Include the new header in the files that are going to need it. Acked-by: Linus Torvalds <torvalds@linux-foundation.org> Cc: Mike Galbraith <efault@gmx.de> Cc: Peter Zijlstra <peterz@infradead.org> Cc: Thomas Gleixner <tglx@linutronix.de> Cc: linux-kernel@vger.kernel.org Signed-off-by: Ingo Molnar <mingo@kernel.org>