aboutsummaryrefslogtreecommitdiff
path: root/arch
AgeCommit message (Collapse)AuthorFilesLines
2015-08-04perf/x86/intel/pebs: Robustify PEBS buffer drainPeter Zijlstra1-17/+17
Vince Weaver and Stephane Eranian reported warnings in the PEBS code when running the perf fuzzer. Stephane wrote: > I can reproduce the problem on my HSW running the fuzzer. > > I can see why this could be happening if you are mixing PEBS and non PEBS events > in the bottom 4 counters. I suspect: > for (bit = 0; bit < x86_pmu.max_pebs_events; bit++) { > if ((counts[bit] == 0) && (error[bit] == 0)) > continue; > > This test is not correct when you have non-PEBS events mixed with > PEBS events and they overflow at the same time. They will have > counts[i] != 0 but error[i] == 0, and thus you fall thru the loop > and hit the assert. Or it is something along those lines. The only way I can make this work is if ->status only has !PEBS events set, because if it has both set we'll take that slow path which masks out the !PEBS bits. After masking there are 3 options: - there is one bit set, and its @bit, we increment counts[bit]. - there are multiple bits set, we increment error[] for each set bit, we do not increment counts[]. - there are no bits set, we do nothing. The intent was to never increment counts[] for !PEBS events. Now if we start out with only a single !PEBS event set, we'll pass the test and increment counts[] for a !PEBS and hit the warn. Reported-by: Vince Weaver <[email protected]> Reported-by: Stephane Eranian <[email protected]> Signed-off-by: Peter Zijlstra (Intel) <[email protected]> Cc: Arnaldo Carvalho de Melo <[email protected]> Cc: Linus Torvalds <[email protected]> Cc: Peter Zijlstra <[email protected]> Cc: Thomas Gleixner <[email protected]> Cc: [email protected] Signed-off-by: Ingo Molnar <[email protected]>
2015-08-04perf/x86/intel/pebs: Fix event disable PEBS buffer drainLiang, Kan1-6/+7
When disabling a PEBS event, we need to drain the buffer. Doing so requires a correct cpuc->pebs_active mask. The current code clears the pebs_active bit before draining the buffer. Fix that. Signed-off-by: "Liang, Kan" <[email protected]> Signed-off-by: Peter Zijlstra (Intel) <[email protected]> Cc: Linus Torvalds <[email protected]> Cc: Peter Zijlstra <[email protected]> Cc: Thomas Gleixner <[email protected]> Cc: Vince Weaver<[email protected]> Link: http://lkml.kernel.org/r/37D7C6CF3E00A74B8858931C1DB2F07701885A65@SHSMSX103.ccr.corp.intel.com [ Fixed the SOB. ] Signed-off-by: Ingo Molnar <[email protected]>
2015-08-04perf/x86: Add an MSR PMU driverAndy Lutomirski2-0/+244
This patch adds an MSR PMU to support free running MSR counters. Such as time and freq related counters includes TSC, IA32_APERF, IA32_MPERF and IA32_PPERF, but also SMI_COUNT. The events are exposed in sysfs for use by perf stat and other tools. The files are under /sys/devices/msr/events/ Signed-off-by: Andy Lutomirski <[email protected]> Signed-off-by: Kan Liang <[email protected]> [ s/freq/msr/, added SMI_COUNT, fixed bugs. ] Signed-off-by: Peter Zijlstra (Intel) <[email protected]> Cc: Linus Torvalds <[email protected]> Cc: Peter Zijlstra <[email protected]> Cc: Thomas Gleixner <[email protected]> Cc: [email protected] Cc: [email protected] Cc: [email protected] Cc: [email protected] Cc: [email protected] Cc: [email protected] Cc: [email protected] Link: http://lkml.kernel.org/r/[email protected] Signed-off-by: Ingo Molnar <[email protected]>
2015-08-04perf/x86/intel/uncore: Add Broadwell-DE uncore supportKan Liang3-0/+172
The uncore subsystem for Broadwell-DE is similar to Haswell-EP. There are some differences in pci device IDs, box number and constraints. Please refer to the public document: http://www.intel.com/content/www/us/en/processors/xeon/xeon-d-1500-uncore-performance-monitoring.html Signed-off-by: Kan Liang <[email protected]> Signed-off-by: Peter Zijlstra (Intel) <[email protected]> Cc: Linus Torvalds <[email protected]> Cc: Peter Zijlstra <[email protected]> Cc: Thomas Gleixner <[email protected]> Cc: [email protected] Link: http://lkml.kernel.org/r/[email protected] Signed-off-by: Ingo Molnar <[email protected]>
2015-08-04perf/x86/intel: Use 0x11 as extra reg test valueAndi Kleen1-1/+1
The next patch adds a new perf extra register where 0x1ff is not a valid value. Use 0x11 instead. Signed-off-by: Andi Kleen <[email protected]> Signed-off-by: Peter Zijlstra (Intel) <[email protected]> Cc: Linus Torvalds <[email protected]> Cc: Peter Zijlstra <[email protected]> Cc: Thomas Gleixner <[email protected]> Link: http://lkml.kernel.org/r/[email protected] Signed-off-by: Ingo Molnar <[email protected]>
2015-08-04perf/x86: Make merge_attr() global to use from perf_event_intelAndi Kleen2-1/+4
merge_attr() allows to merge two sysfs attribute tables. Export it to be usable by other files too. Next patch is going to use that to extend the sysfs format attributes for a CPU. Signed-off-by: Andi Kleen <[email protected]> Signed-off-by: Peter Zijlstra (Intel) <[email protected]> Cc: Linus Torvalds <[email protected]> Cc: Peter Zijlstra <[email protected]> Cc: Thomas Gleixner <[email protected]> Cc: [email protected] Link: http://lkml.kernel.org/r/[email protected] Signed-off-by: Ingo Molnar <[email protected]>
2015-08-04perf/x86/intel/lbr: Limit LBR accesses to TOS in callstack modeAndi Kleen1-3/+7
In callstack mode the LBR is not a ring buffer, but a stack that grows up and down. This means in this case we don't need to access all LBRs, only the ones up to TOS. Do this optimization for the normal LBR read, and the context switch save/restore code. For save/restore it can be done unconditionally, as it only runs when call stack mode is active. This recovers some of the cost of going to 32 LBRs on Skylake. Signed-off-by: Andi Kleen <[email protected]> Signed-off-by: Peter Zijlstra (Intel) <[email protected]> Cc: Linus Torvalds <[email protected]> Cc: Peter Zijlstra <[email protected]> Cc: Thomas Gleixner <[email protected]> Cc: [email protected] Cc: [email protected] Cc: [email protected] Link: http://lkml.kernel.org/r/[email protected] Signed-off-by: Ingo Molnar <[email protected]>
2015-08-04perf/x86/intel/lbr: Use correct index to save/restore LBR_INFO with call stackAndi Kleen1-2/+2
Use the correct index to save/restore the LBR_INFO_x MSR in callstack mode. This is more a cleanup, as even with the wrong index the register was correctly saved/restored, and also LBR callgraph mode in perf tools do not really need anything in LBR_INFO. But still better to use the right index. Signed-off-by: Andi Kleen <[email protected]> Signed-off-by: Peter Zijlstra (Intel) <[email protected]> Cc: Linus Torvalds <[email protected]> Cc: Peter Zijlstra <[email protected]> Cc: Thomas Gleixner <[email protected]> Cc: [email protected] Cc: [email protected] Cc: [email protected] Link: http://lkml.kernel.org/r/[email protected] Signed-off-by: Ingo Molnar <[email protected]>
2015-08-04perf/x86/intel: Add Intel Skylake PMU supportAndi Kleen4-1/+279
Add perf core PMU support for future Intel Skylake CPU cores. The code is based on Haswell/Broadwell. There is a new cache event list, based on the updated Haswell event list. Skylake has removed most counter constraints on basic events, so the basic constraints table now only has a single entry (plus the fixed counters). TSX support and various other setups are all shared with Haswell. Skylake has 32 LBR entries. Add a new LBR init function to set this up. The filters are all the same as Haswell. It also has a new LBR format with a separate LBR_INFO_* MSR, but that has been already added earlier. Signed-off-by: Andi Kleen <[email protected]> Signed-off-by: Peter Zijlstra (Intel) <[email protected]> Cc: Linus Torvalds <[email protected]> Cc: Peter Zijlstra <[email protected]> Cc: Thomas Gleixner <[email protected]> Cc: [email protected] Link: http://lkml.kernel.org/r/[email protected] Signed-off-by: Ingo Molnar <[email protected]>
2015-08-04perf/x86/intel/lbr: Optimize v4 LBR unfreezingAndi Kleen1-0/+7
In Arch perfmon v4 the GLOBAL_STATUS reset automatically unfreezes LBRs. So no need to do it manually in the LBR code. Add a check to skip it. v2: Move test up to beginning of function. Signed-off-by: Andi Kleen <[email protected]> Signed-off-by: Peter Zijlstra (Intel) <[email protected]> Cc: Linus Torvalds <[email protected]> Cc: Peter Zijlstra <[email protected]> Cc: Thomas Gleixner <[email protected]> Cc: [email protected] Link: http://lkml.kernel.org/r/[email protected] Signed-off-by: Ingo Molnar <[email protected]>
2015-08-04perf/x86/intel: Move PMU ACK to after LBR readAndi Kleen1-1/+1
With Arch Perfmon v4 the PMU ack unfreezes the LBRs. So we need to do the PMU ack after the LBR reading, otherwise the LBRs would be polluted by the PMI handler. This is a minimal change. In principle the ACK could be moved much later. Signed-off-by: Andi Kleen <[email protected]> Signed-off-by: Peter Zijlstra (Intel) <[email protected]> Cc: Linus Torvalds <[email protected]> Cc: Peter Zijlstra <[email protected]> Cc: Thomas Gleixner <[email protected]> Cc: [email protected] Link: http://lkml.kernel.org/r/[email protected] Signed-off-by: Ingo Molnar <[email protected]>
2015-08-04perf/x86/intel: Handle new arch perfmon v4 status bitsAndi Kleen1-6/+7
ArchPerfmon v4 has some new status bits in GLOBAL_STATUS. These need to be ignored when deciding whether a NMI was an NMI, to avoid eating all NMIs when they stay set, see: b292d7a10487 ("perf/x86/intel: ignore CondChgd bit to avoid false NMI handling") This patch ignores the new ASIF bit, which indicates that SGX interfered with the PMU, and also the new LBR freezing bits, which are set when the LBRs get frozen, plus the existing CondChange (set by JTAG debuggers and some buggy BIOSes) Signed-off-by: Andi Kleen <[email protected]> Signed-off-by: Peter Zijlstra (Intel) <[email protected]> Cc: Linus Torvalds <[email protected]> Cc: Peter Zijlstra <[email protected]> Cc: Thomas Gleixner <[email protected]> Cc: [email protected] Link: http://lkml.kernel.org/r/[email protected] Signed-off-by: Ingo Molnar <[email protected]>
2015-08-04perf/x86/intel/lbr: Add support for LBRv5Andi Kleen2-1/+21
Add support for the new LBRv5 format used on Intel Skylake CPUs. The flags for mispredict, abort, in_tx etc. moved to range of separate LBR_INFO_* MSRs. Teach the LBR code to read those. The original LBR registers stay the same, except they have full sign extension now. LBR_INFO also reports a cycle count to the last branch. Report the cycle information using the new "cycles" branch_info output field. In addition we have to context switch and clear the new INFO MSRs to avoid any information leaks. Signed-off-by: Andi Kleen <[email protected]> Signed-off-by: Peter Zijlstra (Intel) <[email protected]> Cc: Linus Torvalds <[email protected]> Cc: Peter Zijlstra <[email protected]> Cc: Thomas Gleixner <[email protected]> Cc: [email protected] Link: http://lkml.kernel.org/r/[email protected] Signed-off-by: Ingo Molnar <[email protected]>
2015-08-04x86: Add new MSRs and MSR bits used for Intel Skylake PMU supportAndi Kleen2-0/+13
Add new MSRs (LBR_INFO) and some new MSR bits used by the Intel Skylake PMU driver. Signed-off-by: Andi Kleen <[email protected]> Signed-off-by: Peter Zijlstra (Intel) <[email protected]> Cc: Linus Torvalds <[email protected]> Cc: Peter Zijlstra <[email protected]> Cc: Thomas Gleixner <[email protected]> Cc: [email protected] Link: http://lkml.kernel.org/r/[email protected] Signed-off-by: Ingo Molnar <[email protected]>
2015-08-04perf/x86/intel/lbr: Allow time stamp for free running PEBSv3Andi Kleen3-1/+16
With PEBSv3 the PEBS record contains a time stamp. That means we can allow free-running PEBS without a PMI even if the user program requested a time stamp. This avoids the need to use -T to get free running PEBS, and also avoids any problems with mis-identifying MMAPs later. Move the free_running_flags state into a variable in x86_pmu and use it. This only works when no explicit clock_id is set. Signed-off-by: Andi Kleen <[email protected]> Signed-off-by: Peter Zijlstra (Intel) <[email protected]> Cc: Linus Torvalds <[email protected]> Cc: Peter Zijlstra <[email protected]> Cc: Thomas Gleixner <[email protected]> Cc: [email protected] Cc: [email protected] Cc: [email protected] Cc: [email protected] Link: http://lkml.kernel.org/r/[email protected] Signed-off-by: Ingo Molnar <[email protected]>
2015-08-04perf/x86/intel: Add support for PEBSv3 profilingAndi Kleen1-3/+33
PEBSv3 is the same as the existing PEBSv2 used on Haswell, but it adds a new TSC field. Add support to the generic PEBS handler to handle the new format, and overwrite the perf time stamp using the new native_sched_clock_from_tsc(). Right now the time stamp is just slightly more accurate, as it is nearer the actual event trigger point. With the PEBS threshold > 1 patchkit it will be much more accurate, avoid the problems with MMAP mismatches earlier. The accurate time stamping is only implemented for the default trace clock for now. v2: Use _skl prefix. Check for default clock_id. Signed-off-by: Andi Kleen <[email protected]> Signed-off-by: Peter Zijlstra (Intel) <[email protected]> Cc: Linus Torvalds <[email protected]> Cc: Peter Zijlstra <[email protected]> Cc: Thomas Gleixner <[email protected]> Cc: [email protected] Link: http://lkml.kernel.org/r/[email protected] Signed-off-by: Ingo Molnar <[email protected]>
2015-08-04perf/x86: Add a native_perf_sched_clock_from_tsc()Andi Kleen2-0/+9
PEBSv3 has a raw TSC time stamp in its memory buffer that later needs to to be converted to perf_clock. Add a native_sched_clock_from_tsc() that works the same as native_sched_clock(), but starts with an already given TSC value. Paravirt is ignored, it will just get the native clock. But there isn't a para virtualized PEBS anyway. Signed-off-by: Andi Kleen <[email protected]> Signed-off-by: Peter Zijlstra (Intel) <[email protected]> Cc: Linus Torvalds <[email protected]> Cc: Peter Zijlstra <[email protected]> Cc: Thomas Gleixner <[email protected]> Cc: [email protected] Link: http://lkml.kernel.org/r/[email protected] Signed-off-by: Ingo Molnar <[email protected]>
2015-08-04perf/x86/intel/pt: Add new timing packet enablesAlexander Shishkin3-1/+81
Intel PT chapter in the new Intel Architecture SDM adds several packets corresponding enable bits and registers that control packet generation. Also, additional bits in the Intel PT CPUID leaf were added to enumerate presence and parameters of these new packets and features. The packets and enables are: * CYC: cycle accurate mode, provides the number of cycles elapsed since previous CYC packet; its presence and available threshold values are enumerated via CPUID; * MTC: mini time counter packets, used for tracking TSC time between full TSC packets; its presence and available resolution options are enumerated via CPUID; * PSB packet period is now configurable, available period values are enumerated via CPUID. This patch adds corresponding bit and register definitions, pmu driver capabilities based on CPUID enumeration, new attribute format bits for the new featurens and extends event configuration validation function to take these into account. Signed-off-by: Alexander Shishkin <[email protected]> Signed-off-by: Peter Zijlstra (Intel) <[email protected]> Cc: Linus Torvalds <[email protected]> Cc: Peter Zijlstra <[email protected]> Cc: Thomas Gleixner <[email protected]> Cc: [email protected] Cc: [email protected] Cc: [email protected] Link: http://lkml.kernel.org/r/1438262131-12725-1-git-send-email-alexander.shishkin@linux.intel.com Signed-off-by: Ingo Molnar <[email protected]>
2015-08-04perf/x86/intel/pt: Do not force sync packets on every schedule-inAlexander Shishkin1-2/+5
Currently, the PT driver zeroes out the status register every time before starting the event. However, all the writable bits are already taken care of in pt_handle_status() function, except the new PacketByteCnt field, which in new versions of PT contains the number of packet bytes written since the last sync (PSB) packet. Zeroing it out before enabling PT forces a sync packet to be written. This means that, with the existing code, a sync packet (PSB and PSBEND, 18 bytes in total) will be generated every time a PT event is scheduled in. To avoid these unnecessary syncs and save a WRMSR in the fast path, this patch changes the default behavior to not clear PacketByteCnt field, so that the sync packets will be generated with the period specified as "psb_period" attribute config field. This has little impact on the trace data as the other packets that are normally sent within PSB+ (between PSB and PSBEND) have their own generation scenarios which do not depend on the sync packets. One exception where we do need to force PSB like this when tracing starts, so that the decoder has a clear sync point in the trace. For this purpose we aready have hw::itrace_started flag, which we are currently using to output PERF_RECORD_ITRACE_START. This patch moves setting itrace_started from perf core to the pmu::start, where it should still be 0 on the very first run. Signed-off-by: Alexander Shishkin <[email protected]> Signed-off-by: Peter Zijlstra (Intel) <[email protected]> Cc: Linus Torvalds <[email protected]> Cc: Peter Zijlstra <[email protected]> Cc: Thomas Gleixner <[email protected]> Cc: [email protected] Cc: [email protected] Cc: [email protected] Link: http://lkml.kernel.org/r/1438264104-16189-1-git-send-email-alexander.shishkin@linux.intel.com Signed-off-by: Ingo Molnar <[email protected]>
2015-08-04perf/x86/hw_breakpoints: Fix check for kernel-space breakpointsAndy Lutomirski1-1/+5
The check looked wrong, although I think it was actually safe. TASK_SIZE is unnecessarily small for compat tasks, and it wasn't possible to make a range breakpoint so large it started in user space and ended in kernel space. Nonetheless, let's fix up the check for the benefit of future readers. A breakpoint is in the kernel if either end is in the kernel. Signed-off-by: Andy Lutomirski <[email protected]> Signed-off-by: Peter Zijlstra (Intel) <[email protected]> Cc: Borislav Petkov <[email protected]> Cc: Brian Gerst <[email protected]> Cc: Linus Torvalds <[email protected]> Cc: Masami Hiramatsu <[email protected]> Cc: Peter Zijlstra <[email protected]> Cc: Steven Rostedt <[email protected]> Cc: Thomas Gleixner <[email protected]> Link: http://lkml.kernel.org/r/136be387950e78f18cea60e9d1bef74465d0ee8f.1438312874.git.luto@kernel.org Signed-off-by: Ingo Molnar <[email protected]>
2015-08-04perf/x86/hw_breakpoints: Improve range breakpoint validationAndy Lutomirski1-0/+10
Range breakpoints will do the wrong thing if the address isn't aligned. While we're there, add comments about why it's safe for instruction breakpoints. Signed-off-by: Andy Lutomirski <[email protected]> Signed-off-by: Peter Zijlstra (Intel) <[email protected]> Cc: Borislav Petkov <[email protected]> Cc: Brian Gerst <[email protected]> Cc: Linus Torvalds <[email protected]> Cc: Masami Hiramatsu <[email protected]> Cc: Peter Zijlstra <[email protected]> Cc: Steven Rostedt <[email protected]> Cc: Thomas Gleixner <[email protected]> Link: http://lkml.kernel.org/r/ae25d14d61f2f43b78e0a247e469f3072df7e201.1438312874.git.luto@kernel.org Signed-off-by: Ingo Molnar <[email protected]>
2015-08-04perf/x86/hw_breakpoints: Disallow kernel breakpoints unless kprobe-safeAndy Lutomirski1-0/+15
Code on the kprobe blacklist doesn't want unexpected int3 exceptions. It probably doesn't want unexpected debug exceptions either. Be safe: disallow breakpoints in nokprobes code. On non-CONFIG_KPROBES kernels, there is no kprobe blacklist. In that case, disallow kernel breakpoints entirely. It will be particularly important to keep hw breakpoints out of the entry and NMI code once we move debug exceptions off the IST stack. Signed-off-by: Andy Lutomirski <[email protected]> Signed-off-by: Peter Zijlstra (Intel) <[email protected]> Cc: Borislav Petkov <[email protected]> Cc: Brian Gerst <[email protected]> Cc: Linus Torvalds <[email protected]> Cc: Masami Hiramatsu <[email protected]> Cc: Peter Zijlstra <[email protected]> Cc: Steven Rostedt <[email protected]> Cc: Thomas Gleixner <[email protected]> Link: http://lkml.kernel.org/r/e14b152af99640448d895e3c2a8c2d5ee19a1325.1438312874.git.luto@kernel.org Signed-off-by: Ingo Molnar <[email protected]>
2015-08-04perf/x86/intel: Fix SLM MSR_OFFCORE_RSP1 valid_maskKan Liang1-6/+10
AVG_LATENCY(bit 38) is only available on MSR_OFFCORE_RSP0. So the bit should be removed from RSP1 valid_mask. Since RSP0 and RSP1 may have different valid_mask, intel_alt_er should validate the config on the alternate offcore reg before replacing it. Signed-off-by: Kan Liang <[email protected]> Signed-off-by: Peter Zijlstra (Intel) <[email protected]> Cc: Linus Torvalds <[email protected]> Cc: Peter Zijlstra <[email protected]> Cc: Thomas Gleixner <[email protected]> Link: http://lkml.kernel.org/r/[email protected] Signed-off-by: Ingo Molnar <[email protected]>
2015-08-04perf/x86/intel/lbr: Kill off intel_pmu_needs_lbr_smpl for goodAlexander Shishkin1-14/+0
The x86_lbr_exclusive commit (4807034248be "perf/x86: Mark Intel PT and LBR/BTS as mutually exclusive") mistakenly moved intel_pmu_needs_lbr_smpl() to perf_event.h, while another commit (a46a2300019 "perf: Simplify the branch stack check") removed it in favor of needs_branch_stack(). This patch gets rid of intel_pmu_needs_lbr_smpl() for good. Signed-off-by: Alexander Shishkin <[email protected]> Signed-off-by: Peter Zijlstra (Intel) <[email protected]> Cc: Linus Torvalds <[email protected]> Cc: Peter Zijlstra <[email protected]> Cc: Thomas Gleixner <[email protected]> Cc: [email protected] Cc: [email protected] Cc: [email protected] Link: http://lkml.kernel.org/r/1435140349-32588-3-git-send-email-alexander.shishkin@linux.intel.com Signed-off-by: Ingo Molnar <[email protected]>
2015-08-04perf/x86/intel/bts: Drop redundant declarationsAlexander Shishkin1-3/+0
Both intel_pmu_enable_bts() and intel_pmu_disable_bts() are in perf_event.h header file, no need to have them declared again in the driver. Signed-off-by: Alexander Shishkin <[email protected]> Signed-off-by: Peter Zijlstra (Intel) <[email protected]> Cc: Linus Torvalds <[email protected]> Cc: Peter Zijlstra <[email protected]> Cc: Thomas Gleixner <[email protected]> Cc: [email protected] Cc: [email protected] Cc: [email protected] Link: http://lkml.kernel.org/r/1435140349-32588-2-git-send-email-alexander.shishkin@linux.intel.com Signed-off-by: Ingo Molnar <[email protected]>
2015-08-04perf/x86/intel/uncore: Use Sandy Bridge client PMU on Haswell/BroadwellAndi Kleen1-0/+5
Haswell and Broadwell have the same uncore CBOX/ARB PMU as Sandy Bridge. Add the respective model numbers to enable the SNB uncore PMU. Signed-off-by: Andi Kleen <[email protected]> Signed-off-by: Peter Zijlstra (Intel) <[email protected]> Cc: Linus Torvalds <[email protected]> Cc: Peter Zijlstra <[email protected]> Cc: Thomas Gleixner <[email protected]> Cc: [email protected] Cc: [email protected] Link: http://lkml.kernel.org/r/[email protected] Signed-off-by: Ingo Molnar <[email protected]>
2015-08-04perf/x86/intel/uncore: Add support for ARB uncore PMU on Sandy/IvyBridgeAndi Kleen1-2/+21
Add a new "ARB" uncore PMU that is used to monitor the uncore queue arbiter. This is useful to measure uncore queue occupancy and similar statistics. The registers all have the same format as the existing CBOX PMU. Also move the event constraints from the CBOX to ARB. The 0x80+ events are ARB events and cannot be scheduled on a CBOX PMU. Signed-off-by: Andi Kleen <[email protected]> Signed-off-by: Peter Zijlstra (Intel) <[email protected]> Cc: Linus Torvalds <[email protected]> Cc: Peter Zijlstra <[email protected]> Cc: Thomas Gleixner <[email protected]> Cc: [email protected] Cc: [email protected] Link: http://lkml.kernel.org/r/[email protected] Signed-off-by: Ingo Molnar <[email protected]>
2015-08-04perf/x86/intel/uncore: Remove use of macro DEFINE_PCI_DEVICE_TABLE()Vaishali Thakkar1-1/+1
The DEFINE_PCI_DEVICE_TABLE() macro is deprecated. Use 'struct pci_device_id' instead of DEFINE_PCI_DEVICE_TABLE(), with the goal of getting rid of this macro completely. This Coccinelle semantic patch performs this transformation: @@ identifier a; declarer name DEFINE_PCI_DEVICE_TABLE; initializer i; @@ - DEFINE_PCI_DEVICE_TABLE(a) + const struct pci_device_id a[] = i; Signed-off-by: Vaishali Thakkar <[email protected]> Signed-off-by: Peter Zijlstra (Intel) <[email protected]> Cc: Arnaldo Carvalho de Melo <[email protected]> Cc: H. Peter Anvin <[email protected]> Cc: Linus Torvalds <[email protected]> Cc: Peter Zijlstra <[email protected]> Cc: Thomas Gleixner <[email protected]> Link: http://lkml.kernel.org/r/20150717052759.GA6265@vaishali-Ideapad-Z570 Signed-off-by: Ingo Molnar <[email protected]>
2015-08-04perf/x86/intel/rapl: Add support for Knights Landing (KNL)Dasaratharaman Chandramouli1-0/+20
Knights Landing DRAM RAPL supports PKG and DRAM RAPL domains. DRAM RAPL has a different fixed energy unit (2^-16J) similar to that of HSW. Signed-off-by: Dasaratharaman Chandramouli <[email protected]> Signed-off-by: Peter Zijlstra (Intel) <[email protected]> Acked-by: Stephane Eranian <[email protected]> Acked-by: Jacob Pan <[email protected]> Cc: Arnaldo Carvalho de Melo <[email protected]> Cc: H. Peter Anvin <[email protected]> Cc: Jacob Pan Jun <[email protected]> Cc: Linus Torvalds <[email protected]> Cc: Nikhil Rao <[email protected]> Cc: Paul Mackerras <[email protected]> Cc: Peter Zijlstra <[email protected]> Cc: Thomas Gleixner <[email protected]> Link: http://lkml.kernel.org/r/aa63b4a3af3160152fea1a10c807f4200527280c.1432665809.git.dasaratharaman.chandramouli@intel.com Signed-off-by: Ingo Molnar <[email protected]>
2015-07-31uprobes/x86: Make arch_uretprobe_is_alive(RP_CHECK_CALL) more cleverOleg Nesterov1-1/+4
The previous change documents that cleanup_return_instances() can't always detect the dead frames, the stack can grow. But there is one special case which imho worth fixing: arch_uretprobe_is_alive() can return true when the stack didn't actually grow, but the next "call" insn uses the already invalidated frame. Test-case: #include <stdio.h> #include <setjmp.h> jmp_buf jmp; int nr = 1024; void func_2(void) { if (--nr == 0) return; longjmp(jmp, 1); } void func_1(void) { setjmp(jmp); func_2(); } int main(void) { func_1(); return 0; } If you ret-probe func_1() and func_2() prepare_uretprobe() hits the MAX_URETPROBE_DEPTH limit and "return" from func_2() is not reported. When we know that the new call is not chained, we can do the more strict check. In this case "sp" points to the new ret-addr, so every frame which uses the same "sp" must be dead. The only complication is that arch_uretprobe_is_alive() needs to know was it chained or not, so we add the new RP_CHECK_CHAIN_CALL enum and change prepare_uretprobe() to pass RP_CHECK_CALL only if !chained. Note: arch_uretprobe_is_alive() could also re-read *sp and check if this word is still trampoline_vaddr. This could obviously improve the logic, but I would like to avoid another copy_from_user() especially in the case when we can't avoid the false "alive == T" positives. Tested-by: Pratyush Anand <[email protected]> Signed-off-by: Oleg Nesterov <[email protected]> Acked-by: Srikar Dronamraju <[email protected]> Acked-by: Anton Arapov <[email protected]> Cc: Andy Lutomirski <[email protected]> Cc: Linus Torvalds <[email protected]> Cc: Peter Zijlstra <[email protected]> Cc: Thomas Gleixner <[email protected]> Link: http://lkml.kernel.org/r/[email protected] Signed-off-by: Ingo Molnar <[email protected]>
2015-07-31uprobes: Add the "enum rp_check ctx" arg to arch_uretprobe_is_alive()Oleg Nesterov1-1/+2
arch/x86 doesn't care (so far), but as Pratyush Anand pointed out other architectures might want why arch_uretprobe_is_alive() was called and use different checks depending on the context. Add the new argument to distinguish 2 callers. Tested-by: Pratyush Anand <[email protected]> Signed-off-by: Oleg Nesterov <[email protected]> Acked-by: Srikar Dronamraju <[email protected]> Acked-by: Anton Arapov <[email protected]> Cc: Andy Lutomirski <[email protected]> Cc: Linus Torvalds <[email protected]> Cc: Peter Zijlstra <[email protected]> Cc: Thomas Gleixner <[email protected]> Link: http://lkml.kernel.org/r/[email protected] Signed-off-by: Ingo Molnar <[email protected]>
2015-07-31uprobes/x86: Reimplement arch_uretprobe_is_alive()Oleg Nesterov1-0/+5
Add the x86 specific version of arch_uretprobe_is_alive() helper. It returns true if the stack frame mangled by prepare_uretprobe() is still on stack. So if it returns false, we know that the probed function has already returned. We add the new return_instance->stack member and change the generic code to initialize it in prepare_uretprobe, but it should be equally useful for other architectures. TODO: this assumes that the probed application can't use multiple stacks (say sigaltstack). We will try to improve this logic later. Tested-by: Pratyush Anand <[email protected]> Signed-off-by: Oleg Nesterov <[email protected]> Acked-by: Srikar Dronamraju <[email protected]> Acked-by: Anton Arapov <[email protected]> Cc: Andy Lutomirski <[email protected]> Cc: Linus Torvalds <[email protected]> Cc: Peter Zijlstra <[email protected]> Cc: Thomas Gleixner <[email protected]> Link: http://lkml.kernel.org/r/[email protected] Signed-off-by: Ingo Molnar <[email protected]>
2015-07-26Merge branch 'perf-urgent-for-linus' of ↵Linus Torvalds1-0/+8
git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip Pull perf fix from Thomas Gleixner: "A single fix for the intel cqm perf facility to prevent IPIs from interrupt context" * 'perf-urgent-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip: perf/x86/intel/cqm: Return cached counter value from IRQ context
2015-07-26Merge branch 'x86-urgent-for-linus' of ↵Linus Torvalds6-44/+32
git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip Pull x86 fixes from Thomas Gleixner: "This update contains: - the manual revert of the SYSCALL32 changes which caused a regression - a fix for the MPX vma handling - three fixes for the ioremap 'is ram' checks. - PAT warning fixes - a trivial fix for the size calculation of TLB tracepoints - handle old EFI structures gracefully This also contains a PAT fix from Jan plus a revert thereof. Toshi explained why the code is correct" * 'x86-urgent-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip: x86/mm/pat: Revert 'Adjust default caching mode translation tables' x86/asm/entry/32: Revert 'Do not use R9 in SYSCALL32' commit x86/mm: Fix newly introduced printk format warnings mm: Fix bugs in region_is_ram() x86/mm: Remove region_is_ram() call from ioremap x86/mm: Move warning from __ioremap_check_ram() to the call site x86/mm/pat, drivers/media/ivtv: Move the PAT warning and replace WARN() with pr_warn() x86/mm/pat, drivers/infiniband/ipath: Replace WARN() with pr_warn() x86/mm/pat: Adjust default caching mode translation tables x86/fpu: Disable dependent CPU features on "noxsave" x86/mpx: Do not set ->vm_ops on MPX VMAs x86/mm: Add parenthesis for TLB tracepoint size calculation efi: Handle memory error structures produced based on old versions of standard
2015-07-26x86/mm/pat: Revert 'Adjust default caching mode translation tables'Thomas Gleixner1-3/+3
Toshi explains: "No, the default values need to be set to the fallback types, i.e. minimal supported mode. For WC and WT, UC is the fallback type. When PAT is disabled, pat_init() does update the tables below to enable WT per the default BIOS setup. However, when PAT is enabled, but CPU has PAT -errata, WT falls back to UC per the default values." Revert: ca1fec58bc6a 'x86/mm/pat: Adjust default caching mode translation tables' Requested-by: Toshi Kani <[email protected]> Cc: Jan Beulich <[email protected]> Link: http://lkml.kernel.org/r/[email protected] Signed-off-by: Thomas Gleixner <[email protected]>
2015-07-26perf/x86/intel/cqm: Return cached counter value from IRQ contextMatt Fleming1-0/+8
Peter reported the following potential crash which I was able to reproduce with his test program, [ 148.765788] ------------[ cut here ]------------ [ 148.765796] WARNING: CPU: 34 PID: 2840 at kernel/smp.c:417 smp_call_function_many+0xb6/0x260() [ 148.765797] Modules linked in: [ 148.765800] CPU: 34 PID: 2840 Comm: perf Not tainted 4.2.0-rc1+ #4 [ 148.765803] ffffffff81cdc398 ffff88085f105950 ffffffff818bdfd5 0000000000000007 [ 148.765805] 0000000000000000 ffff88085f105990 ffffffff810e413a 0000000000000000 [ 148.765807] ffffffff82301080 0000000000000022 ffffffff8107f640 ffffffff8107f640 [ 148.765809] Call Trace: [ 148.765810] <NMI> [<ffffffff818bdfd5>] dump_stack+0x45/0x57 [ 148.765818] [<ffffffff810e413a>] warn_slowpath_common+0x8a/0xc0 [ 148.765822] [<ffffffff8107f640>] ? intel_cqm_stable+0x60/0x60 [ 148.765824] [<ffffffff8107f640>] ? intel_cqm_stable+0x60/0x60 [ 148.765825] [<ffffffff810e422a>] warn_slowpath_null+0x1a/0x20 [ 148.765827] [<ffffffff811613f6>] smp_call_function_many+0xb6/0x260 [ 148.765829] [<ffffffff8107f640>] ? intel_cqm_stable+0x60/0x60 [ 148.765831] [<ffffffff81161748>] on_each_cpu_mask+0x28/0x60 [ 148.765832] [<ffffffff8107f6ef>] intel_cqm_event_count+0x7f/0xe0 [ 148.765836] [<ffffffff811cdd35>] perf_output_read+0x2a5/0x400 [ 148.765839] [<ffffffff811d2e5a>] perf_output_sample+0x31a/0x590 [ 148.765840] [<ffffffff811d333d>] ? perf_prepare_sample+0x26d/0x380 [ 148.765841] [<ffffffff811d3497>] perf_event_output+0x47/0x60 [ 148.765843] [<ffffffff811d36c5>] __perf_event_overflow+0x215/0x240 [ 148.765844] [<ffffffff811d4124>] perf_event_overflow+0x14/0x20 [ 148.765847] [<ffffffff8107e7f4>] intel_pmu_handle_irq+0x1d4/0x440 [ 148.765849] [<ffffffff811d07a6>] ? __perf_event_task_sched_in+0x36/0xa0 [ 148.765853] [<ffffffff81219bad>] ? vunmap_page_range+0x19d/0x2f0 [ 148.765854] [<ffffffff81219d11>] ? unmap_kernel_range_noflush+0x11/0x20 [ 148.765859] [<ffffffff814ce6fe>] ? ghes_copy_tofrom_phys+0x11e/0x2a0 [ 148.765863] [<ffffffff8109e5db>] ? native_apic_msr_write+0x2b/0x30 [ 148.765865] [<ffffffff8109e44d>] ? x2apic_send_IPI_self+0x1d/0x20 [ 148.765869] [<ffffffff81065135>] ? arch_irq_work_raise+0x35/0x40 [ 148.765872] [<ffffffff811c8d86>] ? irq_work_queue+0x66/0x80 [ 148.765875] [<ffffffff81075306>] perf_event_nmi_handler+0x26/0x40 [ 148.765877] [<ffffffff81063ed9>] nmi_handle+0x79/0x100 [ 148.765879] [<ffffffff81064422>] default_do_nmi+0x42/0x100 [ 148.765880] [<ffffffff81064563>] do_nmi+0x83/0xb0 [ 148.765884] [<ffffffff818c7c0f>] end_repeat_nmi+0x1e/0x2e [ 148.765886] [<ffffffff811d07a6>] ? __perf_event_task_sched_in+0x36/0xa0 [ 148.765888] [<ffffffff811d07a6>] ? __perf_event_task_sched_in+0x36/0xa0 [ 148.765890] [<ffffffff811d07a6>] ? __perf_event_task_sched_in+0x36/0xa0 [ 148.765891] <<EOE>> [<ffffffff8110ab66>] finish_task_switch+0x156/0x210 [ 148.765898] [<ffffffff818c1671>] __schedule+0x341/0x920 [ 148.765899] [<ffffffff818c1c87>] schedule+0x37/0x80 [ 148.765903] [<ffffffff810ae1af>] ? do_page_fault+0x2f/0x80 [ 148.765905] [<ffffffff818c1f4a>] schedule_user+0x1a/0x50 [ 148.765907] [<ffffffff818c666c>] retint_careful+0x14/0x32 [ 148.765908] ---[ end trace e33ff2be78e14901 ]--- The CQM task events are not safe to be called from within interrupt context because they require performing an IPI to read the counter value on all sockets. And performing IPIs from within IRQ context is a "no-no". Make do with the last read counter value currently event in event->count when we're invoked in this context. Reported-by: Peter Zijlstra <[email protected]> Signed-off-by: Matt Fleming <[email protected]> Cc: Thomas Gleixner <[email protected]> Cc: Vikas Shivappa <[email protected]> Cc: Kanaka Juvva <[email protected]> Cc: Will Auld <[email protected]> Cc: <[email protected]> Link: http://lkml.kernel.org/r/[email protected] Signed-off-by: Thomas Gleixner <[email protected]>
2015-07-25Merge tag 'tty-4.2-rc4' of ↵Linus Torvalds1-0/+5
git://git.kernel.org/pub/scm/linux/kernel/git/gregkh/tty Pull tty/serial driver fixes from Greg KH: "Here are a number of small serial and tty fixes for reported issues. All have been in linux-next successfully" * tag 'tty-4.2-rc4' of git://git.kernel.org/pub/scm/linux/kernel/git/gregkh/tty: tty: vt: Fix !TASK_RUNNING diagnostic warning from paste_selection() serial: core: Fix crashes while echoing when closing m32r: Add ioreadXX/iowriteXX big-endian mmio accessors Revert "serial: imx: initialized DMA w/o HW flow enabled" sc16is7xx: fix FIFO address of secondary UART sc16is7xx: fix Kconfig dependencies serial: etraxfs-uart: Fix release etraxfs_uart_ports tty/vt: Fix the memory leak in visual_init serial: amba-pl011: Fix devm_ioremap_resource return value check n_tty: signal and flush atomically
2015-07-24Merge tag 'mmc-4.2-rc3' of git://git.linaro.org/people/ulf.hansson/mmcLinus Torvalds33-68/+75
Pull MMC fixes from Ulf Hansson: "Here are some mmc fixes intended for v4.2 rc4. Note, most of the changes are for the sdhci-esdhc-imx controller, which also required us to modify some related DTS files. Those changes have been acked by the SoC maintainer. MMC core: - Fix a reference inbalance issue for power_ro_lock_show() sysfs handler MMC host: - omap_hsmmc: Fix IRQ errorhandling for CD, DTO, and CRC - sdhci: Prevent a kernel panic while using DMA - mtk-sd: Let it depend on HAS_DMA to prevent build errors - sdhci-esdhc: Make 8BIT bus work - sdhci-esdhc-imx: Fix some regressions for DT based platforms - sdhci-pxav3: Fix a regression for DT based platforms" * tag 'mmc-4.2-rc3' of git://git.linaro.org/people/ulf.hansson/mmc: mmc: sdhci-pxav3: fix platform_data is not initialized dts: mmc: fsl-imx-esdhc: remove fsl,cd-controller support mmc: sdhci-esdhc-imx: clear f_max in boarddata mmc: sdhci-esdhc-imx: remove duplicated dts parsing mmc: sdhci: make max-frequency property in device tree work mmc: sdhci-esdhc-imx: move all non dt probe code into one function mmc: sdhci-esdhc-imx: fix cd regression for dt platform dts: imx7: fix sd card gpio polarity specified in device tree dts: imx25: fix sd card gpio polarity specified in device tree dts: imx6: fix sd card gpio polarity specified in device tree dts: imx53: fix sd card gpio polarity specified in device tree dts: imx51: fix sd card gpio polarity specified in device tree mmc: sdhci-esdhc: Make 8BIT bus work mmc: block: Add missing mmc_blk_put() in power_ro_lock_show() mmc: MMC_MTK should depend on HAS_DMA mmc: sdhci check parameters before call dma_free_coherent mmc: omap_hsmmc: Handle BADA, DEB and CEB interrupts mmc: omap_hsmmc: Fix DTO and DCRC handling
2015-07-24Merge branch 'stable' of ↵Linus Torvalds1-1/+1
git://git.kernel.org/pub/scm/linux/kernel/git/cmetcalf/linux-tile Pull arch/tile bugfix from Chris Metcalf: "This fixes a bug in freeing the initramfs memory" * 'stable' of git://git.kernel.org/pub/scm/linux/kernel/git/cmetcalf/linux-tile: tile: use free_bootmem_late() for initrd
2015-07-24x86/asm/entry/32: Revert 'Do not use R9 in SYSCALL32' commitDenys Vlasenko1-5/+9
This change reverts most of commit 53e9accf0f 'Do not use R9 in SYSCALL32'. I don't yet understand how, but code in that commit sometimes fails to preserve EBP. See https://bugzilla.kernel.org/show_bug.cgi?id=101061 "Problems while executing 32-bit code on AMD64" Reported-and-tested-by: Krzysztof A. Sobiecki <[email protected]> Signed-off-by: Denys Vlasenko <[email protected]> Cc: Linus Torvalds <[email protected]> Cc: Steven Rostedt <[email protected]> Cc: Borislav Petkov <[email protected]> Cc: Andy Lutomirski <[email protected]> Cc: Oleg Nesterov <[email protected]> Cc: Frederic Weisbecker <[email protected]> Cc: Alexei Starovoitov <[email protected]> Cc: Will Drewry <[email protected]> Cc: Kees Cook <[email protected]> CC: [email protected] Link: http://lkml.kernel.org/r/[email protected] Signed-off-by: Thomas Gleixner <[email protected]>
2015-07-24x86/mm: Fix newly introduced printk format warningsThomas Gleixner1-2/+2
Signed-off-by: Thomas Gleixner <[email protected]>
2015-07-24dts: imx7: fix sd card gpio polarity specified in device treeDong Aisheng1-2/+2
cd-gpios polarity should be changed to GPIO_ACTIVE_LOW and wp-gpios should be changed to GPIO_ACTIVE_HIGH. Otherwise, the SD may not work properly due to wrong polarity inversion specified in DT after switch to common parsing function mmc_of_parse(). Signed-off-by: Dong Aisheng <[email protected]> Acked-by: Shawn Guo <[email protected]> Signed-off-by: Ulf Hansson <[email protected]>
2015-07-24dts: imx25: fix sd card gpio polarity specified in device treeDong Aisheng1-2/+3
cd-gpios polarity should be changed to GPIO_ACTIVE_LOW and wp-gpios should be changed to GPIO_ACTIVE_HIGH. Otherwise, the SD may not work properly due to wrong polarity inversion specified in DT after switch to common parsing function mmc_of_parse(). Signed-off-by: Dong Aisheng <[email protected]> Acked-by: Shawn Guo <[email protected]> Signed-off-by: Ulf Hansson <[email protected]>
2015-07-24dts: imx6: fix sd card gpio polarity specified in device treeDong Aisheng23-49/+55
cd-gpios polarity should be changed to GPIO_ACTIVE_LOW and wp-gpios should be changed to GPIO_ACTIVE_HIGH. Otherwise, the SD may not work properly due to wrong polarity inversion specified in DT after switch to common parsing function mmc_of_parse(). Signed-off-by: Dong Aisheng <[email protected]> Acked-by: Shawn Guo <[email protected]> Signed-off-by: Ulf Hansson <[email protected]>
2015-07-24dts: imx53: fix sd card gpio polarity specified in device treeDong Aisheng7-14/+14
cd-gpios polarity should be changed to GPIO_ACTIVE_LOW and wp-gpios should be changed to GPIO_ACTIVE_HIGH. Otherwise, the SD may not work properly due to wrong polarity inversion specified in DT after switch to common parsing function mmc_of_parse(). Signed-off-by: Dong Aisheng <[email protected]> Acked-by: Shawn Guo <[email protected]> Signed-off-by: Ulf Hansson <[email protected]>
2015-07-24dts: imx51: fix sd card gpio polarity specified in device treeDong Aisheng1-1/+1
cd-gpios polarity should be changed to GPIO_ACTIVE_LOW and wp-gpios should be changed to GPIO_ACTIVE_HIGH. Otherwise, the SD may not work properly due to wrong polarity inversion specified in DT after switch to common parsing function mmc_of_parse(). Signed-off-by: Dong Aisheng <[email protected]> Acked-by: Shawn Guo <[email protected]> Signed-off-by: Ulf Hansson <[email protected]>
2015-07-23m32r: Add ioreadXX/iowriteXX big-endian mmio accessorsPeter Hurley1-0/+5
commit c627f2ceb692 ("serial: 8250: Add support for big-endian MMIO accesses") added support for 32-bit big-endian mmio to the 8250 driver. Support for ioreadXXbe/iowriteXXbe io accessors was missing from m32r arch, which caused build errors. Add trivial macro mmio accessors. Reported-by: Fengguang Wu <[email protected]> Cc: Kevin Cernekee <[email protected]> Signed-off-by: Peter Hurley <[email protected]> Signed-off-by: Greg Kroah-Hartman <[email protected]>
2015-07-23tile: use free_bootmem_late() for initrdChris Metcalf1-1/+1
We were previously using free_bootmem() and just getting lucky that nothing too bad happened. Signed-off-by: Chris Metcalf <[email protected]> Cc: [email protected]
2015-07-23KVM: x86: rename quirk constants to KVM_X86_QUIRK_*Paolo Bonzini4-5/+5
Make them clearly architecture-dependent; the capability is valid for all architectures, but the argument is not. Signed-off-by: Paolo Bonzini <[email protected]>
2015-07-23KVM: vmx: obey KVM_QUIRK_CD_NW_CLEAREDXiao Guangrong1-1/+4
OVMF depends on WB to boot fast, because it only clears caches after it has set up MTRRs---which is too late. Let's do writeback if CR0.CD is set to make it happy, similar to what SVM is already doing. Signed-off-by: Xiao Guangrong <[email protected]> Tested-by: Alex Williamson <[email protected]> Signed-off-by: Paolo Bonzini <[email protected]>