aboutsummaryrefslogtreecommitdiff
path: root/arch/x86/kernel
AgeCommit message (Collapse)AuthorFilesLines
2016-01-06perf/x86: Use INST_RETIRED.TOTAL_CYCLES_PS for cycles:pp for SkylakeAndi Kleen2-4/+3
I added UOPS_RETIRED.ALL by mistake to the Skylake PEBS event list for cycles:pp. But the event is not documented for Skylake, and has some issues. The recommended replacement for cycles:pp is to use INST_RETIRED.ANY+pebs as a base, similar to what CPUs before Sandy Bridge did. This new event is called INST_RETIRED.TOTAL_CYCLES_PS. The event is not really new, but has been already used by perf before Sandy Bridge for the original cycles:p Note the SDM doesn't document that event either, but it's being documented in the latest version of the event list on: https://download.01.org/perfmon/SKL This patch does: - Remove UOPS_RETIRED.ALL from the Skylake PEBS event list - Add INST_RETIRED.ANY to the Skylake PEBS event list, and an table entry to allow cmask=16,inv=1 for cycles:pp - We don't need an extra entry for the base INST_RETIRED event, because it is already covered by the catch-all PEBS table entry. - Switch Skylake to use the Core2 PEBS alias (which is INST_RETIRED.TOTAL_CYCLES_PS) Signed-off-by: Andi Kleen <[email protected]> Signed-off-by: Peter Zijlstra (Intel) <[email protected]> Cc: Arnaldo Carvalho de Melo <[email protected]> Cc: Jiri Olsa <[email protected]> Cc: Linus Torvalds <[email protected]> Cc: Peter Zijlstra <[email protected]> Cc: Stephane Eranian <[email protected]> Cc: Thomas Gleixner <[email protected]> Cc: Vince Weaver <[email protected]> Cc: [email protected] Link: http://lkml.kernel.org/r/[email protected] Signed-off-by: Ingo Molnar <[email protected]>
2016-01-06perf/x86: Allow zero PEBS status with only single active eventAndi Kleen1-0/+12
Normally we drop PEBS events with a zero status field. But when there is only a single PEBS event active we can assume the PEBS record is for that event. The PEBS buffer is always flushed when PEBS events are disabled, so there is no risk of mishandling state PEBS records this way. Signed-off-by: Andi Kleen <[email protected]> Signed-off-by: Peter Zijlstra (Intel) <[email protected]> Cc: Arnaldo Carvalho de Melo <[email protected]> Cc: Jiri Olsa <[email protected]> Cc: Linus Torvalds <[email protected]> Cc: Peter Zijlstra <[email protected]> Cc: Stephane Eranian <[email protected]> Cc: Thomas Gleixner <[email protected]> Cc: Vince Weaver <[email protected]> Link: http://lkml.kernel.org/r/[email protected] Signed-off-by: Ingo Molnar <[email protected]>
2016-01-06perf/x86: Remove warning for zero PEBS statusAndi Kleen1-4/+1
The recent commit: 75f80859b130 ("perf/x86/intel/pebs: Robustify PEBS buffer drain") causes lots of warnings on different CPUs before Skylake when running PEBS intensive workloads. They can have a zero status field in the PEBS record when PEBS is racing with clearing of GLOBAl_STATUS. This also can cause hangs (it seems there are still problems with printk in NMI). Disable the warning, but still ignore the record. Signed-off-by: Andi Kleen <[email protected]> Signed-off-by: Peter Zijlstra (Intel) <[email protected]> Cc: Arnaldo Carvalho de Melo <[email protected]> Cc: Jiri Olsa <[email protected]> Cc: Linus Torvalds <[email protected]> Cc: Peter Zijlstra <[email protected]> Cc: Stephane Eranian <[email protected]> Cc: Thomas Gleixner <[email protected]> Cc: Vince Weaver <[email protected]> Link: http://lkml.kernel.org/r/[email protected] Signed-off-by: Ingo Molnar <[email protected]>
2016-01-06x86/fpu: Properly align size in CHECK_MEMBER_AT_END_OF() macroJiri Olsa1-2/+11
The CHECK_MEMBER_AT_END_OF(TYPE, MEMBER) checks whether MEMBER is last member of TYPE by evaluating: offsetof(TYPE::MEMBER) + sizeof(TYPE::MEMBER) == sizeof(TYPE) and ensuring TYPE::MEMBER is the last member of the TYPE. This condition breaks on structs that are padded to be aligned. This patch ensures the TYPE alignment is taken into account. This bug was revealed after adding cacheline alignment into struct sched_entity, which broke task_struct::thread check: CHECK_MEMBER_AT_END_OF(struct task_struct, thread); Signed-off-by: Jiri Olsa <[email protected]> Signed-off-by: Peter Zijlstra (Intel) <[email protected]> Cc: Dave Hansen <[email protected]> Cc: Linus Torvalds <[email protected]> Cc: Mike Galbraith <[email protected]> Cc: Peter Zijlstra <[email protected]> Cc: Thomas Gleixner <[email protected]> Link: http://lkml.kernel.org/r/[email protected] Signed-off-by: Ingo Molnar <[email protected]>
2016-01-04x86: ftrace: Fix the comments for ftrace_modify_code_direct()Li Bin1-7/+5
There is no need to worry about module and __init text disappearing case, because that ftrace has a module notifier that is called when a module is being unloaded and before the text goes away and this code grabs the ftrace_lock mutex and removes the module functions from the ftrace list, such that it will no longer do any modifications to that module's text, the update to make functions be traced or not is done under the ftrace_lock mutex as well. And by now, __init section codes should not been modified by ftrace, because it is black listed in recordmcount.c and ignored by ftrace. Link: http://lkml.kernel.org/r/[email protected] Reviewed-by: Thomas Gleixner <[email protected]> Suggested-by: Steven Rostedt <[email protected]> Signed-off-by: Li Bin <[email protected]> Signed-off-by: Steven Rostedt <[email protected]>
2016-01-04Merge branch 'memdup_user_nul' into work.miscAl Viro6-7/+10
2015-12-30x86/numachip: Fix NumaConnect2 MMCFG PCI accessDaniel J Blueman1-4/+1
The MMCFG PCI accessors weren't being setup for NumacConnect2 correctly due to over-early assignment; this would create the potential for the wrong PCI domain to be accessed. Fix this by using the correct arch-specific PCI init function. Signed-off-by: Daniel J Blueman <[email protected]> Acked-by: Steffen Persvold <[email protected]> Cc: Daniel Lezcano <[email protected]> Cc: Linus Torvalds <[email protected]> Link: http://lkml.kernel.org/r/[email protected] Signed-off-by: Thomas Gleixner <[email protected]>
2015-12-29x86/LDT: Print the real LDT base addressJan Beulich1-1/+1
This was meant to print base address and entry count; make it do so again. Fixes: 37868fe113ff "x86/ldt: Make modify_ldt synchronous" Signed-off-by: Jan Beulich <[email protected]> Acked-by: Andy Lutomirski <[email protected]> Link: http://lkml.kernel.org/r/[email protected] Signed-off-by: Thomas Gleixner <[email protected]>
2015-12-29arch/x86/kernel/ptrace.c: Remove unused arg_offs_table[email protected]1-15/+0
The related warning from gcc 6.0: arch/x86/kernel/ptrace.c:127:18: warning: ‘arg_offs_table’ defined but not used [-Wunused-const-variable] static const int arg_offs_table[] = { ^~~~~~~~~~~~~~ Signed-off-by: Chen Gang <[email protected]> Link: http://lkml.kernel.org/r/[email protected] Signed-off-by: Thomas Gleixner <[email protected]>
2015-12-23new helpers: no_seek_end_llseek{,_size}()Al Viro2-46/+2
Signed-off-by: Al Viro <[email protected]>
2015-12-23Merge branch 'for-linus' into for-nextTakashi Iwai6-7/+10
Conflicts: drivers/gpu/drm/i915/intel_pm.c
2015-12-20x86/irq: Export functions to allow MSI domains in modulesJake Oshins2-3/+7
The Linux kernel already has the concept of IRQ domain, wherein a component can expose a set of IRQs which are managed by a particular interrupt controller chip or other subsystem. The PCI driver exposes the notion of an IRQ domain for Message-Signaled Interrupts (MSI) from PCI Express devices. This patch exposes the functions which are necessary for creating a MSI IRQ domain within a module. [ tglx: Split it into x86 and core irq parts ] Signed-off-by: Jake Oshins <[email protected]> Cc: [email protected] Cc: [email protected] Cc: [email protected] Cc: [email protected] Cc: [email protected] Cc: [email protected] Cc: [email protected] Cc: [email protected] Cc: [email protected] Link: http://lkml.kernel.org/r/[email protected] Signed-off-by: Thomas Gleixner <[email protected]>
2015-12-19x86/paravirt: Prevent rtc_cmos platform device init on PV guestsDavid Vrabel1-0/+3
Adding the rtc platform device in non-privileged Xen PV guests causes an IRQ conflict because these guests do not have legacy PIC and may allocate irqs in the legacy range. In a single VCPU Xen PV guest we should have: /proc/interrupts: CPU0 0: 4934 xen-percpu-virq timer0 1: 0 xen-percpu-ipi spinlock0 2: 0 xen-percpu-ipi resched0 3: 0 xen-percpu-ipi callfunc0 4: 0 xen-percpu-virq debug0 5: 0 xen-percpu-ipi callfuncsingle0 6: 0 xen-percpu-ipi irqwork0 7: 321 xen-dyn-event xenbus 8: 90 xen-dyn-event hvc_console ... But hvc_console cannot get its interrupt because it is already in use by rtc0 and the console does not work. genirq: Flags mismatch irq 8. 00000000 (hvc_console) vs. 00000000 (rtc0) We can avoid this problem by realizing that unprivileged PV guests (both Xen and lguests) are not supposed to have rtc_cmos device and so adding it is not necessary. Privileged guests (i.e. Xen's dom0) do use it but they should not have irq conflicts since they allocate irqs above legacy range (above gsi_top, in fact). Instead of explicitly testing whether the guest is privileged we can extend pv_info structure to include information about guest's RTC support. Reported-and-tested-by: Sander Eikelenboom <[email protected]> Signed-off-by: David Vrabel <[email protected]> Signed-off-by: Boris Ostrovsky <[email protected]> Cc: [email protected] Cc: [email protected] Cc: [email protected] Cc: [email protected] # 4.2+ Link: http://lkml.kernel.org/r/[email protected] Signed-off-by: Thomas Gleixner <[email protected]>
2015-12-19x86/cpufeature: Remove unused and seldomly used cpu_has_xx macrosBorislav Petkov12-22/+30
Those are stupid and code should use static_cpu_has_safe() or boot_cpu_has() instead. Kill the least used and unused ones. The remaining ones need more careful inspection before a conversion can happen. On the TODO. Signed-off-by: Borislav Petkov <[email protected]> Link: http://lkml.kernel.org/r/[email protected] Cc: David Sterba <[email protected]> Cc: Herbert Xu <[email protected]> Cc: Peter Zijlstra <[email protected]> Cc: Matt Mackall <[email protected]> Cc: Chris Mason <[email protected]> Cc: Josef Bacik <[email protected]> Signed-off-by: Thomas Gleixner <[email protected]>
2015-12-19x86/cpufeature: Cleanup get_cpu_cap()Borislav Petkov3-28/+25
Add an enum for the ->x86_capability array indices and cleanup get_cpu_cap() by killing some redundant local vars. Signed-off-by: Borislav Petkov <[email protected]> Link: http://lkml.kernel.org/r/[email protected] Signed-off-by: Thomas Gleixner <[email protected]>
2015-12-19x86/cpufeature: Move some of the scattered feature bits to x86_capabilityBorislav Petkov2-20/+5
Turn the CPUID leafs which are proper CPUID feature bit leafs into separate ->x86_capability words. Signed-off-by: Borislav Petkov <[email protected]> Link: http://lkml.kernel.org/r/[email protected] Signed-off-by: Thomas Gleixner <[email protected]>
2015-12-19Merge branch 'linus' into x86/cleanupsThomas Gleixner28-74/+172
Pull in upstream changes so we can apply depending patches.
2015-12-19x86/nmi: Save regs in crash dump on external NMIHidehiro Kawai2-9/+31
Now, multiple CPUs can receive an external NMI simultaneously by specifying the "apic_extnmi=all" command line parameter. When we take a crash dump by using external NMI with this option, we fail to save registers into the crash dump. This happens as follows: CPU 0 CPU 1 ================================ ============================= receive an external NMI default_do_nmi() receive an external NMI spin_lock(&nmi_reason_lock) default_do_nmi() io_check_error() spin_lock(&nmi_reason_lock) panic() busy loop ... kdump_nmi_shootdown_cpus() issue NMI IPI -----------> blocked until IRET busy loop... Here, since CPU 1 is in NMI context, an additional NMI from CPU 0 remains unhandled until CPU 1 IRETs. However, CPU 1 will never execute IRET so the NMI is not handled and the callback function to save registers is never called. To solve this issue, we check if the IPI for crash dumping was issued while waiting for nmi_reason_lock to be released, and if so, call its callback function directly. If the IPI is not issued (e.g. kdump is disabled), the actual behavior doesn't change. Signed-off-by: Hidehiro Kawai <[email protected]> Acked-by: Michal Hocko <[email protected]> Cc: Andrew Morton <[email protected]> Cc: Andy Lutomirski <[email protected]> Cc: Baoquan He <[email protected]> Cc: Dave Young <[email protected]> Cc: "Eric W. Biederman" <[email protected]> Cc: "H. Peter Anvin" <[email protected]> Cc: Ingo Molnar <[email protected]> Cc: Jiang Liu <[email protected]> Cc: Jonathan Corbet <[email protected]> Cc: [email protected] Cc: [email protected] Cc: Masami Hiramatsu <[email protected]> Cc: Peter Zijlstra <[email protected]> Cc: Stefan Lippers-Hollmann <[email protected]> Cc: Steven Rostedt <[email protected]> Cc: Thomas Gleixner <[email protected]> Cc: Vivek Goyal <[email protected]> Cc: x86-ml <[email protected]> Link: http://lkml.kernel.org/r/20151210065245.4587.39316.stgit@softrs Signed-off-by: Borislav Petkov <[email protected]> Signed-off-by: Thomas Gleixner <[email protected]>
2015-12-19x86/apic: Introduce apic_extnmi command line parameterHidehiro Kawai1-2/+32
This patch introduces a command line parameter apic_extnmi: apic_extnmi=( bsp|all|none ) The default value is "bsp" and this is the current behavior: only the Boot-Strapping Processor receives an external NMI. "all" allows external NMIs to be broadcast to all CPUs. This would raise the success rate of panic on NMI when BSP hangs in NMI context or the external NMI is swallowed by other NMI handlers on the BSP. If you specify "none", no CPUs receive external NMIs. This is useful for the dump capture kernel so that it cannot be shot down by accidentally pressing the external NMI button (on platforms which have it) while saving a crash dump. Signed-off-by: Hidehiro Kawai <[email protected]> Acked-by: Michal Hocko <[email protected]> Cc: Andrew Morton <[email protected]> Cc: Andy Lutomirski <[email protected]> Cc: Bandan Das <[email protected]> Cc: Baoquan He <[email protected]> Cc: "Eric W. Biederman" <[email protected]> Cc: "H. Peter Anvin" <[email protected]> Cc: Ingo Molnar <[email protected]> Cc: Jiang Liu <[email protected]> Cc: Joerg Roedel <[email protected]> Cc: Jonathan Corbet <[email protected]> Cc: [email protected] Cc: [email protected] Cc: "Maciej W. Rozycki" <[email protected]> Cc: Masami Hiramatsu <[email protected]> Cc: Paolo Bonzini <[email protected]> Cc: Peter Zijlstra <[email protected]> Cc: Ricardo Ribalda Delgado <[email protected]> Cc: Steven Rostedt <[email protected]> Cc: Thomas Gleixner <[email protected]> Cc: Viresh Kumar <[email protected]> Cc: Vivek Goyal <[email protected]> Cc: x86-ml <[email protected]> Link: http://lkml.kernel.org/r/20151210014632.25437.43778.stgit@softrs Signed-off-by: Borislav Petkov <[email protected]> Signed-off-by: Thomas Gleixner <[email protected]>
2015-12-19panic, x86: Allow CPUs to save registers even if looping in NMI contextHidehiro Kawai2-3/+23
Currently, kdump_nmi_shootdown_cpus(), a subroutine of crash_kexec(), sends an NMI IPI to CPUs which haven't called panic() to stop them, save their register information and do some cleanups for crash dumping. However, if such a CPU is infinitely looping in NMI context, we fail to save its register information into the crash dump. For example, this can happen when unknown NMIs are broadcast to all CPUs as follows: CPU 0 CPU 1 =========================== ========================== receive an unknown NMI unknown_nmi_error() panic() receive an unknown NMI spin_trylock(&panic_lock) unknown_nmi_error() crash_kexec() panic() spin_trylock(&panic_lock) panic_smp_self_stop() infinite loop kdump_nmi_shootdown_cpus() issue NMI IPI -----------> blocked until IRET infinite loop... Here, since CPU 1 is in NMI context, the second NMI from CPU 0 is blocked until CPU 1 executes IRET. However, CPU 1 never executes IRET, so the NMI is not handled and the callback function to save registers is never called. In practice, this can happen on some servers which broadcast NMIs to all CPUs when the NMI button is pushed. To save registers in this case, we need to: a) Return from NMI handler instead of looping infinitely or b) Call the callback function directly from the infinite loop Inherently, a) is risky because NMI is also used to prevent corrupted data from being propagated to devices. So, we chose b). This patch does the following: 1. Move the infinite looping of CPUs which haven't called panic() in NMI context (actually done by panic_smp_self_stop()) outside of panic() to enable us to refer pt_regs. Please note that panic_smp_self_stop() is still used for normal context. 2. Call a callback of kdump_nmi_shootdown_cpus() directly to save registers and do some cleanups after setting waiting_for_crash_ipi which is used for counting down the number of CPUs which handled the callback Signed-off-by: Hidehiro Kawai <[email protected]> Acked-by: Michal Hocko <[email protected]> Cc: Aaron Tomlin <[email protected]> Cc: Andrew Morton <[email protected]> Cc: Andy Lutomirski <[email protected]> Cc: Baoquan He <[email protected]> Cc: Chris Metcalf <[email protected]> Cc: Dave Young <[email protected]> Cc: David Hildenbrand <[email protected]> Cc: Don Zickus <[email protected]> Cc: Eric Biederman <[email protected]> Cc: Frederic Weisbecker <[email protected]> Cc: Gobinda Charan Maji <[email protected]> Cc: HATAYAMA Daisuke <[email protected]> Cc: Hidehiro Kawai <[email protected]> Cc: "H. Peter Anvin" <[email protected]> Cc: Ingo Molnar <[email protected]> Cc: Javi Merino <[email protected]> Cc: Jiang Liu <[email protected]> Cc: Jonathan Corbet <[email protected]> Cc: [email protected] Cc: [email protected] Cc: lkml <[email protected]> Cc: Masami Hiramatsu <[email protected]> Cc: Michal Nazarewicz <[email protected]> Cc: Nicolas Iooss <[email protected]> Cc: Oleg Nesterov <[email protected]> Cc: Peter Zijlstra <[email protected]> Cc: Prarit Bhargava <[email protected]> Cc: Rasmus Villemoes <[email protected]> Cc: Seth Jennings <[email protected]> Cc: Stefan Lippers-Hollmann <[email protected]> Cc: Steven Rostedt <[email protected]> Cc: Thomas Gleixner <[email protected]> Cc: Ulrich Obergfell <[email protected]> Cc: Vitaly Kuznetsov <[email protected]> Cc: Vivek Goyal <[email protected]> Cc: Yasuaki Ishimatsu <[email protected]> Link: http://lkml.kernel.org/r/20151210014628.25437.75256.stgit@softrs [ Cleanup comments, fixup formatting. ] Signed-off-by: Borislav Petkov <[email protected]> Signed-off-by: Thomas Gleixner <[email protected]>
2015-12-19panic, x86: Fix re-entrance problem due to panic on NMIHidehiro Kawai1-4/+12
If panic on NMI happens just after panic() on the same CPU, panic() is recursively called. Kernel stalls, as a result, after failing to acquire panic_lock. To avoid this problem, don't call panic() in NMI context if we've already entered panic(). For that, introduce nmi_panic() macro to reduce code duplication. In the case of panic on NMI, don't return from NMI handlers if another CPU already panicked. Signed-off-by: Hidehiro Kawai <[email protected]> Acked-by: Michal Hocko <[email protected]> Cc: Aaron Tomlin <[email protected]> Cc: Andrew Morton <[email protected]> Cc: Andy Lutomirski <[email protected]> Cc: Baoquan He <[email protected]> Cc: Chris Metcalf <[email protected]> Cc: David Hildenbrand <[email protected]> Cc: Don Zickus <[email protected]> Cc: "Eric W. Biederman" <[email protected]> Cc: Frederic Weisbecker <[email protected]> Cc: Gobinda Charan Maji <[email protected]> Cc: HATAYAMA Daisuke <[email protected]> Cc: "H. Peter Anvin" <[email protected]> Cc: Ingo Molnar <[email protected]> Cc: Javi Merino <[email protected]> Cc: Jonathan Corbet <[email protected]> Cc: [email protected] Cc: [email protected] Cc: lkml <[email protected]> Cc: Masami Hiramatsu <[email protected]> Cc: Michal Nazarewicz <[email protected]> Cc: Nicolas Iooss <[email protected]> Cc: Peter Zijlstra <[email protected]> Cc: Prarit Bhargava <[email protected]> Cc: Rasmus Villemoes <[email protected]> Cc: Rusty Russell <[email protected]> Cc: Seth Jennings <[email protected]> Cc: Steven Rostedt <[email protected]> Cc: Thomas Gleixner <[email protected]> Cc: Ulrich Obergfell <[email protected]> Cc: Vitaly Kuznetsov <[email protected]> Cc: Vivek Goyal <[email protected]> Link: http://lkml.kernel.org/r/20151210014626.25437.13302.stgit@softrs [ Cleanup comments, fixup formatting. ] Signed-off-by: Borislav Petkov <[email protected]> Signed-off-by: Thomas Gleixner <[email protected]>
2015-12-19Merge branch 'linus' into x86/apicThomas Gleixner27-70/+167
Pull in update changes so we can apply conflicting patches
2015-12-19x86/mce: Ensure offline CPUs don't participate in rendezvous processAshok Raj1-0/+11
Intel's MCA implementation broadcasts MCEs to all CPUs on the node. This poses a problem for offlined CPUs which cannot participate in the rendezvous process: Kernel panic - not syncing: Timeout: Not all CPUs entered broadcast exception handler Kernel Offset: disabled Rebooting in 100 seconds.. More specifically, Linux does a soft offline of a CPU when writing a 0 to /sys/devices/system/cpu/cpuX/online, which doesn't prevent the #MC exception from being broadcasted to that CPU. Ensure that offline CPUs don't participate in the MCE rendezvous and clear the RIP valid status bit so that a second MCE won't cause a shutdown. Without the patch, mce_start() will increment mce_callin and wait for all CPUs. Offlined CPUs should avoid participating in the rendezvous process altogether. Signed-off-by: Ashok Raj <[email protected]> [ Massage commit message. ] Signed-off-by: Borislav Petkov <[email protected]> Reviewed-by: Tony Luck <[email protected]> Cc: <[email protected]> Cc: H. Peter Anvin <[email protected]> Cc: Linus Torvalds <[email protected]> Cc: Peter Zijlstra <[email protected]> Cc: Thomas Gleixner <[email protected]> Cc: linux-edac <[email protected]> Link: http://lkml.kernel.org/r/[email protected] Signed-off-by: Ingo Molnar <[email protected]> Signed-off-by: Thomas Gleixner <[email protected]>
2015-12-14Merge tag 'v4.4-rc5' into perf/core, to pick up fixesIngo Molnar5-13/+28
Signed-off-by: Ingo Molnar <[email protected]>
2015-12-11x86/vdso: Remove pvclock fixmap machineryAndy Lutomirski2-30/+0
Signed-off-by: Andy Lutomirski <[email protected]> Reviewed-by: Paolo Bonzini <[email protected]> Cc: Andy Lutomirski <[email protected]> Cc: Borislav Petkov <[email protected]> Cc: Brian Gerst <[email protected]> Cc: Denys Vlasenko <[email protected]> Cc: H. Peter Anvin <[email protected]> Cc: Linus Torvalds <[email protected]> Cc: Peter Zijlstra <[email protected]> Cc: Thomas Gleixner <[email protected]> Cc: [email protected] Link: http://lkml.kernel.org/r/4933029991103ae44672c82b97a20035f5c1fe4f.1449702533.git.luto@kernel.org Signed-off-by: Ingo Molnar <[email protected]>
2015-12-11x86/vdso: Get pvclock data from the vvar VMA instead of the fixmapAndy Lutomirski1-0/+5
Signed-off-by: Andy Lutomirski <[email protected]> Reviewed-by: Paolo Bonzini <[email protected]> Cc: Andy Lutomirski <[email protected]> Cc: Borislav Petkov <[email protected]> Cc: Brian Gerst <[email protected]> Cc: Denys Vlasenko <[email protected]> Cc: H. Peter Anvin <[email protected]> Cc: Linus Torvalds <[email protected]> Cc: Peter Zijlstra <[email protected]> Cc: Thomas Gleixner <[email protected]> Cc: [email protected] Link: http://lkml.kernel.org/r/9d37826fdc7e2d2809efe31d5345f97186859284.1449702533.git.luto@kernel.org Signed-off-by: Ingo Molnar <[email protected]>
2015-12-08Merge branch 'perf-urgent-for-linus' of ↵Linus Torvalds6-7/+10
git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip Pull perf fixes from Ingo Molnar: "This tree includes four core perf fixes for misc bugs, three fixes to x86 PMU drivers, and two updates to old email addresses" * 'perf-urgent-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip: perf: Do not send exit event twice perf/x86/intel: Fix INTEL_FLAGS_UEVENT_CONSTRAINT_DATALA_NA macro perf/x86/intel: Make L1D_PEND_MISS.FB_FULL not constrained on Haswell perf: Fix PERF_EVENT_IOC_PERIOD deadlock treewide: Remove old email address perf/x86: Fix LBR call stack save/restore perf: Update email address in MAINTAINERS perf/core: Robustify the perf_cgroup_from_task() RCU checks perf/core: Fix RCU problem with cgroup context switching code
2015-12-08Back merge tag 'v4.4-rc4' into drm-nextDave Airlie5-13/+28
We've picked up a few conflicts and it would be nice to resolve them before we move onwards.
2015-12-06Merge branch 'x86-urgent-for-linus' of ↵Linus Torvalds4-13/+16
git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip Pull x86 fixes from Thoma Gleixner: "Another round of fixes for x86: - Move the initialization of the microcode driver to late_initcall to make sure everything that init function needs is available. - Make sure that lockdep knows about interrupts being off in the entry code before calling into c-code. - Undo the cpu hotplug init delay regression. - Use the proper conditionals in the mpx instruction decoder. - Fixup restart_syscall for x32 tasks. - Fix the hugepage regression on PAE kernels which was introduced with the latest PAT changes" * 'x86-urgent-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip: x86/signal: Fix restart_syscall number for x32 tasks x86/mpx: Fix instruction decoder condition x86/mm: Fix regression with huge pages on PAE x86 smpboot: Re-enable init_udelay=0 by default on modern CPUs x86/entry/64: Fix irqflag tracing wrt context tracking x86/microcode: Initialize the driver late when facilities are up
2015-12-06perf/x86: Remove old MSR perf tracing codeAndi Kleen1-11/+1
Now that we have generic MSR trace points we can remove the old hackish perf MSR read tracing code. Signed-off-by: Andi Kleen <[email protected]> Signed-off-by: Peter Zijlstra (Intel) <[email protected]> Cc: Arnaldo Carvalho de Melo <[email protected]> Cc: Jiri Olsa <[email protected]> Cc: Linus Torvalds <[email protected]> Cc: Mike Galbraith <[email protected]> Cc: Peter Zijlstra <[email protected]> Cc: Stephane Eranian <[email protected]> Cc: Thomas Gleixner <[email protected]> Cc: Vince Weaver <[email protected]> Cc: [email protected] Link: http://lkml.kernel.org/r/[email protected] Signed-off-by: Ingo Molnar <[email protected]>
2015-12-06perf/x86/intel: Fix __initconst declaration in the RAPL perf driverAndi Kleen1-1/+1
Fix a definition in the perf rapl driver. __initconst must be applied to a const object, but to declare a const pointer you need to use * const ..., not const ... * This fixes a section attribute conflict with LTO builds. Signed-off-by: Andi Kleen <[email protected]> Signed-off-by: Peter Zijlstra (Intel) <[email protected]> Cc: Arnaldo Carvalho de Melo <[email protected]> Cc: Jiri Olsa <[email protected]> Cc: Linus Torvalds <[email protected]> Cc: Mike Galbraith <[email protected]> Cc: Peter Zijlstra <[email protected]> Cc: Stephane Eranian <[email protected]> Cc: Thomas Gleixner <[email protected]> Cc: Vince Weaver <[email protected]> Link: http://lkml.kernel.org/r/[email protected] Signed-off-by: Ingo Molnar <[email protected]>
2015-12-06Merge branch 'perf/urgent' into perf/core, to pick up fixesIngo Molnar2-2/+2
Signed-off-by: Ingo Molnar <[email protected]>
2015-12-06perf/x86/intel: Fix INTEL_FLAGS_UEVENT_CONSTRAINT_DATALA_NA macroJiri Olsa1-1/+1
We need to add rest of the flags to the constraint mask instead of another INTEL_ARCH_EVENT_MASK, fixing a typo. Signed-off-by: Jiri Olsa <[email protected]> Signed-off-by: Peter Zijlstra (Intel) <[email protected]> Cc: Arnaldo Carvalho de Melo <[email protected]> Cc: Jiri Olsa <[email protected]> Cc: Linus Torvalds <[email protected]> Cc: Paul Mackerras <[email protected]> Cc: Peter Zijlstra <[email protected]> Cc: Stephane Eranian <[email protected]> Cc: Thomas Gleixner <[email protected]> Cc: Vince Weaver <[email protected]> Link: http://lkml.kernel.org/r/[email protected] Signed-off-by: Ingo Molnar <[email protected]>
2015-12-06perf/x86/intel: Make L1D_PEND_MISS.FB_FULL not constrained on HaswellYuanfang Chen1-1/+1
There was a mistake in the Haswell constraints table. Signed-off-by: Yuanfang Chen <[email protected]> Signed-off-by: Peter Zijlstra (Intel) <[email protected]> Reviewed-by: Andi Kleen <[email protected]> Cc: Arnaldo Carvalho de Melo <[email protected]> Cc: Jiri Olsa <[email protected]> Cc: Linus Torvalds <[email protected]> Cc: Peter Zijlstra <[email protected]> Cc: Stephane Eranian <[email protected]> Cc: Thomas Gleixner <[email protected]> Cc: Vince Weaver <[email protected]> Link: http://lkml.kernel.org/r/[email protected] Signed-off-by: Ingo Molnar <[email protected]>
2015-12-06x86/mm/64: Enable SWIOTLB if system has SRAT memory regions above MAX_DMA32_PFNIgor Mammedov1-1/+1
when memory hotplug enabled system is booted with less than 4GB of RAM and then later more RAM is hotplugged 32-bit devices stop functioning with following error: nommu_map_single: overflow 327b4f8c0+1522 of device mask ffffffff the reason for this is that if x86_64 system were booted with RAM less than 4GB, it doesn't enable SWIOTLB and when memory is hotplugged beyond MAX_DMA32_PFN, devices that expect 32-bit addresses can't handle 64-bit addresses. Fix it by tracking max possible PFN when parsing memory affinity structures from SRAT ACPI table and enable SWIOTLB if there is hotpluggable memory regions beyond MAX_DMA32_PFN. It fixes KVM guests when they use emulated devices (reproduces with ata_piix, e1000 and usb devices, RHBZ: 1275941, 1275977, 1271527) It also fixes the HyperV, VMWare with emulated devices which are affected by this issue as well. Signed-off-by: Igor Mammedov <[email protected]> Cc: Andrew Morton <[email protected]> Cc: Linus Torvalds <[email protected]> Cc: Peter Zijlstra <[email protected]> Cc: Thomas Gleixner <[email protected]> Cc: [email protected] Cc: [email protected] Cc: [email protected] Cc: [email protected] Cc: [email protected] Cc: [email protected] Link: http://lkml.kernel.org/r/[email protected] Signed-off-by: Ingo Molnar <[email protected]>
2015-12-06x86/mm: Introduce max_possible_pfnIgor Mammedov1-0/+2
max_possible_pfn will be used for tracking max possible PFN for memory that isn't present in E820 table and could be hotplugged later. By default max_possible_pfn is initialized with max_pfn, but later it could be updated with highest PFN of hotpluggable memory ranges declared in ACPI SRAT table if any present. Signed-off-by: Igor Mammedov <[email protected]> Cc: Andrew Morton <[email protected]> Cc: Linus Torvalds <[email protected]> Cc: Peter Zijlstra <[email protected]> Cc: Thomas Gleixner <[email protected]> Cc: [email protected] Cc: [email protected] Cc: [email protected] Cc: [email protected] Cc: [email protected] Cc: [email protected] Link: http://lkml.kernel.org/r/[email protected] Signed-off-by: Ingo Molnar <[email protected]>
2015-12-05x86/signal: Fix restart_syscall number for x32 tasksDmitry V. Levin1-7/+10
When restarting a syscall with regs->ax == -ERESTART_RESTARTBLOCK, regs->ax is assigned to a restart_syscall number. For x32 tasks, this syscall number must have __X32_SYSCALL_BIT set, otherwise it will be an x86_64 syscall number instead of a valid x32 syscall number. This issue has been there since the introduction of x32. Reported-by: strace/tests/restart_syscall.test Reported-and-tested-by: Elvira Khabirova <[email protected]> Signed-off-by: Dmitry V. Levin <[email protected]> Cc: Elvira Khabirova <[email protected]> Cc: [email protected] Link: http://lkml.kernel.org/r/[email protected] Signed-off-by: Thomas Gleixner <[email protected]>
2015-12-04livepatch: Cleanup module page permission changesJosh Poimboeuf1-23/+2
Calling set_memory_rw() and set_memory_ro() for every iteration of the loop in klp_write_object_relocations() is messy, inefficient, and error-prone. Change all the read-only pages to read-write before the loop and convert them back to read-only again afterwards. Suggested-by: Miroslav Benes <[email protected]> Signed-off-by: Josh Poimboeuf <[email protected]> Reviewed-by: Petr Mladek <[email protected]> Signed-off-by: Jiri Kosina <[email protected]>
2015-12-04module: use a structure to encapsulate layout.Rusty Russell1-3/+3
Makes it easier to handle init vs core cleanly, though the change is fairly invasive across random architectures. It simplifies the rbtree code immediately, however, while keeping the core data together in the same cachline (now iff the rbtree code is enabled). Acked-by: Peter Zijlstra <[email protected]> Reviewed-by: Josh Poimboeuf <[email protected]> Signed-off-by: Rusty Russell <[email protected]> Signed-off-by: Jiri Kosina <[email protected]>
2015-12-04x86/mm/mtrr: Mark the 'range_new' static variable in mtrr_calc_range_state() ↵Rasmus Villemoes1-2/+9
as __initdata 'range_new' doesn't seem to be used after init. It is only passed to memset(), sum_ranges(), memcmp() and x86_get_mtrr_mem_range(), the latter of which also only passes it on to various *range* library functions. So mark it __initdata to free up an extra page after init. Its contents are wiped at every call to mtrr_calc_range_state(), so it being static is not about preserving state between calls, but simply to avoid a 4k+ stack frame. While there, add a comment explaining this and why it's safe. We could also mark nr_range_new as __initdata, but since it's just a single int and also doesn't carry state between calls (it is unconditionally assigned to before it is read), we might as well make it an ordinary automatic variable. Signed-off-by: Rasmus Villemoes <[email protected]> Cc: H. Peter Anvin <[email protected]> Cc: Linus Torvalds <[email protected]> Cc: Peter Zijlstra <[email protected]> Cc: Thomas Gleixner <[email protected]> Cc: Toshi Kani <[email protected]> Link: http://lkml.kernel.org/r/[email protected] Signed-off-by: Ingo Molnar <[email protected]>
2015-11-30libnvdimm, e820: skip module loading when no type-12Dan Williams1-0/+12
If there are no persistent memory ranges present then don't bother creating the platform device. Otherwise, it loads the full libnvdimm sub-system only to discover no resources present. Reported-by: Andy Lutomirski <[email protected]> Acked-by: Andy Lutomirski <[email protected]> Signed-off-by: Dan Williams <[email protected]>
2015-11-29x86/mm: Page align the '_end' symbol to avoid pfn conversion bugsMatt Fleming1-0/+1
Ingo noted that if we can guarantee _end is aligned to PAGE_SIZE we can automatically avoid bugs along the lines of, size = _end - _text >> PAGE_SHIFT which is missing a call to PFN_ALIGN(). The EFI mixed mode contains this bug, for example. _text is already aligned to PAGE_SIZE through the use of LOAD_PHYSICAL_ADDR, and the BSS and BRK sections are explicitly aligned in the linker script, so it makes sense to align _end to match. Reported-by: Ingo Molnar <[email protected]> Signed-off-by: Matt Fleming <[email protected]> Acked-by: Borislav Petkov <[email protected]> Cc: Andy Lutomirski <[email protected]> Cc: Ard Biesheuvel <[email protected]> Cc: Borislav Petkov <[email protected]> Cc: Brian Gerst <[email protected]> Cc: Dave Hansen <[email protected]> Cc: Denys Vlasenko <[email protected]> Cc: H. Peter Anvin <[email protected]> Cc: Linus Torvalds <[email protected]> Cc: Peter Zijlstra <[email protected]> Cc: Sai Praneeth Prakhya <[email protected]> Cc: Thomas Gleixner <[email protected]> Cc: Toshi Kani <[email protected]> Cc: [email protected] Link: http://lkml.kernel.org/r/[email protected] Signed-off-by: Ingo Molnar <[email protected]>
2015-11-29x86/platform/calgary: Constify cal_chipset_ops structuresJulia Lawall1-2/+2
The cal_chipset_ops structures are never modified, so declare them as const. Done with the help of Coccinelle. Signed-off-by: Julia Lawall <[email protected]> Cc: Borislav Petkov <[email protected]> Cc: H. Peter Anvin <[email protected]> Cc: Jon D. Mason <[email protected]> Cc: Linus Torvalds <[email protected]> Cc: Muli Ben-Yehuda <[email protected]> Cc: Peter Zijlstra <[email protected]> Cc: Thomas Gleixner <[email protected]> Link: http://lkml.kernel.org/r/[email protected] Signed-off-by: Ingo Molnar <[email protected]>
2015-11-27x86/fpu: Put a few variables in .init.dataRasmus Villemoes2-4/+4
These are clearly just used during init. Signed-off-by: Rasmus Villemoes <[email protected]> Cc: Andy Lutomirski <[email protected]> Cc: Borislav Petkov <[email protected]> Cc: Dave Hansen <[email protected]> Cc: Fenghua Yu <[email protected]> Cc: H. Peter Anvin <[email protected]> Cc: Linus Torvalds <[email protected]> Cc: Oleg Nesterov <[email protected]> Cc: Peter Zijlstra <[email protected]> Cc: Quentin Casasnovas <[email protected]> Cc: Thomas Gleixner <[email protected]> Link: http://lkml.kernel.org/r/[email protected] Signed-off-by: Ingo Molnar <[email protected]>
2015-11-25x86 smpboot: Re-enable init_udelay=0 by default on modern CPUsLen Brown1-4/+5
commit f1ccd249319e allowed the cmdline "cpu_init_udelay=" to work with all values, including the default of 10000. But in setting the default of 10000, it over-rode the code that sets the delay 0 on modern processors. Also, tidy up use of INT/UINT. Fixes: f1ccd249319e "x86/smpboot: Fix cpu_init_udelay=10000 corner case boot parameter misbehavior" Reported-by: Shane <[email protected]> Signed-off-by: Len Brown <[email protected]> Cc: [email protected] Cc: [email protected] Link: http://lkml.kernel.org/r/9082eb809ef40dad02db714759c7aaf618c518d4.1448232494.git.len.brown@intel.com Signed-off-by: Thomas Gleixner <[email protected]>
2015-11-25x86/paravirt: Remove paravirt ops pmd_update[_defer] and pte_update_deferJuergen Gross1-3/+0
pte_update_defer can be removed as it is always set to the same function as pte_update. So any usage of pte_update_defer() can be replaced by pte_update(). pmd_update and pmd_update_defer are always set to paravirt_nop, so they can just be nuked. Signed-off-by: Juergen Gross <[email protected]> Acked-by: Rusty Russell <[email protected]> Cc: [email protected] Cc: [email protected] Cc: [email protected] Cc: [email protected] Cc: [email protected] Cc: [email protected] Cc: [email protected] Cc: [email protected] Link: http://lkml.kernel.org/r/[email protected] Signed-off-by: Thomas Gleixner <[email protected]>
2015-11-25x86: Replace RDRAND forced-reseed with simple sanity checkLen Brown1-13/+12
x86_init_rdrand() was added with 2 goals: 1. Sanity check that the built-in-self-test circuit on the Digital Random Number Generator (DRNG) is not complaining. As RDRAND HW self-checks on every invocation, this goal is achieved by simply invoking RDRAND and checking its return code. 2. Force a full re-seed of the random number generator. This was done out of paranoia to benefit the most un-sophisticated DRNG implementation conceivable in the architecture, an implementation that does not exist, and unlikely ever will. This worst-case full-re-seed is achieved by invoking a 64-bit RDRAND 8192 times. Unfortunately, this worst-case re-seed costs O(1,000us). Magnifying this cost, it is done from identify_cpu(), which is the synchronous critical path to bring a processor on-line -- repeated for every logical processor in the system at boot and resume from S3. As it is very expensive, and of highly dubious value, we delete the worst-case re-seed from the kernel. We keep the 1st goal -- sanity check the hardware, and mark it absent if it complains. This change reduces the cost of x86_init_rdrand() by a factor of 1,000x, to O(1us) from O(1,000us). Signed-off-by: Len Brown <[email protected]> Link: http://lkml.kernel.org/r/058618cc56ec6611171427ad7205e37e377aa8d4.1439738240.git.len.brown@intel.com Signed-off-by: Thomas Gleixner <[email protected]>
2015-11-25ftrace: Add variable ftrace_expected for archs to show expected codeSteven Rostedt (Red Hat)1-0/+9
When an anomaly is found while modifying function code, ftrace_bug() is called which disables the function tracing infrastructure and reports information about what failed. If the code that is to be replaced does not match what is expected, then actual code is shown. Currently there is no arch generic way to show what was expected. Add a new variable pointer calld ftrace_expected that the arch code can set to point to what it expected so that ftrace_bug() can report the actual text as well as the text that was expected to be there. Signed-off-by: Steven Rostedt <[email protected]>
2015-11-24Merge branch 'x86/urgent' into x86/asm, to pick up dependent fixesIngo Molnar2-2/+1
Signed-off-by: Ingo Molnar <[email protected]>
2015-11-24x86/apic: Fix the saving and restoring of lapic vectors during suspend/resumeJuergen Gross1-1/+10
Saving and restoring lapic vectors in lapic_suspend() and lapic_resume() is not consistent: the thmr vector saving is guarded by a different config option than the restore part. The cmci vector isn't handled at all. Those inconsistencies are not very critical, as the missing cmci vector will be set via mce resume handling, the wrong config option used for restoring the thmr vector can't be configured differently than the one which should be used. Nevertheless correct the thmr vector restore and add cmci vector handling. Signed-off-by: Juergen Gross <[email protected]> Acked-by: Borislav Petkov <[email protected]> Cc: Andy Lutomirski <[email protected]> Cc: Borislav Petkov <[email protected]> Cc: Brian Gerst <[email protected]> Cc: Denys Vlasenko <[email protected]> Cc: H. Peter Anvin <[email protected]> Cc: Linus Torvalds <[email protected]> Cc: Peter Zijlstra <[email protected]> Cc: Thomas Gleixner <[email protected]> Link: http://lkml.kernel.org/r/[email protected] [ Minor code edits. ] Signed-off-by: Ingo Molnar <[email protected]>