aboutsummaryrefslogtreecommitdiff
path: root/arch/x86/include
AgeCommit message (Collapse)AuthorFilesLines
2013-04-10cpufreq: AMD "frequency sensitivity feedback" powersave bias for ondemand ↵Jacob Shin1-0/+1
governor Future AMD processors, starting with Family 16h, can provide software with feedback on how the workload may respond to frequency change -- memory-bound workloads will not benefit from higher frequency, where as compute-bound workloads will. This patch enables this "frequency sensitivity feedback" to aid the ondemand governor to make better frequency change decisions by hooking into the powersave bias. Signed-off-by: Jacob Shin <[email protected]> Acked-by: Thomas Renninger <[email protected]> Acked-by: Borislav Petkov <[email protected]> Acked-by: Viresh Kumar <[email protected]> Signed-off-by: Rafael J. Wysocki <[email protected]>
2013-04-08arch: Consolidate tsk_is_polling()Thomas Gleixner1-2/+0
Move it to a common place. Preparatory patch for implementing set/clear for the idle need_resched poll implementation. Signed-off-by: Thomas Gleixner <[email protected]> Cc: Linus Torvalds <[email protected]> Cc: Rusty Russell <[email protected]> Cc: Paul McKenney <[email protected]> Cc: Peter Zijlstra <[email protected]> Reviewed-by: Cc: Srivatsa S. Bhat <[email protected]> Cc: Magnus Damm <[email protected]> Link: http://lkml.kernel.org/r/[email protected] Signed-off-by: Thomas Gleixner <[email protected]>
2013-04-08KVM: Move kvm_rebooting declaration out of x86Geoff Levand1-1/+0
The variable kvm_rebooting is a common kvm variable, so move its declaration from arch/x86/include/asm/kvm_host.h to include/asm/kvm_host.h. Fixes this sparse warning when building on arm64: virt/kvm/kvm_main.c:warning: symbol 'kvm_rebooting' was not declared. Should it be static? Signed-off-by: Geoff Levand <[email protected]> Signed-off-by: Gleb Natapov <[email protected]>
2013-04-08KVM: Move vm_list kvm_lock declarations out of x86Geoff Levand1-3/+0
The variables vm_list and kvm_lock are common to all architectures, so move the declarations from arch/x86/include/asm/kvm_host.h to include/linux/kvm_host.h. Fixes sparse warnings like these when building for arm64: virt/kvm/kvm_main.c: warning: symbol 'kvm_lock' was not declared. Should it be static? virt/kvm/kvm_main.c: warning: symbol 'vm_list' was not declared. Should it be static? Signed-off-by: Geoff Levand <[email protected]> Signed-off-by: Gleb Natapov <[email protected]>
2013-04-08Merge branch 'for-tip' of ↵Ingo Molnar3-4/+5
git://git.kernel.org/pub/scm/linux/kernel/git/rric/oprofile into perf/core Pull IBM zEnterprise EC12 support patchlet from Robert Richter. Signed-off-by: Ingo Molnar <[email protected]>
2013-04-02x86, msr: Unify variable namesBorislav Petkov1-7/+7
Make sure all MSR-accessing primitives which split MSR values in two 32-bit parts have their variables called 'low' and 'high' for consistence with the rest of the code and for ease of staring. Signed-off-by: Borislav Petkov <[email protected]> Link: http://lkml.kernel.org/r/[email protected] Signed-off-by: H. Peter Anvin <[email protected]>
2013-04-02x86: Drop KERNEL_IMAGE_STARTBorislav Petkov1-1/+0
We have KERNEL_IMAGE_START and __START_KERNEL_map which both contain the start of the kernel text mapping's virtual address. Remove the prior one which has been replicated a lot less times around the tree. No functionality change. Signed-off-by: Borislav Petkov <[email protected]> Link: http://lkml.kernel.org/r/[email protected] Signed-off-by: H. Peter Anvin <[email protected]>
2013-04-02x86: remove the x32 syscall bitmask from syscall_get_nr()Paul Moore1-2/+2
Commit fca460f95e928bae373daa8295877b6905bc62b8 simplified the x32 implementation by creating a syscall bitmask, equal to 0x40000000, that could be applied to x32 syscalls such that the masked syscall number would be the same as a x86_64 syscall. While that patch was a nice way to simplify the code, it went a bit too far by adding the mask to syscall_get_nr(); returning the masked syscall numbers can cause confusion with callers that expect syscall numbers matching the x32 ABI, e.g. unmasked syscall numbers. This patch fixes this by simply removing the mask from syscall_get_nr() while preserving the other changes from the original commit. While there are several syscall_get_nr() callers in the kernel, most simply check that the syscall number is greater than zero, in this case this patch will have no effect. Of those remaining callers, they appear to be few, seccomp and ftrace, and from my testing of seccomp without this patch the original commit definitely breaks things; the seccomp filter does not correctly filter the syscalls due to the difference in syscall numbers in the BPF filter and the value from syscall_get_nr(). Applying this patch restores the seccomp BPF filter functionality on x32. I've tested this patch with the seccomp BPF filters as well as ftrace and everything looks reasonable to me; needless to say general usage seemed fine as well. Signed-off-by: Paul Moore <[email protected]> Link: http://lkml.kernel.org/r/20130215172143.12549.10292.stgit@localhost Cc: <[email protected]> Cc: Will Drewry <[email protected]> Cc: H. Peter Anvin <[email protected]> Signed-off-by: H. Peter Anvin <[email protected]>
2013-04-02x86/mce: Rework cmci_rediscover() to play well with CPU hotplugSrivatsa S. Bhat1-2/+2
Dave Jones reports that offlining a CPU leads to this trace: numa_remove_cpu cpu 1 node 0: mask now 0,2-3 smpboot: CPU 1 is now offline BUG: using smp_processor_id() in preemptible [00000000] code: cpu-offline.sh/10591 caller is cmci_rediscover+0x6a/0xe0 Pid: 10591, comm: cpu-offline.sh Not tainted 3.9.0-rc3+ #2 Call Trace: [<ffffffff81333bbd>] debug_smp_processor_id+0xdd/0x100 [<ffffffff8101edba>] cmci_rediscover+0x6a/0xe0 [<ffffffff815f5b9f>] mce_cpu_callback+0x19d/0x1ae [<ffffffff8160ea66>] notifier_call_chain+0x66/0x150 [<ffffffff8107ad7e>] __raw_notifier_call_chain+0xe/0x10 [<ffffffff8104c2e3>] cpu_notify+0x23/0x50 [<ffffffff8104c31e>] cpu_notify_nofail+0xe/0x20 [<ffffffff815ef082>] _cpu_down+0x302/0x350 [<ffffffff815ef106>] cpu_down+0x36/0x50 [<ffffffff815f1c9d>] store_online+0x8d/0xd0 [<ffffffff813edc48>] dev_attr_store+0x18/0x30 [<ffffffff81226eeb>] sysfs_write_file+0xdb/0x150 [<ffffffff811adfb2>] vfs_write+0xa2/0x170 [<ffffffff811ae16c>] sys_write+0x4c/0xa0 [<ffffffff81613019>] system_call_fastpath+0x16/0x1b However, a look at cmci_rediscover shows that it can be simplified quite a bit, apart from solving the above issue. It invokes functions that take spin locks with interrupts disabled, and hence it can run in atomic context. Also, it is run in the CPU_POST_DEAD phase, so the dying CPU is already dead and out of the cpu_online_mask. So take these points into account and simplify the code, and thereby also fix the above issue. Reported-by: Dave Jones <[email protected]> Signed-off-by: Srivatsa S. Bhat <[email protected]> Signed-off-by: Tony Luck <[email protected]>
2013-04-02x86, cpu: Convert AMD Erratum 400Borislav Petkov2-19/+1
Convert AMD erratum 400 to the bug infrastructure. Then, retract all exports for modules since they're not needed now and make the AMD erratum checking machinery local to amd.c. Use forward declarations to avoid shuffling too much code around needlessly. Signed-off-by: Borislav Petkov <[email protected]> Link: http://lkml.kernel.org/r/[email protected] Signed-off-by: H. Peter Anvin <[email protected]>
2013-04-02x86, cpu: Convert AMD Erratum 383Borislav Petkov2-1/+1
Convert the AMD erratum 383 testing code to the bug infrastructure. This allows keeping the AMD-specific erratum testing machinery private to amd.c and not export symbols to modules needlessly. Signed-off-by: Borislav Petkov <[email protected]> Link: http://lkml.kernel.org/r/[email protected] Signed-off-by: H. Peter Anvin <[email protected]>
2013-04-02x86, cpu: Convert Cyrix coma bug detectionBorislav Petkov2-1/+1
... to the new facility. Signed-off-by: Borislav Petkov <[email protected]> Link: http://lkml.kernel.org/r/[email protected] Signed-off-by: H. Peter Anvin <[email protected]>
2013-04-02x86, cpu: Convert FDIV bug detectionBorislav Petkov2-1/+1
... to the new facility. Add a reference to the wikipedia article explaining the FDIV test we're doing here. Signed-off-by: Borislav Petkov <[email protected]> Link: http://lkml.kernel.org/r/[email protected] Signed-off-by: H. Peter Anvin <[email protected]>
2013-04-02x86, cpu: Convert F00F bug detectionBorislav Petkov2-1/+2
... to using the new facility and drop the cpuinfo_x86 member. Signed-off-by: Borislav Petkov <[email protected]> Link: http://lkml.kernel.org/r/[email protected] Signed-off-by: H. Peter Anvin <[email protected]>
2013-04-02x86, cpu: Expand cpufeature facility to include cpu bugsBorislav Petkov2-1/+14
We add another 32-bit vector at the end of the ->x86_capability bitvector which collects bugs present in CPUs. After all, a CPU bug is a kind of a capability, albeit a strange one. Signed-off-by: Borislav Petkov <[email protected]> Link: http://lkml.kernel.org/r/[email protected] Signed-off-by: H. Peter Anvin <[email protected]>
2013-04-02pmu: prepare for migration supportPaolo Bonzini1-1/+1
In order to migrate the PMU state correctly, we need to restore the values of MSR_CORE_PERF_GLOBAL_STATUS (a read-only register) and MSR_CORE_PERF_GLOBAL_OVF_CTRL (which has side effects when written). We also need to write the full 40-bit value of the performance counter, which would only be possible with a v3 architectural PMU's full-width counter MSRs. To distinguish host-initiated writes from the guest's, pass the full struct msr_data to kvm_pmu_set_msr. Signed-off-by: Paolo Bonzini <[email protected]> Signed-off-by: Gleb Natapov <[email protected]>
2013-04-01perf/x86: Add memory profiling via PEBS Load LatencyStephane Eranian1-0/+1
This patch adds support for memory profiling using the PEBS Load Latency facility. Load accesses are sampled by HW and the instruction address, data address, load latency, data source, tlb, locked information can be saved in the sampling buffer if using the PERF_SAMPLE_COST (for latency), PERF_SAMPLE_ADDR, PERF_SAMPLE_DATA_SRC types. To enable PEBS Load Latency, users have to use the model specific event: - on NHM/WSM: MEM_INST_RETIRED:LATENCY_ABOVE_THRESHOLD - on SNB/IVB: MEM_TRANS_RETIRED:LATENCY_ABOVE_THRESHOLD To make things easier, this patch also exports a generic alias via sysfs: mem-loads. It export the right event encoding based on the host CPU and can be used directly by the perf tool. Loosely based on Intel's Lin Ming patch posted on LKML in July 2011. Signed-off-by: Stephane Eranian <[email protected]> Cc: [email protected] Cc: [email protected] Cc: [email protected] Cc: [email protected] Cc: [email protected] Link: http://lkml.kernel.org/r/[email protected] Signed-off-by: Ingo Molnar <[email protected]> Signed-off-by: Arnaldo Carvalho de Melo <[email protected]>
2013-03-28Merge tag 'pm+acpi-3.9-rc5' of ↵Linus Torvalds1-0/+1
git://git.kernel.org/pub/scm/linux/kernel/git/rafael/linux-pm Pull ACPI and power management fixes from Rafael J Wysocki: - Fix for a recent cpufreq regression related to acpi-cpufreq and suspend/resume from Viresh Kumar. - cpufreq stats reference counting fix from Viresh Kumar. - intel_pstate driver fixes from Dirk Brandewie and Konrad Rzeszutek Wilk. - New ACPI suspend blacklist entry for Sony Vaio VGN-FW21M from Fabio Valentini. - ACPI Platform Error Interface (APEI) fix from Chen Gong. - PCI root bridge hotplug locking fix from Yinghai Lu. * tag 'pm+acpi-3.9-rc5' of git://git.kernel.org/pub/scm/linux/kernel/git/rafael/linux-pm: PCI / ACPI: hold acpi_scan_lock during root bus hotplug ACPI / APEI: fix error status check condition for CPER ACPI / PM: fix suspend and resume on Sony Vaio VGN-FW21M cpufreq: acpi-cpufreq: Don't set policy->related_cpus from .init() cpufreq: stats: do cpufreq_cpu_put() corresponding to cpufreq_cpu_get() intel-pstate: Use #defines instead of hard-coded values. cpufreq / intel_pstate: Fix calculation of current frequency cpufreq / intel_pstate: Add function to check that all MSRs are valid
2013-03-27Merge tag 'stable/for-linus-3.9-rc4-tag' of ↵Linus Torvalds1-2/+2
git://git.kernel.org/pub/scm/linux/kernel/git/konrad/xen Pull Xen bug-fixes from Konrad Rzeszutek Wilk: "This is mostly just the last stragglers of the regression bugs that this merge window had. There are also two bug-fixes: one that adds an extra layer of security, and a regression fix for a change that was added in v3.7 (the v1 was faulty, the v2 works). - Regression fixes for C-and-P states not being parsed properly. - Fix possible security issue with guests triggering DoS via non-assigned MSI-Xs. - Fix regression (introduced in v3.7) with raising an event (v2). - Fix hastily introduced band-aid during c0 for the CR3 blowup." * tag 'stable/for-linus-3.9-rc4-tag' of git://git.kernel.org/pub/scm/linux/kernel/git/konrad/xen: xen/events: avoid race with raising an event in unmask_evtchn() xen/mmu: Move the setting of pvops.write_cr3 to later phase in bootup. xen/acpi-stub: Disable it b/c the acpi_processor_add is no longer called. xen-pciback: notify hypervisor about devices intended to be assigned to guests xen/acpi-processor: Don't dereference struct acpi_processor on all CPUs.
2013-03-25intel-pstate: Use #defines instead of hard-coded values.Konrad Rzeszutek Wilk1-0/+1
They are defined in coreboot (MSR_PLATFORM) and the other one is already defined in msr-index.h. Let's use those. Signed-off-by: Konrad Rzeszutek Wilk <[email protected]> Acked-by: Viresh Kumar <[email protected]> Acked-by: Dirk Brandewie <[email protected]> Signed-off-by: Rafael J. Wysocki <[email protected]>
2013-03-22xen-pciback: notify hypervisor about devices intended to be assigned to guestsJan Beulich1-2/+2
For MSI-X capable devices the hypervisor wants to write protect the MSI-X table and PBA, yet it can't assume that resources have been assigned to their final values at device enumeration time. Thus have pciback do that notification, as having the device controlled by it is a prerequisite to assigning the device to guests anyway. This is the kernel part of hypervisor side commit 4245d33 ("x86/MSI: add mechanism to fully protect MSI-X table from PV guest accesses") on the master branch of git://xenbits.xen.org/xen.git. CC: [email protected] Signed-off-by: Jan Beulich <[email protected]> Signed-off-by: Konrad Rzeszutek Wilk <[email protected]>
2013-03-21Merge branch 'perf-urgent-for-linus' of ↵Linus Torvalds1-0/+1
git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip Pull perf fixes from Ingo Molnar: "A fair chunk of the linecount comes from a fix for a tracing bug that corrupts latency tracing buffers when the overwrite mode is changed on the fly - the rest is mostly assorted fewliner fixlets." * 'perf-urgent-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip: perf/x86: Add SNB/SNB-EP scheduling constraints for cycle_activity event kprobes/x86: Check Interrupt Flag modifier when registering probe kprobes: Make hash_64() as always inlined perf: Generate EXIT event only once per task context perf: Reset hwc->last_period on sw clock events tracing: Prevent buffer overwrite disabled for latency tracers tracing: Keep overwrite in sync between regular and snapshot buffers tracing: Protect tracer flags with trace_types_lock perf tools: Fix LIBNUMA build with glibc 2.12 and older. tracing: Fix free of probe entry by calling call_rcu_sched() perf/POWER7: Create a sysfs format entry for Power7 events perf probe: Fix segfault libtraceevent: Remove hard coded include to /usr/local/include in Makefile perf record: Fix -C option perf tools: check if -DFORTIFY_SOURCE=2 is allowed perf report: Fix build with NO_NEWT=1 perf annotate: Fix build with NO_NEWT=1 tracing: Fix race in snapshot swapping
2013-03-21Merge remote-tracking branch 'upstream/master' into queueMarcelo Tosatti2-4/+20
Merge reason: From: Alexander Graf <[email protected]> "Just recently this really important patch got pulled into Linus' tree for 3.9: commit 1674400aaee5b466c595a8fc310488263ce888c7 Author: Anton Blanchard <anton <at> samba.org> Date: Tue Mar 12 01:51:51 2013 +0000 Without that commit, I can not boot my G5, thus I can't run automated tests on it against my queue. Could you please merge kvm/next against linus/master, so that I can base my trees against that?" * upstream/master: (653 commits) PCI: Use ROM images from firmware only if no other ROM source available sparc: remove unused "config BITS" sparc: delete "if !ULTRA_HAS_POPULATION_COUNT" KVM: Fix bounds checking in ioapic indirect register reads (CVE-2013-1798) KVM: x86: Convert MSR_KVM_SYSTEM_TIME to use gfn_to_hva_cache functions (CVE-2013-1797) KVM: x86: fix for buffer overflow in handling of MSR_KVM_SYSTEM_TIME (CVE-2013-1796) arm64: Kconfig.debug: Remove unused CONFIG_DEBUG_ERRORS arm64: Do not select GENERIC_HARDIRQS_NO_DEPRECATED inet: limit length of fragment queue hash table bucket lists qeth: Fix scatter-gather regression qeth: Fix invalid router settings handling qeth: delay feature trace sgy-cts1000: Remove __dev* attributes KVM: x86: fix deadlock in clock-in-progress request handling KVM: allow host header to be included even for !CONFIG_KVM hwmon: (lm75) Fix tcn75 prefix hwmon: (lm75.h) Update header inclusion MAINTAINERS: Remove Mark M. Hoffman xfs: ensure we capture IO errors correctly xfs: fix xfs_iomap_eof_prealloc_initial_size type ... Signed-off-by: Marcelo Tosatti <[email protected]>
2013-03-19Merge git://git.kernel.org/pub/scm/virt/kvm/kvmLinus Torvalds1-2/+2
Pull kvm fixes from Marcelo Tosatti. * git://git.kernel.org/pub/scm/virt/kvm/kvm: KVM: Fix bounds checking in ioapic indirect register reads (CVE-2013-1798) KVM: x86: Convert MSR_KVM_SYSTEM_TIME to use gfn_to_hva_cache functions (CVE-2013-1797) KVM: x86: fix for buffer overflow in handling of MSR_KVM_SYSTEM_TIME (CVE-2013-1796) KVM: x86: fix deadlock in clock-in-progress request handling KVM: allow host header to be included even for !CONFIG_KVM
2013-03-19KVM: x86: Convert MSR_KVM_SYSTEM_TIME to use gfn_to_hva_cache functions ↵Andy Honig1-2/+2
(CVE-2013-1797) There is a potential use after free issue with the handling of MSR_KVM_SYSTEM_TIME. If the guest specifies a GPA in a movable or removable memory such as frame buffers then KVM might continue to write to that address even after it's removed via KVM_SET_USER_MEMORY_REGION. KVM pins the page in memory so it's unlikely to cause an issue, but if the user space component re-purposes the memory previously used for the guest, then the guest will be able to corrupt that memory. Tested: Tested against kvmclock unit test Signed-off-by: Andrew Honig <[email protected]> Signed-off-by: Marcelo Tosatti <[email protected]>
2013-03-18kprobes/x86: Check Interrupt Flag modifier when registering probeMasami Hiramatsu1-0/+1
Currently kprobes check whether the copied instruction modifies IF (interrupt flag) on each probe hit. This results not only in introducing overhead but also involving inat_get_opcode_attribute into the kprobes hot path, and it can cause an infinite recursive call (and kernel panic in the end). Actually, since the copied instruction itself can never be modified on the buffer, it is needless to analyze the instruction on every probe hit. To fix this issue, we check it only once when registering probe and store the result on ainsn->if_modifier. Reported-by: Timo Juhani Lindfors <[email protected]> Signed-off-by: Masami Hiramatsu <[email protected]> Acked-by: Ananth N Mavinakayanahalli <[email protected]> Cc: [email protected] Cc: Steven Rostedt <[email protected]> Cc: David S. Miller <[email protected]> Cc: Linus Torvalds <[email protected]> Link: http://lkml.kernel.org/r/20130314115242.19690.33573.stgit@mhiramat-M0-7522 Signed-off-by: Ingo Molnar <[email protected]>
2013-03-15x86: Add cpu capability flag X86_FEATURE_NONSTOP_TSC_S3Feng Tang1-0/+1
On some new Intel Atom processors (Penwell and Cloverview), there is a feature that the TSC won't stop in S3 state, say the TSC value won't be reset to 0 after resume. This feature makes TSC a more reliable clocksource and could benefit the timekeeping code during system suspend/resume cycle, so add a flag for it. Signed-off-by: Feng Tang <[email protected]> [jstultz: Fix checkpatch warning] Signed-off-by: John Stultz <[email protected]>
2013-03-14KVM: x86: Optimize mmio spte zapping when creating/moving memslotTakuya Yoshikawa1-0/+1
When we create or move a memory slot, we need to zap mmio sptes. Currently, zap_all() is used for this and this is causing two problems: - extra page faults after zapping mmu pages - long mmu_lock hold time during zapping mmu pages For the latter, Marcelo reported a disastrous mmu_lock hold time during hot-plug, which made the guest unresponsive for a long time. This patch takes a simple way to fix these problems: do not zap mmu pages unless they are marked mmio cached. On our test box, this took only 50us for the 4GB guest and we did not see ms of mmu_lock hold time any more. Note that we still need to do zap_all() for other cases. So another work is also needed: Xiao's work may be the one. Reviewed-by: Marcelo Tosatti <[email protected]> Signed-off-by: Takuya Yoshikawa <[email protected]> Signed-off-by: Gleb Natapov <[email protected]>
2013-03-14KVM: MMU: Mark sp mmio cached when creating mmio spteTakuya Yoshikawa1-0/+1
This will be used not to zap unrelated mmu pages when creating/moving a memory slot later. Reviewed-by: Marcelo Tosatti <[email protected]> Signed-off-by: Takuya Yoshikawa <[email protected]> Signed-off-by: Gleb Natapov <[email protected]>
2013-03-14KVM: nVMX: Add preemption timer supportJan Kiszka2-2/+6
Provided the host has this feature, it's straightforward to offer it to the guest as well. We just need to load to timer value on L2 entry if the feature was enabled by L1 and watch out for the corresponding exit reason. Reviewed-by: Paolo Bonzini <[email protected]> Signed-off-by: Jan Kiszka <[email protected]> Signed-off-by: Gleb Natapov <[email protected]>
2013-03-14KVM: nVMX: Provide EFER.LMA saving supportJan Kiszka1-0/+2
We will need EFER.LMA saving to provide unrestricted guest mode. All what is missing for this is picking up EFER.LMA from VM_ENTRY_CONTROLS on L2->L1 switches. If the host does not support EFER.LMA saving, no change is performed, otherwise we properly emulate for L1 what the hardware does for L0. Advertise the support, depending on the host feature. Reviewed-by: Paolo Bonzini <[email protected]> Signed-off-by: Jan Kiszka <[email protected]> Signed-off-by: Gleb Natapov <[email protected]>
2013-03-13KVM: nVMX: Clean up and fix pin-based execution controlsJan Kiszka1-0/+2
Only interrupt and NMI exiting are mandatory for KVM to work, thus can be exposed to the guest unconditionally, virtual NMI exiting is optional. So we must not advertise it unless the host supports it. Introduce the symbolic constant PIN_BASED_ALWAYSON_WITHOUT_TRUE_MSR at this chance. Reviewed-by:: Paolo Bonzini <[email protected]> Signed-off-by: Jan Kiszka <[email protected]> Signed-off-by: Gleb Natapov <[email protected]>
2013-03-13KVM: x86: Rework INIT and SIPI handlingJan Kiszka1-1/+2
A VCPU sending INIT or SIPI to some other VCPU races for setting the remote VCPU's mp_state. When we were unlucky, KVM_MP_STATE_INIT_RECEIVED was overwritten by kvm_emulate_halt and, thus, got lost. This introduces APIC events for those two signals, keeping them in kvm_apic until kvm_apic_accept_events is run over the target vcpu context. kvm_apic_has_events reports to kvm_arch_vcpu_runnable if there are pending events, thus if vcpu blocking should end. The patch comes with the side effect of effectively obsoleting KVM_MP_STATE_SIPI_RECEIVED. We still accept it from user space, but immediately translate it to KVM_MP_STATE_INIT_RECEIVED + KVM_APIC_SIPI. The vcpu itself will no longer enter the KVM_MP_STATE_SIPI_RECEIVED state. That also means we no longer exit to user space after receiving a SIPI event. Furthermore, we already reset the VCPU on INIT, only fixing up the code segment later on when SIPI arrives. Moreover, we fix INIT handling for the BSP: it never enter wait-for-SIPI but directly starts over on INIT. Tested-by: Paolo Bonzini <[email protected]> Signed-off-by: Jan Kiszka <[email protected]> Signed-off-by: Gleb Natapov <[email protected]>
2013-03-12KVM: x86: Drop unused return code from VCPU reset callbackJan Kiszka1-1/+1
Neither vmx nor svm nor the common part may generate an error on kvm_vcpu_reset. So drop the return code. Reviewed-by: Paolo Bonzini <[email protected]> Signed-off-by: Jan Kiszka <[email protected]> Signed-off-by: Gleb Natapov <[email protected]>
2013-03-07KVM: nVMX: Fix content of MSR_IA32_VMX_ENTRY/EXIT_CTLSJan Kiszka1-0/+4
Properly set those bits to 1 that the spec demands in case bit 55 of VMX_BASIC is 0 - like in our case. Reviewed-by: Paolo Bonzini <[email protected]> Signed-off-by: Jan Kiszka <[email protected]> Signed-off-by: Marcelo Tosatti <[email protected]>
2013-03-07context_tracking: Move exception handling to generic codeFrederic Weisbecker1-21/+0
Exceptions handling on context tracking should share common treatment: on entry we exit user mode if the exception triggered in that context. Then on exception exit we return to that previous context. Generalize this to avoid duplication across archs. Signed-off-by: Frederic Weisbecker <[email protected]> Cc: Li Zhong <[email protected]> Cc: Kevin Hilman <[email protected]> Cc: Mats Liljegren <[email protected]> Cc: Peter Zijlstra <[email protected]> Cc: Ingo Molnar <[email protected]> Cc: Steven Rostedt <[email protected]> Cc: Namhyung Kim <[email protected]> Cc: Andrew Morton <[email protected]> Cc: Thomas Gleixner <[email protected]> Cc: Paul E. McKenney <[email protected]>
2013-03-06x86, doc: Be explicit about what the x86 struct boot_params requiresPeter Jones1-1/+15
If the sentinel triggers, we do not want the boot loader authors to just poke it and make the error go away, we want them to actually fix the problem. This should help avoid making the incorrect change in non-compliant bootloaders. [ hpa: dropped the Documentation/x86/boot.txt hunk pending clarifications ] Signed-off-by: Peter Jones <[email protected]> Link: http://lkml.kernel.org/r/[email protected] Signed-off-by: H. Peter Anvin <[email protected]>
2013-03-06x86: Don't clear efi_info even if the sentinel hitsJosh Boyer1-1/+3
When boot_params->sentinel is set, all we really know is that some undefined set of fields in struct boot_params contain garbage. In the particular case of efi_info, however, there is a private magic for that substructure, so it is generally safe to leave it even if the bootloader is broken. kexec (for which we did the initial analysis) did not initialize this field, but of course all the EFI bootloaders do, and most EFI bootloaders are broken in this respect (and should be fixed.) Reported-by: Robin Holt <[email protected]> Link: http://lkml.kernel.org/r/CA%2B5PVA51-FT14p4CRYKbicykugVb=PiaEycdQ57CK2km_OQuRQ@mail.gmail.com Tested-by: Josh Boyer <[email protected]> Signed-off-by: H. Peter Anvin <[email protected]>
2013-03-04x86: Make Linux guest support optionalBorislav Petkov1-5/+11
Put all config options needed to run Linux as a guest behind a CONFIG_HYPERVISOR_GUEST menu so that they don't get built-in by default but be selectable by the user. Also, make all units which depend on x86_hyper, depend on this new symbol so that compilation doesn't fail when CONFIG_HYPERVISOR_GUEST is disabled but those units assume its presence. Sort options in the new HYPERVISOR_GUEST menu, adapt config text and drop redundant select. Signed-off-by: Borislav Petkov <[email protected]> Link: http://lkml.kernel.org/r/[email protected] Cc: Dmitry Torokhov <[email protected]> Cc: K. Y. Srinivasan <[email protected]> Cc: Haiyang Zhang <[email protected]> Signed-off-by: H. Peter Anvin <[email protected]>
2013-03-03x86: trim sys_ia32.hAl Viro1-5/+0
remove the externs for functions that don't exist anymore Signed-off-by: Al Viro <[email protected]>
2013-03-03x86: sys32_kill and sys32_mprotect are pointlessAl Viro1-2/+0
their argument types are identical to those of sys_kill and sys_mprotect resp., so we are not doing any kind of argument validation, etc. in those - they turn into unconditional branches to corresponding syscalls. Signed-off-by: Al Viro <[email protected]>
2013-03-03merge compat sys_ipc instancesAl Viro1-3/+0
Signed-off-by: Al Viro <[email protected]>
2013-03-03consolidate compat lookup_dcookie()Al Viro1-1/+0
Signed-off-by: Al Viro <[email protected]>
2013-03-03convert sendfile{,64} to COMPAT_SYSCALL_DEFINEAl Viro1-1/+0
Signed-off-by: Al Viro <[email protected]>
2013-03-03make SYSCALL_DEFINE<n>-generated wrappers do asmlinkage_protectAl Viro1-2/+2
... and switch i386 to HAVE_SYSCALL_WRAPPERS, killing open-coded uses of asmlinkage_protect() in a bunch of syscalls. Signed-off-by: Al Viro <[email protected]>
2013-03-03consolidate cond_syscall and SYSCALL_ALIAS declarationsAl Viro1-8/+0
take them to asm/linkage.h, with default in linux/linkage.h Signed-off-by: Al Viro <[email protected]>
2013-03-02Merge branch 'for-linus' of ↵Linus Torvalds1-1/+0
git://git.kernel.org/pub/scm/linux/kernel/git/viro/signal Pull signal/compat fixes from Al Viro: "Fixes for several regressions introduced in the last signal.git pile, along with fixing bugs in truncate and ftruncate compat (on just about anything biarch at least one of those two had been done wrong)." * 'for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/viro/signal: compat: restore timerfd settime and gettime compat syscalls [regression] braino in "sparc: convert to ksignal" fix compat truncate/ftruncate switch lseek to COMPAT_SYSCALL_DEFINE lseek() and truncate() on sparc really need sign extension
2013-02-27Merge branch 'x86-efi-for-linus' of ↵Linus Torvalds1-1/+8
git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip Pull x86/EFI changes from Peter Anvin: - Improve the initrd handling in the EFI boot stub by allowing forward slashes in the pathname - from Chun-Yi Lee. - Cleanup code duplication in the EFI mixed kernel/firmware code - from Satoru Takeuchi. - efivarfs bug fixes for more strict filename validation, with lots of input from Al Viro. * 'x86-efi-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip: x86, efi: remove duplicate code in setup_arch() by using, efi_is_native() efivarfs: guid part of filenames are case-insensitive efivarfs: Validate filenames much more aggressively efivarfs: Use sizeof() instead of magic number x86, efi: Allow slash in file path of initrd
2013-02-26Merge branch 'perf-urgent-for-linus' of ↵Linus Torvalds2-1/+24
git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip Pull perf fixes from Ingo Molnar. * 'perf-urgent-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip: perf/x86: Add Intel IvyBridge event scheduling constraints ftrace: Call ftrace cleanup module notifier after all other notifiers tracing/syscalls: Allow archs to ignore tracing compat syscalls
2013-02-25Merge tag 'pci-v3.9-changes' of ↵Linus Torvalds2-1/+3
git://git.kernel.org/pub/scm/linux/kernel/git/helgaas/pci Pull PCI changes from Bjorn Helgaas: "Host bridge hotplug - Major overhaul of ACPI host bridge add/start (Rafael Wysocki, Yinghai Lu) - Major overhaul of PCI/ACPI binding (Rafael Wysocki, Yinghai Lu) - Split out ACPI host bridge and ACPI PCI device hotplug (Yinghai Lu) - Stop caching _PRT and make independent of bus numbers (Yinghai Lu) PCI device hotplug - Clean up cpqphp dead code (Sasha Levin) - Disable ARI unless device and upstream bridge support it (Yijing Wang) - Initialize all hot-added devices (not functions 0-7) (Yijing Wang) Power management - Don't touch ASPM if disabled (Joe Lawrence) - Fix ASPM link state management (Myron Stowe) Miscellaneous - Fix PCI_EXP_FLAGS accessor (Alex Williamson) - Disable Bus Master in pci_device_shutdown (Konstantin Khlebnikov) - Document hotplug resource and MPS parameters (Yijing Wang) - Add accessor for PCIe capabilities (Myron Stowe) - Drop pciehp suspend/resume messages (Paul Bolle) - Make pci_slot built-in only (not a module) (Jiang Liu) - Remove unused PCI/ACPI bind ops (Jiang Liu) - Removed used pci_root_bus (Bjorn Helgaas)" * tag 'pci-v3.9-changes' of git://git.kernel.org/pub/scm/linux/kernel/git/helgaas/pci: (51 commits) PCI/ACPI: Don't cache _PRT, and don't associate them with bus numbers PCI: Fix PCI Express Capability accessors for PCI_EXP_FLAGS ACPI / PCI: Make pci_slot built-in only, not a module PCI/PM: Clear state_saved during suspend PCI: Use atomic_inc_return() rather than atomic_add_return() PCI: Catch attempts to disable already-disabled devices PCI: Disable Bus Master unconditionally in pci_device_shutdown() PCI: acpiphp: Remove dead code for PCI host bridge hotplug PCI: acpiphp: Create companion ACPI devices before creating PCI devices PCI: Remove unused "rc" in virtfn_add_bus() PCI: pciehp: Drop suspend/resume ENTRY messages PCI/ASPM: Don't touch ASPM if forcibly disabled PCI/ASPM: Deallocate upstream link state even if device is not PCIe PCI: Document MPS parameters pci=pcie_bus_safe, pci=pcie_bus_perf, etc PCI: Document hpiosize= and hpmemsize= resource reservation parameters PCI: Use PCI Express Capability accessor PCI: Introduce accessor to retrieve PCIe Capabilities Register PCI: Put pci_dev in device tree as early as possible PCI: Skip attaching driver in device_add() PCI: acpiphp: Keep driver loaded even if no slots found ...