Age | Commit message (Collapse) | Author | Files | Lines |
|
This patch improves the error handler in nmi_setup(). Most parts of
the code are moved to allocate_msrs(). In case of an error
allocate_msrs() also frees already allocated memory. nmi_setup()
becomes easier and better extendable.
Signed-off-by: Robert Richter <[email protected]>
|
|
APERF/MPERF can be handled via the table like all the other scattered
CPU flags.
Signed-off-by: H. Peter Anvin <[email protected]>
Cc: Thomas Renninger <[email protected]>
Cc: Borislav Petkov <[email protected]>
LKML-Reference: <[email protected]>
|
|
K8_NB depends on PCI and when the last is disabled (allnoconfig) we fail
at the final linking stage due to missing exported num_k8_northbridges.
Add a header stub for that.
Signed-off-by: Borislav Petkov <[email protected]>
LKML-Reference: <20100503183036.GJ26107@aftab>
Signed-off-by: H. Peter Anvin <[email protected]>
|
|
Any processor that supports simd will have an internal fpu, and the
irq13 handler will not be enabled.
Signed-off-by: Brian Gerst <[email protected]>
LKML-Reference: <[email protected]>
Signed-off-by: H. Peter Anvin <[email protected]>
|
|
Clean up the kernel exception handling and make it more similar to
the other traps.
Signed-off-by: Brian Gerst <[email protected]>
LKML-Reference: <[email protected]>
Signed-off-by: H. Peter Anvin <[email protected]>
|
|
The only difference between FPU and SIMD exceptions is where the
status bits are read from (cwd/swd vs. mxcsr). This also fixes
the discrepency introduced by commit adf77bac, which fixed FPU
but not SIMD.
Signed-off-by: Brian Gerst <[email protected]>
LKML-Reference: <[email protected]>
Signed-off-by: H. Peter Anvin <[email protected]>
|
|
The cache flush denied error is an erratum on some AMD 486 clones. If an invd
instruction is executed in userspace, the processor calls exception 19 (13 hex)
instead of #GP (13 decimal). On cpus where XMM is not supported, redirect
exception 19 to do_general_protection(). Also, remove die_if_kernel(), since
this was the last user.
Signed-off-by: Brian Gerst <[email protected]>
LKML-Reference: <[email protected]>
Signed-off-by: H. Peter Anvin <[email protected]>
|
|
With F10, model 10, all valid frequencies are in the ACPI _PST table.
Cc: <[email protected]> # 33.x 32.x
Signed-off-by: Mark Langsdorf <[email protected]>
LKML-Reference: <[email protected]>
Signed-off-by: Borislav Petkov <[email protected]>
Reviewed-by: Thomas Renninger <[email protected]>
Signed-off-by: H. Peter Anvin <[email protected]>
Signed-off-by: Ingo Molnar <[email protected]>
|
|
Commit e67a807 ("x86: Fix 'reservetop=' functionality") added a
fixup_early_ioremap() call to parse_reservetop() and declared it
in io.h.
But asm/io.h was only included indirectly - and on some configs
not at all, causing a build failure on those configs.
Cc: Liang Li <[email protected]>
Cc: Konrad Rzeszutek Wilk <[email protected]>
Cc: Yinghai Lu <[email protected]>
Cc: Jeremy Fitzhardinge <[email protected]>
Cc: Wang Chen <[email protected]>
Cc: "H. Peter Anvin" <[email protected]>
Cc: Andrew Morton <[email protected]>
LKML-Reference: <[email protected]>
Signed-off-by: Ingo Molnar <[email protected]>
|
|
|
|
|
|
The breakpoint generic layer assumes that archs always know in advance
the static number of address registers available to host breakpoints
through the HBP_NUM macro.
However this is not true for every archs. For example Arm needs to get
this information dynamically to handle the compatiblity between
different versions.
To solve this, this patch proposes to drop the static HBP_NUM macro
and let the arch provide the number of available slots through a
new hw_breakpoint_slots() function. For archs that have
CONFIG_HAVE_MIXED_BREAKPOINTS_REGS selected, it will be called once
as the number of registers fits for instruction and data breakpoints
together.
For the others it will be called first to get the number of
instruction breakpoint registers and another time to get the
data breakpoint registers, the targeted type is given as a
parameter of hw_breakpoint_slots().
Reported-by: Will Deacon <[email protected]>
Signed-off-by: Frederic Weisbecker <[email protected]>
Acked-by: Paul Mundt <[email protected]>
Cc: Mahesh Salgaonkar <[email protected]>
Cc: K. Prasad <[email protected]>
Cc: Benjamin Herrenschmidt <[email protected]>
Cc: Paul Mackerras <[email protected]>
Cc: Jason Wessel <[email protected]>
Cc: Ingo Molnar <[email protected]>
|
|
There are two outstanding fashions for archs to implement hardware
breakpoints.
The first is to separate breakpoint address pattern definition
space between data and instruction breakpoints. We then have
typically distinct instruction address breakpoint registers
and data address breakpoint registers, delivered with
separate control registers for data and instruction breakpoints
as well. This is the case of PowerPc and ARM for example.
The second consists in having merged breakpoint address space
definition between data and instruction breakpoint. Address
registers can host either instruction or data address and
the access mode for the breakpoint is defined in a control
register. This is the case of x86 and Super H.
This patch adds a new CONFIG_HAVE_MIXED_BREAKPOINTS_REGS config
that archs can select if they belong to the second case. Those
will have their slot allocation merged for instructions and
data breakpoints.
The others will have a separate slot tracking between data and
instruction breakpoints.
Signed-off-by: Frederic Weisbecker <[email protected]>
Acked-by: Paul Mundt <[email protected]>
Cc: Will Deacon <[email protected]>
Cc: Mahesh Salgaonkar <[email protected]>
Cc: K. Prasad <[email protected]>
Cc: Benjamin Herrenschmidt <[email protected]>
Cc: Paul Mackerras <[email protected]>
Cc: Ingo Molnar <[email protected]>
|
|
The current policies of breakpoints in x86 and SH are the following:
- task bound breakpoints can only break on userspace addresses
- cpu wide breakpoints can only break on kernel addresses
The former rule prevents ptrace breakpoints to be set to trigger on
kernel addresses, which is good. But as a side effect, we can't
breakpoint on kernel addresses for task bound breakpoints.
The latter rule simply makes no sense, there is no reason why we
can't set breakpoints on userspace while performing cpu bound
profiles.
We want the following new policies:
- task bound breakpoint can set userspace address breakpoints, with
no particular privilege required.
- task bound breakpoints can set kernelspace address breakpoints but
must be privileged to do that.
- cpu bound breakpoints can do what they want as they are privileged
already.
To implement these new policies, this patch checks if we are dealing
with a kernel address breakpoint, if so and if the exclude_kernel
parameter is set, we tell the user that the breakpoint is invalid,
which makes a good generic ptrace protection.
If we don't have exclude_kernel, ensure the user has the right
privileges as kernel breakpoints are quite sensitive (risk of
trap recursion attacks and global performance impacts).
[ Paul Mundt: keep addr space check for sh signal delivery and fix
double function declaration]
Signed-off-by: Frederic Weisbecker <[email protected]>
Cc: Will Deacon <[email protected]>
Cc: Mahesh Salgaonkar <[email protected]>
Cc: K. Prasad <[email protected]>
Cc: Paul Mundt <[email protected]>
Cc: Benjamin Herrenschmidt <[email protected]>
Cc: Paul Mackerras <[email protected]>
Cc: Jason Wessel <[email protected]>
Cc: Ingo Molnar <[email protected]>
Signed-off-by: Paul Mundt <[email protected]>
|
|
Tag ptrace breakpoints with the exclude_kernel attribute set. This
will make it easier to set generic policies on breakpoints, when it
comes to ensure nobody unpriviliged try to breakpoint on the kernel.
Signed-off-by: Frederic Weisbecker <[email protected]>
Acked-by: Paul Mundt <[email protected]>
Cc: Will Deacon <[email protected]>
Cc: Mahesh Salgaonkar <[email protected]>
Cc: K. Prasad <[email protected]>
Cc: Benjamin Herrenschmidt <[email protected]>
Cc: Paul Mackerras <[email protected]>
Cc: Ingo Molnar <[email protected]>
|
|
Upstream PV guests fail to boot because of a NULL pointer in
irq_force_complete_move(). It is possible that xen guests have
irq_desc->chip_data = NULL.
Test for NULL chip_data pointer before attempting to complete an irq move.
Signed-off-by: Prarit Bhargava <[email protected]>
LKML-Reference: <[email protected]>
Acked-by: Suresh Siddha <[email protected]>
Signed-off-by: H. Peter Anvin <[email protected]>
Cc: <[email protected]> [2.6.33]
|
|
When specifying the 'reservetop=0xbadc0de' kernel parameter,
the kernel will stop booting due to a early_ioremap bug that
relates to commit 8827247ff.
The root cause of boot failure problem is the value of
'slot_virt[i]' was initialized in setup_arch->early_ioremap_init().
But later in setup_arch, the function 'parse_early_param' will
modify 'FIXADDR_TOP' when 'reservetop=0xbadc0de' being specified.
The simplest fix might be use __fix_to_virt(idx0) to get updated
value of 'FIXADDR_TOP' in '__early_ioremap' instead of reference
old value from slot_virt[slot] directly.
Changelog since v0:
-v1: When reservetop being handled then FIXADDR_TOP get
adjusted, Hence check prev_map then re-initialize slot_virt and
PMD based on new FIXADDR_TOP.
-v2: place fixup_early_ioremap hence call early_ioremap_init in
reserve_top_address to re-initialize slot_virt and
corresponding PMD when parse_reservertop
-v3: move fixup_early_ioremap out of reserve_top_address to make
sure other clients of reserve_top_address like xen/lguest won't
broken
Signed-off-by: Liang Li <[email protected]>
Tested-by: Konrad Rzeszutek Wilk <[email protected]>
Acked-by: Yinghai Lu <[email protected]>
Acked-by: Jeremy Fitzhardinge <[email protected]>
Cc: Wang Chen <[email protected]>
Cc: "H. Peter Anvin" <[email protected]>
Cc: Andrew Morton <[email protected]>
LKML-Reference: <[email protected]>
[ fixed three small cleanliness details in fixup_early_ioremap() ]
Signed-off-by: Ingo Molnar <[email protected]>
|
|
Merge reason: update to the latest -rc.
Signed-off-by: Ingo Molnar <[email protected]>
|
|
Merge reason:
Conflict between LOCK_PREFIX_HERE and relative alternatives
pointers
Resolved Conflicts:
arch/x86/include/asm/alternative.h
arch/x86/kernel/alternative.c
Signed-off-by: H. Peter Anvin <[email protected]>
|
|
Checkin b3ac891b67bd4b1fc728d1c784cad1212dea433d:
x86: Add support for lock prefix in alternatives
... did not define LOCK_PREFIX_HERE in the case of a uniprocessor
build. As a result, it would cause any of the usages of this macro to
fail on a uniprocessor build. Fix this by defining LOCK_PREFIX_HERE
as a null string.
Signed-off-by: H. Peter Anvin <[email protected]>
Cc: Luca Barbieri <[email protected]>
LKML-Reference: <[email protected]>
|
|
git://git.kernel.org/pub/scm/linux/kernel/git/x86/linux-2.6-tip
* 'x86-fixes-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/x86/linux-2.6-tip:
x86: Disable large pages on CPUs with Atom erratum AAE44
x86-64: Clear a 64-bit FS/GS base on fork if selector is nonzero
x86, mrst: Conditionally register cpu hotplug notifier for apbt
|
|
After programming the HPET, we do a readback as a workaround for
ATI/SBx00 chipsets as a synchronization. Unfortunately this triggers
an erratum in newer ICH chipsets (ICH9+) where reading the comparator
immediately after the write returns the old value. Furthermore, as
always, I/O reads are bad for performance.
Therefore, restrict the readback to the chipsets that need it, or, for
debugging purposes, when we are running with hpet=verbose.
Signed-off-by: Thomas Gleixner <[email protected]>
Acked-by: Venkatesh Pallipadi <[email protected]>
LKML-Reference: <[email protected]>
Signed-off-by: H. Peter Anvin <[email protected]>
|
|
No functional change intended.
Signed-off-by: Jan Beulich <[email protected]>
LKML-Reference: <[email protected]>
Signed-off-by: H. Peter Anvin <[email protected]>
|
|
It's not used by any module, and i386 (as well as some other arches)
also doesn't export its equivalent (swapper_pg_dir).
Signed-off-by: Jan Beulich <[email protected]>
LKML-Reference: <[email protected]>
Signed-off-by: H. Peter Anvin <[email protected]>
|
|
Reduce the SMP locks table size by using relative pointers instead of
absolute ones, thus cutting the table size by half.
Signed-off-by: Jan Beulich <[email protected]>
LKML-Reference: <[email protected]>
Signed-off-by: H. Peter Anvin <[email protected]>
|
|
... i.e. when the hole between two regions isn't occupied by memory on
another node. This reduces the memory->node table size, thus reducing
cache footprint of lookups, which got increased significantly some
time ago, and things go back to how they were before that change on
the systems I looked at.
Signed-off-by: Jan Beulich <[email protected]>
LKML-Reference: <[email protected]>
Signed-off-by: H. Peter Anvin <[email protected]>
|
|
... generating slightly smaller code.
Signed-off-by: Jan Beulich <[email protected]>
LKML-Reference: <[email protected]>
Signed-off-by: H. Peter Anvin <[email protected]>
|
|
ACPI _CRS Address Space Descriptors have _MIN, _MAX, and _LEN. Linux has
been computing Address Spaces as [_MIN to _MIN + _LEN - 1]. Based on the
tests in the bug reports below, Windows apparently uses [_MIN to _MAX].
Per spec (ACPI 4.0, Table 6-40), for _CRS fixed-size, fixed location
descriptors, "_LEN must be (_MAX - _MIN + 1)", and when that's true, it
doesn't matter which way we compute the end. But of course, there are
BIOSes that don't follow this rule, and we're better off if Linux handles
those exceptions the same way as Windows.
This patch makes Linux use [_MIN to _MAX], as Windows seems to do. This
effectively reverts d558b483d5 and 03db42adfe and replaces them with
simpler code.
https://bugzilla.kernel.org/show_bug.cgi?id=14337 (round)
https://bugzilla.kernel.org/show_bug.cgi?id=15480 (truncate)
Signed-off-by: Bjorn Helgaas <[email protected]>
Signed-off-by: Jesse Barnes <[email protected]>
|
|
When we move a PCI device or assign resources to a device not configured
by the BIOS, we want to avoid the BIOS region below 1MB. Note that if the
BIOS places devices below 1MB, we leave them there.
See https://bugzilla.kernel.org/show_bug.cgi?id=15744
and https://bugzilla.kernel.org/show_bug.cgi?id=15841
Tested-by: Andy Isaacson <[email protected]>
Tested-by: Andy Bailey <[email protected]>
Signed-off-by: Bjorn Helgaas <[email protected]>
Signed-off-by: Jesse Barnes <[email protected]>
|
|
Implement jmp far opcode ff/5. It is used by multiboot loader.
Signed-off-by: Gleb Natapov <[email protected]>
Signed-off-by: Avi Kivity <[email protected]>
|
|
Add decoding of Ep type of argument used by callf/jmpf.
Signed-off-by: Gleb Natapov <[email protected]>
Signed-off-by: Avi Kivity <[email protected]>
|
|
segment_base() is used only by vmx so move it there.
Signed-off-by: Gleb Natapov <[email protected]>
Signed-off-by: Avi Kivity <[email protected]>
|
|
fix segment_base() to properly check for null segment selector and
avoid accessing NULL pointer if ldt selector in null.
Signed-off-by: Gleb Natapov <[email protected]>
Signed-off-by: Avi Kivity <[email protected]>
|
|
Linux now has native_store_gdt() to do the same. Use it instead of
kvm local version.
Signed-off-by: Gleb Natapov <[email protected]>
Signed-off-by: Avi Kivity <[email protected]>
|
|
When injecting an vmexit.intr into the nested hypervisor
there might be leftover values in the exit_info fields.
Clear them to not confuse nested hypervisors.
Signed-off-by: Joerg Roedel <[email protected]>
Signed-off-by: Avi Kivity <[email protected]>
|
|
If we have the following situation with nested svm:
1. Host KVM intercepts cr0 writes
2. Guest hypervisor intercepts only selective cr0 writes
Then we get an cr0 write intercept which is handled on the
host. But that intercepts may actually be a selective cr0
intercept for the guest. This patch checks for this
condition and injects a selective cr0 intercept if needed.
Signed-off-by: Joerg Roedel <[email protected]>
Signed-off-by: Avi Kivity <[email protected]>
|
|
The vcpu->arch.cr0 variable is already set in the
architecture specific set_cr0 callbacks. There is no need to
set it in the common code.
This allows the architecture code to keep the old arch.cr0
value if it wants. This is required for nested svm to decide
if a selective_cr0 exit needs to be injected.
Signed-off-by: Joerg Roedel <[email protected]>
Signed-off-by: Avi Kivity <[email protected]>
|
|
Hyper-V as a guest wants to write this bit. This patch
ignores it.
Signed-off-by: Joerg Roedel <[email protected]>
Signed-off-by: Avi Kivity <[email protected]>
|
|
This patch implements the emulation of the vm_cr msr for
nested svm.
Signed-off-by: Joerg Roedel <[email protected]>
Signed-off-by: Avi Kivity <[email protected]>
|
|
This patch adds a tracepoint to get information about the
most important intercept bitmasks from the nested vmcb.
Signed-off-by: Joerg Roedel <[email protected]>
Signed-off-by: Avi Kivity <[email protected]>
|
|
A recent change broke tracing of the nested vmcb address. It
was reported as 0 all the time. This patch fixes it.
Signed-off-by: Joerg Roedel <[email protected]>
Signed-off-by: Avi Kivity <[email protected]>
|
|
This patch implements the NMI intercept checking for nested
svm.
Signed-off-by: Joerg Roedel <[email protected]>
Signed-off-by: Avi Kivity <[email protected]>
|
|
Without resetting the MMU the gva_to_pga function will not
work reliably when the vcpu is running in nested context.
Signed-off-by: Joerg Roedel <[email protected]>
Signed-off-by: Avi Kivity <[email protected]>
|
|
This patch removes whitespace errors, fixes comment formats
and most of checkpatch warnings. Now vim does not show
c-space-errors anymore.
Signed-off-by: Joerg Roedel <[email protected]>
Signed-off-by: Avi Kivity <[email protected]>
|
|
Call directly into the vendor services for getting/setting rflags in
emulate_instruction to ensure injected TF survives the emulation.
Signed-off-by: Jan Kiszka <[email protected]>
Signed-off-by: Avi Kivity <[email protected]>
|
|
RF is not required for injecting TF as the latter will trigger only
after an instruction execution anyway. So do not touch RF when arming or
disarming guest single-step mode.
Signed-off-by: Jan Kiszka <[email protected]>
Signed-off-by: Avi Kivity <[email protected]>
|
|
When in guest debugging mode, we have to reinject those #BP software
exceptions that are caused by guest-injected INT3. As older AMD
processors do not support the required nRIP VMCB field, try to emulate
it by moving RIP past the instruction on exception injection. Fix it up
again in case the injection failed and we were able to catch this. This
does not work for unintercepted faults, but it is better than doing
nothing.
Signed-off-by: Jan Kiszka <[email protected]>
Signed-off-by: Avi Kivity <[email protected]>
|
|
Based on Gleb's suggestion: Add a helper kvm_is_linear_rip that matches
a given linear RIP against the current one. Use this for guest
single-stepping, more users will follow.
Signed-off-by: Jan Kiszka <[email protected]>
Signed-off-by: Avi Kivity <[email protected]>
|
|
Move svm_queue_exception past skip_emulated_instruction to allow calling
it later on.
Signed-off-by: Jan Kiszka <[email protected]>
Signed-off-by: Avi Kivity <[email protected]>
|
|
This restores the deferred VCPU kicking before 956f97cf. We need this
over -rt as wake_up* requires non-atomic context in this configuration.
Signed-off-by: Jan Kiszka <[email protected]>
Signed-off-by: Avi Kivity <[email protected]>
|