Age | Commit message (Collapse) | Author | Files | Lines |
|
We no longer rely on paging_tmpl.h defines; so we can move the function
to mmu.c.
Rely on zero extension to 64 bits to get the correct nx behaviour.
Reviewed-by: Xiao Guangrong <[email protected]>
Signed-off-by: Avi Kivity <[email protected]>
|
|
If nx is disabled, then is gpte[63] is set we will hit a reserved
bit set fault before checking permissions; so we can ignore the
setting of efer.nxe.
Reviewed-by: Xiao Guangrong <[email protected]>
Signed-off-by: Avi Kivity <[email protected]>
|
|
gpte_access() computes the access permissions of a guest pte and also
write-protects clean gptes. This is wrong when we are servicing a
write fault (since we'll be setting the dirty bit momentarily) but
correct when instantiating a speculative spte, or when servicing a
read fault (since we'll want to trap a following write in order to
set the dirty bit).
It doesn't seem to hurt in practice, but in order to make the code
readable, push the write protection out of gpte_access() and into
a new protect_clean_gpte() which is called explicitly when needed.
Reviewed-by: Xiao Guangrong <[email protected]>
Signed-off-by: Avi Kivity <[email protected]>
|
|
- mention that system time needs to be added to wallclock time
- positive tsc_shift means left shift, not right
- mention additional 32bit right shift
Signed-off-by: Stefan Fritsch <[email protected]>
Signed-off-by: Marcelo Tosatti <[email protected]>
|
|
vcpu mutex can be held for unlimited time so
taking it with mutex_lock on an ioctl is wrong:
one process could be passed a vcpu fd and
call this ioctl on the vcpu used by another process,
it will then be unkillable until the owner exits.
Call mutex_lock_killable instead and return status.
Note: mutex_lock_interruptible would be even nicer,
but I am not sure all users are prepared to handle EINTR
from these ioctls. They might misinterpret it as an error.
Cleanup paths expect a vcpu that can't be used by
any userspace so this will always succeed - catch bugs
by calling BUG_ON.
Catch callers that don't check return state by adding
__must_check.
Signed-off-by: Michael S. Tsirkin <[email protected]>
Signed-off-by: Marcelo Tosatti <[email protected]>
|
|
Use macros for bitness-insensitive register names, instead of
rolling our own.
Signed-off-by: Avi Kivity <[email protected]>
Signed-off-by: Marcelo Tosatti <[email protected]>
|
|
Use macros for bitness-insensitive register names, instead of
rolling our own.
Signed-off-by: Avi Kivity <[email protected]>
Signed-off-by: Marcelo Tosatti <[email protected]>
|
|
LTO (link-time optimization) doesn't like local labels to be referred to
from a different function, since the two functions may be built in separate
compilation units. Use an external variable instead.
Signed-off-by: Avi Kivity <[email protected]>
Signed-off-by: Marcelo Tosatti <[email protected]>
|
|
find_highest_vector() and count_vectors():
- Instead of using magic values, define and use proper macros.
find_highest_vector():
- Remove likely() which is there only for historical reasons and not
doing correct branch predictions anymore. Using such heuristics
to optimize this function is not worth it now. Let CPUs predict
things instead.
- Stop checking word[0] separately. This was only needed for doing
likely() optimization.
- Use for loop, not while, to iterate over the register array to make
the code clearer.
Note that we actually confirmed that the likely() did wrong predictions
by inserting debug code.
Acked-by: Michael S. Tsirkin <[email protected]>
Signed-off-by: Takuya Yoshikawa <[email protected]>
Signed-off-by: Marcelo Tosatti <[email protected]>
|
|
Checking the return of kvm_mmu_get_page is unnecessary since it is
guaranteed by memory cache
Signed-off-by: Xiao Guangrong <[email protected]>
Signed-off-by: Avi Kivity <[email protected]>
|
|
KVM lapic timer and tsc deadline timer based on hrtimer,
setting a leftmost node to rb tree and then do hrtimer reprogram.
If hrtimer not configured as high resolution, hrtimer_enqueue_reprogram
do nothing and then make kvm lapic timer and tsc deadline timer fail.
Signed-off-by: Liu, Jinsong <[email protected]>
Signed-off-by: Avi Kivity <[email protected]>
|
|
Signed-off-by: Jan Kiszka <[email protected]>
Signed-off-by: Avi Kivity <[email protected]>
|
|
interrupt_bitmap is KVM_NR_INTERRUPTS bits in size,
so just use that instead of hard-coded constants
and math.
Signed-off-by: Michael S. Tsirkin <[email protected]>
Signed-off-by: Avi Kivity <[email protected]>
|
|
Optimize "rep ins" by allowing emulator to write back more than one
datum at a time. Introduce new operand type OP_MEM_STR which tells
writeback() that dst contains pointer to an array that should be written
back as opposite to just one data element.
Signed-off-by: Gleb Natapov <[email protected]>
Signed-off-by: Avi Kivity <[email protected]>
|
|
Remove unneeded segment argument. Address structure already has correct
segment which was put there during decode.
Signed-off-by: Gleb Natapov <[email protected]>
Signed-off-by: Avi Kivity <[email protected]>
|
|
Signed-off-by: Gleb Natapov <[email protected]>
Signed-off-by: Avi Kivity <[email protected]>
|
|
Current code assumes that IO exit was due to instruction emulation
and handles execution back to emulator directly. This patch adds new
userspace IO exit completion callback that can be set by any other code
that caused IO exit to userspace.
Signed-off-by: Gleb Natapov <[email protected]>
Signed-off-by: Avi Kivity <[email protected]>
|
|
Other arches do not need this.
Signed-off-by: Marcelo Tosatti <[email protected]>
v2: fix incorrect deletion of mmio sptes on gpa move (noticed by Takuya)
Signed-off-by: Avi Kivity <[email protected]>
|
|
PPC must flush all translations before the new memory slot
is visible.
Signed-off-by: Marcelo Tosatti <[email protected]>
Signed-off-by: Avi Kivity <[email protected]>
|
|
Introducing kvm_arch_flush_shadow_memslot, to invalidate the
translations of a single memory slot.
Signed-off-by: Marcelo Tosatti <[email protected]>
Signed-off-by: Avi Kivity <[email protected]>
|
|
We never modify direct_access_msrs[], msrpm_ranges[],
svm_exit_handlers[] or x86_intercept_map[] at runtime.
Mark them r/o.
Signed-off-by: Mathias Krause <[email protected]>
Cc: Joerg Roedel <[email protected]>
Signed-off-by: Avi Kivity <[email protected]>
|
|
We use vmcs_field_to_offset_table[], kvm_vmx_segment_fields[] and
kvm_vmx_exit_handlers[] as lookup tables only -- make them r/o.
Signed-off-by: Mathias Krause <[email protected]>
Signed-off-by: Avi Kivity <[email protected]>
|
|
Signed-off-by: Mathias Krause <[email protected]>
Signed-off-by: Avi Kivity <[email protected]>
|
|
We never change those, make them r/o.
Signed-off-by: Mathias Krause <[email protected]>
Signed-off-by: Avi Kivity <[email protected]>
|
|
We never change emulate_ops[] at runtime so it should be r/o.
Signed-off-by: Mathias Krause <[email protected]>
Signed-off-by: Avi Kivity <[email protected]>
|
|
The opcode tables never change at runtime, therefor mark them const.
Signed-off-by: Mathias Krause <[email protected]>
Signed-off-by: Avi Kivity <[email protected]>
|
|
As the the compiler ensures that the memory operand is always aligned
to a 16 byte memory location, use the aligned variant of MOVDQ for
read_sse_reg() and write_sse_reg().
Signed-off-by: Mathias Krause <[email protected]>
Signed-off-by: Avi Kivity <[email protected]>
|
|
Some fields can be constified and/or made static to reduce code and data
size.
Numbers for a 32 bit build:
text data bss dec hex filename
before: 3351 80 0 3431 d67 cpuid.o
after: 3391 0 0 3391 d3f cpuid.o
Signed-off-by: Mathias Krause <[email protected]>
Signed-off-by: Avi Kivity <[email protected]>
|
|
kvm_pic_reset() is not used anywhere. Move reset logic from
pic_ioport_write() there.
Signed-off-by: Gleb Natapov <[email protected]>
Signed-off-by: Avi Kivity <[email protected]>
|
|
Signed-off-by: Marcelo Tosatti <[email protected]>
|
|
We will enter the guest with G and D cleared; as real hardware ignores D in
real mode, and G is taken care of by the limit test, we allow more code to
run in vm86 mode.
Signed-off-by: Avi Kivity <[email protected]>
Signed-off-by: Marcelo Tosatti <[email protected]>
|
|
Signed-off-by: Avi Kivity <[email protected]>
Signed-off-by: Marcelo Tosatti <[email protected]>
|
|
While this is undocumented, real processors do not reload the segment
limit and access rights when loading a segment register in real mode.
Real programs rely on it so we need to comply with this behaviour.
Signed-off-by: Avi Kivity <[email protected]>
Signed-off-by: Marcelo Tosatti <[email protected]>
|
|
emulate_invalid_guest_state=1
emulate_invalid_guest_state=1 doesn't mean we don't munge the segments in the
vmcs; we do. So we need to return the real ones (maintained by vmx_set_segment).
Signed-off-by: Avi Kivity <[email protected]>
Signed-off-by: Marcelo Tosatti <[email protected]>
|
|
We want the segment selector, nor segment number.
Signed-off-by: Avi Kivity <[email protected]>
Signed-off-by: Marcelo Tosatti <[email protected]>
|
|
Segment limits are verified in real mode, not just protected mode.
Signed-off-by: Avi Kivity <[email protected]>
Signed-off-by: Marcelo Tosatti <[email protected]>
|
|
When loading a segment in real mode, only the base and selector must
be modified. The limit needs to be left alone, otherwise big real mode
users will hit a #GP due to limit checking (currently this is suppressed
because we don't check limits in real mode).
Signed-off-by: Avi Kivity <[email protected]>
Signed-off-by: Marcelo Tosatti <[email protected]>
|
|
Usually, big real mode uses large (4GB) segments. Currently we don't
virtualize this; if any segment has a limit other than 0xffff, we emulate.
But if we set the vmx-visible limit to 0xffff, we can use vm86 to virtualize
real mode; if an access overruns the segment limit, the guest will #GP, which
we will trap and forward to the emulator. This results in significantly
faster execution, and less risk of hitting an unemulated instruction.
If the limit is less than 0xffff, we retain the existing behaviour.
Signed-off-by: Avi Kivity <[email protected]>
Signed-off-by: Marcelo Tosatti <[email protected]>
|
|
Real mode is always entered from protected mode with dpl=0. Since
the dpl doesn't affect execution, and we already override it to 3
in the vmcs (as vmx requires), we can allow execution in that state.
Signed-off-by: Avi Kivity <[email protected]>
Signed-off-by: Marcelo Tosatti <[email protected]>
|
|
Real processors don't change segment limits and attributes while in
real mode. Mimic that behaviour.
Signed-off-by: Avi Kivity <[email protected]>
Signed-off-by: Marcelo Tosatti <[email protected]>
|
|
Instead of using struct kvm_save_segment, use struct kvm_segment, which is what
the other APIs use. This leads to some simplification.
We replace save_rmode_seg() with a call to vmx_save_segment(). Since this depends
on rmode.vm86_active, we move the call to before setting the flag.
Signed-off-by: Avi Kivity <[email protected]>
Signed-off-by: Marcelo Tosatti <[email protected]>
|
|
fix_pmode_dataseg() looks up S in ->base instead of ->ar_bytes.
Signed-off-by: Avi Kivity <[email protected]>
Signed-off-by: Marcelo Tosatti <[email protected]>
|
|
Commit b246dd5df139 ("KVM: VMX: Fix KVM_SET_SREGS with big real mode
segments") moved fix_rmode_seg() to vmx_set_segment(), so that it is
applied not just on transitions to real mode, but also on KVM_SET_SREGS
(migration). However fix_rmode_seg() not only munges the vmcs segments,
it also sets up the save area for us to restore when returning to
protected mode or to return in vmx_get_segment().
Move saving the segment into a new function, save_rmode_seg(), and
call it just during the transition.
Signed-off-by: Avi Kivity <[email protected]>
Signed-off-by: Marcelo Tosatti <[email protected]>
|
|
Instead of populating the entire register file, read in registers
as they are accessed, and write back only the modified ones. This
saves a VMREAD and VMWRITE on Intel (for rsp, since it is not usually
used during emulation), and a two 128-byte copies for the registers.
Signed-off-by: Avi Kivity <[email protected]>
Signed-off-by: Marcelo Tosatti <[email protected]>
|
|
The build error was caused by that builtin functions are calling
the functions implemented in modules. This error was introduced by
commit 4d8b81abc4 ("KVM: introduce readonly memslot").
The patch fixes the build error by moving function __gfn_to_hva_memslot()
from kvm_main.c to kvm_host.h and making that "inline" so that the
builtin function (kvmppc_h_enter) can use that.
Acked-by: Paul Mackerras <[email protected]>
Signed-off-by: Gavin Shan <[email protected]>
Signed-off-by: Marcelo Tosatti <[email protected]>
|
|
Merging critical fixes from upstream required for development.
* upstream/master: (809 commits)
libata: Add a space to " 2GB ATA Flash Disk" DMA blacklist entry
Revert "powerpc: Update g5_defconfig"
powerpc/perf: Use pmc_overflow() to detect rolled back events
powerpc: Fix VMX in interrupt check in POWER7 copy loops
powerpc: POWER7 copy_to_user/copy_from_user patch applied twice
powerpc: Fix personality handling in ppc64_personality()
powerpc/dma-iommu: Fix IOMMU window check
powerpc: Remove unnecessary ifdefs
powerpc/kgdb: Restore current_thread_info properly
powerpc/kgdb: Bail out of KGDB when we've been triggered
powerpc/kgdb: Do not set kgdb_single_step on ppc
powerpc/mpic_msgr: Add missing includes
powerpc: Fix null pointer deref in perf hardware breakpoints
powerpc: Fixup whitespace in xmon
powerpc: Fix xmon dl command for new printk implementation
xfs: check for possible overflow in xfs_ioc_trim
xfs: unlock the AGI buffer when looping in xfs_dialloc
xfs: fix uninitialised variable in xfs_rtbuf_get()
powerpc/fsl: fix "Failed to mount /dev: No such device" errors
powerpc/fsl: update defconfigs
...
Signed-off-by: Marcelo Tosatti <[email protected]>
|
|
git://git.kernel.org/pub/scm/linux/kernel/git/arm/arm-soc
Pull arm-soc fixes from Arnd Bergmann:
"Bug fixes for various ARM platforms. About half of these are for OMAP
and submitted before but did not make it into v3.6-rc2."
* tag 'fixes-3.6-rc3' of git://git.kernel.org/pub/scm/linux/kernel/git/arm/arm-soc: (39 commits)
ARM: ux500: don't select LEDS_GPIO for snowball
ARM: imx: build i.MX6 functions only when needed
ARM: imx: select CPU_FREQ_TABLE when needed
ARM: imx: fix ksz9021rn_phy_fixup
ARM: imx: build pm-imx5 code only when PM is enabled
ARM: omap: allow building omap44xx without SMP
ARM: dts: imx51-babbage: fix esdhc cd/wp properties
ARM: imx6: spin the cpu until hardware takes it down
ARM: ux500: Ensure probing of Audio devices when Device Tree is enabled
ARM: ux500: Fix merge error, no matching driver name for 'snd_soc_u8500'
ARM i.MX6q: Add virtual 1/3.5 dividers in the LDB clock path
ARM: Kirkwood: fix Makefile.boot
ARM: Kirkwood: Fix iconnect leds
ARM: Orion: Set eth packet size csum offload limit
ARM: mv78xx0: fix win_cfg_base prototype
ARM: OMAP: dmtimers: Fix locking issue in omap_dm_timer_request*()
ARM: mmp: fix potential NULL dereference
ARM: OMAP4: Register the OPP table only for 4430 device
cpufreq: OMAP: Handle missing frequency table on SMP systems
ARM: OMAP4: sleep: Save the complete used register stack frame
...
|
|
git://git.kernel.org/pub/scm/linux/kernel/git/konrad/xen
Pull three xen bug-fixes from Konrad Rzeszutek Wilk:
- Revert the kexec fix which caused on non-kexec shutdowns a race.
- Reuse existing P2M leafs - instead of requiring to allocate a large
area of bootup virtual address estate.
- Fix a one-off error when adding PFNs for balloon pages.
* tag 'stable/for-linus-3.6-rc3-tag' of git://git.kernel.org/pub/scm/linux/kernel/git/konrad/xen:
xen/setup: Fix one-off error when adding for-balloon PFNs to the P2M.
xen/p2m: Reuse existing P2M leafs if they are filled with 1:1 PFNs or INVALID.
Revert "xen PVonHVM: move shared_info to MMIO before kexec"
|
|
git://git.kernel.org/pub/scm/linux/kernel/git/benh/powerpc
Pull powerpc fixes from Benjamin Herrenschmidt:
"I meant to sent that earlier but got swamped with other things, so
here are some powerpc fixes for 3.6. A few regression fixes and some
bug fixes that I deemed should still make it.
There's a FSL update from Kumar with a bunch of defconfig updates
along with a few embedded fixes.
I also reverted my g5_defconfig update that I merged earlier as it was
completely busted, not too sure what happened there, I'll do a new one
later."
* 'merge' of git://git.kernel.org/pub/scm/linux/kernel/git/benh/powerpc:
Revert "powerpc: Update g5_defconfig"
powerpc/perf: Use pmc_overflow() to detect rolled back events
powerpc: Fix VMX in interrupt check in POWER7 copy loops
powerpc: POWER7 copy_to_user/copy_from_user patch applied twice
powerpc: Fix personality handling in ppc64_personality()
powerpc/dma-iommu: Fix IOMMU window check
powerpc: Remove unnecessary ifdefs
powerpc/kgdb: Restore current_thread_info properly
powerpc/kgdb: Bail out of KGDB when we've been triggered
powerpc/kgdb: Do not set kgdb_single_step on ppc
powerpc/mpic_msgr: Add missing includes
powerpc: Fix null pointer deref in perf hardware breakpoints
powerpc: Fixup whitespace in xmon
powerpc: Fix xmon dl command for new printk implementation
powerpc/fsl: fix "Failed to mount /dev: No such device" errors
powerpc/fsl: update defconfigs
booke/wdt: some ioctls do not return values properly
powerpc/p4080ds: dts - add usb controller version info and port0
powerpc/85xx: mpc85xx_defconfig - add VIA PATA support for MPC85xxCDS
powerpc/fsl-pci: Only scan PCI bus if configured as a host
|
|
Pull kvm fixes from Marcelo Tosatti.
* git://git.kernel.org/pub/scm/virt/kvm/kvm:
KVM: x86 emulator: use stack size attribute to mask rsp in stack ops
KVM: MMU: Fix mmu_shrink() so that it can free mmu pages as intended
ppc: e500_tlb memset clears nothing
KVM: PPC: Add cache flush on page map
KVM: PPC: Book3S HV: Fix incorrect branch in H_CEDE code
KVM: x86: update KVM_SAVE_MSRS_BEGIN to correct value
|