| Age | Commit message (Collapse) | Author | Files | Lines |
|
There's still a race in kvm_vcpu_block(), if a wake_up_interruptible()
call happens before the task state is set to TASK_INTERRUPTIBLE:
CPU0 CPU1
kvm_vcpu_block
add_wait_queue
kvm_cpu_has_interrupt = 0
set interrupt
if (waitqueue_active())
wake_up_interruptible()
kvm_cpu_has_pending_timer
kvm_arch_vcpu_runnable
signal_pending
set_current_state(TASK_INTERRUPTIBLE)
schedule()
Can be fixed by using prepare_to_wait() which sets the task state before
testing for the wait condition.
Signed-off-by: Marcelo Tosatti <[email protected]>
Signed-off-by: Avi Kivity <[email protected]>
|
|
Signed-off-by: Sheng Yang <[email protected]>
Signed-off-by: Avi Kivity <[email protected]>
|
|
a) none of the callers even looks at inode or file returned by anon_inode_getfd()
b) any caller that would try to look at those would be racy, since by the time
it returns we might have raced with close() from another thread and that
file would be pining for fjords.
Signed-off-by: Al Viro <[email protected]>
|
|
Use kvm own refcounting instead of playing with ->filp->f_count.
That will allow to get rid of a lot of crap in anon_inode_getfd() and
kill a race in kvm_dev_ioctl_create_vm() (file might have been closed
immediately by another thread, so ->filp might point to already freed
struct file when we get around to setting it).
Signed-off-by: Al Viro <[email protected]>
Signed-off-by: Avi Kivity <[email protected]>
|
|
It's a globally exported symbol now.
Signed-off-by: Hollis Blanchard <[email protected]>
Signed-off-by: Avi Kivity <[email protected]>
|
|
So userspace can save/restore the mpstate during migration.
[avi: export the #define constants describing the value]
[christian: add s390 stubs]
[avi: ditto for ia64]
Signed-off-by: Marcelo Tosatti <[email protected]>
Signed-off-by: Christian Borntraeger <[email protected]>
Signed-off-by: Carsten Otte <[email protected]>
Signed-off-by: Avi Kivity <[email protected]>
|
|
Timers that fire between guest hlt and vcpu_block's add_wait_queue() are
ignored, possibly resulting in hangs.
Also make sure that atomic_inc and waitqueue_active tests happen in the
specified order, otherwise the following race is open:
CPU0 CPU1
if (waitqueue_active(wq))
add_wait_queue()
if (!atomic_read(pit_timer->pending))
schedule()
atomic_inc(pit_timer->pending)
Signed-off-by: Marcelo Tosatti <[email protected]>
Signed-off-by: Avi Kivity <[email protected]>
|
|
This interface allows user a space application to read the trace of kvm
related events through relayfs.
Signed-off-by: Feng (Eric) Liu <[email protected]>
Signed-off-by: Avi Kivity <[email protected]>
|
|
This patch introduces a gfn_to_pfn() function and corresponding functions like
kvm_release_pfn_dirty(). Using these new functions, we can modify the x86
MMU to no longer assume that it can always get a struct page for any given gfn.
We don't want to eliminate gfn_to_page() entirely because a number of places
assume they can do gfn_to_page() and then kmap() the results. When we support
IO memory, gfn_to_page() will fail for IO pages although gfn_to_pfn() will
succeed.
This does not implement support for avoiding reference counting for reserved
RAM or for IO memory. However, it should make those things pretty straight
forward.
Since we're only introducing new common symbols, I don't think it will break
the non-x86 architectures but I haven't tested those. I've tested Intel,
AMD, NPT, and hugetlbfs with Windows and Linux guests.
[avi: fix overflow when shifting left pfns by adding casts]
Signed-off-by: Anthony Liguori <[email protected]>
Signed-off-by: Avi Kivity <[email protected]>
|
|
the main purpose of adding this functions is the abilaty to release the
spinlock that protect the kvm list while still be able to do operations
on a specific kvm in a safe way.
Signed-off-by: Izik Eidus <[email protected]>
Signed-off-by: Avi Kivity <[email protected]>
|
|
Since the size of kvm_regs is too big to allocate from kernel stack on ia64,
use kzalloc to allocate it.
Signed-off-by: Xiantao Zhang <[email protected]>
Signed-off-by: Avi Kivity <[email protected]>
|
|
Create large pages mappings if the guest PTE's are marked as such and
the underlying memory is hugetlbfs backed. If the largepage contains
write-protected pages, a large pte is not used.
Gives a consistent 2% improvement for data copies on ram mounted
filesystem, without NPT/EPT.
Anthony measures a 4% improvement on 4-way kernbench, with NPT.
Signed-off-by: Marcelo Tosatti <[email protected]>
Signed-off-by: Avi Kivity <[email protected]>
|
|
Mark zapped root pagetables as invalid and ignore such pages during lookup.
This is a problem with the cr3-target feature, where a zapped root table fools
the faulting code into creating a read-only mapping. The result is a lockup
if the instruction can't be emulated.
Signed-off-by: Marcelo Tosatti <[email protected]>
Cc: Anthony Liguori <[email protected]>
Signed-off-by: Avi Kivity <[email protected]>
|
|
With CONFIG_PREEMPT=n, this is needed in order to disable the fault-in
code from sleeping.
Signed-off-by: Andrea Arcangeli <[email protected]>
Signed-off-by: Avi Kivity <[email protected]>
|
|
The second page is only needed on archs that support pio.
Noted by Carsten Otte.
Signed-off-by: Avi Kivity <[email protected]>
|
|
Signed-off-by: Avi Kivity <[email protected]>
|
|
Signed-off-by: Jan Engelhardt <[email protected]>
Signed-off-by: Avi Kivity <[email protected]>
|
|
Some Linux versions allow the timer interrupt to be processed by more than
one cpu, leading to hangs due to tsc instability. Work around the issue
by only disaptching the interrupt to vcpu 0.
Problem analyzed (and patch tested) by Sheng Yang.
Signed-off-by: Avi Kivity <[email protected]>
|
|
This patch replaces the mmap_sem lock for the memory slots with a new
kvm private lock, it is needed beacuse untill now there were cases where
kvm accesses user memory while holding the mmap semaphore.
Signed-off-by: Izik Eidus <[email protected]>
Signed-off-by: Avi Kivity <[email protected]>
|
|
Sometimes simple attributes might need to return an error, e.g. for
acquiring a mutex interruptibly. In fact we have that situation in
spufs already which is the original user of the simple attributes. This
patch merged the temporarily forked attributes in spufs back into the
main ones and allows to return errors.
[[email protected]: build fix]
Signed-off-by: Christoph Hellwig <[email protected]>
Cc: <[email protected]>
Cc: Arnd Bergmann <[email protected]>
Cc: Greg KH <[email protected]>
Cc: Al Viro <[email protected]>
Signed-off-by: Andrew Morton <[email protected]>
Signed-off-by: Linus Torvalds <[email protected]>
|
|
Convert the synchronization of the shadow handling to a separate mmu_lock
spinlock.
Also guard fetch() by mmap_sem in read-mode to protect against alias
and memslot changes.
Signed-off-by: Marcelo Tosatti <[email protected]>
Signed-off-by: Avi Kivity <[email protected]>
|
|
In preparation for a mmu spinlock, add kvm_read_guest_atomic()
and use it in fetch() and prefetch_page().
Signed-off-by: Marcelo Tosatti <[email protected]>
Signed-off-by: Avi Kivity <[email protected]>
|
|
Do not hold kvm->lock mutex across the entire pagefault code,
only acquire it in places where it is necessary, such as mmu
hash list, active list, rmap and parent pte handling.
Allow concurrent guest walkers by switching walk_addr() to use
mmap_sem in read-mode.
And get rid of the lockless __gfn_to_page.
[avi: move kvm_mmu_pte_write() locking inside the function]
[avi: add locking for real mode]
[avi: fix cmpxchg locking]
Signed-off-by: Marcelo Tosatti <[email protected]>
Signed-off-by: Avi Kivity <[email protected]>
|
|
Move ioapic code to common, since IA64 also needs it.
Signed-off-by: Zhang Xiantao <[email protected]>
Signed-off-by: Avi Kivity <[email protected]>
|
|
Signed-off-by: Avi Kivity <[email protected]>
|