aboutsummaryrefslogtreecommitdiff
path: root/arch/powerpc/kvm/powerpc.c
AgeCommit message (Collapse)AuthorFilesLines
2014-07-28KVM: PPC: Remove 440 supportAlexander Graf1-1/+0
The 440 target hasn't been properly functioning for a few releases and before I was the only one who fixes a very serious bug that indicates to me that nobody used it before either. Furthermore KVM on 440 is slow to the extent of unusable. We don't have to carry along completely unused code. Remove 440 and give us one less thing to worry about. Signed-off-by: Alexander Graf <[email protected]>
2014-07-28KVM: PPC: Allow kvmppc_get_last_inst() to failMihai Caraman1-2/+9
On book3e, guest last instruction is read on the exit path using load external pid (lwepx) dedicated instruction. This load operation may fail due to TLB eviction and execute-but-not-read entries. This patch lay down the path for an alternative solution to read the guest last instruction, by allowing kvmppc_get_lat_inst() function to fail. Architecture specific implmentations of kvmppc_load_last_inst() may read last guest instruction and instruct the emulation layer to re-execute the guest in case of failure. Make kvmppc_get_last_inst() definition common between architectures. Signed-off-by: Mihai Caraman <[email protected]> Signed-off-by: Alexander Graf <[email protected]>
2014-07-28KVM: PPC: Book3S: Make magic page properly 4k mappableAlexander Graf1-0/+19
The magic page is defined as a 4k page of per-vCPU data that is shared between the guest and the host to accelerate accesses to privileged registers. However, when the host is using 64k page size granularity we weren't quite as strict about that rule anymore. Instead, we partially treated all of the upper 64k as magic page and mapped only the uppermost 4k with the actual magic contents. This works well enough for Linux which doesn't use any memory in kernel space in the upper 64k, but Mac OS X got upset. So this patch makes magic page actually stay in a 4k range even on 64k page size hosts. This patch fixes magic page usage with Mac OS X (using MOL) on 64k PAGE_SIZE hosts for me. Signed-off-by: Alexander Graf <[email protected]>
2014-07-28KVM: PPC: Book3S: Allow only implemented hcalls to be enabled or disabledPaul Mackerras1-0/+2
This adds code to check that when the KVM_CAP_PPC_ENABLE_HCALL capability is used to enable or disable in-kernel handling of an hcall, that the hcall is actually implemented by the kernel. If not an EINVAL error is returned. This also checks the default-enabled list of hcalls and prints a warning if any hcall there is not actually implemented. Signed-off-by: Paul Mackerras <[email protected]> Signed-off-by: Alexander Graf <[email protected]>
2014-07-28KVM: PPC: Book3S: Controls for in-kernel sPAPR hypercall handlingPaul Mackerras1-0/+45
This provides a way for userspace controls which sPAPR hcalls get handled in the kernel. Each hcall can be individually enabled or disabled for in-kernel handling, except for H_RTAS. The exception for H_RTAS is because userspace can already control whether individual RTAS functions are handled in-kernel or not via the KVM_PPC_RTAS_DEFINE_TOKEN ioctl, and because the numeric value for H_RTAS is out of the normal sequence of hcall numbers. Hcalls are enabled or disabled using the KVM_ENABLE_CAP ioctl for the KVM_CAP_PPC_ENABLE_HCALL capability on the file descriptor for the VM. The args field of the struct kvm_enable_cap specifies the hcall number in args[0] and the enable/disable flag in args[1]; 0 means disable in-kernel handling (so that the hcall will always cause an exit to userspace) and 1 means enable. Enabling or disabling in-kernel handling of an hcall is effective across the whole VM. The ability for KVM_ENABLE_CAP to be used on a VM file descriptor on PowerPC is new, added by this commit. The KVM_CAP_ENABLE_CAP_VM capability advertises that this ability exists. When a VM is created, an initial set of hcalls are enabled for in-kernel handling. The set that is enabled is the set that have an in-kernel implementation at this point. Any new hcall implementations from this point onwards should not be added to the default set without a good reason. No distinction is made between real-mode and virtual-mode hcall implementations; the one setting controls them both. Signed-off-by: Paul Mackerras <[email protected]> Signed-off-by: Alexander Graf <[email protected]>
2014-06-10Merge branch 'next' of ↵Linus Torvalds1-1/+1
git://git.kernel.org/pub/scm/linux/kernel/git/benh/powerpc Pull powerpc updates from Ben Herrenschmidt: "Here is the bulk of the powerpc changes for this merge window. It got a bit delayed in part because I wasn't paying attention, and in part because I discovered I had a core PCI change without a PCI maintainer ack in it. Bjorn eventually agreed it was ok to merge it though we'll probably improve it later and I didn't want to rebase to add his ack. There is going to be a bit more next week, essentially fixes that I still want to sort through and test. The biggest item this time is the support to build the ppc64 LE kernel with our new v2 ABI. We previously supported v2 userspace but the kernel itself was a tougher nut to crack. This is now sorted mostly thanks to Anton and Rusty. We also have a fairly big series from Cedric that add support for 64-bit LE zImage boot wrapper. This was made harder by the fact that traditionally our zImage wrapper was always 32-bit, but our new LE toolchains don't really support 32-bit anymore (it's somewhat there but not really "supported") so we didn't want to rely on it. This meant more churn that just endian fixes. This brings some more LE bits as well, such as the ability to run in LE mode without a hypervisor (ie. under OPAL firmware) by doing the right OPAL call to reinitialize the CPU to take HV interrupts in the right mode and the usual pile of endian fixes. There's another series from Gavin adding EEH improvements (one day we *will* have a release with less than 20 EEH patches, I promise!). Another highlight is the support for the "Split core" functionality on P8 by Michael. This allows a P8 core to be split into "sub cores" of 4 threads which allows the subcores to run different guests under KVM (the HW still doesn't support a partition per thread). And then the usual misc bits and fixes ..." [ Further delayed by gmail deciding that BenH is a dirty spammer. Google knows. ] * 'next' of git://git.kernel.org/pub/scm/linux/kernel/git/benh/powerpc: (155 commits) powerpc/powernv: Add missing include to LPC code selftests/powerpc: Test the THP bug we fixed in the previous commit powerpc/mm: Check paca psize is up to date for huge mappings powerpc/powernv: Pass buffer size to OPAL validate flash call powerpc/pseries: hcall functions are exported to modules, need _GLOBAL_TOC() powerpc: Exported functions __clear_user and copy_page use r2 so need _GLOBAL_TOC() powerpc/powernv: Set memory_block_size_bytes to 256MB powerpc: Allow ppc_md platform hook to override memory_block_size_bytes powerpc/powernv: Fix endian issues in memory error handling code powerpc/eeh: Skip eeh sysfs when eeh is disabled powerpc: 64bit sendfile is capped at 2GB powerpc/powernv: Provide debugfs access to the LPC bus via OPAL powerpc/serial: Use saner flags when creating legacy ports powerpc: Add cpu family documentation powerpc/xmon: Fix up xmon format strings powerpc/powernv: Add calls to support little endian host powerpc: Document sysfs DSCR interface powerpc: Fix regression of per-CPU DSCR setting powerpc: Split __SYSFS_SPRSETUP macro arch: powerpc/fadump: Cleaning up inconsistent NULL checks ...
2014-05-30KVM: PPC: Add CAP to indicate hcall fixesAlexander Graf1-0/+1
We worked around some nasty KVM magic page hcall breakages: 1) NX bit not honored, so ignore NX when we detect it 2) LE guests swizzle hypercall instruction Without these fixes in place, there's no way it would make sense to expose kvm hypercalls to a guest. Chances are immensely high it would trip over and break. So add a new CAP that gives user space a hint that we have workarounds for the bugs above in place. It can use those as hint to disable PV hypercalls when the guest CPU is anything POWER7 or higher and the host does not have fixes in place. Signed-off-by: Alexander Graf <[email protected]>
2014-05-30KVM: PPC: Disable NX for old magic page using guestsAlexander Graf1-2/+12
Old guests try to use the magic page, but map their trampoline code inside of an NX region. Since we can't fix those old kernels, try to detect whether the guest is sane or not. If not, just disable NX functionality in KVM so that old guests at least work at all. For newer guests, add a bit that we can set to keep NX functionality available. Signed-off-by: Alexander Graf <[email protected]>
2014-05-30KVM: PPC: Make shared struct aka magic page guest endianAlexander Graf1-1/+32
The shared (magic) page is a data structure that contains often used supervisor privileged SPRs accessible via memory to the user to reduce the number of exits we have to take to read/write them. When we actually share this structure with the guest we have to maintain it in guest endianness, because some of the patch tricks only work with native endian load/store operations. Since we only share the structure with either host or guest in little endian on book3s_64 pr mode, we don't have to worry about booke or book3s hv. For booke, the shared struct stays big endian. For book3s_64 hv we maintain the struct in host native endian, since it never gets shared with the guest. For book3s_64 pr we introduce a variable that tells us which endianness the shared struct is in and route every access to it through helper inline functions that evaluate this variable. Signed-off-by: Alexander Graf <[email protected]>
2014-05-30KVM: PPC: PR: Fill pvinfo hcall instructions in big endianAlexander Graf1-8/+8
We expose a blob of hypercall instructions to user space that it gives to the guest via device tree again. That blob should contain a stream of instructions necessary to do a hypercall in big endian, as it just gets passed into the guest and old guests use them straight away. Signed-off-by: Alexander Graf <[email protected]>
2014-05-28powerpc/kvm/book3s_hv: Use threads_per_subcore in KVMMichael Ellerman1-1/+1
To support split core on POWER8 we need to modify various parts of the KVM code to use threads_per_subcore instead of threads_per_core. On systems that do not support split core threads_per_subcore == threads_per_core and these changes are a nop. We use threads_per_subcore as the value reported by KVM_CAP_PPC_SMT. This communicates to userspace that guests can only be created with a value of threads_per_core that is less than or equal to the current threads_per_subcore. This ensures that guests can only be created with a thread configuration that we are able to run given the current split core mode. Although threads_per_subcore can change during the life of the system, the commit that enables that will ensure that threads_per_subcore does not change during the life of a KVM VM. Signed-off-by: Michael Ellerman <[email protected]> Signed-off-by: Michael Neuling <[email protected]> Acked-by: Alexander Graf <[email protected]> Acked-by: Paul Mackerras <[email protected]> Signed-off-by: Benjamin Herrenschmidt <[email protected]>
2014-01-27kvm/ppc: IRQ disabling cleanupScott Wood1-17/+9
Simplify the handling of lazy EE by going directly from fully-enabled to hard-disabled. This replaces the lazy_irq_pending() check (including its misplaced kvm_guest_exit() call). As suggested by Tiejun Chen, move the interrupt disabling into kvmppc_prepare_to_enter() rather than have each caller do it. Also move the IRQ enabling on heavyweight exit into kvmppc_prepare_to_enter(). Signed-off-by: Scott Wood <[email protected]> Signed-off-by: Alexander Graf <[email protected]>
2014-01-27KVM: PPC: Book3S: MMIO emulation support for little endian guestsCédric Le Goater1-4/+24
MMIO emulation reads the last instruction executed by the guest and then emulates. If the guest is running in Little Endian order, or more generally in a different endian order of the host, the instruction needs to be byte-swapped before being emulated. This patch adds a helper routine which tests the endian order of the host and the guest in order to decide whether a byteswap is needed or not. It is then used to byteswap the last instruction of the guest in the endian order of the host before MMIO emulation is performed. Finally, kvmppc_handle_load() of kvmppc_handle_store() are modified to reverse the endianness of the MMIO if required. Signed-off-by: Cédric Le Goater <[email protected]> [agraf: add booke handling] Signed-off-by: Alexander Graf <[email protected]>
2014-01-09KVM: PPC: Store FP/VSX/VMX state in thread_fp/vr_state structuresPaul Mackerras1-2/+2
This uses struct thread_fp_state and struct thread_vr_state to store the floating-point, VMX/Altivec and VSX state, rather than flat arrays. This makes transferring the state to/from the thread_struct simpler and allows us to unify the get/set_one_reg implementations for the VSX registers. Signed-off-by: Paul Mackerras <[email protected]> Signed-off-by: Alexander Graf <[email protected]>
2013-10-17kvm: powerpc: book3s: drop is_hv_enabledAneesh Kumar K.V1-1/+1
drop is_hv_enabled, because that should not be a callback property Signed-off-by: Aneesh Kumar K.V <[email protected]> Signed-off-by: Alexander Graf <[email protected]>
2013-10-17kvm: powerpc: book3s: Allow the HV and PR selection per virtual machineAneesh Kumar K.V1-21/+55
This moves the kvmppc_ops callbacks to be a per VM entity. This enables us to select HV and PR mode when creating a VM. We also allow both kvm-hv and kvm-pr kernel module to be loaded. To achieve this we move /dev/kvm ownership to kvm.ko module. Depending on which KVM mode we select during VM creation we take a reference count on respective module Signed-off-by: Aneesh Kumar K.V <[email protected]> [agraf: fix coding style] Signed-off-by: Alexander Graf <[email protected]>
2013-10-17kvm: Add struct kvm arg to memslot APIsAneesh Kumar K.V1-4/+5
We will use that in the later patch to find the kvm ops handler Signed-off-by: Aneesh Kumar K.V <[email protected]> Signed-off-by: Alexander Graf <[email protected]>
2013-10-17kvm: powerpc: book3s: Support building HV and PR KVM as moduleAneesh Kumar K.V1-0/+10
Signed-off-by: Aneesh Kumar K.V <[email protected]> [agraf: squash in compile fix] Signed-off-by: Alexander Graf <[email protected]>
2013-10-17kvm: powerpc: book3s: Add is_hv_enabled to kvmppc_opsAneesh Kumar K.V1-25/+29
This help us to identify whether we are running with hypervisor mode KVM enabled. The change is needed so that we can have both HV and PR kvm enabled in the same kernel. If both HV and PR KVM are included, interrupts come in to the HV version of the kvmppc_interrupt code, which then jumps to the PR handler, renamed to kvmppc_interrupt_pr, if the guest is a PR guest. Allowing both PR and HV in the same kernel required some changes to kvm_dev_ioctl_check_extension(), since the values returned now can't be selected with #ifdefs as much as previously. We look at is_hv_enabled to return the right value when checking for capabilities.For capabilities that are only provided by HV KVM, we return the HV value only if is_hv_enabled is true. For capabilities provided by PR KVM but not HV, we return the PR value only if is_hv_enabled is false. NOTE: in later patch we replace is_hv_enabled with a static inline function comparing kvm_ppc_ops Signed-off-by: Aneesh Kumar K.V <[email protected]> Signed-off-by: Alexander Graf <[email protected]>
2013-10-17kvm: powerpc: Add kvmppc_ops callbackAneesh Kumar K.V1-44/+14
This patch add a new callback kvmppc_ops. This will help us in enabling both HV and PR KVM together in the same kernel. The actual change to enable them together is done in the later patch in the series. Signed-off-by: Aneesh Kumar K.V <[email protected]> [agraf: squash in booke changes] Signed-off-by: Alexander Graf <[email protected]>
2013-09-05Merge branch 'for-linus' of ↵Linus Torvalds1-10/+10
git://git.kernel.org/pub/scm/linux/kernel/git/viro/vfs Pull vfs pile 1 from Al Viro: "Unfortunately, this merge window it'll have a be a lot of small piles - my fault, actually, for not keeping #for-next in anything that would resemble a sane shape ;-/ This pile: assorted fixes (the first 3 are -stable fodder, IMO) and cleanups + %pd/%pD formats (dentry/file pathname, up to 4 last components) + several long-standing patches from various folks. There definitely will be a lot more (starting with Miklos' check_submount_and_drop() series)" * 'for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/viro/vfs: (26 commits) direct-io: Handle O_(D)SYNC AIO direct-io: Implement generic deferred AIO completions add formats for dentry/file pathnames kvm eventfd: switch to fdget powerpc kvm: use fdget switch fchmod() to fdget switch epoll_ctl() to fdget switch copy_module_from_fd() to fdget git simplify nilfs check for busy subtree ibmasmfs: don't bother passing superblock when not needed don't pass superblock to hypfs_{mkdir,create*} don't pass superblock to hypfs_diag_create_files don't pass superblock to hypfs_vm_create_files() oprofile: get rid of pointless forward declarations of struct super_block oprofilefs_create_...() do not need superblock argument oprofilefs_mkdir() doesn't need superblock argument don't bother with passing superblock to oprofile_create_stats_files() oprofile: don't bother with passing superblock to ->create_files() don't bother passing sb to oprofile_create_files() coh901318: don't open-code simple_read_from_buffer() ...
2013-09-03powerpc kvm: use fdgetAl Viro1-10/+10
Signed-off-by: Al Viro <[email protected]>
2013-08-29Merge remote-tracking branch 'origin/next' into kvm-ppc-nextAlexander Graf1-0/+4
Conflicts: mm/Kconfig CMA DMA split and ZSWAP introduction were conflicting, fix up manually.
2013-07-18KVM: Introduce kvm_arch_memslots_updated()Takuya Yoshikawa1-0/+4
This is called right after the memslots is updated, i.e. when the result of update_memslots() gets installed in install_new_memslots(). Since the memslots needs to be updated twice when we delete or move a memslot, kvm_arch_commit_memory_region() does not correspond to this exactly. In the following patch, x86 will use this new API to check if the mmio generation has reached its maximum value, in which case mmio sptes need to be flushed out. Signed-off-by: Takuya Yoshikawa <[email protected]> Acked-by: Alexander Graf <[email protected]> Reviewed-by: Xiao Guangrong <[email protected]> Signed-off-by: Paolo Bonzini <[email protected]>
2013-07-11kvm/ppc: Call trace_hardirqs_on before entryScott Wood1-2/+0
Currently this is only being done on 64-bit. Rather than just move it out of the 64-bit ifdef, move it to kvm_lazy_ee_enable() so that it is consistent with lazy ee state, and so that we don't track more host code as interrupts-enabled than necessary. Rename kvm_lazy_ee_enable() to kvm_fix_ee_before_entry() to reflect that this function now has a role on 32-bit as well. Signed-off-by: Scott Wood <[email protected]> Signed-off-by: Alexander Graf <[email protected]>
2013-05-02KVM: PPC: Book3S: Add API for in-kernel XICS emulationPaul Mackerras1-0/+22
This adds the API for userspace to instantiate an XICS device in a VM and connect VCPUs to it. The API consists of a new device type for the KVM_CREATE_DEVICE ioctl, a new capability KVM_CAP_IRQ_XICS, which functions similarly to KVM_CAP_IRQ_MPIC, and the KVM_IRQ_LINE ioctl, which is used to assert and deassert interrupt inputs of the XICS. The XICS device has one attribute group, KVM_DEV_XICS_GRP_SOURCES. Each attribute within this group corresponds to the state of one interrupt source. The attribute number is the same as the interrupt source number. This does not support irq routing or irqfd yet. Signed-off-by: Paul Mackerras <[email protected]> Acked-by: David Gibson <[email protected]> Signed-off-by: Alexander Graf <[email protected]>
2013-05-02kvm/ppc: Hold srcu lock when calling kvm_io_bus_read/writeScott Wood1-4/+19
These functions do an srcu_dereference without acquiring the srcu lock themselves. Signed-off-by: Scott Wood <[email protected]> Signed-off-by: Alexander Graf <[email protected]>
2013-04-26KVM: PPC: Book3S: Add kernel emulation for the XICS interrupt controllerBenjamin Herrenschmidt1-0/+3
This adds in-kernel emulation of the XICS (eXternal Interrupt Controller Specification) interrupt controller specified by PAPR, for both HV and PR KVM guests. The XICS emulation supports up to 1048560 interrupt sources. Interrupt source numbers below 16 are reserved; 0 is used to mean no interrupt and 2 is used for IPIs. Internally these are represented in blocks of 1024, called ICS (interrupt controller source) entities, but that is not visible to userspace. Each vcpu gets one ICP (interrupt controller presentation) entity, used to store the per-vcpu state such as vcpu priority, pending interrupt state, IPI request, etc. This does not include any API or any way to connect vcpus to their ICP state; that will be added in later patches. This is based on an initial implementation by Michael Ellerman <[email protected]> reworked by Benjamin Herrenschmidt and Paul Mackerras. Signed-off-by: Benjamin Herrenschmidt <[email protected]> Signed-off-by: Paul Mackerras <[email protected]> [agraf: fix typo, add dependency on !KVM_MPIC] Signed-off-by: Alexander Graf <[email protected]>
2013-04-26KVM: PPC: Book3S: Add infrastructure to implement kernel-side RTAS callsMichael Ellerman1-0/+8
For pseries machine emulation, in order to move the interrupt controller code to the kernel, we need to intercept some RTAS calls in the kernel itself. This adds an infrastructure to allow in-kernel handlers to be registered for RTAS services by name. A new ioctl, KVM_PPC_RTAS_DEFINE_TOKEN, then allows userspace to associate token values with those service names. Then, when the guest requests an RTAS service with one of those token values, it will be handled by the relevant in-kernel handler rather than being passed up to userspace as at present. Signed-off-by: Michael Ellerman <[email protected]> Signed-off-by: Benjamin Herrenschmidt <[email protected]> Signed-off-by: Paul Mackerras <[email protected]> [agraf: fix warning] Signed-off-by: Alexander Graf <[email protected]>
2013-04-26KVM: PPC: MPIC: Add support for KVM_IRQ_LINEAlexander Graf1-0/+13
Now that all pieces are in place for reusing generic irq infrastructure, we can copy x86's implementation of KVM_IRQ_LINE irq injection and simply reuse it for PPC, as it will work there just as well. Signed-off-by: Alexander Graf <[email protected]>
2013-04-26kvm/ppc/mpic: add KVM_CAP_IRQ_MPICScott Wood1-0/+30
Enabling this capability connects the vcpu to the designated in-kernel MPIC. Using explicit connections between vcpus and irqchips allows for flexibility, but the main benefit at the moment is that it simplifies the code -- KVM doesn't need vm-global state to remember which MPIC object is associated with this vm, and it doesn't need to care about ordering between irqchip creation and vcpu creation. Signed-off-by: Scott Wood <[email protected]> [agraf: add stub functions for kvmppc_mpic_{dis,}connect_vcpu] Signed-off-by: Alexander Graf <[email protected]>
2013-04-26kvm/ppc/mpic: in-kernel MPIC emulationScott Wood1-6/+6
Hook the MPIC code up to the KVM interfaces, add locking, etc. Signed-off-by: Scott Wood <[email protected]> [agraf: add stub function for kvmppc_mpic_set_epr, non-booke, 64bit] Signed-off-by: Alexander Graf <[email protected]>
2013-04-26KVM: PPC: debug stub interface parameter definedBharat Bhushan1-6/+0
This patch defines the interface parameter for KVM_SET_GUEST_DEBUG ioctl support. Follow up patches will use this for setting up hardware breakpoints, watchpoints and software breakpoints. Also kvm_arch_vcpu_ioctl_set_guest_debug() is brought one level below. This is because I am not sure what is required for book3s. So this ioctl behaviour will not change for book3s. Signed-off-by: Bharat Bhushan <[email protected]> Signed-off-by: Alexander Graf <[email protected]>
2013-04-17kvm/ppc: don't call complete_mmio_load when it's a storeScott Wood1-1/+0
complete_mmio_load writes back the mmio result into the destination register. Doing this on a store results in register corruption. Signed-off-by: Scott Wood <[email protected]> Signed-off-by: Alexander Graf <[email protected]>
2013-03-22KVM: PPC: Remove unused argument to kvmppc_core_dequeue_externalPaul Mackerras1-1/+1
Currently kvmppc_core_dequeue_external() takes a struct kvm_interrupt * argument and does nothing with it, in any of its implementations. This removes it in order to make things easier for forthcoming in-kernel interrupt controller emulation code. Signed-off-by: Paul Mackerras <[email protected]> Signed-off-by: Alexander Graf <[email protected]>
2013-03-04KVM: set_memory_region: Refactor commit_memory_region()Takuya Yoshikawa1-1/+2
This patch makes the parameter old a const pointer to the old memory slot and adds a new parameter named change to know the change being requested: the former is for removing extra copying and the latter is for cleaning up the code. Signed-off-by: Takuya Yoshikawa <[email protected]> Signed-off-by: Marcelo Tosatti <[email protected]>
2013-03-04KVM: set_memory_region: Refactor prepare_memory_region()Takuya Yoshikawa1-2/+2
This patch drops the parameter old, a copy of the old memory slot, and adds a new parameter named change to know the change being requested. This not only cleans up the code but also removes extra copying of the memory slot structure. Signed-off-by: Takuya Yoshikawa <[email protected]> Signed-off-by: Marcelo Tosatti <[email protected]>
2013-03-04KVM: set_memory_region: Drop user_alloc from prepare/commit_memory_region()Takuya Yoshikawa1-7/+5
X86 does not use this any more. The remaining user, s390's !user_alloc check, can be simply removed since KVM_SET_MEMORY_REGION ioctl is no longer supported. Note: fixed powerpc's indentations with spaces to suppress checkpatch errors. Signed-off-by: Takuya Yoshikawa <[email protected]> Signed-off-by: Marcelo Tosatti <[email protected]>
2013-01-10KVM: PPC: BookE: Implement EPR exitAlexander Graf1-0/+10
The External Proxy Facility in FSL BookE chips allows the interrupt controller to automatically acknowledge an interrupt as soon as a core gets its pending external interrupt delivered. Today, user space implements the interrupt controller, so we need to check on it during such a cycle. This patch implements logic for user space to enable EPR exiting, disable EPR exiting and EPR exiting itself, so that user space can acknowledge an interrupt when an external interrupt has successfully been delivered into the guest vcpu. Signed-off-by: Alexander Graf <[email protected]>
2013-01-10KVM: PPC: Only WARN on invalid emulationAlexander Graf1-1/+2
When we hit an emulation result that we didn't expect, that is an error, but it's nothing that warrants a BUG(), because it can be guest triggered. So instead, let's only WARN() the user that this happened. Signed-off-by: Alexander Graf <[email protected]>
2012-12-13KVM: struct kvm_memory_slot.user_alloc -> boolAlex Williamson1-2/+2
There's no need for this to be an int, it holds a boolean. Move to the end of the struct for alignment. Reviewed-by: Gleb Natapov <[email protected]> Signed-off-by: Alex Williamson <[email protected]> Signed-off-by: Marcelo Tosatti <[email protected]>
2012-12-06KVM: PPC: Book3S HV: Provide a method for userspace to read and write the HPTPaul Mackerras1-0/+17
A new ioctl, KVM_PPC_GET_HTAB_FD, returns a file descriptor. Reads on this fd return the contents of the HPT (hashed page table), writes create and/or remove entries in the HPT. There is a new capability, KVM_CAP_PPC_HTAB_FD, to indicate the presence of the ioctl. The ioctl takes an argument structure with the index of the first HPT entry to read out and a set of flags. The flags indicate whether the user is intending to read or write the HPT, and whether to return all entries or only the "bolted" entries (those with the bolted bit, 0x10, set in the first doubleword). This is intended for use in implementing qemu's savevm/loadvm and for live migration. Therefore, on reads, the first pass returns information about all HPTEs (or all bolted HPTEs). When the first pass reaches the end of the HPT, it returns from the read. Subsequent reads only return information about HPTEs that have changed since they were last read. A read that finds no changed HPTEs in the HPT following where the last read finished will return 0 bytes. The format of the data provides a simple run-length compression of the invalid entries. Each block of data starts with a header that indicates the index (position in the HPT, which is just an array), the number of valid entries starting at that index (may be zero), and the number of invalid entries following those valid entries. The valid entries, 16 bytes each, follow the header. The invalid entries are not explicitly represented. Signed-off-by: Paul Mackerras <[email protected]> [agraf: fix documentation] Signed-off-by: Alexander Graf <[email protected]>
2012-12-06KVM: PPC: Support eventfdAlexander Graf1-1/+16
In order to support the generic eventfd infrastructure on PPC, we need to call into the generic KVM in-kernel device mmio code. Signed-off-by: Alexander Graf <[email protected]>
2012-11-27KVM: x86: add kvm_arch_vcpu_postcreate callback, move TSC initializationMarcelo Tosatti1-0/+5
TSC initialization will soon make use of online_vcpus. Signed-off-by: Marcelo Tosatti <[email protected]>
2012-10-05KVM: PPC: set IN_GUEST_MODE before checking requestsScott Wood1-5/+9
Avoid a race as described in the code comment. Also remove a related smp_wmb() from booke's kvmppc_prepare_to_enter(). I can't see any reason for it, and the book3s_pr version doesn't have it. Signed-off-by: Scott Wood <[email protected]> Signed-off-by: Alexander Graf <[email protected]>
2012-10-05KVM: PPC: Book3S HV: Fix updates of vcpu->cpuPaul Mackerras1-2/+0
This removes the powerpc "generic" updates of vcpu->cpu in load and put, and moves them to the various backends. The reason is that "HV" KVM does its own sauce with that field and the generic updates might corrupt it. The field contains the CPU# of the -first- HW CPU of the core always for all the VCPU threads of a core (the one that's online from a host Linux perspective). However, the preempt notifiers are going to be called on the threads VCPUs when they are running (due to them sleeping on our private waitqueue) causing unload to be called, potentially clobbering the value. Signed-off-by: Benjamin Herrenschmidt <[email protected]> Signed-off-by: Paul Mackerras <[email protected]> Signed-off-by: Alexander Graf <[email protected]>
2012-10-05KVM: PPC: Book3S HV: Handle memory slot deletion and modification correctlyPaul Mackerras1-1/+2
This adds an implementation of kvm_arch_flush_shadow_memslot for Book3S HV, and arranges for kvmppc_core_commit_memory_region to flush the dirty log when modifying an existing slot. With this, we can handle deletion and modification of memory slots. kvm_arch_flush_shadow_memslot calls kvmppc_core_flush_memslot, which on Book3S HV now traverses the reverse map chains to remove any HPT (hashed page table) entries referring to pages in the memslot. This gets called by generic code whenever deleting a memslot or changing the guest physical address for a memslot. We flush the dirty log in kvmppc_core_commit_memory_region for consistency with what x86 does. We only need to flush when an existing memslot is being modified, because for a new memslot the rmap array (which stores the dirty bits) is all zero, meaning that every page is considered clean already, and when deleting a memslot we obviously don't care about the dirty bits any more. Signed-off-by: Paul Mackerras <[email protected]> Signed-off-by: Alexander Graf <[email protected]>
2012-10-05KVM: PPC: Move kvm->arch.slot_phys into memslot.archPaul Mackerras1-10/+3
Now that we have an architecture-specific field in the kvm_memory_slot structure, we can use it to store the array of page physical addresses that we need for Book3S HV KVM on PPC970 processors. This reduces the size of struct kvm_arch for Book3S HV, and also reduces the size of struct kvm_arch_memory_slot for other PPC KVM variants since the fields in it are now only compiled in for Book3S HV. This necessitates making the kvm_arch_create_memslot and kvm_arch_free_memslot operations specific to each PPC KVM variant. That in turn means that we now don't allocate the rmap arrays on Book3S PR and Book E. Since we now unpin pages and free the slot_phys array in kvmppc_core_free_memslot, we no longer need to do it in kvmppc_core_destroy_vm, since the generic code takes care to free all the memslots when destroying a VM. We now need the new memslot to be passed in to kvmppc_core_prepare_memory_region, since we need to initialize its arch.slot_phys member on Book3S HV. Signed-off-by: Paul Mackerras <[email protected]> Signed-off-by: Alexander Graf <[email protected]>
2012-10-05KVM: PPC: booke: Add watchdog emulationBharat Bhushan1-2/+12
This patch adds the watchdog emulation in KVM. The watchdog emulation is enabled by KVM_ENABLE_CAP(KVM_CAP_PPC_BOOKE_WATCHDOG) ioctl. The kernel timer are used for watchdog emulation and emulates h/w watchdog state machine. On watchdog timer expiry, it exit to QEMU if TCR.WRC is non ZERO. QEMU can reset/shutdown etc depending upon how it is configured. Signed-off-by: Liu Yu <[email protected]> Signed-off-by: Scott Wood <[email protected]> [[email protected]: reworked patch] Signed-off-by: Bharat Bhushan <[email protected]> [agraf: adjust to new request framework] Signed-off-by: Alexander Graf <[email protected]>
2012-10-05KVM: PPC: Add return value to core_check_requestsAlexander Graf1-2/+4
Requests may want to tell us that we need to go back into host state, so add a return value for the checks. Signed-off-by: Alexander Graf <[email protected]>