aboutsummaryrefslogtreecommitdiff
AgeCommit message (Collapse)AuthorFilesLines
2016-09-08arm64: KVM: Route asynchronous abortsMarc Zyngier1-3/+8
As we now have some basic handling to EL1-triggered aborts, we can actually report them to KVM. Signed-off-by: Marc Zyngier <[email protected]> Signed-off-by: Christoffer Dall <[email protected]>
2016-09-08arm64: KVM: Add EL1 async abort handlerMarc Zyngier1-0/+3
If we've exited the guest because it has triggered an asynchronous abort from EL1, a possible course of action is to let it know it screwed up by giving it a Virtual Abort to chew on. Signed-off-by: Marc Zyngier <[email protected]> Signed-off-by: Christoffer Dall <[email protected]>
2016-09-08arm64: KVM: Add exception code to report EL1 asynchronous abortsMarc Zyngier1-2/+3
So far, we don't have a code to indicate that we've taken an asynchronous abort from EL1. Let's add one. Signed-off-by: Marc Zyngier <[email protected]> Signed-off-by: Christoffer Dall <[email protected]>
2016-09-08arm64: KVM: Add Virtual Abort injection helperMarc Zyngier2-0/+13
Now that we're able to context switch the HCR_EL2.VA bit, let's introduce a helper that injects an Abort into a vcpu. Signed-off-by: Marc Zyngier <[email protected]> Signed-off-by: Christoffer Dall <[email protected]>
2016-09-08arm64: KVM: Preserve pending vSError in world switchMarc Zyngier1-0/+9
The HCR_EL2.VSE bit is used to signal an SError to a guest, and has the peculiar feature of getting cleared when the guest has taken the abort (this is the only bit that behaves as such in this register). This means that if we signal such an abort, we must leave it in the guest context until it disappears from HCR_EL2, and at which point it must be cleared from the context. This is achieved by reading back from HCR_EL2 until the guest takes the fault. Signed-off-by: Marc Zyngier <[email protected]> Signed-off-by: Christoffer Dall <[email protected]>
2016-09-08arm64: KVM: Rename HCR_VA to HCR_VSEMarc Zyngier1-2/+2
HCR_VA is a leftover from ARMv7, On ARMv8, this is HCR_VSE (which stands for Virtual System Error), and has better defined semantics. Let's rename the constant. Signed-off-by: Marc Zyngier <[email protected]> Signed-off-by: Christoffer Dall <[email protected]>
2016-09-08arm64: KVM: vgic-v2: Enable GICV access from HYP if access from guest is unsafeMarc Zyngier1-27/+42
So far, we've been disabling KVM on systems where the GICV region couldn't be safely given to a guest. Now that we're able to handle this access safely by emulating it in HYP, we can enable this feature when we detect an unsafe configuration. Reviewed-by: Christoffer Dall <[email protected]> Signed-off-by: Marc Zyngier <[email protected]> Signed-off-by: Christoffer Dall <[email protected]>
2016-09-08arm64: KVM: vgic-v2: Add GICV access from HYPMarc Zyngier2-0/+42
Now that we have the necessary infrastructure to handle MMIO accesses in HYP, perform the GICV access on behalf of the guest. This requires checking that the access is strictly 32bit, properly aligned, and falls within the expected range. When all condition are satisfied, we perform the access and tell the rest of the HYP code that the instruction has been correctly emulated. Reviewed-by: Christoffer Dall <[email protected]> Signed-off-by: Marc Zyngier <[email protected]> Signed-off-by: Christoffer Dall <[email protected]>
2016-09-08arm64: KVM: vgic-v2: Add the GICV emulation infrastructureMarc Zyngier5-0/+45
In order to efficiently perform the GICV access on behalf of the guest, we need to be able to avoid going back all the way to the host kernel. For this, we introduce a new hook in the world switch code, conveniently placed just after populating the fault info. At that point, we only have saved/restored the GP registers, and we can quickly perform all the required checks (data abort, translation fault, valid faulting syndrome, not an external abort, not a PTW). Coming back from the emulation code, we need to skip the emulated instruction. This involves an additional bit of save/restore in order to be able to access the guest's PC (and possibly CPSR if this is a 32bit guest). At this stage, no emulation code is provided. Signed-off-by: Marc Zyngier <[email protected]> Reviewed-by: Christoffer Dall <[email protected]> Signed-off-by: Christoffer Dall <[email protected]>
2016-09-08arm64: KVM: Make kvm_skip_instr32 available to HYPMarc Zyngier1-2/+3
As we plan to do some emulation at HYP, let's make kvm_skip_instr32 as part of the hyp_text section. This doesn't preclude the kernel from using it. Signed-off-by: Marc Zyngier <[email protected]> Signed-off-by: Christoffer Dall <[email protected]>
2016-09-08arm: KVM: Use common AArch32 conditional execution codeMarc Zyngier4-104/+33
Add the bit of glue and const-ification that is required to use the code inherited from the arm64 port, and move over to it. Signed-off-by: Marc Zyngier <[email protected]> Signed-off-by: Christoffer Dall <[email protected]>
2016-09-08arm64: KVM: Move the AArch32 conditional execution to common codeMarc Zyngier3-7/+4
It would make some sense to share the conditional execution code between 32 and 64bit. In order to achieve this, let's move that code to virt/kvm/arm/aarch32.c. While we're at it, drop a superfluous BUG_ON() that wasn't that useful. Following patches will migrate the 32bit port to that code base. Signed-off-by: Marc Zyngier <[email protected]> Signed-off-by: Christoffer Dall <[email protected]>
2016-09-08arm64: KVM: Move kvm_vcpu_get_condition out of emulate.cMarc Zyngier2-11/+10
In order to make emulate.c more generic, move the arch-specific manupulation bits out of emulate.c. Signed-off-by: Marc Zyngier <[email protected]> Signed-off-by: Christoffer Dall <[email protected]>
2016-09-08arm64: KVM: VHE: reset PSTATE.PAN on entry to EL2Vladimir Murzin1-0/+2
SCTLR_EL2.SPAN bit controls what happens with the PSTATE.PAN bit on an exception. However, this bit has no effect on the PSTATE.PAN when HCR_EL2.E2H or HCR_EL2.TGE is unset. Thus when VHE is used and exception taken from a guest PSTATE.PAN bit left unchanged and we continue with a value guest has set. To address that always reset PSTATE.PAN on entry from EL1. Fixes: 1f364c8c48a0 ("arm64: VHE: Add support for running Linux in EL2 mode") Signed-off-by: Vladimir Murzin <[email protected]> Reviewed-by: James Morse <[email protected]> Acked-by: Marc Zyngier <[email protected]> Cc: <[email protected]> # v4.6+ Signed-off-by: Christoffer Dall <[email protected]>
2016-09-08KVM: arm/arm64: Get rid of exported aliases to static functionsChristoffer Dall6-33/+11
When rewriting the assembly code to C code, it was useful to have exported aliases or static functions so that we could keep the existing common C code unmodified and at the same time rewrite arm64 from assembly to C code, and later do the arm part. Now when both are done, we really don't need this level of indirection anymore, and it's time to save a few lines and brain cells. Acked-by: Marc Zyngier <[email protected]> Signed-off-by: Christoffer Dall <[email protected]>
2016-09-08arm64/kvm: remove unused stub functionsMark Rutland1-6/+0
Now that 32-bit KVM no longer performs cache maintenance for page table updates, we no longer need empty stubs for arm64. Remove them. Signed-off-by: Mark Rutland <[email protected]> Cc: Christoffer Dall <[email protected]> Cc: Marc Zyngier <[email protected]> Cc: [email protected] Signed-off-by: Christoffer Dall <[email protected]>
2016-09-08arm/kvm: excise redundant cache maintenanceMark Rutland2-28/+2
When modifying Stage-2 page tables, we perform cache maintenance to account for non-coherent page table walks. However, this is unnecessary, as page table walks are guaranteed to be coherent in the presence of the virtualization extensions. Per ARM DDI 0406C.c, section B1.7 ("The Virtualization Extensions"), the virtualization extensions mandate the multiprocessing extensions. Per ARM DDI 0406C.c, section B3.10.1 ("General TLB maintenance requirements"), as described in the sub-section titled "TLB maintenance operations and the memory order model", this maintenance is not required in the presence of the multiprocessing extensions. Hence, we need not perform this cache maintenance when modifying Stage-2 entries. This patch removes the logic for performing the redundant maintenance. To ensure visibility and ordering of updates, a dsb(ishst) that was otherwise implicit in the maintenance is folded into kvm_set_pmd() and kvm_set_pte(). Signed-off-by: Mark Rutland <[email protected]> Cc: Christoffer Dall <[email protected]> Cc: Marc Zyngier <[email protected]> Cc: [email protected] Signed-off-by: Christoffer Dall <[email protected]>
2016-09-08KVM: arm: vgic: Drop build compatibility hack for older kernel versionsMarc Zyngier1-6/+0
As kvm_set_routing_entry() was changing prototype between 4.7 and 4.8, an ugly hack was put in place in order to survive both building in -next and the merge window. Now that everything has been merged, let's dump the compatibility hack for good. Signed-off-by: Marc Zyngier <[email protected]> Reviewed-by: Eric Auger <[email protected]> Signed-off-by: Christoffer Dall <[email protected]>
2016-09-08arm64: KVM: Optimize __guest_enter/exit() to save a few instructionsShanker Donthineni2-75/+63
We are doing an unnecessary stack push/pop operation when restoring the guest registers x0-x18 in __guest_enter(). This patch saves the two instructions by using x18 as a base register. No need to store the vcpu context pointer in stack because it is redundant, the same information is available in tpidr_el2. The function __guest_exit() calling convention is slightly modified, caller only pushes the regs x0-x1 to stack instead of regs x0-x3. Signed-off-by: Shanker Donthineni <[email protected]> Reviewed-by: Christoffer Dall <[email protected]> Signed-off-by: Christoffer Dall <[email protected]>
2016-09-08KVM: arm/arm64: Rename vgic_attr_regs_access to vgic_attr_regs_access_v2Christoffer Dall1-15/+11
Just a rename so we can implement a v3-specific function later. We take the chance to get rid of the V2/V3 ops comments as well. No functional change. Signed-off-by: Christoffer Dall <[email protected]> Reviewed-by: Eric Auger <[email protected]>
2016-09-08KVM: arm/arm64: Factor out vgic_attr_regs_access functionalityChristoffer Dall1-27/+73
As we are about to deal with multiple data types and situations where the vgic should not be initialized when doing userspace accesses on the register attributes, factor out the functionality of vgic_attr_regs_access into smaller bits which can be reused by a new function later. Signed-off-by: Christoffer Dall <[email protected]> Reviewed-by: Eric Auger <[email protected]>
2016-09-08KVM: arm/arm64: Add VGICv3 save/restore API documentationChristoffer Dall3-35/+261
Factor out the GICv3 and ITS-specific documentation into a separate documentation file. Add description for how to access distributor, redistributor, and CPU interface registers for GICv3 in this new file, and add a group for accessing level triggered IRQ information for GICv3 as well. Reviewed-by: Marc Zyngier <[email protected]> Reviewed-by: Peter Maydell <[email protected]> Signed-off-by: Christoffer Dall <[email protected]>
2016-09-07KVM: nVMX: expose INS/OUTS information supportJan Dakinevich1-0/+9
Expose the feature to L1 hypervisor if host CPU supports it, since certain hypervisors requires it for own purposes. According to Intel SDM A.1, if CPU supports the feature, VMX_INSTRUCTION_INFO field of VMCS will contain detailed information about INS/OUTS instructions handling. This field is already copied to VMCS12 for L1 hypervisor (see prepare_vmcs12 routine) independently feature presence. Signed-off-by: Jan Dakinevich <[email protected]> Signed-off-by: Paolo Bonzini <[email protected]>
2016-09-07KVM: VMX: not use vmcs_config in setup_vmcs_configPaolo Bonzini1-1/+1
setup_vmcs_config takes a pointer to the vmcs_config global. The indirection is somewhat pointless, but just keep things consistent for now. Signed-off-by: Paolo Bonzini <[email protected]>
2016-09-07KVM: x86: remove stale commentsPaolo Bonzini1-1/+0
handle_external_intr does not enable interrupts anymore, vcpu_enter_guest does it after calling guest_exit_irqoff. Signed-off-by: Paolo Bonzini <[email protected]>
2016-09-07KVM: x86: ratelimit and decrease severity for guest-triggered printkPaolo Bonzini1-9/+9
These are mostly related to nested VMX. They needn't have a loglevel as high as KERN_WARN, and mustn't be allowed to pollute the host logs. Signed-off-by: Paolo Bonzini <[email protected]>
2016-09-07KVM: nVMX: pass valid guest linear-address to the L1Jan Dakinevich1-0/+3
If EPT support is exposed to L1 hypervisor, guest linear-address field of VMCS should contain GVA of L2, the access to which caused EPT violation. Signed-off-by: Jan Dakinevich <[email protected]> Reviewed-by: Wanpeng Li <[email protected]> Signed-off-by: Paolo Bonzini <[email protected]>
2016-09-07KVM: Remove deprecated create_singlethread_workqueueBhaktipriya Shridhar3-34/+5
The workqueue "irqfd_cleanup_wq" queues a single work item &irqfd->shutdown and hence doesn't require ordering. It is a host-wide workqueue for issuing deferred shutdown requests aggregated from all vm* instances. It is not being used on a memory reclaim path. Hence, it has been converted to use system_wq. The work item has been flushed in kvm_irqfd_release(). The workqueue "wqueue" queues a single work item &timer->expired and hence doesn't require ordering. Also, it is not being used on a memory reclaim path. Hence, it has been converted to use system_wq. System workqueues have been able to handle high level of concurrency for a long time now and hence it's not required to have a singlethreaded workqueue just to gain concurrency. Unlike a dedicated per-cpu workqueue created with create_singlethread_workqueue(), system_wq allows multiple work items to overlap executions even on the same CPU; however, a per-cpu workqueue doesn't have any CPU locality or global ordering guarantee unless the target CPU is explicitly specified and thus the increase of local concurrency shouldn't make any difference. Signed-off-by: Bhaktipriya Shridhar <[email protected]> Signed-off-by: Paolo Bonzini <[email protected]>
2016-09-07KVM: nVMX: make emulated nested preemption timer pinnedWanpeng Li1-1/+1
Commit 61abdbe0bc ("kvm: x86: make lapic hrtimer pinned") pins the emulated lapic timer. This patch does the same for the emulated nested preemption timer to avoid vmexit an unrelated vCPU and the latency of kicking IPI to another vCPU. Cc: Paolo Bonzini <[email protected]> Cc: Radim Krčmář <[email protected]> Cc: Yunhong Jiang <[email protected]> Signed-off-by: Wanpeng Li <[email protected]> Signed-off-by: Paolo Bonzini <[email protected]>
2016-09-07vmx: refine validity check for guest linear addressLiang Li1-1/+1
The validity check for the guest line address is inefficient, check the invalid value instead of enumerating the valid ones. Signed-off-by: Liang Li <[email protected]> Signed-off-by: Paolo Bonzini <[email protected]>
2016-09-05Merge branch 'x86/amd-avic' of ↵Paolo Bonzini6-66/+801
git://git.kernel.org/pub/scm/linux/kernel/git/joro/iommu into HEAD Merge IOMMU bits for virtualization of interrupt injection into virtual machines.
2016-09-05iommu/amd: Enable vAPIC interrupt remapping mode by defaultSuravee Suthikulpanit3-10/+48
Introduce struct iommu_dev_data.use_vapic flag, which IOMMU driver uses to determine if it should enable vAPIC support, by setting the ga_mode bit in the device's interrupt remapping table entry. Currently, it is enabled for all pass-through device if vAPIC mode is enabled. Signed-off-by: Suravee Suthikulpanit <[email protected]> Signed-off-by: Joerg Roedel <[email protected]>
2016-09-05iommu/amd: Implements irq_set_vcpu_affinity() hook to setup vapic mode for ↵Suravee Suthikulpanit3-4/+79
pass-through devices This patch implements irq_set_vcpu_affinity() function to set up interrupt remapping table entry with vapic mode for pass-through devices. In case requirements for vapic mode are not met, it falls back to set up the IRTE in legacy mode. Signed-off-by: Suravee Suthikulpanit <[email protected]> Signed-off-by: Joerg Roedel <[email protected]>
2016-09-05iommu/amd: Introduce amd_iommu_update_ga()Suravee Suthikulpanit3-0/+49
Introduces a new IOMMU API, amd_iommu_update_ga(), which allows KVM (SVM) to update existing posted interrupt IOMMU IRTE when load/unload vcpu. Signed-off-by: Suravee Suthikulpanit <[email protected]> Signed-off-by: Joerg Roedel <[email protected]>
2016-09-05iommu/amd: Adding GALOG interrupt handlerSuravee Suthikulpanit2-6/+87
This patch adds AMD IOMMU guest virtual APIC log (GALOG) handler. When IOMMU hardware receives an interrupt targeting a blocking vcpu, it creates an entry in the GALOG, and generates an interrupt to notify the AMD IOMMU driver. At this point, the driver processes the log entry, and notify the SVM driver via the registered iommu_ga_log_notifier function. Signed-off-by: Suravee Suthikulpanit <[email protected]> Signed-off-by: Joerg Roedel <[email protected]>
2016-09-05iommu/amd: Detect and initialize guest vAPIC logSuravee Suthikulpanit2-7/+133
This patch adds support to detect and initialize IOMMU Guest vAPIC log (GALOG). By default, it also enable GALog interrupt to notify IOMMU driver when GA Log entry is created. Signed-off-by: Suravee Suthikulpanit <[email protected]> Signed-off-by: Joerg Roedel <[email protected]>
2016-09-05iommu/amd: Add support for multiple IRTE formatsSuravee Suthikulpanit3-25/+50
This patch enables support for the new 128-bit IOMMU IRTE format, which can be used for both legacy and vapic interrupt remapping modes. It replaces the existing operations on IRTE, which can only support the older 32-bit IRTE format, with calls to the new struct amd_irt_ops. It also provides helper functions for setting up, accessing, and updating interrupt remapping table entries in different mode. Signed-off-by: Suravee Suthikulpanit <[email protected]> Signed-off-by: Joerg Roedel <[email protected]>
2016-09-05iommu/amd: Introduce interrupt remapping ops structureSuravee Suthikulpanit2-5/+205
Currently, IOMMU support two interrupt remapping table entry formats, 32-bit (legacy) and 128-bit (GA). The spec also implies that it might support additional modes/formats in the future. So, this patch introduces the new struct amd_irte_ops, which allows the same code to work with different irte formats by providing hooks for various operations on an interrupt remapping table entry. Suggested-by: Joerg Roedel <[email protected]> Signed-off-by: Suravee Suthikulpanit <[email protected]> Signed-off-by: Joerg Roedel <[email protected]>
2016-09-05iommu/amd: Move and introduce new IRTE-related unions and structuresSuravee Suthikulpanit2-28/+76
Move existing unions and structs for accessing/managing IRTE to a proper header file. This is mainly to simplify variable declarations in subsequent patches. Besides, this patch also introduces new struct irte_ga for the new 128-bit IRTE format. Signed-off-by: Suravee Suthikulpanit <[email protected]> Signed-off-by: Joerg Roedel <[email protected]>
2016-09-05iommu/amd: Detect and enable guest vAPIC supportSuravee Suthikulpanit4-6/+99
This patch introduces a new IOMMU driver parameter, amd_iommu_guest_ir, which can be used to specify different interrupt remapping mode for passthrough devices to VM guest: * legacy: Legacy interrupt remapping (w/ 32-bit IRTE) * vapic : Guest vAPIC interrupt remapping (w/ GA mode 128-bit IRTE) Note that in vapic mode, it can also supports legacy interrupt remapping for non-passthrough devices with the 128-bit IRTE. Signed-off-by: Suravee Suthikulpanit <[email protected]> Signed-off-by: Joerg Roedel <[email protected]>
2016-08-19KVM: x86: Expose more Intel AVX512 feature to guestLuwei Kang1-1/+2
Expose AVX512DQ, AVX512BW, AVX512VL feature to guest. Its spec can be found at: https://software.intel.com/sites/default/files/managed/b4/3a/319433-024.pdf Signed-off-by: Luwei Kang <[email protected]> [Resolved a trivial conflict with removed F(PCOMMIT).] Signed-off-by: Radim Krčmář <[email protected]>
2016-08-19mmu: don't pass *kvm to spte_write_protect and spte_*_dirtyBandan Das1-6/+6
That parameter isn't used in these functions, it's probably a historical artifact. Signed-off-by: Bandan Das <[email protected]> Signed-off-by: Paolo Bonzini <[email protected]>
2016-08-19KVM: lapic: don't recalculate apic map table twice when enabling LAPICWanpeng Li1-2/+3
APIC map table is recalculated during reset APIC ID to the initial value when enabling LAPIC. This patch move the recalculate_apic_map() to the next branch since we don't need to recalculate apic map twice in current codes. Cc: Paolo Bonzini <[email protected]> Cc: Radim Krčmář <[email protected]> Signed-off-by: Wanpeng Li <[email protected]> Signed-off-by: Paolo Bonzini <[email protected]>
2016-08-19MIPS: KVM: Check for pfn noslot caseJames Hogan1-1/+1
When mapping a page into the guest we error check using is_error_pfn(), however this doesn't detect a value of KVM_PFN_NOSLOT, indicating an error HVA for the page. This can only happen on MIPS right now due to unusual memslot management (e.g. being moved / removed / resized), or with an Enhanced Virtual Memory (EVA) configuration where the default KVM_HVA_ERR_* and kvm_is_error_hva() definitions are unsuitable (fixed in a later patch). This case will be treated as a pfn of zero, mapping the first page of physical memory into the guest. It would appear the MIPS KVM port wasn't updated prior to being merged (in v3.10) to take commit 81c52c56e2b4 ("KVM: do not treat noslot pfn as a error pfn") into account (merged v3.8), which converted a bunch of is_error_pfn() calls to is_error_noslot_pfn(). Switch to using is_error_noslot_pfn() instead to catch this case properly. Fixes: 858dd5d45733 ("KVM/MIPS32: MMU/TLB operations for the Guest.") Signed-off-by: James Hogan <[email protected]> Cc: Paolo Bonzini <[email protected]> Cc: Radim Krčmář <[email protected]> Cc: Ralf Baechle <[email protected]> Cc: [email protected] Cc: [email protected] Cc: <[email protected]> # 3.10.y- Signed-off-by: Paolo Bonzini <[email protected]>
2016-08-18Merge tag 'kvm-arm-for-v4.8-rc3' of ↵Paolo Bonzini11-71/+164
git://git.kernel.org/pub/scm/linux/kernel/git/kvmarm/kvmarm into HEAD KVM/ARM Fixes for v4.8-rc3 This tag contains the following fixes on top of v4.8-rc1: - ITS init issues - ITS error handling issues - ITS IRQ leakage fix - Plug a couple of ITS race conditions - An erratum workaround for timers - Some removal of misleading use of errors and comments - A fix for GICv3 on 32-bit guests
2016-08-18kvm: nVMX: fix nested tsc scalingPeter Feiner1-4/+12
When the host supported TSC scaling, L2 would use a TSC multiplier of 0, which causes a VM entry failure. Now L2's TSC uses the same multiplier as L1. Signed-off-by: Peter Feiner <[email protected]> Signed-off-by: Paolo Bonzini <[email protected]>
2016-08-18KVM: nVMX: postpone VMCS changes on MSR_IA32_APICBASE writeRadim Krčmář1-0/+13
If vmcs12 does not intercept APIC_BASE writes, then KVM will handle the write with vmcs02 as the current VMCS. This will incorrectly apply modifications intended for vmcs01 to vmcs02 and L2 can use it to gain access to L0's x2APIC registers by disabling virtualized x2APIC while using msr bitmap that assumes enabled. Postpone execution of vmx_set_virtual_x2apic_mode until vmcs01 is the current VMCS. An alternative solution would temporarily make vmcs01 the current VMCS, but it requires more care. Fixes: 8d14695f9542 ("x86, apicv: add virtual x2apic support") Reported-by: Jim Mattson <[email protected]> Reviewed-by: Wanpeng Li <[email protected]> Signed-off-by: Radim Krčmář <[email protected]>
2016-08-18KVM: nVMX: fix msr bitmaps to prevent L2 from accessing L0 x2APICRadim Krčmář1-63/+44
msr bitmap can be used to avoid a VM exit (interception) on guest MSR accesses. In some configurations of VMX controls, the guest can even directly access host's x2APIC MSRs. See SDM 29.5 VIRTUALIZING MSR-BASED APIC ACCESSES. L2 could read all L0's x2APIC MSRs and write TPR, EOI, and SELF_IPI. To do so, L1 would first trick KVM to disable all possible interceptions by enabling APICv features and then would turn those features off; nested_vmx_merge_msr_bitmap() only disabled interceptions, so VMX would not intercept previously enabled MSRs even though they were not safe with the new configuration. Correctly re-enabling interceptions is not enough as a second bug would still allow L1+L2 to access host's MSRs: msr bitmap was shared for all VMCSs, so L1 could trigger a race to get the desired combination of msr bitmap and VMX controls. This fix allocates a msr bitmap for every L1 VCPU, allows only safe x2APIC MSRs from L1's msr bitmap, and disables msr bitmaps if they would have to intercept everything anyway. Fixes: 3af18d9c5fe9 ("KVM: nVMX: Prepare for using hardware MSR bitmap") Reported-by: Jim Mattson <[email protected]> Suggested-by: Wincy Van <[email protected]> Reviewed-by: Wanpeng Li <[email protected]> Signed-off-by: Radim Krčmář <[email protected]>
2016-08-17Merge git://git.kernel.org/pub/scm/linux/kernel/git/davem/netLinus Torvalds116-1109/+1486
Pull networking fixes from David Miller: 1) Buffers powersave frame test is reversed in cfg80211, fix from Felix Fietkau. 2) Remove bogus WARN_ON in openvswitch, from Jarno Rajahalme. 3) Fix some tg3 ethtool logic bugs, and one that would cause no interrupts to be generated when rx-coalescing is set to 0. From Satish Baddipadige and Siva Reddy Kallam. 4) QLCNIC mailbox corruption and napi budget handling fix from Manish Chopra. 5) Fix fib_trie logic when walking the trie during /proc/net/route output than can access a stale node pointer. From David Forster. 6) Several sctp_diag fixes from Phil Sutter. 7) PAUSE frame handling fixes in mlxsw driver from Ido Schimmel. 8) Checksum fixup fixes in bpf from Daniel Borkmann. 9) Memork leaks in nfnetlink, from Liping Zhang. 10) Use after free in rxrpc, from David Howells. 11) Use after free in new skb_array code of macvtap driver, from Jason Wang. 12) Calipso resource leak, from Colin Ian King. 13) mediatek bug fixes (missing stats sync init, etc.) from Sean Wang. 14) Fix bpf non-linear packet write helpers, from Daniel Borkmann. 15) Fix lockdep splats in macsec, from Sabrina Dubroca. 16) hv_netvsc bug fixes from Vitaly Kuznetsov, mostly to do with VF handling. 17) Various tc-action bug fixes, from CONG Wang. * git://git.kernel.org/pub/scm/linux/kernel/git/davem/net: (116 commits) net_sched: allow flushing tc police actions net_sched: unify the init logic for act_police net_sched: convert tcf_exts from list to pointer array net_sched: move tc offload macros to pkt_cls.h net_sched: fix a typo in tc_for_each_action() net_sched: remove an unnecessary list_del() net_sched: remove the leftover cleanup_a() mlxsw: spectrum: Allow packets to be trapped from any PG mlxsw: spectrum: Unmap 802.1Q FID before destroying it mlxsw: spectrum: Add missing rollbacks in error path mlxsw: reg: Fix missing op field fill-up mlxsw: spectrum: Trap loop-backed packets mlxsw: spectrum: Add missing packet traps mlxsw: spectrum: Mark port as active before registering it mlxsw: spectrum: Create PVID vPort before registering netdevice mlxsw: spectrum: Remove redundant errors from the code mlxsw: spectrum: Don't return upon error in removal path i40e: check for and deal with non-contiguous TCs ixgbe: Re-enable ability to toggle VLAN filtering ixgbe: Force VLNCTRL.VFE to be set in all VMDq paths ...
2016-08-17Merge branch 'tc_action-fixes'David S. Miller8-119/+112
Cong Wang says: ==================== net_sched: tc action fixes and updates This patchset fixes a few regressions caused by the previous code refactor and more. Thanks to Jamal for catching them! Note, patch 3/7 and 4/7 are not strictly necessary for this patchset, I just want to carry them together. --- v4: adjust an indention for Jamal add two more patches v3: avoid list for fast path, suggested by Jamal v2: replace flex_array with regular dynamic array keep tcf_action_stats_update() in act_api.h fix macro typos found by Amir ==================== Signed-off-by: David S. Miller <[email protected]>