blaster4385/linux-IllusionX - Linux kernel with personal config changes for arch linux

Age	Commit message (Collapse)	Author	Files	Lines
2016-03-09	Merge branch 'ib-mfd-regulator-gpio-4.6' of ↵	Linus Walleij	82	-442/+598
	git://git.kernel.org/pub/scm/linux/kernel/git/lee/mfd into devel
2016-03-09	arm64: KVM: vgic-v3: Only wipe LRs on vcpu exit	Marc Zyngier	1	-5/+4
	So far, we're always writing all possible LRs, setting the empty ones with a zero value. This is obvious doing a low of work for nothing, and we're better off clearing those we've actually dirtied on the exit path (it is very rare to inject more than one interrupt at a time anyway). Reviewed-by: Christoffer Dall <[email protected]> Signed-off-by: Marc Zyngier <[email protected]>
2016-03-09	arm64: KVM: vgic-v3: Reset LRs at boot time	Marc Zyngier	2	-0/+10
	In order to let the GICv3 code be more lazy in the way it accesses the LRs, it is necessary to start with a clean slate. Let's reset the LRs on each CPU when the vgic is probed (which includes a round trip to EL2...). Reviewed-by: Christoffer Dall <[email protected]> Signed-off-by: Marc Zyngier <[email protected]>
2016-03-09	arm64: KVM: vgic-v3: Do not save an LR known to be empty	Marc Zyngier	1	-2/+9
	On exit, any empty LR will be signaled in ICH_ELRSR_EL2. Which means that we do not have to save it, and we can just clear its state in the in-memory copy. Reviewed-by: Christoffer Dall <[email protected]> Signed-off-by: Marc Zyngier <[email protected]>
2016-03-09	arm64: KVM: vgic-v3: Save maintenance interrupt state only if required	Marc Zyngier	1	-2/+31
	Next on our list of useless accesses is the maintenance interrupt status registers (ICH_MISR_EL2, ICH_EISR_EL2). It is pointless to save them if we haven't asked for a maintenance interrupt the first place, which can only happen for two reasons: - Underflow: ICH_HCR_UIE will be set, - EOI: ICH_LR_EOI will be set. These conditions can be checked on the in-memory copies of the regs. Should any of these two condition be valid, we must read GICH_MISR. We can then check for ICH_MISR_EOI, and only when set read ICH_EISR_EL2. This means that in most case, we don't have to save them at all. Reviewed-by: Christoffer Dall <[email protected]> Signed-off-by: Marc Zyngier <[email protected]>
2016-03-09	arm64: KVM: vgic-v3: Avoid accessing ICH registers	Marc Zyngier	1	-113/+180
	Just like on GICv2, we're a bit hammer-happy with GICv3, and access them more often than we should. Adopt a policy similar to what we do for GICv2, only save/restoring the minimal set of registers. As we don't access the registers linearly anymore (we may skip some), the convoluted accessors become slightly simpler, and we can drop the ugly indexing macro that tended to confuse the reviewers. Reviewed-by: Christoffer Dall <[email protected]> Signed-off-by: Marc Zyngier <[email protected]>
2016-03-09	powerpc: New possible return value from hcall	Christophe Lombard	1	-0/+1
	The hcalls introduced for cxl use a possible new value: H_STATE (invalid state). Co-authored-by: Frederic Barrat <[email protected]> Signed-off-by: Frederic Barrat <[email protected]> Signed-off-by: Christophe Lombard <[email protected]> Reviewed-by: Manoj Kumar <[email protected]> Acked-by: Ian Munsie <[email protected]> Signed-off-by: Michael Ellerman <[email protected]>
2016-03-09	powerpc/eeh: eeh_pci_enable(): fix checking of post-request state	Andrew Donnellan	1	-1/+1
	In eeh_pci_enable(), after making the request to set the new options, we call eeh_ops->wait_state() to check that the request finished successfully. At the moment, if eeh_ops->wait_state() returns 0, we return 0 without checking that it reflects the expected outcome. This can lead to callers further up the chain incorrectly assuming the slot has been successfully unfrozen and continuing to attempt recovery. On powernv, this will occur if pnv_eeh_get_pe_state() or pnv_eeh_get_phb_state() return 0, which in turn occurs if the relevant OPAL call returns OPAL_EEH_STOPPED_MMIO_DMA_FREEZE or OPAL_EEH_PHB_ERROR respectively. On pseries, this will occur if pseries_eeh_get_state() returns 0, which in turn occurs if RTAS reports that the PE is in the MMIO Stopped and DMA Stopped states. Obviously, none of these cases represent a successful completion of a request to thaw MMIO or DMA. Fix the check so that a wait_state() return value of 0 won't be considered successful for the EEH_OPT_THAW_MMIO or EEH_OPT_THAW_DMA cases. Signed-off-by: Andrew Donnellan <[email protected]> Acked-by: Gavin Shan <[email protected]> Reviewed-by: Daniel Axtens <[email protected]> Signed-off-by: Michael Ellerman <[email protected]>
2016-03-09	powerpc/eeh: Remove duplicated check in eeh_dump_pe_log()	Gavin Shan	1	-7/+0
	When eeh_dump_pe_log() is only called by eeh_slot_error_detail(), we already have the check that the PE isn't in PCI config blocked state in eeh_slot_error_detail(). So we needn't the duplicated check in eeh_dump_pe_log(). This removes the duplicated check in eeh_dump_pe_log(). No logical changes introduced. Signed-off-by: Gavin Shan <[email protected]> Reviewed-by: Andrew Donnellan <[email protected]> Signed-off-by: Michael Ellerman <[email protected]>
2016-03-09	powerpc/eeh: Synchronize recovery in host/guest	Gavin Shan	1	-0/+11
	When passing through SRIOV VFs to guest, we possibly encounter EEH error on PF. In this case, the VF PEs are put into frozen state. The error could be reported to guest before it's captured by the host. That means the guest could attempt to recover errors on VFs before host gets chance to recover errors on PFs. The VFs won't be recovered successfully. This enforces the recovery order for above case: the recovery on child PE in guest is hold until the recovery on parent PE in host is completed. Signed-off-by: Gavin Shan <[email protected]> Reviewed-by: Russell Currey <[email protected]> Signed-off-by: Michael Ellerman <[email protected]>
2016-03-09	powerpc/eeh: Don't remove passed VFs	Gavin Shan	1	-0/+3
	When we have partial hotplug as part of the error recovery on PF, the VFs that are bound with vfio-pci driver will experience hotplug. That's not allowed. This checks if the VF PE is passed or not. If it does, we leave the VF without removing it. Signed-off-by: Gavin Shan <[email protected]> Reviewed-by: Russell Currey <[email protected]> Signed-off-by: Michael Ellerman <[email protected]>
2016-03-09	powerpc/eeh: Don't propagate error to guest	Gavin Shan	1	-5/+5
	When EEH error happened to the parent PE of those PEs that have been passed through to guest, the error is propagated to guest domain and the VFIO driver's error handlers are called. It's not correct as the error in the host domain shouldn't be propagated to guests and affect them. This adds one more limitation when calling EEH error handlers. If the PE has been passed through to guest, the error handlers won't be called. Signed-off-by: Gavin Shan <[email protected]> Reviewed-by: Russell Currey <[email protected]> Signed-off-by: Michael Ellerman <[email protected]>
2016-03-09	powerpc/eeh: powerpc/eeh: Support error recovery for VF PE	Wei Yang	6	-26/+127
	PFs are enumerated on PCI bus, while VFs are created by PF's driver. In EEH recovery, it has two cases: 1. Device and driver is EEH aware, error handlers are called. 2. Device and driver is not EEH aware, un-plug the device and plug it again by enumerating it. The special thing happens on the second case. For a PF, we could use the original pci core to enumerate the bus, while for VF we need to record the VFs which aer un-plugged then plug it again. Also The patch caches the VF index in pci_dn, which can be used to calculate VF's bus, device and function number. Those information helps to locate the VF's PCI device instance when doing hotplug during EEH recovery if necessary. Signed-off-by: Wei Yang <[email protected]> Acked-by: Gavin Shan <[email protected]> Signed-off-by: Michael Ellerman <[email protected]>
2016-03-09	powerpc/powernv: Support PCI config restore for VFs	Wei Yang	2	-3/+93
	After PE reset, OPAL API opal_pci_reinit() is called on all devices contained in the PE to reinitialize them. While skiboot is not aware of VFs, we have to implement the function in kernel to reinitialize VFs after reset on PE for VFs. In this patch, two functions pnv_pci_fixup_vf_mps() and pnv_eeh_restore_vf_config() both manipulate the MPS of the VF, since for a VF it has three cases. 1. Normal creation for a VF In this case, pnv_pci_fixup_vf_mps() is called to make the MPS a proper value compared with its parent. 2. EEH recovery without VF removed In this case, MPS is stored in pci_dn and pnv_eeh_restore_vf_config() is called to restore it and reinitialize other part. 3. EEH recovery with VF removed In this case, VF will be removed then re-created. Both functions are called. First pnv_pci_fixup_vf_mps() is called to store the proper MPS to pci_dn and then pnv_eeh_restore_vf_config() is called to do proper thing. This introduces two functions: pnv_pci_fixup_vf_mps() to fixup the VF's MPS to make sure it is equal to parent's and store this value in pci_dn for future use. pnv_eeh_restore_vf_config() to re-initialize on VF by restoring MPS, disabling completion timeout, enabling SERR, etc. Signed-off-by: Wei Yang <[email protected]> Acked-by: Gavin Shan <[email protected]> Signed-off-by: Michael Ellerman <[email protected]>
2016-03-09	powerpc/powernv: Support EEH reset for VF PE	Wei Yang	3	-4/+133
	PEs for VFs don't have primary bus. So they have to have their own reset backend, which is used during EEH recovery. The patch implements the reset backend for VF's PE by issuing FLR or AF FLR to the VFs, which are contained in the PE. Signed-off-by: Wei Yang <[email protected]> Acked-by: Gavin Shan <[email protected]> Signed-off-by: Michael Ellerman <[email protected]>
2016-03-09	powerpc/eeh: Create PE for VFs	Wei Yang	3	-2/+25
	This creates PEs for VFs in the weak function pcibios_bus_add_device(). Those PEs for VFs are identified with newly introduced flag EEH_PE_VF so that we treat them differently during EEH recovery. Signed-off-by: Wei Yang <[email protected]> Acked-by: Gavin Shan <[email protected]> Signed-off-by: Michael Ellerman <[email protected]>
2016-03-09	powerpc/eeh: EEH device for VF	Wei Yang	2	-0/+16
	VFs and their corresponding pdn are created and released dynamically when their PF's SRIOV capability is enabled and disabled. This creates and releases EEH devices for VFs when creating and releasing their pdn instances, which means EEH devices and pdn instances have same life cycle. Also, VF's EEH device is identified by (struct eeh_dev::physfn). Signed-off-by: Wei Yang <[email protected]> Acked-by: Gavin Shan <[email protected]> Signed-off-by: Michael Ellerman <[email protected]>
2016-03-09	powerpc/eeh: Cache normal BARs, not windows or IOV BARs	Wei Yang	1	-6/+5
	This restricts the EEH address cache to use only the first 7 BARs. This makes __eeh_addr_cache_insert_dev() ignore PCI bridge window and IOV BARs. As the result of this change, eeh_addr_cache_get_dev() will return VFs from VF's resource addresses instead of parent PFs. This also removes PCI bridge check as we limit __eeh_addr_cache_insert_dev() to 7 BARs and this effectively excludes PCI bridges from being cached. Signed-off-by: Wei Yang <[email protected]> Acked-by: Gavin Shan <[email protected]> Signed-off-by: Michael Ellerman <[email protected]>
2016-03-09	powerpc/pci: Remove VFs prior to PF	Wei Yang	1	-1/+1
	As commit ac205b7bb72f ("PCI: make sriov work with hotplug remove") indicates, VFs which is on the same PCI bus as their PF, should be removed before the PF. Otherwise, we might run into kernel crash at PCI unplugging time. This applies the above pattern to powerpc PCI hotplug path. Signed-off-by: Wei Yang <[email protected]> Acked-by: Gavin Shan <[email protected]> Signed-off-by: Michael Ellerman <[email protected]>
2016-03-09	powerpc/eeh: Reworked eeh_pe_bus_get()	Gavin Shan	1	-16/+12
	The original implementation is ugly: unnecessary if statements and "out" tag. This reworks the function to avoid above weaknesses. No functional changes introduced. Signed-off-by: Gavin Shan <[email protected]> Reviewed-by: Andrew Donnellan <[email protected]> Signed-off-by: Michael Ellerman <[email protected]>
2016-03-08	PCI: Include pci/hotplug Kconfig directly from pci/Kconfig	Bjorn Helgaas	11	-21/+0
	Include pci/hotplug/Kconfig directly from pci/Kconfig, so arches don't have to source both pci/Kconfig and pci/hotplug/Kconfig. Note that this effectively adds pci/hotplug/Kconfig to the following arches, because they already sourced drivers/pci/Kconfig but they previously did not source drivers/pci/hotplug/Kconfig: alpha arm avr32 frv m68k microblaze mn10300 sparc unicore32 Inspired-by-patch-from: Bogicevic Sasa <[email protected]> Signed-off-by: Bjorn Helgaas <[email protected]>
2016-03-08	PCI: Include pci/pcie/Kconfig directly from pci/Kconfig	Bogicevic Sasa	9	-14/+0
	Include pci/pcie/Kconfig directly from pci/Kconfig, so arches don't have to source both pci/Kconfig and pci/pcie/Kconfig. Note that this effectively adds pci/pcie/Kconfig to the following arches, because they already sourced drivers/pci/Kconfig but they previously did not source drivers/pci/pcie/Kconfig: alpha avr32 blackfin frv m32r m68k microblaze mn10300 parisc sparc unicore32 xtensa [bhelgaas: changelog, source pci/pcie/Kconfig at top of pci/Kconfig, whitespace] Signed-off-by: Sasa Bogicevic <[email protected]> Signed-off-by: Bjorn Helgaas <[email protected]>
2016-03-08	microblaze/PCI: Support generic Xilinx AXI PCIe Host Bridge IP driver	Bharat Kumar Gogada	2	-46/+13
	Modify the Microblaze PCI subsystem to work with the generic drivers/pci/host/pcie-xilinx.c driver on Microblaze and Zynq. [bhelgaas: changelog] Signed-off-by: Bharat Kumar Gogada <[email protected]> Signed-off-by: Ravi Kiran Gummaluri <[email protected]> Signed-off-by: Bjorn Helgaas <[email protected]> Acked-by: Michal Simek <[email protected]>
2016-03-08	Input: ad7879 - move header to platform_data directory	Stefan Agner	5	-5/+5
	The header file is used by the SPI and I2C variant of the driver. Therefore, move it to a more generic place under platform_data. Signed-off-by: Stefan Agner <[email protected]> Signed-off-by: Dmitry Torokhov <[email protected]>
2016-03-08	PCI: Set ROM shadow location in arch code, not in PCI core	Bjorn Helgaas	2	-12/+32
	IORESOURCE_ROM_SHADOW means there is a copy of a device's option ROM in RAM. The existence of such a copy and its location are arch-specific. Previously the IORESOURCE_ROM_SHADOW flag was set in arch code, but the 0xC0000-0xDFFFF location was hard-coded into the PCI core. If we're using a shadow copy in RAM, disable the ROM BAR and release the address space it was consuming. Move the location information from the PCI core to the arch code that sets IORESOURCE_ROM_SHADOW. Save the location of the RAM copy in the struct resource for PCI_ROM_RESOURCE. After this change, pci_map_rom() will call pci_assign_resource() and pci_enable_rom() for these IORESOURCE_ROM_SHADOW resources, which we did not do before. This is safe because: - pci_assign_resource() will do nothing because the resource is marked IORESOURCE_PCI_FIXED, which means we can't move it, and - pci_enable_rom() will not turn on the ROM BAR's enable bit because the resource is marked IORESOURCE_ROM_SHADOW, which means it is in RAM rather than in PCI memory space. Storing the location in the struct resource means "lspci" will show the shadow location, not the value from the ROM BAR. Signed-off-by: Bjorn Helgaas <[email protected]>
2016-03-08	PCI: Mark shadow copy of VGA ROM as IORESOURCE_PCI_FIXED	Bjorn Helgaas	2	-2/+4
	A shadow copy of an option ROM is placed by the BIOS as a fixed address. Set IORESOURCE_PCI_FIXED to indicate that we can't move the shadow copy. This prevents warnings like the following when we assign resources: BAR 6: [??? 0x00000000 flags 0x2] has bogus alignment This warning is emitted by pdev_sort_resources(), which already ignores IORESOURCE_PCI_FIXED resources. Link: http://lkml.kernel.org/r/CA+55aFyVMfTBB0oz_yx8+eQOEJnzGtCsYSj9QuhEpdZ9BHdq5A@mail.gmail.com Signed-off-by: Bjorn Helgaas <[email protected]>
2016-03-08	x86/PCI: Mark Broadwell-EP Home Agent & PCU as having non-compliant BARs	Bjorn Helgaas	1	-0/+7
	The Home Agent and PCU PCI devices in Broadwell-EP have a non-BAR register where a BAR should be. We don't know what the side effects of sizing the "BAR" would be, and we don't know what address space the "BAR" might appear to describe. Mark these devices as having non-compliant BARs so the PCI core doesn't touch them. Signed-off-by: Bjorn Helgaas <[email protected]> Tested-by: Andi Kleen <[email protected]> CC: [email protected]
2016-03-08	unicore32: Remove unused HAVE_ARCH_PCI_SET_DMA_MASK definition	Bjorn Helgaas	1	-5/+0
	HAVE_ARCH_PCI_SET_DMA_MASK is unused, so remove the #define for it. Signed-off-by: Bjorn Helgaas <[email protected]> Acked-by: GUAN Xuetao <[email protected]>
2016-03-08	Merge git://git.kernel.org/pub/scm/linux/kernel/git/davem/net	David S. Miller	108	-381/+603
	Several cases of overlapping changes, as well as one instance (vxlan) of a bug fix in 'net' overlapping with code movement in 'net-next'. Signed-off-by: David S. Miller <[email protected]>
2016-03-08	x86/mm, x86/mce: Add memcpy_mcsafe()	Tony Luck	3	-0/+132
	Make use of the EXTABLE_FAULT exception table entries to write a kernel copy routine that doesn't crash the system if it encounters a machine check. Prime use case for this is to copy from large arrays of non-volatile memory used as storage. We have to use an unrolled copy loop for now because current hardware implementations treat a machine check in "rep mov" as fatal. When that is fixed we can simplify. Return type is a "bool". True means that we copied OK, false means that it didn't. Signed-off-by: Tony Luck <[email protected]> Cc: Andrew Morton <[email protected]> Cc: Andy Lutomirski <[email protected]> Cc: Borislav Petkov <[email protected]> Cc: Linus Torvalds <[email protected]> Cc: Peter Zijlstra <[email protected]> Cc: Peter Zijlstra <[email protected]> Cc: Thomas Gleixner <[email protected]> Cc: Tony Luck <[email protected]> Link: http://lkml.kernel.org/r/a44e1055efc2d2a9473307b22c91caa437aa3f8b.1456439214.git.tony.luck@intel.com Signed-off-by: Ingo Molnar <[email protected]>
2016-03-08	perf/x86/intel/rapl: Simplify quirk handling even more	Borislav Petkov	1	-19/+13
	Drop the quirk() function pointer in favor of a simple boolean which says whether the quirk should be applied or not. Update comment while at it. Signed-off-by: Borislav Petkov <[email protected]> Cc: Alexander Shishkin <[email protected]> Cc: Andi Kleen <[email protected]> Cc: Arnaldo Carvalho de Melo <[email protected]> Cc: Harish Chegondi <[email protected]> Cc: Jacob Pan <[email protected]> Cc: Jiri Olsa <[email protected]> Cc: Kan Liang <[email protected]> Cc: Linus Torvalds <[email protected]> Cc: Peter Zijlstra <[email protected]> Cc: Stephane Eranian <[email protected]> Cc: Thomas Gleixner <[email protected]> Cc: Vince Weaver <[email protected]> Cc: [email protected] Link: http://lkml.kernel.org/r/[email protected] Signed-off-by: Ingo Molnar <[email protected]>
2016-03-08	s390: Fix misspellings in comments	Adam Buchbinder	5	-6/+6
	Signed-off-by: Adam Buchbinder <[email protected]> Signed-off-by: Heiko Carstens <[email protected]> Signed-off-by: Martin Schwidefsky <[email protected]>
2016-03-08	s390/mm: split arch/s390/mm/pgtable.c	Martin Schwidefsky	13	-1405/+1464
	The pgtable.c file is quite big, before it grows any larger split it into pgtable.c, pgalloc.c and gmap.c. In addition move the gmap related header definitions into the new gmap.h header and all of the pgste helpers from pgtable.h to pgtable.c. Signed-off-by: Martin Schwidefsky <[email protected]>
2016-03-08	s390/mm: uninline pmdp_xxx functions from pgtable.h	Martin Schwidefsky	3	-112/+124
	The pmdp_xxx function are smaller than their ptep_xxx counterparts but to keep things symmetrical unline them as well. Signed-off-by: Martin Schwidefsky <[email protected]>
2016-03-08	s390/mm: uninline ptep_xxx functions from pgtable.h	Martin Schwidefsky	3	-353/+318
	The code in the various ptep_xxx functions has grown quite large, consolidate them to four out-of-line functions: ptep_xchg_direct to exchange a pte with another with immediate flushing ptep_xchg_lazy to exchange a pte with another in a batched update ptep_modify_prot_start to begin a protection flags update ptep_modify_prot_commit to commit a protection flags update Signed-off-by: Martin Schwidefsky <[email protected]>
2016-03-08	arm64: dts: juno/vexpress: fix node name unit-address presence warnings	Sudeep Holla	6	-36/+36
	Commit fa38a82096a1 ("scripts/dtc: Update to upstream version 53bf130b1cdd") added warnings on node name unit-address presence/absence mismatch in device trees. This patch fixes those warning on all the juno/vexpress platforms where unit-address is present in node name while the reg/ranges property is not present. It also adds unit-address to all smb bus node. Signed-off-by: Sudeep Holla <[email protected]>
2016-03-08	Merge branch 'linus' into x86/fpu, to pick up fixes	Ingo Molnar	188	-659/+1244
	Signed-off-by: Ingo Molnar <[email protected]>
2016-03-08	x86/asm-offsets: Remove PARAVIRT_enabled	Andy Lutomirski	1	-1/+0
	It no longer has any users. Signed-off-by: Andy Lutomirski <[email protected]> Reviewed-by: Borislav Petkov <[email protected]> Cc: Andrew Cooper <[email protected]> Cc: Andy Lutomirski <[email protected]> Cc: Borislav Petkov <[email protected]> Cc: Brian Gerst <[email protected]> Cc: Denys Vlasenko <[email protected]> Cc: H. Peter Anvin <[email protected]> Cc: Linus Torvalds <[email protected]> Cc: Luis R. Rodriguez <[email protected]> Cc: Peter Zijlstra <[email protected]> Cc: Thomas Gleixner <[email protected]> Cc: [email protected] Cc: [email protected] Cc: [email protected] Cc: [email protected] Cc: [email protected] Signed-off-by: Ingo Molnar <[email protected]>
2016-03-08	x86/entry/32: Introduce and use X86_BUG_ESPFIX instead of paravirt_enabled	Andy Lutomirski	3	-13/+35
	x86_64 has very clean espfix handling on paravirt: espfix64 is set up in native_iret, so paravirt systems that override iret bypass espfix64 automatically. This is robust and straightforward. x86_32 is messier. espfix is set up before the IRET paravirt patch point, so it can't be directly conditionalized on whether we use native_iret. We also can't easily move it into native_iret without regressing performance due to a bizarre consideration. Specifically, on 64-bit kernels, the logic is: if (regs->ss & 0x4) setup_espfix; On 32-bit kernels, the logic is: if ((regs->ss & 0x4) && (regs->cs & 0x3) == 3 && (regs->flags & X86_EFLAGS_VM) == 0) setup_espfix; The performance of setup_espfix itself is essentially irrelevant, but the comparison happens on every IRET so its performance matters. On x86_64, there's no need for any registers except flags to implement the comparison, so we fold the whole thing into native_iret. On x86_32, we don't do that because we need a free register to implement the comparison efficiently. We therefore do espfix setup before restoring registers on x86_32. This patch gets rid of the explicit paravirt_enabled check by introducing X86_BUG_ESPFIX on 32-bit systems and using an ALTERNATIVE to skip espfix on paravirt systems where iret != native_iret. This is also messy, but it's at least in line with other things we do. This improves espfix performance by removing a branch, but no one cares. More importantly, it removes a paravirt_enabled user, which is good because paravirt_enabled is ill-defined and is going away. Signed-off-by: Andy Lutomirski <[email protected]> Reviewed-by: Borislav Petkov <[email protected]> Cc: Andrew Cooper <[email protected]> Cc: Andy Lutomirski <[email protected]> Cc: Borislav Petkov <[email protected]> Cc: Brian Gerst <[email protected]> Cc: Denys Vlasenko <[email protected]> Cc: H. Peter Anvin <[email protected]> Cc: Linus Torvalds <[email protected]> Cc: Luis R. Rodriguez <[email protected]> Cc: Peter Zijlstra <[email protected]> Cc: Thomas Gleixner <[email protected]> Cc: [email protected] Cc: [email protected] Cc: [email protected] Cc: [email protected] Cc: [email protected] Signed-off-by: Ingo Molnar <[email protected]>
2016-03-08	KVM: s390: allocate only one DMA page per VM	David Hildenbrand	4	-48/+41
	We can fit the 2k for the STFLE interpretation and the crypto control block into one DMA page. As we now only have to allocate one DMA page, we can clean up the code a bit. As a nice side effect, this also fixes a problem with crycbd alignment in case special allocation debug options are enabled, debugged by Sascha Silbe. Acked-by: Christian Borntraeger <[email protected]> Reviewed-by: Dominik Dingel <[email protected]> Acked-by: Cornelia Huck <[email protected]> Signed-off-by: David Hildenbrand <[email protected]> Signed-off-by: Christian Borntraeger <[email protected]>
2016-03-08	KVM: s390: enable STFLE interpretation only if enabled for the guest	David Hildenbrand	1	-1/+2
	Not setting the facility list designation disables STFLE interpretation, this is what we want if the guest was told to not have it. Signed-off-by: David Hildenbrand <[email protected]> Signed-off-by: Christian Borntraeger <[email protected]>
2016-03-08	KVM: s390: wake up when the VCPU cpu timer expires	David Hildenbrand	1	-13/+35
	When the VCPU cpu timer expires, we have to wake up just like when the ckc triggers. For now, setting up a cpu timer in the guest and going into enabled wait will never lead to a wakeup. This patch fixes this problem. Just as for the ckc, we have to take care of waking up too early. We have to recalculate the sleep time and go back to sleep. Please note that the timer callback calls kvm_s390_get_cpu_timer() from interrupt context. As the timer is canceled when leaving handle_wait(), and we don't do any VCPU cpu timer writes/updates in that function, we can be sure that we will never try to read the VCPU cpu timer from the same cpu that is currentyl updating the timer (deadlock). Reported-by: Sascha Silbe <[email protected]> Tested-by: Sascha Silbe <[email protected]> Signed-off-by: David Hildenbrand <[email protected]> Signed-off-by: Christian Borntraeger <[email protected]>
2016-03-08	KVM: s390: step the VCPU timer while in enabled wait	David Hildenbrand	2	-2/+7
	The cpu timer is a mean to measure task execution time. We want to account everything for a VCPU for which it is responsible. Therefore, if the VCPU wants to sleep, it shall be accounted for it. We can easily get this done by not disabling cpu timer accounting when scheduled out while sleeping because of enabled wait. Signed-off-by: David Hildenbrand <[email protected]> Signed-off-by: Christian Borntraeger <[email protected]>
2016-03-08	KVM: s390: protect VCPU cpu timer with a seqcount	David Hildenbrand	2	-8/+30
	For now, only the owning VCPU thread (that has loaded the VCPU) can get a consistent cpu timer value when calculating the delta. However, other threads might also be interested in a more recent, consistent value. Of special interest will be the timer callback of a VCPU that executes without having the VCPU loaded and could run in parallel with the VCPU thread. The cpu timer has a nice property: it is only updated by the owning VCPU thread. And speaking about accounting, a consistent value can only be calculated by looking at cputm_start and the cpu timer itself in one shot, otherwise the result might be wrong. As we only have one writing thread at a time (owning VCPU thread), we can use a seqcount instead of a seqlock and retry if the VCPU refreshed its cpu timer. This avoids any heavy locking and only introduces a counter update/check plus a handful of smp_wmb(). The owning VCPU thread should never have to retry on reads, and also for other threads this might be a very rare scenario. Please note that we have to use the raw_* variants for locking the seqcount as lockdep will produce false warnings otherwise. The rq->lock held during vcpu_load/put is also acquired from hardirq context. Lockdep cannot know that we avoid potential deadlocks by disabling preemption and thereby disable concurrent write locking attempts (via vcpu_put/load). Reviewed-by: Christian Borntraeger <[email protected]> Signed-off-by: David Hildenbrand <[email protected]> Signed-off-by: Christian Borntraeger <[email protected]>
2016-03-08	KVM: s390: step VCPU cpu timer during kvm_run ioctl	David Hildenbrand	2	-2/+76
	Architecturally we should only provide steal time if we are scheduled away, and not if the host interprets a guest exit. We have to step the guest CPU timer in these cases. In the first shot, we will step the VCPU timer only during the kvm_run ioctl. Therefore all time spent e.g. in interception handlers or on irq delivery will be accounted for that VCPU. We have to take care of a few special cases: - Other VCPUs can test for pending irqs. We can only report a consistent value for the VCPU thread itself when adding the delta. - We have to take care of STP sync, therefore we have to extend kvm_clock_sync() and disable preemption accordingly - During any call to disable/enable/start/stop we could get premeempted and therefore get start/stop calls. Therefore we have to make sure we don't get into an inconsistent state. Whenever a VCPU is scheduled out, sleeping, in user space or just about to enter the SIE, the guest cpu timer isn't stepped. Please note that all primitives are prepared to be called from both environments (cpu timer accounting enabled or not), although not completely used in this patch yet (e.g. kvm_s390_set_cpu_timer() will never be called while cpu timer accounting is enabled). Signed-off-by: David Hildenbrand <[email protected]> Signed-off-by: Christian Borntraeger <[email protected]>
2016-03-08	KVM: s390: abstract access to the VCPU cpu timer	David Hildenbrand	3	-10/+28
	We want to manually step the cpu timer in certain scenarios in the future. Let's abstract any access to the cpu timer, so we can hide the complexity internally. Signed-off-by: David Hildenbrand <[email protected]> Signed-off-by: Christian Borntraeger <[email protected]>
2016-03-08	KVM: s390: store cpu id in vcpu->cpu when scheduled in	David Hildenbrand	1	-0/+2
	By storing the cpu id, we have a way to verify if the current cpu is owning a VCPU. Signed-off-by: David Hildenbrand <[email protected]> Signed-off-by: Christian Borntraeger <[email protected]>
2016-03-08	KVM: s390: Add diag "watchdog functions" to trace event decoding	Alexander Yarygin	1	-0/+1
	DIAG 0x288 may occur now. Let's add its code to the diag table in sie.h. Signed-off-by: Alexander Yarygin <[email protected]> Signed-off-by: Christian Borntraeger <[email protected]>
2016-03-08	x86/nmi: Mark 'ignore_nmis' as __read_mostly	Kostenzer Felix	1	-1/+2
	ignore_nmis is used in two distinct places: 1. modified through {stop,restart}_nmi by alternative_instructions 2. read by do_nmi to determine if default_do_nmi should be called or not thus the access pattern conforms to __read_mostly and do_nmi() is a fastpath. Signed-off-by: Kostenzer Felix <[email protected]> Cc: Linus Torvalds <[email protected]> Cc: Peter Zijlstra <[email protected]> Cc: Thomas Gleixner <[email protected]> Cc: [email protected] Signed-off-by: Ingo Molnar <[email protected]>
2016-03-08	KVM: s390: correct fprs on SIGP (STOP AND) STORE STATUS	David Hildenbrand	1	-1/+1
	With MACHINE_HAS_VX, we convert the floating point registers from the vector registeres when storing the status. For other VCPUs, these are stored to vcpu->run->s.regs.vrs, but we are using current->thread.fpu.vxrs, which resolves to the currently loaded VCPU. So kvm_s390_store_status_unloaded() currently writes the wrong floating point registers (converted from the vector registers) when called from another VCPU on a z13. This is only the case for old user space not handling SIGP STORE STATUS and SIGP STOP AND STORE STATUS, but relying on the kernel implementation. All other calls come from the loaded VCPU via kvm_s390_store_status(). Fixes: 9abc2a08a7d6 (KVM: s390: fix memory overwrites when vx is disabled) Reviewed-by: Christian Borntraeger <[email protected]> Cc: [email protected] # v4.4+ Signed-off-by: David Hildenbrand <[email protected]> Signed-off-by: Christian Borntraeger <[email protected]> Signed-off-by: Paolo Bonzini <[email protected]>