blaster4385/linux-IllusionX - Linux kernel with personal config changes for arch linux

Age	Commit message (Collapse)	Author	Files	Lines
2019-04-18	Merge branch 'linus' of ↵	Linus Torvalds	2	-12/+24
	git://git.kernel.org/pub/scm/linux/kernel/git/herbert/crypto-2.6 Pull crypto fix from Herbert Xu: "Fix a bug in the implementation of the x86 accelerated version of poly1305" * 'linus' of git://git.kernel.org/pub/scm/linux/kernel/git/herbert/crypto-2.6: crypto: x86/poly1305 - fix overflow during partial reduction
2019-04-18	perf/x86/amd: Add event map for AMD Family 17h	Kim Phillips	1	-9/+26
	Family 17h differs from prior families by: - Does not support an L2 cache miss event - It has re-enumerated PMC counters for: - L2 cache references - front & back end stalled cycles So we add a new amd_f17h_perfmon_event_map[] so that the generic perf event names will resolve to the correct h/w events on family 17h and above processors. Reference sections 2.1.13.3.3 (stalls) and 2.1.13.3.6 (L2): https://www.amd.com/system/files/TechDocs/54945_PPR_Family_17h_Models_00h-0Fh.pdf Signed-off-by: Kim Phillips <[email protected]> Cc: <[email protected]> # v4.9+ Cc: Alexander Shishkin <[email protected]> Cc: Arnaldo Carvalho de Melo <[email protected]> Cc: Borislav Petkov <[email protected]> Cc: H. Peter Anvin <[email protected]> Cc: Janakarajan Natarajan <[email protected]> Cc: Jiri Olsa <[email protected]> Cc: Linus Torvalds <[email protected]> Cc: Martin Liška <[email protected]> Cc: Namhyung Kim <[email protected]> Cc: Peter Zijlstra <[email protected]> Cc: Pu Wen <[email protected]> Cc: Suravee Suthikulpanit <[email protected]> Cc: Thomas Gleixner <[email protected]> Cc: [email protected] Fixes: e40ed1542dd7 ("perf/x86: Add perf support for AMD family-17h processors") [ Improved the formatting a bit. ] Signed-off-by: Ingo Molnar <[email protected]>
2019-04-18	x86/speculation/mds: Add 'mitigations=' support for MDS	Josh Poimboeuf	1	-2/+3
	Add MDS to the new 'mitigations=' cmdline option. Signed-off-by: Josh Poimboeuf <[email protected]> Signed-off-by: Thomas Gleixner <[email protected]>
2019-04-18	x86/mm/KASLR: Fix the size of the direct mapping section	Baoquan He	1	-1/+1
	kernel_randomize_memory() uses __PHYSICAL_MASK_SHIFT to calculate the maximum amount of system RAM supported. The size of the direct mapping section is obtained from the smaller one of the below two values: (actual system RAM size + padding size) vs (max system RAM size supported) This calculation is wrong since commit b83ce5ee9147 ("x86/mm/64: Make __PHYSICAL_MASK_SHIFT always 52"). In it, __PHYSICAL_MASK_SHIFT was changed to be 52, regardless of whether the kernel is using 4-level or 5-level page tables. Thus, it will always use 4 PB as the maximum amount of system RAM, even in 4-level paging mode where it should actually be 64 TB. Thus, the size of the direct mapping section will always be the sum of the actual system RAM size plus the padding size. Even when the amount of system RAM is 64 TB, the following layout will still be used. Obviously KALSR will be weakened significantly. \|____\|_______actual RAM_______\|_padding_\|______the rest_______\| 0 64TB ~120TB Instead, it should be like this: \|____\|_______actual RAM_______\|_________the rest______________\| 0 64TB ~120TB The size of padding region is controlled by CONFIG_RANDOMIZE_MEMORY_PHYSICAL_PADDING, which is 10 TB by default. The above issue only exists when CONFIG_RANDOMIZE_MEMORY_PHYSICAL_PADDING is set to a non-zero value, which is the case when CONFIG_MEMORY_HOTPLUG is enabled. Otherwise, using __PHYSICAL_MASK_SHIFT doesn't affect KASLR. Fix it by replacing __PHYSICAL_MASK_SHIFT with MAX_PHYSMEM_BITS. [ bp: Massage commit message. ] Fixes: b83ce5ee9147 ("x86/mm/64: Make __PHYSICAL_MASK_SHIFT always 52") Signed-off-by: Baoquan He <[email protected]> Signed-off-by: Borislav Petkov <[email protected]> Reviewed-by: Thomas Garnier <[email protected]> Acked-by: Kirill A. Shutemov <[email protected]> Cc: "H. Peter Anvin" <[email protected]> Cc: Andy Lutomirski <[email protected]> Cc: Dave Hansen <[email protected]> Cc: Ingo Molnar <[email protected]> Cc: Kees Cook <[email protected]> Cc: Peter Zijlstra <[email protected]> Cc: Thomas Gleixner <[email protected]> Cc: [email protected] Cc: [email protected] Cc: [email protected] Cc: [email protected] Cc: [email protected] Cc: x86-ml <[email protected]> Cc: [email protected] Link: https://lkml.kernel.org/r/20190417083536.GE7065@MiWiFi-R3L-srv
2019-04-18	x86/build/vdso: Add FORCE to the build rule of %.so	Masahiro Yamada	1	-1/+1
	$(call if_changed,...) must have FORCE as a prerequisite. Signed-off-by: Masahiro Yamada <[email protected]> Cc: Andy Lutomirski <[email protected]> Cc: Borislav Petkov <[email protected]> Cc: Linus Torvalds <[email protected]> Cc: Peter Zijlstra <[email protected]> Cc: Thomas Gleixner <[email protected]> Link: http://lkml.kernel.org/r/[email protected] Signed-off-by: Ingo Molnar <[email protected]>
2019-04-18	x86/resctrl: Initialize a new resource group with default MBA values	Xiaochen Shen	2	-21/+35
	Currently, when a new resource group is created, the allocation values of the MBA resource are not initialized and remain meaningless data. For example: mkdir /sys/fs/resctrl/p1 cat /sys/fs/resctrl/p1/schemata MB:0=100;1=100 echo "MB:0=10;1=20" > /sys/fs/resctrl/p1/schemata cat /sys/fs/resctrl/p1/schemata MB:0= 10;1= 20 rmdir /sys/fs/resctrl/p1 mkdir /sys/fs/resctrl/p2 cat /sys/fs/resctrl/p2/schemata MB:0= 10;1= 20 Therefore, when the new group is created, it is reasonable to initialize MBA resource with default values. Initialize the MBA resource and cache resources in separate functions. [ bp: Add newlines between code blocks for better readability. ] Signed-off-by: Xiaochen Shen <[email protected]> Signed-off-by: Borislav Petkov <[email protected]> Reviewed-by: Fenghua Yu <[email protected]> Reviewed-by: Reinette Chatre <[email protected]> Cc: "H. Peter Anvin" <[email protected]> Cc: Ingo Molnar <[email protected]> Cc: [email protected] Cc: Thomas Gleixner <[email protected]> Cc: Tony Luck <[email protected]> Cc: x86-ml <[email protected]> Link: https://lkml.kernel.org/r/[email protected]
2019-04-17	x86/resctrl: Move per RDT domain initialization to a separate function	Xiaochen Shen	1	-59/+72
	Carve out per rdt_domain initialization code from rdtgroup_init_alloc() into a separate function. No functional change, make the code more readable and save us at least two indentation levels. Signed-off-by: Xiaochen Shen <[email protected]> Signed-off-by: Borislav Petkov <[email protected]> Cc: Fenghua Yu <[email protected]> Cc: "H. Peter Anvin" <[email protected]> Cc: Ingo Molnar <[email protected]> Cc: [email protected] Cc: Reinette Chatre <[email protected]> Cc: Thomas Gleixner <[email protected]> Cc: Tony Luck <[email protected]> Cc: x86-ml <[email protected]> Link: https://lkml.kernel.org/r/[email protected]
2019-04-17	Merge x86/urgent into x86/cache	Borislav Petkov	8	-8/+35
	Merge it to pick up dependent urgent changes before applying more resctrl stuff. Signed-off-by: Borislav Petkov <[email protected]>
2019-04-17	x86/PCI: Fix PCI IRQ routing table memory leak	Wenwen Wang	1	-2/+8
	In pcibios_irq_init(), the PCI IRQ routing table 'pirq_table' is first found through pirq_find_routing_table(). If the table is not found and CONFIG_PCI_BIOS is defined, the table is then allocated in pcibios_get_irq_routing_table() using kmalloc(). Later, if the I/O APIC is used, this table is actually not used. In that case, the allocated table is not freed, which is a memory leak. Free the allocated table if it is not used. Signed-off-by: Wenwen Wang <[email protected]> [bhelgaas: added Ingo's reviewed-by, since the only change since v1 was to use the irq_routing_table local variable name he suggested] Signed-off-by: Bjorn Helgaas <[email protected]> Reviewed-by: Ingo Molnar <[email protected]> Acked-by: Thomas Gleixner <[email protected]>
2019-04-17	Merge branch 'core/speculation' of ↵	Thomas Gleixner	2	-3/+12
	git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip.git Pull in the command line updates from the tip tree so the MDS parts can be added.
2019-04-17	x86/speculation: Support 'mitigations=' cmdline option	Josh Poimboeuf	2	-3/+12
	Configure x86 runtime CPU speculation bug mitigations in accordance with the 'mitigations=' cmdline option. This affects Meltdown, Spectre v2, Speculative Store Bypass, and L1TF. The default behavior is unchanged. Signed-off-by: Josh Poimboeuf <[email protected]> Signed-off-by: Thomas Gleixner <[email protected]> Tested-by: Jiri Kosina <[email protected]> (on x86) Reviewed-by: Jiri Kosina <[email protected]> Cc: Borislav Petkov <[email protected]> Cc: "H . Peter Anvin" <[email protected]> Cc: Andy Lutomirski <[email protected]> Cc: Peter Zijlstra <[email protected]> Cc: Jiri Kosina <[email protected]> Cc: Waiman Long <[email protected]> Cc: Andrea Arcangeli <[email protected]> Cc: Jon Masters <[email protected]> Cc: Benjamin Herrenschmidt <[email protected]> Cc: Paul Mackerras <[email protected]> Cc: Michael Ellerman <[email protected]> Cc: [email protected] Cc: Martin Schwidefsky <[email protected]> Cc: Heiko Carstens <[email protected]> Cc: [email protected] Cc: Catalin Marinas <[email protected]> Cc: Will Deacon <[email protected]> Cc: [email protected] Cc: [email protected] Cc: Greg Kroah-Hartman <[email protected]> Cc: Tyler Hicks <[email protected]> Cc: Linus Torvalds <[email protected]> Cc: Randy Dunlap <[email protected]> Cc: Steven Price <[email protected]> Cc: Phil Auld <[email protected]> Link: https://lkml.kernel.org/r/6616d0ae169308516cfdf5216bedd169f8a8291b.1555085500.git.jpoimboe@redhat.com
2019-04-17	x86/speculation/mds: Print SMT vulnerable on MSBDS with mitigations off	Konrad Rzeszutek Wilk	1	-1/+2
	This code is only for CPUs which are affected by MSBDS, but are not affected by the other two MDS issues. For such CPUs, enabling the mds_idle_clear mitigation is enough to mitigate SMT. However if user boots with 'mds=off' and still has SMT enabled, we should not report that SMT is mitigated: $cat /sys//devices/system/cpu/vulnerabilities/mds Vulnerable; SMT mitigated But rather: Vulnerable; SMT vulnerable Signed-off-by: Konrad Rzeszutek Wilk <[email protected]> Signed-off-by: Thomas Gleixner <[email protected]> Reviewed-by: Tyler Hicks <[email protected]> Reviewed-by: Josh Poimboeuf <[email protected]> Link: https://lkml.kernel.org/r/[email protected]
2019-04-17	x86/speculation/mds: Fix comment	Boris Ostrovsky	1	-1/+1
	s/L1TF/MDS/ Signed-off-by: Boris Ostrovsky <[email protected]> Signed-off-by: Konrad Rzeszutek Wilk <[email protected]> Signed-off-by: Thomas Gleixner <[email protected]> Reviewed-by: Tyler Hicks <[email protected]> Reviewed-by: Josh Poimboeuf <[email protected]>
2019-04-17	x86/irq/64: Remove stack overflow debug code	Thomas Gleixner	2	-57/+1
	All stack types on x86 64-bit have guard pages now. So there is no point in executing probabilistic overflow checks as the guard pages are a accurate and reliable overflow prevention. Signed-off-by: Thomas Gleixner <[email protected]> Signed-off-by: Borislav Petkov <[email protected]> Cc: "H. Peter Anvin" <[email protected]> Cc: Andy Lutomirski <[email protected]> Cc: Ingo Molnar <[email protected]> Cc: Josh Poimboeuf <[email protected]> Cc: Nicolai Stange <[email protected]> Cc: Sean Christopherson <[email protected]> Cc: x86-ml <[email protected]> Link: https://lkml.kernel.org/r/[email protected]
2019-04-17	x86/irq/64: Remap the IRQ stack with guard pages	Andy Lutomirski	1	-0/+30
	The IRQ stack lives in percpu space, so an IRQ handler that overflows it will overwrite other data structures. Use vmap() to remap the IRQ stack so that it will have the usual guard pages that vmap()/vmalloc() allocations have. With this, the kernel will panic immediately on an IRQ stack overflow. [ tglx: Move the map code to a proper place and invoke it only when a CPU is about to be brought online. No point in installing the map at early boot for all possible CPUs. Fail the CPU bringup if the vmap() fails as done for all other preparatory stages in CPU hotplug. ] Signed-off-by: Andy Lutomirski <[email protected]> Signed-off-by: Thomas Gleixner <[email protected]> Signed-off-by: Borislav Petkov <[email protected]> Cc: "H. Peter Anvin" <[email protected]> Cc: Andy Lutomirski <[email protected]> Cc: Ingo Molnar <[email protected]> Cc: Josh Poimboeuf <[email protected]> Cc: Nicolai Stange <[email protected]> Cc: Sean Christopherson <[email protected]> Cc: x86-ml <[email protected]> Link: https://lkml.kernel.org/r/[email protected]
2019-04-17	x86/irq/64: Split the IRQ stack into its own pages	Andy Lutomirski	11	-44/+39
	Currently, the IRQ stack is hardcoded as the first page of the percpu area, and the stack canary lives on the IRQ stack. The former gets in the way of adding an IRQ stack guard page, and the latter is a potential weakness in the stack canary mechanism. Split the IRQ stack into its own private percpu pages. [ tglx: Make 64 and 32 bit share struct irq_stack ] Signed-off-by: Andy Lutomirski <[email protected]> Signed-off-by: Thomas Gleixner <[email protected]> Signed-off-by: Borislav Petkov <[email protected]> Cc: Alexey Dobriyan <[email protected]> Cc: Andrew Morton <[email protected]> Cc: Ard Biesheuvel <[email protected]> Cc: Boris Ostrovsky <[email protected]> Cc: Brijesh Singh <[email protected]> Cc: "Chang S. Bae" <[email protected]> Cc: Dominik Brodowski <[email protected]> Cc: Feng Tang <[email protected]> Cc: "H. Peter Anvin" <[email protected]> Cc: Ingo Molnar <[email protected]> Cc: Jan Beulich <[email protected]> Cc: Jiri Kosina <[email protected]> Cc: Joerg Roedel <[email protected]> Cc: Jordan Borgner <[email protected]> Cc: Josh Poimboeuf <[email protected]> Cc: Juergen Gross <[email protected]> Cc: Konrad Rzeszutek Wilk <[email protected]> Cc: Maran Wilson <[email protected]> Cc: Masahiro Yamada <[email protected]> Cc: Michal Hocko <[email protected]> Cc: Mike Rapoport <[email protected]> Cc: Nick Desaulniers <[email protected]> Cc: Nicolai Stange <[email protected]> Cc: Peter Zijlstra <[email protected]> Cc: Pu Wen <[email protected]> Cc: "Rafael Ávila de Espíndola" <[email protected]> Cc: Sean Christopherson <[email protected]> Cc: Stefano Stabellini <[email protected]> Cc: Vlastimil Babka <[email protected]> Cc: x86-ml <[email protected]> Cc: [email protected] Link: https://lkml.kernel.org/r/[email protected]
2019-04-17	x86/irq/64: Init hardirq_stack_ptr during CPU hotplug	Thomas Gleixner	3	-7/+16
	Preparatory change for disentangling the irq stack union as a prerequisite for irq stacks with guard pages. Signed-off-by: Thomas Gleixner <[email protected]> Signed-off-by: Borislav Petkov <[email protected]> Cc: Andy Lutomirski <[email protected]> Cc: "Chang S. Bae" <[email protected]> Cc: Dominik Brodowski <[email protected]> Cc: "H. Peter Anvin" <[email protected]> Cc: Ingo Molnar <[email protected]> Cc: Josh Poimboeuf <[email protected]> Cc: Konrad Rzeszutek Wilk <[email protected]> Cc: Nicolai Stange <[email protected]> Cc: Pavel Tatashin <[email protected]> Cc: Peter Zijlstra <[email protected]> Cc: Sean Christopherson <[email protected]> Cc: x86-ml <[email protected]> Cc: Yi Wang <[email protected]> Link: https://lkml.kernel.org/r/[email protected]
2019-04-17	x86/irq/32: Handle irq stack allocation failure proper	Thomas Gleixner	6	-24/+35
	irq_ctx_init() crashes hard on page allocation failures. While that's ok during early boot, it's just wrong in the CPU hotplug bringup code. Check the page allocation failure and return -ENOMEM and handle it at the call sites. On early boot the only way out is to BUG(), but on CPU hotplug there is no reason to crash, so just abort the operation. Rename the function to something more sensible while at it. Signed-off-by: Thomas Gleixner <[email protected]> Signed-off-by: Borislav Petkov <[email protected]> Cc: Alison Schofield <[email protected]> Cc: Andrew Morton <[email protected]> Cc: Andy Lutomirski <[email protected]> Cc: Anshuman Khandual <[email protected]> Cc: Boris Ostrovsky <[email protected]> Cc: "H. Peter Anvin" <[email protected]> Cc: Ingo Molnar <[email protected]> Cc: Josh Poimboeuf <[email protected]> Cc: Juergen Gross <[email protected]> Cc: Konrad Rzeszutek Wilk <[email protected]> Cc: Nicolai Stange <[email protected]> Cc: Pu Wen <[email protected]> Cc: Sean Christopherson <[email protected]> Cc: Shaokun Zhang <[email protected]> Cc: Stefano Stabellini <[email protected]> Cc: Suravee Suthikulpanit <[email protected]> Cc: x86-ml <[email protected]> Cc: [email protected] Cc: Yazen Ghannam <[email protected]> Cc: Yi Wang <[email protected]> Cc: Zhenzhong Duan <[email protected]> Link: https://lkml.kernel.org/r/[email protected]
2019-04-17	x86/irq/32: Invoke irq_ctx_init() from init_IRQ()	Thomas Gleixner	1	-2/+2
	irq_ctx_init() is invoked from native_init_IRQ() or from xen_init_IRQ() code. There is no reason to have this split. The interrupt stacks must be allocated no matter what. Invoke it from init_IRQ() before invoking the native or XEN init implementation. Signed-off-by: Thomas Gleixner <[email protected]> Signed-off-by: Borislav Petkov <[email protected]> Reviewed-by: Juergen Gross <[email protected]> Cc: Andy Lutomirski <[email protected]> Cc: Boris Ostrovsky <[email protected]> Cc: "H. Peter Anvin" <[email protected]> Cc: Ingo Molnar <[email protected]> Cc: Josh Abraham <[email protected]> Cc: Josh Poimboeuf <[email protected]> Cc: Juergen Gross <[email protected]> Cc: Mike Rapoport <[email protected]> Cc: Nicolai Stange <[email protected]> Cc: Sean Christopherson <[email protected]> Cc: Stefano Stabellini <[email protected]> Cc: Stephen Rothwell <[email protected]> Cc: x86-ml <[email protected]> Cc: [email protected] Link: https://lkml.kernel.org/r/[email protected]
2019-04-17	x86/irq/64: Rename irq_stack_ptr to hardirq_stack_ptr	Thomas Gleixner	6	-6/+6
	Preparatory patch to share code with 32bit. No functional changes. Signed-off-by: Thomas Gleixner <[email protected]> Signed-off-by: Borislav Petkov <[email protected]> Cc: Alexey Dobriyan <[email protected]> Cc: Andrew Morton <[email protected]> Cc: Andy Lutomirski <[email protected]> Cc: "Chang S. Bae" <[email protected]> Cc: Dominik Brodowski <[email protected]> Cc: "H. Peter Anvin" <[email protected]> Cc: Ingo Molnar <[email protected]> Cc: Jiri Kosina <[email protected]> Cc: Josh Poimboeuf <[email protected]> Cc: Konrad Rzeszutek Wilk <[email protected]> Cc: Michal Hocko <[email protected]> Cc: Mike Rapoport <[email protected]> Cc: Nick Desaulniers <[email protected]> Cc: Nicolai Stange <[email protected]> Cc: Peter Zijlstra <[email protected]> Cc: Pingfan Liu <[email protected]> Cc: Sean Christopherson <[email protected]> Cc: Stephen Rothwell <[email protected]> Cc: Vlastimil Babka <[email protected]> Cc: x86-ml <[email protected]> Link: https://lkml.kernel.org/r/[email protected]
2019-04-17	x86/irq/32: Rename hard/softirq_stack to hard/softirq_stack_ptr	Thomas Gleixner	3	-13/+14
	The percpu storage holds a pointer to the stack not the stack itself. Rename it before sharing struct irq_stack with 64-bit. No functional changes. Signed-off-by: Thomas Gleixner <[email protected]> Signed-off-by: Borislav Petkov <[email protected]> Cc: Andrew Morton <[email protected]> Cc: Andy Lutomirski <[email protected]> Cc: Dave Hansen <[email protected]> Cc: "H. Peter Anvin" <[email protected]> Cc: Ingo Molnar <[email protected]> Cc: Jiri Kosina <[email protected]> Cc: Josh Poimboeuf <[email protected]> Cc: Juergen Gross <[email protected]> Cc: Nick Desaulniers <[email protected]> Cc: Nicolai Stange <[email protected]> Cc: Sean Christopherson <[email protected]> Cc: Vlastimil Babka <[email protected]> Cc: x86-ml <[email protected]> Link: https://lkml.kernel.org/r/[email protected]
2019-04-17	x86/irq/32: Make irq stack a character array	Thomas Gleixner	1	-1/+1
	There is no reason to have an u32 array in struct irq_stack. The only purpose of the array is to size the struct properly. Preparatory change for sharing struct irq_stack with 64-bit. Signed-off-by: Thomas Gleixner <[email protected]> Signed-off-by: Borislav Petkov <[email protected]> Cc: Alexey Dobriyan <[email protected]> Cc: Andrew Morton <[email protected]> Cc: Andy Lutomirski <[email protected]> Cc: "H. Peter Anvin" <[email protected]> Cc: Ingo Molnar <[email protected]> Cc: Jiri Kosina <[email protected]> Cc: Josh Poimboeuf <[email protected]> Cc: Nick Desaulniers <[email protected]> Cc: Pingfan Liu <[email protected]> Cc: Pu Wen <[email protected]> Cc: Sean Christopherson <[email protected]> Cc: Vlastimil Babka <[email protected]> Cc: x86-ml <[email protected]> Link: https://lkml.kernel.org/r/[email protected]
2019-04-17	x86/irq/32: Define IRQ_STACK_SIZE	Thomas Gleixner	2	-2/+4
	On 32-bit IRQ_STACK_SIZE is the same as THREAD_SIZE. To allow sharing struct irq_stack with 32-bit, define IRQ_STACK_SIZE for 32-bit and use it for struct irq_stack. No functional change. Signed-off-by: Thomas Gleixner <[email protected]> Signed-off-by: Borislav Petkov <[email protected]> Cc: Andrew Morton <[email protected]> Cc: Andy Lutomirski <[email protected]> Cc: Dave Hansen <[email protected]> Cc: "H. Peter Anvin" <[email protected]> Cc: Ingo Molnar <[email protected]> Cc: Jiri Kosina <[email protected]> Cc: Josh Poimboeuf <[email protected]> Cc: Juergen Gross <[email protected]> Cc: Michal Hocko <[email protected]> Cc: Nick Desaulniers <[email protected]> Cc: Sean Christopherson <[email protected]> Cc: Suravee Suthikulpanit <[email protected]> Cc: Vlastimil Babka <[email protected]> Cc: x86-ml <[email protected]> Link: https://lkml.kernel.org/r/[email protected]
2019-04-17	x86/dumpstack/64: Speedup in_exception_stack()	Thomas Gleixner	1	-31/+51
	The current implementation of in_exception_stack() iterates over the exception stacks array. Most of the time this is an useless exercise, but even for the actual use cases (perf and ftrace) it takes at least 2 iterations to get to the NMI stack. As the exception stacks and the guard pages are page aligned the loop can be avoided completely. Add a initial check whether the stack pointer is inside the full exception stack area and leave early if not. Create a lookup table which describes the stack area. The table index is the page offset from the beginning of the exception stacks. So for any given stack pointer the page offset is computed and a lookup in the description table is performed. If it is inside a guard page, return. If not, use the descriptor to fill in the info structure. The table is filled at compile time and for the !KASAN case the interesting page descriptors exactly fit into a single cache line. Just the last guard page descriptor is in the next cacheline, but that should not be accessed in the regular case. Signed-off-by: Thomas Gleixner <[email protected]> Signed-off-by: Borislav Petkov <[email protected]> Acked-by: Josh Poimboeuf <[email protected]> Cc: "H. Peter Anvin" <[email protected]> Cc: Andy Lutomirski <[email protected]> Cc: Ingo Molnar <[email protected]> Cc: Josh Poimboeuf <[email protected]> Cc: Sean Christopherson <[email protected]> Cc: x86-ml <[email protected]> Link: https://lkml.kernel.org/r/[email protected]
2019-04-17	x86/exceptions: Split debug IST stack	Thomas Gleixner	9	-30/+46
	The debug IST stack is actually two separate debug stacks to handle #DB recursion. This is required because the CPU starts always at top of stack on exception entry, which means on #DB recursion the second #DB would overwrite the stack of the first. The low level entry code therefore adjusts the top of stack on entry so a secondary #DB starts from a different stack page. But the stack pages are adjacent without a guard page between them. Split the debug stack into 3 stacks which are separated by guard pages. The 3rd stack is never mapped into the cpu_entry_area and is only there to catch triple #DB nesting: --- top of DB_stack <- Initial stack --- end of DB_stack guard page --- top of DB1_stack <- Top of stack after entering first #DB --- end of DB1_stack guard page --- top of DB2_stack <- Top of stack after entering second #DB --- end of DB2_stack guard page If DB2 would not act as the final guard hole, a second #DB would point the top of #DB stack to the stack below #DB1 which would be valid and not catch the not so desired triple nesting. The backing store does not allocate any memory for DB2 and its guard page as it is not going to be mapped into the cpu_entry_area. - Adjust the low level entry code so it adjusts top of #DB with the offset between the stacks instead of exception stack size. - Make the dumpstack code aware of the new stacks. - Adjust the in_debug_stack() implementation and move it into the NMI code where it belongs. As this is NMI hotpath code, it just checks the full area between top of DB_stack and bottom of DB1_stack without checking for the guard page. That's correct because the NMI cannot hit a stackpointer pointing to the guard page between DB and DB1 stack. Even if it would, then the NMI operation still is unaffected, but the resume of the debug exception on the topmost DB stack will crash by touching the guard page. [ bp: Make exception_stack_names static const char * const ] Suggested-by: Andy Lutomirski <[email protected]> Signed-off-by: Thomas Gleixner <[email protected]> Signed-off-by: Borislav Petkov <[email protected]> Reviewed-by: Sean Christopherson <[email protected]> Cc: Andy Lutomirski <[email protected]> Cc: Baoquan He <[email protected]> Cc: "Chang S. Bae" <[email protected]> Cc: Dave Hansen <[email protected]> Cc: Dominik Brodowski <[email protected]> Cc: "H. Peter Anvin" <[email protected]> Cc: Ingo Molnar <[email protected]> Cc: Joerg Roedel <[email protected]> Cc: Jonathan Corbet <[email protected]> Cc: Josh Poimboeuf <[email protected]> Cc: Juergen Gross <[email protected]> Cc: "Kirill A. Shutemov" <[email protected]> Cc: Konrad Rzeszutek Wilk <[email protected]> Cc: [email protected] Cc: Masahiro Yamada <[email protected]> Cc: Peter Zijlstra <[email protected]> Cc: Qian Cai <[email protected]> Cc: Sean Christopherson <[email protected]> Cc: x86-ml <[email protected]> Link: https://lkml.kernel.org/r/[email protected]
2019-04-17	x86/exceptions: Enable IST guard pages	Thomas Gleixner	1	-6/+2
	All usage sites which expected that the exception stacks in the CPU entry area are mapped linearly are fixed up. Enable guard pages between the IST stacks. Signed-off-by: Thomas Gleixner <[email protected]> Signed-off-by: Borislav Petkov <[email protected]> Cc: "H. Peter Anvin" <[email protected]> Cc: Andy Lutomirski <[email protected]> Cc: Ingo Molnar <[email protected]> Cc: Josh Poimboeuf <[email protected]> Cc: Sean Christopherson <[email protected]> Cc: Thomas Gleixner <[email protected]> Cc: x86-ml <[email protected]> Link: https://lkml.kernel.org/r/[email protected]
2019-04-17	x86/exceptions: Disconnect IST index and stack order	Thomas Gleixner	6	-15/+27
	The entry order of the TSS.IST array and the order of the stack storage/mapping are not required to be the same. With the upcoming split of the debug stack this is going to fall apart as the number of TSS.IST array entries stays the same while the actual stacks are increasing. Make them separate so that code like dumpstack can just utilize the mapping order. The IST index is solely required for the actual TSS.IST array initialization. Signed-off-by: Thomas Gleixner <[email protected]> Signed-off-by: Borislav Petkov <[email protected]> Cc: Andy Lutomirski <[email protected]> Cc: Baoquan He <[email protected]> Cc: "Chang S. Bae" <[email protected]> Cc: Dominik Brodowski <[email protected]> Cc: Dou Liyang <[email protected]> Cc: "H. Peter Anvin" <[email protected]> Cc: Ingo Molnar <[email protected]> Cc: Jann Horn <[email protected]> Cc: Josh Poimboeuf <[email protected]> Cc: Kees Cook <[email protected]> Cc: "Kirill A. Shutemov" <[email protected]> Cc: Konrad Rzeszutek Wilk <[email protected]> Cc: Nicolai Stange <[email protected]> Cc: Peter Zijlstra <[email protected]> Cc: Qian Cai <[email protected]> Cc: Sean Christopherson <[email protected]> Cc: x86-ml <[email protected]> Link: https://lkml.kernel.org/r/[email protected]
2019-04-17	x86/cpu: Remove orig_ist array	Thomas Gleixner	2	-15/+0
	All users gone. Signed-off-by: Thomas Gleixner <[email protected]> Signed-off-by: Borislav Petkov <[email protected]> Cc: Andrew Morton <[email protected]> Cc: Andy Lutomirski <[email protected]> Cc: "Chang S. Bae" <[email protected]> Cc: Dave Hansen <[email protected]> Cc: Dominik Brodowski <[email protected]> Cc: "H. Peter Anvin" <[email protected]> Cc: Ingo Molnar <[email protected]> Cc: Jiri Kosina <[email protected]> Cc: Josh Poimboeuf <[email protected]> Cc: Juergen Gross <[email protected]> Cc: Konrad Rzeszutek Wilk <[email protected]> Cc: Nick Desaulniers <[email protected]> Cc: Peter Zijlstra <[email protected]> Cc: Pingfan Liu <[email protected]> Cc: Pu Wen <[email protected]> Cc: Sean Christopherson <[email protected]> Cc: Vlastimil Babka <[email protected]> Cc: x86-ml <[email protected]> Link: https://lkml.kernel.org/r/[email protected]
2019-04-17	x86/cpu: Prepare TSS.IST setup for guard pages	Thomas Gleixner	1	-28/+7
	Convert the TSS.IST setup code to use the cpu entry area information directly instead of assuming a linear mapping of the IST stacks. The store to orig_ist[] is no longer required as there are no users anymore. This is the last preparatory step towards IST guard pages. Signed-off-by: Thomas Gleixner <[email protected]> Signed-off-by: Borislav Petkov <[email protected]> Cc: Andy Lutomirski <[email protected]> Cc: "Chang S. Bae" <[email protected]> Cc: Dominik Brodowski <[email protected]> Cc: "H. Peter Anvin" <[email protected]> Cc: Ingo Molnar <[email protected]> Cc: Josh Poimboeuf <[email protected]> Cc: Konrad Rzeszutek Wilk <[email protected]> Cc: Peter Zijlstra <[email protected]> Cc: Sean Christopherson <[email protected]> Cc: x86-ml <[email protected]> Link: https://lkml.kernel.org/r/[email protected]
2019-04-17	x86/dumpstack/64: Use cpu_entry_area instead of orig_ist	Thomas Gleixner	1	-12/+27
	The orig_ist[] array is a shadow copy of the IST array in the TSS. The reason why it exists is that older kernels used two TSS variants with different pointers into the debug stack. orig_ist[] contains the real starting points. There is no point anymore to do so because the same information can be retrieved using the base address of the cpu entry area mapping and the offsets of the various exception stacks. No functional change. Preparation for removing orig_ist. Cc: Josh Poimboeuf <[email protected]> Signed-off-by: Thomas Gleixner <[email protected]> Signed-off-by: Borislav Petkov <[email protected]> Cc: "H. Peter Anvin" <[email protected]> Cc: Andy Lutomirski <[email protected]> Cc: Ingo Molnar <[email protected]> Cc: Josh Poimboeuf <[email protected]> Cc: Sean Christopherson <[email protected]> Cc: x86-ml <[email protected]> Link: https://lkml.kernel.org/r/[email protected]
2019-04-17	x86/irq/64: Use cpu entry area instead of orig_ist	Thomas Gleixner	1	-6/+7
	The orig_ist[] array is a shadow copy of the IST array in the TSS. The reason why it exists is that older kernels used two TSS variants with different pointers into the debug stack. orig_ist[] contains the real starting points. There is no point anymore to do so because the same information can be retrieved using the base address of the cpu entry area mapping and the offsets of the various exception stacks. No functional change. Preparation for removing orig_ist. Signed-off-by: Thomas Gleixner <[email protected]> Signed-off-by: Borislav Petkov <[email protected]> Cc: "H. Peter Anvin" <[email protected]> Cc: Andy Lutomirski <[email protected]> Cc: Ingo Molnar <[email protected]> Cc: Josh Poimboeuf <[email protected]> Cc: Nicolai Stange <[email protected]> Cc: Sean Christopherson <[email protected]> Cc: x86-ml <[email protected]> Link: https://lkml.kernel.org/r/[email protected]
2019-04-17	x86/traps: Use cpu_entry_area instead of orig_ist	Thomas Gleixner	1	-1/+2
	The orig_ist[] array is a shadow copy of the IST array in the TSS. The reason why it exists is that older kernels used two TSS variants with different pointers into the debug stack. orig_ist[] contains the real starting points. There is no point anymore to do so because the same information can be retrieved using the base address of the cpu entry area mapping and the offsets of the various exception stacks. No functional change. Preparation for removing orig_ist. Signed-off-by: Thomas Gleixner <[email protected]> Signed-off-by: Borislav Petkov <[email protected]> Cc: "H. Peter Anvin" <[email protected]> Cc: Andy Lutomirski <[email protected]> Cc: Dave Hansen <[email protected]> Cc: Ingo Molnar <[email protected]> Cc: Josh Poimboeuf <[email protected]> Cc: Peter Zijlstra <[email protected]> Cc: Sean Christopherson <[email protected]> Cc: x86-ml <[email protected]> Link: https://lkml.kernel.org/r/[email protected]
2019-04-17	x86/cpu_entry_area: Provide exception stack accessor	Thomas Gleixner	2	-0/+8
	Store a pointer to the per cpu entry area exception stack mappings to allow fast retrieval. Required for converting various places from using the shadow IST array to directly doing address calculations on the actual mapping address. Signed-off-by: Thomas Gleixner <[email protected]> Signed-off-by: Borislav Petkov <[email protected]> Cc: "H. Peter Anvin" <[email protected]> Cc: Andy Lutomirski <[email protected]> Cc: Dave Hansen <[email protected]> Cc: Ingo Molnar <[email protected]> Cc: Josh Poimboeuf <[email protected]> Cc: Peter Zijlstra <[email protected]> Cc: Sean Christopherson <[email protected]> Cc: x86-ml <[email protected]> Link: https://lkml.kernel.org/r/[email protected]
2019-04-17	x86/cpu_entry_area: Prepare for IST guard pages	Thomas Gleixner	1	-7/+30
	To allow guard pages between the IST stacks each stack needs to be mapped individually. Signed-off-by: Thomas Gleixner <[email protected]> Signed-off-by: Borislav Petkov <[email protected]> Cc: "H. Peter Anvin" <[email protected]> Cc: Andy Lutomirski <[email protected]> Cc: Dave Hansen <[email protected]> Cc: Ingo Molnar <[email protected]> Cc: Josh Poimboeuf <[email protected]> Cc: Peter Zijlstra <[email protected]> Cc: Sean Christopherson <[email protected]> Cc: x86-ml <[email protected]> Link: https://lkml.kernel.org/r/[email protected]
2019-04-17	x86/exceptions: Add structs for exception stacks	Thomas Gleixner	3	-11/+51
	At the moment everything assumes a full linear mapping of the various exception stacks. Adding guard pages to the cpu entry area mapping of the exception stacks will break that assumption. As a preparatory step convert both the real storage and the effective mapping in the cpu entry area from character arrays to structures. To ensure that both arrays have the same ordering and the same size of the individual stacks fill the members with a macro. The guard size is the only difference between the two resulting structures. For now both have guard size 0 until the preparation of all usage sites is done. Provide a couple of helper macros which are used in the following conversions. Signed-off-by: Thomas Gleixner <[email protected]> Signed-off-by: Borislav Petkov <[email protected]> Reviewed-by: Sean Christopherson <[email protected]> Cc: Andy Lutomirski <[email protected]> Cc: "Chang S. Bae" <[email protected]> Cc: Dave Hansen <[email protected]> Cc: Dominik Brodowski <[email protected]> Cc: "H. Peter Anvin" <[email protected]> Cc: Ingo Molnar <[email protected]> Cc: Josh Poimboeuf <[email protected]> Cc: Konrad Rzeszutek Wilk <[email protected]> Cc: Peter Zijlstra <[email protected]> Cc: x86-ml <[email protected]> Link: https://lkml.kernel.org/r/[email protected]
2019-04-17	x86/cpu_entry_area: Cleanup setup functions	Thomas Gleixner	1	-10/+9
	No point in retrieving the entry area pointer over and over. Do it once and use unsigned int for 'cpu' everywhere. Signed-off-by: Thomas Gleixner <[email protected]> Signed-off-by: Borislav Petkov <[email protected]> Reviewed-by: Sean Christopherson <[email protected]> Cc: "H. Peter Anvin" <[email protected]> Cc: Andy Lutomirski <[email protected]> Cc: Dave Hansen <[email protected]> Cc: Ingo Molnar <[email protected]> Cc: Josh Poimboeuf <[email protected]> Cc: Peter Zijlstra <[email protected]> Cc: x86-ml <[email protected]> Link: https://lkml.kernel.org/r/[email protected]
2019-04-17	x86/exceptions: Make IST index zero based	Thomas Gleixner	7	-24/+30
	The defines for the exception stack (IST) array in the TSS are using the SDM convention IST1 - IST7. That causes all sorts of code to subtract 1 for array indices related to IST. That's confusing at best and does not provide any value. Make the indices zero based and fixup the usage sites. The only code which needs to adjust the 0 based index is the interrupt descriptor setup which needs to add 1 now. Signed-off-by: Thomas Gleixner <[email protected]> Signed-off-by: Borislav Petkov <[email protected]> Reviewed-by: Sean Christopherson <[email protected]> Cc: Andy Lutomirski <[email protected]> Cc: Baoquan He <[email protected]> Cc: "Chang S. Bae" <[email protected]> Cc: Dave Hansen <[email protected]> Cc: Dominik Brodowski <[email protected]> Cc: Dou Liyang <[email protected]> Cc: "H. Peter Anvin" <[email protected]> Cc: Ingo Molnar <[email protected]> Cc: Jonathan Corbet <[email protected]> Cc: Josh Poimboeuf <[email protected]> Cc: "Kirill A. Shutemov" <[email protected]> Cc: Konrad Rzeszutek Wilk <[email protected]> Cc: [email protected] Cc: Nicolai Stange <[email protected]> Cc: Peter Zijlstra <[email protected]> Cc: Qian Cai <[email protected]> Cc: x86-ml <[email protected]> Link: https://lkml.kernel.org/r/[email protected]
2019-04-17	x86/exceptions: Remove unused stack defines on 32bit	Thomas Gleixner	1	-5/+1
	Nothing requires those for 32bit builds. Signed-off-by: Thomas Gleixner <[email protected]> Signed-off-by: Borislav Petkov <[email protected]> Cc: "H. Peter Anvin" <[email protected]> Cc: Andy Lutomirski <[email protected]> Cc: Dave Hansen <[email protected]> Cc: Ingo Molnar <[email protected]> Cc: Josh Poimboeuf <[email protected]> Cc: Michal Hocko <[email protected]> Cc: Sean Christopherson <[email protected]> Cc: x86-ml <[email protected]> Link: https://lkml.kernel.org/r/[email protected]
2019-04-17	x86/64: Remove stale CURRENT_MASK	Thomas Gleixner	1	-1/+0
	Nothing uses that and before people get the wrong ideas, get rid of it. Signed-off-by: Thomas Gleixner <[email protected]> Signed-off-by: Borislav Petkov <[email protected]> Cc: "H. Peter Anvin" <[email protected]> Cc: "Kirill A. Shutemov" <[email protected]> Cc: Andy Lutomirski <[email protected]> Cc: Baoquan He <[email protected]> Cc: Ingo Molnar <[email protected]> Cc: Josh Poimboeuf <[email protected]> Cc: Qian Cai <[email protected]> Cc: Sean Christopherson <[email protected]> Cc: x86-ml <[email protected]> Link: https://lkml.kernel.org/r/[email protected]
2019-04-17	x86/idt: Remove unused macro SISTG	Thomas Gleixner	1	-4/+0
	Commit d8ba61ba58c8 ("x86/entry/64: Don't use IST entry for #BP stack") removed the last user but left the macro around. Signed-off-by: Thomas Gleixner <[email protected]> Signed-off-by: Borislav Petkov <[email protected]> Reviewed-by: Andy Lutomirski <[email protected]> Cc: "H. Peter Anvin" <[email protected]> Cc: Dou Liyang <[email protected]> Cc: Ingo Molnar <[email protected]> Cc: Josh Poimboeuf <[email protected]> Cc: Nicolai Stange <[email protected]> Cc: Sean Christopherson <[email protected]> Cc: x86-ml <[email protected]> Link: https://lkml.kernel.org/r/[email protected]
2019-04-17	x86/irq/64: Sanitize the top/bottom confusion	Thomas Gleixner	1	-11/+11
	On x86, stacks go top to bottom, but the stack overflow check uses it the other way round, which is just confusing. Clean it up and sanitize the warning string a bit. Signed-off-by: Thomas Gleixner <[email protected]> Signed-off-by: Borislav Petkov <[email protected]> Reviewed-by: Sean Christopherson <[email protected]> Cc: "H. Peter Anvin" <[email protected]> Cc: Andy Lutomirski <[email protected]> Cc: Ingo Molnar <[email protected]> Cc: Josh Poimboeuf <[email protected]> Cc: Nicolai Stange <[email protected]> Cc: x86-ml <[email protected]> Link: https://lkml.kernel.org/r/[email protected]
2019-04-17	x86/irq/64: Remove a hardcoded irq_stack_union access	Andy Lutomirski	1	-2/+1
	stack_overflow_check() is using both irq_stack_ptr and irq_stack_union to find the IRQ stack. That's going to break when vmapped irq stacks are introduced. Change it to just use irq_stack_ptr. Signed-off-by: Andy Lutomirski <[email protected]> Signed-off-by: Thomas Gleixner <[email protected]> Signed-off-by: Borislav Petkov <[email protected]> Reviewed-by: Sean Christopherson <[email protected]> Cc: "H. Peter Anvin" <[email protected]> Cc: Ingo Molnar <[email protected]> Cc: Josh Poimboeuf <[email protected]> Cc: Nicolai Stange <[email protected]> Cc: x86-ml <[email protected]> Link: https://lkml.kernel.org/r/[email protected]
2019-04-17	x86/dumpstack: Fix off-by-one errors in stack identification	Andy Lutomirski	2	-4/+4
	The get_stack_info() function is off-by-one when checking whether an address is on a IRQ stack or a IST stack. This prevents an overflowed IRQ or IST stack from being dumped properly. [ tglx: Do the same for 32-bit ] Signed-off-by: Andy Lutomirski <[email protected]> Signed-off-by: Thomas Gleixner <[email protected]> Signed-off-by: Borislav Petkov <[email protected]> Reviewed-by: Sean Christopherson <[email protected]> Cc: "H. Peter Anvin" <[email protected]> Cc: Ingo Molnar <[email protected]> Cc: Josh Poimboeuf <[email protected]> Cc: x86-ml <[email protected]> Link: https://lkml.kernel.org/r/[email protected]
2019-04-17	x86/irq/64: Limit IST stack overflow check to #DB stack	Thomas Gleixner	1	-5/+14
	Commit 37fe6a42b343 ("x86: Check stack overflow in detail") added a broad check for the full exception stack area, i.e. it considers the full exception stack area as valid. That's wrong in two aspects: 1) It does not check the individual areas one by one 2) #DF, NMI and #MCE are not enabling interrupts which means that a regular device interrupt cannot happen in their context. In fact if a device interrupt hits one of those IST stacks that's a bug because some code path enabled interrupts while handling the exception. Limit the check to the #DB stack and consider all other IST stacks as 'overflow' or invalid. Signed-off-by: Thomas Gleixner <[email protected]> Signed-off-by: Borislav Petkov <[email protected]> Cc: Andy Lutomirski <[email protected]> Cc: "H. Peter Anvin" <[email protected]> Cc: Ingo Molnar <[email protected]> Cc: Josh Poimboeuf <[email protected]> Cc: Mitsuo Hayasaka <[email protected]> Cc: Nicolai Stange <[email protected]> Cc: Sean Christopherson <[email protected]> Cc: x86-ml <[email protected]> Link: https://lkml.kernel.org/r/[email protected]
2019-04-16	Merge tag 'for-linus' of git://git.kernel.org/pub/scm/virt/kvm/kvm	Linus Torvalds	15	-181/+268
	Pull KVM fixes from Paolo Bonzini: "5.1 keeps its reputation as a big bugfix release for KVM x86. - Fix for a memory leak introduced during the merge window - Fixes for nested VMX with ept=0 - Fixes for AMD (APIC virtualization, NMI injection) - Fixes for Hyper-V under KVM and KVM under Hyper-V - Fixes for 32-bit SMM and tests for SMM virtualization - More array_index_nospec peppering" * tag 'for-linus' of git://git.kernel.org/pub/scm/virt/kvm/kvm: (21 commits) KVM: x86: avoid misreporting level-triggered irqs as edge-triggered in tracing KVM: fix spectrev1 gadgets KVM: x86: fix warning Using plain integer as NULL pointer selftests: kvm: add a selftest for SMM selftests: kvm: fix for compilers that do not support -no-pie selftests: kvm/evmcs_test: complete I/O before migrating guest state KVM: x86: Always use 32-bit SMRAM save state for 32-bit kernels KVM: x86: Don't clear EFER during SMM transitions for 32-bit vCPU KVM: x86: clear SMM flags before loading state while leaving SMM KVM: x86: Open code kvm_set_hflags KVM: x86: Load SMRAM in a single shot when leaving SMM KVM: nVMX: Expose RDPMC-exiting only when guest supports PMU KVM: x86: Raise #GP when guest vCPU do not support PMU x86/kvm: move kvm_load/put_guest_xcr0 into atomic context KVM: x86: svm: make sure NMI is injected after nmi_singlestep svm/avic: Fix invalidate logical APIC id entry Revert "svm: Fix AVIC incomplete IPI emulation" kvm: mmu: Fix overflow on kvm mmu page limit calculation KVM: nVMX: always use early vmcs check when EPT is disabled KVM: nVMX: allow tests to use bad virtual-APIC page address ...
2019-04-16	kvm: move KVM_CAP_NR_MEMSLOTS to common code	Paolo Bonzini	1	-3/+0
	All architectures except MIPS were defining it in the same way, and memory slots are handled entirely by common code so there is no point in keeping the definition per-architecture. Signed-off-by: Paolo Bonzini <[email protected]>
2019-04-16	KVM: x86: Inject #GP if guest attempts to set unsupported EFER bits	Sean Christopherson	1	-0/+7
	EFER.LME and EFER.NX are considered reserved if their respective feature bits are not advertised to the guest. Signed-off-by: Sean Christopherson <[email protected]> Signed-off-by: Paolo Bonzini <[email protected]>
2019-04-16	KVM: x86: Skip EFER vs. guest CPUID checks for host-initiated writes	Sean Christopherson	1	-13/+24
	KVM allows userspace to violate consistency checks related to the guest's CPUID model to some degree. Generally speaking, userspace has carte blanche when it comes to guest state so long as jamming invalid state won't negatively affect the host. Currently this is seems to be a non-issue as most of the interesting EFER checks are missing, e.g. NX and LME, but those will be added shortly. Proactively exempt userspace from the CPUID checks so as not to break userspace. Note, the efer_reserved_bits check still applies to userspace writes as that mask reflects the host's capabilities, e.g. KVM shouldn't allow a guest to run with NX=1 if it has been disabled in the host. Fixes: d80174745ba39 ("KVM: SVM: Only allow setting of EFER_SVME when CPUID SVM is set") Cc: [email protected] Signed-off-by: Sean Christopherson <[email protected]> Signed-off-by: Paolo Bonzini <[email protected]>
2019-04-16	KVM: nVMX: Return -EINVAL when signaling failure in VM-Entry helpers	Sean Christopherson	1	-12/+12
	Most, but not all, helpers that are related to emulating consistency checks for nested VM-Entry return -EINVAL when a check fails. Convert the holdouts to have consistency throughout and to make it clear that the functions are signaling pass/fail as opposed to "resume guest" vs. "exit to userspace". Opportunistically fix bad indentation in nested_vmx_check_guest_state(). Signed-off-by: Sean Christopherson <[email protected]> Signed-off-by: Paolo Bonzini <[email protected]>
2019-04-16	KVM: nVMX: Return -EINVAL when signaling failure in pre-VM-Entry helpers	Paolo Bonzini	1	-21/+7
	Convert all top-level nested VM-Enter consistency check functions to return 0/-EINVAL instead of failure codes, since now they can only ever return one failure code. This also does not give the false impression that failure information is always consumed and/or relevant, e.g. vmx_set_nested_state() only cares whether or not the checks were successful. nested_check_host_control_regs() can also now be inlined into its caller, nested_vmx_check_host_state, since the two have effectively become the same function. Based on a patch by Sean Christopherson. Signed-off-by: Paolo Bonzini <[email protected]>