aboutsummaryrefslogtreecommitdiff
path: root/arch/x86
AgeCommit message (Collapse)AuthorFilesLines
2018-08-03x86/speculation: Support Enhanced IBRS on future CPUsSai Praneeth4-2/+23
Future Intel processors will support "Enhanced IBRS" which is an "always on" mode i.e. IBRS bit in SPEC_CTRL MSR is enabled once and never disabled. From the specification [1]: "With enhanced IBRS, the predicted targets of indirect branches executed cannot be controlled by software that was executed in a less privileged predictor mode or on another logical processor. As a result, software operating on a processor with enhanced IBRS need not use WRMSR to set IA32_SPEC_CTRL.IBRS after every transition to a more privileged predictor mode. Software can isolate predictor modes effectively simply by setting the bit once. Software need not disable enhanced IBRS prior to entering a sleep state such as MWAIT or HLT." If Enhanced IBRS is supported by the processor then use it as the preferred spectre v2 mitigation mechanism instead of Retpoline. Intel's Retpoline white paper [2] states: "Retpoline is known to be an effective branch target injection (Spectre variant 2) mitigation on Intel processors belonging to family 6 (enumerated by the CPUID instruction) that do not have support for enhanced IBRS. On processors that support enhanced IBRS, it should be used for mitigation instead of retpoline." The reason why Enhanced IBRS is the recommended mitigation on processors which support it is that these processors also support CET which provides a defense against ROP attacks. Retpoline is very similar to ROP techniques and might trigger false positives in the CET defense. If Enhanced IBRS is selected as the mitigation technique for spectre v2, the IBRS bit in SPEC_CTRL MSR is set once at boot time and never cleared. Kernel also has to make sure that IBRS bit remains set after VMEXIT because the guest might have cleared the bit. This is already covered by the existing x86_spec_ctrl_set_guest() and x86_spec_ctrl_restore_host() speculation control functions. Enhanced IBRS still requires IBPB for full mitigation. [1] Speculative-Execution-Side-Channel-Mitigations.pdf [2] Retpoline-A-Branch-Target-Injection-Mitigation.pdf Both documents are available at: https://bugzilla.kernel.org/show_bug.cgi?id=199511 Originally-by: David Woodhouse <[email protected]> Signed-off-by: Sai Praneeth Prakhya <[email protected]> Signed-off-by: Thomas Gleixner <[email protected]> Cc: Tim C Chen <[email protected]> Cc: Dave Hansen <[email protected]> Cc: Ravi Shankar <[email protected]> Link: https://lkml.kernel.org/r/[email protected]
2018-08-03x86/cpufeatures: Add EPT_AD feature bitPeter Feiner2-2/+10
Some Intel processors have an EPT feature whereby the accessed & dirty bits in EPT entries can be updated by HW. MSR IA32_VMX_EPT_VPID_CAP exposes the presence of this capability. There is no point in trying to use that new feature bit in the VMX code as VMX needs to read the MSR anyway to access other bits, but having the feature bit for EPT_AD in place helps virtualization management as it exposes "ept_ad" in /proc/cpuinfo/$proc/flags if the feature is present. [ tglx: Amended changelog ] Signed-off-by: Peter Feiner <[email protected]> Signed-off-by: Peter Shier <[email protected]> Signed-off-by: Thomas Gleixner <[email protected]> Reviewed-by: Jim Mattson <[email protected]> Cc: "H. Peter Anvin" <[email protected]> Cc: Borislav Petkov <[email protected]> Cc: Konrad Rzeszutek Wilk <[email protected]> Cc: David Woodhouse <[email protected]> Link: https://lkml.kernel.org/r/[email protected]
2018-08-03Merge git://git.kernel.org/pub/scm/linux/kernel/git/torvalds/linuxHerbert Xu60-152/+478
Merge mainline to pick up c7513c2a2714 ("crypto/arm64: aes-ce-gcm - add missing kernel_neon_begin/end pair").
2018-08-02Merge ra.kernel.org:/pub/scm/linux/kernel/git/davem/netDavid S. Miller11-39/+40
The BTF conflicts were simple overlapping changes. The virtio_net conflict was an overlap of a fix of statistics counter, happening alongisde a move over to a bonafide statistics structure rather than counting value on the stack. Signed-off-by: David S. Miller <[email protected]>
2018-08-02x86/iommu: Use NULL instead of 0Zhong Jiang1-1/+1
Fixes the following sparse warning: arch/x86/kernel/pci-iommu_table.c:63:37: warning: Using plain integer as NULL pointer Signed-off-by: zhong jiang <[email protected]> Signed-off-by: Thomas Gleixner <[email protected]> Cc: <[email protected]> Cc: <[email protected]> Link: https://lkml.kernel.org/r/[email protected]
2018-08-02x86/boot: Use CC_SET()/CC_OUT() instead of open coding itUros Bizjak2-3/+5
Remove open-coded uses of set instructions with CC_SET()/CC_OUT(). Signed-off-by: Uros Bizjak <[email protected]> Signed-off-by: Thomas Gleixner <[email protected]> Link: https://lkml.kernel.org/r/[email protected]
2018-08-02x86/mm: Remove redundant check for kmem_cache_create()Chengguang Xu1-3/+0
The flag 'SLAB_PANIC' implies panic on failure, So there is no need to check the returned pointer for NULL. Signed-off-by: Chengguang Xu <[email protected]> Signed-off-by: Thomas Gleixner <[email protected]> Cc: [email protected] Link: https://lkml.kernel.org/r/[email protected]
2018-08-02x86/platform/UV: Remove redundant check of p == qColin Ian King1-2/+0
The check for p == q is dead code because the proceeding switch statements jump to the end of the outer for-loop with continue statements. Remove the dead code. Detected by CoverityScan, CID#145071 ("Structurally dead code") Signed-off-by: Colin Ian King <[email protected]> Signed-off-by: Thomas Gleixner <[email protected]> Cc: "H . Peter Anvin" <[email protected]> Cc: [email protected] Link: https://lkml.kernel.org/r/[email protected]
2018-08-02x86/platform/olpc: Use PTR_ERR_OR_ZERO()zhong jiang1-3/+1
Replace the open coded equivalent with PTR_ERR_OR_ZERO(). Signed-off-by: zhong jiang <[email protected]> Signed-off-by: Thomas Gleixner <[email protected]> Link: https://lkml.kernel.org/r/[email protected]
2018-08-02x86/boot/compressed/64: Validate trampoline placement against E820Kirill A. Shutemov1-18/+55
There were two report of boot failure cased by trampoline placed into a reserved memory region. It can happen on machines that don't report EBDA correctly. Fix the problem by re-validating the found address against the E820 table. If the address is in a reserved area, find the next usable region below the initial address. Fixes: 3548e131ec6a ("x86/boot/compressed/64: Find a place for 32-bit trampoline") Reported-by: Dmitry Malkin <[email protected]> Reported-by: youling 257 <[email protected]> Signed-off-by: Kirill A. Shutemov <[email protected]> Signed-off-by: Thomas Gleixner <[email protected]> Cc: "H. Peter Anvin" <[email protected]> Link: https://lkml.kernel.org/r/[email protected]
2018-08-02Merge branch 'perf/urgent' into perf/core, to pick up fixesIngo Molnar9-24/+25
Signed-off-by: Ingo Molnar <[email protected]>
2018-08-02kconfig: include kernel/Kconfig.preempt from init/KconfigChristoph Hellwig1-2/+0
Almost all architectures include it. Add a ARCH_NO_PREEMPT symbol to disable preempt support for alpha, hexagon, non-coldfire m68k and user mode Linux. Signed-off-by: Christoph Hellwig <[email protected]> Signed-off-by: Masahiro Yamada <[email protected]>
2018-08-02Kconfig: consolidate the "Kernel hacking" menuChristoph Hellwig2-7/+0
Move the source of lib/Kconfig.debug and arch/$(ARCH)/Kconfig.debug to the top-level Kconfig. For two architectures that means moving their arch-specific symbols in that menu into a new arch Kconfig.debug file, and for a few more creating a dummy file so that we can include it unconditionally. Also move the actual 'Kernel hacking' menu to lib/Kconfig.debug, where it belongs. Signed-off-by: Christoph Hellwig <[email protected]> Signed-off-by: Masahiro Yamada <[email protected]>
2018-08-02kconfig: include common Kconfig files from top-level KconfigChristoph Hellwig1-21/+1
Instead of duplicating the source statements in every architecture just do it once in the toplevel Kconfig file. Note that with this the inclusion of arch/$(SRCARCH/Kconfig moves out of the top-level Kconfig into arch/Kconfig so that don't violate ordering constraits while keeping a sensible menu structure. Signed-off-by: Christoph Hellwig <[email protected]> Signed-off-by: Masahiro Yamada <[email protected]>
2018-08-02um: cleanup Kconfig filesChristoph Hellwig1-10/+0
We can handle all not architecture specific UM configuration directly in the newly added arch/um/Kconfig. Do so by merging the Kconfig.common, Kconfig.rest and Kconfig.um files into arch/um/Kconfig, and move the main UML menu as well. Signed-off-by: Christoph Hellwig <[email protected]> Acked-by: Richard Weinberger <[email protected]> Signed-off-by: Masahiro Yamada <[email protected]>
2018-08-02um: stop abusing KBUILD_KCONFIGChristoph Hellwig1-5/+0
Instead create a arch/um/Kconfig file that just includes the actual per-arch Kconfig file. Note that we use HEADER_ARCH to find the per-arch Kconfig file as that variable already includes the normalization from i386 or x86_64 to x86. Signed-off-by: Christoph Hellwig <[email protected]> Acked-by: Richard Weinberger <[email protected]> Signed-off-by: Masahiro Yamada <[email protected]>
2018-07-31perf/x86/intel/uncore: Fix hardcoded index of Broadwell extra PCI devicesKan Liang2-4/+8
Masayoshi Mizuma reported that a warning message is shown while a CPU is hot-removed on Broadwell servers: WARNING: CPU: 126 PID: 6 at arch/x86/events/intel/uncore.c:988 uncore_pci_remove+0x10b/0x150 Call Trace: pci_device_remove+0x42/0xd0 device_release_driver_internal+0x148/0x220 pci_stop_bus_device+0x76/0xa0 pci_stop_root_bus+0x44/0x60 acpi_pci_root_remove+0x1f/0x80 acpi_bus_trim+0x57/0x90 acpi_bus_trim+0x2e/0x90 acpi_device_hotplug+0x2bc/0x4b0 acpi_hotplug_work_fn+0x1a/0x30 process_one_work+0x174/0x3a0 worker_thread+0x4c/0x3d0 kthread+0xf8/0x130 This bug was introduced by: commit 15a3e845b01c ("perf/x86/intel/uncore: Fix SBOX support for Broadwell CPUs") The index of "QPI Port 2 filter" was hardcode to 2, but this conflicts with the index of "PCU.3" which is "HSWEP_PCI_PCU_3", which equals to 2 as well. To fix the conflict, the hardcoded index needs to be cleaned up: - introduce a new enumerator "BDX_PCI_QPI_PORT2_FILTER" for "QPI Port 2 filter" on Broadwell, - increase UNCORE_EXTRA_PCI_DEV_MAX by one, - clean up the hardcoded index. Debugged-by: Masayoshi Mizuma <[email protected]> Suggested-by: Ingo Molnar <[email protected]> Reported-by: Masayoshi Mizuma <[email protected]> Tested-by: Masayoshi Mizuma <[email protected]> Signed-off-by: Kan Liang <[email protected]> Cc: Alexander Shishkin <[email protected]> Cc: Arnaldo Carvalho de Melo <[email protected]> Cc: Jiri Olsa <[email protected]> Cc: Linus Torvalds <[email protected]> Cc: Peter Zijlstra <[email protected]> Cc: Stephane Eranian <[email protected]> Cc: Thomas Gleixner <[email protected]> Cc: Vince Weaver <[email protected]> Cc: [email protected] Cc: [email protected] Fixes: 15a3e845b01c ("perf/x86/intel/uncore: Fix SBOX support for Broadwell CPUs") Link: http://lkml.kernel.org/r/[email protected] Signed-off-by: Ingo Molnar <[email protected]>
2018-07-30Merge git://git.kernel.org/pub/scm/linux/kernel/git/davem/netLinus Torvalds1-4/+4
Pull networking fixes from David Miller: "Several smallish fixes, I don't think any of this requires another -rc but I'll leave that up to you: 1) Don't leak uninitialzed bytes to userspace in xfrm_user, from Eric Dumazet. 2) Route leak in xfrm_lookup_route(), from Tommi Rantala. 3) Premature poll() returns in AF_XDP, from Björn Töpel. 4) devlink leak in netdevsim, from Jakub Kicinski. 5) Don't BUG_ON in fib_compute_spec_dst, the condition can legitimately happen. From Lorenzo Bianconi. 6) Fix some spectre v1 gadgets in generic socket code, from Jeremy Cline. 7) Don't allow user to bind to out of range multicast groups, from Dmitry Safonov with a follow-up by Dmitry Safonov. 8) Fix metrics leak in fib6_drop_pcpu_from(), from Sabrina Dubroca" * git://git.kernel.org/pub/scm/linux/kernel/git/davem/net: (41 commits) netlink: Don't shift with UB on nlk->ngroups net/ipv6: fix metrics leak xen-netfront: wait xenbus state change when load module manually can: ems_usb: Fix memory leak on ems_usb_disconnect() openvswitch: meter: Fix setting meter id for new entries netlink: Do not subscribe to non-existent groups NET: stmmac: align DMA stuff to largest cache line length tcp_bbr: fix bw probing to raise in-flight data for very small BDPs net: socket: Fix potential spectre v1 gadget in sock_is_registered net: socket: fix potential spectre v1 gadget in socketcall net: mdio-mux: bcm-iproc: fix wrong getter and setter pair ipv4: remove BUG_ON() from fib_compute_spec_dst enic: handle mtu change for vf properly net: lan78xx: fix rx handling before first packet is send nfp: flower: fix port metadata conversion bug bpf: use GFP_ATOMIC instead of GFP_KERNEL in bpf_parse_prog() bpf: fix bpf_skb_load_bytes_relative pkt length check perf build: Build error in libbpf missing initialization net: ena: Fix use of uninitialized DMA address bits field bpf: btf: Use exact btf value_size match in map_check_btf() ...
2018-07-31x86/speculation: Protect against userspace-userspace spectreRSBJiri Kosina1-31/+7
The article "Spectre Returns! Speculation Attacks using the Return Stack Buffer" [1] describes two new (sub-)variants of spectrev2-like attacks, making use solely of the RSB contents even on CPUs that don't fallback to BTB on RSB underflow (Skylake+). Mitigate userspace-userspace attacks by always unconditionally filling RSB on context switch when the generic spectrev2 mitigation has been enabled. [1] https://arxiv.org/pdf/1807.07940.pdf Signed-off-by: Jiri Kosina <[email protected]> Signed-off-by: Thomas Gleixner <[email protected]> Reviewed-by: Josh Poimboeuf <[email protected]> Acked-by: Tim Chen <[email protected]> Cc: Konrad Rzeszutek Wilk <[email protected]> Cc: Borislav Petkov <[email protected]> Cc: David Woodhouse <[email protected]> Cc: Peter Zijlstra <[email protected]> Cc: Linus Torvalds <[email protected]> Cc: [email protected] Link: https://lkml.kernel.org/r/[email protected]
2018-07-30PCI: Call dma_debug_add_bus() for pci_bus_type from PCI coreChristoph Hellwig1-3/+0
There is nothing arch-specific about PCI or dma-debug, so call dma_debug_add_bus() from the PCI core just after registering the bus type. Most of dma-debug is already generic; this just adds reporting of pending dma-allocations on driver unload for arches other than powerpc, sh, and x86. Signed-off-by: Christoph Hellwig <[email protected]> Signed-off-by: Bjorn Helgaas <[email protected]> Acked-by: Thomas Gleixner <[email protected]> Acked-by: Michael Ellerman <[email protected]> (powerpc)
2018-07-30Merge branch 'x86-urgent-for-linus' of ↵Linus Torvalds3-16/+13
git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip Pull x86 fixes from Ingo Molnar: "Misc fixes: - a build race fix - a Xen entry fix - a TSC_DEADLINE quirk future-proofing fix" * 'x86-urgent-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip: x86/boot: Fix if_changed build flip/flop bug x86/entry/64: Remove %ebx handling from error_entry/exit x86/apic: Future-proof the TSC_DEADLINE quirk for SKX
2018-07-30Merge branch 'perf-urgent-for-linus' of ↵Linus Torvalds3-15/+19
git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip Pull perf fixes from Ingo Molnar: "Misc fixes: - AMD IBS data corruptor fix (uncovered by UBSAN) - an Intel PEBS entry unwind error fix - a HW-tracing crash fix - a MAINTAINERS update" * 'perf-urgent-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip: perf/core: Fix crash when using HW tracing kernel filters perf/x86/intel: Fix unwind errors from PEBS entries (mk-II) MAINTAINERS: Add Naveen N. Rao as kprobes co-maintainer perf/x86/amd/ibs: Don't access non-started event
2018-07-30Merge branch 'locking-urgent-for-linus' of ↵Linus Torvalds1-1/+1
git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip Pull locking fixes from Ingo Molnar: "A paravirt UP-patching fix, and an I2C MUX driver lockdep warning fix" * 'locking-urgent-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip: locking/pvqspinlock/x86: Use LOCK_PREFIX in __pv_queued_spin_unlock() assembly code i2c/mux, locking/core: Annotate the nested rt_mutex usage locking/rtmutex: Allow specifying a subclass for nested locking
2018-07-30Merge branch 'efi-urgent-for-linus' of ↵Linus Torvalds1-1/+1
git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip Pull EFI fix from Ingo Molnar: "An UEFI variables fix for SEV guests" * 'efi-urgent-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip: x86/efi: Access EFI MMIO data as unencrypted when SEV is active
2018-07-30x86/apic: Trivial coding style fixesYi Wang2-2/+2
There is inconsistent indenting in calibrate_APIC_clock() and activate_managed(). Remove the surplus TAB. Signed-off-by: Yi Wang <[email protected]> Signed-off-by: Thomas Gleixner <[email protected]> Reviewed-by: Jiang Biao <[email protected]> Acked-by: Steven Rostedt (VMware) <[email protected]> Cc: [email protected] Cc: [email protected] Cc: [email protected] Cc: [email protected] Cc: [email protected] Cc: [email protected] Cc: [email protected] Link: https://lkml.kernel.org/r/[email protected]
2018-07-30x86/platform/UV: Mark memblock related init code and data correctlyDou Liyang1-2/+2
parse_mem_block_size() and mem_block_size are only used during init. Mark them accordingly. Fixes: d7609f4210cb ("x86/platform/UV: Add kernel parameter to set memory block size") Signed-off-by: Dou Liyang <[email protected]> Signed-off-by: Thomas Gleixner <[email protected]> Cc: [email protected] Cc: Mike Travis <[email protected]> Cc: Andrew Banman <[email protected]> Link: https://lkml.kernel.org/r/[email protected]
2018-07-30x86/boot/KASLR: Make local variable mem_limit staticzhong jiang1-1/+1
Fix the following sparse warning: arch/x86/boot/compressed/kaslr.c:102:20: warning: symbol 'mem_limit' was not declared. Should it be static? Signed-off-by: zhong jiang <[email protected]> Signed-off-by: Thomas Gleixner <[email protected]> Cc: <[email protected]> Link: https://lkml.kernel.org/r/[email protected]
2018-07-30x86/kvmclock: Mark kvm_get_preset_lpj() as __initDou Liyang1-1/+1
kvm_get_preset_lpj() is only called from kvmclock_init(), so mark it __init as well. Signed-off-by: Dou Liyang <[email protected]> Signed-off-by: Thomas Gleixner <[email protected]> Reviewed-by: Pavel Tatashin <[email protected]> Cc: <[email protected]> Cc: Paolo Bonzini <[email protected]> Cc: Radim Krčmář<[email protected]> Cc: <[email protected]> Cc: "H. Peter Anvin" <[email protected]> Cc: "Radim Krčmář" <[email protected]> Cc: [email protected] Link: https://lkml.kernel.org/r/[email protected]
2018-07-30x86/tsc: Consolidate init codeDou Liyang1-10/+12
Split out suplicated code from tsc_early_init() and tsc_init() into a common helper and fixup some comment typos. [ tglx: Massaged changelog and renamed function ] Signed-off-by: Dou Liyang <[email protected]> Signed-off-by: Thomas Gleixner <[email protected]> Reviewed-by: Pavel Tatashin <[email protected]> Cc: <[email protected]> Cc: Peter Zijlstra <[email protected]> Cc: "H. Peter Anvin" <[email protected]> Link: https://lkml.kernel.org/r/[email protected]
2018-07-30x86/kexec: Allocate 8k PGDs for PTIJoerg Roedel1-2/+3
Fuzzing the PTI-x86-32 code with trinity showed unhandled kernel paging request oops-messages that looked a lot like silent data corruption. Lot's of debugging and testing lead to the kexec-32bit code, which is still allocating 4k PGDs when PTI is enabled. But since it uses native_set_pud() to build the page-table, it will unevitably call into __pti_set_user_pgtbl(), which writes beyond the allocated 4k page. Use PGD_ALLOCATION_ORDER to allocate PGDs in the kexec code to fix the issue. Signed-off-by: Joerg Roedel <[email protected]> Signed-off-by: Thomas Gleixner <[email protected]> Tested-by: David H. Gutteridge <[email protected]> Cc: "H . Peter Anvin" <[email protected]> Cc: [email protected] Cc: Linus Torvalds <[email protected]> Cc: Andy Lutomirski <[email protected]> Cc: Dave Hansen <[email protected]> Cc: Josh Poimboeuf <[email protected]> Cc: Juergen Gross <[email protected]> Cc: Peter Zijlstra <[email protected]> Cc: Borislav Petkov <[email protected]> Cc: Jiri Kosina <[email protected]> Cc: Boris Ostrovsky <[email protected]> Cc: Brian Gerst <[email protected]> Cc: David Laight <[email protected]> Cc: Denys Vlasenko <[email protected]> Cc: Eduardo Valentin <[email protected]> Cc: Greg KH <[email protected]> Cc: Will Deacon <[email protected]> Cc: [email protected] Cc: [email protected] Cc: [email protected] Cc: [email protected] Cc: Andrea Arcangeli <[email protected]> Cc: Waiman Long <[email protected]> Cc: Pavel Machek <[email protected]> Cc: Arnaldo Carvalho de Melo <[email protected]> Cc: Alexander Shishkin <[email protected]> Cc: Jiri Olsa <[email protected]> Cc: Namhyung Kim <[email protected]> Cc: [email protected] Link: https://lkml.kernel.org/r/[email protected]
2018-07-30x86/mm: Remove in_nmi() warning from vmalloc_fault()Joerg Roedel1-2/+0
It is perfectly okay to take page-faults, especially on the vmalloc area while executing an NMI handler. Remove the warning. Signed-off-by: Joerg Roedel <[email protected]> Signed-off-by: Thomas Gleixner <[email protected]> Tested-by: David H. Gutteridge <[email protected]> Cc: "H . Peter Anvin" <[email protected]> Cc: [email protected] Cc: Linus Torvalds <[email protected]> Cc: Andy Lutomirski <[email protected]> Cc: Dave Hansen <[email protected]> Cc: Josh Poimboeuf <[email protected]> Cc: Juergen Gross <[email protected]> Cc: Peter Zijlstra <[email protected]> Cc: Borislav Petkov <[email protected]> Cc: Jiri Kosina <[email protected]> Cc: Boris Ostrovsky <[email protected]> Cc: Brian Gerst <[email protected]> Cc: David Laight <[email protected]> Cc: Denys Vlasenko <[email protected]> Cc: Eduardo Valentin <[email protected]> Cc: Greg KH <[email protected]> Cc: Will Deacon <[email protected]> Cc: [email protected] Cc: [email protected] Cc: [email protected] Cc: [email protected] Cc: Andrea Arcangeli <[email protected]> Cc: Waiman Long <[email protected]> Cc: Pavel Machek <[email protected]> Cc: Arnaldo Carvalho de Melo <[email protected]> Cc: Alexander Shishkin <[email protected]> Cc: Jiri Olsa <[email protected]> Cc: Namhyung Kim <[email protected]> Cc: [email protected] Link: https://lkml.kernel.org/r/[email protected]
2018-07-30BackMerge v4.18-rc7 into drm-nextDave Airlie33-77/+209
rmk requested this for armada and I think we've had a few conflicts build up. Signed-off-by: Dave Airlie <[email protected]>
2018-07-29Drivers: hv: vmbus: Get rid of MSR access from vmbus_drv.cSunil Muthuswamy1-0/+3
Get rid of ISA specific code from vmus_drv.c which is common code. Fixes: 81b18bce48af ("Drivers: HV: Send one page worth of kmsg dump over Hyper-V during panic") Signed-off-by: Sunil Muthuswamy <[email protected]> Signed-off-by: K. Y. Srinivasan <[email protected]> Signed-off-by: Greg Kroah-Hartman <[email protected]>
2018-07-28Merge git://git.kernel.org/pub/scm/linux/kernel/git/bpf/bpfDavid S. Miller1-4/+4
Daniel Borkmann says: ==================== pull-request: bpf 2018-07-28 The following pull-request contains BPF updates for your *net* tree. The main changes are: 1) API fixes for libbpf's BTF mapping of map key/value types in order to make them compatible with iproute2's BPF_ANNOTATE_KV_PAIR() markings, from Martin. 2) Fix AF_XDP to not report POLLIN prematurely by using the non-cached consumer pointer of the RX queue, from Björn. 3) Fix __xdp_return() to check for NULL pointer after the rhashtable lookup that retrieves the allocator object, from Taehee. 4) Fix x86-32 JIT to adjust ebp register in prologue and epilogue by 4 bytes which got removed from overall stack usage, from Wang. 5) Fix bpf_skb_load_bytes_relative() length check to use actual packet length, from Daniel. 6) Fix uninitialized return code in libbpf bpf_perf_event_read_simple() handler, from Thomas. ==================== Signed-off-by: David S. Miller <[email protected]>
2018-07-27dma-mapping: Generalise dma_32bit_limit flagRobin Murphy1-1/+1
Whilst the notion of an upstream DMA restriction is most commonly seen in PCI host bridges saddled with a 32-bit native interface, a more general version of the same issue can exist on complex SoCs where a bus or point-to-point interconnect link from a device's DMA master interface to another component along the path to memory (often an IOMMU) may carry fewer address bits than the interfaces at both ends nominally support. In order to properly deal with this, the first step is to expand the dma_32bit_limit flag into an arbitrary mask. To minimise the impact on existing code, we'll make sure to only consider this new mask valid if set. That makes sense anyway, since a mask of zero would represent DMA not being wired up at all, and that would be better handled by not providing valid ops in the first place. Signed-off-by: Robin Murphy <[email protected]> Acked-by: Ard Biesheuvel <[email protected]> Signed-off-by: Christoph Hellwig <[email protected]>
2018-07-27iommu: Add config option to set passthrough as defaultOlof Johansson1-0/+8
This allows the default behavior to be controlled by a kernel config option instead of changing the commandline for the kernel to include "iommu.passthrough=on" or "iommu=pt" on machines where this is desired. Likewise, for machines where this config option is enabled, it can be disabled at boot time with "iommu.passthrough=off" or "iommu=nopt". Also corrected iommu=pt documentation for IA-64, since it has no code that parses iommu= at all. Signed-off-by: Olof Johansson <[email protected]> Signed-off-by: Joerg Roedel <[email protected]>
2018-07-26xen/spinlock: Don't use pvqspinlock if only 1 vCPUWaiman Long1-0/+4
On a VM with only 1 vCPU, the locking fast paths will always be successful. In this case, there is no need to use the the PV qspinlock code which has higher overhead on the unlock side than the native qspinlock code. The xen_pvspin veriable is also turned off in this 1 vCPU case to eliminate unneeded pvqspinlock initialization in xen_init_lock_cpu() which is run after xen_init_spinlocks(). Signed-off-by: Waiman Long <[email protected]> Reviewed-by: Boris Ostrovsky <[email protected]> Signed-off-by: Boris Ostrovsky <[email protected]>
2018-07-26kvm, mm: account shadow page tables to kmemcgShakeel Butt1-1/+1
The size of kvm's shadow page tables corresponds to the size of the guest virtual machines on the system. Large VMs can spend a significant amount of memory as shadow page tables which can not be left as system memory overhead. So, account shadow page tables to the kmemcg. [[email protected]: replace (GFP_KERNEL|__GFP_ACCOUNT) with GFP_KERNEL_ACCOUNT] Link: http://lkml.kernel.org/r/[email protected] Link: http://lkml.kernel.org/r/[email protected] Signed-off-by: Shakeel Butt <[email protected]> Cc: Michal Hocko <[email protected]> Cc: Johannes Weiner <[email protected]> Cc: Vladimir Davydov <[email protected]> Cc: Paolo Bonzini <[email protected]> Cc: Greg Thelen <[email protected]> Cc: Radim Krčmář <[email protected]> Cc: Peter Feiner <[email protected]> Cc: <[email protected]> Signed-off-by: Andrew Morton <[email protected]> Signed-off-by: Linus Torvalds <[email protected]>
2018-07-26mm: use vma_init() to initialize VMAs on stack and data segmentsKirill A. Shutemov1-1/+1
Make sure to initialize all VMAs properly, not only those which come from vm_area_cachep. Link: http://lkml.kernel.org/r/[email protected] Signed-off-by: Kirill A. Shutemov <[email protected]> Acked-by: Linus Torvalds <[email protected]> Reviewed-by: Andrew Morton <[email protected]> Cc: Dmitry Vyukov <[email protected]> Cc: Oleg Nesterov <[email protected]> Cc: Andrea Arcangeli <[email protected]> Cc: <[email protected]> Signed-off-by: Andrew Morton <[email protected]> Signed-off-by: Linus Torvalds <[email protected]>
2018-07-26x86/power/hibernate_64: Remove VLA usageKees Cook1-15/+21
In the quest to remove all stack VLA usage from the kernel [1], this removes the discouraged use of AHASH_REQUEST_ON_STACK by switching to shash directly and allocating the descriptor in heap memory (which should be fine: the tfm has already been allocated there too). Link: https://lkml.kernel.org/r/CA+55aFzCG-zNmZwX4A2FQpadafLfEzK6CC=qPXydAacU1RqZWA@mail.gmail.com # [1] Signed-off-by: Kees Cook <[email protected]> Acked-by: Pavel Machek <[email protected]> Signed-off-by: Rafael J. Wysocki <[email protected]>
2018-07-26bpf, x32: Fix regression caused by commit 24dea04767e6Wang YanQing1-4/+4
Commit 24dea04767e6 ("bpf, x32: remove ld_abs/ld_ind") removed the 4 /* Extra space for skb_copy_bits buffer */ from _STACK_SIZE, but it didn't fix the concerned code in emit_prologue and emit_epilogue, and this error will bring very strange kernel runtime errors. This patch fixes it. Fixes: 24dea04767e6 ("bpf, x32: remove ld_abs/ld_ind") Reported-by: Meelis Roos <[email protected]> Bisected-by: Meelis Roos <[email protected]> Signed-off-by: Wang YanQing <[email protected]> Signed-off-by: Daniel Borkmann <[email protected]>
2018-07-25x86/boot: Fix if_changed build flip/flop bugKees Cook1-2/+6
Dirk Gouders reported that two consecutive "make" invocations on an already compiled tree will show alternating behaviors: $ make CALL scripts/checksyscalls.sh DESCEND objtool CHK include/generated/compile.h DATAREL arch/x86/boot/compressed/vmlinux Kernel: arch/x86/boot/bzImage is ready (#48) Building modules, stage 2. MODPOST 165 modules $ make CALL scripts/checksyscalls.sh DESCEND objtool CHK include/generated/compile.h LD arch/x86/boot/compressed/vmlinux ZOFFSET arch/x86/boot/zoffset.h AS arch/x86/boot/header.o LD arch/x86/boot/setup.elf OBJCOPY arch/x86/boot/setup.bin OBJCOPY arch/x86/boot/vmlinux.bin BUILD arch/x86/boot/bzImage Setup is 15644 bytes (padded to 15872 bytes). System is 6663 kB CRC 3eb90f40 Kernel: arch/x86/boot/bzImage is ready (#48) Building modules, stage 2. MODPOST 165 modules He bisected it back to: commit 98f78525371b ("x86/boot: Refuse to build with data relocations") The root cause was the use of the "if_changed" kbuild function multiple times for the same target. It was designed to only be used once per target, otherwise it will effectively always trigger, flipping back and forth between the two commands getting recorded by "if_changed". Instead, this patch merges the two commands into a single function to get stable build artifacts (i.e. .vmlinux.cmd), and a single build behavior. Bisected-and-Reported-by: Dirk Gouders <[email protected]> Fix-Suggested-by: Masahiro Yamada <[email protected]> Signed-off-by: Kees Cook <[email protected]> Reviewed-by: Masahiro Yamada <[email protected]> Cc: Linus Torvalds <[email protected]> Cc: Peter Zijlstra <[email protected]> Cc: Thomas Gleixner <[email protected]> Link: http://lkml.kernel.org/r/20180724230827.GA37823@beast Signed-off-by: Ingo Molnar <[email protected]>
2018-07-25locking/atomics: Instrument xchg()Mark Rutland3-3/+3
While we instrument all of the (non-relaxed) atomic_*() functions and cmpxchg(), we missed xchg(). Let's add instrumentation for xchg(), fixing up x86 to implement arch_xchg(). Signed-off-by: Mark Rutland <[email protected]> Acked-by: Peter Zijlstra (Intel) <[email protected]> Acked-by: Will Deacon <[email protected]> Cc: Boqun Feng <[email protected]> Cc: Dmitry Vyukov <[email protected]> Cc: Linus Torvalds <[email protected]> Cc: Peter Zijlstra <[email protected]> Cc: Thomas Gleixner <[email protected]> Cc: [email protected] Cc: [email protected] Cc: [email protected] Cc: [email protected] Cc: [email protected] Cc: [email protected] Cc: [email protected] Cc: [email protected] Link: http://lkml.kernel.org/r/[email protected] Signed-off-by: Ingo Molnar <[email protected]>
2018-07-25locking/atomics/x86: Reduce arch_cmpxchg64*() instrumentationMark Rutland1-2/+2
Currently x86's arch_cmpxchg64() and arch_cmpxchg64_local() are instrumented twice, as they call into instrumented atomics rather than their arch_ equivalents. A call to cmpxchg64() results in: cmpxchg64() kasan_check_write() arch_cmpxchg64() cmpxchg() kasan_check_write() arch_cmpxchg() Let's fix this up and call the arch_ equivalents, resulting in: cmpxchg64() kasan_check_write() arch_cmpxchg64() arch_cmpxchg() Signed-off-by: Mark Rutland <[email protected]> Acked-by: Thomas Gleixner <[email protected]> Acked-by: Peter Zijlstra (Intel) <[email protected]> Acked-by: Will Deacon <[email protected]> Cc: Boqun Feng <[email protected]> Cc: Dmitry Vyukov <[email protected]> Cc: Linus Torvalds <[email protected]> Cc: Peter Zijlstra <[email protected]> Cc: [email protected] Cc: [email protected] Cc: [email protected] Cc: [email protected] Cc: [email protected] Cc: [email protected] Cc: [email protected] Cc: [email protected] Link: http://lkml.kernel.org/r/[email protected] Signed-off-by: Ingo Molnar <[email protected]>
2018-07-25perf/x86/intel: Support Extended PEBS for Goldmont PlusKan Liang2-7/+1
Enable the extended PEBS for Goldmont Plus. There is no specific PEBS constrains for Goldmont Plus. Removing the pebs_constraints for Goldmont Plus. Signed-off-by: Kan Liang <[email protected]> Signed-off-by: Peter Zijlstra (Intel) <[email protected]> Cc: Alexander Shishkin <[email protected]> Cc: Arnaldo Carvalho de Melo <[email protected]> Cc: Jiri Olsa <[email protected]> Cc: Linus Torvalds <[email protected]> Cc: Peter Zijlstra <[email protected]> Cc: Stephane Eranian <[email protected]> Cc: Thomas Gleixner <[email protected]> Cc: Vince Weaver <[email protected]> Cc: [email protected] Link: http://lkml.kernel.org/r/[email protected] Signed-off-by: Ingo Molnar <[email protected]>
2018-07-25perf/x86/intel/ds: Handle PEBS overflow for fixed countersKan Liang3-11/+33
The pebs_drain() need to support fixed counters. The DS Save Area now include "counter reset value" fields for each fixed counters. Extend the related variables (e.g. mask, counters, error) to support fixed counters. There is no extended PEBS in PEBS v2 and earlier PEBS format. Only need to change the code for PEBS v3 and later PEBS format. Extend the pebs_event_reset[] logic to support new "counter reset value" fields. Increase the reserve space for fixed counters. Based-on-code-from: Andi Kleen <[email protected]> Signed-off-by: Kan Liang <[email protected]> Signed-off-by: Peter Zijlstra (Intel) <[email protected]> Cc: Alexander Shishkin <[email protected]> Cc: Arnaldo Carvalho de Melo <[email protected]> Cc: Jiri Olsa <[email protected]> Cc: Linus Torvalds <[email protected]> Cc: Peter Zijlstra <[email protected]> Cc: Stephane Eranian <[email protected]> Cc: Thomas Gleixner <[email protected]> Cc: Vince Weaver <[email protected]> Cc: [email protected] Link: http://lkml.kernel.org/r/[email protected] Signed-off-by: Ingo Molnar <[email protected]>
2018-07-25perf/x86/intel: Support PEBS on fixed countersKan Liang1-11/+13
The Extended PEBS feature supports PEBS on fixed-function performance counters as well as all four general purpose counters. It has to change the order of PEBS and fixed counter enabling to make sure PEBS is enabled for the fixed counters. The change of the order doesn't impact the behavior of current code on other platforms which don't support extended PEBS. Because there is no dependency among those enable/disable functions. Don't enable IRQ generation (0x8) for MSR_ARCH_PERFMON_FIXED_CTR_CTRL. The PEBS ucode will handle the interrupt generation. Based-on-code-from: Andi Kleen <[email protected]> Signed-off-by: Kan Liang <[email protected]> Signed-off-by: Peter Zijlstra (Intel) <[email protected]> Cc: Alexander Shishkin <[email protected]> Cc: Arnaldo Carvalho de Melo <[email protected]> Cc: Jiri Olsa <[email protected]> Cc: Linus Torvalds <[email protected]> Cc: Peter Zijlstra <[email protected]> Cc: Stephane Eranian <[email protected]> Cc: Thomas Gleixner <[email protected]> Cc: Vince Weaver <[email protected]> Cc: [email protected] Link: http://lkml.kernel.org/r/[email protected] Signed-off-by: Ingo Molnar <[email protected]>
2018-07-25perf/x86/intel: Introduce PMU flag for Extended PEBSKan Liang2-0/+8
The Extended PEBS feature, introduced in the Goldmont Plus microarchitecture, supports all events as "Extended PEBS". Introduce flag PMU_FL_PEBS_ALL to indicate the platforms which support extended PEBS. To support all events, it needs to support all constraints for PEBS. To avoid duplicating all the constraints in the PEBS table, making the PEBS code search the normal constraints too. Based-on-code-from: Andi Kleen <[email protected]> Signed-off-by: Kan Liang <[email protected]> Signed-off-by: Peter Zijlstra (Intel) <[email protected]> Cc: Alexander Shishkin <[email protected]> Cc: Arnaldo Carvalho de Melo <[email protected]> Cc: Jiri Olsa <[email protected]> Cc: Linus Torvalds <[email protected]> Cc: Peter Zijlstra <[email protected]> Cc: Stephane Eranian <[email protected]> Cc: Thomas Gleixner <[email protected]> Cc: Vince Weaver <[email protected]> Cc: [email protected] Link: http://lkml.kernel.org/r/[email protected] Signed-off-by: Ingo Molnar <[email protected]>
2018-07-25Merge branch 'perf/urgent' into perf/core, to pick up fixesIngo Molnar41-121/+262
Signed-off-by: Ingo Molnar <[email protected]>
2018-07-25perf/x86/intel: Fix unwind errors from PEBS entries (mk-II)Peter Zijlstra2-14/+14
Vince reported the perf_fuzzer giving various unwinder warnings and Josh reported: > Deja vu. Most of these are related to perf PEBS, similar to the > following issue: > > b8000586c90b ("perf/x86/intel: Cure bogus unwind from PEBS entries") > > This is basically the ORC version of that. setup_pebs_sample_data() is > assembling a franken-pt_regs which ORC isn't happy about. RIP is > inconsistent with some of the other registers (like RSP and RBP). And where the previous unwinder only needed BP,SP ORC also requires IP. But we cannot spoof IP because then the sample will get displaced, entirely negating the point of PEBS. So cure the whole thing differently by doing the unwind early; this does however require a means to communicate we did the unwind early. We (ab)use an unused sample_type bit for this, which we set on events that fill out the data->callchain before the normal perf_prepare_sample(). Debugged-by: Josh Poimboeuf <[email protected]> Reported-by: Vince Weaver <[email protected]> Tested-by: Josh Poimboeuf <[email protected]> Tested-by: Prashant Bhole <[email protected]> Signed-off-by: Peter Zijlstra (Intel) <[email protected]> Cc: Alexander Shishkin <[email protected]> Cc: Andy Lutomirski <[email protected]> Cc: Arnaldo Carvalho de Melo <[email protected]> Cc: Arnaldo Carvalho de Melo <[email protected]> Cc: Jiri Olsa <[email protected]> Cc: Linus Torvalds <[email protected]> Cc: Peter Zijlstra <[email protected]> Cc: Stephane Eranian <[email protected]> Cc: Thomas Gleixner <[email protected]> Signed-off-by: Ingo Molnar <[email protected]>