aboutsummaryrefslogtreecommitdiff
path: root/arch
AgeCommit message (Collapse)AuthorFilesLines
2018-06-22Merge branch 'linus' into x86/urgentThomas Gleixner1725-28656/+51133
Required to queue a dependent fix.
2018-06-22rseq: Avoid infinite recursion when delivering SIGSEGVWill Deacon4-6/+6
When delivering a signal to a task that is using rseq, we call into __rseq_handle_notify_resume() so that the registers pushed in the sigframe are updated to reflect the state of the restartable sequence (for example, ensuring that the signal returns to the abort handler if necessary). However, if the rseq management fails due to an unrecoverable fault when accessing userspace or certain combinations of RSEQ_CS_* flags, then we will attempt to deliver a SIGSEGV. This has the potential for infinite recursion if the rseq code continuously fails on signal delivery. Avoid this problem by using force_sigsegv() instead of force_sig(), which is explicitly designed to reset the SEGV handler to SIG_DFL in the case of a recursive fault. In doing so, remove rseq_signal_deliver() from the internal rseq API and have an optional struct ksignal * parameter to rseq_handle_notify_resume() instead. Signed-off-by: Will Deacon <[email protected]> Signed-off-by: Thomas Gleixner <[email protected]> Acked-by: Mathieu Desnoyers <[email protected]> Cc: [email protected] Cc: [email protected] Cc: [email protected] Link: https://lkml.kernel.org/r/[email protected]
2018-06-22arm64: mm: Ensure writes to swapper are ordered wrt subsequent cache maintenanceWill Deacon1-2/+3
When rewriting swapper using nG mappings, we must performance cache maintenance around each page table access in order to avoid coherency problems with the host's cacheable alias under KVM. To ensure correct ordering of the maintenance with respect to Device memory accesses made with the Stage-1 MMU disabled, DMBs need to be added between the maintenance and the corresponding memory access. This patch adds a missing DMB between writing a new page table entry and performing a clean+invalidate on the same line. Fixes: f992b4dfd58b ("arm64: kpti: Add ->enable callback to remap swapper using nG mappings") Cc: <[email protected]> # 4.16.x- Acked-by: Mark Rutland <[email protected]> Signed-off-by: Will Deacon <[email protected]> Signed-off-by: Catalin Marinas <[email protected]>
2018-06-22arm64: kpti: Use early_param for kpti= command-line optionWill Deacon1-1/+1
We inspect __kpti_forced early on as part of the cpufeature enable callback which remaps the swapper page table using non-global entries. Ensure that __kpti_forced has been updated to reflect the kpti= command-line option before we start using it. Fixes: ea1e3de85e94 ("arm64: entry: Add fake CPU feature for unmapping the kernel at EL0") Cc: <[email protected]> # 4.16.x- Reported-by: Wei Xu <[email protected]> Tested-by: Sudeep Holla <[email protected]> Tested-by: Wei Xu <[email protected]> Signed-off-by: Will Deacon <[email protected]> Signed-off-by: Catalin Marinas <[email protected]>
2018-06-22kvm: vmx: Nested VM-entry prereqs for event inj.Marc Orr3-0/+79
This patch extends the checks done prior to a nested VM entry. Specifically, it extends the check_vmentry_prereqs function with checks for fields relevant to the VM-entry event injection information, as described in the Intel SDM, volume 3. This patch is motivated by a syzkaller bug, where a bad VM-entry interruption information field is generated in the VMCS02, which causes the nested VM launch to fail. Then, KVM fails to resume L1. While KVM should be improved to correctly resume L1 execution after a failed nested launch, this change is justified because the existing code to resume L1 is flaky/ad-hoc and the test coverage for resuming L1 is sparse. Reported-by: syzbot <[email protected]> Signed-off-by: Marc Orr <[email protected]> [Removed comment whose parts were describing previous revisions and the rest was obvious from function/variable naming. - Radim] Signed-off-by: Radim Krčmář <[email protected]>
2018-06-22x86/microcode/intel: Fix memleak in save_microcode_patch()Zhenzhong Duan1-1/+4
Free useless ucode_patch entry when it's replaced. [ bp: Drop the memfree_patch() two-liner. ] Signed-off-by: Zhenzhong Duan <[email protected]> Signed-off-by: Borislav Petkov <[email protected]> Signed-off-by: Thomas Gleixner <[email protected]> Cc: Srinivas REDDY Eeda <[email protected]> Link: http://lkml.kernel.org/r/888102f0-fd22-459d-b090-a1bd8a00cb2b@default
2018-06-22x86/mce: Fix incorrect "Machine check from unknown source" messageTony Luck1-8/+18
Some injection testing resulted in the following console log: mce: [Hardware Error]: CPU 22: Machine Check Exception: f Bank 1: bd80000000100134 mce: [Hardware Error]: RIP 10:<ffffffffc05292dd> {pmem_do_bvec+0x11d/0x330 [nd_pmem]} mce: [Hardware Error]: TSC c51a63035d52 ADDR 3234bc4000 MISC 88 mce: [Hardware Error]: PROCESSOR 0:50654 TIME 1526502199 SOCKET 0 APIC 38 microcode 2000043 mce: [Hardware Error]: Run the above through 'mcelog --ascii' Kernel panic - not syncing: Machine check from unknown source This confused everybody because the first line quite clearly shows that we found a logged error in "Bank 1", while the last line says "unknown source". The problem is that the Linux code doesn't do the right thing for a local machine check that results in a fatal error. It turns out that we know very early in the handler whether the machine check is fatal. The call to mce_no_way_out() has checked all the banks for the CPU that took the local machine check. If it says we must crash, we can do so right away with the right messages. We do scan all the banks again. This means that we might initially not see a problem, but during the second scan find something fatal. If this happens we print a slightly different message (so I can see if it actually every happens). [ bp: Remove unneeded severity assignment. ] Signed-off-by: Tony Luck <[email protected]> Signed-off-by: Borislav Petkov <[email protected]> Signed-off-by: Thomas Gleixner <[email protected]> Cc: Ashok Raj <[email protected]> Cc: Dan Williams <[email protected]> Cc: Qiuxu Zhuo <[email protected]> Cc: linux-edac <[email protected]> Cc: [email protected] # 4.2 Link: http://lkml.kernel.org/r/52e049a497e86fd0b71c529651def8871c804df0.1527283897.git.tony.luck@intel.com
2018-06-22x86/mce: Do not overwrite MCi_STATUS in mce_no_way_out()Borislav Petkov1-8/+10
mce_no_way_out() does a quick check during #MC to see whether some of the MCEs logged would require the kernel to panic immediately. And it passes a struct mce where MCi_STATUS gets written. However, after having saved a valid status value, the next iteration of the loop which goes over the MCA banks on the CPU, overwrites the valid status value because we're using struct mce as storage instead of a temporary variable. Which leads to MCE records with an empty status value: mce: [Hardware Error]: CPU 0: Machine Check Exception: 6 Bank 0: 0000000000000000 mce: [Hardware Error]: RIP 10:<ffffffffbd42fbd7> {trigger_mce+0x7/0x10} In order to prevent the loss of the status register value, return immediately when severity is a panic one so that we can panic immediately with the first fatal MCE logged. This is also the intention of this function and not to noodle over the banks while a fatal MCE is already logged. Tony: read the rest of the MCA bank to populate the struct mce fully. Suggested-by: Tony Luck <[email protected]> Signed-off-by: Borislav Petkov <[email protected]> Signed-off-by: Thomas Gleixner <[email protected]> Cc: <[email protected]> Link: https://lkml.kernel.org/r/[email protected]
2018-06-21ARC: [plat-hsdk] Add PCIe supportGustavo Pimentel1-0/+1
Add PCI support to the ARC HSDK platform allowing to use the generic PCI setup functions. Signed-off-by: Gustavo Pimentel <[email protected]> Acked-by: Alexey Brodkin <[email protected]> Signed-off-by: Vineet Gupta <[email protected]>
2018-06-21uprobes/x86: Remove incorrect WARN_ON() in uprobe_init_insn()Oleg Nesterov1-1/+1
insn_get_length() has the side-effect of processing the entire instruction but only if it was decoded successfully, otherwise insn_complete() can fail and in this case we need to just return an error without warning. Reported-by: [email protected] Signed-off-by: Oleg Nesterov <[email protected]> Reviewed-by: Masami Hiramatsu <[email protected]> Cc: Linus Torvalds <[email protected]> Cc: Peter Zijlstra <[email protected]> Cc: Thomas Gleixner <[email protected]> Cc: [email protected] Link: https://lkml.kernel.org/lkml/[email protected] Signed-off-by: Ingo Molnar <[email protected]>
2018-06-21x86/platform/UV: Add kernel parameter to set memory block size[email protected]1-0/+11
Add a kernel parameter that allows setting UV memory block size. This is to provide an adjustment for new forms of PMEM and other DIMM memory that might require alignment restrictions other than scanning the global address table for the required minimum alignment. The value set will be further adjusted by both the GAM range table scan as well as restrictions imposed by set_memory_block_size_order(). Signed-off-by: Mike Travis <[email protected]> Reviewed-by: Andrew Banman <[email protected]> Cc: Andrew Morton <[email protected]> Cc: Dimitri Sivanich <[email protected]> Cc: Linus Torvalds <[email protected]> Cc: Peter Zijlstra <[email protected]> Cc: Russ Anderson <[email protected]> Cc: Thomas Gleixner <[email protected]> Cc: [email protected] Cc: [email protected] Cc: [email protected] Cc: [email protected] Cc: [email protected] Link: https://lkml.kernel.org/lkml/[email protected] Signed-off-by: Ingo Molnar <[email protected]>
2018-06-21x86/platform/UV: Use new set memory block size function[email protected]1-3/+46
Add a call to the new function to "adjust" the current fixed UV memory block size of 2GB so it can be changed to a different physical boundary. This accommodates changes in the Intel BIOS, and therefore UV BIOS, which now can align boundaries different than the previous UV standard of 2GB. It also flags any UV Global Address boundaries from BIOS that cause a change in the mem block size (boundary). The current boundary of 2GB has been used on UV since the first system release in 2009 with Linux 2.6 and has worked fine. But the new NVDIMM persistent memory modules (PMEM), along with the Intel BIOS changes to support these modules caused the memory block size boundary to be set to a lower limit. Intel only guarantees that this minimum boundary at 64MB though the current Linux limit is 128MB. Note that the default remains 2GB if no changes occur. Signed-off-by: Mike Travis <[email protected]> Reviewed-by: Andrew Banman <[email protected]> Cc: Andrew Morton <[email protected]> Cc: Dimitri Sivanich <[email protected]> Cc: Linus Torvalds <[email protected]> Cc: Peter Zijlstra <[email protected]> Cc: Russ Anderson <[email protected]> Cc: Thomas Gleixner <[email protected]> Cc: [email protected] Cc: [email protected] Cc: [email protected] Cc: [email protected] Cc: [email protected] Link: https://lkml.kernel.org/lkml/[email protected] Signed-off-by: Ingo Molnar <[email protected]>
2018-06-21x86/platform/UV: Add adjustable set memory block size function[email protected]1-4/+16
Add a new function to "adjust" the current fixed UV memory block size of 2GB so it can be changed to a different physical boundary. This is out of necessity so arch dependent code can accommodate specific BIOS requirements which can align these new PMEM modules at less than the default boundaries. A "set order" type of function was used to insure that the memory block size will be a power of two value without requiring a validity check. 64GB was chosen as the upper limit for memory block size values to accommodate upcoming 4PB systems which have 6 more bits of physical address space (46 becoming 52). Signed-off-by: Mike Travis <[email protected]> Reviewed-by: Andrew Banman <[email protected]> Cc: Andrew Morton <[email protected]> Cc: Dimitri Sivanich <[email protected]> Cc: Linus Torvalds <[email protected]> Cc: Peter Zijlstra <[email protected]> Cc: Russ Anderson <[email protected]> Cc: Thomas Gleixner <[email protected]> Cc: [email protected] Cc: [email protected] Cc: [email protected] Cc: [email protected] Cc: [email protected] Link: https://lkml.kernel.org/lkml/[email protected] Signed-off-by: Ingo Molnar <[email protected]>
2018-06-21x86/spectre_v1: Disable compiler optimizations over array_index_mask_nospec()Dan Williams1-1/+1
Mark Rutland noticed that GCC optimization passes have the potential to elide necessary invocations of the array_index_mask_nospec() instruction sequence, so mark the asm() volatile. Mark explains: "The volatile will inhibit *some* cases where the compiler could lift the array_index_nospec() call out of a branch, e.g. where there are multiple invocations of array_index_nospec() with the same arguments: if (idx < foo) { idx1 = array_idx_nospec(idx, foo) do_something(idx1); } < some other code > if (idx < foo) { idx2 = array_idx_nospec(idx, foo); do_something_else(idx2); } ... since the compiler can determine that the two invocations yield the same result, and reuse the first result (likely the same register as idx was in originally) for the second branch, effectively re-writing the above as: if (idx < foo) { idx = array_idx_nospec(idx, foo); do_something(idx); } < some other code > if (idx < foo) { do_something_else(idx); } ... if we don't take the first branch, then speculatively take the second, we lose the nospec protection. There's more info on volatile asm in the GCC docs: https://gcc.gnu.org/onlinedocs/gcc/Extended-Asm.html#Volatile " Reported-by: Mark Rutland <[email protected]> Signed-off-by: Dan Williams <[email protected]> Acked-by: Mark Rutland <[email protected]> Acked-by: Thomas Gleixner <[email protected]> Acked-by: Linus Torvalds <[email protected]> Cc: <[email protected]> Cc: Peter Zijlstra <[email protected]> Fixes: babdde2698d4 ("x86: Implement array_index_mask_nospec") Link: https://lkml.kernel.org/lkml/152838798950.14521.4893346294059739135.stgit@dwillia2-desk3.amr.corp.intel.com Signed-off-by: Ingo Molnar <[email protected]>
2018-06-21x86/pti: Don't report XenPV as vulnerableJiri Kosina1-0/+4
Xen PV domain kernel is not by design affected by meltdown as it's enforcing split CR3 itself. Let's not report such systems as "Vulnerable" in sysfs (we're also already forcing PTI to off in X86_HYPER_XEN_PV cases); the security of the system ultimately depends on presence of mitigation in the Hypervisor, which can't be easily detected from DomU; let's report that. Reported-and-tested-by: Mike Latimer <[email protected]> Signed-off-by: Jiri Kosina <[email protected]> Signed-off-by: Thomas Gleixner <[email protected]> Acked-by: Juergen Gross <[email protected]> Cc: Borislav Petkov <[email protected]> Link: https://lkml.kernel.org/r/[email protected] [ Merge the user-visible string into a single line. ] Signed-off-by: Ingo Molnar <[email protected]>
2018-06-21microblaze: consolidate GPIO reset handlingRob Herring4-30/+7
Now that platform.c only has the GPIO reset handling left, move the initcall to reset.c and remove platform.c. Cc: Michal Simek <[email protected]> Signed-off-by: Rob Herring <[email protected]> Signed-off-by: Michal Simek <[email protected]>
2018-06-21microblaze: remove unecessary of_platform_bus_probe callRob Herring1-7/+0
The call to of_platform_bus_probe has no effect because the DT core already probes default buses like "simple-bus" before this call. Michal Simek said 'xlnx,compound' hasn't been used in a long time, so that match entry isn't needed. Cc: Michal Simek <[email protected]> Signed-off-by: Rob Herring <[email protected]> Signed-off-by: Michal Simek <[email protected]>
2018-06-21microblaze: Add new syscalls io_pgetevents and rseqMichal Simek3-1/+5
Wire up new syscalls io_pgetevents and rseq. Signed-off-by: Michal Simek <[email protected]>
2018-06-21x86/build: Remove unnecessary preparation for purgatoryMasahiro Yamada1-5/+0
kexec-purgatory.c is properly generated when Kbuild descend into the arch/x86/purgatory/. Thus the 'archprepare' target is redundant. Signed-off-by: Masahiro Yamada <[email protected]> Cc: Linus Torvalds <[email protected]> Cc: Michal Marek <[email protected]> Cc: Peter Zijlstra <[email protected]> Cc: Sam Ravnborg <[email protected]> Cc: Thomas Gleixner <[email protected]> Link: https://lore.kernel.org/lkml/[email protected] Signed-off-by: Ingo Molnar <[email protected]>
2018-06-21Revert "kexec/purgatory: Add clean-up for purgatory directory"Masahiro Yamada1-1/+0
Reverts the following commit: b0108f9e93d0 ("kexec: purgatory: add clean-up for purgatory directory") ... which incorrectly stated that the kexec-purgatory.c and purgatory.ro files were not removed after 'make mrproper'. In fact, they are. You can confirm it after reverting it. $ make mrproper $ touch arch/x86/purgatory/kexec-purgatory.c $ touch arch/x86/purgatory/purgatory.ro $ make mrproper CLEAN arch/x86/purgatory $ ls arch/x86/purgatory/ entry64.S Makefile purgatory.c setup-x86_64.S stack.S string.c This is obvious from the build system point of view. arch/x86/Makefile adds 'arch/x86' to core-y. Hence 'make clean' descends like this: arch/x86/Kbuild -> arch/x86/purgatory/Makefile Signed-off-by: Masahiro Yamada <[email protected]> Cc: Linus Torvalds <[email protected]> Cc: Michal Marek <[email protected]> Cc: Peter Zijlstra <[email protected]> Cc: Sam Ravnborg <[email protected]> Cc: Thomas Gleixner <[email protected]> Link: https://lore.kernel.org/lkml/[email protected] Signed-off-by: Ingo Molnar <[email protected]>
2018-06-21x86/xen: Add call of speculative_store_bypass_ht_init() to PV pathsJuergen Gross1-0/+5
Commit: 1f50ddb4f418 ("x86/speculation: Handle HT correctly on AMD") ... added speculative_store_bypass_ht_init() to the per-CPU initialization sequence. speculative_store_bypass_ht_init() needs to be called on each CPU for PV guests, too. Reported-by: Brian Woods <[email protected]> Tested-by: Brian Woods <[email protected]> Signed-off-by: Juergen Gross <[email protected]> Cc: <[email protected]> Cc: Linus Torvalds <[email protected]> Cc: Peter Zijlstra <[email protected]> Cc: Thomas Gleixner <[email protected]> Cc: [email protected] Cc: [email protected] Fixes: 1f50ddb4f4189243c05926b842dc1a0332195f31 ("x86/speculation: Handle HT correctly on AMD") Link: https://lore.kernel.org/lkml/[email protected] Signed-off-by: Ingo Molnar <[email protected]>
2018-06-21KVM: arm64: Avoid mistaken attempts to save SVE state for vcpusDave Martin1-3/+3
Commit e6b673b ("KVM: arm64: Optimise FPSIMD handling to reduce guest/host thrashing") uses fpsimd_save() to save the FPSIMD state for a vcpu when scheduling the vcpu out. However, currently current's value of TIF_SVE is restored before calling fpsimd_save() which means that fpsimd_save() may erroneously attempt to save SVE state from the vcpu. This enables current's vector state to be polluted with guest data. current->thread.sve_state may be unallocated or not large enough, so this can also trigger a NULL dereference or buffer overrun. Instead of this, TIF_SVE should be configured properly for the guest when calling fpsimd_save() with the vcpu context loaded. This patch ensures this by delaying restoration of current's TIF_SVE until after the call to fpsimd_save(). Fixes: e6b673b741ea ("KVM: arm64: Optimise FPSIMD handling to reduce guest/host thrashing") Signed-off-by: Dave Martin <[email protected]> Signed-off-by: Marc Zyngier <[email protected]>
2018-06-21KVM: arm64/sve: Fix SVE trap restoration for non-current tasksDave Martin2-4/+21
Commit e6b673b ("KVM: arm64: Optimise FPSIMD handling to reduce guest/host thrashing") attempts to restore the configuration of userspace SVE trapping via a call to fpsimd_bind_task_to_cpu(), but the logic for determining when to do this is not correct. The patch makes the errnoenous assumption that the only task that may try to enter userspace with the currently loaded FPSIMD/SVE register content is current. This may not be the case however: if some other user task T is scheduled on the CPU during the execution of the KVM run loop, and the vcpu does not try to use the registers in the meantime, then T's state may be left there intact. If T happens to be the next task to enter userspace on this CPU then the hooks for reloading the register state and configuring traps will be skipped. (Also, current never has SVE state at this point anyway and should always have the trap enabled, as a side-effect of the ioctl() syscall needed to reach the KVM run loop in the first place.) This patch instead restores the state of the EL0 trap from the state observed at the most recent vcpu_load(), ensuring that the trap is set correctly for the loaded context (if any). Fixes: e6b673b741ea ("KVM: arm64: Optimise FPSIMD handling to reduce guest/host thrashing") Signed-off-by: Dave Martin <[email protected]> Signed-off-by: Marc Zyngier <[email protected]>
2018-06-21KVM: arm64: Don't mask softirq with IRQs disabled in vcpu_put()Dave Martin1-3/+5
Commit e6b673b ("KVM: arm64: Optimise FPSIMD handling to reduce guest/host thrashing") introduces a specific helper kvm_arch_vcpu_put_fp() for saving the vcpu FPSIMD state during vcpu_put(). This function uses local_bh_disable()/_enable() to protect the FPSIMD context manipulation from interruption by softirqs. This approach is not correct, because vcpu_put() can be invoked either from the KVM host vcpu thread (when exiting the vcpu run loop), or via a preempt notifier. In the former case, only preemption is disabled. In the latter case, the function is called from inside __schedule(), which means that IRQs are disabled. Use of local_bh_disable()/_enable() with IRQs disabled is considerd an error, resulting in lockdep splats while running VMs if lockdep is enabled. This patch disables IRQs instead of attempting to disable softirqs, avoiding the problem of calling local_bh_enable() with IRQs disabled in the __schedule() path. This creates an additional interrupt blackout during vcpu run loop exit, but this is the rare case and the blackout latency is still less than that of __schedule(). Fixes: e6b673b741ea ("KVM: arm64: Optimise FPSIMD handling to reduce guest/host thrashing") Reported-by: Andre Przywara <[email protected]> Signed-off-by: Dave Martin <[email protected]> Signed-off-by: Marc Zyngier <[email protected]>
2018-06-21arm64: Introduce sysreg_clear_set()Mark Rutland1-0/+11
Currently we have a couple of helpers to manipulate bits in particular sysregs: * config_sctlr_el1(u32 clear, u32 set) * change_cpacr(u64 val, u64 mask) The parameters of these differ in naming convention, order, and size, which is unfortunate. They also differ slightly in behaviour, as change_cpacr() skips the sysreg write if the bits are unchanged, which is a useful optimization when sysreg writes are expensive. Before we gain yet another sysreg manipulation function, let's unify these with a common helper, providing a consistent order for clear/set operands, and the write skipping behaviour from change_cpacr(). Code will be migrated to the new helper in subsequent patches. Signed-off-by: Mark Rutland <[email protected]> Reviewed-by: Dave Martin <[email protected]> Acked-by: Catalin Marinas <[email protected]> Signed-off-by: Marc Zyngier <[email protected]>
2018-06-20ARC: Enable machine_desc->init_per_cpu for !CONFIG_SMPAlexey Brodkin2-3/+1
machine_desc->init_per_cpu() hook is supposed to be per cpu initialization and would seem to apply equally to UP and/or SMP. Infact the comment in header file seems to suggest it works for UP too, which was not the case and this patch. This enables !CONFIG_SMP build for platforms such as hsdk. Signed-off-by: Alexey Brodkin <[email protected]> Signed-off-by: Vineet Gupta <[email protected]> [vgupta: trimmeed changelog]
2018-06-20x86: Call fixup_exception() before notify_die() in math_error()Siarhei Liakh1-6/+8
fpu__drop() has an explicit fwait which under some conditions can trigger a fixable FPU exception while in kernel. Thus, we should attempt to fixup the exception first, and only call notify_die() if the fixup failed just like in do_general_protection(). The original call sequence incorrectly triggers KDB entry on debug kernels under particular FPU-intensive workloads. Andy noted, that this makes the whole conditional irq enable thing even more inconsistent, but fixing that it outside the scope of this. Signed-off-by: Siarhei Liakh <[email protected]> Signed-off-by: Thomas Gleixner <[email protected]> Reviewed-by: Andy Lutomirski <[email protected]> Cc: "H. Peter Anvin" <[email protected]> Cc: "Borislav Petkov" <[email protected]> Cc: [email protected] Link: https://lkml.kernel.org/r/DM5PR11MB201156F1CAB2592B07C79A03B17D0@DM5PR11MB2011.namprd11.prod.outlook.com
2018-06-19MIPS: Wire up io_pgetevents syscallPaul Burton5-6/+13
Wire up the io_pgetevents syscall that was introduced by commit 7a074e96dee6 ("aio: implement io_pgetevents"). Signed-off-by: Paul Burton <[email protected]> Patchwork: https://patchwork.linux-mips.org/patch/19593/ Cc: James Hogan <[email protected]> Cc: Ralf Baechle <[email protected]> Cc: [email protected]
2018-06-19MIPS: Wire up the restartable sequences (rseq) syscallPaul Burton5-6/+13
Wire up the restartable sequences (rseq) syscall for MIPS. This was introduced by commit d7822b1e24f2 ("rseq: Introduce restartable sequences system call") & MIPS now supports the prerequisites. Signed-off-by: Paul Burton <[email protected]> Reviewed-by: James Hogan <[email protected]> Patchwork: https://patchwork.linux-mips.org/patch/19525/ Cc: Ralf Baechle <[email protected]> Cc: Mathieu Desnoyers <[email protected]> Cc: Peter Zijlstra <[email protected]> Cc: Paul E. McKenney <[email protected]> Cc: Boqun Feng <[email protected]> Cc: [email protected] Cc: [email protected]
2018-06-19MIPS: Add syscall detection for restartable sequencesPaul Burton1-0/+8
Syscalls are not allowed inside restartable sequences, so add a call to rseq_syscall() at the very beginning of the system call exit path when CONFIG_DEBUG_RSEQ=y. This will help us to detect whether there is a syscall issued erroneously inside a restartable sequence. Signed-off-by: Paul Burton <[email protected]> Reviewed-by: James Hogan <[email protected]> Patchwork: https://patchwork.linux-mips.org/patch/19522/ Cc: Ralf Baechle <[email protected]> Cc: Mathieu Desnoyers <[email protected]> Cc: Peter Zijlstra <[email protected]> Cc: Paul E. McKenney <[email protected]> Cc: Boqun Feng <[email protected]> Cc: [email protected] Cc: [email protected]
2018-06-19MIPS: Add support for restartable sequencesPaul Burton2-0/+4
Implement support for restartable sequences on MIPS, which requires 3 simple things: - Call rseq_handle_notify_resume() on return to userspace if TIF_NOTIFY_RESUME is set. - Call rseq_signal_deliver() to fixup the pre-signal stack frame when a signal is delivered whilst executing a restartable sequence critical section. - Select CONFIG_HAVE_RSEQ. Signed-off-by: Paul Burton <[email protected]> Reviewed-by: James Hogan <[email protected]> Patchwork: https://patchwork.linux-mips.org/patch/19523/ Cc: Ralf Baechle <[email protected]> Cc: Mathieu Desnoyers <[email protected]> Cc: Peter Zijlstra <[email protected]> Cc: Paul E. McKenney <[email protected]> Cc: Boqun Feng <[email protected]> Cc: [email protected] Cc: [email protected]
2018-06-19MIPS: io: Add barrier after register read in inX()Huacai Chen1-0/+2
While a barrier is present in the outX() functions before the register write, a similar barrier is missing in the inX() functions after the register read. This could allow memory accesses following inX() to observe stale data. This patch is very similar to commit a1cc7034e33d12dc1 ("MIPS: io: Add barrier after register read in readX()"). Because war_io_reorder_wmb() is both used by writeX() and outX(), if readX() need a barrier then so does inX(). Cc: [email protected] Signed-off-by: Huacai Chen <[email protected]> Patchwork: https://patchwork.linux-mips.org/patch/19516/ Signed-off-by: Paul Burton <[email protected]> Cc: James Hogan <[email protected]> Cc: [email protected] Cc: Fuxin Zhang <[email protected]> Cc: Zhangjin Wu <[email protected]> Cc: Huacai Chen <[email protected]>
2018-06-20powerpc/mm/hash/4k: Free hugetlb page table caches correctly.Aneesh Kumar K.V8-1/+52
With 4k page size for hugetlb we allocate hugepage directories from its on slab cache. With patch 0c4d26802 ("powerpc/book3s64/mm: Simplify the rcu callback for page table free") we missed to free these allocated hugepd tables. Update pgtable_free to handle hugetlb hugepd directory table. Fixes: 0c4d268029bf ("powerpc/book3s64/mm: Simplify the rcu callback for page table free") Signed-off-by: Aneesh Kumar K.V <[email protected]> [mpe: Add CONFIG_HUGETLB_PAGE guard to fix build break] Signed-off-by: Michael Ellerman <[email protected]>
2018-06-20powerpc/64s/radix: Fix radix_kvm_prefetch_workaround paca access of not ↵Nicholas Piggin1-0/+2
possible CPU If possible CPUs are limited (e.g., by kexec), then the kvm prefetch workaround function can access the paca pointer for a !possible CPU. Fixes: d2e60075a3d44 ("powerpc/64: Use array of paca pointers and allocate pacas individually") Cc: [email protected] Reported-by: Pridhiviraj Paidipeddi <[email protected]> Tested-by: Pridhiviraj Paidipeddi <[email protected]> Signed-off-by: Nicholas Piggin <[email protected]> Signed-off-by: Michael Ellerman <[email protected]>
2018-06-19mips: ftrace: fix static function graph tracingMatthias Schiffer1-15/+12
ftrace_graph_caller was never run after calling ftrace_trace_function, breaking the function graph tracer. Fix this, bringing it in line with the x86 implementation. While we're at it, also streamline the control flow of _mcount a bit to reduce the number of branches. This issue was reported before: https://www.linux-mips.org/archives/linux-mips/2014-11/msg00295.html Signed-off-by: Matthias Schiffer <[email protected]> Tested-by: Matt Redfearn <[email protected]> Patchwork: https://patchwork.linux-mips.org/patch/18929/ Signed-off-by: Paul Burton <[email protected]> Cc: [email protected] # v3.17+
2018-06-19arm64: make secondary_start_kernel() notraceZhizhou Zhang1-1/+1
We can't call function trace hook before setup percpu offset. When entering secondary_start_kernel(), percpu offset has not been initialized. So this lead hotplug malfunction. Here is the flow to reproduce this bug: echo 0 > /sys/devices/system/cpu/cpu1/online echo function > /sys/kernel/debug/tracing/current_tracer echo 1 > /sys/kernel/debug/tracing/tracing_on echo 1 > /sys/devices/system/cpu/cpu1/online Acked-by: Mark Rutland <[email protected]> Tested-by: Suzuki K Poulose <[email protected]> Signed-off-by: Zhizhou Zhang <[email protected]> Signed-off-by: Catalin Marinas <[email protected]>
2018-06-19arm64: dma-mapping: clear buffers allocated with FORCE_CONTIGUOUS flagMarek Szyprowski1-4/+5
dma_alloc_*() buffers might be exposed to userspace via mmap() call, so they should be cleared on allocation. In case of IOMMU-based dma-mapping implementation such buffer clearing was missing in the code path for DMA_ATTR_FORCE_CONTIGUOUS flag handling, because dma_alloc_from_contiguous() doesn't honor __GFP_ZERO flag. This patch fixes this issue. For more information on clearing buffers allocated by dma_alloc_* functions, see commit 6829e274a623 ("arm64: dma-mapping: always clear allocated buffers"). Fixes: 44176bb38fa4 ("arm64: Add support for DMA_ATTR_FORCE_CONTIGUOUS to IOMMU") Signed-off-by: Marek Szyprowski <[email protected]> Signed-off-by: Catalin Marinas <[email protected]>
2018-06-19powerpc/64s: Fix build failures with CONFIG_NMI_IPI=nMichael Ellerman2-3/+3
I broke the build when CONFIG_NMI_IPI=n with my recent commit to add arch_trigger_cpumask_backtrace(), eg: stacktrace.c:(.text+0x1b0): undefined reference to `.smp_send_safe_nmi_ipi' We should rework the CONFIG symbols here in future to avoid these double barrelled ifdefs but for now they fix the build. Fixes: 5cc05910f26e ("powerpc/64s: Wire up arch_trigger_cpumask_backtrace()") Reported-by: Christophe LEROY <[email protected]> Signed-off-by: Michael Ellerman <[email protected]>
2018-06-19xen: share start flags between PV and PVHRoger Pau Monne4-3/+13
Use a global variable to store the start flags for both PV and PVH. This allows the xen_initial_domain macro to work properly on PVH. Note that ARM is also switched to use the new variable. Signed-off-by: Boris Ostrovsky <[email protected]> Signed-off-by: Roger Pau Monné <[email protected]> Reviewed-by: Juergen Gross <[email protected]> Signed-off-by: Juergen Gross <[email protected]>
2018-06-19powerpc/64: hard disable irqs on the panic()ing CPUNicholas Piggin1-2/+10
Similar to previous patches, hard disable interrupts when a CPU is in panic. This reduces the chance the watchdog has to interfere with the panic, and avoids any other type of masked interrupt being executed when crashing which minimises the length of the crash path. Signed-off-by: Nicholas Piggin <[email protected]> Signed-off-by: Michael Ellerman <[email protected]>
2018-06-19powerpc: smp_send_stop do not offline stopped CPUsNicholas Piggin1-6/+0
Marking CPUs stopped by smp_send_stop as offline can cause warnings due to cross-CPU wakeups. This trace was noticed on a busy system running a sysrq+c crash test, after the injected crash: WARNING: CPU: 51 PID: 1546 at kernel/sched/core.c:1179 set_task_cpu+0x22c/0x240 CPU: 51 PID: 1546 Comm: kworker/u352:1 Tainted: G D Workqueue: mlx5e mlx5e_update_stats_work [mlx5_core] [...] NIP [c00000000017c21c] set_task_cpu+0x22c/0x240 LR [c00000000017d580] try_to_wake_up+0x230/0x720 Call Trace: [c000000001017700] runqueues+0x0/0xb00 (unreliable) [c00000000017d580] try_to_wake_up+0x230/0x720 [c00000000015a214] insert_work+0x104/0x140 [c00000000015adb0] __queue_work+0x230/0x690 [c000003fc5007910] [c00000000015b26c] queue_work_on+0x5c/0x90 [c0080000135fc8f8] mlx5_cmd_exec+0x538/0xcb0 [mlx5_core] [c008000013608fd0] mlx5_core_access_reg+0x140/0x1d0 [mlx5_core] [c00800001362777c] mlx5e_update_pport_counters.constprop.59+0x6c/0x90 [mlx5_core] [c008000013628868] mlx5e_update_ndo_stats+0x28/0x90 [mlx5_core] [c008000013625558] mlx5e_update_stats_work+0x68/0xb0 [mlx5_core] [c00000000015bcec] process_one_work+0x1bc/0x5f0 [c00000000015ecac] worker_thread+0xac/0x6b0 [c000000000168338] kthread+0x168/0x1b0 [c00000000000b628] ret_from_kernel_thread+0x5c/0xb4 This happens because firstly the CPU is not really offline in the usual sense, processes and interrupts have not been migrated away. Secondly smp_send_stop does not happen atomically on all CPUs, so one CPU can have marked itself offline, while another CPU is still running processes or interrupts which can affect the first CPU. Fix this by just not marking the CPU as offline. It's more like frozen in time, so offline does not really reflect its state properly anyway. There should be nothing in the crash/panic path that walks online CPUs and synchronously waits for them, so this change should not introduce new hangs. Signed-off-by: Nicholas Piggin <[email protected]> Signed-off-by: Michael Ellerman <[email protected]>
2018-06-19powerpc/64: hard disable irqs in panic_smp_self_stopNicholas Piggin1-0/+8
Similarly to commit 855bfe0de1 ("powerpc: hard disable irqs in smp_send_stop loop"), irqs should be hard disabled by panic_smp_self_stop. Signed-off-by: Nicholas Piggin <[email protected]> Signed-off-by: Michael Ellerman <[email protected]>
2018-06-19powerpc/64s: Fix DT CPU features Power9 DD2.1 logicMichael Ellerman1-1/+2
In the device tree CPU features quirk code we want to set CPU_FTR_POWER9_DD2_1 on all Power9s that aren't DD2.0 or earlier. But we got the logic wrong and instead set it on all CPUs that aren't Power9 DD2.0 or earlier, ie. including Power8. Fix it by making sure we're on a Power9. This isn't a bug in practice because the only code that checks the feature is Power9 only to begin with. But we'll backport it anyway to avoid confusion. Fixes: 9e9626ed3a4a ("powerpc/64s: Fix POWER9 DD2.2 and above in DT CPU features") Cc: [email protected] # v4.17+ Reported-by: Paul Mackerras <[email protected]> Signed-off-by: Michael Ellerman <[email protected]> Acked-by: Nicholas Piggin <[email protected]> Signed-off-by: Michael Ellerman <[email protected]>
2018-06-19powerpc/64s/radix: Fix MADV_[FREE|DONTNEED] TLB flush miss problem with THPNicholas Piggin1-21/+75
The patch 99baac21e4 ("mm: fix MADV_[FREE|DONTNEED] TLB flush miss problem") added a force flush mode to the mmu_gather flush, which unconditionally flushes the entire address range being invalidated (even if actual ptes only covered a smaller range), to solve a problem with concurrent threads invalidating the same PTEs causing them to miss TLBs that need flushing. This does not work with powerpc that invalidates mmu_gather batches according to page size. Have powerpc flush all possible page sizes in the range if it encounters this concurrency condition. Patch 4647706ebe ("mm: always flush VMA ranges affected by zap_page_range") does add a TLB flush for all page sizes on powerpc for the zap_page_range case, but that is to be removed and replaced with the mmu_gather flush to avoid redundant flushing. It is also thought to not cover other obscure race conditions: https://lkml.kernel.org/r/[email protected] Hash does not have a problem because it invalidates TLBs inside the page table locks. Reported-by: Aneesh Kumar K.V <[email protected]> Signed-off-by: Nicholas Piggin <[email protected]> Signed-off-by: Michael Ellerman <[email protected]>
2018-06-19powerpc/e500mc: Set assembler machine type to e500mcMichael Jeanson1-0/+1
In binutils 2.26 a new opcode for the "wait" instruction was added for the POWER9 and has precedence over the one specific to the e500mc. Commit ebf714ff3756 ("powerpc/e500mc: Add support for the wait instruction in e500_idle") uses this instruction specifically on the e500mc to work around an erratum. This results in an invalid instruction in idle_e500 when we build for the e500mc on bintutils >= 2.26 with the default assembler machine type. Since multiplatform between e500 and non-e500 is not supported, set the assembler machine type globaly when CONFIG_PPC_E500MC=y. Signed-off-by: Michael Jeanson <[email protected]> Reviewed-by: Mathieu Desnoyers <[email protected]> CC: Benjamin Herrenschmidt <[email protected]> CC: Paul Mackerras <[email protected]> CC: Michael Ellerman <[email protected]> CC: Kumar Gala <[email protected]> CC: Vakul Garg <[email protected]> CC: Scott Wood <[email protected]> CC: Mathieu Desnoyers <[email protected]> CC: [email protected] CC: [email protected] CC: [email protected] Signed-off-by: Michael Ellerman <[email protected]>
2018-06-19nds32: Fix build error caused by configuration flag renameGuenter Roeck1-6/+6
Fix build error on nds32 due to the merge of commit e3d5980568f ("lib: Rename compiler intrinsic selects to GENERIC_LIB_*") during the 4.18 merge window which renames Kconfig symbols. This had raced with commit aeaa7af744fa ("nds32: lib: To use generic lib instead of libgcc to prevent the symbol undefined issue.") merged late in the 4.17 cycle, which added selects to nds32 using the original Kconfig symbol names. When they came together in merge commit 763f96944c95 ("Merge tag 'mips_4.18' of git://git.kernel.org/pub/scm/linux/kernel/git/mips/linux") this resulted in the following build errors: nds32le-linux-ld: kernel/time/timekeeping.o: in function `timekeeping_init': timekeeping.c:(.init.text+0x140): undefined reference to `__ashldi3' nds32le-linux-ld: timekeeping.c:(.init.text+0x144): undefined reference to `__ashldi3' nds32le-linux-ld: timekeeping.c:(.init.text+0x17e): undefined reference to `__lshrdi3' nds32le-linux-ld: timekeeping.c:(.init.text+0x182): undefined reference to `__lshrdi3' nds32le-linux-ld: drivers/clocksource/mmio.o: in function `clocksource_mmio_init': mmio.c:(.init.text+0x54): undefined reference to `__lshrdi3' nds32le-linux-ld: mmio.c:(.init.text+0x58): undefined reference to `__lshrdi3' Rename all 6 selects in nds32 and adjust the ordering accordingly to be alphabetical. Fixes: 763f96944c95 ("Merge tag 'mips_4.18' of git://git.kernel.org/pub/scm/linux/kernel/git/mips/linux") Signed-off-by: Guenter Roeck <[email protected]> [[email protected]: Rename all 6 symbols, sort, update commit message] Signed-off-by: James Hogan <[email protected]> Cc: Greentime Hu <[email protected]> Cc: Vincent Chen <[email protected]> Cc: Matt Redfearn <[email protected]> Cc: Palmer Dabbelt <[email protected]> Acked-by: Greentime Hu <[email protected]> Signed-off-by: Greentime Hu <[email protected]>
2018-06-19nds32: define __NDS32_E[BL]__ for sparseLuc Van Oostenryck1-0/+2
nds32 depends on the macros '__NDS32_E[BL]__' to correctly select or define endian-specific macros, structures or pieces of code. These macros are predefined by the compiler but sparse knows nothing about them and thus may pre-process files differently from what GCC would. Fix this by adding '-D__NDS32_E[BL]__' to CHECKFLAGS. Signed-off-by: Luc Van Oostenryck <[email protected]> Acked-by: Greentime Hu <[email protected]> Signed-off-by: Greentime Hu <[email protected]>
2018-06-19ARM: dts: imx6sx: fix irq for pcie bridgeOleksij Rempel1-1/+1
Use the correct IRQ line for the MSI controller in the PCIe host controller. Apparently a different IRQ line is used compared to other i.MX6 variants. Without this change MSI IRQs aren't properly propagated to the upstream interrupt controller. Signed-off-by: Oleksij Rempel <[email protected]> Reviewed-by: Lucas Stach <[email protected]> Fixes: b1d17f68e5c5 ("ARM: dts: imx: add initial imx6sx device tree source") Signed-off-by: Shawn Guo <[email protected]>
2018-06-19Merge branch 'for-linus' of ↵Linus Torvalds1-30/+32
git://git.kernel.org/pub/scm/linux/kernel/git/s390/linux Pull s390 updates from Martin Schwidefsky: "common I/O layer - Fix bit-fields crossing storage-unit boundaries in css_general_char dasd driver - Avoid a sparse warning in regard to the queue lock - Allocate the struct dasd_ccw_req as per request data. Only for internal I/O is the structure allocated separately - Remove the unused function dasd_kmalloc_set_cda - Save a few bytes in struct dasd_ccw_req by reordering fields - Convert remaining users of dasd_kmalloc_request to dasd_smalloc_request and remove the now unused function vfio/ccw - Refactor and improve pfn_array_alloc_pin/pfn_array_pin - Add a new tracepoint for failed vfio/ccw requests - Add a CCW translation improvement to accept more requests as valid - Bug fixes" * 'for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/s390/linux: s390/dasd: only use preallocated requests s390/dasd: reshuffle struct dasd_ccw_req s390/dasd: remove dasd_kmalloc_set_cda s390/dasd: move dasd_ccw_req to per request data s390/dasd: simplify locking in process_final_queue s390/cio: sanitize css_general_characteristics definition vfio: ccw: add tracepoints for interesting error paths vfio: ccw: set ccw->cda to NULL defensively vfio: ccw: refactor and improve pfn_array_alloc_pin() vfio: ccw: shorten kernel doc description for pfn_array_pin() vfio: ccw: push down unsupported IDA check vfio: ccw: fix error return in vfio_ccw_sch_event s390/archrandom: Rework arch random implementation. s390/net: add pnetid support
2018-06-18MIPS: BCM47XX: Enable 74K Core ExternalSync for PCIe erratumTokunori Ikegami2-0/+9
The erratum and workaround are described by BCM5300X-ES300-RDS.pdf as below. R10: PCIe Transactions Periodically Fail Description: The BCM5300X PCIe does not maintain transaction ordering. This may cause PCIe transaction failure. Fix Comment: Add a dummy PCIe configuration read after a PCIe configuration write to ensure PCIe configuration access ordering. Set ES bit of CP0 configu7 register to enable sync function so that the sync instruction is functional. Resolution: hndpci.c: extpci_write_config() hndmips.c: si_mips_init() mipsinc.h CONF7_ES This is fixed by the CFE MIPS bcmsi chipset driver also for BCM47XX. Also the dummy PCIe configuration read is already implemented in the Linux BCMA driver. Enable ExternalSync in Config7 when CONFIG_BCMA_DRIVER_PCI_HOSTMODE=y too so that the sync instruction is externalised. Signed-off-by: Tokunori Ikegami <[email protected]> Reviewed-by: Paul Burton <[email protected]> Acked-by: Hauke Mehrtens <[email protected]> Cc: Chris Packham <[email protected]> Cc: Rafał Miłecki <[email protected]> Cc: [email protected] Cc: [email protected] Patchwork: https://patchwork.linux-mips.org/patch/19461/ Signed-off-by: James Hogan <[email protected]>