aboutsummaryrefslogtreecommitdiff
path: root/arch
AgeCommit message (Collapse)AuthorFilesLines
2020-10-19Merge tag 'riscv-for-linus-5.10-mw0' of ↵Linus Torvalds25-239/+936
git://git.kernel.org/pub/scm/linux/kernel/git/riscv/linux Pull RISC-V updates from Palmer Dabbelt: "A handful of cleanups and new features: - A handful of cleanups for our page fault handling - Improvements to how we fill out cacheinfo - Support for EFI-based systems" * tag 'riscv-for-linus-5.10-mw0' of git://git.kernel.org/pub/scm/linux/kernel/git/riscv/linux: (22 commits) RISC-V: Add page table dump support for uefi RISC-V: Add EFI runtime services RISC-V: Add EFI stub support. RISC-V: Add PE/COFF header for EFI stub RISC-V: Implement late mapping page table allocation functions RISC-V: Add early ioremap support RISC-V: Move DT mapping outof fixmap RISC-V: Fix duplicate included thread_info.h riscv/mm/fault: Set FAULT_FLAG_INSTRUCTION flag in do_page_fault() riscv/mm/fault: Fix inline placement in vmalloc_fault() declaration riscv: Add cache information in AUX vector riscv: Define AT_VECTOR_SIZE_ARCH for ARCH_DLINFO riscv: Set more data to cacheinfo riscv/mm/fault: Move access error check to function riscv/mm/fault: Move FAULT_FLAG_WRITE handling in do_page_fault() riscv/mm/fault: Simplify mm_fault_error() riscv/mm/fault: Move fault error handling to mm_fault_error() riscv/mm/fault: Simplify fault error handling riscv/mm/fault: Move vmalloc fault handling to vmalloc_fault() riscv/mm/fault: Move bad area handling to bad_area() ...
2020-10-19Merge tag 'm68knommu-for-v5.10' of ↵Linus Torvalds6-559/+402
git://git.kernel.org/pub/scm/linux/kernel/git/gerg/m68knommu Pull m68knommu updates from Greg Ungerer: "A collection of fixes for 5.10: - switch to using asm-generic uaccess code - fix sparse warnings in signal code - fix compilation of ColdFire MMC support - support sysrq in ColdFire serial driver" * tag 'm68knommu-for-v5.10' of git://git.kernel.org/pub/scm/linux/kernel/git/gerg/m68knommu: serial: mcf: add sysrq capability m68knommu: include SDHC support only when hardware has it m68knommu: fix sparse warnings in signal code m68knommu: switch to using asm-generic/uaccess.h
2020-10-20powerpc: Fix undetected data corruption with P9N DD2.1 VSX CI load emulationMichael Neuling1-1/+1
__get_user_atomic_128_aligned() stores to kaddr using stvx which is a VMX store instruction, hence kaddr must be 16 byte aligned otherwise the store won't occur as expected. Unfortunately when we call __get_user_atomic_128_aligned() in p9_hmi_special_emu(), the buffer we pass as kaddr (ie. vbuf) isn't guaranteed to be 16B aligned. This means that the write to vbuf in __get_user_atomic_128_aligned() has the bottom bits of the address truncated. This results in other local variables being overwritten. Also vbuf will not contain the correct data which results in the userspace emulation being wrong and hence undetected user data corruption. In the past we've been mostly lucky as vbuf has ended up aligned but this is fragile and isn't always true. CONFIG_STACKPROTECTOR in particular can change the stack arrangement enough that our luck runs out. This issue only occurs on POWER9 Nimbus <= DD2.1 bare metal. The fix is to align vbuf to a 16 byte boundary. Fixes: 5080332c2c89 ("powerpc/64s: Add workaround for P9 vector CI load issue") Cc: [email protected] # v4.15+ Signed-off-by: Michael Neuling <[email protected]> Signed-off-by: Michael Ellerman <[email protected]> Link: https://lore.kernel.org/r/[email protected]
2020-10-19x86/boot/64: Explicitly map boot_params and command lineArvind Sankar2-3/+23
Commits ca0e22d4f011 ("x86/boot/compressed/64: Always switch to own page table") 8570978ea030 ("x86/boot/compressed/64: Don't pre-map memory in KASLR code") set up a new page table in the decompressor stub, but without explicit mappings for boot_params and the kernel command line, relying on the #PF handler instead. This is fragile, as boot_params and the command line mappings are required for the main kernel. If EARLY_PRINTK and RANDOMIZE_BASE are disabled, a QEMU/OVMF boot never accesses the command line in the decompressor stub, and so it never gets mapped. The main kernel accesses it from the identity mapping if AMD_MEM_ENCRYPT is enabled, and will crash. Fix this by adding back the explicit mapping of boot_params and the command line. Note: the changes also removed the explicit mapping of the main kernel, with the result that .bss and .brk may not be in the identity mapping, but those don't get accessed by the main kernel before it switches to its own page tables. [ bp: Pass boot_params with a MOV %rsp... instead of PUSH/POP. Use block formatting for the comment. ] Signed-off-by: Arvind Sankar <[email protected]> Signed-off-by: Borislav Petkov <[email protected]> Reviewed-by: Joerg Roedel <[email protected]> Link: https://lkml.kernel.org/r/[email protected]
2020-10-19KVM: VMX: Fix x2APIC MSR intercept handling on !APICV platformsPeter Xu1-2/+3
Fix an inverted flag for intercepting x2APIC MSRs and intercept writes by default, even when APICV is enabled. Fixes: 3eb900173c71 ("KVM: x86: VMX: Prevent MSR passthrough when MSR access is denied") Co-developed-by: Peter Xu <[email protected]> [sean: added changelog] Signed-off-by: Sean Christopherson <[email protected]> Message-Id: <[email protected]> Signed-off-by: Paolo Bonzini <[email protected]>
2020-10-19powerpc/powernv/dump: Handle multiple writes to ack attributeVasant Hegde1-3/+8
Even though we use self removing sysfs helper, we still need to make sure we do the final kobject delete conditionally. sysfs_remove_file_self() will handle parallel calls to remove the sysfs attribute file and returns true only in the caller that removed the attribute file. The other parallel callers are returned false. Do the final kobject delete checking the return value of sysfs_remove_file_self(). Signed-off-by: Vasant Hegde <[email protected]> Signed-off-by: Michael Ellerman <[email protected]> Link: https://lore.kernel.org/r/[email protected]
2020-10-19powerpc/powernv/dump: Fix race while processing OPAL dumpVasant Hegde1-12/+29
Every dump reported by OPAL is exported to userspace through a sysfs interface and notified using kobject_uevent(). The userspace daemon (opal_errd) then reads the dump and acknowledges that the dump is saved safely to disk. Once acknowledged the kernel removes the respective sysfs file entry causing respective resources to be released including kobject. However it's possible the userspace daemon may already be scanning dump entries when a new sysfs dump entry is created by the kernel. User daemon may read this new entry and ack it even before kernel can notify userspace about it through kobject_uevent() call. If that happens then we have a potential race between dump_ack_store->kobject_put() and kobject_uevent which can lead to use-after-free of a kernfs object resulting in a kernel crash. This patch fixes this race by protecting the sysfs file creation/notification by holding a reference count on kobject until we safely send kobject_uevent(). The function create_dump_obj() returns the dump object which if used by caller function will end up in use-after-free problem again. However, the return value of create_dump_obj() function isn't being used today and there is no need as well. Hence change it to return void to make this fix complete. Fixes: c7e64b9ce04a ("powerpc/powernv Platform dump interface") Signed-off-by: Vasant Hegde <[email protected]> Signed-off-by: Michael Ellerman <[email protected]> Link: https://lore.kernel.org/r/[email protected]
2020-10-19x86/head/64: Disable stack protection for head$(BITS).oArvind Sankar1-0/+2
On 64-bit, the startup_64_setup_env() function added in 866b556efa12 ("x86/head/64: Install startup GDT") has stack protection enabled because of set_bringup_idt_handler(). This happens when CONFIG_STACKPROTECTOR_STRONG is enabled. It also currently needs CONFIG_AMD_MEM_ENCRYPT enabled because then set_bringup_idt_handler() is not an empty stub but that might change in the future, when the other vendor adds their similar technology. At this point, %gs is not yet initialized, and this doesn't cause a crash only because the #PF handler from the decompressor stub is still installed and handles the page fault. Disable stack protection for the whole file, and do it on 32-bit as well to avoid surprises. [ bp: Extend commit message with the exact explanation how it happens. ] Signed-off-by: Arvind Sankar <[email protected]> Signed-off-by: Borislav Petkov <[email protected]> Reviewed-by: Joerg Roedel <[email protected]> Link: https://lkml.kernel.org/r/[email protected]
2020-10-19x86/boot/64: Initialize 5-level paging variables earlierArvind Sankar3-14/+16
Commit ca0e22d4f011 ("x86/boot/compressed/64: Always switch to own page table") started using a new set of pagetables even without KASLR. After that commit, initialize_identity_maps() is called before the 5-level paging variables are setup in choose_random_location(), which will not work if 5-level paging is actually enabled. Fix this by moving the initialization of __pgtable_l5_enabled, pgdir_shift and ptrs_per_p4d into cleanup_trampoline(), which is called immediately after the finalization of whether the kernel is executing with 4- or 5-level paging. This will be earlier than anything that might require those variables, and keeps the 4- vs 5-level paging code all in one place. Fixes: ca0e22d4f011 ("x86/boot/compressed/64: Always switch to own page table") Signed-off-by: Arvind Sankar <[email protected]> Signed-off-by: Borislav Petkov <[email protected]> Reviewed-by: Joerg Roedel <[email protected]> Tested-by: Joerg Roedel <[email protected]> Tested-by: Kirill A. Shutemov <[email protected]> Link: https://lkml.kernel.org/r/[email protected]
2020-10-19powerpc/smp: Use GFP_ATOMIC while allocating tmp maskSrikar Dronamraju1-26/+31
Qian Cai reported a regression where CPU Hotplug fails with the latest powerpc/next BUG: sleeping function called from invalid context at mm/slab.h:494 in_atomic(): 1, irqs_disabled(): 1, non_block: 0, pid: 0, name: swapper/88 no locks held by swapper/88/0. irq event stamp: 18074448 hardirqs last enabled at (18074447): [<c0000000001a2a7c>] tick_nohz_idle_enter+0x9c/0x110 hardirqs last disabled at (18074448): [<c000000000106798>] do_idle+0x138/0x3b0 do_idle at kernel/sched/idle.c:253 (discriminator 1) softirqs last enabled at (18074440): [<c0000000000bbec4>] irq_enter_rcu+0x94/0xa0 softirqs last disabled at (18074439): [<c0000000000bbea0>] irq_enter_rcu+0x70/0xa0 CPU: 88 PID: 0 Comm: swapper/88 Tainted: G W 5.9.0-rc8-next-20201007 #1 Call Trace: [c00020000a4bfcf0] [c000000000649e98] dump_stack+0xec/0x144 (unreliable) [c00020000a4bfd30] [c0000000000f6c34] ___might_sleep+0x2f4/0x310 [c00020000a4bfdb0] [c000000000354f94] slab_pre_alloc_hook.constprop.82+0x124/0x190 [c00020000a4bfe00] [c00000000035e9e8] __kmalloc_node+0x88/0x3a0 slab_alloc_node at mm/slub.c:2817 (inlined by) __kmalloc_node at mm/slub.c:4013 [c00020000a4bfe80] [c0000000006494d8] alloc_cpumask_var_node+0x38/0x80 kmalloc_node at include/linux/slab.h:577 (inlined by) alloc_cpumask_var_node at lib/cpumask.c:116 [c00020000a4bfef0] [c00000000003eedc] start_secondary+0x27c/0x800 update_mask_by_l2 at arch/powerpc/kernel/smp.c:1267 (inlined by) add_cpu_to_masks at arch/powerpc/kernel/smp.c:1387 (inlined by) start_secondary at arch/powerpc/kernel/smp.c:1420 [c00020000a4bff90] [c00000000000c468] start_secondary_resume+0x10/0x14 Allocating a temporary mask while performing a CPU Hotplug operation with CONFIG_CPUMASK_OFFSTACK enabled, leads to calling a sleepable function from a atomic context. Fix this by allocating the temporary mask with GFP_ATOMIC flag. Also instead of having to allocate twice, allocate the mask in the caller so that we only have to allocate once. If the allocation fails, assume the mask to be same as sibling mask, which will make the scheduler to drop this domain for this CPU. Fixes: 70a94089d7f7 ("powerpc/smp: Optimize update_coregroup_mask") Fixes: 3ab33d6dc3e9 ("powerpc/smp: Optimize update_mask_by_l2") Reported-by: Qian Cai <[email protected]> Signed-off-by: Srikar Dronamraju <[email protected]> Signed-off-by: Michael Ellerman <[email protected]> Link: https://lore.kernel.org/r/[email protected]
2020-10-19powerpc/smp: Remove unnecessary variableSrikar Dronamraju1-9/+4
Commit 3ab33d6dc3e9 ("powerpc/smp: Optimize update_mask_by_l2") introduced submask_fn in update_mask_by_l2 to track the right submask. However commit f6606cfdfbcd ("powerpc/smp: Dont assume l2-cache to be superset of sibling") introduced sibling_mask in update_mask_by_l2 to track the same submask. Remove sibling_mask in favour of submask_fn. Signed-off-by: Srikar Dronamraju <[email protected]> Signed-off-by: Michael Ellerman <[email protected]> Link: https://lore.kernel.org/r/[email protected]
2020-10-18Merge tag 'core-rcu-2020-10-12' of ↵Linus Torvalds1-2/+4
git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip Pull RCU changes from Ingo Molnar: - Debugging for smp_call_function() - RT raw/non-raw lock ordering fixes - Strict grace periods for KASAN - New smp_call_function() torture test - Torture-test updates - Documentation updates - Miscellaneous fixes [ This doesn't actually pull the tag - I've dropped the last merge from the RCU branch due to questions about the series. - Linus ] * tag 'core-rcu-2020-10-12' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip: (77 commits) smp: Make symbol 'csd_bug_count' static kernel/smp: Provide CSD lock timeout diagnostics smp: Add source and destination CPUs to __call_single_data rcu: Shrink each possible cpu krcp rcu/segcblist: Prevent useless GP start if no CBs to accelerate torture: Add gdb support rcutorture: Allow pointer leaks to test diagnostic code rcutorture: Hoist OOM registry up one level refperf: Avoid null pointer dereference when buf fails to allocate rcutorture: Properly synchronize with OOM notifier rcutorture: Properly set rcu_fwds for OOM handling torture: Add kvm.sh --help and update help message rcutorture: Add CONFIG_PROVE_RCU_LIST to TREE05 torture: Update initrd documentation rcutorture: Replace HTTP links with HTTPS ones locktorture: Make function torture_percpu_rwsem_init() static torture: document --allcpus argument added to the kvm.sh script rcutorture: Output number of elapsed grace periods rcutorture: Remove KCSAN stubs rcu: Remove unused "cpu" parameter from rcu_report_qs_rdp() ...
2020-10-18Merge branch 'akpm' (patches from Andrew)Linus Torvalds21-10/+40
Merge yet more updates from Andrew Morton: "Subsystems affected by this patch series: mm (memcg, migration, pagemap, gup, madvise, vmalloc), ia64, and misc" * emailed patches from Andrew Morton <[email protected]>: (31 commits) mm: remove duplicate include statement in mmu.c mm: remove the filename in the top of file comment in vmalloc.c mm: cleanup the gfp_mask handling in __vmalloc_area_node mm: remove alloc_vm_area x86/xen: open code alloc_vm_area in arch_gnttab_valloc xen/xenbus: use apply_to_page_range directly in xenbus_map_ring_pv drm/i915: use vmap in i915_gem_object_map drm/i915: stop using kmap in i915_gem_object_map drm/i915: use vmap in shmem_pin_map zsmalloc: switch from alloc_vm_area to get_vm_area mm: allow a NULL fn callback in apply_to_page_range mm: add a vmap_pfn function mm: add a VM_MAP_PUT_PAGES flag for vmap mm: update the documentation for vfree mm/madvise: introduce process_madvise() syscall: an external memory hinting API pid: move pidfd_get_pid() to pid.c mm/madvise: pass mm to do_madvise selftests/vm: 10x speedup for hmm-tests binfmt_elf: take the mmap lock around find_extend_vma() mm/gup_benchmark: take the mmap lock around GUP ...
2020-10-18Merge tag 'for-linus-5.10-rc1' of ↵Linus Torvalds14-61/+91
git://git.kernel.org/pub/scm/linux/kernel/git/rw/uml Pull UML updates from Richard Weinberger: - Improve support for non-glibc systems - Vector: Add support for scripting and dynamic tap devices - Various fixes for the vector networking driver - Various fixes for time travel mode * tag 'for-linus-5.10-rc1' of git://git.kernel.org/pub/scm/linux/kernel/git/rw/uml: um: vector: Add dynamic tap interfaces and scripting um: Clean up stacktrace dump um: Fix incorrect assumptions about max pid length um: Remove dead usage of TIF_IA32 um: Remove redundant NULL check um: change sigio_spinlock to a mutex um: time-travel: Return the sequence number in ACK messages um: time-travel: Fix IRQ handling in time_travel_handle_message() um: Allow static linking for non-glibc implementations um: Some fixes to build UML with musl um: vector: Use GFP_ATOMIC under spin lock um: Fix null pointer dereference in vector_user_bpf
2020-10-18mm: remove duplicate include statement in mmu.cTian Tao1-1/+0
asm/sections.h is included more than once, Remove the one that isn't necessary. Signed-off-by: Tian Tao <[email protected]> Signed-off-by: Andrew Morton <[email protected]> Reviewed-by: Mike Rapoport <[email protected]> Link: https://lkml.kernel.org/r/[email protected] Signed-off-by: Linus Torvalds <[email protected]>
2020-10-18x86/xen: open code alloc_vm_area in arch_gnttab_vallocChristoph Hellwig1-7/+20
Replace the last call to alloc_vm_area with an open coded version using an iterator in struct gnttab_vm_area instead of the triple indirection magic in alloc_vm_area. Signed-off-by: Christoph Hellwig <[email protected]> Signed-off-by: Andrew Morton <[email protected]> Reviewed-by: Boris Ostrovsky <[email protected]> Cc: Chris Wilson <[email protected]> Cc: Jani Nikula <[email protected]> Cc: Joonas Lahtinen <[email protected]> Cc: Juergen Gross <[email protected]> Cc: Matthew Auld <[email protected]> Cc: "Matthew Wilcox (Oracle)" <[email protected]> Cc: Minchan Kim <[email protected]> Cc: Nitin Gupta <[email protected]> Cc: Peter Zijlstra <[email protected]> Cc: Rodrigo Vivi <[email protected]> Cc: Stefano Stabellini <[email protected]> Cc: Tvrtko Ursulin <[email protected]> Cc: Uladzislau Rezki (Sony) <[email protected]> Link: https://lkml.kernel.org/r/[email protected] Signed-off-by: Linus Torvalds <[email protected]>
2020-10-18mm/madvise: introduce process_madvise() syscall: an external memory hinting APIMinchan Kim18-1/+19
There is usecase that System Management Software(SMS) want to give a memory hint like MADV_[COLD|PAGEEOUT] to other processes and in the case of Android, it is the ActivityManagerService. The information required to make the reclaim decision is not known to the app. Instead, it is known to the centralized userspace daemon(ActivityManagerService), and that daemon must be able to initiate reclaim on its own without any app involvement. To solve the issue, this patch introduces a new syscall process_madvise(2). It uses pidfd of an external process to give the hint. It also supports vector address range because Android app has thousands of vmas due to zygote so it's totally waste of CPU and power if we should call the syscall one by one for each vma.(With testing 2000-vma syscall vs 1-vector syscall, it showed 15% performance improvement. I think it would be bigger in real practice because the testing ran very cache friendly environment). Another potential use case for the vector range is to amortize the cost ofTLB shootdowns for multiple ranges when using MADV_DONTNEED; this could benefit users like TCP receive zerocopy and malloc implementations. In future, we could find more usecases for other advises so let's make it happens as API since we introduce a new syscall at this moment. With that, existing madvise(2) user could replace it with process_madvise(2) with their own pid if they want to have batch address ranges support feature. ince it could affect other process's address range, only privileged process(PTRACE_MODE_ATTACH_FSCREDS) or something else(e.g., being the same UID) gives it the right to ptrace the process could use it successfully. The flag argument is reserved for future use if we need to extend the API. I think supporting all hints madvise has/will supported/support to process_madvise is rather risky. Because we are not sure all hints make sense from external process and implementation for the hint may rely on the caller being in the current context so it could be error-prone. Thus, I just limited hints as MADV_[COLD|PAGEOUT] in this patch. If someone want to add other hints, we could hear the usecase and review it for each hint. It's safer for maintenance rather than introducing a buggy syscall but hard to fix it later. So finally, the API is as follows, ssize_t process_madvise(int pidfd, const struct iovec *iovec, unsigned long vlen, int advice, unsigned int flags); DESCRIPTION The process_madvise() system call is used to give advice or directions to the kernel about the address ranges from external process as well as local process. It provides the advice to address ranges of process described by iovec and vlen. The goal of such advice is to improve system or application performance. The pidfd selects the process referred to by the PID file descriptor specified in pidfd. (See pidofd_open(2) for further information) The pointer iovec points to an array of iovec structures, defined in <sys/uio.h> as: struct iovec { void *iov_base; /* starting address */ size_t iov_len; /* number of bytes to be advised */ }; The iovec describes address ranges beginning at address(iov_base) and with size length of bytes(iov_len). The vlen represents the number of elements in iovec. The advice is indicated in the advice argument, which is one of the following at this moment if the target process specified by pidfd is external. MADV_COLD MADV_PAGEOUT Permission to provide a hint to external process is governed by a ptrace access mode PTRACE_MODE_ATTACH_FSCREDS check; see ptrace(2). The process_madvise supports every advice madvise(2) has if target process is in same thread group with calling process so user could use process_madvise(2) to extend existing madvise(2) to support vector address ranges. RETURN VALUE On success, process_madvise() returns the number of bytes advised. This return value may be less than the total number of requested bytes, if an error occurred. The caller should check return value to determine whether a partial advice occurred. FAQ: Q.1 - Why does any external entity have better knowledge? Quote from Sandeep "For Android, every application (including the special SystemServer) are forked from Zygote. The reason of course is to share as many libraries and classes between the two as possible to benefit from the preloading during boot. After applications start, (almost) all of the APIs end up calling into this SystemServer process over IPC (binder) and back to the application. In a fully running system, the SystemServer monitors every single process periodically to calculate their PSS / RSS and also decides which process is "important" to the user for interactivity. So, because of how these processes start _and_ the fact that the SystemServer is looping to monitor each process, it does tend to *know* which address range of the application is not used / useful. Besides, we can never rely on applications to clean things up themselves. We've had the "hey app1, the system is low on memory, please trim your memory usage down" notifications for a long time[1]. They rely on applications honoring the broadcasts and very few do. So, if we want to avoid the inevitable killing of the application and restarting it, some way to be able to tell the OS about unimportant memory in these applications will be useful. - ssp Q.2 - How to guarantee the race(i.e., object validation) between when giving a hint from an external process and get the hint from the target process? process_madvise operates on the target process's address space as it exists at the instant that process_madvise is called. If the space target process can run between the time the process_madvise process inspects the target process address space and the time that process_madvise is actually called, process_madvise may operate on memory regions that the calling process does not expect. It's the responsibility of the process calling process_madvise to close this race condition. For example, the calling process can suspend the target process with ptrace, SIGSTOP, or the freezer cgroup so that it doesn't have an opportunity to change its own address space before process_madvise is called. Another option is to operate on memory regions that the caller knows a priori will be unchanged in the target process. Yet another option is to accept the race for certain process_madvise calls after reasoning that mistargeting will do no harm. The suggested API itself does not provide synchronization. It also apply other APIs like move_pages, process_vm_write. The race isn't really a problem though. Why is it so wrong to require that callers do their own synchronization in some manner? Nobody objects to write(2) merely because it's possible for two processes to open the same file and clobber each other's writes --- instead, we tell people to use flock or something. Think about mmap. It never guarantees newly allocated address space is still valid when the user tries to access it because other threads could unmap the memory right before. That's where we need synchronization by using other API or design from userside. It shouldn't be part of API itself. If someone needs more fine-grained synchronization rather than process level, there were two ideas suggested - cookie[2] and anon-fd[3]. Both are applicable via using last reserved argument of the API but I don't think it's necessary right now since we have already ways to prevent the race so don't want to add additional complexity with more fine-grained optimization model. To make the API extend, it reserved an unsigned long as last argument so we could support it in future if someone really needs it. Q.3 - Why doesn't ptrace work? Injecting an madvise in the target process using ptrace would not work for us because such injected madvise would have to be executed by the target process, which means that process would have to be runnable and that creates the risk of the abovementioned race and hinting a wrong VMA. Furthermore, we want to act the hint in caller's context, not the callee's, because the callee is usually limited in cpuset/cgroups or even freezed state so they can't act by themselves quick enough, which causes more thrashing/kill. It doesn't work if the target process are ptraced(e.g., strace, debugger, minidump) because a process can have at most one ptracer. [1] https://developer.android.com/topic/performance/memory" [2] process_getinfo for getting the cookie which is updated whenever vma of process address layout are changed - Daniel Colascione - https://lore.kernel.org/lkml/[email protected]/T/#m7694416fd179b2066a2c62b5b139b14e3894e224 [3] anonymous fd which is used for the object(i.e., address range) validation - Michal Hocko - https://lore.kernel.org/lkml/[email protected]/ [[email protected]: fix process_madvise build break for arm64] Link: http://lkml.kernel.org/r/[email protected] [[email protected]: fix build error for mips of process_madvise] Link: http://lkml.kernel.org/r/[email protected] [[email protected]: fix patch ordering issue] [[email protected]: fix arm64 whoops] [[email protected]: make process_madvise() vlen arg have type size_t, per Florian] [[email protected]: fix i386 build] [[email protected]: fix syscall numbering] Link: https://lkml.kernel.org/r/[email protected] [[email protected]: madvise.c needs compat.h] Link: https://lkml.kernel.org/r/[email protected] [[email protected]: fix mips build] Link: https://lkml.kernel.org/r/[email protected] [[email protected]: remove duplicate header which is included twice] Link: https://lkml.kernel.org/r/[email protected] [[email protected]: do not use helper functions for process_madvise] Link: https://lkml.kernel.org/r/[email protected] [[email protected]: pidfd_get_pid() gained an argument] [[email protected]: fix up for "iov_iter: transparently handle compat iovecs in import_iovec"] Link: https://lkml.kernel.org/r/[email protected] Signed-off-by: Minchan Kim <[email protected]> Signed-off-by: YueHaibing <[email protected]> Signed-off-by: Stephen Rothwell <[email protected]> Signed-off-by: Andrew Morton <[email protected]> Reviewed-by: Suren Baghdasaryan <[email protected]> Reviewed-by: Vlastimil Babka <[email protected]> Acked-by: David Rientjes <[email protected]> Cc: Alexander Duyck <[email protected]> Cc: Brian Geffon <[email protected]> Cc: Christian Brauner <[email protected]> Cc: Daniel Colascione <[email protected]> Cc: Jann Horn <[email protected]> Cc: Jens Axboe <[email protected]> Cc: Joel Fernandes <[email protected]> Cc: Johannes Weiner <[email protected]> Cc: John Dias <[email protected]> Cc: Kirill Tkhai <[email protected]> Cc: Michal Hocko <[email protected]> Cc: Oleksandr Natalenko <[email protected]> Cc: Sandeep Patil <[email protected]> Cc: SeongJae Park <[email protected]> Cc: SeongJae Park <[email protected]> Cc: Shakeel Butt <[email protected]> Cc: Sonny Rao <[email protected]> Cc: Tim Murray <[email protected]> Cc: Christian Brauner <[email protected]> Cc: Florian Weimer <[email protected]> Cc: <[email protected]> Link: http://lkml.kernel.org/r/[email protected] Link: http://lkml.kernel.org/r/[email protected] Link: http://lkml.kernel.org/r/[email protected] Link: https://lkml.kernel.org/r/[email protected] Signed-off-by: Linus Torvalds <[email protected]>
2020-10-18ia64: fix build error with !COREDUMPKrzysztof Kozlowski1-1/+1
Fix linkage error when CONFIG_BINFMT_ELF is selected but CONFIG_COREDUMP is not: ia64-linux-ld: arch/ia64/kernel/elfcore.o: in function `elf_core_write_extra_phdrs': elfcore.c:(.text+0x172): undefined reference to `dump_emit' ia64-linux-ld: arch/ia64/kernel/elfcore.o: in function `elf_core_write_extra_data': elfcore.c:(.text+0x2b2): undefined reference to `dump_emit' Fixes: 1fcccbac89f5 ("elf coredump: replace ELF_CORE_EXTRA_* macros by functions") Reported-by: kernel test robot <[email protected]> Signed-off-by: Krzysztof Kozlowski <[email protected]> Signed-off-by: Andrew Morton <[email protected]> Cc: Tony Luck <[email protected]> Cc: Fenghua Yu <[email protected]> Cc: <[email protected]> Link: https://lkml.kernel.org/r/[email protected] Signed-off-by: Linus Torvalds <[email protected]>
2020-10-17task_work: cleanup notification modesJens Axboe2-2/+2
A previous commit changed the notification mode from true/false to an int, allowing notify-no, notify-yes, or signal-notify. This was backwards compatible in the sense that any existing true/false user would translate to either 0 (on notification sent) or 1, the latter which mapped to TWA_RESUME. TWA_SIGNAL was assigned a value of 2. Clean this up properly, and define a proper enum for the notification mode. Now we have: - TWA_NONE. This is 0, same as before the original change, meaning no notification requested. - TWA_RESUME. This is 1, same as before the original change, meaning that we use TIF_NOTIFY_RESUME. - TWA_SIGNAL. This uses TIF_SIGPENDING/JOBCTL_TASK_WORK for the notification. Clean up all the callers, switching their 0/1/false/true to using the appropriate TWA_* mode for notifications. Fixes: e91b48162332 ("task_work: teach task_work_add() to do signal_wake_up()") Reviewed-by: Thomas Gleixner <[email protected]> Signed-off-by: Jens Axboe <[email protected]>
2020-10-17tracehook: clear TIF_NOTIFY_RESUME in tracehook_notify_resume()Jens Axboe24-40/+15
All the callers currently do this, clean it up and move the clearing into tracehook_notify_resume() instead. Reviewed-by: Oleg Nesterov <[email protected]> Reviewed-by: Thomas Gleixner <[email protected]> Signed-off-by: Jens Axboe <[email protected]>
2020-10-17Merge tag 'mtd/for-5.10' of ↵Linus Torvalds25-25/+26
git://git.kernel.org/pub/scm/linux/kernel/git/mtd/linux Pull MTD updates from Richard Weinberger: "NAND core changes: - Drop useless 'depends on' in Kconfig - Add an extra level in the Kconfig hierarchy - Trivial spellings - Dynamic allocation of the interface configurations - Dropping the default ONFI timing mode - Various cleanup (types, structures, naming, comments) - Hide the chip->data_interface indirection - Add the generic rb-gpios property - Add the ->choose_interface_config() hook - Introduce nand_choose_best_sdr_timings() - Use default values for tPROG_max and tBERS_max - Avoid redefining tR_max and tCCS_min - Add a helper to find the closest ONFI mode - bcm63xx MTD parsers: simplify CFE detection Raw NAND controller drivers changes: - fsl-upm: Deprecation of specific DT properties - fsl_upm: Driver rework and cleanup in favor of ->exec_op() - Ingenic: Cleanup ARRAY_SIZE() vs sizeof() use - brcmnand: ECC error handling on EDU transfers - brcmnand: Don't default to EDU transfers - qcom: Set BAM mode only if not set already - qcom: Avoid write to unavailable register - gpio: Driver rework in favor of ->exec_op() - tango: ->exec_op() conversion - mtk: ->exec_op() conversion Raw NAND chip drivers changes: - toshiba: Implement ->choose_interface_config() for TH58NVG2S3HBAI4 - toshiba: Implement ->choose_interface_config() for TC58NVG0S3E - toshiba: Implement ->choose_interface_config() for TC58TEG5DCLTA00 - hynix: Implement ->choose_interface_config() for H27UCG8T2ATR-BC HyperBus changes: - DMA support for TI's AM654 HyperBus controller driver. - HyperBus frontend driver for Renesas RPC-IF driver. SPI NOR core changes: - Support for Winbond w25q64jwm flash - Enable 4K sector support for mx25l12805d SPI NOR controller drivers changes: - intel-spi Add Alder Lake-S PCI ID MTD Core changes: - mtdoops: Don't run panic write twice - mtdconcat: Correctly handle panic write - Use DEFINE_SHOW_ATTRIBUTE" * tag 'mtd/for-5.10' of git://git.kernel.org/pub/scm/linux/kernel/git/mtd/linux: (76 commits) mtd: hyperbus: Fix build failure when only RPCIF_HYPERBUS is enabled mtd: hyperbus: add Renesas RPC-IF driver Revert "mtd: spi-nor: Prefer asynchronous probe" mtd: parsers: bcm63xx: Do not make it modular mtd: spear_smi: Enable compile testing mtd: maps: vmu-flash: fix typos for struct memcard mtd: physmap: Add Baikal-T1 physically mapped ROM support mtd: maps: vmu-flash: simplify the return expression of probe_maple_vmu mtd: onenand: simplify the return expression of onenand_transfer_auto_oob mtd: rawnand: cadence: remove a redundant dev_err call mtd: rawnand: ams-delta: Fix non-OF build warning mtd: rawnand: Don't overwrite the error code from nand_set_ecc_soft_ops() mtd: rawnand: Introduce nand_set_ecc_on_host_ops() mtd: rawnand: atmel: Check return values for nand_read_data_op mtd: rawnand: vf610: Remove unused function vf610_nfc_transfer_size() mtd: rawnand: qcom: Simplify with dev_err_probe() mtd: rawnand: marvell: Fix and update kerneldoc mtd: rawnand: marvell: Simplify with dev_err_probe() mtd: rawnand: gpmi: Simplify with dev_err_probe() mtd: rawnand: atmel: Simplify with dev_err_probe() ...
2020-10-16Merge tag 'mips_5.10' of ↵Linus Torvalds158-3672/+1700
git://git.kernel.org/pub/scm/linux/kernel/git/mips/linux Pull MIPS updates from Thomas Bogendoerfer: - removed support for PNX833x alias NXT_STB22x - included Ingenic SoC support into generic MIPS kernels - added support for new Ingenic SoCs - converted workaround selection to use Kconfig - replaced old boot mem functions by memblock_* - enabled COP2 usage in kernel for Loongson64 to make use of 16byte load/stores possible - cleanups and fixes * tag 'mips_5.10' of git://git.kernel.org/pub/scm/linux/kernel/git/mips/linux: (92 commits) MIPS: DEC: Restore bootmem reservation for firmware working memory area MIPS: dec: fix section mismatch bcm963xx_tag.h: fix duplicated word mips: ralink: enable zboot support MIPS: ingenic: Remove CPU_SUPPORTS_HUGEPAGES MIPS: cpu-probe: remove MIPS_CPU_BP_GHIST option bit MIPS: cpu-probe: introduce exclusive R3k CPU probe MIPS: cpu-probe: move fpu probing/handling into its own file MIPS: replace add_memory_region with memblock MIPS: Loongson64: Clean up numa.c MIPS: Loongson64: Select SMP in Kconfig to avoid build error mips: octeon: Add Ubiquiti E200 and E220 boards MIPS: SGI-IP28: disable use of ll/sc in kernel MIPS: tx49xx: move tx4939_add_memory_regions into only user MIPS: pgtable: Remove used PAGE_USERIO define MIPS: alchemy: Share prom_init implementation MIPS: alchemy: Fix build breakage, if TOUCHSCREEN_WM97XX is disabled MIPS: process: include exec.h header in process.c MIPS: process: Add prototype for function arch_dup_task_struct MIPS: idle: Add prototype for function check_wait ...
2020-10-16Merge tag 's390-5.10-1' of ↵Linus Torvalds87-1202/+1745
git://git.kernel.org/pub/scm/linux/kernel/git/s390/linux Pull s390 updates from Vasily Gorbik: - Remove address space overrides using set_fs() - Convert to generic vDSO - Convert to generic page table dumper - Add ARCH_HAS_DEBUG_WX support - Add leap seconds handling support - Add NVMe firmware-assisted kernel dump support - Extend NVMe boot support with memory clearing control and addition of kernel parameters - AP bus and zcrypt api code rework. Add adapter configure/deconfigure interface. Extend debug features. Add failure injection support - Add ECC secure private keys support - Add KASan support for running protected virtualization host with 4-level paging - Utilize destroy page ultravisor call to speed up secure guests shutdown - Implement ioremap_wc() and ioremap_prot() with MIO in PCI code - Various checksum improvements - Other small various fixes and improvements all over the code * tag 's390-5.10-1' of git://git.kernel.org/pub/scm/linux/kernel/git/s390/linux: (85 commits) s390/uaccess: fix indentation s390/uaccess: add default cases for __put_user_fn()/__get_user_fn() s390/zcrypt: fix wrong format specifications s390/kprobes: move insn_page to text segment s390/sie: fix typo in SIGP code description s390/lib: fix kernel doc for memcmp() s390/zcrypt: Introduce Failure Injection feature s390/zcrypt: move ap_msg param one level up the call chain s390/ap/zcrypt: revisit ap and zcrypt error handling s390/ap: Support AP card SCLP config and deconfig operations s390/sclp: Add support for SCLP AP adapter config/deconfig s390/ap: add card/queue deconfig state s390/ap: add error response code field for ap queue devices s390/ap: split ap queue state machine state from device state s390/zcrypt: New config switch CONFIG_ZCRYPT_DEBUG s390/zcrypt: introduce msg tracking in zcrypt functions s390/startup: correct early pgm check info formatting s390: remove orphaned extern variables declarations s390/kasan: make sure int handler always run with DAT on s390/ipl: add support to control memory clearing for nvme re-IPL ...
2020-10-16Merge tag 'powerpc-5.10-1' of ↵Linus Torvalds171-2361/+2660
git://git.kernel.org/pub/scm/linux/kernel/git/powerpc/linux Pull powerpc updates from Michael Ellerman: - A series from Nick adding ARCH_WANT_IRQS_OFF_ACTIVATE_MM & selecting it for powerpc, as well as a related fix for sparc. - Remove support for PowerPC 601. - Some fixes for watchpoints & addition of a new ptrace flag for detecting ISA v3.1 (Power10) watchpoint features. - A fix for kernels using 4K pages and the hash MMU on bare metal Power9 systems with > 16TB of RAM, or RAM on the 2nd node. - A basic idle driver for shallow stop states on Power10. - Tweaks to our sched domains code to better inform the scheduler about the hardware topology on Power9/10, where two SMT4 cores can be presented by firmware as an SMT8 core. - A series doing further reworks & cleanups of our EEH code. - Addition of a filter for RTAS (firmware) calls done via sys_rtas(), to prevent root from overwriting kernel memory. - Other smaller features, fixes & cleanups. Thanks to: Alexey Kardashevskiy, Andrew Donnellan, Aneesh Kumar K.V, Athira Rajeev, Biwen Li, Cameron Berkenpas, Cédric Le Goater, Christophe Leroy, Christoph Hellwig, Colin Ian King, Daniel Axtens, David Dai, Finn Thain, Frederic Barrat, Gautham R. Shenoy, Greg Kurz, Gustavo Romero, Ira Weiny, Jason Yan, Joel Stanley, Jordan Niethe, Kajol Jain, Konrad Rzeszutek Wilk, Laurent Dufour, Leonardo Bras, Liu Shixin, Luca Ceresoli, Madhavan Srinivasan, Mahesh Salgaonkar, Nathan Lynch, Nicholas Mc Guire, Nicholas Piggin, Nick Desaulniers, Oliver O'Halloran, Pedro Miraglia Franco de Carvalho, Pratik Rajesh Sampat, Qian Cai, Qinglang Miao, Ravi Bangoria, Russell Currey, Satheesh Rajendran, Scott Cheloha, Segher Boessenkool, Srikar Dronamraju, Stan Johnson, Stephen Kitt, Stephen Rothwell, Thiago Jung Bauermann, Tyrel Datwyler, Vaibhav Jain, Vaidyanathan Srinivasan, Vasant Hegde, Wang Wensheng, Wolfram Sang, Yang Yingliang, zhengbin. * tag 'powerpc-5.10-1' of git://git.kernel.org/pub/scm/linux/kernel/git/powerpc/linux: (228 commits) Revert "powerpc/pci: unmap legacy INTx interrupts when a PHB is removed" selftests/powerpc: Fix eeh-basic.sh exit codes cpufreq: powernv: Fix frame-size-overflow in powernv_cpufreq_reboot_notifier powerpc/time: Make get_tb() common to PPC32 and PPC64 powerpc/time: Make get_tbl() common to PPC32 and PPC64 powerpc/time: Remove get_tbu() powerpc/time: Avoid using get_tbl() and get_tbu() internally powerpc/time: Make mftb() common to PPC32 and PPC64 powerpc/time: Rename mftbl() to mftb() powerpc/32s: Remove #ifdef CONFIG_PPC_BOOK3S_32 in head_book3s_32.S powerpc/32s: Rename head_32.S to head_book3s_32.S powerpc/32s: Setup the early hash table at all time. powerpc/time: Remove ifdef in get_dec() and set_dec() powerpc: Remove get_tb_or_rtc() powerpc: Remove __USE_RTC() powerpc: Tidy up a bit after removal of PowerPC 601. powerpc: Remove support for PowerPC 601 powerpc: Remove PowerPC 601 powerpc: Drop SYNC_601() ISYNC_601() and SYNC() powerpc: Remove CONFIG_PPC601_SYNC_FIX ...
2020-10-16Merge branch 'akpm' (patches from Andrew)Linus Torvalds6-20/+27
Merge more updates from Andrew Morton: "155 patches. Subsystems affected by this patch series: mm (dax, debug, thp, readahead, page-poison, util, memory-hotplug, zram, cleanups), misc, core-kernel, get_maintainer, MAINTAINERS, lib, bitops, checkpatch, binfmt, ramfs, autofs, nilfs, rapidio, panic, relay, kgdb, ubsan, romfs, and fault-injection" * emailed patches from Andrew Morton <[email protected]>: (155 commits) lib, uaccess: add failure injection to usercopy functions lib, include/linux: add usercopy failure capability ROMFS: support inode blocks calculation ubsan: introduce CONFIG_UBSAN_LOCAL_BOUNDS for Clang sched.h: drop in_ubsan field when UBSAN is in trap mode scripts/gdb/tasks: add headers and improve spacing format scripts/gdb/proc: add struct mount & struct super_block addr in lx-mounts command kernel/relay.c: drop unneeded initialization panic: dump registers on panic_on_warn rapidio: fix the missed put_device() for rio_mport_add_riodev rapidio: fix error handling path nilfs2: fix some kernel-doc warnings for nilfs2 autofs: harden ioctl table ramfs: fix nommu mmap with gaps in the page cache mm: remove the now-unnecessary mmget_still_valid() hack mm/gup: take mmap_lock in get_dump_page() binfmt_elf, binfmt_elf_fdpic: use a VMA list snapshot coredump: rework elf/elf_fdpic vma_dump_size() into common helper coredump: refactor page range dumping into common helper coredump: let dump_emit() bail out on short writes ...
2020-10-16mm/memory_hotplug: prepare passing flags to add_memory() and friendsDavid Hildenbrand2-2/+2
We soon want to pass flags, e.g., to mark added System RAM resources. mergeable. Prepare for that. This patch is based on a similar patch by Oscar Salvador: https://lkml.kernel.org/r/[email protected] Signed-off-by: David Hildenbrand <[email protected]> Signed-off-by: Andrew Morton <[email protected]> Reviewed-by: Juergen Gross <[email protected]> # Xen related part Reviewed-by: Pankaj Gupta <[email protected]> Acked-by: Wei Liu <[email protected]> Cc: Michal Hocko <[email protected]> Cc: Dan Williams <[email protected]> Cc: Jason Gunthorpe <[email protected]> Cc: Baoquan He <[email protected]> Cc: Michael Ellerman <[email protected]> Cc: Benjamin Herrenschmidt <[email protected]> Cc: Paul Mackerras <[email protected]> Cc: "Rafael J. Wysocki" <[email protected]> Cc: Len Brown <[email protected]> Cc: Greg Kroah-Hartman <[email protected]> Cc: Vishal Verma <[email protected]> Cc: Dave Jiang <[email protected]> Cc: "K. Y. Srinivasan" <[email protected]> Cc: Haiyang Zhang <[email protected]> Cc: Stephen Hemminger <[email protected]> Cc: Wei Liu <[email protected]> Cc: Heiko Carstens <[email protected]> Cc: Vasily Gorbik <[email protected]> Cc: Christian Borntraeger <[email protected]> Cc: David Hildenbrand <[email protected]> Cc: "Michael S. Tsirkin" <[email protected]> Cc: Jason Wang <[email protected]> Cc: Boris Ostrovsky <[email protected]> Cc: Stefano Stabellini <[email protected]> Cc: "Oliver O'Halloran" <[email protected]> Cc: Pingfan Liu <[email protected]> Cc: Nathan Lynch <[email protected]> Cc: Libor Pechacek <[email protected]> Cc: Anton Blanchard <[email protected]> Cc: Leonardo Bras <[email protected]> Cc: Ard Biesheuvel <[email protected]> Cc: Eric Biederman <[email protected]> Cc: Julien Grall <[email protected]> Cc: Kees Cook <[email protected]> Cc: Roger Pau Monné <[email protected]> Cc: Thomas Gleixner <[email protected]> Cc: Wei Yang <[email protected]> Link: https://lkml.kernel.org/r/[email protected] Signed-off-by: Linus Torvalds <[email protected]>
2020-10-16mm: pass migratetype into memmap_init_zone() and move_pfn_range_to_zone()David Hildenbrand1-2/+2
On the memory onlining path, we want to start with MIGRATE_ISOLATE, to un-isolate the pages after memory onlining is complete. Let's allow passing in the migratetype. Signed-off-by: David Hildenbrand <[email protected]> Signed-off-by: Andrew Morton <[email protected]> Reviewed-by: Oscar Salvador <[email protected]> Acked-by: Michal Hocko <[email protected]> Cc: Wei Yang <[email protected]> Cc: Baoquan He <[email protected]> Cc: Pankaj Gupta <[email protected]> Cc: Tony Luck <[email protected]> Cc: Fenghua Yu <[email protected]> Cc: Logan Gunthorpe <[email protected]> Cc: Dan Williams <[email protected]> Cc: Mike Rapoport <[email protected]> Cc: "Matthew Wilcox (Oracle)" <[email protected]> Cc: Michel Lespinasse <[email protected]> Cc: Charan Teja Reddy <[email protected]> Cc: Mel Gorman <[email protected]> Link: https://lkml.kernel.org/r/[email protected] Signed-off-by: Linus Torvalds <[email protected]>
2020-10-16powerpc/mm: move setting pte specific flags to pfn_pteAneesh Kumar K.V3-16/+9
powerpc used to set the pte specific flags in set_pte_at(). This is different from other architectures. To be consistent with other architecture update pfn_pte to set _PAGE_PTE on ppc64. Also, drop now unused pte_mkpte. We add a VM_WARN_ON() to catch the usage of calling set_pte_at() without setting _PAGE_PTE bit. We will remove that after a few releases. With respect to huge pmd entries, pmd_mkhuge() takes care of adding the _PAGE_PTE bit. [[email protected]: whitespace fix, per Christophe] Signed-off-by: Aneesh Kumar K.V <[email protected]> Signed-off-by: Andrew Morton <[email protected]> Reviewed-by: Christophe Leroy <[email protected]> Cc: Anshuman Khandual <[email protected]> Cc: Michael Ellerman <[email protected]> Link: https://lkml.kernel.org/r/[email protected] Signed-off-by: Linus Torvalds <[email protected]>
2020-10-16powerpc/mm: add DEBUG_VM WARN for pmd_clearAneesh Kumar K.V1-0/+14
Patch series "mm/debug_vm_pgtable fixes", v4. This patch series includes fixes for debug_vm_pgtable test code so that they follow page table updates rules correctly. The first two patches introduce changes w.r.t ppc64. Hugetlb test is disabled on ppc64 because that needs larger change to satisfy page table update rules. These tests are broken w.r.t page table update rules and results in kernel crash as below. [ 21.083519] kernel BUG at arch/powerpc/mm/pgtable.c:304! cpu 0x0: Vector: 700 (Program Check) at [c000000c6d1e76c0] pc: c00000000009a5ec: assert_pte_locked+0x14c/0x380 lr: c0000000005eeeec: pte_update+0x11c/0x190 sp: c000000c6d1e7950 msr: 8000000002029033 current = 0xc000000c6d172c80 paca = 0xc000000003ba0000 irqmask: 0x03 irq_happened: 0x01 pid = 1, comm = swapper/0 kernel BUG at arch/powerpc/mm/pgtable.c:304! [link register ] c0000000005eeeec pte_update+0x11c/0x190 [c000000c6d1e7950] 0000000000000001 (unreliable) [c000000c6d1e79b0] c0000000005eee14 pte_update+0x44/0x190 [c000000c6d1e7a10] c000000001a2ca9c pte_advanced_tests+0x160/0x3d8 [c000000c6d1e7ab0] c000000001a2d4fc debug_vm_pgtable+0x7e8/0x1338 [c000000c6d1e7ba0] c0000000000116ec do_one_initcall+0xac/0x5f0 [c000000c6d1e7c80] c0000000019e4fac kernel_init_freeable+0x4dc/0x5a4 [c000000c6d1e7db0] c000000000012474 kernel_init+0x24/0x160 [c000000c6d1e7e20] c00000000000cbd0 ret_from_kernel_thread+0x5c/0x6c With DEBUG_VM disabled [ 20.530152] BUG: Kernel NULL pointer dereference on read at 0x00000000 [ 20.530183] Faulting instruction address: 0xc0000000000df330 cpu 0x33: Vector: 380 (Data SLB Access) at [c000000c6d19f700] pc: c0000000000df330: memset+0x68/0x104 lr: c00000000009f6d8: hash__pmdp_huge_get_and_clear+0xe8/0x1b0 sp: c000000c6d19f990 msr: 8000000002009033 dar: 0 current = 0xc000000c6d177480 paca = 0xc00000001ec4f400 irqmask: 0x03 irq_happened: 0x01 pid = 1, comm = swapper/0 [link register ] c00000000009f6d8 hash__pmdp_huge_get_and_clear+0xe8/0x1b0 [c000000c6d19f990] c00000000009f748 hash__pmdp_huge_get_and_clear+0x158/0x1b0 (unreliable) [c000000c6d19fa10] c0000000019ebf30 pmd_advanced_tests+0x1f0/0x378 [c000000c6d19fab0] c0000000019ed088 debug_vm_pgtable+0x79c/0x1244 [c000000c6d19fba0] c0000000000116ec do_one_initcall+0xac/0x5f0 [c000000c6d19fc80] c0000000019a4fac kernel_init_freeable+0x4dc/0x5a4 [c000000c6d19fdb0] c000000000012474 kernel_init+0x24/0x160 [c000000c6d19fe20] c00000000000cbd0 ret_from_kernel_thread+0x5c/0x6c This patch (of 13): With the hash page table, the kernel should not use pmd_clear for clearing huge pte entries. Add a DEBUG_VM WARN to catch the wrong usage. Signed-off-by: Aneesh Kumar K.V <[email protected]> Signed-off-by: Andrew Morton <[email protected]> Cc: Michael Ellerman <[email protected]> Cc: Anshuman Khandual <[email protected]> Cc: Aneesh Kumar K.V <[email protected]> Cc: Christophe Leroy <[email protected]> Link: https://lkml.kernel.org/r/[email protected] Link: https://lkml.kernel.org/r/[email protected] Signed-off-by: Linus Torvalds <[email protected]>
2020-10-16PM: AVS: smartreflex Move driver to soc specific driversUlf Hansson1-1/+1
The avs drivers are all SoC specific drivers that doesn't share any code. Instead they are located in a directory, mostly to keep similar functionality together. From a maintenance point of view, it makes better sense to collect SoC specific drivers like these, into the SoC specific directories. Therefore, let's move the smartreflex driver for OMAP to the ti directory. Signed-off-by: Ulf Hansson <[email protected]> Reviewed-by: Nishanth Menon <[email protected]> Signed-off-by: Rafael J. Wysocki <[email protected]>
2020-10-16powerpc/mce: Avoid nmi_enter/exit in real mode on pseries hashGanesh Goudar1-4/+3
Use of nmi_enter/exit in real mode handler causes the kernel to panic and reboot on injecting SLB mutihit on pseries machine running in hash MMU mode, because these calls try to accesses memory outside RMO region in real mode handler where translation is disabled. Add check to not to use these calls on pseries machine running in hash MMU mode. Fixes: 116ac378bb3f ("powerpc/64s: machine check interrupt update NMI accounting") Cc: [email protected] # v5.8+ Signed-off-by: Ganesh Goudar <[email protected]> Signed-off-by: Michael Ellerman <[email protected]> Link: https://lore.kernel.org/r/[email protected]
2020-10-16powerpc/opal_elog: Handle multiple writes to ack attributeAneesh Kumar K.V1-3/+8
Even though we use self removing sysfs helper, we still need to make sure we do the final kobject delete conditionally. sysfs_remove_file_self() will handle parallel calls to remove the sysfs attribute file and returns true only in the caller that removed the attribute file. The other parallel callers are returned false. Do the final kobject delete checking the return value of sysfs_remove_file_self(). Signed-off-by: Aneesh Kumar K.V <[email protected]> Reviewed-by: Mahesh Salgaonkar <[email protected]> Signed-off-by: Michael Ellerman <[email protected]> Link: https://lore.kernel.org/r/[email protected]
2020-10-15Merge tag 'net-next-5.10' of ↵Linus Torvalds13-128/+586
git://git.kernel.org/pub/scm/linux/kernel/git/netdev/net-next Pull networking updates from Jakub Kicinski: - Add redirect_neigh() BPF packet redirect helper, allowing to limit stack traversal in common container configs and improving TCP back-pressure. Daniel reports ~10Gbps => ~15Gbps single stream TCP performance gain. - Expand netlink policy support and improve policy export to user space. (Ge)netlink core performs request validation according to declared policies. Expand the expressiveness of those policies (min/max length and bitmasks). Allow dumping policies for particular commands. This is used for feature discovery by user space (instead of kernel version parsing or trial and error). - Support IGMPv3/MLDv2 multicast listener discovery protocols in bridge. - Allow more than 255 IPv4 multicast interfaces. - Add support for Type of Service (ToS) reflection in SYN/SYN-ACK packets of TCPv6. - In Multi-patch TCP (MPTCP) support concurrent transmission of data on multiple subflows in a load balancing scenario. Enhance advertising addresses via the RM_ADDR/ADD_ADDR options. - Support SMC-Dv2 version of SMC, which enables multi-subnet deployments. - Allow more calls to same peer in RxRPC. - Support two new Controller Area Network (CAN) protocols - CAN-FD and ISO 15765-2:2016. - Add xfrm/IPsec compat layer, solving the 32bit user space on 64bit kernel problem. - Add TC actions for implementing MPLS L2 VPNs. - Improve nexthop code - e.g. handle various corner cases when nexthop objects are removed from groups better, skip unnecessary notifications and make it easier to offload nexthops into HW by converting to a blocking notifier. - Support adding and consuming TCP header options by BPF programs, opening the doors for easy experimental and deployment-specific TCP option use. - Reorganize TCP congestion control (CC) initialization to simplify life of TCP CC implemented in BPF. - Add support for shipping BPF programs with the kernel and loading them early on boot via the User Mode Driver mechanism, hence reusing all the user space infra we have. - Support sleepable BPF programs, initially targeting LSM and tracing. - Add bpf_d_path() helper for returning full path for given 'struct path'. - Make bpf_tail_call compatible with bpf-to-bpf calls. - Allow BPF programs to call map_update_elem on sockmaps. - Add BPF Type Format (BTF) support for type and enum discovery, as well as support for using BTF within the kernel itself (current use is for pretty printing structures). - Support listing and getting information about bpf_links via the bpf syscall. - Enhance kernel interfaces around NIC firmware update. Allow specifying overwrite mask to control if settings etc. are reset during update; report expected max time operation may take to users; support firmware activation without machine reboot incl. limits of how much impact reset may have (e.g. dropping link or not). - Extend ethtool configuration interface to report IEEE-standard counters, to limit the need for per-vendor logic in user space. - Adopt or extend devlink use for debug, monitoring, fw update in many drivers (dsa loop, ice, ionic, sja1105, qed, mlxsw, mv88e6xxx, dpaa2-eth). - In mlxsw expose critical and emergency SFP module temperature alarms. Refactor port buffer handling to make the defaults more suitable and support setting these values explicitly via the DCBNL interface. - Add XDP support for Intel's igb driver. - Support offloading TC flower classification and filtering rules to mscc_ocelot switches. - Add PTP support for Marvell Octeontx2 and PP2.2 hardware, as well as fixed interval period pulse generator and one-step timestamping in dpaa-eth. - Add support for various auth offloads in WiFi APs, e.g. SAE (WPA3) offload. - Add Lynx PHY/PCS MDIO module, and convert various drivers which have this HW to use it. Convert mvpp2 to split PCS. - Support Marvell Prestera 98DX3255 24-port switch ASICs, as well as 7-port Mediatek MT7531 IP. - Add initial support for QCA6390 and IPQ6018 in ath11k WiFi driver, and wcn3680 support in wcn36xx. - Improve performance for packets which don't require much offloads on recent Mellanox NICs by 20% by making multiple packets share a descriptor entry. - Move chelsio inline crypto drivers (for TLS and IPsec) from the crypto subtree to drivers/net. Move MDIO drivers out of the phy directory. - Clean up a lot of W=1 warnings, reportedly the actively developed subsections of networking drivers should now build W=1 warning free. - Make sure drivers don't use in_interrupt() to dynamically adapt their code. Convert tasklets to use new tasklet_setup API (sadly this conversion is not yet complete). * tag 'net-next-5.10' of git://git.kernel.org/pub/scm/linux/kernel/git/netdev/net-next: (2583 commits) Revert "bpfilter: Fix build error with CONFIG_BPFILTER_UMH" net, sockmap: Don't call bpf_prog_put() on NULL pointer bpf, selftest: Fix flaky tcp_hdr_options test when adding addr to lo bpf, sockmap: Add locking annotations to iterator netfilter: nftables: allow re-computing sctp CRC-32C in 'payload' statements net: fix pos incrementment in ipv6_route_seq_next net/smc: fix invalid return code in smcd_new_buf_create() net/smc: fix valid DMBE buffer sizes net/smc: fix use-after-free of delayed events bpfilter: Fix build error with CONFIG_BPFILTER_UMH cxgb4/ch_ipsec: Replace the module name to ch_ipsec from chcr net: sched: Fix suspicious RCU usage while accessing tcf_tunnel_info bpf: Fix register equivalence tracking. rxrpc: Fix loss of final ack on shutdown rxrpc: Fix bundle counting for exclusive connections netfilter: restore NF_INET_NUMHOOKS ibmveth: Identify ingress large send packets. ibmveth: Switch order of ibmveth_helper calls. cxgb4: handle 4-tuple PEDIT to NAT mode translation selftests: Add VRF route leaking tests ...
2020-10-15Merge tag 'trace-v5.10' of ↵Linus Torvalds1-11/+9
git://git.kernel.org/pub/scm/linux/kernel/git/rostedt/linux-trace Pull tracing updates from Steven Rostedt: "Updates for tracing and bootconfig: - Add support for "bool" type in synthetic events - Add per instance tracing for bootconfig - Support perf-style return probe ("SYMBOL%return") in kprobes and uprobes - Allow for kprobes to be enabled earlier in boot up - Added tracepoint helper function to allow testing if tracepoints are enabled in headers - Synthetic events can now have dynamic strings (variable length) - Various fixes and cleanups" * tag 'trace-v5.10' of git://git.kernel.org/pub/scm/linux/kernel/git/rostedt/linux-trace: (58 commits) tracing: support "bool" type in synthetic trace events selftests/ftrace: Add test case for synthetic event syntax errors tracing: Handle synthetic event array field type checking correctly selftests/ftrace: Change synthetic event name for inter-event-combined test tracing: Add synthetic event error logging tracing: Check that the synthetic event and field names are legal tracing: Move is_good_name() from trace_probe.h to trace.h tracing: Don't show dynamic string internals in synthetic event description tracing: Fix some typos in comments tracing/boot: Add ftrace.instance.*.alloc_snapshot option tracing: Fix race in trace_open and buffer resize call tracing: Check return value of __create_val_fields() before using its result tracing: Fix synthetic print fmt check for use of __get_str() tracing: Remove a pointless assignment ftrace: ftrace_global_list is renamed to ftrace_ops_list ftrace: Format variable declarations of ftrace_allocate_records ftrace: Simplify the calculation of page number for ftrace_page->records ftrace: Simplify the dyn_ftrace->flags macro ftrace: Simplify the hash calculation ftrace: Use fls() to get the bits for dup_hash() ...
2020-10-15Merge tag 'hyperv-next-signed' of ↵Linus Torvalds2-1/+7
git://git.kernel.org/pub/scm/linux/kernel/git/hyperv/linux Pull another Hyper-V update from Wei Liu: "One patch from Michael to get VMbus interrupt from ACPI DSDT" * tag 'hyperv-next-signed' of git://git.kernel.org/pub/scm/linux/kernel/git/hyperv/linux: Drivers: hv: vmbus: Add parsing of VMbus interrupt in ACPI DSDT
2020-10-15Merge branch 'parisc-5.10-1' of ↵Linus Torvalds19-68/+116
git://git.kernel.org/pub/scm/linux/kernel/git/deller/parisc-linux Pull parisc updates from Helge Deller: - Added fw_cfg support for parisc on qemu - Added font support in sti text console driver for byte- and word-mode ROMs - Switch to more fine grained lws locks and improve spinlock handling - Add ioread64_hi_lo() and iowrite64_hi_lo() to avoid 0-day linking errors - Mark pointers volatile in __xchg8(), __xchg32() and __xchg64() to help compiler - Header file cleanups, mostly removal of unused HP-UX compat defines - Drop one bit from our O_NONBLOCK define to become now 000200000 - Add MAP_UNINITIALIZED define to avoid userspace compile errors - Drop CONFIG_IDE from defconfigs - Speed up synchronize_caches() on UP machines - Rewrite tlb flush threshold calculation - Comment fixes and cleanups * 'parisc-5.10-1' of git://git.kernel.org/pub/scm/linux/kernel/git/deller/parisc-linux: parisc/sticon: Add user font support parisc/sticon: Always register sticon console driver parisc: Add MAP_UNINITIALIZED define parisc: Improve spinlock handling parisc: Install vmlinuz instead of zImage file parisc: Rewrite tlb flush threshold calculation parisc: Switch to more fine grained lws locks parisc: Mark pointers volatile in __xchg8(), __xchg32() and __xchg64() parisc: Fix comments and enable interrupts later parisc: Add alternative patching to synchronize_caches define parisc: Add ioread64_hi_lo() and iowrite64_hi_lo() parisc: disable CONFIG_IDE in defconfigs parisc: Drop useless comments in uapi/asm/signal.h parisc: Define O_NONBLOCK to become 000200000 parisc: Drop HP-UX specific fcntl and signal flags parisc: Avoid external interrupts when IPI finishes parisc: Add qemu fw_cfg interface fw_cfg: Add support for parisc architecture
2020-10-15Merge branch 'for-linus' of ↵Linus Torvalds1-1/+1
git://git.kernel.org/pub/scm/linux/kernel/git/jikos/trivial Pull trivial updates from Jiri Kosina: "The latest advances in computer science from the trivial queue" * 'for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/jikos/trivial: xtensa: fix Kconfig typo spelling.txt: Remove some duplicate entries mtd: rawnand: oxnas: cleanup/simplify code selftests: vm: add fragment CONFIG_GUP_BENCHMARK perf: Fix opt help text for --no-bpf-event HID: logitech-dj: Fix spelling in comment bootconfig: Fix kernel message mentioning CONFIG_BOOT_CONFIG MAINTAINERS: rectify MMP SUPPORT after moving cputype.h scif: Fix spelling of EACCES printk: fix global comment lib/bitmap.c: fix spello fs: Fix missing 'bit' in comment
2020-10-15Merge tag 'dma-mapping-5.10' of git://git.infradead.org/users/hch/dma-mappingLinus Torvalds88-391/+256
Pull dma-mapping updates from Christoph Hellwig: - rework the non-coherent DMA allocator - move private definitions out of <linux/dma-mapping.h> - lower CMA_ALIGNMENT (Paul Cercueil) - remove the omap1 dma address translation in favor of the common code - make dma-direct aware of multiple dma offset ranges (Jim Quinlan) - support per-node DMA CMA areas (Barry Song) - increase the default seg boundary limit (Nicolin Chen) - misc fixes (Robin Murphy, Thomas Tai, Xu Wang) - various cleanups * tag 'dma-mapping-5.10' of git://git.infradead.org/users/hch/dma-mapping: (63 commits) ARM/ixp4xx: add a missing include of dma-map-ops.h dma-direct: simplify the DMA_ATTR_NO_KERNEL_MAPPING handling dma-direct: factor out a dma_direct_alloc_from_pool helper dma-direct check for highmem pages in dma_direct_alloc_pages dma-mapping: merge <linux/dma-noncoherent.h> into <linux/dma-map-ops.h> dma-mapping: move large parts of <linux/dma-direct.h> to kernel/dma dma-mapping: move dma-debug.h to kernel/dma/ dma-mapping: remove <asm/dma-contiguous.h> dma-mapping: merge <linux/dma-contiguous.h> into <linux/dma-map-ops.h> dma-contiguous: remove dma_contiguous_set_default dma-contiguous: remove dev_set_cma_area dma-contiguous: remove dma_declare_contiguous dma-mapping: split <linux/dma-mapping.h> cma: decrease CMA_ALIGNMENT lower limit to 2 firewire-ohci: use dma_alloc_pages dma-iommu: implement ->alloc_noncoherent dma-mapping: add new {alloc,free}_noncoherent dma_map_ops methods dma-mapping: add a new dma_alloc_pages API dma-mapping: remove dma_cache_sync 53c700: convert to dma_alloc_noncoherent ...
2020-10-15Merge tag 'sound-5.10-rc1' of ↵Linus Torvalds3-0/+51
git://git.kernel.org/pub/scm/linux/kernel/git/tiwai/sound Pull sound updates from Takashi Iwai: "The amount of changes is smaller at this round (what a surprise), but lots of activity is seen. Most of changes are about ASoC driver development, especially Intel platforms. Here are some highlights: General: - Replace all tasklet usages with other alternatives - Cleanup of the ASoC error unwinding code - Fixes for trivial issues caught by static checker - Spell fixes allover the places ALSA Core: - Lockdep fix for control devices - Fix for potential OSS sequencer mutex stalls HD-audio and USB-audio: - SoundBlaster AE-7 support - Changes in quirk table for the rename handling - Quirks for HP and ASUS machines, Pioneer DJ DJM-250MK2. ASoC: - Lots of updates for Intel SOF and SoundWire enablement - Replacement of the DSP driver for some older x86 systems; the new code was written from scratch, better maintenance expected - Helpers for parsing auxiluary devices from the device tree - New support for AllWinner A64, Cirrus Logic CS4234, Mediatek MT6359 Microchip S/PDIF TX and RX controllers, Realtek RT1015P, and Texas Instruments J721E, TAS2110, TAS2564 and TAS2764" * tag 'sound-5.10-rc1' of git://git.kernel.org/pub/scm/linux/kernel/git/tiwai/sound: (498 commits) ALSA: hda/hdmi: fix incorrect locking in hdmi_pcm_close ALSA: hda: fix jack detection with Realtek codecs when in D3 ALSA: fireworks: use semicolons rather than commas to separate statements ALSA: hda: use semicolons rather than commas to separate statements ALSA: hda/i915 - fix list corruption with concurrent probes ASoC: dmaengine: Document support for TX only or RX only streams ASoC: mchp-spdiftx: remove 'TX' from playback stream name ASoC: ti: davinci-mcasp: Use &pdev->dev for early dev_warn ASoC: tas2764: Add the driver for the TAS2764 dt-bindings: tas2764: Add the TAS2764 binding doc ASoC: Intel: catpt: Add explicit DMADEVICES kconfig dependency ASoC: Intel: catpt: Fix compilation when CONFIG_MODULES is disabled ASoC: stm32: dfsdm: add actual resolution trace ASoC: stm32: dfsdm: change rate limits ASoC: qcom: sc7180: Add support for audio over DP Asoc: qcom: lpass-platform : Increase buffer size ASoC: qcom: Add support for lpass hdmi driver Asoc: qcom: lpass:Update lpaif_dmactl members order Asoc:qcom:lpass-cpu:Update dts property read API ASoC: dt-bindings: Add dt binding for lpass hdmi ...
2020-10-15Merge tag 'usb-5.10-rc1' of ↵Linus Torvalds3-0/+49
git://git.kernel.org/pub/scm/linux/kernel/git/gregkh/usb Pull USB/PHY/Thunderbolt driver updates from Greg KH: "Here is the big set of USB, PHY, and Thunderbolt driver updates for 5.10-rc1. Lots of tiny different things for these subsystems are in here, including: - phy driver updates - thunderbolt / USB 4 updates and additions - USB gadget driver updates - xhci fixes and updates - typec driver additions and updates - api conversions to various drivers for core kernel api changes - new USB control message functions to make it harder to get wrong, as found by syzbot (took 2 tries to get it right) - lots of tiny USB driver fixes and updates all over the place All of these have been in linux-next for a while, with the exception of the last "obviously correct" patch that updated a FALLTHROUGH comment that got merged last weekend" * tag 'usb-5.10-rc1' of git://git.kernel.org/pub/scm/linux/kernel/git/gregkh/usb: (374 commits) usb: musb: gadget: Use fallthrough pseudo-keyword usb: typec: Add QCOM PMIC typec detection driver USB: serial: option: add Cellient MPL200 card usb: typec: tcpci_maxim: Add support for Sink FRS usb: typec: tcpci: Implement callbacks for FRS usb: typec: tcpm: Add support for Sink Fast Role SWAP(FRS) usb: typec: tcpci_maxim: Chip level TCPC driver usb: typec: tcpci: Add set_vbus tcpci callback usb: typec: tcpci: Add a getter method to retrieve tcpm_port reference usbip: vhci_hcd: fix calling usb_hcd_giveback_urb() with irqs enabled usb: cdc-acm: add quirk to blacklist ETAS ES58X devices USB: serial: ftdi_sio: use cur_altsetting for consistency USB: serial: option: Add Telit FT980-KS composition USB: core: remove polling for /sys/kernel/debug/usb/devices usb: typec: add support for STUSB160x Type-C controller family usb: typec: add typec_find_pwr_opmode usb: typec: hd3ss3220: Use OF graph API to get the connector fwnode dt-bindings: usb: renesas,usb3-peri: Document HS and SS data bus dt-bindings: usb: convert ti,hd3ss3220 bindings to json-schema usb: dwc2: Fix INTR OUT transfers in DDMA mode. ...
2020-10-15arm64: mremap speedup - Enable HAVE_MOVE_PMDKalesh Singh1-0/+1
HAVE_MOVE_PMD enables remapping pages at the PMD level if both the source and destination addresses are PMD-aligned. HAVE_MOVE_PMD is already enabled on x86. The original patch [1] that introduced this config did not enable it on arm64 at the time because of performance issues with flushing the TLB on every PMD move. These issues have since been addressed in more recent releases with improvements to the arm64 TLB invalidation and core mmu_gather code as Will Deacon mentioned in [2]. >From the data below, it can be inferred that there is approximately 8x improvement in performance when HAVE_MOVE_PMD is enabled on arm64. --------- Test Results ---------- The following results were obtained on an arm64 device running a 5.4 kernel, by remapping a PMD-aligned, 1GB sized region to a PMD-aligned destination. The results from 10 iterations of the test are given below. All times are in nanoseconds. Control HAVE_MOVE_PMD 9220833 1247761 9002552 1219896 9254115 1094792 8725885 1227760 9308646 1043698 9001667 1101771 8793385 1159896 8774636 1143594 9553125 1025833 9374010 1078125 9100885.4 1134312.6 <-- Mean Time in nanoseconds Total mremap time for a 1GB sized PMD-aligned region drops from ~9.1 milliseconds to ~1.1 milliseconds. (~8x speedup). [1] https://lore.kernel.org/r/[email protected] [2] https://www.mail-archive.com/[email protected]/msg140837.html Signed-off-by: Kalesh Singh <[email protected]> Acked-by: Kirill A. Shutemov <[email protected]> Cc: Catalin Marinas <[email protected]> Cc: Will Deacon <[email protected]> Cc: Andrew Morton <[email protected]> Link: https://lore.kernel.org/r/[email protected] Link: https://lore.kernel.org/kvmarm/[email protected]/ Signed-off-by: Will Deacon <[email protected]>
2020-10-15arm64: mm: use single quantity to represent the PA to VA translationArd Biesheuvel3-25/+14
On arm64, the global variable memstart_addr represents the physical address of PAGE_OFFSET, and so physical to virtual translations or vice versa used to come down to simple additions or subtractions involving the values of PAGE_OFFSET and memstart_addr. When support for 52-bit virtual addressing was introduced, we had to deal with PAGE_OFFSET potentially being outside of the region that can be covered by the virtual range (as the 52-bit VA capable build needs to be able to run on systems that are only 48-bit VA capable), and for this reason, another translation was introduced, and recorded in the global variable physvirt_offset. However, if we go back to the original definition of memstart_addr, i.e., the physical address of PAGE_OFFSET, it turns out that there is no need for two separate translations: instead, we can simply subtract the size of the unaddressable VA space from memstart_addr to make the available physical memory appear in the 48-bit addressable VA region. This simplifies things, but also fixes a bug on KASLR builds, which may update memstart_addr later on in arm64_memblock_init(), but fails to update vmemmap and physvirt_offset accordingly. Fixes: 5383cc6efed1 ("arm64: mm: Introduce vabits_actual") Signed-off-by: Ard Biesheuvel <[email protected]> Reviewed-by: Steve Capper <[email protected]> Link: https://lore.kernel.org/r/[email protected] Signed-off-by: Will Deacon <[email protected]>
2020-10-15arm64: reject prctl(PR_PAC_RESET_KEYS) on compat tasksPeter Collingbourne2-2/+6
It doesn't make sense to issue prctl(PR_PAC_RESET_KEYS) on a compat task because the 32-bit instruction set does not offer PAuth instructions. For consistency with other 64-bit only prctls such as {SET,GET}_TAGGED_ADDR_CTRL, reject the prctl on compat tasks. Although this is a userspace-visible change, maybe it isn't too late to make this change given that the hardware isn't available yet and it's very unlikely that anyone has 32-bit software that actually depends on this succeeding. Signed-off-by: Peter Collingbourne <[email protected]> Link: https://linux-review.googlesource.com/id/Ie885a1ff84ab498cc9f62d6451e9f2cfd4b1d06a Link: https://lore.kernel.org/r/[email protected] [will: Do the same for the SVE prctl()s] Signed-off-by: Will Deacon <[email protected]>
2020-10-15parisc: Add MAP_UNINITIALIZED defineHelge Deller1-0/+1
We will not allow unitialized anon mmaps, but we need this define to prevent build errors, e.g. the debian foot package. Suggested-by: John David Anglin <[email protected]> Signed-off-by: Helge Deller <[email protected]>
2020-10-15parisc: Improve spinlock handlingJohn David Anglin1-10/+13
Use READ_ONCE() to check if spinlock is locked. The other changes are cleanups. Signed-off-by: John David Anglin <[email protected]> Signed-off-by: Helge Deller <[email protected]>
2020-10-15parisc: Install vmlinuz instead of zImage fileHelge Deller1-1/+1
Tested-by: John David Anglin <[email protected]> Signed-off-by: Helge Deller <[email protected]>
2020-10-15parisc: Rewrite tlb flush threshold calculationJohn David Anglin1-10/+8
Signed-off-by: John David Anglin <[email protected]> Signed-off-by: Helge Deller <[email protected]>
2020-10-15parisc: Switch to more fine grained lws locksJohn David Anglin2-7/+7
Increase the number of lws locks to 256 entries (instead of 16) and choose lock entry based on bits 3-11 (instead of 4-7) of the relevant address. With this change we archieve more fine-grained locking in futex syscalls and thus reduce the number of possible stalls. Signed-off-by: John David Anglin <[email protected]> Signed-off-by: Helge Deller <[email protected]>
2020-10-15parisc: Mark pointers volatile in __xchg8(), __xchg32() and __xchg64()John David Anglin2-10/+10
Let the complier treat the pointers volatile to ensure that they get accessed atomicly. Signed-off-by: John David Anglin <[email protected]> Signed-off-by: Helge Deller <[email protected]>
2020-10-15parisc: Fix comments and enable interrupts laterJohn David Anglin1-7/+7
Correct the comments: The jump is forwards, not backwards. Enable the interrupts after %r29 (reference param area) was loaded. Signed-off-by: John David Anglin <[email protected]> Signed-off-by: Helge Deller <[email protected]>