blaster4385/linux-IllusionX - Linux kernel with personal config changes for arch linux

Age	Commit message (Collapse)	Author	Files	Lines
2024-03-14	Merge tag 'mm-stable-2024-03-13-20-04' of ↵	Linus Torvalds	1	-1/+0
	git://git.kernel.org/pub/scm/linux/kernel/git/akpm/mm Pull MM updates from Andrew Morton: - Sumanth Korikkar has taught s390 to allocate hotplug-time page frames from hotplugged memory rather than only from main memory. Series "implement "memmap on memory" feature on s390". - More folio conversions from Matthew Wilcox in the series "Convert memcontrol charge moving to use folios" "mm: convert mm counter to take a folio" - Chengming Zhou has optimized zswap's rbtree locking, providing significant reductions in system time and modest but measurable reductions in overall runtimes. The series is "mm/zswap: optimize the scalability of zswap rb-tree". - Chengming Zhou has also provided the series "mm/zswap: optimize zswap lru list" which provides measurable runtime benefits in some swap-intensive situations. - And Chengming Zhou further optimizes zswap in the series "mm/zswap: optimize for dynamic zswap_pools". Measured improvements are modest. - zswap cleanups and simplifications from Yosry Ahmed in the series "mm: zswap: simplify zswap_swapoff()". - In the series "Add DAX ABI for memmap_on_memory", Vishal Verma has contributed several DAX cleanups as well as adding a sysfs tunable to control the memmap_on_memory setting when the dax device is hotplugged as system memory. - Johannes Weiner has added the large series "mm: zswap: cleanups", which does that. - More DAMON work from SeongJae Park in the series "mm/damon: make DAMON debugfs interface deprecation unignorable" "selftests/damon: add more tests for core functionalities and corner cases" "Docs/mm/damon: misc readability improvements" "mm/damon: let DAMOS feeds and tame/auto-tune itself" - In the series "mm/mempolicy: weighted interleave mempolicy and sysfs extension" Rakie Kim has developed a new mempolicy interleaving policy wherein we allocate memory across nodes in a weighted fashion rather than uniformly. This is beneficial in heterogeneous memory environments appearing with CXL. - Christophe Leroy has contributed some cleanup and consolidation work against the ARM pagetable dumping code in the series "mm: ptdump: Refactor CONFIG_DEBUG_WX and check_wx_pages debugfs attribute". - Luis Chamberlain has added some additional xarray selftesting in the series "test_xarray: advanced API multi-index tests". - Muhammad Usama Anjum has reworked the selftest code to make its human-readable output conform to the TAP ("Test Anything Protocol") format. Amongst other things, this opens up the use of third-party tools to parse and process out selftesting results. - Ryan Roberts has added fork()-time PTE batching of THP ptes in the series "mm/memory: optimize fork() with PTE-mapped THP". Mainly targeted at arm64, this significantly speeds up fork() when the process has a large number of pte-mapped folios. - David Hildenbrand also gets in on the THP pte batching game in his series "mm/memory: optimize unmap/zap with PTE-mapped THP". It implements batching during munmap() and other pte teardown situations. The microbenchmark improvements are nice. - And in the series "Transparent Contiguous PTEs for User Mappings" Ryan Roberts further utilizes arm's pte's contiguous bit ("contpte mappings"). Kernel build times on arm64 improved nicely. Ryan's series "Address some contpte nits" provides some followup work. - In the series "mm/hugetlb: Restore the reservation" Breno Leitao has fixed an obscure hugetlb race which was causing unnecessary page faults. He has also added a reproducer under the selftest code. - In the series "selftests/mm: Output cleanups for the compaction test", Mark Brown did what the title claims. - Kinsey Ho has added the series "mm/mglru: code cleanup and refactoring". - Even more zswap material from Nhat Pham. The series "fix and extend zswap kselftests" does as claimed. - In the series "Introduce cpu_dcache_is_aliasing() to fix DAX regression" Mathieu Desnoyers has cleaned up and fixed rather a mess in our handling of DAX on archiecctures which have virtually aliasing data caches. The arm architecture is the main beneficiary. - Lokesh Gidra's series "per-vma locks in userfaultfd" provides dramatic improvements in worst-case mmap_lock hold times during certain userfaultfd operations. - Some page_owner enhancements and maintenance work from Oscar Salvador in his series "page_owner: print stacks and their outstanding allocations" "page_owner: Fixup and cleanup" - Uladzislau Rezki has contributed some vmalloc scalability improvements in his series "Mitigate a vmap lock contention". It realizes a 12x improvement for a certain microbenchmark. - Some kexec/crash cleanup work from Baoquan He in the series "Split crash out from kexec and clean up related config items". - Some zsmalloc maintenance work from Chengming Zhou in the series "mm/zsmalloc: fix and optimize objects/page migration" "mm/zsmalloc: some cleanup for get/set_zspage_mapping()" - Zi Yan has taught the MM to perform compaction on folios larger than order=0. This a step along the path to implementaton of the merging of large anonymous folios. The series is named "Enable >0 order folio memory compaction". - Christoph Hellwig has done quite a lot of cleanup work in the pagecache writeback code in his series "convert write_cache_pages() to an iterator". - Some modest hugetlb cleanups and speedups in Vishal Moola's series "Handle hugetlb faults under the VMA lock". - Zi Yan has changed the page splitting code so we can split huge pages into sizes other than order-0 to better utilize large folios. The series is named "Split a folio to any lower order folios". - David Hildenbrand has contributed the series "mm: remove total_mapcount()", a cleanup. - Matthew Wilcox has sought to improve the performance of bulk memory freeing in his series "Rearrange batched folio freeing". - Gang Li's series "hugetlb: parallelize hugetlb page init on boot" provides large improvements in bootup times on large machines which are configured to use large numbers of hugetlb pages. - Matthew Wilcox's series "PageFlags cleanups" does that. - Qi Zheng's series "minor fixes and supplement for ptdesc" does that also. S390 is affected. - Cleanups to our pagemap utility functions from Peter Xu in his series "mm/treewide: Replace pXd_large() with pXd_leaf()". - Nico Pache has fixed a few things with our hugepage selftests in his series "selftests/mm: Improve Hugepage Test Handling in MM Selftests". - Also, of course, many singleton patches to many things. Please see the individual changelogs for details. * tag 'mm-stable-2024-03-13-20-04' of git://git.kernel.org/pub/scm/linux/kernel/git/akpm/mm: (435 commits) mm/zswap: remove the memcpy if acomp is not sleepable crypto: introduce: acomp_is_async to expose if comp drivers might sleep memtest: use {READ,WRITE}_ONCE in memory scanning mm: prohibit the last subpage from reusing the entire large folio mm: recover pud_leaf() definitions in nopmd case selftests/mm: skip the hugetlb-madvise tests on unmet hugepage requirements selftests/mm: skip uffd hugetlb tests with insufficient hugepages selftests/mm: dont fail testsuite due to a lack of hugepages mm/huge_memory: skip invalid debugfs new_order input for folio split mm/huge_memory: check new folio order when split a folio mm, vmscan: retry kswapd's priority loop with cache_trim_mode off on failure mm: add an explicit smp_wmb() to UFFDIO_CONTINUE mm: fix list corruption in put_pages_list mm: remove folio from deferred split list before uncharging it filemap: avoid unnecessary major faults in filemap_fault() mm,page_owner: drop unnecessary check mm,page_owner: check for null stack_record before bumping its refcount mm: swap: fix race between free_swap_and_cache() and swapoff() mm/treewide: align up pXd_leaf() retval across archs mm/treewide: drop pXd_large() ...
2024-03-06	mm: Introduce VM_SPARSE kind and vm_area_[un]map_pages().	Alexei Starovoitov	1	-0/+5
	vmap/vmalloc APIs are used to map a set of pages into contiguous kernel virtual space. get_vm_area() with appropriate flag is used to request an area of kernel address range. It's used for vmalloc, vmap, ioremap, xen use cases. - vmalloc use case dominates the usage. Such vm areas have VM_ALLOC flag. - the areas created by vmap() function should be tagged with VM_MAP. - ioremap areas are tagged with VM_IOREMAP. BPF would like to extend the vmap API to implement a lazily-populated sparse, yet contiguous kernel virtual space. Introduce VM_SPARSE flag and vm_area_map_pages(area, start_addr, count, pages) API to map a set of pages within a given area. It has the same sanity checks as vmap() does. It also checks that get_vm_area() was created with VM_SPARSE flag which identifies such areas in /proc/vmallocinfo and returns zero pages on read through /proc/kcore. The next commits will introduce bpf_arena which is a sparsely populated shared memory region between bpf program and user space process. It will map privately-managed pages into a sparse vm area with the following steps: // request virtual memory region during bpf prog verification area = get_vm_area(area_size, VM_SPARSE); // on demand vm_area_map_pages(area, kaddr, kend, pages); vm_area_unmap_pages(area, kaddr, kend); // after bpf program is detached and unloaded free_vm_area(area); Signed-off-by: Alexei Starovoitov <[email protected]> Signed-off-by: Andrii Nakryiko <[email protected]> Reviewed-by: Christoph Hellwig <[email protected]> Reviewed-by: Pasha Tatashin <[email protected]> Link: https://lore.kernel.org/bpf/[email protected]
2024-02-23	mm/vmalloc: remove vmap_area_list	Baoquan He	1	-1/+0
	Earlier, vmap_area_list is exported to vmcoreinfo so that makedumpfile get the base address of vmalloc area. Now, vmap_area_list is empty, so export VMALLOC_START to vmcoreinfo instead, and remove vmap_area_list. [[email protected]: fix a warning in the crash_save_vmcoreinfo_init()] Link: https://lkml.kernel.org/r/[email protected] Link: https://lkml.kernel.org/r/[email protected] Signed-off-by: Baoquan He <[email protected]> Signed-off-by: Uladzislau Rezki (Sony) <[email protected]> Acked-by: Lorenzo Stoakes <[email protected]> Cc: Christoph Hellwig <[email protected]> Cc: Dave Chinner <[email protected]> Cc: Joel Fernandes (Google) <[email protected]> Cc: Kazuhito Hagio <[email protected]> Cc: Liam R. Howlett <[email protected]> Cc: Matthew Wilcox (Oracle) <[email protected]> Cc: Oleksiy Avramchenko <[email protected]> Cc: Paul E. McKenney <[email protected]> Signed-off-by: Andrew Morton <[email protected]>
2023-04-05	mm: vmalloc: convert vread() to vread_iter()	Lorenzo Stoakes	1	-1/+2
	Having previously laid the foundation for converting vread() to an iterator function, pull the trigger and do so. This patch attempts to provide minimal refactoring and to reflect the existing logic as best we can, for example we continue to zero portions of memory not read, as before. Overall, there should be no functional difference other than a performance improvement in /proc/kcore access to vmalloc regions. Now we have eliminated the need for a bounce buffer in read_kcore_iter(), we dispense with it, and try to write to user memory optimistically but with faults disabled via copy_page_to_iter_nofault(). We already have preemption disabled by holding a spin lock. We continue faulting in until the operation is complete. Additionally, we must account for the fact that at any point a copy may fail (most likely due to a fault not being able to occur), we exit indicating fewer bytes retrieved than expected. [[email protected]: fix sparc64 warning] Link: https://lkml.kernel.org/r/[email protected] [[email protected]: redo Stephen's sparc build fix] Link: https://lkml.kernel.org/r/8506cbc667c39205e65a323f750ff9c11a463798.1679566220.git.lstoakes@gmail.com [[email protected]: unbreak uio.h includes] Link: https://lkml.kernel.org/r/941f88bc5ab928e6656e1e2593b91bf0f8c81e1b.1679511146.git.lstoakes@gmail.com Signed-off-by: Lorenzo Stoakes <[email protected]> Signed-off-by: Stephen Rothwell <[email protected]> Reviewed-by: Baoquan He <[email protected]> Cc: Alexander Viro <[email protected]> Cc: David Hildenbrand <[email protected]> Cc: Jens Axboe <[email protected]> Cc: Jiri Olsa <[email protected]> Cc: Liu Shixin <[email protected]> Cc: Matthew Wilcox (Oracle) <[email protected]> Cc: Uladzislau Rezki (Sony) <[email protected]> Signed-off-by: Andrew Morton <[email protected]>
2023-04-05	mm: move vmalloc_init() declaration to mm/internal.h	Mike Rapoport (IBM)	1	-4/+0
	vmalloc_init() is called only from mm_core_init(), there is no need to declare it in include/linux/vmalloc.h Move vmalloc_init() declaration to mm/internal.h Link: https://lkml.kernel.org/r/[email protected] Signed-off-by: Mike Rapoport (IBM) <[email protected]> Reviewed-by: David Hildenbrand <[email protected]> Reviewed-by: Vlastimil Babka <[email protected]> Cc: Doug Berger <[email protected]> Cc: Matthew Wilcox (Oracle) <[email protected]> Cc: Mel Gorman <[email protected]> Cc: Michal Hocko <[email protected]> Cc: Thomas Bogendoerfer <[email protected]> Signed-off-by: Andrew Morton <[email protected]>
2023-02-09	mm/vmalloc.c: add flags to mark vm_map_ram area	Baoquan He	1	-0/+1
	Through vmalloc API, a virtual kernel area is reserved for physical address mapping. And vmap_area is used to track them, while vm_struct is allocated to associate with the vmap_area to store more information and passed out. However, area reserved via vm_map_ram() is an exception. It doesn't have vm_struct to associate with vmap_area. And we can't recognize the vmap_area with '->vm == NULL' as a vm_map_ram() area because the normal freeing path will set va->vm = NULL before unmapping, please see function remove_vm_area(). Meanwhile, there are two kinds of handling for vm_map_ram area. One is the whole vmap_area being reserved and mapped at one time through vm_map_area() interface; the other is the whole vmap_area with VMAP_BLOCK_SIZE size being reserved, while mapped into split regions with smaller size via vb_alloc(). To mark the area reserved through vm_map_ram(), add flags field into struct vmap_area. Bit 0 indicates this is vm_map_ram area created through vm_map_ram() interface, while bit 1 marks out the type of vm_map_ram area which makes use of vmap_block to manage split regions via vb_alloc/free(). This is a preparation for later use. Link: https://lkml.kernel.org/r/[email protected] Signed-off-by: Baoquan He <[email protected]> Reviewed-by: Lorenzo Stoakes <[email protected]> Reviewed-by: Uladzislau Rezki (Sony) <[email protected]> Cc: Dan Carpenter <[email protected]> Cc: Stephen Brennan <[email protected]> Signed-off-by: Andrew Morton <[email protected]>
2022-06-13	usercopy: Handle vm_map_ram() areas	Matthew Wilcox (Oracle)	1	-0/+1
	vmalloc does not allocate a vm_struct for vm_map_ram() areas. That causes us to deny usercopies from those areas. This affects XFS which uses vm_map_ram() for its directories. Fix this by calling find_vmap_area() instead of find_vm_area(). Fixes: 0aef499f3172 ("mm/usercopy: Detect vmalloc overruns") Signed-off-by: Matthew Wilcox (Oracle) <[email protected]> Reviewed-by: Uladzislau Rezki (Sony) <[email protected]> Tested-by: Zorro Lang <[email protected]> Signed-off-by: Kees Cook <[email protected]> Link: https://lore.kernel.org/r/[email protected]
2022-04-19	vmalloc: replace VM_NO_HUGE_VMAP with VM_ALLOW_HUGE_VMAP	Song Liu	1	-2/+2
	Huge page backed vmalloc memory could benefit performance in many cases. However, some users of vmalloc may not be ready to handle huge pages for various reasons: hardware constraints, potential pages split, etc. VM_NO_HUGE_VMAP was introduced to allow vmalloc users to opt-out huge pages. However, it is not easy to track down all the users that require the opt-out, as the allocation are passed different stacks and may cause issues in different layers. To address this issue, replace VM_NO_HUGE_VMAP with an opt-in flag, VM_ALLOW_HUGE_VMAP, so that users that benefit from huge pages could ask specificially. Also, remove vmalloc_no_huge() and add opt-in helper vmalloc_huge(). Fixes: fac54e2bfb5b ("x86/Kconfig: Select HAVE_ARCH_HUGE_VMALLOC with HAVE_ARCH_HUGE_VMAP") Link: https://lore.kernel.org/netdev/[email protected]/" Reviewed-by: Christoph Hellwig <[email protected]> Signed-off-by: Song Liu <[email protected]> Reviewed-by: Rik van Riel <[email protected]> Signed-off-by: Linus Torvalds <[email protected]>
2022-03-24	kasan, vmalloc, arm64: mark vmalloc mappings as pgprot_tagged	Andrey Konovalov	1	-0/+7
	HW_TAGS KASAN relies on ARM Memory Tagging Extension (MTE). With MTE, a memory region must be mapped as MT_NORMAL_TAGGED to allow setting memory tags via MTE-specific instructions. Add proper protection bits to vmalloc() allocations. These allocations are always backed by page_alloc pages, so the tags will actually be getting set on the corresponding physical memory. Link: https://lkml.kernel.org/r/983fc33542db2f6b1e77b34ca23448d4640bbb9e.1643047180.git.andreyknvl@google.com Signed-off-by: Andrey Konovalov <[email protected]> Co-developed-by: Vincenzo Frascino <[email protected]> Signed-off-by: Vincenzo Frascino <[email protected]> Acked-by: Marco Elver <[email protected]> Cc: Alexander Potapenko <[email protected]> Cc: Andrey Ryabinin <[email protected]> Cc: Catalin Marinas <[email protected]> Cc: Dmitry Vyukov <[email protected]> Cc: Evgenii Stepanov <[email protected]> Cc: Mark Rutland <[email protected]> Cc: Peter Collingbourne <[email protected]> Cc: Will Deacon <[email protected]> Signed-off-by: Andrew Morton <[email protected]> Signed-off-by: Linus Torvalds <[email protected]>
2022-03-24	kasan, vmalloc: drop outdated VM_KASAN comment	Andrey Konovalov	1	-11/+0
	The comment about VM_KASAN in include/linux/vmalloc.c is outdated. VM_KASAN is currently only used to mark vm_areas allocated for kernel modules when CONFIG_KASAN_VMALLOC is disabled. Drop the comment. Link: https://lkml.kernel.org/r/780395afea83a147b3b5acc36cf2e38f7f8479f9.1643047180.git.andreyknvl@google.com Signed-off-by: Andrey Konovalov <[email protected]> Reviewed-by: Alexander Potapenko <[email protected]> Acked-by: Marco Elver <[email protected]> Cc: Andrey Ryabinin <[email protected]> Cc: Catalin Marinas <[email protected]> Cc: Dmitry Vyukov <[email protected]> Cc: Evgenii Stepanov <[email protected]> Cc: Mark Rutland <[email protected]> Cc: Peter Collingbourne <[email protected]> Cc: Vincenzo Frascino <[email protected]> Cc: Will Deacon <[email protected]> Signed-off-by: Andrew Morton <[email protected]> Signed-off-by: Linus Torvalds <[email protected]>
2022-03-24	Merge tag 'for-linus' of git://git.kernel.org/pub/scm/virt/kvm/kvm	Linus Torvalds	1	-0/+5
	Pull kvm updates from Paolo Bonzini: "ARM: - Proper emulation of the OSLock feature of the debug architecture - Scalibility improvements for the MMU lock when dirty logging is on - New VMID allocator, which will eventually help with SVA in VMs - Better support for PMUs in heterogenous systems - PSCI 1.1 support, enabling support for SYSTEM_RESET2 - Implement CONFIG_DEBUG_LIST at EL2 - Make CONFIG_ARM64_ERRATUM_2077057 default y - Reduce the overhead of VM exit when no interrupt is pending - Remove traces of 32bit ARM host support from the documentation - Updated vgic selftests - Various cleanups, doc updates and spelling fixes RISC-V: - Prevent KVM_COMPAT from being selected - Optimize __kvm_riscv_switch_to() implementation - RISC-V SBI v0.3 support s390: - memop selftest - fix SCK locking - adapter interruptions virtualization for secure guests - add Claudio Imbrenda as maintainer - first step to do proper storage key checking x86: - Continue switching kvm_x86_ops to static_call(); introduce static_call_cond() and __static_call_ret0 when applicable. - Cleanup unused arguments in several functions - Synthesize AMD 0x80000021 leaf - Fixes and optimization for Hyper-V sparse-bank hypercalls - Implement Hyper-V's enlightened MSR bitmap for nested SVM - Remove MMU auditing - Eager splitting of page tables (new aka "TDP" MMU only) when dirty page tracking is enabled - Cleanup the implementation of the guest PGD cache - Preparation for the implementation of Intel IPI virtualization - Fix some segment descriptor checks in the emulator - Allow AMD AVIC support on systems with physical APIC ID above 255 - Better API to disable virtualization quirks - Fixes and optimizations for the zapping of page tables: - Zap roots in two passes, avoiding RCU read-side critical sections that last too long for very large guests backed by 4 KiB SPTEs. - Zap invalid and defunct roots asynchronously via concurrency-managed work queue. - Allowing yielding when zapping TDP MMU roots in response to the root's last reference being put. - Batch more TLB flushes with an RCU trick. Whoever frees the paging structure now holds RCU as a proxy for all vCPUs running in the guest, i.e. to prolongs the grace period on their behalf. It then kicks the the vCPUs out of guest mode before doing rcu_read_unlock(). Generic: - Introduce __vcalloc and use it for very large allocations that need memcg accounting" * tag 'for-linus' of git://git.kernel.org/pub/scm/virt/kvm/kvm: (246 commits) KVM: use kvcalloc for array allocations KVM: x86: Introduce KVM_CAP_DISABLE_QUIRKS2 kvm: x86: Require const tsc for RT KVM: x86: synthesize CPUID leaf 0x80000021h if useful KVM: x86: add support for CPUID leaf 0x80000021 KVM: x86: do not use KVM_X86_OP_OPTIONAL_RET0 for get_mt_mask Revert "KVM: x86/mmu: Zap only TDP MMU leafs in kvm_zap_gfn_range()" kvm: x86/mmu: Flush TLB before zap_gfn_range releases RCU KVM: arm64: fix typos in comments KVM: arm64: Generalise VM features into a set of flags KVM: s390: selftests: Add error memop tests KVM: s390: selftests: Add more copy memop tests KVM: s390: selftests: Add named stages for memop test KVM: s390: selftests: Add macro as abstraction for MEM_OP KVM: s390: selftests: Split memop tests KVM: s390x: fix SCK locking RISC-V: KVM: Implement SBI HSM suspend call RISC-V: KVM: Add common kvm_riscv_vcpu_wfi() function RISC-V: Add SBI HSM suspend related defines RISC-V: KVM: Implement SBI v0.3 SRST extension ...
2022-03-22	mm/vmalloc: fix comments about vmap_area struct	Bang Li	1	-2/+2
	The vmap_area_root should be in the "busy" tree and the free_vmap_area_root should be in the "free" tree. Link: https://lkml.kernel.org/r/[email protected] Fixes: 688fcbfc06e4 ("mm/vmalloc: modify struct vmap_area to reduce its size") Signed-off-by: Bang Li <[email protected]> Reviewed-by: Uladzislau Rezki (Sony) <[email protected]> Cc: Pengfei Li <[email protected]> Signed-off-by: Andrew Morton <[email protected]> Signed-off-by: Linus Torvalds <[email protected]>
2022-03-08	mm: vmalloc: introduce array allocation functions	Paolo Bonzini	1	-0/+5
	Linux has dozens of occurrences of vmalloc(array_size()) and vzalloc(array_size()). Allow to simplify the code by providing vmalloc_array and vcalloc, as well as the underscored variants that let the caller specify the GFP flags. Acked-by: Michal Hocko <[email protected]> Signed-off-by: Paolo Bonzini <[email protected]>
2022-01-15	mm: defer kmemleak object creation of module_alloc()	Kefeng Wang	1	-0/+7
	Yongqiang reports a kmemleak panic when module insmod/rmmod with KASAN enabled(without KASAN_VMALLOC) on x86[1]. When the module area allocates memory, it's kmemleak_object is created successfully, but the KASAN shadow memory of module allocation is not ready, so when kmemleak scan the module's pointer, it will panic due to no shadow memory with KASAN check. module_alloc __vmalloc_node_range kmemleak_vmalloc kmemleak_scan update_checksum kasan_module_alloc kmemleak_ignore Note, there is no problem if KASAN_VMALLOC enabled, the modules area entire shadow memory is preallocated. Thus, the bug only exits on ARCH which supports dynamic allocation of module area per module load, for now, only x86/arm64/s390 are involved. Add a VM_DEFER_KMEMLEAK flags, defer vmalloc'ed object register of kmemleak in module_alloc() to fix this issue. [1] https://lore.kernel.org/all/[email protected]/ [[email protected]: fix build] Link: https://lkml.kernel.org/r/[email protected] [[email protected]: simplify ifdefs, per Andrey] Link: https://lkml.kernel.org/r/CA+fCnZcnwJHUQq34VuRxpdoY6_XbJCDJ-jopksS5Eia4PijPzw@mail.gmail.com Link: https://lkml.kernel.org/r/[email protected] Fixes: 793213a82de4 ("s390/kasan: dynamic shadow mem allocation for modules") Fixes: 39d114ddc682 ("arm64: add KASAN support") Fixes: bebf56a1b176 ("kasan: enable instrumentation of global variables") Signed-off-by: Kefeng Wang <[email protected]> Reported-by: Yongqiang Liu <[email protected]> Cc: Andrey Konovalov <[email protected]> Cc: Andrey Ryabinin <[email protected]> Cc: Dmitry Vyukov <[email protected]> Cc: Catalin Marinas <[email protected]> Cc: Will Deacon <[email protected]> Cc: Heiko Carstens <[email protected]> Cc: Vasily Gorbik <[email protected]> Cc: Christian Borntraeger <[email protected]> Cc: Alexander Gordeev <[email protected]> Cc: Thomas Gleixner <[email protected]> Cc: Ingo Molnar <[email protected]> Cc: Borislav Petkov <[email protected]> Cc: Dave Hansen <[email protected]> Cc: Alexander Potapenko <[email protected]> Cc: Kefeng Wang <[email protected]> Signed-off-by: Andrew Morton <[email protected]> Signed-off-by: Linus Torvalds <[email protected]>
2021-11-06	mm/vmalloc: don't allow VM_NO_GUARD on vmap()	Peter Zijlstra	1	-1/+1
	The vmalloc guard pages are added on top of each allocation, thereby isolating any two allocations from one another. The top guard of the lower allocation is the bottom guard guard of the higher allocation etc. Therefore VM_NO_GUARD is dangerous; it breaks the basic premise of isolating separate allocations. There are only two in-tree users of this flag, neither of which use it through the exported interface. Ensure it stays this way. Link: https://lkml.kernel.org/r/YUMfdA36fuyZ+/[email protected] Signed-off-by: Peter Zijlstra (Intel) <[email protected]> Reviewed-by: Christoph Hellwig <[email protected]> Reviewed-by: David Hildenbrand <[email protected]> Acked-by: Will Deacon <[email protected]> Acked-by: Kees Cook <[email protected]> Cc: Andrey Konovalov <[email protected]> Cc: Mel Gorman <[email protected]> Cc: Uladzislau Rezki <[email protected]> Signed-off-by: Andrew Morton <[email protected]> Signed-off-by: Linus Torvalds <[email protected]>
2021-11-06	mm/vmalloc: add __alloc_size attributes for better bounds checking	Kees Cook	1	-11/+11
	As already done in GrapheneOS, add the __alloc_size attribute for appropriate vmalloc allocator interfaces, to provide additional hinting for better bounds checking, assisting CONFIG_FORTIFY_SOURCE and other compiler optimizations. Link: https://lkml.kernel.org/r/[email protected] Signed-off-by: Kees Cook <[email protected]> Co-developed-by: Daniel Micay <[email protected]> Signed-off-by: Daniel Micay <[email protected]> Cc: Andy Whitcroft <[email protected]> Cc: Christoph Lameter <[email protected]> Cc: David Rientjes <[email protected]> Cc: Dennis Zhou <[email protected]> Cc: Dwaipayan Ray <[email protected]> Cc: Joe Perches <[email protected]> Cc: Joonsoo Kim <[email protected]> Cc: Lukas Bulwahn <[email protected]> Cc: Miguel Ojeda <[email protected]> Cc: Nathan Chancellor <[email protected]> Cc: Nick Desaulniers <[email protected]> Cc: Pekka Enberg <[email protected]> Cc: Tejun Heo <[email protected]> Cc: Vlastimil Babka <[email protected]> Cc: Alexandre Bounine <[email protected]> Cc: Gustavo A. R. Silva <[email protected]> Cc: Ira Weiny <[email protected]> Cc: Jing Xiangfeng <[email protected]> Cc: John Hubbard <[email protected]> Cc: kernel test robot <[email protected]> Cc: Matt Porter <[email protected]> Cc: Randy Dunlap <[email protected]> Cc: Souptick Joarder <[email protected]> Signed-off-by: Andrew Morton <[email protected]> Signed-off-by: Linus Torvalds <[email protected]>
2021-09-08	mm: move ioremap_page_range to vmalloc.c	Christoph Hellwig	1	-3/+0
	Patch series "small ioremap cleanups". The first patch moves a little code around the vmalloc/ioremap boundary following a bigger move by Nick earlier. The second enforces non-executable mapping on ioremap just like we do for vmap. No driver currently uses executable mappings anyway, as they should. This patch (of 2): This keeps it together with the implementation, and to remove the vmap_range wrapper. Link: https://lkml.kernel.org/r/[email protected] Link: https://lkml.kernel.org/r/[email protected] Signed-off-by: Christoph Hellwig <[email protected]> Reviewed-by: Nicholas Piggin <[email protected]> Cc: Peter Zijlstra <[email protected]> Signed-off-by: Andrew Morton <[email protected]> Signed-off-by: Linus Torvalds <[email protected]>
2021-07-08	mm: fix spelling mistakes in header files	Zhen Lei	1	-2/+2
	Fix some spelling mistakes in comments: successfull ==> successful potentialy ==> potentially alloced ==> allocated indicies ==> indices wont ==> won't resposible ==> responsible dirtyness ==> dirtiness droppped ==> dropped alread ==> already occured ==> occurred interupts ==> interrupts extention ==> extension slighly ==> slightly Dont't ==> Don't Link: https://lkml.kernel.org/r/[email protected] Signed-off-by: Zhen Lei <[email protected]> Cc: Jerome Glisse <[email protected]> Cc: Mike Kravetz <[email protected]> Cc: Dennis Zhou <[email protected]> Cc: Tejun Heo <[email protected]> Cc: Christoph Lameter <[email protected]> Signed-off-by: Andrew Morton <[email protected]> Signed-off-by: Linus Torvalds <[email protected]>
2021-06-30	mm/vmalloc: enable mapping of huge pages at pte level in vmalloc	Christophe Leroy	1	-0/+7
	On some architectures like powerpc, there are huge pages that are mapped at pte level. Enable it in vmalloc. For that, architectures can provide arch_vmap_pte_supported_shift() that returns the shift for pages to map at pte level. Link: https://lkml.kernel.org/r/2c717e3b1fba1894d890feb7669f83025bfa314d.1620795204.git.christophe.leroy@csgroup.eu Signed-off-by: Christophe Leroy <[email protected]> Cc: Benjamin Herrenschmidt <[email protected]> Cc: Michael Ellerman <[email protected]> Cc: Mike Kravetz <[email protected]> Cc: Mike Rapoport <[email protected]> Cc: Nicholas Piggin <[email protected]> Cc: Paul Mackerras <[email protected]> Cc: Uladzislau Rezki <[email protected]> Signed-off-by: Andrew Morton <[email protected]> Signed-off-by: Linus Torvalds <[email protected]>
2021-06-30	mm/vmalloc: enable mapping of huge pages at pte level in vmap	Christophe Leroy	1	-0/+8
	On some architectures like powerpc, there are huge pages that are mapped at pte level. Enable it in vmap. For that, architectures can provide arch_vmap_pte_range_map_size() that returns the size of pages to map at pte level. Link: https://lkml.kernel.org/r/fb3ccc73377832ac6708181ec419128a2f98ce36.1620795204.git.christophe.leroy@csgroup.eu Signed-off-by: Christophe Leroy <[email protected]> Cc: Benjamin Herrenschmidt <[email protected]> Cc: Michael Ellerman <[email protected]> Cc: Mike Kravetz <[email protected]> Cc: Mike Rapoport <[email protected]> Cc: Nicholas Piggin <[email protected]> Cc: Paul Mackerras <[email protected]> Cc: Uladzislau Rezki <[email protected]> Signed-off-by: Andrew Morton <[email protected]> Signed-off-by: Linus Torvalds <[email protected]>
2021-06-24	mm/vmalloc: add vmalloc_no_huge	Claudio Imbrenda	1	-0/+1
	Patch series "mm: add vmalloc_no_huge and use it", v4. Add vmalloc_no_huge() and export it, so modules can allocate memory with small pages. Use the newly added vmalloc_no_huge() in KVM on s390 to get around a hardware limitation. This patch (of 2): Commit 121e6f3258fe3 ("mm/vmalloc: hugepage vmalloc mappings") added support for hugepage vmalloc mappings, it also added the flag VM_NO_HUGE_VMAP for __vmalloc_node_range to request the allocation to be performed with 0-order non-huge pages. This flag is not accessible when calling vmalloc, the only option is to call directly __vmalloc_node_range, which is not exported. This means that a module can't vmalloc memory with small pages. Case in point: KVM on s390x needs to vmalloc a large area, and it needs to be mapped with non-huge pages, because of a hardware limitation. This patch adds the function vmalloc_no_huge, which works like vmalloc, but it is guaranteed to always back the mapping using small pages. This new function is exported, therefore it is usable by modules. [[email protected]: whitespace fixes, per Christoph] Link: https://lkml.kernel.org/r/[email protected] Link: https://lkml.kernel.org/r/[email protected] Fixes: 121e6f3258fe3 ("mm/vmalloc: hugepage vmalloc mappings") Signed-off-by: Claudio Imbrenda <[email protected]> Reviewed-by: Uladzislau Rezki (Sony) <[email protected]> Acked-by: Nicholas Piggin <[email protected]> Reviewed-by: David Hildenbrand <[email protected]> Acked-by: David Rientjes <[email protected]> Cc: Uladzislau Rezki (Sony) <[email protected]> Cc: Catalin Marinas <[email protected]> Cc: Thomas Gleixner <[email protected]> Cc: Ingo Molnar <[email protected]> Cc: Christoph Hellwig <[email protected]> Cc: Cornelia Huck <[email protected]> Signed-off-by: Andrew Morton <[email protected]> Signed-off-by: Linus Torvalds <[email protected]>
2021-05-07	mm: fix typos in comments	Ingo Molnar	1	-2/+2
	Fix ~94 single-word typos in locking code comments, plus a few very obvious grammar mistakes. Link: https://lkml.kernel.org/r/[email protected] Link: https://lore.kernel.org/r/[email protected] Signed-off-by: Ingo Molnar <[email protected]> Reviewed-by: Matthew Wilcox (Oracle) <[email protected]> Reviewed-by: Randy Dunlap <[email protected]> Cc: Bhaskar Chowdhury <[email protected]> Signed-off-by: Andrew Morton <[email protected]> Signed-off-by: Linus Torvalds <[email protected]>
2021-05-07	mm/vmalloc: remove vwrite()	David Hildenbrand	1	-1/+0
	The last user (/dev/kmem) is gone. Let's drop it. Link: https://lkml.kernel.org/r/[email protected] Signed-off-by: David Hildenbrand <[email protected]> Acked-by: Michal Hocko <[email protected]> Cc: Linus Torvalds <[email protected]> Cc: Greg Kroah-Hartman <[email protected]> Cc: Hillf Danton <[email protected]> Cc: Matthew Wilcox <[email protected]> Cc: Oleksiy Avramchenko <[email protected]> Cc: Steven Rostedt <[email protected]> Cc: Minchan Kim <[email protected]> Cc: huang ying <[email protected]> Signed-off-by: Andrew Morton <[email protected]> Signed-off-by: Linus Torvalds <[email protected]>
2021-05-07	drivers/char: remove /dev/kmem for good	David Hildenbrand	1	-1/+1
	Patch series "drivers/char: remove /dev/kmem for good". Exploring /dev/kmem and /dev/mem in the context of memory hot(un)plug and memory ballooning, I started questioning the existence of /dev/kmem. Comparing it with the /proc/kcore implementation, it does not seem to be able to deal with things like a) Pages unmapped from the direct mapping (e.g., to be used by secretmem) -> kern_addr_valid(). virt_addr_valid() is not sufficient. b) Special cases like gart aperture memory that is not to be touched -> mem_pfn_is_ram() Unless I am missing something, it's at least broken in some cases and might fault/crash the machine. Looks like its existence has been questioned before in 2005 and 2010 [1], after ~11 additional years, it might make sense to revive the discussion. CONFIG_DEVKMEM is only enabled in a single defconfig (on purpose or by mistake?). All distributions disable it: in Ubuntu it has been disabled for more than 10 years, in Debian since 2.6.31, in Fedora at least starting with FC3, in RHEL starting with RHEL4, in SUSE starting from 15sp2, and OpenSUSE has it disabled as well. 1) /dev/kmem was popular for rootkits [2] before it got disabled basically everywhere. Ubuntu documents [3] "There is no modern user of /dev/kmem any more beyond attackers using it to load kernel rootkits.". RHEL documents in a BZ [5] "it served no practical purpose other than to serve as a potential security problem or to enable binary module drivers to access structures/functions they shouldn't be touching" 2) /proc/kcore is a decent interface to have a controlled way to read kernel memory for debugging puposes. (will need some extensions to deal with memory offlining/unplug, memory ballooning, and poisoned pages, though) 3) It might be useful for corner case debugging [1]. KDB/KGDB might be a better fit, especially, to write random memory; harder to shoot yourself into the foot. 4) "Kernel Memory Editor" [4] hasn't seen any updates since 2000 and seems to be incompatible with 64bit [1]. For educational purposes, /proc/kcore might be used to monitor value updates -- or older kernels can be used. 5) It's broken on arm64, and therefore, completely disabled there. Looks like it's essentially unused and has been replaced by better suited interfaces for individual tasks (/proc/kcore, KDB/KGDB). Let's just remove it. [1] https://lwn.net/Articles/147901/ [2] https://www.linuxjournal.com/article/10505 [3] https://wiki.ubuntu.com/Security/Features#A.2Fdev.2Fkmem_disabled [4] https://sourceforge.net/projects/kme/ [5] https://bugzilla.redhat.com/show_bug.cgi?id=154796 Link: https://lkml.kernel.org/r/[email protected] Link: https://lkml.kernel.org/r/[email protected] Signed-off-by: David Hildenbrand <[email protected]> Acked-by: Michal Hocko <[email protected]> Acked-by: Kees Cook <[email protected]> Cc: Linus Torvalds <[email protected]> Cc: Greg Kroah-Hartman <[email protected]> Cc: "Alexander A. Klimov" <[email protected]> Cc: Alexander Viro <[email protected]> Cc: Alexandre Belloni <[email protected]> Cc: Andrew Lunn <[email protected]> Cc: Andrey Zhizhikin <[email protected]> Cc: Arnd Bergmann <[email protected]> Cc: Benjamin Herrenschmidt <[email protected]> Cc: Brian Cain <[email protected]> Cc: Christian Borntraeger <[email protected]> Cc: Christophe Leroy <[email protected]> Cc: Chris Zankel <[email protected]> Cc: Corentin Labbe <[email protected]> Cc: "David S. Miller" <[email protected]> Cc: "Eric W. Biederman" <[email protected]> Cc: Geert Uytterhoeven <[email protected]> Cc: Gerald Schaefer <[email protected]> Cc: Greentime Hu <[email protected]> Cc: Gregory Clement <[email protected]> Cc: Heiko Carstens <[email protected]> Cc: Helge Deller <[email protected]> Cc: Hillf Danton <[email protected]> Cc: huang ying <[email protected]> Cc: Ingo Molnar <[email protected]> Cc: Ivan Kokshaysky <[email protected]> Cc: "James E.J. Bottomley" <[email protected]> Cc: James Troup <[email protected]> Cc: Jiaxun Yang <[email protected]> Cc: Jonas Bonn <[email protected]> Cc: Jonathan Corbet <[email protected]> Cc: Kairui Song <[email protected]> Cc: Krzysztof Kozlowski <[email protected]> Cc: Kuninori Morimoto <[email protected]> Cc: Liviu Dudau <[email protected]> Cc: Lorenzo Pieralisi <[email protected]> Cc: Luc Van Oostenryck <[email protected]> Cc: Luis Chamberlain <[email protected]> Cc: Matthew Wilcox <[email protected]> Cc: Matt Turner <[email protected]> Cc: Max Filippov <[email protected]> Cc: Michael Ellerman <[email protected]> Cc: Mike Rapoport <[email protected]> Cc: Mikulas Patocka <[email protected]> Cc: Minchan Kim <[email protected]> Cc: Niklas Schnelle <[email protected]> Cc: Oleksiy Avramchenko <[email protected]> Cc: [email protected] Cc: Palmer Dabbelt <[email protected]> Cc: Paul Mackerras <[email protected]> Cc: "Pavel Machek (CIP)" <[email protected]> Cc: Pavel Machek <[email protected]> Cc: "Peter Zijlstra (Intel)" <[email protected]> Cc: Pierre Morel <[email protected]> Cc: Randy Dunlap <[email protected]> Cc: Richard Henderson <[email protected]> Cc: Rich Felker <[email protected]> Cc: Robert Richter <[email protected]> Cc: Rob Herring <[email protected]> Cc: Russell King <[email protected]> Cc: Sam Ravnborg <[email protected]> Cc: Sebastian Andrzej Siewior <[email protected]> Cc: Sebastian Hesselbarth <[email protected]> Cc: [email protected] Cc: Stafford Horne <[email protected]> Cc: Stefan Kristiansson <[email protected]> Cc: Steven Rostedt <[email protected]> Cc: Sudeep Holla <[email protected]> Cc: Theodore Dubois <[email protected]> Cc: Thomas Bogendoerfer <[email protected]> Cc: Thomas Gleixner <[email protected]> Cc: Vasily Gorbik <[email protected]> Cc: Viresh Kumar <[email protected]> Cc: William Cohen <[email protected]> Cc: Xiaoming Ni <[email protected]> Cc: Yoshinori Sato <[email protected]> Signed-off-by: Andrew Morton <[email protected]> Signed-off-by: Linus Torvalds <[email protected]>
2021-04-30	mm/vmalloc: remove unmap_kernel_range	Nicholas Piggin	1	-7/+1
	This is a shim around vunmap_range, get rid of it. Move the main API comment from the _noflush variant to the normal variant, and make _noflush internal to mm/. [[email protected]: fix nommu builds and a comment bug per sfr] Link: https://lkml.kernel.org/r/[email protected] [[email protected]: move vunmap_range_noflush() stub inside !CONFIG_MMU, not !CONFIG_NUMA] [[email protected]: fix nommu builds] Link: https://lkml.kernel.org/r/[email protected] Link: https://lkml.kernel.org/r/[email protected] Signed-off-by: Nicholas Piggin <[email protected]> Reviewed-by: Christoph Hellwig <[email protected]> Cc: Cédric Le Goater <[email protected]> Cc: Uladzislau Rezki <[email protected]> Signed-off-by: Andrew Morton <[email protected]> Signed-off-by: Linus Torvalds <[email protected]>
2021-04-30	mm/vmalloc: remove map_kernel_range	Nicholas Piggin	1	-11/+0
	Patch series "mm/vmalloc: cleanup after hugepage series", v2. Christoph pointed out some overdue cleanups required after the huge vmalloc series, and I had another failure error message improvement as well. This patch (of 5): This is a shim around vmap_pages_range, get rid of it. Move the main API comment from the _noflush variant to the normal variant, and make _noflush internal to mm/. Link: https://lkml.kernel.org/r/[email protected] Link: https://lkml.kernel.org/r/[email protected] Signed-off-by: Nicholas Piggin <[email protected]> Reviewed-by: Christoph Hellwig <[email protected]> Cc: Uladzislau Rezki <[email protected]> Cc: Cédric Le Goater <[email protected]> Signed-off-by: Andrew Morton <[email protected]> Signed-off-by: Linus Torvalds <[email protected]>
2021-04-30	mm/vmalloc: hugepage vmalloc mappings	Nicholas Piggin	1	-0/+21
	Support huge page vmalloc mappings. Config option HAVE_ARCH_HUGE_VMALLOC enables support on architectures that define HAVE_ARCH_HUGE_VMAP and supports PMD sized vmap mappings. vmalloc will attempt to allocate PMD-sized pages if allocating PMD size or larger, and fall back to small pages if that was unsuccessful. Architectures must ensure that any arch specific vmalloc allocations that require PAGE_SIZE mappings (e.g., module allocations vs strict module rwx) use the VM_NOHUGE flag to inhibit larger mappings. This can result in more internal fragmentation and memory overhead for a given allocation, an option nohugevmalloc is added to disable at boot. [[email protected]: fix read of uninitialized pointer area] Link: https://lkml.kernel.org/r/[email protected] Link: https://lkml.kernel.org/r/[email protected] Signed-off-by: Nicholas Piggin <[email protected]> Cc: Borislav Petkov <[email protected]> Cc: Catalin Marinas <[email protected]> Cc: Christoph Hellwig <[email protected]> Cc: Ding Tianhong <[email protected]> Cc: "H. Peter Anvin" <[email protected]> Cc: Ingo Molnar <[email protected]> Cc: Miaohe Lin <[email protected]> Cc: Michael Ellerman <[email protected]> Cc: Russell King <[email protected]> Cc: Thomas Gleixner <[email protected]> Cc: Uladzislau Rezki (Sony) <[email protected]> Cc: Will Deacon <[email protected]> Signed-off-by: Andrew Morton <[email protected]> Signed-off-by: Linus Torvalds <[email protected]>
2021-04-30	mm: move vmap_range from mm/ioremap.c to mm/vmalloc.c	Nicholas Piggin	1	-0/+3
	This is a generic kernel virtual memory mapper, not specific to ioremap. Code is unchanged other than making vmap_range non-static. Link: https://lkml.kernel.org/r/[email protected] Signed-off-by: Nicholas Piggin <[email protected]> Reviewed-by: Christoph Hellwig <[email protected]> Cc: Borislav Petkov <[email protected]> Cc: Catalin Marinas <[email protected]> Cc: Ding Tianhong <[email protected]> Cc: "H. Peter Anvin" <[email protected]> Cc: Ingo Molnar <[email protected]> Cc: Miaohe Lin <[email protected]> Cc: Michael Ellerman <[email protected]> Cc: Russell King <[email protected]> Cc: Thomas Gleixner <[email protected]> Cc: Uladzislau Rezki (Sony) <[email protected]> Cc: Will Deacon <[email protected]> Signed-off-by: Andrew Morton <[email protected]> Signed-off-by: Linus Torvalds <[email protected]>
2021-04-30	mm/vmalloc: provide fallback arch huge vmap support functions	Nicholas Piggin	1	-4/+20
	If an architecture doesn't support a particular page table level as a huge vmap page size then allow it to skip defining the support query function. Link: https://lkml.kernel.org/r/[email protected] Signed-off-by: Nicholas Piggin <[email protected]> Suggested-by: Christoph Hellwig <[email protected]> Cc: Borislav Petkov <[email protected]> Cc: Catalin Marinas <[email protected]> Cc: Ding Tianhong <[email protected]> Cc: "H. Peter Anvin" <[email protected]> Cc: Ingo Molnar <[email protected]> Cc: Miaohe Lin <[email protected]> Cc: Michael Ellerman <[email protected]> Cc: Russell King <[email protected]> Cc: Thomas Gleixner <[email protected]> Cc: Uladzislau Rezki (Sony) <[email protected]> Cc: Will Deacon <[email protected]> Signed-off-by: Andrew Morton <[email protected]> Signed-off-by: Linus Torvalds <[email protected]>
2021-04-30	mm: HUGE_VMAP arch support cleanup	Nicholas Piggin	1	-0/+6
	This changes the awkward approach where architectures provide init functions to determine which levels they can provide large mappings for, to one where the arch is queried for each call. This removes code and indirection, and allows constant-folding of dead code for unsupported levels. This also adds a prot argument to the arch query. This is unused currently but could help with some architectures (e.g., some powerpc processors can't map uncacheable memory with large pages). Link: https://lkml.kernel.org/r/[email protected] Signed-off-by: Nicholas Piggin <[email protected]> Reviewed-by: Ding Tianhong <[email protected]> Acked-by: Catalin Marinas <[email protected]> [arm64] Cc: Will Deacon <[email protected]> Cc: Thomas Gleixner <[email protected]> Cc: Ingo Molnar <[email protected]> Cc: Borislav Petkov <[email protected]> Cc: "H. Peter Anvin" <[email protected]> Cc: Christoph Hellwig <[email protected]> Cc: Miaohe Lin <[email protected]> Cc: Michael Ellerman <[email protected]> Cc: Russell King <[email protected]> Cc: Uladzislau Rezki (Sony) <[email protected]> Signed-off-by: Andrew Morton <[email protected]> Signed-off-by: Linus Torvalds <[email protected]>
2021-03-08	mm: Don't build mm_dump_obj() on CONFIG_PRINTK=n kernels	Paul E. McKenney	1	-1/+1
	The mem_dump_obj() functionality adds a few hundred bytes, which is a small price to pay. Except on kernels built with CONFIG_PRINTK=n, in which mem_dump_obj() messages will be suppressed. This commit therefore makes mem_dump_obj() be a static inline empty function on kernels built with CONFIG_PRINTK=n and excludes all of its support functions as well. This avoids kernel bloat on systems that cannot use mem_dump_obj(). Cc: Christoph Lameter <[email protected]> Cc: Pekka Enberg <[email protected]> Cc: David Rientjes <[email protected]> Cc: Joonsoo Kim <[email protected]> Cc: <[email protected]> Suggested-by: Andrew Morton <[email protected]> Signed-off-by: Paul E. McKenney <[email protected]>
2021-02-12	Merge branch 'for-mingo-rcu' of ↵	Ingo Molnar	1	-0/+6
	git://git.kernel.org/pub/scm/linux/kernel/git/paulmck/linux-rcu into core/rcu Pull RCU updates from Paul E. McKenney: - Documentation updates. - Miscellaneous fixes. - kfree_rcu() updates: Addition of mem_dump_obj() to provide allocator return addresses to more easily locate bugs. This has a couple of RCU-related commits, but is mostly MM. Was pulled in with akpm's agreement. - Per-callback-batch tracking of numbers of callbacks, which enables better debugging information and smarter reactions to large numbers of callbacks. - The first round of changes to allow CPUs to be runtime switched from and to callback-offloaded state. - CONFIG_PREEMPT_RT-related changes. - RCU CPU stall warning updates. - Addition of polling grace-period APIs for SRCU. - Torture-test and torture-test scripting updates, including a "torture everything" script that runs rcutorture, locktorture, scftorture, rcuscale, and refscale. Plus does an allmodconfig build. Signed-off-by: Ingo Molnar <[email protected]>
2021-02-05	mm/vmalloc: separate put pages and flush VM flags	Rick Edgecombe	1	-7/+2
	When VM_MAP_PUT_PAGES was added, it was defined with the same value as VM_FLUSH_RESET_PERMS. This doesn't seem like it will cause any big functional problems other than some excess flushing for VM_MAP_PUT_PAGES allocations. Redefine VM_MAP_PUT_PAGES to have its own value. Also, rearrange things so flags are less likely to be missed in the future. Link: https://lkml.kernel.org/r/[email protected] Fixes: b944afc9d64d ("mm: add a VM_MAP_PUT_PAGES flag for vmap") Signed-off-by: Rick Edgecombe <[email protected]> Suggested-by: Matthew Wilcox <[email protected]> Cc: Miaohe Lin <[email protected]> Cc: Christoph Hellwig <[email protected]> Cc: Daniel Axtens <[email protected]> Cc: <[email protected]> Signed-off-by: Andrew Morton <[email protected]> Signed-off-by: Linus Torvalds <[email protected]>
2021-01-22	mm: Make mem_dump_obj() handle vmalloc() memory	Paul E. McKenney	1	-0/+6
	This commit adds vmalloc() support to mem_dump_obj(). Note that the vmalloc_dump_obj() function combines the checking and dumping, in contrast with the split between kmem_valid_obj() and kmem_dump_obj(). The reason for the difference is that the checking in the vmalloc() case involves acquiring a global lock, and redundant acquisitions of global locks should be avoided, even on not-so-fast paths. Note that this change causes on-stack variables to be reported as vmalloc() storage from kernel_clone() or similar, depending on the degree of inlining that your compiler does. This is likely more helpful than the earlier "non-paged (local) memory". Cc: Andrew Morton <[email protected]> Cc: Joonsoo Kim <[email protected]> Cc: <[email protected]> Reported-by: Andrii Nakryiko <[email protected]> Acked-by: Vlastimil Babka <[email protected]> Tested-by: Naresh Kamboju <[email protected]> Signed-off-by: Paul E. McKenney <[email protected]>
2020-12-15	mm/vmalloc: rework the drain logic	Uladzislau Rezki (Sony)	1	-5/+3
	A current "lazy drain" model suffers from at least two issues. First one is related to the unsorted list of vmap areas, thus in order to identify the [min:max] range of areas to be drained, it requires a full list scan. What is a time consuming if the list is too long. Second one and as a next step is about merging all fragments with a free space. What is also a time consuming because it has to iterate over entire list which holds outstanding lazy areas. See below the "preemptirqsoff" tracer that illustrates a high latency. It is ~24676us. Our workloads like audio and video are effected by such long latency: <snip> tracer: preemptirqsoff preemptirqsoff latency trace v1.1.5 on 4.9.186-perf+ -------------------------------------------------------------------- latency: 24676 us, #4/4, CPU#1 \| (M:preempt VP:0, KP:0, SP:0 HP:0 P:8) ----------------- \| task: crtc_commit:112-261 (uid:0 nice:0 policy:1 rt_prio:16) ----------------- => started at: __purge_vmap_area_lazy => ended at: __purge_vmap_area_lazy _------=> CPU# / _-----=> irqs-off \| / _----=> need-resched \|\| / _---=> hardirq/softirq \|\|\| / _--=> preempt-depth \|\|\|\| / delay cmd pid \|\|\|\|\| time \| caller \ / \|\|\|\|\| \ \| / crtc_com-261 1...1 1us*: _raw_spin_lock <-__purge_vmap_area_lazy [...] crtc_com-261 1...1 24675us : _raw_spin_unlock <-__purge_vmap_area_lazy crtc_com-261 1...1 24677us : trace_preempt_on <-__purge_vmap_area_lazy crtc_com-261 1...1 24683us : <stack trace> => free_vmap_area_noflush => remove_vm_area => __vunmap => vfree => drm_property_free_blob => drm_mode_object_unreference => drm_property_unreference_blob => __drm_atomic_helper_crtc_destroy_state => sde_crtc_destroy_state => drm_atomic_state_default_clear => drm_atomic_state_clear => drm_atomic_state_free => complete_commit => _msm_drm_commit_work_cb => kthread_worker_fn => kthread => ret_from_fork <snip> To address those two issues we can redesign a purging of the outstanding lazy areas. Instead of queuing vmap areas to the list, we replace it by the separate rb-tree. In hat case an area is located in the tree/list in ascending order. It will give us below advantages: a) Outstanding vmap areas are merged creating bigger coalesced blocks, thus it becomes less fragmented. b) It is possible to calculate a flush range [min:max] without scanning all elements. It is O(1) access time or complexity; c) The final merge of areas with the rb-tree that represents a free space is faster because of (a). As a result the lock contention is also reduced. Link: https://lkml.kernel.org/r/[email protected] Signed-off-by: Uladzislau Rezki (Sony) <[email protected]> Cc: Hillf Danton <[email protected]> Cc: Michal Hocko <[email protected]> Cc: Matthew Wilcox <[email protected]> Cc: Oleksiy Avramchenko <[email protected]> Cc: Steven Rostedt <[email protected]> Cc: Minchan Kim <[email protected]> Cc: huang ying <[email protected]> Signed-off-by: Andrew Morton <[email protected]> Signed-off-by: Linus Torvalds <[email protected]>
2020-10-18	mm: remove alloc_vm_area	Christoph Hellwig	1	-4/+1
	All users are gone now. Signed-off-by: Christoph Hellwig <[email protected]> Signed-off-by: Andrew Morton <[email protected]> Cc: Boris Ostrovsky <[email protected]> Cc: Chris Wilson <[email protected]> Cc: Jani Nikula <[email protected]> Cc: Joonas Lahtinen <[email protected]> Cc: Juergen Gross <[email protected]> Cc: Matthew Auld <[email protected]> Cc: "Matthew Wilcox (Oracle)" <[email protected]> Cc: Minchan Kim <[email protected]> Cc: Nitin Gupta <[email protected]> Cc: Peter Zijlstra <[email protected]> Cc: Rodrigo Vivi <[email protected]> Cc: Stefano Stabellini <[email protected]> Cc: Tvrtko Ursulin <[email protected]> Cc: Uladzislau Rezki (Sony) <[email protected]> Link: https://lkml.kernel.org/r/[email protected] Signed-off-by: Linus Torvalds <[email protected]>
2020-10-18	mm: add a vmap_pfn function	Christoph Hellwig	1	-0/+1
	Add a proper helper to remap PFNs into kernel virtual space so that drivers don't have to abuse alloc_vm_area and open coded PTE manipulation for it. Signed-off-by: Christoph Hellwig <[email protected]> Signed-off-by: Andrew Morton <[email protected]> Cc: Boris Ostrovsky <[email protected]> Cc: Chris Wilson <[email protected]> Cc: Jani Nikula <[email protected]> Cc: Joonas Lahtinen <[email protected]> Cc: Juergen Gross <[email protected]> Cc: Matthew Auld <[email protected]> Cc: "Matthew Wilcox (Oracle)" <[email protected]> Cc: Minchan Kim <[email protected]> Cc: Nitin Gupta <[email protected]> Cc: Peter Zijlstra <[email protected]> Cc: Rodrigo Vivi <[email protected]> Cc: Stefano Stabellini <[email protected]> Cc: Tvrtko Ursulin <[email protected]> Cc: Uladzislau Rezki (Sony) <[email protected]> Link: https://lkml.kernel.org/r/[email protected] Signed-off-by: Linus Torvalds <[email protected]>
2020-10-18	mm: add a VM_MAP_PUT_PAGES flag for vmap	Christoph Hellwig	1	-0/+1
	Add a flag so that vmap takes ownership of the passed in page array. When vfree is called on such an allocation it will put one reference on each page, and free the page array itself. Signed-off-by: Christoph Hellwig <[email protected]> Signed-off-by: Andrew Morton <[email protected]> Cc: Boris Ostrovsky <[email protected]> Cc: Chris Wilson <[email protected]> Cc: Jani Nikula <[email protected]> Cc: Joonas Lahtinen <[email protected]> Cc: Juergen Gross <[email protected]> Cc: Matthew Auld <[email protected]> Cc: "Matthew Wilcox (Oracle)" <[email protected]> Cc: Minchan Kim <[email protected]> Cc: Nitin Gupta <[email protected]> Cc: Peter Zijlstra <[email protected]> Cc: Rodrigo Vivi <[email protected]> Cc: Stefano Stabellini <[email protected]> Cc: Tvrtko Ursulin <[email protected]> Cc: Uladzislau Rezki (Sony) <[email protected]> Link: https://lkml.kernel.org/r/[email protected] Signed-off-by: Linus Torvalds <[email protected]>
2020-06-26	mm: remove vmalloc_exec	Christoph Hellwig	1	-1/+0
	Merge vmalloc_exec into its only caller. Note that for !CONFIG_MMU __vmalloc_node_range maps to __vmalloc, which directly clears the __GFP_HIGHMEM added by the vmalloc_exec stub anyway. Link: http://lkml.kernel.org/r/[email protected] Signed-off-by: Christoph Hellwig <[email protected]> Reviewed-by: David Hildenbrand <[email protected]> Acked-by: Peter Zijlstra (Intel) <[email protected]> Cc: Catalin Marinas <[email protected]> Cc: Dexuan Cui <[email protected]> Cc: Jessica Yu <[email protected]> Cc: Vitaly Kuznetsov <[email protected]> Cc: Wei Liu <[email protected]> Cc: Will Deacon <[email protected]> Signed-off-by: Andrew Morton <[email protected]> Signed-off-by: Linus Torvalds <[email protected]>
2020-06-02	mm: remove vmalloc_sync_(un)mappings()	Joerg Roedel	1	-2/+0
	These functions are not needed anymore because the vmalloc and ioremap mappings are now synchronized when they are created or torn down. Remove all callers and function definitions. Signed-off-by: Joerg Roedel <[email protected]> Signed-off-by: Andrew Morton <[email protected]> Tested-by: Steven Rostedt (VMware) <[email protected]> Acked-by: Andy Lutomirski <[email protected]> Acked-by: Peter Zijlstra (Intel) <[email protected]> Cc: Arnd Bergmann <[email protected]> Cc: Christoph Hellwig <[email protected]> Cc: Dave Hansen <[email protected]> Cc: "H . Peter Anvin" <[email protected]> Cc: Ingo Molnar <[email protected]> Cc: Matthew Wilcox (Oracle) <[email protected]> Cc: Michal Hocko <[email protected]> Cc: "Rafael J. Wysocki" <[email protected]> Cc: Thomas Gleixner <[email protected]> Cc: Vlastimil Babka <[email protected]> Link: http://lkml.kernel.org/r/[email protected] Signed-off-by: Linus Torvalds <[email protected]>
2020-06-02	mm/vmalloc: track which page-table levels were modified	Joerg Roedel	1	-0/+16
	Track at which levels in the page-table entries were modified by vmap/vunmap. After the page-table has been modified, use that information do decide whether the new arch_sync_kernel_mappings() needs to be called. [[email protected]: map_kernel_range_noflush() needs the arch_sync_kernel_mappings() call] Signed-off-by: Joerg Roedel <[email protected]> Signed-off-by: Andrew Morton <[email protected]> Acked-by: Andy Lutomirski <[email protected]> Acked-by: Peter Zijlstra (Intel) <[email protected]> Cc: Arnd Bergmann <[email protected]> Cc: Christoph Hellwig <[email protected]> Cc: Dave Hansen <[email protected]> Cc: "H . Peter Anvin" <[email protected]> Cc: Ingo Molnar <[email protected]> Cc: Matthew Wilcox (Oracle) <[email protected]> Cc: Michal Hocko <[email protected]> Cc: "Rafael J. Wysocki" <[email protected]> Cc: Steven Rostedt (VMware) <[email protected]> Cc: Thomas Gleixner <[email protected]> Cc: Vlastimil Babka <[email protected]> Link: http://lkml.kernel.org/r/[email protected] Signed-off-by: Linus Torvalds <[email protected]>
2020-06-02	mm: remove vmalloc_user_node_flags	Christoph Hellwig	1	-1/+0
	Open code it in __bpf_map_area_alloc, which is the only caller. Also clean up __bpf_map_area_alloc to have a single vmalloc call with slightly different flags instead of the current two different calls. For this to compile for the nommu case add a __vmalloc_node_range stub to nommu.c. [[email protected]: fix nommu.c build] Signed-off-by: Christoph Hellwig <[email protected]> Signed-off-by: Andrew Morton <[email protected]> Acked-by: Peter Zijlstra (Intel) <[email protected]> Acked-by: Johannes Weiner <[email protected]> Cc: Christian Borntraeger <[email protected]> Cc: Christophe Leroy <[email protected]> Cc: Daniel Vetter <[email protected]> Cc: David Airlie <[email protected]> Cc: Gao Xiang <[email protected]> Cc: Greg Kroah-Hartman <[email protected]> Cc: Haiyang Zhang <[email protected]> Cc: "K. Y. Srinivasan" <[email protected]> Cc: Laura Abbott <[email protected]> Cc: Mark Rutland <[email protected]> Cc: Michael Kelley <[email protected]> Cc: Minchan Kim <[email protected]> Cc: Nitin Gupta <[email protected]> Cc: Robin Murphy <[email protected]> Cc: Sakari Ailus <[email protected]> Cc: Stephen Hemminger <[email protected]> Cc: Sumit Semwal <[email protected]> Cc: Wei Liu <[email protected]> Cc: Benjamin Herrenschmidt <[email protected]> Cc: Catalin Marinas <[email protected]> Cc: Heiko Carstens <[email protected]> Cc: Paul Mackerras <[email protected]> Cc: Vasily Gorbik <[email protected]> Cc: Will Deacon <[email protected]> Cc: Stephen Rothwell <[email protected]> Link: http://lkml.kernel.org/r/[email protected] Signed-off-by: Linus Torvalds <[email protected]>
2020-06-02	mm: remove __vmalloc_node_flags_caller	Christoph Hellwig	1	-2/+2
	Just use __vmalloc_node instead which gets and extra argument. To be able to to use __vmalloc_node in all caller make it available outside of vmalloc and implement it in nommu.c. [[email protected]: fix nommu build] Signed-off-by: Christoph Hellwig <[email protected]> Signed-off-by: Andrew Morton <[email protected]> Acked-by: Peter Zijlstra (Intel) <[email protected]> Cc: Christian Borntraeger <[email protected]> Cc: Christophe Leroy <[email protected]> Cc: Daniel Vetter <[email protected]> Cc: David Airlie <[email protected]> Cc: Gao Xiang <[email protected]> Cc: Greg Kroah-Hartman <[email protected]> Cc: Haiyang Zhang <[email protected]> Cc: Johannes Weiner <[email protected]> Cc: "K. Y. Srinivasan" <[email protected]> Cc: Laura Abbott <[email protected]> Cc: Mark Rutland <[email protected]> Cc: Michael Kelley <[email protected]> Cc: Minchan Kim <[email protected]> Cc: Nitin Gupta <[email protected]> Cc: Robin Murphy <[email protected]> Cc: Sakari Ailus <[email protected]> Cc: Stephen Hemminger <[email protected]> Cc: Sumit Semwal <[email protected]> Cc: Wei Liu <[email protected]> Cc: Benjamin Herrenschmidt <[email protected]> Cc: Catalin Marinas <[email protected]> Cc: Heiko Carstens <[email protected]> Cc: Paul Mackerras <[email protected]> Cc: Vasily Gorbik <[email protected]> Cc: Will Deacon <[email protected]> Cc: Stephen Rothwell <[email protected]> Link: http://lkml.kernel.org/r/[email protected] Signed-off-by: Linus Torvalds <[email protected]>
2020-06-02	mm: remove both instances of __vmalloc_node_flags	Christoph Hellwig	1	-9/+0
	The real version just had a few callers that can open code it and remove one layer of indirection. The nommu stub was public but only had a single caller, so remove it and avoid a CONFIG_MMU ifdef in vmalloc.h. Signed-off-by: Christoph Hellwig <[email protected]> Signed-off-by: Andrew Morton <[email protected]> Acked-by: Peter Zijlstra (Intel) <[email protected]> Cc: Christian Borntraeger <[email protected]> Cc: Christophe Leroy <[email protected]> Cc: Daniel Vetter <[email protected]> Cc: David Airlie <[email protected]> Cc: Gao Xiang <[email protected]> Cc: Greg Kroah-Hartman <[email protected]> Cc: Haiyang Zhang <[email protected]> Cc: Johannes Weiner <[email protected]> Cc: "K. Y. Srinivasan" <[email protected]> Cc: Laura Abbott <[email protected]> Cc: Mark Rutland <[email protected]> Cc: Michael Kelley <[email protected]> Cc: Minchan Kim <[email protected]> Cc: Nitin Gupta <[email protected]> Cc: Robin Murphy <[email protected]> Cc: Sakari Ailus <[email protected]> Cc: Stephen Hemminger <[email protected]> Cc: Sumit Semwal <[email protected]> Cc: Wei Liu <[email protected]> Cc: Benjamin Herrenschmidt <[email protected]> Cc: Catalin Marinas <[email protected]> Cc: Heiko Carstens <[email protected]> Cc: Paul Mackerras <[email protected]> Cc: Vasily Gorbik <[email protected]> Cc: Will Deacon <[email protected]> Link: http://lkml.kernel.org/r/[email protected] Signed-off-by: Linus Torvalds <[email protected]>
2020-06-02	mm: remove the pgprot argument to __vmalloc	Christoph Hellwig	1	-1/+1
	The pgprot argument to __vmalloc is always PAGE_KERNEL now, so remove it. Signed-off-by: Christoph Hellwig <[email protected]> Signed-off-by: Andrew Morton <[email protected]> Reviewed-by: Michael Kelley <[email protected]> [hyperv] Acked-by: Gao Xiang <[email protected]> [erofs] Acked-by: Peter Zijlstra (Intel) <[email protected]> Acked-by: Wei Liu <[email protected]> Cc: Christian Borntraeger <[email protected]> Cc: Christophe Leroy <[email protected]> Cc: Daniel Vetter <[email protected]> Cc: David Airlie <[email protected]> Cc: Greg Kroah-Hartman <[email protected]> Cc: Haiyang Zhang <[email protected]> Cc: Johannes Weiner <[email protected]> Cc: "K. Y. Srinivasan" <[email protected]> Cc: Laura Abbott <[email protected]> Cc: Mark Rutland <[email protected]> Cc: Minchan Kim <[email protected]> Cc: Nitin Gupta <[email protected]> Cc: Robin Murphy <[email protected]> Cc: Sakari Ailus <[email protected]> Cc: Stephen Hemminger <[email protected]> Cc: Sumit Semwal <[email protected]> Cc: Benjamin Herrenschmidt <[email protected]> Cc: Catalin Marinas <[email protected]> Cc: Heiko Carstens <[email protected]> Cc: Paul Mackerras <[email protected]> Cc: Vasily Gorbik <[email protected]> Cc: Will Deacon <[email protected]> Link: http://lkml.kernel.org/r/[email protected] Signed-off-by: Linus Torvalds <[email protected]>
2020-06-02	mm: remove the prot argument from vm_map_ram	Christoph Hellwig	1	-2/+1
	This is always PAGE_KERNEL - for long term mappings with other properties vmap should be used. Signed-off-by: Christoph Hellwig <[email protected]> Signed-off-by: Andrew Morton <[email protected]> Acked-by: Peter Zijlstra (Intel) <[email protected]> Cc: Christian Borntraeger <[email protected]> Cc: Christophe Leroy <[email protected]> Cc: Daniel Vetter <[email protected]> Cc: David Airlie <[email protected]> Cc: Gao Xiang <[email protected]> Cc: Greg Kroah-Hartman <[email protected]> Cc: Haiyang Zhang <[email protected]> Cc: Johannes Weiner <[email protected]> Cc: "K. Y. Srinivasan" <[email protected]> Cc: Laura Abbott <[email protected]> Cc: Mark Rutland <[email protected]> Cc: Michael Kelley <[email protected]> Cc: Minchan Kim <[email protected]> Cc: Nitin Gupta <[email protected]> Cc: Robin Murphy <[email protected]> Cc: Sakari Ailus <[email protected]> Cc: Stephen Hemminger <[email protected]> Cc: Sumit Semwal <[email protected]> Cc: Wei Liu <[email protected]> Cc: Benjamin Herrenschmidt <[email protected]> Cc: Catalin Marinas <[email protected]> Cc: Heiko Carstens <[email protected]> Cc: Paul Mackerras <[email protected]> Cc: Vasily Gorbik <[email protected]> Cc: Will Deacon <[email protected]> Link: http://lkml.kernel.org/r/[email protected] Signed-off-by: Linus Torvalds <[email protected]>
2020-06-02	mm: remove map_vm_range	Christoph Hellwig	1	-6/+4
	Switch all callers to map_kernel_range, which symmetric to the unmap side (as well as the _noflush versions). Signed-off-by: Christoph Hellwig <[email protected]> Signed-off-by: Andrew Morton <[email protected]> Acked-by: Peter Zijlstra (Intel) <[email protected]> Cc: Christian Borntraeger <[email protected]> Cc: Christophe Leroy <[email protected]> Cc: Daniel Vetter <[email protected]> Cc: David Airlie <[email protected]> Cc: Gao Xiang <[email protected]> Cc: Greg Kroah-Hartman <[email protected]> Cc: Haiyang Zhang <[email protected]> Cc: Johannes Weiner <[email protected]> Cc: "K. Y. Srinivasan" <[email protected]> Cc: Laura Abbott <[email protected]> Cc: Mark Rutland <[email protected]> Cc: Michael Kelley <[email protected]> Cc: Minchan Kim <[email protected]> Cc: Nitin Gupta <[email protected]> Cc: Robin Murphy <[email protected]> Cc: Sakari Ailus <[email protected]> Cc: Stephen Hemminger <[email protected]> Cc: Sumit Semwal <[email protected]> Cc: Wei Liu <[email protected]> Cc: Benjamin Herrenschmidt <[email protected]> Cc: Catalin Marinas <[email protected]> Cc: Heiko Carstens <[email protected]> Cc: Paul Mackerras <[email protected]> Cc: Vasily Gorbik <[email protected]> Cc: Will Deacon <[email protected]> Link: http://lkml.kernel.org/r/[email protected] Signed-off-by: Linus Torvalds <[email protected]>
2020-06-02	mm: remove __get_vm_area	Christoph Hellwig	1	-2/+0
	Switch the two remaining callers to use __get_vm_area_caller instead. Signed-off-by: Christoph Hellwig <[email protected]> Signed-off-by: Andrew Morton <[email protected]> Acked-by: Peter Zijlstra (Intel) <[email protected]> Cc: Christian Borntraeger <[email protected]> Cc: Christophe Leroy <[email protected]> Cc: Daniel Vetter <[email protected]> Cc: David Airlie <[email protected]> Cc: Gao Xiang <[email protected]> Cc: Greg Kroah-Hartman <[email protected]> Cc: Haiyang Zhang <[email protected]> Cc: Johannes Weiner <[email protected]> Cc: "K. Y. Srinivasan" <[email protected]> Cc: Laura Abbott <[email protected]> Cc: Mark Rutland <[email protected]> Cc: Michael Kelley <[email protected]> Cc: Minchan Kim <[email protected]> Cc: Nitin Gupta <[email protected]> Cc: Robin Murphy <[email protected]> Cc: Sakari Ailus <[email protected]> Cc: Stephen Hemminger <[email protected]> Cc: Sumit Semwal <[email protected]> Cc: Wei Liu <[email protected]> Cc: Benjamin Herrenschmidt <[email protected]> Cc: Catalin Marinas <[email protected]> Cc: Heiko Carstens <[email protected]> Cc: Paul Mackerras <[email protected]> Cc: Vasily Gorbik <[email protected]> Cc: Will Deacon <[email protected]> Link: http://lkml.kernel.org/r/[email protected] Signed-off-by: Linus Torvalds <[email protected]>
2020-04-21	vmalloc: fix remap_vmalloc_range() bounds checks	Jann Horn	1	-1/+1
	remap_vmalloc_range() has had various issues with the bounds checks it promises to perform ("This function checks that addr is a valid vmalloc'ed area, and that it is big enough to cover the vma") over time, e.g.: - not detecting pgoff<<PAGE_SHIFT overflow - not detecting (pgoff<<PAGE_SHIFT)+usize overflow - not checking whether addr and addr+(pgoff<<PAGE_SHIFT) are the same vmalloc allocation - comparing a potentially wildly out-of-bounds pointer with the end of the vmalloc region In particular, since commit fc9702273e2e ("bpf: Add mmap() support for BPF_MAP_TYPE_ARRAY"), unprivileged users can cause kernel null pointer dereferences by calling mmap() on a BPF map with a size that is bigger than the distance from the start of the BPF map to the end of the address space. This could theoretically be used as a kernel ASLR bypass, by using whether mmap() with a given offset oopses or returns an error code to perform a binary search over the possible address range. To allow remap_vmalloc_range_partial() to verify that addr and addr+(pgoff<<PAGE_SHIFT) are in the same vmalloc region, pass the offset to remap_vmalloc_range_partial() instead of adding it to the pointer in remap_vmalloc_range(). In remap_vmalloc_range_partial(), fix the check against get_vm_area_size() by using size comparisons instead of pointer comparisons, and add checks for pgoff. Fixes: 833423143c3a ("[PATCH] mm: introduce remap_vmalloc_range()") Signed-off-by: Jann Horn <[email protected]> Signed-off-by: Andrew Morton <[email protected]> Cc: [email protected] Cc: Alexei Starovoitov <[email protected]> Cc: Daniel Borkmann <[email protected]> Cc: Martin KaFai Lau <[email protected]> Cc: Song Liu <[email protected]> Cc: Yonghong Song <[email protected]> Cc: Andrii Nakryiko <[email protected]> Cc: John Fastabend <[email protected]> Cc: KP Singh <[email protected]> Link: http://lkml.kernel.org/r/[email protected] Signed-off-by: Linus Torvalds <[email protected]>
2020-03-21	x86/mm: split vmalloc_sync_all()	Joerg Roedel	1	-2/+3
	Commit 3f8fd02b1bf1 ("mm/vmalloc: Sync unmappings in __purge_vmap_area_lazy()") introduced a call to vmalloc_sync_all() in the vunmap() code-path. While this change was necessary to maintain correctness on x86-32-pae kernels, it also adds additional cycles for architectures that don't need it. Specifically on x86-64 with CONFIG_VMAP_STACK=y some people reported severe performance regressions in micro-benchmarks because it now also calls the x86-64 implementation of vmalloc_sync_all() on vunmap(). But the vmalloc_sync_all() implementation on x86-64 is only needed for newly created mappings. To avoid the unnecessary work on x86-64 and to gain the performance back, split up vmalloc_sync_all() into two functions: * vmalloc_sync_mappings(), and * vmalloc_sync_unmappings() Most call-sites to vmalloc_sync_all() only care about new mappings being synchronized. The only exception is the new call-site added in the above mentioned commit. Shile Zhang directed us to a report of an 80% regression in reaim throughput. Fixes: 3f8fd02b1bf1 ("mm/vmalloc: Sync unmappings in __purge_vmap_area_lazy()") Reported-by: kernel test robot <[email protected]> Reported-by: Shile Zhang <[email protected]> Signed-off-by: Joerg Roedel <[email protected]> Signed-off-by: Andrew Morton <[email protected]> Tested-by: Borislav Petkov <[email protected]> Acked-by: Rafael J. Wysocki <[email protected]> [GHES] Cc: Dave Hansen <[email protected]> Cc: Andy Lutomirski <[email protected]> Cc: Peter Zijlstra <[email protected]> Cc: Thomas Gleixner <[email protected]> Cc: Ingo Molnar <[email protected]> Cc: <[email protected]> Link: http://lkml.kernel.org/r/[email protected] Link: https://lists.01.org/hyperkitty/list/[email protected]/thread/4D3JPPHBNOSPFK2KEPC6KGKS6J25AIDB/ Link: http://lkml.kernel.org/r/[email protected] Signed-off-by: Linus Torvalds <[email protected]>