aboutsummaryrefslogtreecommitdiff
path: root/arch/s390/mm/dump_pagetables.c
AgeCommit message (Collapse)AuthorFilesLines
2024-09-21Merge tag 's390-6.12-1' of ↵Linus Torvalds1-114/+77
git://git.kernel.org/pub/scm/linux/kernel/git/s390/linux Pull s390 updates from Vasily Gorbik: - Optimize ftrace and kprobes code patching and avoid stop machine for kprobes if sequential instruction fetching facility is available - Add hiperdispatch feature to dynamically adjust CPU capacity in vertical polarization to improve scheduling efficiency and overall performance. Also add infrastructure for handling warning track interrupts (WTI), allowing for graceful CPU preemption - Rework crypto code pkey module and split it into separate, independent modules for sysfs, PCKMO, CCA, and EP11, allowing modules to load only when the relevant hardware is available - Add hardware acceleration for HMAC modes and the full AES-XTS cipher, utilizing message-security assist extensions (MSA) 10 and 11. It introduces new shash implementations for HMAC-SHA224/256/384/512 and registers the hardware-accelerated AES-XTS cipher as the preferred option. Also add clear key token support - Add MSA 10 and 11 processor activity instrumentation counters to perf and update PAI Extension 1 NNPA counters - Cleanup cpu sampling facility code and rework debug/WARN_ON_ONCE statements - Add support for SHA3 performance enhancements introduced with MSA 12 - Add support for the query authentication information feature of MSA 13 and introduce the KDSA CPACF instruction. Provide query and query authentication information in sysfs, enabling tools like cpacfinfo to present this data in a human-readable form - Update kernel disassembler instructions - Always enable EXPOLINE_EXTERN if supported by the compiler to ensure kpatch compatibility - Add missing warning handling and relocated lowcore support to the early program check handler - Optimize ftrace_return_address() and avoid calling unwinder - Make modules use kernel ftrace trampolines - Strip relocs from the final vmlinux ELF file to make it roughly 2 times smaller - Dump register contents and call trace for early crashes to the console - Generate ptdump address marker array dynamically - Fix rcu_sched stalls that might occur when adding or removing large amounts of pages at once to or from the CMM balloon - Fix deadlock caused by recursive lock of the AP bus scan mutex - Unify sync and async register save areas in entry code - Cleanup debug prints in crypto code - Various cleanup and sanitizing patches for the decompressor - Various small ftrace cleanups * tag 's390-6.12-1' of git://git.kernel.org/pub/scm/linux/kernel/git/s390/linux: (84 commits) s390/crypto: Display Query and Query Authentication Information in sysfs s390/crypto: Add Support for Query Authentication Information s390/crypto: Rework RRE and RRF CPACF inline functions s390/crypto: Add KDSA CPACF Instruction s390/disassembler: Remove duplicate instruction format RSY_RDRU s390/boot: Move boot_printk() code to own file s390/boot: Use boot_printk() instead of sclp_early_printk() s390/boot: Rename decompressor_printk() to boot_printk() s390/boot: Compile all files with the same march flag s390: Use MARCH_HAS_*_FEATURES defines s390: Provide MARCH_HAS_*_FEATURES defines s390/facility: Disable compile time optimization for decompressor code s390/boot: Increase minimum architecture to z10 s390/als: Remove obsolete comment s390/sha3: Fix SHA3 selftests failures s390/pkey: Add AES xts and HMAC clear key token support s390/cpacf: Add MSA 10 and 11 new PCKMO functions s390/mm: Add cond_resched() to cmm_alloc/free_pages() s390/pai_ext: Update PAI extension 1 counters s390/pai_crypto: Add support for MSA 10 and 11 pai counters ...
2024-08-07s390/mm/ptdump: Generate address marker array dynamicallyHeiko Carstens1-114/+77
Generate the address marker array dynamically instead of modifying a large static array at kernel startup. Each marker is added twice to the array: with and without a "start" indicator. This way the code and logic stays similar to other architectures. Acked-by: Alexander Gordeev <[email protected]> Signed-off-by: Heiko Carstens <[email protected]> Signed-off-by: Vasily Gorbik <[email protected]>
2024-07-31s390/mm/ptdump: Improve sorting of markersHeiko Carstens1-47/+53
Use the sort() from lib/sort.c to sort markers instead of the private implementation. The current implementation does not sort markers properly if they have to be moved downwards: ---[ Real Memory Copy Area Start ]--- 0x0000035b903ff000-0x0000035b90400000 4K PTE I ---[ vmalloc Area Start ]--- ---[ Real Memory Copy Area End ]--- Add a new member to each marker which indicates if a marker is start of an area. If addresses of areas are equal consider an address which defines the start of an area higher than the address which defines the end of an area. In result the output is sorted as intended: ---[ Real Memory Copy Area Start ]--- 0x0000019cedcff000-0x0000019cedd00000 4K PTE I ---[ Real Memory Copy Area End ]--- ---[ vmalloc Area Start ]--- Reviewed-by: Alexander Gordeev <[email protected]> Signed-off-by: Heiko Carstens <[email protected]> Signed-off-by: Vasily Gorbik <[email protected]>
2024-07-31s390/mm/ptdump: Add support for relocated lowcore mappingHeiko Carstens1-13/+22
The page table dumper contains a hard coded assumption that the first mapped area starts at address zero. With a relocated lowcore this is not true anymore. Subsequently the first entry (lowcore) is printed as if it would contain everything from address zero until the end of the location of the lowcore area. Fix this by adding a single "Kernel Virtual Address Space" entry, which always starts at address zero. It ends when the lowcore area starts which is either address zero, or its relocated address. Reviewed-by: Sven Schnelle <[email protected]> Signed-off-by: Heiko Carstens <[email protected]> Signed-off-by: Vasily Gorbik <[email protected]>
2024-07-31s390/mm/ptdump: Fix handling of identity mapping areaHeiko Carstens1-9/+12
Since virtual and real addresses are not the same anymore the assumption that the kernel image is contained within the identity mapping is also not true anymore. Fix this by adding two explicit areas and at the correct locations: one for the 8kb lowcore area, and one for the identity mapping. Fixes: c98d2ecae08f ("s390/mm: Uncouple physical vs virtual address spaces") Reviewed-by: Alexander Gordeev <[email protected]> Signed-off-by: Heiko Carstens <[email protected]> Signed-off-by: Vasily Gorbik <[email protected]>
2024-07-23s390/ptdump: Add KMSAN page markersIlya Leoshkevich1-0/+30
Add KMSAN vmalloc metadata areas to /sys/kernel/debug/kernel_page_tables. Example output: 0x000003a95fff9000-0x000003a960000000 28K PTE I ---[ vmalloc Area End ]--- ---[ Kmsan vmalloc Shadow Start ]--- 0x000003a960000000-0x000003a960010000 64K PTE RW NX [...] 0x000003d3dfff9000-0x000003d3e0000000 28K PTE I ---[ Kmsan vmalloc Shadow End ]--- ---[ Kmsan vmalloc Origins Start ]--- 0x000003d3e0000000-0x000003d3e0010000 64K PTE RW NX [...] 0x000003fe5fff9000-0x000003fe60000000 28K PTE I ---[ Kmsan vmalloc Origins End ]--- ---[ Kmsan Modules Shadow Start ]--- 0x000003fe60000000-0x000003fe60001000 4K PTE RW NX [...] 0x000003fe60100000-0x000003fee0000000 2047M PMD I ---[ Kmsan Modules Shadow End ]--- ---[ Kmsan Modules Origins Start ]--- 0x000003fee0000000-0x000003fee0001000 4K PTE RW NX [...] 0x000003fee0100000-0x000003ff60000000 2047M PMD I ---[ Kmsan Modules Origins End ]--- ---[ Modules Area Start ]--- 0x000003ff60000000-0x000003ff60001000 4K PTE RO X Signed-off-by: Ilya Leoshkevich <[email protected]> Reviewed-by: Heiko Carstens <[email protected]> Link: https://lore.kernel.org/r/[email protected] Signed-off-by: Vasily Gorbik <[email protected]>
2024-06-18s390: Replace S390_lowcore by get_lowcore()Sven Schnelle1-1/+1
Replace all S390_lowcore usages in arch/s390/ by get_lowcore(). Acked-by: Heiko Carstens <[email protected]> Signed-off-by: Sven Schnelle <[email protected]> Signed-off-by: Vasily Gorbik <[email protected]>
2024-02-22mm: ptdump: have ptdump_check_wx() return boolChristophe Leroy1-4/+9
Have ptdump_check_wx() return true when the check is successful or false otherwise. [[email protected]: fix a couple of build issues (x86_64 allmodconfig)] Link: https://lkml.kernel.org/r/7943149fe955458cb7b57cd483bf41a3aad94684.1706610398.git.christophe.leroy@csgroup.eu Signed-off-by: Christophe Leroy <[email protected]> Cc: Albert Ou <[email protected]> Cc: Alexander Gordeev <[email protected]> Cc: Alexandre Ghiti <[email protected]> Cc: Andy Lutomirski <[email protected]> Cc: "Aneesh Kumar K.V (IBM)" <[email protected]> Cc: Borislav Petkov (AMD) <[email protected]> Cc: Catalin Marinas <[email protected]> Cc: Christian Borntraeger <[email protected]> Cc: Dave Hansen <[email protected]> Cc: Gerald Schaefer <[email protected]> Cc: Greg KH <[email protected]> Cc: Heiko Carstens <[email protected]> Cc: "H. Peter Anvin" <[email protected]> Cc: Ingo Molnar <[email protected]> Cc: Kees Cook <[email protected]> Cc: Mark Rutland <[email protected]> Cc: Michael Ellerman <[email protected]> Cc: "Naveen N. Rao" <[email protected]> Cc: Nicholas Piggin <[email protected]> Cc: Palmer Dabbelt <[email protected]> Cc: Paul Walmsley <[email protected]> Cc: Peter Zijlstra <[email protected]> Cc: Phong Tran <[email protected]> Cc: Russell King <[email protected]> Cc: Steven Price <[email protected]> Cc: Sven Schnelle <[email protected]> Cc: Thomas Gleixner <[email protected]> Cc: Vasily Gorbik <[email protected]> Cc: Will Deacon <[email protected]> Signed-off-by: Andrew Morton <[email protected]>
2024-02-22powerpc,s390: ptdump: define ptdump_check_wx() regardless of CONFIG_DEBUG_WXChristophe Leroy1-5/+2
Following patch will use ptdump_check_wx() regardless of CONFIG_DEBUG_WX, so define it at all times on powerpc and s390 just like other architectures. Though keep the WARN_ON_ONCE() only when CONFIG_DEBUG_WX is set. Link: https://lkml.kernel.org/r/07bfb04c7fec58e84413e91d2533581be357a696.1706610398.git.christophe.leroy@csgroup.eu Signed-off-by: Christophe Leroy <[email protected]> Cc: Albert Ou <[email protected]> Cc: Alexander Gordeev <[email protected]> Cc: Alexandre Ghiti <[email protected]> Cc: Andy Lutomirski <[email protected]> Cc: "Aneesh Kumar K.V (IBM)" <[email protected]> Cc: Borislav Petkov (AMD) <[email protected]> Cc: Catalin Marinas <[email protected]> Cc: Christian Borntraeger <[email protected]> Cc: Dave Hansen <[email protected]> Cc: Gerald Schaefer <[email protected]> Cc: Greg KH <[email protected]> Cc: Heiko Carstens <[email protected]> Cc: "H. Peter Anvin" <[email protected]> Cc: Ingo Molnar <[email protected]> Cc: Kees Cook <[email protected]> Cc: Mark Rutland <[email protected]> Cc: Michael Ellerman <[email protected]> Cc: "Naveen N. Rao" <[email protected]> Cc: Nicholas Piggin <[email protected]> Cc: Palmer Dabbelt <[email protected]> Cc: Paul Walmsley <[email protected]> Cc: Peter Zijlstra <[email protected]> Cc: Phong Tran <[email protected]> Cc: Russell King <[email protected]> Cc: Steven Price <[email protected]> Cc: Sven Schnelle <[email protected]> Cc: Thomas Gleixner <[email protected]> Cc: Vasily Gorbik <[email protected]> Cc: Will Deacon <[email protected]> Signed-off-by: Andrew Morton <[email protected]>
2024-02-22arm64, powerpc, riscv, s390, x86: ptdump: refactor CONFIG_DEBUG_WXChristophe Leroy1-1/+0
All architectures using the core ptdump functionality also implement CONFIG_DEBUG_WX, and they all do it more or less the same way, with a function called debug_checkwx() that is called by mark_rodata_ro(), which is a substitute to ptdump_check_wx() when CONFIG_DEBUG_WX is set and a no-op otherwise. Refactor by centrally defining debug_checkwx() in linux/ptdump.h and call debug_checkwx() immediately after calling mark_rodata_ro() instead of calling it at the end of every mark_rodata_ro(). On x86_32, mark_rodata_ro() first checks __supported_pte_mask has _PAGE_NX before calling debug_checkwx(). Now the check is inside the callee ptdump_walk_pgd_level_checkwx(). On powerpc_64, mark_rodata_ro() bails out early before calling ptdump_check_wx() when the MMU doesn't have KERNEL_RO feature. The check is now also done in ptdump_check_wx() as it is called outside mark_rodata_ro(). Link: https://lkml.kernel.org/r/a59b102d7964261d31ead0316a9f18628e4e7a8e.1706610398.git.christophe.leroy@csgroup.eu Signed-off-by: Christophe Leroy <[email protected]> Reviewed-by: Alexandre Ghiti <[email protected]> Cc: Albert Ou <[email protected]> Cc: Alexander Gordeev <[email protected]> Cc: Andy Lutomirski <[email protected]> Cc: "Aneesh Kumar K.V (IBM)" <[email protected]> Cc: Borislav Petkov (AMD) <[email protected]> Cc: Catalin Marinas <[email protected]> Cc: Christian Borntraeger <[email protected]> Cc: Dave Hansen <[email protected]> Cc: Gerald Schaefer <[email protected]> Cc: Greg KH <[email protected]> Cc: Heiko Carstens <[email protected]> Cc: "H. Peter Anvin" <[email protected]> Cc: Ingo Molnar <[email protected]> Cc: Kees Cook <[email protected]> Cc: Mark Rutland <[email protected]> Cc: Michael Ellerman <[email protected]> Cc: "Naveen N. Rao" <[email protected]> Cc: Nicholas Piggin <[email protected]> Cc: Palmer Dabbelt <[email protected]> Cc: Paul Walmsley <[email protected]> Cc: Peter Zijlstra <[email protected]> Cc: Phong Tran <[email protected]> Cc: Russell King <[email protected]> Cc: Steven Price <[email protected]> Cc: Sven Schnelle <[email protected]> Cc: Thomas Gleixner <[email protected]> Cc: Vasily Gorbik <[email protected]> Cc: Will Deacon <[email protected]> Signed-off-by: Andrew Morton <[email protected]>
2023-09-19s390/ctlreg: add struct ctlregHeiko Carstens1-1/+1
Add struct ctlreg to enforce strict type checking / usage for control register functions. Reviewed-by: Alexander Gordeev <[email protected]> Signed-off-by: Heiko Carstens <[email protected]> Signed-off-by: Vasily Gorbik <[email protected]>
2023-08-30s390/amode31: change type of __samode31, __eamode31, etcHeiko Carstens1-2/+2
For consistencs reasons change the type of __samode31, __eamode31, __stext_amode31, and __etext_amode31 to a char pointer so they (nearly) match the type of all other sections. This allows for code simplifications with follow-on patches. Reviewed-by: Alexander Gordeev <[email protected]> Signed-off-by: Heiko Carstens <[email protected]>
2023-08-16s390/mm: define Real Memory Copy size and mask macrosAlexander Gordeev1-1/+1
Make Real Memory Copy area size and mask explicit. This does not bring any functional change and only needed for clarity. Acked-by: Heiko Carstens <[email protected]> Signed-off-by: Alexander Gordeev <[email protected]> Signed-off-by: Heiko Carstens <[email protected]>
2023-01-31s390/mm,ptdump: avoid Kasan vs Memcpy Real markers swappingVasily Gorbik1-8/+8
---[ Real Memory Copy Area Start ]--- 0x001bfffffffff000-0x001c000000000000 4K PTE I ---[ Kasan Shadow Start ]--- ---[ Real Memory Copy Area End ]--- 0x001c000000000000-0x001c000200000000 8G PMD RW NX ... ---[ Kasan Shadow End ]--- ptdump does a stable sort of markers. Move kasan markers after memcpy real to avoid swapping. Reviewed-by: Alexander Gordeev <[email protected]> Signed-off-by: Vasily Gorbik <[email protected]> Signed-off-by: Heiko Carstens <[email protected]>
2022-09-14s390/mm,ptdump: add real memory copy page markersAlexander Gordeev1-0/+7
Add "Real Memory Copy Area Start" and "Real Memory Copy Area End" markers that fence the page used for real memory copying. Signed-off-by: Alexander Gordeev <[email protected]> Signed-off-by: Vasily Gorbik <[email protected]>
2022-09-14s390/smp,ptdump: add absolute lowcore markersAlexander Gordeev1-0/+7
Add "Lowcore Area Start" and "Lowcore Area End" markers that fence pages where absolute lowcore resides. Reviewed-by: Heiko Carstens <[email protected]> Signed-off-by: Alexander Gordeev <[email protected]> Signed-off-by: Vasily Gorbik <[email protected]>
2022-09-07s390/ptdump: add missing amode31 markersHeiko Carstens1-0/+6
Add amode31 markers which makes a ro mapping in the middle of nowhere in the kernel_page_tables output less magic. Reviewed-by: Alexander Gordeev <[email protected]> Signed-off-by: Heiko Carstens <[email protected]> Signed-off-by: Vasily Gorbik <[email protected]>
2021-10-26s390: add support for BEAR enhancement facilitySven Schnelle1-3/+11
The Breaking-Event-Address-Register (BEAR) stores the address of the last breaking event instruction. Breaking events are usually instructions that change the program flow - for example branches, and instructions that modify the address in the PSW like lpswe. This is useful for debugging wild branches, because one could easily figure out where the wild branch was originating from. What is problematic is that lpswe is considered a breaking event, and therefore overwrites BEAR on kernel exit. The BEAR enhancement facility adds new instructions that allow to save/restore BEAR and also an lpswey instruction that doesn't cause a breaking event. So we can save BEAR on kernel entry and restore it on exit to user space. Signed-off-by: Sven Schnelle <[email protected]> Reviewed-by: Heiko Carstens <[email protected]> Signed-off-by: Vasily Gorbik <[email protected]>
2021-07-30s390: add kfence region to pagetable dumperSven Schnelle1-0/+16
Signed-off-by: Sven Schnelle <[email protected]> Link: https://lore.kernel.org/r/[email protected] Signed-off-by: Heiko Carstens <[email protected]>
2020-11-20s390: unify identity mapping limits handlingVasily Gorbik1-1/+1
Currently we have to consider too many different values which in the end only affect identity mapping size. These are: 1. max_physmem_end - end of physical memory online or standby. Always <= end of the last online memory block (get_mem_detect_end()). 2. CONFIG_MAX_PHYSMEM_BITS - the maximum size of physical memory the kernel is able to support. 3. "mem=" kernel command line option which limits physical memory usage. 4. OLDMEM_BASE which is a kdump memory limit when the kernel is executed as crash kernel. 5. "hsa" size which is a memory limit when the kernel is executed during zfcp/nvme dump. Through out kernel startup and run we juggle all those values at once but that does not bring any amusement, only confusion and complexity. Unify all those values to a single one we should really care, that is our identity mapping size. Signed-off-by: Vasily Gorbik <[email protected]> Reviewed-by: Alexander Gordeev <[email protected]> Acked-by: Heiko Carstens <[email protected]> Signed-off-by: Heiko Carstens <[email protected]>
2020-09-16s390/mm,ptdump: sort markersVasily Gorbik1-0/+19
Kasan configuration options and size of physical memory present could affect kernel memory layout. In particular vmemmap, vmalloc and modules might come before kasan shadow or after it. To make ptdump correctly output markers in the right order markers have to be sorted. To preserve the original order of markers with the same start address avoid using sort() from lib/sort.c (which is not stable sorting algorithm) and sort markers in place. Reviewed-by: Heiko Carstens <[email protected]> Signed-off-by: Vasily Gorbik <[email protected]>
2020-09-16s390/mm,ptdump: add proper ifdefsHeiko Carstens1-2/+3
Use ifdefs instead of IS_ENABLED() to avoid compile error for !PTDUMP_DEBUGFS: arch/s390/mm/dump_pagetables.c: In function ‘pt_dump_init’: arch/s390/mm/dump_pagetables.c:248:64: error: ‘ptdump_fops’ undeclared (first use in this function); did you mean ‘pidfd_fops’? debugfs_create_file("kernel_page_tables", 0400, NULL, NULL, &ptdump_fops); Reported-by: Julian Wiedmann <[email protected]> Fixes: 08c8e685c7c9 ("s390: add ARCH_HAS_DEBUG_WX support") Signed-off-by: Heiko Carstens <[email protected]> Signed-off-by: Vasily Gorbik <[email protected]>
2020-09-14s390/mm,ptdump: add couple of additional markersVasily Gorbik1-5/+21
Signed-off-by: Vasily Gorbik <[email protected]> [[email protected]: add more markers, rename some markers] Signed-off-by: Heiko Carstens <[email protected]> Signed-off-by: Vasily Gorbik <[email protected]>
2020-09-14s390: add ARCH_HAS_DEBUG_WX supportHeiko Carstens1-1/+63
Checks the whole kernel address space for W+X mappings. Note that currently the first lowcore page unfortunately has to be mapped W+X. Therefore this not reported as an insecure mapping. For the very same reason the wording is also different to other architectures if the test passes: On s390 it is "no unexpected W+X pages found" instead of "no W+X pages found". Tested-by: Vasily Gorbik <[email protected]> Signed-off-by: Heiko Carstens <[email protected]> Signed-off-by: Vasily Gorbik <[email protected]>
2020-09-14s390/mm,ptdump: make page table dumping seq_file optionalHeiko Carstens1-10/+26
s390 version of ae5d1cf358a5 ("arm64: dump: Make the page table dumping seq_file optional"). Tested-by: Vasily Gorbik <[email protected]> Signed-off-by: Heiko Carstens <[email protected]> Signed-off-by: Vasily Gorbik <[email protected]>
2020-09-14s390/mm,ptdump: hold cpa mutex while walking for kernel page table dumpHeiko Carstens1-0/+3
This is currently only preventing that outdated information is provided to user space. A concurrent split of huge/large pages does modify the kernel page tables, however either the huge/large mapping is reported or the split area is being walked. This "fixes" also only a potential future bug, since split pages could also be merged again if page permissions are the same for larger memory areas. Reviewed-by: Vasily Gorbik <[email protected]> Signed-off-by: Heiko Carstens <[email protected]> Signed-off-by: Vasily Gorbik <[email protected]>
2020-09-14s390/mm,ptdump: hold memory hotplug lock while walking for kernel page table ↵Heiko Carstens1-0/+2
dump This is the s390 variant of commit bf2b59f60ee1 ("arm64/mm: Hold memory hotplug lock while walking for kernel page table dump"). Right now this doesn't fix any real bug, however as soon as kvm patches get merged which make use of memory remove we might end up dereferencing/accessing freed page tables. Therefore fix this potential bug already now. Reviewed-by: Vasily Gorbik <[email protected]> Signed-off-by: Heiko Carstens <[email protected]> Signed-off-by: Vasily Gorbik <[email protected]>
2020-09-14s390/mm,ptdump: convert to generic page table dumperHeiko Carstens1-186/+47
Make use of generic ptdump infrastructure. Reviewed-by: Vasily Gorbik <[email protected]> Signed-off-by: Heiko Carstens <[email protected]> Signed-off-by: Vasily Gorbik <[email protected]>
2020-06-09mm: don't include asm/pgtable.h if linux/mm.h is already includedMike Rapoport1-1/+0
Patch series "mm: consolidate definitions of page table accessors", v2. The low level page table accessors (pXY_index(), pXY_offset()) are duplicated across all architectures and sometimes more than once. For instance, we have 31 definition of pgd_offset() for 25 supported architectures. Most of these definitions are actually identical and typically it boils down to, e.g. static inline unsigned long pmd_index(unsigned long address) { return (address >> PMD_SHIFT) & (PTRS_PER_PMD - 1); } static inline pmd_t *pmd_offset(pud_t *pud, unsigned long address) { return (pmd_t *)pud_page_vaddr(*pud) + pmd_index(address); } These definitions can be shared among 90% of the arches provided XYZ_SHIFT, PTRS_PER_XYZ and xyz_page_vaddr() are defined. For architectures that really need a custom version there is always possibility to override the generic version with the usual ifdefs magic. These patches introduce include/linux/pgtable.h that replaces include/asm-generic/pgtable.h and add the definitions of the page table accessors to the new header. This patch (of 12): The linux/mm.h header includes <asm/pgtable.h> to allow inlining of the functions involving page table manipulations, e.g. pte_alloc() and pmd_alloc(). So, there is no point to explicitly include <asm/pgtable.h> in the files that include <linux/mm.h>. The include statements in such cases are remove with a simple loop: for f in $(git grep -l "include <linux/mm.h>") ; do sed -i -e '/include <asm\/pgtable.h>/ d' $f done Signed-off-by: Mike Rapoport <[email protected]> Signed-off-by: Andrew Morton <[email protected]> Cc: Arnd Bergmann <[email protected]> Cc: Borislav Petkov <[email protected]> Cc: Brian Cain <[email protected]> Cc: Catalin Marinas <[email protected]> Cc: Chris Zankel <[email protected]> Cc: "David S. Miller" <[email protected]> Cc: Geert Uytterhoeven <[email protected]> Cc: Greentime Hu <[email protected]> Cc: Greg Ungerer <[email protected]> Cc: Guan Xuetao <[email protected]> Cc: Guo Ren <[email protected]> Cc: Heiko Carstens <[email protected]> Cc: Helge Deller <[email protected]> Cc: Ingo Molnar <[email protected]> Cc: Ley Foon Tan <[email protected]> Cc: Mark Salter <[email protected]> Cc: Matthew Wilcox <[email protected]> Cc: Matt Turner <[email protected]> Cc: Max Filippov <[email protected]> Cc: Michael Ellerman <[email protected]> Cc: Michal Simek <[email protected]> Cc: Mike Rapoport <[email protected]> Cc: Nick Hu <[email protected]> Cc: Paul Walmsley <[email protected]> Cc: Richard Weinberger <[email protected]> Cc: Rich Felker <[email protected]> Cc: Russell King <[email protected]> Cc: Stafford Horne <[email protected]> Cc: Thomas Bogendoerfer <[email protected]> Cc: Thomas Gleixner <[email protected]> Cc: Tony Luck <[email protected]> Cc: Vincent Chen <[email protected]> Cc: Vineet Gupta <[email protected]> Cc: Will Deacon <[email protected]> Cc: Yoshinori Sato <[email protected]> Link: http://lkml.kernel.org/r/[email protected] Link: http://lkml.kernel.org/r/[email protected] Signed-off-by: Linus Torvalds <[email protected]>
2019-08-06s390/mm: fix dump_pagetables top level page table walkingVasily Gorbik1-6/+6
Since commit d1874a0c2805 ("s390/mm: make the pxd_offset functions more robust") behaviour of p4d_offset, pud_offset and pmd_offset has been changed so that they cannot be used to iterate through top level page table, because the index for the top level page table is now calculated in pgd_offset. To avoid dumping the very first region/segment top level table entry 2048 times simply iterate entry pointer like it is already done in other page walking cases. Fixes: d1874a0c2805 ("s390/mm: make the pxd_offset functions more robust") Reported-by: Ilya Leoshkevich <[email protected]> Reviewed-by: Heiko Carstens <[email protected]> Signed-off-by: Vasily Gorbik <[email protected]>
2018-12-28kasan: rename kasan_zero_page to kasan_early_shadow_pageAndrey Konovalov1-8/+9
With tag based KASAN mode the early shadow value is 0xff and not 0x00, so this patch renames kasan_zero_(page|pte|pmd|pud|p4d) to kasan_early_shadow_(page|pte|pmd|pud|p4d) to avoid confusion. Link: http://lkml.kernel.org/r/3fed313280ebf4f88645f5b89ccbc066d320e177.1544099024.git.andreyknvl@google.com Signed-off-by: Andrey Konovalov <[email protected]> Suggested-by: Mark Rutland <[email protected]> Cc: Andrey Ryabinin <[email protected]> Cc: Christoph Lameter <[email protected]> Cc: Dmitry Vyukov <[email protected]> Cc: Will Deacon <[email protected]> Signed-off-by: Andrew Morton <[email protected]> Signed-off-by: Linus Torvalds <[email protected]>
2018-10-09s390/mm: improve debugfs ptdump markers walkingVasily Gorbik1-1/+1
This allows to print multiple markers when they happened to have the same value. ... 0x001bfffff0100000-0x001c000000000000 255M PMD I ---[ Kasan Shadow End ]--- ---[ vmemmap Area ]--- 0x001c000000000000-0x001c000002000000 32M PMD RW X ... Reviewed-by: Martin Schwidefsky <[email protected]> Signed-off-by: Vasily Gorbik <[email protected]> Signed-off-by: Martin Schwidefsky <[email protected]>
2018-10-09s390/mm: optimize debugfs ptdump kasan zero page walkingVasily Gorbik1-1/+34
Kasan zero p4d/pud/pmd/pte are always filled in with corresponding kasan zero entries. Walking kasan zero page backed area is time consuming and unnecessary. When kasan zero p4d/pud/pmd is encountered, it eventually points to the kasan zero page always with the same attributes and nothing but it, therefore zero p4d/pud/pmd could be jumped over. Also adds a space between address range and pages number to separate them from each other when pages number is huge. 0x0018000000000000-0x0018000010000000 256M PMD RW X 0x0018000010000000-0x001bfffff0000000 1073741312M PTE RO X 0x001bfffff0000000-0x001bfffff0001000 4K PTE RW X Reviewed-by: Martin Schwidefsky <[email protected]> Signed-off-by: Vasily Gorbik <[email protected]> Signed-off-by: Martin Schwidefsky <[email protected]>
2018-10-09s390/mm: add kasan shadow to the debugfs pgtable dumpVasily Gorbik1-6/+15
This change adds address space markers for kasan shadow memory. Reviewed-by: Martin Schwidefsky <[email protected]> Signed-off-by: Vasily Gorbik <[email protected]> Signed-off-by: Martin Schwidefsky <[email protected]>
2018-02-27s390: unify linker symbols usageVasily Gorbik1-2/+2
Common code defines linker symbols which denote sections start/end in a form of char []. Referencing those symbols as _symbol or &_symbol yields the same result, but "_symbol" form is more widespread across newly written code. Convert s390 specific code to this style. Also removes unused _text symbol definition in boot/compressed/misc.c. Signed-off-by: Vasily Gorbik <[email protected]> Acked-by: Heiko Carstens <[email protected]> Signed-off-by: Martin Schwidefsky <[email protected]>
2017-11-02License cleanup: add SPDX GPL-2.0 license identifier to files with no licenseGreg Kroah-Hartman1-0/+1
Many source files in the tree are missing licensing information, which makes it harder for compliance tools to determine the correct license. By default all files without license information are under the default license of the kernel, which is GPL version 2. Update the files which contain no license information with the 'GPL-2.0' SPDX license identifier. The SPDX identifier is a legally binding shorthand, which can be used instead of the full boiler plate text. This patch is based on work done by Thomas Gleixner and Kate Stewart and Philippe Ombredanne. How this work was done: Patches were generated and checked against linux-4.14-rc6 for a subset of the use cases: - file had no licensing information it it. - file was a */uapi/* one with no licensing information in it, - file was a */uapi/* one with existing licensing information, Further patches will be generated in subsequent months to fix up cases where non-standard license headers were used, and references to license had to be inferred by heuristics based on keywords. The analysis to determine which SPDX License Identifier to be applied to a file was done in a spreadsheet of side by side results from of the output of two independent scanners (ScanCode & Windriver) producing SPDX tag:value files created by Philippe Ombredanne. Philippe prepared the base worksheet, and did an initial spot review of a few 1000 files. The 4.13 kernel was the starting point of the analysis with 60,537 files assessed. Kate Stewart did a file by file comparison of the scanner results in the spreadsheet to determine which SPDX license identifier(s) to be applied to the file. She confirmed any determination that was not immediately clear with lawyers working with the Linux Foundation. Criteria used to select files for SPDX license identifier tagging was: - Files considered eligible had to be source code files. - Make and config files were included as candidates if they contained >5 lines of source - File already had some variant of a license header in it (even if <5 lines). All documentation files were explicitly excluded. The following heuristics were used to determine which SPDX license identifiers to apply. - when both scanners couldn't find any license traces, file was considered to have no license information in it, and the top level COPYING file license applied. For non */uapi/* files that summary was: SPDX license identifier # files ---------------------------------------------------|------- GPL-2.0 11139 and resulted in the first patch in this series. If that file was a */uapi/* path one, it was "GPL-2.0 WITH Linux-syscall-note" otherwise it was "GPL-2.0". Results of that was: SPDX license identifier # files ---------------------------------------------------|------- GPL-2.0 WITH Linux-syscall-note 930 and resulted in the second patch in this series. - if a file had some form of licensing information in it, and was one of the */uapi/* ones, it was denoted with the Linux-syscall-note if any GPL family license was found in the file or had no licensing in it (per prior point). Results summary: SPDX license identifier # files ---------------------------------------------------|------ GPL-2.0 WITH Linux-syscall-note 270 GPL-2.0+ WITH Linux-syscall-note 169 ((GPL-2.0 WITH Linux-syscall-note) OR BSD-2-Clause) 21 ((GPL-2.0 WITH Linux-syscall-note) OR BSD-3-Clause) 17 LGPL-2.1+ WITH Linux-syscall-note 15 GPL-1.0+ WITH Linux-syscall-note 14 ((GPL-2.0+ WITH Linux-syscall-note) OR BSD-3-Clause) 5 LGPL-2.0+ WITH Linux-syscall-note 4 LGPL-2.1 WITH Linux-syscall-note 3 ((GPL-2.0 WITH Linux-syscall-note) OR MIT) 3 ((GPL-2.0 WITH Linux-syscall-note) AND MIT) 1 and that resulted in the third patch in this series. - when the two scanners agreed on the detected license(s), that became the concluded license(s). - when there was disagreement between the two scanners (one detected a license but the other didn't, or they both detected different licenses) a manual inspection of the file occurred. - In most cases a manual inspection of the information in the file resulted in a clear resolution of the license that should apply (and which scanner probably needed to revisit its heuristics). - When it was not immediately clear, the license identifier was confirmed with lawyers working with the Linux Foundation. - If there was any question as to the appropriate license identifier, the file was flagged for further research and to be revisited later in time. In total, over 70 hours of logged manual review was done on the spreadsheet to determine the SPDX license identifiers to apply to the source files by Kate, Philippe, Thomas and, in some cases, confirmation by lawyers working with the Linux Foundation. Kate also obtained a third independent scan of the 4.13 code base from FOSSology, and compared selected files where the other two scanners disagreed against that SPDX file, to see if there was new insights. The Windriver scanner is based on an older version of FOSSology in part, so they are related. Thomas did random spot checks in about 500 files from the spreadsheets for the uapi headers and agreed with SPDX license identifier in the files he inspected. For the non-uapi files Thomas did random spot checks in about 15000 files. In initial set of patches against 4.14-rc6, 3 files were found to have copy/paste license identifier errors, and have been fixed to reflect the correct identifier. Additionally Philippe spent 10 hours this week doing a detailed manual inspection and review of the 12,461 patched files from the initial patch version early this week with: - a full scancode scan run, collecting the matched texts, detected license ids and scores - reviewing anything where there was a license detected (about 500+ files) to ensure that the applied SPDX license was correct - reviewing anything where there was no detection but the patch license was not GPL-2.0 WITH Linux-syscall-note to ensure that the applied SPDX license was correct This produced a worksheet with 20 files needing minor correction. This worksheet was then exported into 3 different .csv files for the different types of files to be modified. These .csv files were then reviewed by Greg. Thomas wrote a script to parse the csv files and add the proper SPDX tag to the file, in the format that the file expected. This script was further refined by Greg based on the output to detect more types of files automatically and to distinguish between header and source .c files (which need different comment types.) Finally Greg ran the script using the .csv files to generate the patches. Reviewed-by: Kate Stewart <[email protected]> Reviewed-by: Philippe Ombredanne <[email protected]> Reviewed-by: Thomas Gleixner <[email protected]> Signed-off-by: Greg Kroah-Hartman <[email protected]>
2017-06-12s390/mm: implement 5 level pages tablesMartin Schwidefsky1-3/+20
Add the logic to upgrade the page table for a 64-bit process to five levels. This increases the TASK_SIZE from 8PB to 16EB-4K. Signed-off-by: Martin Schwidefsky <[email protected]>
2017-02-17s390/mm: add cond_resched call to kernel page table dumperHeiko Carstens1-0/+2
Walking kernel page tables within the kernel page table dumper may potentially take a lot of time. This may lead to soft lockup warning messages. To avoid this add a cond_resched call for each pgd_level iteration. This is the same as "x86/mm/ptdump: Fix soft lockup in page table walker" for x86. Signed-off-by: Heiko Carstens <[email protected]> Signed-off-by: Martin Schwidefsky <[email protected]>
2017-02-17s390: mm: Audit and remove any unnecessary uses of module.hPaul Gortmaker1-1/+0
Historically a lot of these existed because we did not have a distinction between what was modular code and what was providing support to modules via EXPORT_SYMBOL and friends. That changed when we forked out support for the latter into the export.h file. This means we should be able to reduce the usage of module.h in code that is obj-y Makefile or bool Kconfig. The advantage in doing so is that module.h itself sources about 15 other headers; adding significantly to what we feed cpp, and it can obscure what headers we are effectively using. Since module.h was the source for init.h (for __init) and for export.h (for EXPORT_SYMBOL) we consider each change instance for the presence of either and replace as needed. An instance where module_param was used without moduleparam.h was also fixed, as well as an implict use of asm/elf.h header. Signed-off-by: Paul Gortmaker <[email protected]> Signed-off-by: Heiko Carstens <[email protected]> Signed-off-by: Martin Schwidefsky <[email protected]>
2017-02-08s390: add no-execute supportMartin Schwidefsky1-5/+10
Bit 0x100 of a page table, segment table of region table entry can be used to disallow code execution for the virtual addresses associated with the entry. There is one tricky bit, the system call to return from a signal is part of the signal frame written to the user stack. With a non-executable stack this would stop working. To avoid breaking things the protection fault handler checks the opcode that caused the fault for 0x0a77 (sys_sigreturn) and 0x0aad (sys_rt_sigreturn) and injects a system call. This is preferable to the alternative solution with a stub function in the vdso because it works for vdso=off and statically linked binaries as well. Signed-off-by: Martin Schwidefsky <[email protected]>
2016-06-13s390/pgtable: get rid of _REGION3_ENTRY_ROHeiko Carstens1-1/+1
_REGION3_ENTRY_RO is a duplicate of _REGION_ENTRY_PROTECT. Signed-off-by: Heiko Carstens <[email protected]> Acked-by: Martin Schwidefsky <[email protected]> Signed-off-by: Martin Schwidefsky <[email protected]>
2015-03-25s390: remove 31 bit supportHeiko Carstens1-22/+2
Remove the 31 bit support in order to reduce maintenance cost and effectively remove dead code. Since a couple of years there is no distribution left that comes with a 31 bit kernel. The 31 bit kernel also has been broken since more than a year before anybody noticed. In addition I added a removal warning to the kernel shown at ipl for 5 minutes: a960062e5826 ("s390: add 31 bit warning message") which let everybody know about the plan to remove 31 bit code. We didn't get any response. Given that the last 31 bit only machine was introduced in 1999 let's remove the code. Anybody with 31 bit user space code can still use the compat mode. Signed-off-by: Heiko Carstens <[email protected]> Signed-off-by: Martin Schwidefsky <[email protected]>
2014-09-25s390/mm: remove change bit override supportHeiko Carstens1-3/+2
Signed-off-by: Heiko Carstens <[email protected]> Signed-off-by: Martin Schwidefsky <[email protected]>
2013-08-22s390/mm: cleanup page table definitionsMartin Schwidefsky1-9/+9
Improve the encoding of the different pte types and the naming of the page, segment table and region table bits. Due to the different pte encoding the hugetlbfs primitives need to be adapted as well. To improve compatability with common code make the huge ptes use the encoding of normal ptes. The conversion between the pte and pmd encoding for a huge pte is done with set_huge_pte_at and huge_ptep_get. Overall the code is now easier to understand. Reviewed-by: Gerald Schaefer <[email protected]> Signed-off-by: Martin Schwidefsky <[email protected]>
2013-02-28s390/page table dumper: add support for change-recording override bitHeiko Carstens1-5/+20
Signed-off-by: Heiko Carstens <[email protected]> Signed-off-by: Martin Schwidefsky <[email protected]>
2012-11-23s390/mm,vmem: use 2GB frames for identity mappingHeiko Carstens1-1/+6
Use 2GB frames for indentity mapping if EDAT2 is available to reduce TLB pressure. Signed-off-by: Heiko Carstens <[email protected]> Signed-off-by: Martin Schwidefsky <[email protected]>
2012-10-09s390/vmalloc: have separate modules areaHeiko Carstens1-3/+10
Add a special module area on top of the vmalloc area, which may be only used for modules and bpf jit generated code. This makes sure that inter module branches will always happen without a trampoline and in addition having all the code within a 2GB frame is branch prediction unit friendly. Signed-off-by: Heiko Carstens <[email protected]> Signed-off-by: Martin Schwidefsky <[email protected]>
2012-10-09s390/mm: add page table dumperHeiko Carstens1-0/+219
This is more or less the same as the x86 page table dumper which was merged four years ago: 926e5392 "x86: add code to dump the (kernel) page tables for visual inspection by kernel developers". We add a file at /sys/kernel/debug/kernel_page_tables for debugging purposes so it's quite easy to see the kernel page table layout and possible odd mappings: ---[ Identity Mapping ]--- 0x0000000000000000-0x0000000000100000 1M PTE RW ---[ Kernel Image Start ]--- 0x0000000000100000-0x0000000000800000 7M PMD RO 0x0000000000800000-0x00000000008a9000 676K PTE RO 0x00000000008a9000-0x0000000000900000 348K PTE RW 0x0000000000900000-0x0000000001500000 12M PMD RW ---[ Kernel Image End ]--- 0x0000000001500000-0x0000000280000000 10219M PMD RW 0x0000000280000000-0x000003d280000000 3904G PUD I ---[ vmemmap Area ]--- 0x000003d280000000-0x000003d288c00000 140M PTE RW 0x000003d288c00000-0x000003d300000000 1908M PMD I 0x000003d300000000-0x000003e000000000 52G PUD I ---[ vmalloc Area ]--- 0x000003e000000000-0x000003e000009000 36K PTE RW 0x000003e000009000-0x000003e0000ee000 916K PTE I 0x000003e0000ee000-0x000003e000146000 352K PTE RW 0x000003e000146000-0x000003e000200000 744K PTE I 0x000003e000200000-0x000003e080000000 2046M PMD I 0x000003e080000000-0x0000040000000000 126G PUD I This usually makes only sense for kernel developers. The output with CONFIG_DEBUG_PAGEALLOC is not very helpful, because of the huge number of mapped out pages, however I decided for the time being to not add a !DEBUG_PAGEALLOC dependency. Maybe it's helpful for somebody even with that option. Signed-off-by: Heiko Carstens <[email protected]> Signed-off-by: Martin Schwidefsky <[email protected]>