aboutsummaryrefslogtreecommitdiff
path: root/arch/powerpc
AgeCommit message (Collapse)AuthorFilesLines
2020-06-04arch/kunmap_atomic: consolidate duplicate codeIra Weiny2-9/+3
Every single architecture (including !CONFIG_HIGHMEM) calls... pagefault_enable(); preempt_enable(); ... before returning from __kunmap_atomic(). Lift this code into the kunmap_atomic() macro. While we are at it rename __kunmap_atomic() to kunmap_atomic_high() to be consistent. [[email protected]: don't enable pagefault/preempt twice] Link: http://lkml.kernel.org/r/[email protected] [[email protected]: coding style fixes] Signed-off-by: Ira Weiny <[email protected]> Signed-off-by: Andrew Morton <[email protected]> Reviewed-by: Christoph Hellwig <[email protected]> Cc: Al Viro <[email protected]> Cc: Andy Lutomirski <[email protected]> Cc: Benjamin Herrenschmidt <[email protected]> Cc: Borislav Petkov <[email protected]> Cc: Christian König <[email protected]> Cc: Chris Zankel <[email protected]> Cc: Daniel Vetter <[email protected]> Cc: Dan Williams <[email protected]> Cc: Dave Hansen <[email protected]> Cc: "David S. Miller" <[email protected]> Cc: Helge Deller <[email protected]> Cc: "H. Peter Anvin" <[email protected]> Cc: Ingo Molnar <[email protected]> Cc: "James E.J. Bottomley" <[email protected]> Cc: Max Filippov <[email protected]> Cc: Paul Mackerras <[email protected]> Cc: Peter Zijlstra <[email protected]> Cc: Thomas Bogendoerfer <[email protected]> Cc: Thomas Gleixner <[email protected]> Cc: Guenter Roeck <[email protected]> Link: http://lkml.kernel.org/r/[email protected] Signed-off-by: Linus Torvalds <[email protected]>
2020-06-04arch/kmap_atomic: consolidate duplicate codeIra Weiny2-8/+2
Every arch has the same code to ensure atomic operations and a check for !HIGHMEM page. Remove the duplicate code by defining a core kmap_atomic() which only calls the arch specific kmap_atomic_high() when the page is high memory. [[email protected]: coding style fixes] Signed-off-by: Ira Weiny <[email protected]> Signed-off-by: Andrew Morton <[email protected]> Reviewed-by: Christoph Hellwig <[email protected]> Cc: Al Viro <[email protected]> Cc: Andy Lutomirski <[email protected]> Cc: Benjamin Herrenschmidt <[email protected]> Cc: Borislav Petkov <[email protected]> Cc: Christian König <[email protected]> Cc: Chris Zankel <[email protected]> Cc: Daniel Vetter <[email protected]> Cc: Dan Williams <[email protected]> Cc: Dave Hansen <[email protected]> Cc: "David S. Miller" <[email protected]> Cc: Helge Deller <[email protected]> Cc: "H. Peter Anvin" <[email protected]> Cc: Ingo Molnar <[email protected]> Cc: "James E.J. Bottomley" <[email protected]> Cc: Max Filippov <[email protected]> Cc: Paul Mackerras <[email protected]> Cc: Peter Zijlstra <[email protected]> Cc: Thomas Bogendoerfer <[email protected]> Cc: Thomas Gleixner <[email protected]> Link: http://lkml.kernel.org/r/[email protected] Signed-off-by: Linus Torvalds <[email protected]>
2020-06-04{x86,powerpc,microblaze}/kmap: move preempt disableIra Weiny2-8/+12
During this kmap() conversion series we must maintain bisect-ability. To do this, kmap_atomic_prot() in x86, powerpc, and microblaze need to remain functional. Create a temporary inline version of kmap_atomic_prot within these architectures so we can rework their kmap_atomic() calls and then lift kmap_atomic_prot() to the core. Suggested-by: Al Viro <[email protected]> Signed-off-by: Ira Weiny <[email protected]> Signed-off-by: Andrew Morton <[email protected]> Reviewed-by: Christoph Hellwig <[email protected]> Cc: Andy Lutomirski <[email protected]> Cc: Benjamin Herrenschmidt <[email protected]> Cc: Borislav Petkov <[email protected]> Cc: Christian König <[email protected]> Cc: Chris Zankel <[email protected]> Cc: Daniel Vetter <[email protected]> Cc: Dan Williams <[email protected]> Cc: Dave Hansen <[email protected]> Cc: "David S. Miller" <[email protected]> Cc: Helge Deller <[email protected]> Cc: "H. Peter Anvin" <[email protected]> Cc: Ingo Molnar <[email protected]> Cc: "James E.J. Bottomley" <[email protected]> Cc: Max Filippov <[email protected]> Cc: Paul Mackerras <[email protected]> Cc: Peter Zijlstra <[email protected]> Cc: Thomas Bogendoerfer <[email protected]> Cc: Thomas Gleixner <[email protected]> Link: http://lkml.kernel.org/r/[email protected] Signed-off-by: Linus Torvalds <[email protected]>
2020-06-04arch/kunmap: remove duplicate kunmap implementationsIra Weiny1-9/+0
All architectures do exactly the same thing for kunmap(); remove all the duplicate definitions and lift the call to the core. This also has the benefit of changing kmap_unmap() on a number of architectures to be an inline call rather than an actual function. [[email protected]: fix CONFIG_HIGHMEM=n build on various architectures] Signed-off-by: Ira Weiny <[email protected]> Signed-off-by: Andrew Morton <[email protected]> Reviewed-by: Christoph Hellwig <[email protected]> Cc: Al Viro <[email protected]> Cc: Andy Lutomirski <[email protected]> Cc: Benjamin Herrenschmidt <[email protected]> Cc: Borislav Petkov <[email protected]> Cc: Christian König <[email protected]> Cc: Chris Zankel <[email protected]> Cc: Daniel Vetter <[email protected]> Cc: Dan Williams <[email protected]> Cc: Dave Hansen <[email protected]> Cc: "David S. Miller" <[email protected]> Cc: Helge Deller <[email protected]> Cc: "H. Peter Anvin" <[email protected]> Cc: Ingo Molnar <[email protected]> Cc: "James E.J. Bottomley" <[email protected]> Cc: Max Filippov <[email protected]> Cc: Paul Mackerras <[email protected]> Cc: Peter Zijlstra <[email protected]> Cc: Thomas Bogendoerfer <[email protected]> Cc: Thomas Gleixner <[email protected]> Link: http://lkml.kernel.org/r/[email protected] Signed-off-by: Linus Torvalds <[email protected]>
2020-06-04arch/kmap: remove redundant arch specific kmapsIra Weiny1-9/+0
The kmap code for all the architectures is almost 100% identical. Lift the common code to the core. Use ARCH_HAS_KMAP_FLUSH_TLB to indicate if an arch defines kmap_flush_tlb() and call if if needed. This also has the benefit of changing kmap() on a number of architectures to be an inline call rather than an actual function. Signed-off-by: Ira Weiny <[email protected]> Signed-off-by: Andrew Morton <[email protected]> Reviewed-by: Christoph Hellwig <[email protected]> Cc: Al Viro <[email protected]> Cc: Andy Lutomirski <[email protected]> Cc: Benjamin Herrenschmidt <[email protected]> Cc: Borislav Petkov <[email protected]> Cc: Christian König <[email protected]> Cc: Chris Zankel <[email protected]> Cc: Daniel Vetter <[email protected]> Cc: Dan Williams <[email protected]> Cc: Dave Hansen <[email protected]> Cc: "David S. Miller" <[email protected]> Cc: Helge Deller <[email protected]> Cc: "H. Peter Anvin" <[email protected]> Cc: Ingo Molnar <[email protected]> Cc: "James E.J. Bottomley" <[email protected]> Cc: Max Filippov <[email protected]> Cc: Paul Mackerras <[email protected]> Cc: Peter Zijlstra <[email protected]> Cc: Thomas Bogendoerfer <[email protected]> Cc: Thomas Gleixner <[email protected]> Link: http://lkml.kernel.org/r/[email protected] Signed-off-by: Linus Torvalds <[email protected]>
2020-06-04arch/kmap: remove BUG_ON()Ira Weiny1-1/+1
Patch series "Remove duplicated kmap code", v3. The kmap infrastructure has been copied almost verbatim to every architecture. This series consolidates obvious duplicated code by defining core functions which call into the architectures only when needed. Some of the k[un]map_atomic() implementations have some similarities but the similarities were not sufficient to warrant further changes. In addition we remove a duplicate implementation of kmap() in DRM. This patch (of 15): Replace the use of BUG_ON(in_interrupt()) in the kmap() and kunmap() in favor of might_sleep(). Besides the benefits of might_sleep(), this normalizes the implementations such that they can be made generic in subsequent patches. Signed-off-by: Ira Weiny <[email protected]> Signed-off-by: Andrew Morton <[email protected]> Reviewed-by: Dan Williams <[email protected]> Reviewed-by: Christoph Hellwig <[email protected]> Cc: Al Viro <[email protected]> Cc: Christian König <[email protected]> Cc: Daniel Vetter <[email protected]> Cc: Thomas Bogendoerfer <[email protected]> Cc: "James E.J. Bottomley" <[email protected]> Cc: Helge Deller <[email protected]> Cc: Benjamin Herrenschmidt <[email protected]> Cc: Paul Mackerras <[email protected]> Cc: "David S. Miller" <[email protected]> Cc: Thomas Gleixner <[email protected]> Cc: Ingo Molnar <[email protected]> Cc: Borislav Petkov <[email protected]> Cc: "H. Peter Anvin" <[email protected]> Cc: Dave Hansen <[email protected]> Cc: Andy Lutomirski <[email protected]> Cc: Peter Zijlstra <[email protected]> Cc: Chris Zankel <[email protected]> Cc: Max Filippov <[email protected]> Link: http://lkml.kernel.org/r/[email protected] Link: http://lkml.kernel.org/r/[email protected] Signed-off-by: Linus Torvalds <[email protected]>
2020-06-04mm/debug: add tests validating architecture page table helpersAnshuman Khandual1-0/+1
This adds tests which will validate architecture page table helpers and other accessors in their compliance with expected generic MM semantics. This will help various architectures in validating changes to existing page table helpers or addition of new ones. This test covers basic page table entry transformations including but not limited to old, young, dirty, clean, write, write protect etc at various level along with populating intermediate entries with next page table page and validating them. Test page table pages are allocated from system memory with required size and alignments. The mapped pfns at page table levels are derived from a real pfn representing a valid kernel text symbol. This test gets called via late_initcall(). This test gets built and run when CONFIG_DEBUG_VM_PGTABLE is selected. Any architecture, which is willing to subscribe this test will need to select ARCH_HAS_DEBUG_VM_PGTABLE. For now this is limited to arc, arm64, x86, s390 and powerpc platforms where the test is known to build and run successfully Going forward, other architectures too can subscribe the test after fixing any build or runtime problems with their page table helpers. Folks interested in making sure that a given platform's page table helpers conform to expected generic MM semantics should enable the above config which will just trigger this test during boot. Any non conformity here will be reported as an warning which would need to be fixed. This test will help catch any changes to the agreed upon semantics expected from generic MM and enable platforms to accommodate it thereafter. [[email protected]: v17] Link: http://lkml.kernel.org/r/[email protected] [[email protected]: v18] Link: http://lkml.kernel.org/r/[email protected] Suggested-by: Catalin Marinas <[email protected]> Signed-off-by: Anshuman Khandual <[email protected]> Signed-off-by: Christophe Leroy <[email protected]> Signed-off-by: Qian Cai <[email protected]> Signed-off-by: Andrew Morton <[email protected]> Tested-by: Gerald Schaefer <[email protected]> [s390] Tested-by: Christophe Leroy <[email protected]> [ppc32] Reviewed-by: Ingo Molnar <[email protected]> Cc: Mike Rapoport <[email protected]> Cc: Vineet Gupta <[email protected]> Cc: Catalin Marinas <[email protected]> Cc: Will Deacon <[email protected]> Cc: Benjamin Herrenschmidt <[email protected]> Cc: Paul Mackerras <[email protected]> Cc: Michael Ellerman <[email protected]> Cc: Heiko Carstens <[email protected]> Cc: Vasily Gorbik <[email protected]> Cc: Christian Borntraeger <[email protected]> Cc: Thomas Gleixner <[email protected]> Cc: Ingo Molnar <[email protected]> Cc: Borislav Petkov <[email protected]> Cc: "H. Peter Anvin" <[email protected]> Cc: Kirill A. Shutemov <[email protected]> Cc: Paul Walmsley <[email protected]> Cc: Palmer Dabbelt <[email protected]> Link: http://lkml.kernel.org/r/[email protected] Signed-off-by: Linus Torvalds <[email protected]>
2020-06-04powerpc: add support for folded p4d page tablesMike Rapoport23-145/+200
Implement primitives necessary for the 4th level folding, add walks of p4d level where appropriate and replace 5level-fixup.h with pgtable-nop4d.h. [[email protected]: powerpc/xmon: drop unused pgdir varialble in show_pte() function] Link: http://lkml.kernel.org/r/[email protected] [[email protected]; build fix] Link: http://lkml.kernel.org/r/[email protected] Signed-off-by: Mike Rapoport <[email protected]> Signed-off-by: Andrew Morton <[email protected]> Tested-by: Christophe Leroy <[email protected]> # 8xx and 83xx Cc: Arnd Bergmann <[email protected]> Cc: Benjamin Herrenschmidt <[email protected]> Cc: Brian Cain <[email protected]> Cc: Catalin Marinas <[email protected]> Cc: Fenghua Yu <[email protected]> Cc: Geert Uytterhoeven <[email protected]> Cc: Guan Xuetao <[email protected]> Cc: James Morse <[email protected]> Cc: Jonas Bonn <[email protected]> Cc: Julien Thierry <[email protected]> Cc: Ley Foon Tan <[email protected]> Cc: Marc Zyngier <[email protected]> Cc: Michael Ellerman <[email protected]> Cc: Paul Mackerras <[email protected]> Cc: Rich Felker <[email protected]> Cc: Russell King <[email protected]> Cc: Stafford Horne <[email protected]> Cc: Stefan Kristiansson <[email protected]> Cc: Suzuki K Poulose <[email protected]> Cc: Tony Luck <[email protected]> Cc: Will Deacon <[email protected]> Cc: Yoshinori Sato <[email protected]> Link: http://lkml.kernel.org/r/[email protected] Signed-off-by: Linus Torvalds <[email protected]>
2020-06-04Merge tag 'kvm-ppc-next-5.8-1' of ↵Paolo Bonzini22-237/+276
git://git.kernel.org/pub/scm/linux/kernel/git/paulus/powerpc into HEAD PPC KVM update for 5.8 - Updates and bug fixes for secure guest support - Other minor bug fixes and cleanups.
2020-06-03Merge branch 'akpm' (patches from Andrew)Linus Torvalds5-39/+12
Merge more updates from Andrew Morton: "More mm/ work, plenty more to come Subsystems affected by this patch series: slub, memcg, gup, kasan, pagealloc, hugetlb, vmscan, tools, mempolicy, memblock, hugetlbfs, thp, mmap, kconfig" * akpm: (131 commits) arm64: mm: use ARCH_HAS_DEBUG_WX instead of arch defined x86: mm: use ARCH_HAS_DEBUG_WX instead of arch defined riscv: support DEBUG_WX mm: add DEBUG_WX support drivers/base/memory.c: cache memory blocks in xarray to accelerate lookup mm/thp: rename pmd_mknotpresent() as pmd_mkinvalid() powerpc/mm: drop platform defined pmd_mknotpresent() mm: thp: don't need to drain lru cache when splitting and mlocking THP hugetlbfs: get unmapped area below TASK_UNMAPPED_BASE for hugetlbfs sparc32: register memory occupied by kernel as memblock.memory include/linux/memblock.h: fix minor typo and unclear comment mm, mempolicy: fix up gup usage in lookup_node tools/vm/page_owner_sort.c: filter out unneeded line mm: swap: memcg: fix memcg stats for huge pages mm: swap: fix vmstats for huge pages mm: vmscan: limit the range of LRU type balancing mm: vmscan: reclaim writepage is IO cost mm: vmscan: determine anon/file pressure balance at the reclaim root mm: balance LRU lists based on relative thrashing mm: only count actual rotations as LRU reclaim cost ...
2020-06-03powerpc/mm: drop platform defined pmd_mknotpresent()Anshuman Khandual1-4/+0
Patch series "mm/thp: Rename pmd_mknotpresent() as pmd_mknotvalid()", v2. This series renames pmd_mknotpresent() as pmd_mknotvalid(). Before that it drops an existing pmd_mknotpresent() definition from powerpc platform which was never required as it defines it's pmdp_invalidate() through subscribing __HAVE_ARCH_PMDP_INVALIDATE. This does not create any functional change. This rename was suggested by Catalin during a previous discussion while we were trying to change the THP helpers on arm64 platform for migration. https://patchwork.kernel.org/patch/11019637/ This patch (of 2): Platform needs to define pmd_mknotpresent() for generic pmdp_invalidate() only when __HAVE_ARCH_PMDP_INVALIDATE is not subscribed. Otherwise platform specific pmd_mknotpresent() is not required. Hence just drop it. Signed-off-by: Anshuman Khandual <[email protected]> Signed-off-by: Andrew Morton <[email protected]> Cc: Benjamin Herrenschmidt <[email protected]> Cc: Michael Ellerman <[email protected]> Cc: Paul Mackerras <[email protected]> Cc: Vineet Gupta <[email protected]> Cc: Russell King <[email protected]> Cc: Catalin Marinas <[email protected]> Cc: Will Deacon <[email protected]> Cc: Thomas Bogendoerfer <[email protected]> Cc: Thomas Gleixner <[email protected]> Cc: Ingo Molnar <[email protected]> Cc: Borislav Petkov <[email protected]> Cc: "H. Peter Anvin" <[email protected]> Cc: Steven Rostedt <[email protected]> Cc: Dave Hansen <[email protected]> Cc: Andy Lutomirski <[email protected]> Cc: Peter Zijlstra <[email protected]> Link: http://lkml.kernel.org/r/[email protected] Link: http://lkml.kernel.org/r/[email protected] Link: http://lkml.kernel.org/r/[email protected] Signed-off-by: Linus Torvalds <[email protected]>
2020-06-03mm/hugetlb: define a generic fallback for arch_clear_hugepage_flags()Anshuman Khandual1-4/+0
There are multiple similar definitions for arch_clear_hugepage_flags() on various platforms. Lets just add it's generic fallback definition for platforms that do not override. This help reduce code duplication. Signed-off-by: Anshuman Khandual <[email protected]> Signed-off-by: Andrew Morton <[email protected]> Acked-by: Mike Kravetz <[email protected]> Cc: Russell King <[email protected]> Cc: Catalin Marinas <[email protected]> Cc: Will Deacon <[email protected]> Cc: Tony Luck <[email protected]> Cc: Fenghua Yu <[email protected]> Cc: Thomas Bogendoerfer <[email protected]> Cc: "James E.J. Bottomley" <[email protected]> Cc: Helge Deller <[email protected]> Cc: Benjamin Herrenschmidt <[email protected]> Cc: Paul Mackerras <[email protected]> Cc: Michael Ellerman <[email protected]> Cc: Paul Walmsley <[email protected]> Cc: Palmer Dabbelt <[email protected]> Cc: Heiko Carstens <[email protected]> Cc: Vasily Gorbik <[email protected]> Cc: Christian Borntraeger <[email protected]> Cc: Yoshinori Sato <[email protected]> Cc: Rich Felker <[email protected]> Cc: "David S. Miller" <[email protected]> Cc: Thomas Gleixner <[email protected]> Cc: Ingo Molnar <[email protected]> Cc: Borislav Petkov <[email protected]> Cc: "H. Peter Anvin" <[email protected]> Link: http://lkml.kernel.org/r/[email protected] Signed-off-by: Linus Torvalds <[email protected]>
2020-06-03mm/hugetlb: define a generic fallback for is_hugepage_only_range()Anshuman Khandual1-0/+1
There are multiple similar definitions for is_hugepage_only_range() on various platforms. Lets just add it's generic fallback definition for platforms that do not override. This help reduce code duplication. Signed-off-by: Anshuman Khandual <[email protected]> Signed-off-by: Andrew Morton <[email protected]> Acked-by: Mike Kravetz <[email protected]> Cc: Russell King <[email protected]> Cc: Catalin Marinas <[email protected]> Cc: Will Deacon <[email protected]> Cc: Tony Luck <[email protected]> Cc: Fenghua Yu <[email protected]> Cc: Thomas Bogendoerfer <[email protected]> Cc: "James E.J. Bottomley" <[email protected]> Cc: Helge Deller <[email protected]> Cc: Benjamin Herrenschmidt <[email protected]> Cc: Paul Mackerras <[email protected]> Cc: Michael Ellerman <[email protected]> Cc: Paul Walmsley <[email protected]> Cc: Palmer Dabbelt <[email protected]> Cc: Heiko Carstens <[email protected]> Cc: Vasily Gorbik <[email protected]> Cc: Christian Borntraeger <[email protected]> Cc: Yoshinori Sato <[email protected]> Cc: Rich Felker <[email protected]> Cc: "David S. Miller" <[email protected]> Cc: Thomas Gleixner <[email protected]> Cc: Ingo Molnar <[email protected]> Cc: Borislav Petkov <[email protected]> Cc: "H. Peter Anvin" <[email protected]> Link: http://lkml.kernel.org/r/[email protected] Signed-off-by: Linus Torvalds <[email protected]>
2020-06-03hugetlbfs: remove hugetlb_add_hstate() warning for existing hstateMike Kravetz1-2/+1
hugetlb_add_hstate() prints a warning if the hstate already exists. This was originally done as part of kernel command line parsing. If 'hugepagesz=' was specified more than once, the warning pr_warn("hugepagesz= specified twice, ignoring\n"); would be printed. Some architectures want to enable all huge page sizes. They would call hugetlb_add_hstate for all supported sizes. However, this was done after command line processing and as a result hstates could have already been created for some sizes. To make sure no warning were printed, there would often be code like: if (!size_to_hstate(size) hugetlb_add_hstate(ilog2(size) - PAGE_SHIFT) The only time we want to print the warning is as the result of command line processing. So, remove the warning from hugetlb_add_hstate and add it to the single arch independent routine processing "hugepagesz=". After this, calls to size_to_hstate() in arch specific code can be removed and hugetlb_add_hstate can be called without worrying about warning messages. [[email protected]: fix hugetlb initialization] Link: http://lkml.kernel.org/r/[email protected] Link: http://lkml.kernel.org/r/[email protected] Signed-off-by: Mike Kravetz <[email protected]> Signed-off-by: Andrew Morton <[email protected]> Tested-by: Anders Roxell <[email protected]> Acked-by: Mina Almasry <[email protected]> Acked-by: Gerald Schaefer <[email protected]> [s390] Acked-by: Will Deacon <[email protected]> Cc: Albert Ou <[email protected]> Cc: Benjamin Herrenschmidt <[email protected]> Cc: Catalin Marinas <[email protected]> Cc: Christian Borntraeger <[email protected]> Cc: Christophe Leroy <[email protected]> Cc: Dave Hansen <[email protected]> Cc: David S. Miller <[email protected]> Cc: Heiko Carstens <[email protected]> Cc: Ingo Molnar <[email protected]> Cc: Jonathan Corbet <[email protected]> Cc: Longpeng <[email protected]> Cc: Nitesh Narayan Lal <[email protected]> Cc: Palmer Dabbelt <[email protected]> Cc: Paul Mackerras <[email protected]> Cc: Paul Walmsley <[email protected]> Cc: Peter Xu <[email protected]> Cc: Randy Dunlap <[email protected]> Cc: Thomas Gleixner <[email protected]> Cc: Vasily Gorbik <[email protected]> Cc: "Aneesh Kumar K.V" <[email protected]> Cc: Qian Cai <[email protected]> Cc: Stephen Rothwell <[email protected]> Link: http://lkml.kernel.org/r/[email protected] Link: http://lkml.kernel.org/r/[email protected] Signed-off-by: Linus Torvalds <[email protected]>
2020-06-03hugetlbfs: move hugepagesz= parsing to arch independent codeMike Kravetz1-15/+0
Now that architectures provide arch_hugetlb_valid_size(), parsing of "hugepagesz=" can be done in architecture independent code. Create a single routine to handle hugepagesz= parsing and remove all arch specific routines. We can also remove the interface hugetlb_bad_size() as this is no longer used outside arch independent code. This also provides consistent behavior of hugetlbfs command line options. The hugepagesz= option should only be specified once for a specific size, but some architectures allow multiple instances. This appears to be more of an oversight when code was added by some architectures to set up ALL huge pages sizes. Signed-off-by: Mike Kravetz <[email protected]> Signed-off-by: Andrew Morton <[email protected]> Tested-by: Sandipan Das <[email protected]> Reviewed-by: Peter Xu <[email protected]> Acked-by: Mina Almasry <[email protected]> Acked-by: Gerald Schaefer <[email protected]> [s390] Acked-by: Will Deacon <[email protected]> Cc: Albert Ou <[email protected]> Cc: Benjamin Herrenschmidt <[email protected]> Cc: Catalin Marinas <[email protected]> Cc: Christian Borntraeger <[email protected]> Cc: Christophe Leroy <[email protected]> Cc: Dave Hansen <[email protected]> Cc: David S. Miller <[email protected]> Cc: Heiko Carstens <[email protected]> Cc: Ingo Molnar <[email protected]> Cc: Jonathan Corbet <[email protected]> Cc: Longpeng <[email protected]> Cc: Nitesh Narayan Lal <[email protected]> Cc: Palmer Dabbelt <[email protected]> Cc: Paul Mackerras <[email protected]> Cc: Paul Walmsley <[email protected]> Cc: Randy Dunlap <[email protected]> Cc: Thomas Gleixner <[email protected]> Cc: Vasily Gorbik <[email protected]> Cc: Anders Roxell <[email protected]> Cc: "Aneesh Kumar K.V" <[email protected]> Cc: Qian Cai <[email protected]> Cc: Stephen Rothwell <[email protected]> Link: http://lkml.kernel.org/r/[email protected] Link: http://lkml.kernel.org/r/[email protected] Signed-off-by: Linus Torvalds <[email protected]>
2020-06-03hugetlbfs: add arch_hugetlb_valid_sizeMike Kravetz1-7/+13
Patch series "Clean up hugetlb boot command line processing", v4. Longpeng(Mike) reported a weird message from hugetlb command line processing and proposed a solution [1]. While the proposed patch does address the specific issue, there are other related issues in command line processing. As hugetlbfs evolved, updates to command line processing have been made to meet immediate needs and not necessarily in a coordinated manner. The result is that some processing is done in arch specific code, some is done in arch independent code and coordination is problematic. Semantics can vary between architectures. The patch series does the following: - Define arch specific arch_hugetlb_valid_size routine used to validate passed huge page sizes. - Move hugepagesz= command line parsing out of arch specific code and into an arch independent routine. - Clean up command line processing to follow desired semantics and document those semantics. [1] https://lore.kernel.org/linux-mm/[email protected] This patch (of 3): The architecture independent routine hugetlb_default_setup sets up the default huge pages size. It has no way to verify if the passed value is valid, so it accepts it and attempts to validate at a later time. This requires undocumented cooperation between the arch specific and arch independent code. For architectures that support more than one huge page size, provide a routine arch_hugetlb_valid_size to validate a huge page size. hugetlb_default_setup can use this to validate passed values. arch_hugetlb_valid_size will also be used in a subsequent patch to move processing of the "hugepagesz=" in arch specific code to a common routine in arch independent code. Signed-off-by: Mike Kravetz <[email protected]> Signed-off-by: Andrew Morton <[email protected]> Acked-by: Gerald Schaefer <[email protected]> [s390] Acked-by: Will Deacon <[email protected]> Cc: Catalin Marinas <[email protected]> Cc: Benjamin Herrenschmidt <[email protected]> Cc: Paul Mackerras <[email protected]> Cc: Paul Walmsley <[email protected]> Cc: Palmer Dabbelt <[email protected]> Cc: Albert Ou <[email protected]> Cc: Heiko Carstens <[email protected]> Cc: Vasily Gorbik <[email protected]> Cc: Christian Borntraeger <[email protected]> Cc: David S. Miller <[email protected]> Cc: Thomas Gleixner <[email protected]> Cc: Ingo Molnar <[email protected]> Cc: Dave Hansen <[email protected]> Cc: Jonathan Corbet <[email protected]> Cc: Longpeng <[email protected]> Cc: Christophe Leroy <[email protected]> Cc: Randy Dunlap <[email protected]> Cc: Mina Almasry <[email protected]> Cc: Peter Xu <[email protected]> Cc: Nitesh Narayan Lal <[email protected]> Cc: Anders Roxell <[email protected]> Cc: "Aneesh Kumar K.V" <[email protected]> Cc: Qian Cai <[email protected]> Cc: Stephen Rothwell <[email protected]> Link: http://lkml.kernel.org/r/[email protected] Link: http://lkml.kernel.org/r/[email protected] Link: http://lkml.kernel.org/r/[email protected] Link: http://lkml.kernel.org/r/[email protected] Signed-off-by: Linus Torvalds <[email protected]>
2020-06-03mm: remove early_pfn_in_nid() and CONFIG_NODES_SPAN_OTHER_NODESMike Rapoport1-9/+0
The memmap_init() function was made to iterate over memblock regions and as the result the early_pfn_in_nid() function became obsolete. Since CONFIG_NODES_SPAN_OTHER_NODES is only used to pick a stub or a real implementation of early_pfn_in_nid(), it is also not needed anymore. Remove both early_pfn_in_nid() and the CONFIG_NODES_SPAN_OTHER_NODES. Co-developed-by: Hoan Tran <[email protected]> Signed-off-by: Hoan Tran <[email protected]> Signed-off-by: Mike Rapoport <[email protected]> Signed-off-by: Andrew Morton <[email protected]> Tested-by: Hoan Tran <[email protected]> [arm64] Cc: Baoquan He <[email protected]> Cc: Brian Cain <[email protected]> Cc: Catalin Marinas <[email protected]> Cc: "David S. Miller" <[email protected]> Cc: Geert Uytterhoeven <[email protected]> Cc: Greentime Hu <[email protected]> Cc: Greg Ungerer <[email protected]> Cc: Guan Xuetao <[email protected]> Cc: Guo Ren <[email protected]> Cc: Heiko Carstens <[email protected]> Cc: Helge Deller <[email protected]> Cc: "James E.J. Bottomley" <[email protected]> Cc: Jonathan Corbet <[email protected]> Cc: Ley Foon Tan <[email protected]> Cc: Mark Salter <[email protected]> Cc: Matt Turner <[email protected]> Cc: Max Filippov <[email protected]> Cc: Michael Ellerman <[email protected]> Cc: Michal Hocko <[email protected]> Cc: Michal Simek <[email protected]> Cc: Nick Hu <[email protected]> Cc: Paul Walmsley <[email protected]> Cc: Richard Weinberger <[email protected]> Cc: Rich Felker <[email protected]> Cc: Russell King <[email protected]> Cc: Stafford Horne <[email protected]> Cc: Thomas Bogendoerfer <[email protected]> Cc: Tony Luck <[email protected]> Cc: Vineet Gupta <[email protected]> Cc: Yoshinori Sato <[email protected]> Link: http://lkml.kernel.org/r/[email protected] Signed-off-by: Linus Torvalds <[email protected]>
2020-06-03mm: use free_area_init() instead of free_area_init_nodes()Mike Rapoport1-1/+1
free_area_init() has effectively became a wrapper for free_area_init_nodes() and there is no point of keeping it. Still free_area_init() name is shorter and more general as it does not imply necessity to initialize multiple nodes. Rename free_area_init_nodes() to free_area_init(), update the callers and drop old version of free_area_init(). Signed-off-by: Mike Rapoport <[email protected]> Signed-off-by: Andrew Morton <[email protected]> Tested-by: Hoan Tran <[email protected]> [arm64] Reviewed-by: Baoquan He <[email protected]> Acked-by: Catalin Marinas <[email protected]> Cc: Brian Cain <[email protected]> Cc: "David S. Miller" <[email protected]> Cc: Geert Uytterhoeven <[email protected]> Cc: Greentime Hu <[email protected]> Cc: Greg Ungerer <[email protected]> Cc: Guan Xuetao <[email protected]> Cc: Guo Ren <[email protected]> Cc: Heiko Carstens <[email protected]> Cc: Helge Deller <[email protected]> Cc: "James E.J. Bottomley" <[email protected]> Cc: Jonathan Corbet <[email protected]> Cc: Ley Foon Tan <[email protected]> Cc: Mark Salter <[email protected]> Cc: Matt Turner <[email protected]> Cc: Max Filippov <[email protected]> Cc: Michael Ellerman <[email protected]> Cc: Michal Hocko <[email protected]> Cc: Michal Simek <[email protected]> Cc: Nick Hu <[email protected]> Cc: Paul Walmsley <[email protected]> Cc: Richard Weinberger <[email protected]> Cc: Rich Felker <[email protected]> Cc: Russell King <[email protected]> Cc: Stafford Horne <[email protected]> Cc: Thomas Bogendoerfer <[email protected]> Cc: Tony Luck <[email protected]> Cc: Vineet Gupta <[email protected]> Cc: Yoshinori Sato <[email protected]> Link: http://lkml.kernel.org/r/[email protected] Signed-off-by: Linus Torvalds <[email protected]>
2020-06-03mm: remove CONFIG_HAVE_MEMBLOCK_NODE_MAP optionMike Rapoport1-1/+0
CONFIG_HAVE_MEMBLOCK_NODE_MAP is used to differentiate initialization of nodes and zones structures between the systems that have region to node mapping in memblock and those that don't. Currently all the NUMA architectures enable this option and for the non-NUMA systems we can presume that all the memory belongs to node 0 and therefore the compile time configuration option is not required. The remaining few architectures that use DISCONTIGMEM without NUMA are easily updated to use memblock_add_node() instead of memblock_add() and thus have proper correspondence of memblock regions to NUMA nodes. Still, free_area_init_node() must have a backward compatible version because its semantics with and without CONFIG_HAVE_MEMBLOCK_NODE_MAP is different. Once all the architectures will use the new semantics, the entire compatibility layer can be dropped. To avoid addition of extra run time memory to store node id for architectures that keep memblock but have only a single node, the node id field of the memblock_region is guarded by CONFIG_NEED_MULTIPLE_NODES and the corresponding accessors presume that in those cases it is always 0. Signed-off-by: Mike Rapoport <[email protected]> Signed-off-by: Andrew Morton <[email protected]> Tested-by: Hoan Tran <[email protected]> [arm64] Acked-by: Catalin Marinas <[email protected]> [arm64] Cc: Baoquan He <[email protected]> Cc: Brian Cain <[email protected]> Cc: "David S. Miller" <[email protected]> Cc: Geert Uytterhoeven <[email protected]> Cc: Greentime Hu <[email protected]> Cc: Greg Ungerer <[email protected]> Cc: Guan Xuetao <[email protected]> Cc: Guo Ren <[email protected]> Cc: Heiko Carstens <[email protected]> Cc: Helge Deller <[email protected]> Cc: "James E.J. Bottomley" <[email protected]> Cc: Jonathan Corbet <[email protected]> Cc: Ley Foon Tan <[email protected]> Cc: Mark Salter <[email protected]> Cc: Matt Turner <[email protected]> Cc: Max Filippov <[email protected]> Cc: Michael Ellerman <[email protected]> Cc: Michal Hocko <[email protected]> Cc: Michal Simek <[email protected]> Cc: Nick Hu <[email protected]> Cc: Paul Walmsley <[email protected]> Cc: Richard Weinberger <[email protected]> Cc: Rich Felker <[email protected]> Cc: Russell King <[email protected]> Cc: Stafford Horne <[email protected]> Cc: Thomas Bogendoerfer <[email protected]> Cc: Tony Luck <[email protected]> Cc: Vineet Gupta <[email protected]> Cc: Yoshinori Sato <[email protected]> Link: http://lkml.kernel.org/r/[email protected] Signed-off-by: Linus Torvalds <[email protected]>
2020-06-03Merge tag 'for-linus' of git://git.kernel.org/pub/scm/virt/kvm/kvmLinus Torvalds6-71/+65
Pull kvm updates from Paolo Bonzini: "ARM: - Move the arch-specific code into arch/arm64/kvm - Start the post-32bit cleanup - Cherry-pick a few non-invasive pre-NV patches x86: - Rework of TLB flushing - Rework of event injection, especially with respect to nested virtualization - Nested AMD event injection facelift, building on the rework of generic code and fixing a lot of corner cases - Nested AMD live migration support - Optimization for TSC deadline MSR writes and IPIs - Various cleanups - Asynchronous page fault cleanups (from tglx, common topic branch with tip tree) - Interrupt-based delivery of asynchronous "page ready" events (host side) - Hyper-V MSRs and hypercalls for guest debugging - VMX preemption timer fixes s390: - Cleanups Generic: - switch vCPU thread wakeup from swait to rcuwait The other architectures, and the guest side of the asynchronous page fault work, will come next week" * tag 'for-linus' of git://git.kernel.org/pub/scm/virt/kvm/kvm: (256 commits) KVM: selftests: fix rdtsc() for vmx_tsc_adjust_test KVM: check userspace_addr for all memslots KVM: selftests: update hyperv_cpuid with SynDBG tests x86/kvm/hyper-v: Add support for synthetic debugger via hypercalls x86/kvm/hyper-v: enable hypercalls regardless of hypercall page x86/kvm/hyper-v: Add support for synthetic debugger interface x86/hyper-v: Add synthetic debugger definitions KVM: selftests: VMX preemption timer migration test KVM: nVMX: Fix VMX preemption timer migration x86/kvm/hyper-v: Explicitly align hcall param for kvm_hyperv_exit KVM: x86/pmu: Support full width counting KVM: x86/pmu: Tweak kvm_pmu_get_msr to pass 'struct msr_data' in KVM: x86: announce KVM_FEATURE_ASYNC_PF_INT KVM: x86: acknowledgment mechanism for async pf page ready notifications KVM: x86: interrupt based APF 'page ready' event delivery KVM: introduce kvm_read_guest_offset_cached() KVM: rename kvm_arch_can_inject_async_page_present() to kvm_arch_can_dequeue_async_page_present() KVM: x86: extend struct kvm_vcpu_pv_apf_data with token info Revert "KVM: async_pf: Fix #DF due to inject "Page not Present" and "Page Ready" exceptions simultaneously" KVM: VMX: Replace zero-length array with flexible-array ...
2020-06-03Merge tag 'sched-core-2020-06-02' of ↵Linus Torvalds1-1/+0
git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip Pull scheduler updates from Ingo Molnar: "The changes in this cycle are: - Optimize the task wakeup CPU selection logic, to improve scalability and reduce wakeup latency spikes - PELT enhancements - CFS bandwidth handling fixes - Optimize the wakeup path by remove rq->wake_list and replacing it with ->ttwu_pending - Optimize IPI cross-calls by making flush_smp_call_function_queue() process sync callbacks first. - Misc fixes and enhancements" * tag 'sched-core-2020-06-02' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip: (36 commits) irq_work: Define irq_work_single() on !CONFIG_IRQ_WORK too sched/headers: Split out open-coded prototypes into kernel/sched/smp.h sched: Replace rq::wake_list sched: Add rq::ttwu_pending irq_work, smp: Allow irq_work on call_single_queue smp: Optimize send_call_function_single_ipi() smp: Move irq_work_run() out of flush_smp_call_function_queue() smp: Optimize flush_smp_call_function_queue() sched: Fix smp_call_function_single_async() usage for ILB sched/core: Offload wakee task activation if it the wakee is descheduling sched/core: Optimize ttwu() spinning on p->on_cpu sched: Defend cfs and rt bandwidth quota against overflow sched/cpuacct: Fix charge cpuacct.usage_sys sched/fair: Replace zero-length array with flexible-array sched/pelt: Sync util/runnable_sum with PELT window when propagating sched/cpuacct: Use __this_cpu_add() instead of this_cpu_ptr() sched/fair: Optimize enqueue_task_fair() sched: Make scheduler_ipi inline sched: Clean up scheduler_ipi() sched/core: Simplify sched_init() ...
2020-06-03Merge branch 'topic/ppc-kvm' into nextMichael Ellerman2-6/+39
Merge one more commit from the topic branch we shared with the kvm-ppc tree. This brings in a fix to the code that scans for dirty pages during migration of a VM, which was incorrectly triggering a warning.
2020-06-02Merge tag 'for-5.8/drivers-2020-06-01' of git://git.kernel.dk/linux-blockLinus Torvalds1-9/+10
Pull block driver updates from Jens Axboe: "On top of the core changes, here are the block driver changes for this merge window: - NVMe changes: - NVMe over Fibre Channel protocol updates, which also reach over to drivers/scsi/lpfc (James Smart) - namespace revalidation support on the target (Anthony Iliopoulos) - gcc zero length array fix (Arnd Bergmann) - nvmet cleanups (Chaitanya Kulkarni) - misc cleanups and fixes (me, Keith Busch, Sagi Grimberg) - use a SRQ per completion vector (Max Gurtovoy) - fix handling of runtime changes to the queue count (Weiping Zhang) - t10 protection information support for nvme-rdma and nvmet-rdma (Israel Rukshin and Max Gurtovoy) - target side AEN improvements (Chaitanya Kulkarni) - various fixes and minor improvements all over, icluding the nvme part of the lpfc driver" - Floppy code cleanup series (Willy, Denis) - Floppy contention fix (Jiri) - Loop CONFIGURE support (Martijn) - bcache fixes/improvements (Coly, Joe, Colin) - q->queuedata cleanups (Christoph) - Get rid of ioctl_by_bdev (Christoph, Stefan) - md/raid5 allocation fixes (Coly) - zero length array fixes (Gustavo) - swim3 task state fix (Xu)" * tag 'for-5.8/drivers-2020-06-01' of git://git.kernel.dk/linux-block: (166 commits) bcache: configure the asynchronous registertion to be experimental bcache: asynchronous devices registration bcache: fix refcount underflow in bcache_device_free() bcache: Convert pr_<level> uses to a more typical style bcache: remove redundant variables i and n lpfc: Fix return value in __lpfc_nvme_ls_abort lpfc: fix axchg pointer reference after free and double frees lpfc: Fix pointer checks and comments in LS receive refactoring nvme: set dma alignment to qword nvmet: cleanups the loop in nvmet_async_events_process nvmet: fix memory leak when removing namespaces and controllers concurrently nvmet-rdma: add metadata/T10-PI support nvmet: add metadata support for block devices nvmet: add metadata/T10-PI support nvme: add Metadata Capabilities enumerations nvmet: rename nvmet_check_data_len to nvmet_check_transfer_len nvmet: rename nvmet_rw_len to nvmet_rw_data_len nvmet: add metadata characteristics for a namespace nvme-rdma: add metadata/T10-PI support nvme-rdma: introduce nvme_rdma_sgl structure ...
2020-06-02Merge branch 'akpm' (patches from Andrew)Linus Torvalds6-87/+62
Merge updates from Andrew Morton: "A few little subsystems and a start of a lot of MM patches. Subsystems affected by this patch series: squashfs, ocfs2, parisc, vfs. With mm subsystems: slab-generic, slub, debug, pagecache, gup, swap, memcg, pagemap, memory-failure, vmalloc, kasan" * emailed patches from Andrew Morton <[email protected]>: (128 commits) kasan: move kasan_report() into report.c mm/mm_init.c: report kasan-tag information stored in page->flags ubsan: entirely disable alignment checks under UBSAN_TRAP kasan: fix clang compilation warning due to stack protector x86/mm: remove vmalloc faulting mm: remove vmalloc_sync_(un)mappings() x86/mm/32: implement arch_sync_kernel_mappings() x86/mm/64: implement arch_sync_kernel_mappings() mm/ioremap: track which page-table levels were modified mm/vmalloc: track which page-table levels were modified mm: add functions to track page directory modifications s390: use __vmalloc_node in stack_alloc powerpc: use __vmalloc_node in alloc_vm_stack arm64: use __vmalloc_node in arch_alloc_vmap_stack mm: remove vmalloc_user_node_flags mm: switch the test_vmalloc module to use __vmalloc_node mm: remove __vmalloc_node_flags_caller mm: remove both instances of __vmalloc_node_flags mm: remove the prot argument to __vmalloc_node mm: remove the pgprot argument to __vmalloc ...
2020-06-02powerpc: use __vmalloc_node in alloc_vm_stackChristoph Hellwig1-3/+2
alloc_vm_stack can use a slightly higher level vmalloc function. Signed-off-by: Christoph Hellwig <[email protected]> Signed-off-by: Andrew Morton <[email protected]> Acked-by: Peter Zijlstra (Intel) <[email protected]> Cc: Christian Borntraeger <[email protected]> Cc: Christophe Leroy <[email protected]> Cc: Daniel Vetter <[email protected]> Cc: David Airlie <[email protected]> Cc: Gao Xiang <[email protected]> Cc: Greg Kroah-Hartman <[email protected]> Cc: Haiyang Zhang <[email protected]> Cc: Johannes Weiner <[email protected]> Cc: "K. Y. Srinivasan" <[email protected]> Cc: Laura Abbott <[email protected]> Cc: Mark Rutland <[email protected]> Cc: Michael Kelley <[email protected]> Cc: Minchan Kim <[email protected]> Cc: Nitin Gupta <[email protected]> Cc: Robin Murphy <[email protected]> Cc: Sakari Ailus <[email protected]> Cc: Stephen Hemminger <[email protected]> Cc: Sumit Semwal <[email protected]> Cc: Wei Liu <[email protected]> Cc: Benjamin Herrenschmidt <[email protected]> Cc: Catalin Marinas <[email protected]> Cc: Heiko Carstens <[email protected]> Cc: Paul Mackerras <[email protected]> Cc: Vasily Gorbik <[email protected]> Cc: Will Deacon <[email protected]> Link: http://lkml.kernel.org/r/[email protected] Signed-off-by: Linus Torvalds <[email protected]>
2020-06-02mm: remove __get_vm_areaChristoph Hellwig1-1/+2
Switch the two remaining callers to use __get_vm_area_caller instead. Signed-off-by: Christoph Hellwig <[email protected]> Signed-off-by: Andrew Morton <[email protected]> Acked-by: Peter Zijlstra (Intel) <[email protected]> Cc: Christian Borntraeger <[email protected]> Cc: Christophe Leroy <[email protected]> Cc: Daniel Vetter <[email protected]> Cc: David Airlie <[email protected]> Cc: Gao Xiang <[email protected]> Cc: Greg Kroah-Hartman <[email protected]> Cc: Haiyang Zhang <[email protected]> Cc: Johannes Weiner <[email protected]> Cc: "K. Y. Srinivasan" <[email protected]> Cc: Laura Abbott <[email protected]> Cc: Mark Rutland <[email protected]> Cc: Michael Kelley <[email protected]> Cc: Minchan Kim <[email protected]> Cc: Nitin Gupta <[email protected]> Cc: Robin Murphy <[email protected]> Cc: Sakari Ailus <[email protected]> Cc: Stephen Hemminger <[email protected]> Cc: Sumit Semwal <[email protected]> Cc: Wei Liu <[email protected]> Cc: Benjamin Herrenschmidt <[email protected]> Cc: Catalin Marinas <[email protected]> Cc: Heiko Carstens <[email protected]> Cc: Paul Mackerras <[email protected]> Cc: Vasily Gorbik <[email protected]> Cc: Will Deacon <[email protected]> Link: http://lkml.kernel.org/r/[email protected] Signed-off-by: Linus Torvalds <[email protected]>
2020-06-02powerpc: remove __ioremap_at and __iounmap_atChristoph Hellwig3-65/+21
These helpers are only used for remapping the ISA I/O base. Replace the mapping side with a remap_isa_range helper in isa-bridge.c that hard codes all the known arguments, and just remove __iounmap_at in favour of open coding it in the only caller. Signed-off-by: Christoph Hellwig <[email protected]> Signed-off-by: Andrew Morton <[email protected]> Acked-by: Peter Zijlstra (Intel) <[email protected]> Cc: Christian Borntraeger <[email protected]> Cc: Christophe Leroy <[email protected]> Cc: Daniel Vetter <[email protected]> Cc: David Airlie <[email protected]> Cc: Gao Xiang <[email protected]> Cc: Greg Kroah-Hartman <[email protected]> Cc: Haiyang Zhang <[email protected]> Cc: Johannes Weiner <[email protected]> Cc: "K. Y. Srinivasan" <[email protected]> Cc: Laura Abbott <[email protected]> Cc: Mark Rutland <[email protected]> Cc: Michael Kelley <[email protected]> Cc: Minchan Kim <[email protected]> Cc: Nitin Gupta <[email protected]> Cc: Robin Murphy <[email protected]> Cc: Sakari Ailus <[email protected]> Cc: Stephen Hemminger <[email protected]> Cc: Sumit Semwal <[email protected]> Cc: Wei Liu <[email protected]> Cc: Benjamin Herrenschmidt <[email protected]> Cc: Catalin Marinas <[email protected]> Cc: Heiko Carstens <[email protected]> Cc: Paul Mackerras <[email protected]> Cc: Vasily Gorbik <[email protected]> Cc: Will Deacon <[email protected]> Link: http://lkml.kernel.org/r/[email protected] Signed-off-by: Linus Torvalds <[email protected]>
2020-06-02powerpc: add an ioremap_phb helperChristoph Hellwig3-19/+38
Factor code shared between pci_64 and electra_cf into a ioremap_pbh helper that follows the normal ioremap semantics, and returns a useful __iomem pointer. Note that it opencodes __ioremap_at as we know from the callers the slab is available. Switch pci_64 to also store the result as __iomem pointer, and unmap the result using iounmap instead of force casting and using vmalloc APIs. Signed-off-by: Christoph Hellwig <[email protected]> Signed-off-by: Andrew Morton <[email protected]> Acked-by: Peter Zijlstra (Intel) <[email protected]> Cc: Christian Borntraeger <[email protected]> Cc: Christophe Leroy <[email protected]> Cc: Daniel Vetter <[email protected]> Cc: David Airlie <[email protected]> Cc: Gao Xiang <[email protected]> Cc: Greg Kroah-Hartman <[email protected]> Cc: Haiyang Zhang <[email protected]> Cc: Johannes Weiner <[email protected]> Cc: "K. Y. Srinivasan" <[email protected]> Cc: Laura Abbott <[email protected]> Cc: Mark Rutland <[email protected]> Cc: Michael Kelley <[email protected]> Cc: Minchan Kim <[email protected]> Cc: Nitin Gupta <[email protected]> Cc: Robin Murphy <[email protected]> Cc: Sakari Ailus <[email protected]> Cc: Stephen Hemminger <[email protected]> Cc: Sumit Semwal <[email protected]> Cc: Wei Liu <[email protected]> Cc: Benjamin Herrenschmidt <[email protected]> Cc: Catalin Marinas <[email protected]> Cc: Heiko Carstens <[email protected]> Cc: Paul Mackerras <[email protected]> Cc: Vasily Gorbik <[email protected]> Cc: Will Deacon <[email protected]> Link: http://lkml.kernel.org/r/[email protected] Signed-off-by: Linus Torvalds <[email protected]>
2020-06-02powerpc/pseries: Make vio and ibmebus initcalls pseries specificOliver O'Halloran2-4/+6
The vio and ibmebus buses are used for pseries specific paravirtualised devices and currently they're initialised by the generic initcall types. This is mostly fine, but it can result in some nuisance errors in dmesg when booting on PowerNV on some OSes, e.g. [ 2.984439] synth uevent: /devices/vio: failed to send uevent [ 2.984442] vio vio: uevent: failed to send synthetic uevent [ 17.968551] synth uevent: /devices/vio: failed to send uevent [ 17.968554] vio vio: uevent: failed to send synthetic uevent We don't see anything similar for the ibmebus because that depends on !CONFIG_LITTLE_ENDIAN. This patch squashes those by switching to using machine_*_initcall() so the bus type is only registered when the kernel is running on a pseries machine. Signed-off-by: Oliver O'Halloran <[email protected]> Reviewed-by: Tyrel Datwyler <[email protected]> Signed-off-by: Michael Ellerman <[email protected]> Link: https://lore.kernel.org/r/[email protected]
2020-06-02powerpc: Add POWER10 architected modeAlistair Popple6-6/+65
PVR value of 0x0F000006 means we are arch v3.1 compliant (i.e. POWER10). This is used by phyp and kvm when booting as a pseries guest to detect the presence of new P10 features and to enable the appropriate hwcap and facility bits. Signed-off-by: Alistair Popple <[email protected]> Signed-off-by: Cédric Le Goater <[email protected]> [mpe: Fall through to __init_FSCR rather than duplicating it, drop hack to set current->thread.fscr now that is handled elsewhere.] Signed-off-by: Michael Ellerman <[email protected]> Link: https://lore.kernel.org/r/[email protected]
2020-06-02powerpc/dt_cpu_ftrs: Add MMA featureAlistair Popple2-2/+18
Matrix multiple assist (MMA) is a new feature added to ISAv3.1 and POWER10. Support on powernv can be selected via a firmware CPU device tree feature which enables it via a PCR bit. Signed-off-by: Alistair Popple <[email protected]> Signed-off-by: Michael Ellerman <[email protected]> Link: https://lore.kernel.org/r/[email protected]
2020-06-02powerpc/dt_cpu_ftrs: Enable Prefixed InstructionsAlistair Popple1-0/+1
Prefix instructions have their own FSCR bit which needs to be enabled via a CPU feature. The kernel will save the FSCR for problem state but it needs to be enabled initially. Signed-off-by: Alistair Popple <[email protected]> Signed-off-by: Michael Ellerman <[email protected]> Link: https://lore.kernel.org/r/[email protected]
2020-06-02powerpc/dt_cpu_ftrs: Advertise support for ISA v3.1 if selectedAlistair Popple1-0/+6
On powernv hardware support for ISAv3.1 is advertised via a cpu feature bit in the device tree. This patch enables the associated HWCAP bit if the device tree indicates ISAv3.1 is available. Signed-off-by: Alistair Popple <[email protected]> Signed-off-by: Michael Ellerman <[email protected]> Link: https://lore.kernel.org/r/[email protected]
2020-06-02powerpc: Add support for ISA v3.1Alistair Popple3-4/+3
Newer ISA versions are enabled by clearing all bits in the PCR associated with previous versions of the ISA. Enable ISA v3.1 support by updating the PCR mask to include ISA v3.0. This ensures all PCR bits corresponding to earlier architecture versions get cleared thereby enabling ISA v3.1 if supported by the hardware. Signed-off-by: Alistair Popple <[email protected]> Signed-off-by: Michael Ellerman <[email protected]> Link: https://lore.kernel.org/r/[email protected]
2020-06-02powerpc: Add new HWCAP bitsAlistair Popple1-0/+2
POWER10 introduces two new architectural features - ISAv3.1 and matrix multiply assist (MMA) instructions. Userspace detects the presence of these features via two HWCAP bits introduced in this patch. These bits have been agreed to by the compiler and binutils team. According to ISAv3.1 MMA is an optional feature and software that makes use of it should first check for availability via this HWCAP bit and use alternate code paths if unavailable. Signed-off-by: Alistair Popple <[email protected]> Tested-by: Michael Neuling <[email protected]> Signed-off-by: Michael Ellerman <[email protected]> Link: https://lore.kernel.org/r/[email protected]
2020-06-02powerpc/64s: Don't set FSCR bits in INIT_THREADMichael Ellerman1-1/+0
Since the previous commit that saves the value of FSCR configured at boot into init_task.thread.fscr, the static initialisation in INIT_THREAD now no longer has any effect. So remove it. For non DT CPU features, the end result is the same, because __init_FSCR() is called on all CPUs that have an FSCR (Power8, Power9), and it sets FSCR_TAR & FSCR_EBB. Signed-off-by: Michael Ellerman <[email protected]> Link: https://lore.kernel.org/r/[email protected]
2020-06-02powerpc/64s: Save FSCR to init_task.thread.fscr after feature initMichael Ellerman1-0/+19
At boot the FSCR is initialised via one of two paths. On most systems it's set to a hard coded value in __init_FSCR(). On newer skiboot systems we use the device tree CPU features binding, where firmware can tell Linux what bits to set in FSCR (and HFSCR). In both cases the value that's configured at boot is not propagated into the init_task.thread.fscr value prior to the initial fork of init (pid 1), which means the value is not used by any processes other than swapper (the idle task). For the __init_FSCR() case this is OK, because the value in init_task.thread.fscr is initialised to something sensible. However it does mean that the value set in __init_FSCR() is not used other than for swapper, which is odd and confusing. The bigger problem is for the device tree CPU features case it prevents firmware from setting (or clearing) FSCR bits for use by user space. This means all existing kernels can not have features enabled/disabled by firmware if those features require setting/clearing FSCR bits. We can handle both cases by saving the FSCR value into init_task.thread.fscr after we have initialised it at boot. This fixes the bug for device tree CPU features, and will allow us to simplify the initialisation for the __init_FSCR() case in a future patch. Fixes: 5a61ef74f269 ("powerpc/64s: Support new device tree binding for discovering CPU features") Cc: [email protected] # v4.12+ Signed-off-by: Michael Ellerman <[email protected]> Link: https://lore.kernel.org/r/[email protected]
2020-06-02powerpc/64s: Don't let DT CPU features set FSCR_DSCRMichael Ellerman1-0/+8
The device tree CPU features binding includes FSCR bit numbers which Linux is instructed to set by firmware. Whether that's a good idea or not, in the case of the DSCR the Linux implementation has a hard requirement that the FSCR_DSCR bit not be set by default. We use it to track when a process reads/writes to DSCR, so it must be clear to begin with. So if firmware tells us to set FSCR_DSCR we must ignore it. Currently this does not cause a bug in our DSCR handling because the value of FSCR that the device tree CPU features code establishes is only used by swapper. All other tasks use the value hard coded in init_task.thread.fscr. However we'd like to fix that in a future commit, at which point this will become necessary. Fixes: 5a61ef74f269 ("powerpc/64s: Support new device tree binding for discovering CPU features") Cc: [email protected] # v4.12+ Signed-off-by: Michael Ellerman <[email protected]> Link: https://lore.kernel.org/r/[email protected]
2020-06-02powerpc/64s: Don't init FSCR_DSCR in __init_FSCR()Michael Ellerman1-1/+1
__init_FSCR() was added originally in commit 2468dcf641e4 ("powerpc: Add support for context switching the TAR register") (Feb 2013), and only set FSCR_TAR. At that point FSCR (Facility Status and Control Register) was not context switched, so the setting was permanent after boot. Later we added initialisation of FSCR_DSCR to __init_FSCR(), in commit 54c9b2253d34 ("powerpc: Set DSCR bit in FSCR setup") (Mar 2013), again that was permanent after boot. Then commit 2517617e0de6 ("powerpc: Fix context switch DSCR on POWER8") (Aug 2013) added a limited context switch of FSCR, just the FSCR_DSCR bit was context switched based on thread.dscr_inherit. That commit said "This clears the H/FSCR DSCR bit initially", but it didn't, it left the initialisation of FSCR_DSCR in __init_FSCR(). However the initial context switch from init_task to pid 1 would clear FSCR_DSCR because thread.dscr_inherit was 0. That commit also introduced the requirement that FSCR_DSCR be clear for user processes, so that we can take the facility unavailable interrupt in order to manage dscr_inherit. Then in commit 152d523e6307 ("powerpc: Create context switch helpers save_sprs() and restore_sprs()") (Dec 2015) FSCR was added to thread_struct. However it still wasn't fully context switched, we just took the existing value and set FSCR_DSCR if the new thread had dscr_inherit set. FSCR was still initialised at boot to FSCR_DSCR | FSCR_TAR, but that value was not propagated into the thread_struct, so the initial context switch set FSCR_DSCR back to 0. Finally commit b57bd2de8c6c ("powerpc: Improve FSCR init and context switching") (Jun 2016) added a full context switch of the FSCR, and added an initialisation of init_task.thread.fscr to FSCR_TAR | FSCR_EBB, but omitted FSCR_DSCR. The end result is that swapper runs with FSCR_DSCR set because of the initialisation in __init_FSCR(), but no other processes do, they use the value from init_task.thread.fscr. Having FSCR_DSCR set for swapper allows it to access SPR 3 from userspace, but swapper never runs userspace, so it has no useful effect. It's also confusing to have the value initialised in two places to two different values. So remove FSCR_DSCR from __init_FSCR(), this at least gets us to the point where there's a single value of FSCR, even if it's still set in two places. Signed-off-by: Michael Ellerman <[email protected]> Tested-by: Alistair Popple <[email protected]> Link: https://lore.kernel.org/r/[email protected]
2020-06-02powerpc/32s: Fix another build failure with CONFIG_PPC_KUAP_DEBUGChristophe Leroy1-1/+2
'thread' doesn't exist in kuap_check() macro. Use 'current' instead. Fixes: a68c31fc01ef ("powerpc/32s: Implement Kernel Userspace Access Protection") Cc: [email protected] Reported-by: kbuild test robot <[email protected]> Signed-off-by: Christophe Leroy <[email protected]> Signed-off-by: Michael Ellerman <[email protected]> Link: https://lore.kernel.org/r/b459e1600b969047a74e34251a84a3d6fdf1f312.1590858925.git.christophe.leroy@csgroup.eu
2020-06-02powerpc/module_64: Use special stub for _mcount() with -mprofile-kernelNaveen N. Rao1-118/+104
Since commit c55d7b5e64265f ("powerpc: Remove STRICT_KERNEL_RWX incompatibility with RELOCATABLE"), powerpc kernels with -mprofile-kernel can crash in certain scenarios with a trace like below: BUG: Unable to handle kernel instruction fetch (NULL pointer?) Faulting instruction address: 0x00000000 Oops: Kernel access of bad area, sig: 11 [#1] LE PAGE_SIZE=64K MMU=Radix SMP NR_CPUS=256 DEBUG_PAGEALLOC NUMA PowerNV <snip> NIP [0000000000000000] 0x0 LR [c0080000102c0048] ext4_iomap_end+0x8/0x30 [ext4] Call Trace: iomap_apply+0x20c/0x920 (unreliable) iomap_bmap+0xfc/0x160 ext4_bmap+0xa4/0x180 [ext4] bmap+0x4c/0x80 jbd2_journal_init_inode+0x44/0x1a0 [jbd2] ext4_load_journal+0x440/0x860 [ext4] ext4_fill_super+0x342c/0x3ab0 [ext4] mount_bdev+0x25c/0x290 ext4_mount+0x28/0x50 [ext4] legacy_get_tree+0x4c/0xb0 vfs_get_tree+0x4c/0x130 do_mount+0xa18/0xc50 sys_mount+0x158/0x180 system_call+0x5c/0x68 The NIP points to NULL, or a random location (data even), while the LR always points to the LEP of a function (with an offset of 8), indicating that something went wrong with ftrace. However, ftrace is not necessarily active when such crashes occur. The kernel OOPS sometimes follows a warning from ftrace indicating that some module functions could not be patched with a nop. Other times, if a module is loaded early during boot, instruction patching can fail due to a separate bug, but the error is not reported due to missing error reporting. In all the above cases when instruction patching fails, ftrace will be disabled but certain kernel module functions will be left with default calls to _mcount(). This is not a problem with ELFv1. However, with -mprofile-kernel, the default stub is problematic since it depends on a valid module TOC in r2. If the kernel (or a different module) calls into a function that does not use the TOC, the function won't have a prologue to setup the module TOC. When that function calls into _mcount(), we will end up in the relocation stub that will use the previous TOC, and end up trying to jump into a random location. From the above trace: iomap_apply+0x20c/0x920 [kernel TOC] | V ext4_iomap_end+0x8/0x30 [no GEP == kernel TOC] | V _mcount() stub [uses kernel TOC -> random entry] To address this, let's change over to using the special stub that is used for ftrace_[regs_]caller() for _mcount(). This ensures that we are not dependent on a valid module TOC in r2 for default _mcount() handling. Reported-by: Qian Cai <[email protected]> Signed-off-by: Naveen N. Rao <[email protected]> Tested-by: Qian Cai <[email protected]> Signed-off-by: Michael Ellerman <[email protected]> Link: https://lore.kernel.org/r/8affd4298d22099bbd82544fab8185700a6222b1.1587488954.git.naveen.n.rao@linux.vnet.ibm.com
2020-06-02powerpc/module_64: Simplify check for -mprofile-kernel ftrace relocationsNaveen N. Rao1-14/+4
For -mprofile-kernel, we need special handling when generating stubs for ftrace calls such as _mcount(). To faciliate this, we check if a R_PPC64_REL24 relocation is for a symbol named "_mcount()" along with also checking the instruction sequence. The latter is not really required since "_mcount()" is an exported symbol and kernel modules cannot use it. As such, drop the additional checking and simplify the code. This helps unify stub creation for ftrace stubs with -mprofile-kernel and aids in code reuse. Also rename is_mprofile_mcount_callsite() to is_mprofile_ftrace_call() to reflect the checking being done. Signed-off-by: Naveen N. Rao <[email protected]> Signed-off-by: Michael Ellerman <[email protected]> Link: https://lore.kernel.org/r/7d9c316adfa1fb787ad268bb4691e7e4059ff2d5.1587488954.git.naveen.n.rao@linux.vnet.ibm.com
2020-06-02powerpc/module_64: Consolidate ftrace codeNaveen N. Rao2-39/+33
module_trampoline_target() is only used by ftrace. Move the prototype within the appropriate #ifdef in the header. Also, move the function body to the end of module_64.c so as to consolidate all ftrace code in one place. No functional changes. Signed-off-by: Naveen N. Rao <[email protected]> Signed-off-by: Michael Ellerman <[email protected]> Link: https://lore.kernel.org/r/2527351f65c53c5866068ae130dc34c5d4ee8ad9.1587488954.git.naveen.n.rao@linux.vnet.ibm.com
2020-06-02powerpc/32: Disable KASAN with pages bigger than 16kChristophe Leroy1-2/+2
Mapping of early shadow area is implemented by using a single static page table having all entries pointing to the same early shadow page. The shadow area must therefore occupy full PGD entries. The shadow area has a size of 128MB starting at 0xf8000000. With 4k pages, a PGD entry is 4MB With 16k pages, a PGD entry is 64MB With 64k pages, a PGD entry is 1GB which is too big. Until we rework the early shadow mapping, disable KASAN when the page size is too big. Fixes: 2edb16efc899 ("powerpc/32: Add KASAN support") Cc: [email protected] # v5.2+ Reported-by: kbuild test robot <[email protected]> Signed-off-by: Christophe Leroy <[email protected]> Signed-off-by: Michael Ellerman <[email protected]> Link: https://lore.kernel.org/r/7195fcde7314ccbf7a081b356084a69d421b10d4.1590660977.git.christophe.leroy@csgroup.eu
2020-06-02powerpc/uaccess: Don't set KUEP by default on book3s/32Christophe Leroy1-1/+1
On book3s/32, KUEP is an heavy process as it requires to set/unset the NX bit in each of the 12 user segments everytime the kernel is entered/exited from/to user space. Don't select KUEP by default on book3s/32. Signed-off-by: Christophe Leroy <[email protected]> Signed-off-by: Michael Ellerman <[email protected]> Link: https://lore.kernel.org/r/1492bb150c1aaa53d99a604b49992e60ea20cd5f.1586962582.git.christophe.leroy@c-s.fr
2020-06-02powerpc/uaccess: Don't set KUAP by default on book3s/32Christophe Leroy1-1/+1
On book3s/32, KUAP is an heavy process as it requires to determine which segments are impacted and unlock/lock each of them. And since the implementation of user_access_begin/end, it is even worth for the time being because unlike __get_user(), user_access_begin doesn't make difference between read and write and unlocks access also for read allthought that's unneeded on book3s/32. As shown by the size of a kernel built with KUAP and one without, the overhead is 64k bytes of code. As a comparison a similar build on an 8xx has an overhead of only 8k bytes of code. text data bss dec hex filename 7230416 1425868 837376 9493660 90dc9c vmlinux.kuap6xx 7165012 1425548 837376 9427936 8fdbe0 vmlinux.nokuap6xx 6519796 1960028 477464 8957288 88ad68 vmlinux.kuap8xx 6511664 1959864 477464 8948992 888d00 vmlinux.nokuap8xx Until a more optimised KUAP is implemented on book3s/32, don't select it by default. Signed-off-by: Christophe Leroy <[email protected]> Signed-off-by: Michael Ellerman <[email protected]> Link: https://lore.kernel.org/r/154a99399317b096ac1f04827b9f8d7a9179ddc1.1586962586.git.christophe.leroy@c-s.fr
2020-06-02powerpc/8xx: Reduce time spent in allow_user_access() and friendsChristophe Leroy1-8/+8
To enable/disable kernel access to user space, the 8xx has to modify the properties of access group 1. This is done by writing predefined values into SPRN_Mx_AP registers. As of today, a __put_user() gives: 00000d64 <my_test>: d64: 3d 20 4f ff lis r9,20479 d68: 61 29 ff ff ori r9,r9,65535 d6c: 7d 3a c3 a6 mtspr 794,r9 d70: 39 20 00 00 li r9,0 d74: 90 83 00 00 stw r4,0(r3) d78: 3d 20 6f ff lis r9,28671 d7c: 61 29 ff ff ori r9,r9,65535 d80: 7d 3a c3 a6 mtspr 794,r9 d84: 4e 80 00 20 blr Because only groups 0 and 1 are used, the definition of groups 2 to 15 doesn't matter. By setting unused bits to 0 instead on 1, one instruction is removed for each lock and unlock action: 00000d5c <my_test>: d5c: 3d 20 40 00 lis r9,16384 d60: 7d 3a c3 a6 mtspr 794,r9 d64: 39 20 00 00 li r9,0 d68: 90 83 00 00 stw r4,0(r3) d6c: 3d 20 60 00 lis r9,24576 d70: 7d 3a c3 a6 mtspr 794,r9 d74: 4e 80 00 20 blr Signed-off-by: Christophe Leroy <[email protected]> Signed-off-by: Michael Ellerman <[email protected]> Link: https://lore.kernel.org/r/57425c33dd72f292b1a23570244b81419072a7aa.1586945153.git.christophe.leroy@c-s.fr
2020-06-02powerpc/entry32: Blacklist exception exit points for kprobe.Christophe Leroy1-1/+8
kprobe does not handle events happening in real mode. The very last part of exception exits cannot support a trap. Blacklist them from kprobe. While we are at it, remove exc_exit_start symbol which is not used to avoid having to blacklist it. Signed-off-by: Christophe Leroy <[email protected]> Acked-by: Naveen N. Rao <[email protected]> Signed-off-by: Michael Ellerman <[email protected]> Link: https://lore.kernel.org/r/098b0fd3f6299aa1bd692bd576bd7012c84608de.1585670437.git.christophe.leroy@c-s.fr
2020-06-02powerpc/entry32: Blacklist syscall exit points for kprobe.Christophe Leroy1-0/+3
kprobe does not handle events happening in real mode. The very last part of syscall cannot support a trap. Add a symbol syscall_exit_finish to identify that part and blacklist it from kprobe. Signed-off-by: Christophe Leroy <[email protected]> Acked-by: Naveen N. Rao <[email protected]> Signed-off-by: Michael Ellerman <[email protected]> Link: https://lore.kernel.org/r/23eddf49abb03d1359fa0be4206998eb3800f42c.1585670437.git.christophe.leroy@c-s.fr
2020-06-02powerpc/entry32: Blacklist exception entry points for kprobe.Christophe Leroy1-15/+22
kprobe does not handle events happening in real mode. As exception entry points are running with MMU disabled, blacklist them. The handling of TLF_NAPPING and TLF_SLEEPING is moved before the CONFIG_TRACE_IRQFLAGS which contains 'reenable_mmu' because from there kprobe will be possible as the kernel will run with MMU enabled. Signed-off-by: Christophe Leroy <[email protected]> Acked-by: Naveen N. Rao <[email protected]> Signed-off-by: Michael Ellerman <[email protected]> Link: https://lore.kernel.org/r/f61ac599855e674ebb592464d0ea32a3ba9c6644.1585670437.git.christophe.leroy@c-s.fr