aboutsummaryrefslogtreecommitdiff
path: root/arch/powerpc/include
AgeCommit message (Collapse)AuthorFilesLines
2019-05-07powerpc/book3s/64: check for NULL pointer in pgd_alloc()Rick Lindsley1-0/+3
When the memset code was added to pgd_alloc(), it failed to consider that kmem_cache_alloc() can return NULL. It's uncommon, but not impossible under heavy memory contention. Example oops: Unable to handle kernel paging request for data at address 0x00000000 Faulting instruction address: 0xc0000000000a4000 Oops: Kernel access of bad area, sig: 11 [#1] LE SMP NR_CPUS=2048 NUMA pSeries CPU: 70 PID: 48471 Comm: entrypoint.sh Kdump: loaded Not tainted 4.14.0-115.6.1.el7a.ppc64le #1 task: c000000334a00000 task.stack: c000000331c00000 NIP: c0000000000a4000 LR: c00000000012f43c CTR: 0000000000000020 REGS: c000000331c039c0 TRAP: 0300 Not tainted (4.14.0-115.6.1.el7a.ppc64le) MSR: 800000010280b033 <SF,VEC,VSX,EE,FP,ME,IR,DR,RI,LE,TM[E]> CR: 44022840 XER: 20040000 CFAR: c000000000008874 DAR: 0000000000000000 DSISR: 42000000 SOFTE: 1 ... NIP [c0000000000a4000] memset+0x68/0x104 LR [c00000000012f43c] mm_init+0x27c/0x2f0 Call Trace: mm_init+0x260/0x2f0 (unreliable) copy_mm+0x11c/0x638 copy_process.isra.28.part.29+0x6fc/0x1080 _do_fork+0xdc/0x4c0 ppc_clone+0x8/0xc Instruction dump: 409e000c b0860000 38c60002 409d000c 90860000 38c60004 78a0d183 78a506a0 7c0903a6 41820034 60000000 60420000 <f8860000> f8860008 f8860010 f8860018 Fixes: fc5c2f4a55a2 ("powerpc/mm/hash64: Zero PGD pages on allocation") Cc: [email protected] # v4.16+ Signed-off-by: Rick Lindsley <[email protected]> Signed-off-by: Michael Ellerman <[email protected]>
2019-05-06Merge branch 'linus' of ↵Linus Torvalds1-0/+1
git://git.kernel.org/pub/scm/linux/kernel/git/herbert/crypto-2.6 Pull crypto update from Herbert Xu: "API: - Add support for AEAD in simd - Add fuzz testing to testmgr - Add panic_on_fail module parameter to testmgr - Use per-CPU struct instead multiple variables in scompress - Change verify API for akcipher Algorithms: - Convert x86 AEAD algorithms over to simd - Forbid 2-key 3DES in FIPS mode - Add EC-RDSA (GOST 34.10) algorithm Drivers: - Set output IV with ctr-aes in crypto4xx - Set output IV in rockchip - Fix potential length overflow with hashing in sun4i-ss - Fix computation error with ctr in vmx - Add SM4 protected keys support in ccree - Remove long-broken mxc-scc driver - Add rfc4106(gcm(aes)) cipher support in cavium/nitrox" * 'linus' of git://git.kernel.org/pub/scm/linux/kernel/git/herbert/crypto-2.6: (179 commits) crypto: ccree - use a proper le32 type for le32 val crypto: ccree - remove set but not used variable 'du_size' crypto: ccree - Make cc_sec_disable static crypto: ccree - fix spelling mistake "protedcted" -> "protected" crypto: caam/qi2 - generate hash keys in-place crypto: caam/qi2 - fix DMA mapping of stack memory crypto: caam/qi2 - fix zero-length buffer DMA mapping crypto: stm32/cryp - update to return iv_out crypto: stm32/cryp - remove request mutex protection crypto: stm32/cryp - add weak key check for DES crypto: atmel - remove set but not used variable 'alg_name' crypto: picoxcell - Use dev_get_drvdata() crypto: crypto4xx - get rid of redundant using_sd variable crypto: crypto4xx - use sync skcipher for fallback crypto: crypto4xx - fix cfb and ofb "overran dst buffer" issues crypto: crypto4xx - fix ctr-aes missing output IV crypto: ecrdsa - select ASN1 and OID_REGISTRY for EC-RDSA crypto: ux500 - use ccflags-y instead of CFLAGS_<basename>.o crypto: ccree - handle tee fips error during power management resume crypto: ccree - add function to handle cryptocell tee fips error ...
2019-05-06Merge tag 'arm64-mmiowb' of ↵Linus Torvalds4-48/+26
git://git.kernel.org/pub/scm/linux/kernel/git/arm64/linux Pull mmiowb removal from Will Deacon: "Remove Mysterious Macro Intended to Obscure Weird Behaviours (mmiowb()) Remove mmiowb() from the kernel memory barrier API and instead, for architectures that need it, hide the barrier inside spin_unlock() when MMIO has been performed inside the critical section. The only relatively recent changes have been addressing review comments on the documentation, which is in a much better shape thanks to the efforts of Ben and Ingo. I was initially planning to split this into two pull requests so that you could run the coccinelle script yourself, however it's been plain sailing in linux-next so I've just included the whole lot here to keep things simple" * tag 'arm64-mmiowb' of git://git.kernel.org/pub/scm/linux/kernel/git/arm64/linux: (23 commits) docs/memory-barriers.txt: Update I/O section to be clearer about CPU vs thread docs/memory-barriers.txt: Fix style, spacing and grammar in I/O section arch: Remove dummy mmiowb() definitions from arch code net/ethernet/silan/sc92031: Remove stale comment about mmiowb() i40iw: Redefine i40iw_mmiowb() to do nothing scsi/qla1280: Remove stale comment about mmiowb() drivers: Remove explicit invocations of mmiowb() drivers: Remove useless trailing comments from mmiowb() invocations Documentation: Kill all references to mmiowb() riscv/mmiowb: Hook up mmwiob() implementation to asm-generic code powerpc/mmiowb: Hook up mmwiob() implementation to asm-generic code ia64/mmiowb: Add unconditional mmiowb() to arch_spin_unlock() mips/mmiowb: Add unconditional mmiowb() to arch_spin_unlock() sh/mmiowb: Add unconditional mmiowb() to arch_spin_unlock() m68k/io: Remove useless definition of mmiowb() nds32/io: Remove useless definition of mmiowb() x86/io: Remove useless definition of mmiowb() arm64/io: Remove useless definition of mmiowb() ARM/io: Remove useless definition of mmiowb() mmiowb: Hook up mmiowb helpers to spinlocks and generic I/O accessors ...
2019-05-06Merge branch 'locking-core-for-linus' of ↵Linus Torvalds1-1/+0
git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip Pull locking updates from Ingo Molnar: "Here are the locking changes in this cycle: - rwsem unification and simpler micro-optimizations to prepare for more intrusive (and more lucrative) scalability improvements in v5.3 (Waiman Long) - Lockdep irq state tracking flag usage cleanups (Frederic Weisbecker) - static key improvements (Jakub Kicinski, Peter Zijlstra) - misc updates, cleanups and smaller fixes" * 'locking-core-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip: (26 commits) locking/lockdep: Remove unnecessary unlikely() locking/static_key: Don't take sleeping locks in __static_key_slow_dec_deferred() locking/static_key: Factor out the fast path of static_key_slow_dec() locking/static_key: Add support for deferred static branches locking/lockdep: Test all incompatible scenarios at once in check_irq_usage() locking/lockdep: Avoid bogus Clang warning locking/lockdep: Generate LOCKF_ bit composites locking/lockdep: Use expanded masks on find_usage_*() functions locking/lockdep: Map remaining magic numbers to lock usage mask names locking/lockdep: Move valid_state() inside CONFIG_TRACE_IRQFLAGS && CONFIG_PROVE_LOCKING locking/rwsem: Prevent unneeded warning during locking selftest locking/rwsem: Optimize rwsem structure for uncontended lock acquisition locking/rwsem: Enable lock event counting locking/lock_events: Don't show pvqspinlock events on bare metal locking/lock_events: Make lock_events available for all archs & other locks locking/qspinlock_stat: Introduce generic lockevent_*() counting APIs locking/rwsem: Enhance DEBUG_RWSEMS_WARN_ON() macro locking/rwsem: Add debug check for __down_read*() locking/rwsem: Micro-optimize rwsem_try_read_lock_unqueued() locking/rwsem: Move rwsem internal function declarations to rwsem-xadd.h ...
2019-05-06Merge branch 'core-mm-for-linus' of ↵Linus Torvalds1-17/+1
git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip Pull unified TLB flushing from Ingo Molnar: "This contains the generic mmu_gather feature from Peter Zijlstra, which is an all-arch unification of TLB flushing APIs, via the following (broad) steps: - enhance the <asm-generic/tlb.h> APIs to cover more arch details - convert most TLB flushing arch implementations to the generic <asm-generic/tlb.h> APIs. - remove leftovers of per arch implementations After this series every single architecture makes use of the unified TLB flushing APIs" * 'core-mm-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip: mm/resource: Use resource_overlaps() to simplify region_intersects() ia64/tlb: Eradicate tlb_migrate_finish() callback asm-generic/tlb: Remove tlb_table_flush() asm-generic/tlb: Remove tlb_flush_mmu_free() asm-generic/tlb: Remove CONFIG_HAVE_GENERIC_MMU_GATHER asm-generic/tlb: Remove arch_tlb*_mmu() s390/tlb: Convert to generic mmu_gather asm-generic/tlb: Introduce CONFIG_HAVE_MMU_GATHER_NO_GATHER=y arch/tlb: Clean up simple architectures um/tlb: Convert to generic mmu_gather sh/tlb: Convert SH to generic mmu_gather ia64/tlb: Convert to generic mmu_gather arm/tlb: Convert to generic mmu_gather asm-generic/tlb, arch: Invert CONFIG_HAVE_RCU_TABLE_INVALIDATE asm-generic/tlb, ia64: Conditionally provide tlb_migrate_finish() asm-generic/tlb: Provide generic tlb_flush() based on flush_tlb_mm() asm-generic/tlb, arch: Provide generic tlb_flush() based on flush_tlb_range() asm-generic/tlb, arch: Provide generic VIPT cache flush asm-generic/tlb, arch: Provide CONFIG_HAVE_MMU_GATHER_PAGE_SIZE asm-generic/tlb: Provide a comment
2019-05-03powerpc/booke64: set RI in default MSRLaurentiu Tudor1-1/+1
Set RI in the default kernel's MSR so that the architected way of detecting unrecoverable machine check interrupts has a chance to work. This is inline with the MSR setup of the rest of booke powerpc architectures configured here. Signed-off-by: Laurentiu Tudor <[email protected]> Cc: [email protected] Signed-off-by: Michael Ellerman <[email protected]>
2019-05-03powerpc/include: Add data structures and macros for IMC trace modeAnju T Sudhakar2-0/+40
Add the macros needed for IMC (In-Memory Collection Counters) trace-mode and data structure to hold the trace-imc record data. Also, add the new type "OPAL_IMC_COUNTERS_TRACE" in 'opal-api.h', since there is a new switch case added in the opal-calls for IMC. Signed-off-by: Anju T Sudhakar <[email protected]> Reviewed-by: Madhavan Srinivasan <[email protected]> Signed-off-by: Michael Ellerman <[email protected]>
2019-05-03powerpc/hmi: Fix kernel hang when TB is in error state.Mahesh Salgaonkar3-0/+14
On TOD/TB errors timebase register stops/freezes until HMI error recovery gets TOD/TB back into running state. On successful recovery, TB starts running again and udelay() that relies on TB value continues to function properly. But in case when HMI fails to recover from TOD/TB errors, the TB register stay freezed. With TB not running the __delay() function keeps looping and never return. If __delay() is called while in panic path then system hangs and never reboots after panic. Signed-off-by: Mahesh Salgaonkar <[email protected]> Signed-off-by: Michael Ellerman <[email protected]>
2019-05-03powerpc/mm: Warn if W+X pages found on bootRussell Currey1-0/+6
Implement code to walk all pages and warn if any are found to be both writable and executable. Depends on STRICT_KERNEL_RWX enabled, and is behind the DEBUG_WX config option. This only runs on boot and has no runtime performance implications. Very heavily influenced (and in some cases copied verbatim) from the ARM64 code written by Laura Abbott (thanks!), since our ptdump infrastructure is similar. Signed-off-by: Russell Currey <[email protected]> [mpe: Fixup build error when disabled] Signed-off-by: Michael Ellerman <[email protected]>
2019-05-03powerpc/mm: define an empty mm_iommu_init()Christophe Leroy1-0/+1
To avoid ifdefs, define a empty static inline mm_iommu_init() function when CONFIG_SPAPR_TCE_IOMMU is not selected. Signed-off-by: Christophe Leroy <[email protected]> Signed-off-by: Michael Ellerman <[email protected]>
2019-05-03powerpc/fadump: define an empty fadump_cleanup()Christophe Leroy1-0/+1
To avoid #ifdefs, define an static inline fadump_cleanup() function when CONFIG_FADUMP is not selected Signed-off-by: Christophe Leroy <[email protected]> Signed-off-by: Michael Ellerman <[email protected]>
2019-05-03powerpc/32: Add KASAN supportChristophe Leroy1-0/+9
This patch adds KASAN support for PPC32. The following patch will add an early activation of hash table for book3s. Until then, a warning will be raised if trying to use KASAN on an hash 6xx. To support KASAN, this patch initialises that MMU mapings for accessing to the KASAN shadow area defined in a previous patch. An early mapping is set as soon as the kernel code has been relocated at its definitive place. Then the definitive mapping is set once paging is initialised. For modules, the shadow area is allocated at module_alloc(). Signed-off-by: Christophe Leroy <[email protected]> Signed-off-by: Michael Ellerman <[email protected]>
2019-05-03powerpc/32: prepare shadow area for KASANChristophe Leroy2-0/+21
This patch prepares a shadow area for KASAN. The shadow area will be at the top of the kernel virtual memory space above the fixmap area and will occupy one eighth of the total kernel virtual memory space. Signed-off-by: Christophe Leroy <[email protected]> Signed-off-by: Michael Ellerman <[email protected]>
2019-05-03powerpc/32: make KVIRT_TOP dependent on FIXMAP_STARTChristophe Leroy2-6/+20
When we add KASAN shadow area, KVIRT_TOP can't be anymore fixed at 0xfe000000. This patch uses FIXADDR_START to define KVIRT_TOP. Signed-off-by: Christophe Leroy <[email protected]> Signed-off-by: Michael Ellerman <[email protected]>
2019-05-03powerpc: prepare string/mem functions for KASANChristophe Leroy2-3/+44
CONFIG_KASAN implements wrappers for memcpy() memmove() and memset() Those wrappers are doing the verification then call respectively __memcpy() __memmove() and __memset(). The arches are therefore expected to rename their optimised functions that way. For files on which KASAN is inhibited, #defines are used to allow them to directly call optimised versions of the functions without going through the KASAN wrappers. See commit 393f203f5fd5 ("x86_64: kasan: add interceptors for memset/memmove/memcpy functions") for details. Other string / mem functions do not (yet) have kasan wrappers, we therefore have to fallback to the generic versions when KASAN is active, otherwise KASAN checks will be skipped. Signed-off-by: Christophe Leroy <[email protected]> [mpe: Fixups to keep selftests working] Signed-off-by: Michael Ellerman <[email protected]>
2019-05-03powerpc/mm: refactor pgd_alloc() and pgd_free() on nohashChristophe Leroy3-22/+12
pgd_alloc() and pgd_free() are identical on nohash 32 and 64. Reviewed-by: Aneesh Kumar K.V <[email protected]> Signed-off-by: Christophe Leroy <[email protected]> Signed-off-by: Michael Ellerman <[email protected]>
2019-05-03powerpc/mm: refactor pmd_pgtable()Christophe Leroy5-11/+5
pmd_pgtable() is identical on the 4 subarches, refactor it. Reviewed-by: Aneesh Kumar K.V <[email protected]> Signed-off-by: Christophe Leroy <[email protected]> Signed-off-by: Michael Ellerman <[email protected]>
2019-05-03powerpc/mm: refactor pgtable freeing functions on nohashChristophe Leroy3-86/+44
pgtable_free() and others are identical on nohash/32 and 64, so move them into asm/nohash/pgalloc.h Reviewed-by: Aneesh Kumar K.V <[email protected]> Signed-off-by: Christophe Leroy <[email protected]> Signed-off-by: Michael Ellerman <[email protected]>
2019-05-03powerpc/mm: Only keep one version of pmd_populate() functions on nohash/32Christophe Leroy1-20/+8
Use IS_ENABLED(CONFIG_BOOKE) to make single versions of pmd_populate() and pmd_populate_kernel() Reviewed-by: Aneesh Kumar K.V <[email protected]> Signed-off-by: Christophe Leroy <[email protected]> Signed-off-by: Michael Ellerman <[email protected]>
2019-05-03powerpc/mm: refactor definition of pgtable_cache[]Christophe Leroy5-86/+21
pgtable_cache[] is the same for the 4 subarches, lets make it common. Reviewed-by: Aneesh Kumar K.V <[email protected]> Signed-off-by: Christophe Leroy <[email protected]> Signed-off-by: Michael Ellerman <[email protected]>
2019-05-03powerpc/mm: refactor pte_alloc_one() and pte_free() families definition.Christophe Leroy5-97/+25
Functions pte_alloc_one(), pte_alloc_one_kernel(), pte_free(), pte_free_kernel() are identical for the four subarches. This patch moves their definition in a common place. Reviewed-by: Aneesh Kumar K.V <[email protected]> Signed-off-by: Christophe Leroy <[email protected]> Signed-off-by: Michael Ellerman <[email protected]>
2019-05-03powerpc/mm: inline pte_alloc_one_kernel() and pte_alloc_one() on PPC32Christophe Leroy2-6/+24
pte_alloc_one_kernel() and pte_alloc_one() are simple calls to pte_fragment_alloc(), so they are good candidates for inlining as already done on PPC64. Reviewed-by: Aneesh Kumar K.V <[email protected]> Signed-off-by: Christophe Leroy <[email protected]> Signed-off-by: Michael Ellerman <[email protected]>
2019-05-03powerpc/mm: get rid of nohash/32/mmu.h and nohash/64/mmu.hChristophe Leroy4-33/+14
Those files have no real added values, especially the 64 bit which only includes the common book3e mmu.h which is also included from 32 bits side. So lets do the final inclusion directly from nohash/mmu.h Reviewed-by: Aneesh Kumar K.V <[email protected]> Signed-off-by: Christophe Leroy <[email protected]> Signed-off-by: Michael Ellerman <[email protected]>
2019-05-03powerpc/mm: move pgtable_t in asm/mmu.hChristophe Leroy5-24/+3
pgtable_t is now identical for all subarches, move it to the top level asm/mmu.h Reviewed-by: Aneesh Kumar K.V <[email protected]> Signed-off-by: Christophe Leroy <[email protected]> Signed-off-by: Michael Ellerman <[email protected]>
2019-05-03powerpc/mm: convert Book3E 64 to pte_fragmentChristophe Leroy3-28/+15
Book3E 64 is the only subarch not using pte_fragment. In order to allow refactorisation, this patch converts it to pte_fragment. Reviewed-by: Aneesh Kumar K.V <[email protected]> Signed-off-by: Christophe Leroy <[email protected]> Signed-off-by: Michael Ellerman <[email protected]>
2019-05-03powerpc/mm: drop __bad_pte()Christophe Leroy2-4/+0
This has never been called (since Kernel has been in git at least), drop it. Reviewed-by: Aneesh Kumar K.V <[email protected]> Signed-off-by: Christophe Leroy <[email protected]> Signed-off-by: Michael Ellerman <[email protected]>
2019-05-03powerpc/mm: cleanup HPAGE_SHIFT setupChristophe Leroy2-3/+10
Only book3s/64 may select default among several HPAGE_SHIFT at runtime. 8xx always defines 512K pages as default FSL_BOOK3E always defines 4M pages as default This patch limits HUGETLB_PAGE_SIZE_VARIABLE to book3s/64 moves the definitions in subarches files. Signed-off-by: Christophe Leroy <[email protected]> Signed-off-by: Michael Ellerman <[email protected]>
2019-05-03powerpc/mm: move hugetlb_disabled into asm/hugetlb.hChristophe Leroy2-1/+2
No need to have this in asm/page.h, move it into asm/hugetlb.h Signed-off-by: Christophe Leroy <[email protected]> Signed-off-by: Michael Ellerman <[email protected]>
2019-05-03powerpc/mm: cleanup ifdef mess in add_huge_page_size()Christophe Leroy3-0/+40
Introduce a subarch specific helper check_and_get_huge_psize() to check the huge page sizes and cleanup the ifdef mess in add_huge_page_size() Signed-off-by: Christophe Leroy <[email protected]> Signed-off-by: Michael Ellerman <[email protected]>
2019-05-03powerpc/mm: add a helper to populate hugepdChristophe Leroy3-0/+19
This patchs adds a subarch helper to populate hugepd. Signed-off-by: Christophe Leroy <[email protected]> Signed-off-by: Michael Ellerman <[email protected]>
2019-05-03powerpc/mm: split asm/hugetlb.h into dedicated subarch filesChristophe Leroy4-83/+106
Three subarches support hugepages: - fsl book3e - book3s/64 - 8xx This patch splits asm/hugetlb.h to reduce the #ifdef mess. Signed-off-by: Christophe Leroy <[email protected]> Signed-off-by: Michael Ellerman <[email protected]>
2019-05-03powerpc/mm: make gup_hugepte() staticChristophe Leroy1-3/+0
gup_huge_pd() is the only user of gup_hugepte() and it is located in the same file. This patch moves gup_huge_pd() after gup_hugepte() and makes gup_hugepte() static. Signed-off-by: Christophe Leroy <[email protected]> Signed-off-by: Michael Ellerman <[email protected]>
2019-05-03powerpc/64: only book3s/64 supports CONFIG_PPC_64K_PAGESChristophe Leroy6-27/+5
CONFIG_PPC_64K_PAGES cannot be selected by nohash/64. Signed-off-by: Christophe Leroy <[email protected]> Signed-off-by: Michael Ellerman <[email protected]>
2019-05-03powerpc/mm: define subarch SLB_ADDR_LIMIT_DEFAULTChristophe Leroy2-0/+4
This patch defines a subarch specific SLB_ADDR_LIMIT_DEFAULT to remove the #ifdefs around the setup of mm->context.slb_addr_limit It also generalises the use of mm_ctx_set_slb_addr_limit() helper. Signed-off-by: Christophe Leroy <[email protected]> Reviewed-by: Aneesh Kumar K.V <[email protected]> Signed-off-by: Michael Ellerman <[email protected]>
2019-05-03powerpc/mm: define get_slice_psize() all the timeChristophe Leroy1-0/+5
get_slice_psize() can be defined regardless of CONFIG_PPC_MM_SLICES to avoid ifdefs Signed-off-by: Christophe Leroy <[email protected]> Reviewed-by: Aneesh Kumar K.V <[email protected]> Signed-off-by: Michael Ellerman <[email protected]>
2019-05-03powerpc/8xx: get rid of #ifdef CONFIG_HUGETLB_PAGE for slicesChristophe Leroy1-4/+1
The 8xx only selects CONFIG_PPC_MM_SLICES when CONFIG_HUGETLB_PAGE is set. Signed-off-by: Christophe Leroy <[email protected]> Signed-off-by: Michael Ellerman <[email protected]>
2019-05-03powerpc/mm: get rid of mm_ctx_slice_mask_xxx()Christophe Leroy2-45/+4
Now that slice_mask_for_size() is in mmu.h, the mm_ctx_slice_mask_xxx() are not needed anymore, so drop them. Note that the 8xx ones where not used anyway. Signed-off-by: Christophe Leroy <[email protected]> Reviewed-by: Aneesh Kumar K.V <[email protected]> Signed-off-by: Michael Ellerman <[email protected]>
2019-05-03powerpc/mm: move slice_mask_for_size() into mmu.hChristophe Leroy2-13/+46
Move slice_mask_for_size() into subarch mmu.h Signed-off-by: Christophe Leroy <[email protected]> Reviewed-by: Aneesh Kumar K.V <[email protected]> [mpe: Retain the BUG_ON()s, rather than converting to VM_BUG_ON()] Signed-off-by: Michael Ellerman <[email protected]>
2019-05-03powerpc/mm: no slice for nohash/64Christophe Leroy2-15/+1
Only nohash/32 and book3s/64 support mm slices. Signed-off-by: Christophe Leroy <[email protected]> Reviewed-by: Aneesh Kumar K.V <[email protected]> Signed-off-by: Michael Ellerman <[email protected]>
2019-05-02powerpc/nohash64: clean pgtable.hChristophe Leroy1-5/+0
TRANSPARENT_HUGEPAGE is only supported by book3s VMEMMAP_REGION_ID is never used Signed-off-by: Christophe Leroy <[email protected]> Signed-off-by: Michael Ellerman <[email protected]>
2019-05-01powerpc/powernv/mce: Print additional information about MCE error.Mahesh Salgaonkar1-0/+10
Print more information about MCE error whether it is an hardware or software error. Some of the MCE errors can be easily categorized as hardware or software errors e.g. UEs are due to hardware error, where as error triggered due to invalid usage of tlbie is a pure software bug. But not all the MCE errors can be easily categorize into either software or hardware. There are errors like multihit errors which are usually result of a software bug, but in some rare cases a hardware failure can cause a multihit error. In past, we have seen case where after replacing faulty chip, multihit errors stopped occurring. Same with parity errors, which are usually due to faulty hardware but there are chances where multihit can also cause an parity error. Such errors are difficult to determine what really caused it. Hence this patch classifies MCE errors into following four categorize: 1. Hardware error: UE and Link timeout failure errors. 2. Probable hardware error (some chance of software cause) SLB/ERAT/TLB Parity errors. 3. Software error Invalid tlbie form. 4. Probable software error (some chance of hardware cause) SLB/ERAT/TLB Multihit errors. Sample output: MCE: CPU80: machine check (Warning) Guest SLB Multihit DAR: 000001001b6e0320 [Recovered] MCE: CPU80: PID: 24765 Comm: qemu-system-ppc Guest NIP: [00007fffa309dc60] MCE: CPU80: Probable Software error (some chance of hardware cause) Signed-off-by: Mahesh Salgaonkar <[email protected]> Signed-off-by: Michael Ellerman <[email protected]>
2019-05-01powerpc/powernv/mce: Print correct severity for MCE error.Mahesh Salgaonkar1-42/+44
Currently all machine check errors are printed as severe errors which isn't correct. Print soft errors as warning instead of severe errors. Signed-off-by: Mahesh Salgaonkar <[email protected]> Signed-off-by: Michael Ellerman <[email protected]>
2019-05-01powerpc/powernv/mce: Reduce MCE console logs to lesser lines.Mahesh Salgaonkar1-1/+1
Also add cpu number while displaying MCE log. This will help cleaner logs when MCE hits on multiple cpus simultaneously. Before the changes the MCE output was: Severe Machine check interrupt [Recovered] NIP [d00000000ba80280]: insert_slb_entry.constprop.0+0x278/0x2c0 [mcetest_slb] Initiator: CPU Error type: SLB [Multihit] Effective address: d00000000ba80280 After this patch series changes the MCE output will be: MCE: CPU80: machine check (Warning) Host SLB Multihit [Recovered] MCE: CPU80: NIP: [d00000000b550280] insert_slb_entry.constprop.0+0x278/0x2c0 [mcetest_slb] MCE: CPU80: Probable software error (some chance of hardware cause) UE in host application: MCE: CPU48: machine check (Severe) Host UE Load/Store DAR: 00007fffc6079a80 paddr: 0000000f8e260000 [Not recovered] MCE: CPU48: PID: 4584 Comm: find NIP: [0000000010023368] MCE: CPU48: Hardware error and for MCE in Guest: MCE: CPU80: machine check (Warning) Guest SLB Multihit DAR: 000001001b6e0320 [Recovered] MCE: CPU80: PID: 24765 Comm: qemu-system-ppc Guest NIP: [00007fffa309dc60] MCE: CPU80: Probable software error (some chance of hardware cause) Signed-off-by: Mahesh Salgaonkar <[email protected]> Signed-off-by: Michael Ellerman <[email protected]>
2019-05-01powerpc: Add doorbell tracepointsAnton Blanchard1-0/+16
When analysing sources of OS jitter, I noticed that doorbells cannot be traced. Signed-off-by: Anton Blanchard <[email protected]> Signed-off-by: Michael Ellerman <[email protected]>
2019-04-30Merge branch 'topic/ppc-kvm' into nextMichael Ellerman8-44/+68
Merge our topic branch shared with KVM. In particular this includes the rewrite of the idle code into C.
2019-04-30powerpc/64s: Reimplement book3s idle code in CNicholas Piggin4-41/+35
Reimplement Book3S idle code in C, moving POWER7/8/9 implementation speific HV idle code to the powernv platform code. Book3S assembly stubs are kept in common code and used only to save the stack frame and non-volatile GPRs before executing architected idle instructions, and restoring the stack and reloading GPRs then returning to C after waking from idle. The complex logic dealing with threads and subcores, locking, SPRs, HMIs, timebase resync, etc., is all done in C which makes it more maintainable. This is not a strict translation to C code, there are some significant differences: - Idle wakeup no longer uses the ->cpu_restore call to reinit SPRs, but saves and restores them itself. - The optimisation where EC=ESL=0 idle modes did not have to save GPRs or change MSR is restored, because it's now simple to do. ESL=1 sleeps that do not lose GPRs can use this optimization too. - KVM secondary entry and cede is now more of a call/return style rather than branchy. nap_state_lost is not required because KVM always returns via NVGPR restoring path. - KVM secondary wakeup from offline sequence is moved entirely into the offline wakeup, which avoids a hwsync in the normal idle wakeup path. Performance measured with context switch ping-pong on different threads or cores, is possibly improved a small amount, 1-3% depending on stop state and core vs thread test for shallow states. Deep states it's in the noise compared with other latencies. KVM improvements: - Idle sleepers now always return to caller rather than branch out to KVM first. - This allows optimisations like very fast return to caller when no state has been lost. - KVM no longer requires nap_state_lost because it controls NVGPR save/restore itself on the way in and out. - The heavy idle wakeup KVM request check can be moved out of the normal host idle code and into the not-performance-critical offline code. - KVM nap code now returns from where it is called, which makes the flow a bit easier to follow. Reviewed-by: Gautham R. Shenoy <[email protected]> Signed-off-by: Nicholas Piggin <[email protected]> [mpe: Squash the KVM changes in] Signed-off-by: Michael Ellerman <[email protected]>
2019-04-30KVM: PPC: Book3S HV: XIVE: Replace the 'destroy' method by a 'release' methodCédric Le Goater1-1/+5
When a P9 sPAPR VM boots, the CAS negotiation process determines which interrupt mode to use (XICS legacy or XIVE native) and invokes a machine reset to activate the chosen mode. We introduce 'release' methods for the XICS-on-XIVE and the XIVE native KVM devices which are called when the file descriptor of the device is closed after the TIMA and ESB pages have been unmapped. They perform the necessary cleanups : clear the vCPU interrupt presenters that could be attached and then destroy the device. The 'release' methods replace the 'destroy' methods as 'destroy' is not called anymore once 'release' is. Compatibility with older QEMU is nevertheless maintained. This is not considered as a safe operation as the vCPUs are still running and could be referencing the KVM device through their presenters. To protect the system from any breakage, the kvmppc_xive objects representing both KVM devices are now stored in an array under the VM. Allocation is performed on first usage and memory is freed only when the VM exits. [[email protected] - Moved freeing of xive structures to book3s.c, put it under #ifdef CONFIG_KVM_XICS.] Signed-off-by: Cédric Le Goater <[email protected]> Reviewed-by: David Gibson <[email protected]> Signed-off-by: Paul Mackerras <[email protected]>
2019-04-30KVM: PPC: Book3S HV: XIVE: Add a mapping for the source ESB pagesCédric Le Goater1-0/+1
Each source is associated with an Event State Buffer (ESB) with a even/odd pair of pages which provides commands to manage the source: to trigger, to EOI, to turn off the source for instance. The custom VM fault handler will deduce the guest IRQ number from the offset of the fault, and the ESB page of the associated XIVE interrupt will be inserted into the VMA using the internal structure caching information on the interrupts. Signed-off-by: Cédric Le Goater <[email protected]> Reviewed-by: David Gibson <[email protected]> Signed-off-by: Paul Mackerras <[email protected]>
2019-04-30KVM: PPC: Book3S HV: XIVE: Add a TIMA mappingCédric Le Goater2-0/+3
Each thread has an associated Thread Interrupt Management context composed of a set of registers. These registers let the thread handle priority management and interrupt acknowledgment. The most important are : - Interrupt Pending Buffer (IPB) - Current Processor Priority (CPPR) - Notification Source Register (NSR) They are exposed to software in four different pages each proposing a view with a different privilege. The first page is for the physical thread context and the second for the hypervisor. Only the third (operating system) and the fourth (user level) are exposed the guest. A custom VM fault handler will populate the VMA with the appropriate pages, which should only be the OS page for now. Signed-off-by: Cédric Le Goater <[email protected]> Reviewed-by: David Gibson <[email protected]> Signed-off-by: Paul Mackerras <[email protected]>
2019-04-30KVM: PPC: Book3S HV: XIVE: Add get/set accessors for the VP XIVE stateCédric Le Goater2-0/+13
The state of the thread interrupt management registers needs to be collected for migration. These registers are cached under the 'xive_saved_state.w01' field of the VCPU when the VPCU context is pulled from the HW thread. An OPAL call retrieves the backup of the IPB register in the underlying XIVE NVT structure and merges it in the KVM state. Signed-off-by: Cédric Le Goater <[email protected]> Reviewed-by: David Gibson <[email protected]> Signed-off-by: Paul Mackerras <[email protected]>