aboutsummaryrefslogtreecommitdiff
AgeCommit message (Collapse)AuthorFilesLines
2019-02-22powerpc/hash32: use physical address directly in hash handlers.Christophe Leroy2-37/+31
Since commit c62ce9ef97ba ("powerpc: remove remaining bits from CONFIG_APUS"), tophys() has become a pure constant operation. PAGE_OFFSET is known at compile time so the physical address can be builtin directly. Signed-off-by: Christophe Leroy <[email protected]> Signed-off-by: Michael Ellerman <[email protected]>
2019-02-22powerpc/603: use physical address directly in TLB miss handlers.Christophe Leroy1-9/+6
Since commit c62ce9ef97ba ("powerpc: remove remaining bits from CONFIG_APUS"), tophys() has become a pure constant operation. PAGE_OFFSET is known at compile time so the physical address can be builtin directly. Signed-off-by: Christophe Leroy <[email protected]> Signed-off-by: Michael Ellerman <[email protected]>
2019-02-22powerpc/6xx: Store PGDIR physical address in a SPRGChristophe Leroy4-15/+18
Use SPRN_SPRG2 to store the current thread PGDIR and avoid reading thread_struct.pgdir at every TLB miss. Signed-off-by: Christophe Leroy <[email protected]> Signed-off-by: Michael Ellerman <[email protected]>
2019-02-22powerpc/6xx: Don't use SPRN_SPRG2 for storing stack pointer while in RTASChristophe Leroy5-13/+21
When calling RTAS, the stack pointer is stored in SPRN_SPRG2 in order to be able to restore it in case of machine check in RTAS. As machine check is not a perfomance critical path, this patch frees SPRN_SPRG2 by using a field in thread struct instead. Signed-off-by: Christophe Leroy <[email protected]> Signed-off-by: Michael Ellerman <[email protected]>
2019-02-22powerpc: simplify BDI switchChristophe Leroy5-11/+9
There is no reason to re-read each time the pointer at location 0xf0 as it is fixed and known. Signed-off-by: Christophe Leroy <[email protected]> Signed-off-by: Michael Ellerman <[email protected]>
2019-02-22powerpc/83xx: Also save/restore SPRG4-7 during suspendChristophe Leroy1-7/+27
The 83xx has 8 SPRG registers and uses at least SPRG4 for DTLB handling LRU. Fixes: 2319f1239592 ("powerpc/mm: e300c2/c3/c4 TLB errata workaround") Cc: [email protected] Signed-off-by: Christophe Leroy <[email protected]> Signed-off-by: Michael Ellerman <[email protected]>
2019-02-22powerpc/traps: fix recoverability of machine check handling on book3s/32Christophe Leroy1-4/+4
Looks like book3s/32 doesn't set RI on machine check, so checking RI before calling die() will always be fatal allthought this is not an issue in most cases. Fixes: b96672dd840f ("powerpc: Machine check interrupt is a non-maskable interrupt") Fixes: daf00ae71dad ("powerpc/traps: restore recoverability of machine_check interrupts") Signed-off-by: Christophe Leroy <[email protected]> Cc: [email protected] Signed-off-by: Michael Ellerman <[email protected]>
2019-02-22powerpc/32: Remove unneccessary MSR[RI] clearing for 8xxChristophe Leroy1-3/+0
MSR[RI] has already been cleared a few lines above. Signed-off-by: Christophe Leroy <[email protected]> Signed-off-by: Michael Ellerman <[email protected]>
2019-02-22powerpc/setup: display reason for not bootingChristophe Leroy1-1/+1
When no machine description matches, display it clearly before looping forever. Signed-off-by: Christophe Leroy <[email protected]> Signed-off-by: Michael Ellerman <[email protected]>
2019-02-22powerpc/8xx: hide itlbie and dtlbie symbolsChristophe Leroy1-8/+8
When disassembling InstructionTLBError we get the following messy code: c000138c: 7d 84 63 78 mr r4,r12 c0001390: 75 25 58 00 andis. r5,r9,22528 c0001394: 75 2a 40 00 andis. r10,r9,16384 c0001398: 41 a2 00 08 beq c00013a0 <itlbie> c000139c: 7c 00 22 64 tlbie r4,r0 c00013a0 <itlbie>: c00013a0: 39 40 04 01 li r10,1025 c00013a4: 91 4b 00 b0 stw r10,176(r11) c00013a8: 39 40 10 32 li r10,4146 c00013ac: 48 00 cc 59 bl c000e004 <transfer_to_handler> For a cleaner code dump, this patch replaces itlbie and dtlbie symbols by local symbols. c000138c: 7d 84 63 78 mr r4,r12 c0001390: 75 25 58 00 andis. r5,r9,22528 c0001394: 75 2a 40 00 andis. r10,r9,16384 c0001398: 41 a2 00 08 beq c00013a0 <InstructionTLBError+0xa0> c000139c: 7c 00 22 64 tlbie r4,r0 c00013a0: 39 40 04 01 li r10,1025 c00013a4: 91 4b 00 b0 stw r10,176(r11) c00013a8: 39 40 10 32 li r10,4146 c00013ac: 48 00 cc 59 bl c000e004 <transfer_to_handler> Signed-off-by: Christophe Leroy <[email protected]> Signed-off-by: Michael Ellerman <[email protected]>
2019-02-22powerpc/selftest: fix type of mftb() in null_syscallChristophe Leroy1-1/+1
All callers of mftb() expect 'unsigned long', and the function itself only returns lower part of the TB so it really is 'unsigned long' not 'unsigned long long' Signed-off-by: Christophe Leroy <[email protected]> Signed-off-by: Michael Ellerman <[email protected]>
2019-02-22powerpc/powernv: Don't reprogram SLW image on every KVM guest entry/exitPaul Mackerras3-25/+29
Commit 24be85a23d1f ("powerpc/powernv: Clear PECE1 in LPCR via stop-api only on Hotplug", 2017-07-21) added two calls to opal_slw_set_reg() inside pnv_cpu_offline(), with the aim of changing the LPCR value in the SLW image to disable wakeups from the decrementer while a CPU is offline. However, pnv_cpu_offline() gets called each time a secondary CPU thread is woken up to participate in running a KVM guest, that is, not just when a CPU is offlined. Since opal_slw_set_reg() is a very slow operation (with observed execution times around 20 milliseconds), this means that an offline secondary CPU can often be busy doing the opal_slw_set_reg() call when the primary CPU wants to grab all the secondary threads so that it can run a KVM guest. This leads to messages like "KVM: couldn't grab CPU n" being printed and guest execution failing. There is no need to reprogram the SLW image on every KVM guest entry and exit. So that we do it only when a CPU is really transitioning between online and offline, this moves the calls to pnv_program_cpu_hotplug_lpcr() into pnv_smp_cpu_kill_self(). Fixes: 24be85a23d1f ("powerpc/powernv: Clear PECE1 in LPCR via stop-api only on Hotplug") Cc: [email protected] # v4.14+ Signed-off-by: Paul Mackerras <[email protected]> Signed-off-by: Michael Ellerman <[email protected]>
2019-02-22powerpc/64s: Fix logic when handling unknown CPU featuresMichael Ellerman1-10/+7
In cpufeatures_process_feature(), if a provided CPU feature is unknown and enable_unknown is false, we erroneously print that the feature is being enabled and return true, even though no feature has been enabled, and may also set feature bits based on the last entry in the match table. Fix this so that we only set feature bits from the match table if we have actually enabled a feature from that table, and when failing to enable an unknown feature, always print the "not enabling" message and return false. Coincidentally, some older gccs (<GCC 7), when invoked with -fsanitize-coverage=trace-pc, cause a spurious uninitialised variable warning in this function: arch/powerpc/kernel/dt_cpu_ftrs.c: In function ‘cpufeatures_process_feature’: arch/powerpc/kernel/dt_cpu_ftrs.c:686:7: warning: ‘m’ may be used uninitialized in this function [-Wmaybe-uninitialized] if (m->cpu_ftr_bit_mask) An upcoming patch will enable support for kcov, which requires this option. This patch avoids the warning. Fixes: 5a61ef74f269 ("powerpc/64s: Support new device tree binding for discovering CPU features") Reported-by: Segher Boessenkool <[email protected]> Signed-off-by: Michael Ellerman <[email protected]> [ajd: add commit message] Signed-off-by: Andrew Donnellan <[email protected]>
2019-02-22powerpc/smp: Make __smp_send_nmi_ipi() staticNicholas Piggin1-1/+2
Signed-off-by: Nicholas Piggin <[email protected]> Signed-off-by: Michael Ellerman <[email protected]>
2019-02-22powerpc/smp: Fix NMI IPI xmon timeoutNicholas Piggin1-64/+29
The xmon debugger IPI handler waits in the callback function while xmon is still active. This means they don't complete the IPI, and the initiator always times out waiting for them. Things manage to work after the timeout because there is some fallback logic to keep NMI IPI state sane in case of the timeout, but this is a bit ugly. This patch changes NMI IPI back to half-asynchronous (i.e., wait for everyone to call in, do not wait for IPI function to complete), but the complexity is avoided by going one step further and allowing new IPIs to be issued before the IPI functions to all complete. If synchronization against that is required, it is left up to the caller, but current callers don't require that. In fact with the timeout handling, callers must be able to cope with this already. Fixes: 5b73151fff63 ("powerpc: NMI IPI make NMI IPIs fully sychronous") Cc: [email protected] # v4.19+ Signed-off-by: Nicholas Piggin <[email protected]> Signed-off-by: Michael Ellerman <[email protected]>
2019-02-22powerpc/smp: Fix NMI IPI timeoutNicholas Piggin1-2/+3
The NMI IPI timeout logic is broken, if __smp_send_nmi_ipi() times out on the first condition, delay_us will be zero which will send it into the second spin loop with no timeout so it will spin forever. Fixes: 5b73151fff63 ("powerpc: NMI IPI make NMI IPIs fully sychronous") Cc: [email protected] # v4.19+ Signed-off-by: Nicholas Piggin <[email protected]> Signed-off-by: Michael Ellerman <[email protected]>
2019-02-22powerpc: Make PPC_64K_PAGES depend on only 44x or PPC_BOOK3S_64Michael Ellerman1-1/+1
In commit 7820856a4fcd ("powerpc/mm/book3e/64: Remove unsupported 64Kpage size from 64bit booke") we dropped the 64K page size support from the 64-bit nohash (Book3E) code. But we didn't update the dependencies of the PPC_64K_PAGES option, meaning a randconfig can still trigger this code and cause a build breakage, eg: arch/powerpc/include/asm/nohash/64/pgtable.h:14:2: error: #error "Page size not supported" arch/powerpc/include/asm/nohash/mmu-book3e.h:275:2: error: #error Unsupported page size So remove PPC_BOOK3E_64 from the dependencies. This also means we don't need to worry about PPC_FSL_BOOK3E, because that was just trying to prevent the PPC_BOOK3E_64=y && PPC_FSL_BOOK3E=y case. Signed-off-by: Michael Ellerman <[email protected]>
2019-02-22powerpc/64: Make sys_switch_endian() traceableMichael Ellerman1-1/+1
We weren't using SYSCALL_DEFINE for sys_switch_endian(), which means it wasn't able to be traced by CONFIG_FTRACE_SYSCALLS. By using the macro we create the right metadata and the syscall is visible. eg: # cd /sys/kernel/debug/tracing # echo 1 | tee events/syscalls/sys_*_switch_endian/enable # ~/switch_endian_test # cat trace ... switch_endian_t-3604 [009] .... 315.175164: sys_switch_endian() switch_endian_t-3604 [009] .... 315.175167: sys_switch_endian -> 0x5555aaaa5555aaaa switch_endian_t-3604 [009] .... 315.175169: sys_switch_endian() switch_endian_t-3604 [009] .... 315.175169: sys_switch_endian -> 0x5555aaaa5555aaaa Fixes: 529d235a0e19 ("powerpc: Add a proper syscall for switching endianness") Signed-off-by: Michael Ellerman <[email protected]>
2019-02-22powerpc/dts: Standardize DTS status assignments from "ok" to "okay"Robert P. J. Day4-4/+4
While the current kernel drivers/of/ code allows developers to be sloppy and use a DTS status value of "ok", the current DTSpec 0.1 makes it clear that the proper spelling is "okay", so fix the small number of PowerPC .dts files that do this. Signed-off-by: Robert P. J. Day <[email protected]> Signed-off-by: Michael Ellerman <[email protected]>
2019-02-22powerpc/book3s: Remove pgd/pud/pmd_set() interfacesAneesh Kumar K.V2-18/+4
When updating page tables, we need to make sure we fill the page table entry valid bits. We do this by or'ing in one of PGD/PUD/PMD_VAL_BITS. The page table 'set' interfaces allow updating the raw value of page table entries without setting the valid bits, so remove those interfaces to avoid incorrect usage in future. Signed-off-by: Aneesh Kumar K.V <[email protected]> [mpe: Reword commit message based on mailing list discussion] Signed-off-by: Michael Ellerman <[email protected]>
2019-02-22powerpc: Fix 32-bit KVM-PR lockup and host crash with MacOS guestMark Cave-Ayland1-1/+1
Commit 8792468da5e1 "powerpc: Add the ability to save FPU without giving it up" unexpectedly removed the MSR_FE0 and MSR_FE1 bits from the bitmask used to update the MSR of the previous thread in __giveup_fpu() causing a KVM-PR MacOS guest to lockup and panic the host kernel. Leaving FE0/1 enabled means unrelated processes might receive FPEs when they're not expecting them and crash. In particular if this happens to init the host will then panic. eg (transcribed): qemu-system-ppc[837]: unhandled signal 8 at 12cc9ce4 nip 12cc9ce4 lr 12cc9ca4 code 0 systemd[1]: unhandled signal 8 at 202f02e0 nip 202f02e0 lr 001003d4 code 0 Kernel panic - not syncing: Attempted to kill init! exitcode=0x0000000b Reinstate these bits to the MSR bitmask to enable MacOS guests to run under 32-bit KVM-PR once again without issue. Fixes: 8792468da5e1 ("powerpc: Add the ability to save FPU without giving it up") Cc: [email protected] # v4.6+ Signed-off-by: Mark Cave-Ayland <[email protected]> Signed-off-by: Michael Ellerman <[email protected]>
2019-02-22powerpc/pseries: export timebase register sample in lparcfgTyrel Datwyler1-0/+1
The Processor Utilzation of Resource Registers (PURR) provide an estimate of resources used by a cpu thread. Section 7.6 in Book III of the ISA outlines how to calculate the percentage of shared resources for threads using the ratio of the PURR delta and Timebase Register delta for a sampled period. This calculation is currently done erroneously by the lparstat tool from the powerpc-utils package. This patch exports the current timebase value after we sample the PURRs and exposes it to userspace accounting tools via /proc/ppc64/lparcfg. Signed-off-by: Tyrel Datwyler <[email protected]> Signed-off-by: Michael Ellerman <[email protected]>
2019-02-22powerpc/44x: Force PCI on for CURRITUCKMichael Ellerman1-0/+1
The recent rework of PCI kconfig symbols exposed an existing bug in the CURRITUCK kconfig logic. It selects PPC4xx_PCI_EXPRESS which depends on PCI, but PCI is user selectable and might be disabled, leading to a warning: WARNING: unmet direct dependencies detected for PPC4xx_PCI_EXPRESS Depends on [n]: PCI [=n] && 4xx [=y] Selected by [y]: - CURRITUCK [=y] && PPC_47x [=y] Prior to commit eb01d42a7778 ("PCI: consolidate PCI config entry in drivers/pci") PCI was enabled by default for currituck_defconfig so we didn't see the warning. The bad logic was still there, it just required someone disabling PCI in their .config to hit it. Fix it by forcing PCI on for CURRITUCK, which seems was always the expectation anyway. Fixes: eb01d42a7778 ("PCI: consolidate PCI config entry in drivers/pci") Reported-by: Randy Dunlap <[email protected]> Reviewed-by: Christoph Hellwig <[email protected]> Signed-off-by: Michael Ellerman <[email protected]>
2019-02-22powerpc/eeh: Add eeh_force_recover to debugfsOliver O'Halloran3-10/+75
This patch adds a debugfs interface to force scheduling a recovery event. This can be used to recover a specific PE or schedule a "special" recovery even that checks for errors at the PHB level. To force a recovery of a normal PE, use: echo '<#pe>:<#phb>' > /sys/kernel/debug/powerpc/eeh_force_recover To force a scan for broken PHBs: echo 'hwcheck' > /sys/kernel/debug/powerpc/eeh_force_recover Signed-off-by: Oliver O'Halloran <[email protected]> Signed-off-by: Michael Ellerman <[email protected]>
2019-02-22powerpc/eeh: Allow disabling recoveryOliver O'Halloran3-0/+20
Currently when we detect an error we automatically invoke the EEH recovery handler. This can be annoying when debugging EEH problems, or when working on EEH itself so this patch adds a debugfs knob that will prevent a recovery event from being queued up when an issue is detected. Signed-off-by: Oliver O'Halloran <[email protected]> Signed-off-by: Michael Ellerman <[email protected]>
2019-02-22powerpc/pci: Add pci_find_controller_for_domain()Oliver O'Halloran2-0/+13
Add a helper to find the pci_controller structure based on the domain number / phb id. Signed-off-by: Oliver O'Halloran <[email protected]> Reviewed-by: Sam Bobroff <[email protected]> Signed-off-by: Michael Ellerman <[email protected]>
2019-02-22powerpc/eeh_cache: Bump log level of eeh_addr_cache_print()Oliver O'Halloran1-1/+1
To use this function at all #define DEBUG needs to be set in eeh_cache.c. Considering that printing at pr_debug is probably not all that useful since it adds the additional hurdle of requiring you to enable the debug print if dynamic_debug is in use so this patch bumps it to pr_info. Signed-off-by: Oliver O'Halloran <[email protected]> Reviewed-by: Sam Bobroff <[email protected]> Signed-off-by: Michael Ellerman <[email protected]>
2019-02-22powerpc/eeh_cache: Add a way to dump the EEH address cacheOliver O'Halloran3-4/+30
Adds a debugfs file that can be read to view the contents of the EEH address cache. This is pretty similar to the existing eeh_addr_cache_print() function, but that function is intended to debug issues inside of the kernel since it's #ifdef`ed out by default, and writes into the kernel log. Signed-off-by: Oliver O'Halloran <[email protected]> Reviewed-by: Sam Bobroff <[email protected]> Signed-off-by: Michael Ellerman <[email protected]>
2019-02-22powerpc/eeh_cache: Add pr_debug() prints for insert/removeOliver O'Halloran1-2/+2
The EEH address cache is used to map a physical MMIO address back to a PCI device. It's useful to know when it's being manipulated, but currently this requires recompiling with #define DEBUG set. This is pointless since we have dynamic_debug nowdays, so remove the #ifdef guard and add a pr_debug() for the remove case too. Signed-off-by: Oliver O'Halloran <[email protected]> Reviewed-by: Sam Bobroff <[email protected]> Signed-off-by: Michael Ellerman <[email protected]>
2019-02-22powerpc/eeh: Use debugfs_create_u32 for eeh_max_freezesOliver O'Halloran2-19/+4
There's no need to the custom getter/setter functions so we should remove them in favour of using the generic one. While we're here, change the type of eeh_max_freeze to u32 and print the value in decimal rather than hex because printing it in hex makes no sense. Signed-off-by: Oliver O'Halloran <[email protected]> Reviewed-by: Sam Bobroff <[email protected]> Signed-off-by: Michael Ellerman <[email protected]>
2019-02-22powerpc: drop unused GENERIC_CSUM Kconfig itemChristophe Leroy2-7/+0
Commit d4fde568a34a ("powerpc/64: Use optimized checksum routines on little-endian") converted last powerpc user of GENERIC_CSUM. This patch does a final cleanup dropping the Kconfig GENERIC_CSUM option which is always 'n', and associated piece of code in asm/checksum.h Fixes: d4fde568a34a ("powerpc/64: Use optimized checksum routines on little-endian") Reported-by: Christoph Hellwig <[email protected]> Signed-off-by: Christophe Leroy <[email protected]> Reviewed-by: Christoph Hellwig <[email protected]> Signed-off-by: Michael Ellerman <[email protected]>
2019-02-22powerpc/64s/hash: Fix assert_slb_presence() use of the slbfee. instructionNicholas Piggin1-0/+5
The slbfee. instruction must have bit 24 of RB clear, failure to do so can result in false negatives that result in incorrect assertions. This is not obvious from the ISA v3.0B document, which only says: The hardware ignores the contents of RB 36:38 40:63 -- p.1032 This patch fixes the bug and also clears all other bits from PPC bit 36-63, which is good practice when dealing with reserved or ignored bits. Fixes: e15a4fea4dee ("powerpc/64s/hash: Add some SLB debugging tests") Cc: [email protected] # v4.20+ Reported-by: Aneesh Kumar K.V <[email protected]> Tested-by: Aneesh Kumar K.V <[email protected]> Signed-off-by: Nicholas Piggin <[email protected]> Reviewed-by: Aneesh Kumar K.V <[email protected]> Signed-off-by: Michael Ellerman <[email protected]>
2019-02-22powerpc/mm/hash: Increase vmalloc space to 512T with hash MMUMichael Ellerman1-9/+23
This patch updates the kernel non-linear virtual map to 512TB when we're built with 64K page size and are using the hash MMU. We allocate one context for the vmalloc region and hence the max virtual area size is limited by the context map size (512TB for 64K and 64TB for 4K page size). This patch fixes boot failures with large amounts of system RAM where we need large vmalloc space to handle per cpu allocations. Signed-off-by: Michael Ellerman <[email protected]> Signed-off-by: Aneesh Kumar K.V <[email protected]> Tested-by: Aneesh Kumar K.V <[email protected]>
2019-02-22powerpc/ptrace: Simplify vr_get/set() to avoid GCC warningMichael Ellerman1-2/+8
GCC 8 warns about the logic in vr_get/set(), which with -Werror breaks the build: In function ‘user_regset_copyin’, inlined from ‘vr_set’ at arch/powerpc/kernel/ptrace.c:628:9: include/linux/regset.h:295:4: error: ‘memcpy’ offset [-527, -529] is out of the bounds [0, 16] of object ‘vrsave’ with type ‘union <anonymous>’ [-Werror=array-bounds] arch/powerpc/kernel/ptrace.c: In function ‘vr_set’: arch/powerpc/kernel/ptrace.c:623:5: note: ‘vrsave’ declared here } vrsave; This has been identified as a regression in GCC, see GCC bug 88273. However we can avoid the warning and also simplify the logic and make it more robust. Currently we pass -1 as end_pos to user_regset_copyout(). This says "copy up to the end of the regset". The definition of the regset is: [REGSET_VMX] = { .core_note_type = NT_PPC_VMX, .n = 34, .size = sizeof(vector128), .align = sizeof(vector128), .active = vr_active, .get = vr_get, .set = vr_set }, The end is calculated as (n * size), ie. 34 * sizeof(vector128). In vr_get/set() we pass start_pos as 33 * sizeof(vector128), meaning we can copy up to sizeof(vector128) into/out-of vrsave. The on-stack vrsave is defined as: union { elf_vrreg_t reg; u32 word; } vrsave; And elf_vrreg_t is: typedef __vector128 elf_vrreg_t; So there is no bug, but we rely on all those sizes lining up, otherwise we would have a kernel stack exposure/overwrite on our hands. Rather than relying on that we can pass an explict end_pos based on the sizeof(vrsave). The result should be exactly the same but it's more obviously not over-reading/writing the stack and it avoids the compiler warning. Reported-by: Meelis Roos <[email protected]> Reported-by: Mathieu Malaterre <[email protected]> Cc: [email protected] Tested-by: Mathieu Malaterre <[email protected]> Tested-by: Meelis Roos <[email protected]> Signed-off-by: Michael Ellerman <[email protected]>
2019-02-22powerpc/powernv/npu: Remove redundant change_pte() hookPeter Xu1-10/+0
The change_pte() notifier was designed to use as a quick path to update secondary MMU PTEs on write permission changes or PFN changes. For KVM, it could reduce the vm-exits when vcpu faults on the pages that was touched up by KSM. It's not used to do cache invalidations, for example, if we see the notifier will be called before the real PTE update after all (please see set_pte_at_notify that set_pte_at was called later). All the necessary cache invalidation should all be done in invalidate_range() already. Signed-off-by: Peter Xu <[email protected]> Reviewed-by: Andrea Arcangeli <[email protected]> Reviewed-by: Alistair Popple <[email protected]> Reviewed-by: Balbir Singh <[email protected]> Signed-off-by: Michael Ellerman <[email protected]>
2019-02-22Merge branch 'topic/ppc-kvm' into nextMichael Ellerman8-91/+54
Merge commits we're sharing with kvm-ppc tree.
2019-02-21powerpc/64s: Better printing of machine check info for guest MCEsPaul Mackerras4-7/+9
This adds an "in_guest" parameter to machine_check_print_event_info() so that we can avoid trying to translate guest NIP values into symbolic form using the host kernel's symbol table. Reviewed-by: Aravinda Prasad <[email protected]> Reviewed-by: Mahesh Salgaonkar <[email protected]> Signed-off-by: Paul Mackerras <[email protected]> Signed-off-by: Michael Ellerman <[email protected]>
2019-02-21KVM: PPC: Book3S HV: Simplify machine check handlingPaul Mackerras5-84/+42
This makes the handling of machine check interrupts that occur inside a guest simpler and more robust, with less done in assembler code and in real mode. Now, when a machine check occurs inside a guest, we always get the machine check event struct and put a copy in the vcpu struct for the vcpu where the machine check occurred. We no longer call machine_check_queue_event() from kvmppc_realmode_mc_power7(), because on POWER8, when a vcpu is running on an offline secondary thread and we call machine_check_queue_event(), that calls irq_work_queue(), which doesn't work because the CPU is offline, but instead triggers the WARN_ON(lazy_irq_pending()) in pnv_smp_cpu_kill_self() (which fires again and again because nothing clears the condition). All that machine_check_queue_event() actually does is to cause the event to be printed to the console. For a machine check occurring in the guest, we now print the event in kvmppc_handle_exit_hv() instead. The assembly code at label machine_check_realmode now just calls C code and then continues exiting the guest. We no longer either synthesize a machine check for the guest in assembly code or return to the guest without a machine check. The code in kvmppc_handle_exit_hv() is extended to handle the case where the guest is not FWNMI-capable. In that case we now always synthesize a machine check interrupt for the guest. Previously, if the host thinks it has recovered the machine check fully, it would return to the guest without any notification that the machine check had occurred. If the machine check was caused by some action of the guest (such as creating duplicate SLB entries), it is much better to tell the guest that it has caused a problem. Therefore we now always generate a machine check interrupt for guests that are not FWNMI-capable. Reviewed-by: Aravinda Prasad <[email protected]> Reviewed-by: Mahesh Salgaonkar <[email protected]> Signed-off-by: Paul Mackerras <[email protected]> Signed-off-by: Michael Ellerman <[email protected]>
2019-02-21Merge branch 'topic/dma' into nextMichael Ellerman48-1158/+317
Merge hch's big DMA rework series. This is in a topic branch in case he wants to merge it to minimise conflicts.
2019-02-21KVM: PPC: Book3S HV: Context switch AMR on Power9Michael Ellerman1-1/+4
kvmhv_p9_guest_entry() implements a fast-path guest entry for Power9 when guest and host are both running with the Radix MMU. Currently in that path we don't save the host AMR (Authority Mask Register) value, and we always restore 0 on return to the host. That is OK at the moment because the AMR is not used for storage keys with the Radix MMU. However we plan to start using the AMR on Radix to prevent the kernel from reading/writing to userspace outside of copy_to/from_user(). In order to make that work we need to save/restore the AMR value. We only restore the value if it is different from the guest value, which is already in the register when we exit to the host. This should mean we rarely need to actually restore the value when running a modern Linux as a guest, because it will be using the same value as us. Signed-off-by: Michael Ellerman <[email protected]> Tested-by: Russell Currey <[email protected]>
2019-02-19Merge branch 'fixes' into nextMichael Ellerman15-35/+60
There's a few important fixes in our fixes branch, in particular the pgd/pud_present() one, so merge it now.
2019-02-18powerpc/dma: trim the fat from <asm/dma-mapping.h>Christoph Hellwig6-29/+14
There is no need to provide anything but get_arch_dma_ops to <linux/dma-mapping.h>. More the remaining declarations to <asm/iommu.h> and drop all the includes. Signed-off-by: Christoph Hellwig <[email protected]> Tested-by: Christian Zigotzky <[email protected]> Signed-off-by: Michael Ellerman <[email protected]>
2019-02-18powerpc/dma: remove set_dma_offsetChristoph Hellwig8-20/+11
There is no good reason for this helper, just opencode it. Signed-off-by: Christoph Hellwig <[email protected]> Tested-by: Christian Zigotzky <[email protected]> Signed-off-by: Michael Ellerman <[email protected]>
2019-02-18powerpc/dma: remove get_dma_offsetChristoph Hellwig2-18/+6
Just fold the calculation into __phys_to_dma/__dma_to_phys as those are the only places that should know about it. Signed-off-by: Christoph Hellwig <[email protected]> Acked-by: Benjamin Herrenschmidt <[email protected]> Tested-by: Christian Zigotzky <[email protected]> Signed-off-by: Michael Ellerman <[email protected]>
2019-02-18powerpc/dma: use the generic direct mapping bypassChristoph Hellwig25-222/+16
Now that we've switched all the powerpc nommu and swiotlb methods to use the generic dma_direct_* calls we can remove these ops vectors entirely and rely on the common direct mapping bypass that avoids indirect function calls entirely. This also allows to remove a whole lot of boilerplate code related to setting up these operations. Signed-off-by: Christoph Hellwig <[email protected]> Tested-by: Christian Zigotzky <[email protected]> Signed-off-by: Michael Ellerman <[email protected]>
2019-02-18powerpc/dma: use the dma_direct mapping routinesChristoph Hellwig5-120/+32
Switch the streaming DMA mapping and ownership transfer methods to the functionally identical dma_direct_ versions. Factor the cache maintainance helpers into the form expected by the common code for that. Signed-off-by: Christoph Hellwig <[email protected]> Tested-by: Christian Zigotzky <[email protected]> Signed-off-by: Michael Ellerman <[email protected]>
2019-02-18powerpc/dma: use the dma-direct allocator for coherent platformsChristoph Hellwig5-92/+9
The generic code allows a few nice things such as node local allocations and dipping into the CMA area. The lookup of the right zone for a given dma mask works a little different, but the results should be the same. Signed-off-by: Christoph Hellwig <[email protected]> Tested-by: Christian Zigotzky <[email protected]> Signed-off-by: Michael Ellerman <[email protected]>
2019-02-18swiotlb: remove swiotlb_dma_supportedChristoph Hellwig3-16/+1
The only user left is powerpc, but even there the generic dma-direct version works just as well, given that we guarantee that the swiotlb buffer must always be addressable. Signed-off-by: Christoph Hellwig <[email protected]> Tested-by: Christian Zigotzky <[email protected]> Signed-off-by: Michael Ellerman <[email protected]>
2019-02-18powerpc/dma: remove dma_nommu_dma_supportedChristoph Hellwig3-26/+2
This function is largely identical to the generic version used everywhere else. Replace it with the generic version. Signed-off-by: Christoph Hellwig <[email protected]> Tested-by: Christian Zigotzky <[email protected]> Signed-off-by: Michael Ellerman <[email protected]>
2019-02-18powerpc/dma: remove dma_nommu_get_required_maskChristoph Hellwig3-15/+2
This function is identical to the generic dma_direct_get_required_mask, except that the generic version also takes the bus_dma_mask account, which could lead to incorrect results in the powerpc version. Signed-off-by: Christoph Hellwig <[email protected]> Tested-by: Christian Zigotzky <[email protected]> Signed-off-by: Michael Ellerman <[email protected]>