aboutsummaryrefslogtreecommitdiff
path: root/arch/powerpc/sysdev/xive/spapr.c
AgeCommit message (Collapse)AuthorFilesLines
2020-07-30powerpc: fix function annotations to avoid section mismatch warnings with gcc-10Vladis Dronov1-1/+1
Certain warnings are emitted for powerpc code when building with a gcc-10 toolset: WARNING: modpost: vmlinux.o(.text.unlikely+0x377c): Section mismatch in reference from the function remove_pmd_table() to the function .meminit.text:split_kernel_mapping() The function remove_pmd_table() references the function __meminit split_kernel_mapping(). This is often because remove_pmd_table lacks a __meminit annotation or the annotation of split_kernel_mapping is wrong. Add the appropriate __init and __meminit annotations to make modpost not complain. In all the cases there are just a single callsite from another __init or __meminit function: __meminit remove_pagetable() -> remove_pud_table() -> remove_pmd_table() __init prom_init() -> setup_secure_guest() __init xive_spapr_init() -> xive_spapr_disabled() Signed-off-by: Vladis Dronov <[email protected]> Signed-off-by: Michael Ellerman <[email protected]> Link: https://lore.kernel.org/r/[email protected]
2020-05-28powerpc/xive: Share the event-queue page with the Hypervisor.Ram Pai1-0/+7
XIVE interrupt controller uses an Event Queue (EQ) to enqueue event notifications when an exception occurs. The EQ is a single memory page provided by the O/S defining a circular buffer, one per server and priority couple. On baremetal, the EQ page is configured with an OPAL call. On pseries, an extra hop is necessary and the guest OS uses the hcall H_INT_SET_QUEUE_CONFIG to configure the XIVE interrupt controller. The XIVE controller being Hypervisor privileged, it will not be allowed to enqueue event notifications for a Secure VM unless the EQ pages are shared by the Secure VM. Hypervisor/Ultravisor still requires support for the TIMA and ESB page fault handlers. Until this is complete, QEMU can use the emulated XIVE device for Secure VMs, option "kernel_irqchip=off" on the QEMU pseries machine. Signed-off-by: Ram Pai <[email protected]> Reviewed-by: Cedric Le Goater <[email protected]> Reviewed-by: Greg Kurz <[email protected]> Signed-off-by: Michael Ellerman <[email protected]> Link: https://lore.kernel.org/r/[email protected]
2020-03-27powerpc/xive: Add a debugfs file to dump internal XIVE stateCédric Le Goater1-0/+19
As does XMON, the debugfs file /sys/kernel/debug/powerpc/xive exposes the XIVE internal state of the machine CPUs and interrupts. Available on the PowerNV and sPAPR platforms. Signed-off-by: Cédric Le Goater <[email protected]> [mpe: Make the debugfs file 0400] Signed-off-by: Michael Ellerman <[email protected]> Link: https://lore.kernel.org/r/[email protected]
2020-03-27powerpc/xive: Use XIVE_BAD_IRQ instead of zero to catch non configured IPIsCédric Le Goater1-2/+2
When a CPU is brought up, an IPI number is allocated and recorded under the XIVE CPU structure. Invalid IPI numbers are tracked with interrupt number 0x0. On the PowerNV platform, the interrupt number space starts at 0x10 and this works fine. However, on the sPAPR platform, it is possible to allocate the interrupt number 0x0 and this raises an issue when CPU 0 is unplugged. The XIVE spapr driver tracks allocated interrupt numbers in a bitmask and it is not correctly updated when interrupt number 0x0 is freed. It stays allocated and it is then impossible to reallocate. Fix by using the XIVE_BAD_IRQ value instead of zero on both platforms. Reported-by: David Gibson <[email protected]> Fixes: eac1e731b59e ("powerpc/xive: guest exploitation of the XIVE interrupt controller") Cc: [email protected] # v4.14+ Signed-off-by: Cédric Le Goater <[email protected]> Reviewed-by: David Gibson <[email protected]> Tested-by: David Gibson <[email protected]> Signed-off-by: Michael Ellerman <[email protected]> Link: https://lore.kernel.org/r/[email protected]
2019-12-05powerpc/xive: Skip ioremap() of ESB pages for LSI interruptsCédric Le Goater1-2/+10
The PCI INTx interrupts and other LSI interrupts are handled differently under a sPAPR platform. When the interrupt source characteristics are queried, the hypervisor returns an H_INT_ESB flag to inform the OS that it should be using the H_INT_ESB hcall for interrupt management and not loads and stores on the interrupt ESB pages. A default -1 value is returned for the addresses of the ESB pages. The driver ignores this condition today and performs a bogus IO mapping. Recent changes and the DEBUG_VM configuration option make the bug visible with : kernel BUG at arch/powerpc/include/asm/book3s/64/pgtable.h:612! Oops: Exception in kernel mode, sig: 5 [#1] LE PAGE_SIZE=64K MMU=Radix MMU=Hash SMP NR_CPUS=1024 NUMA pSeries Modules linked in: CPU: 0 PID: 1 Comm: swapper/0 Not tainted 5.4.0-0.rc6.git0.1.fc32.ppc64le #1 NIP: c000000000f63294 LR: c000000000f62e44 CTR: 0000000000000000 REGS: c0000000fa45f0d0 TRAP: 0700 Not tainted (5.4.0-0.rc6.git0.1.fc32.ppc64le) ... NIP ioremap_page_range+0x4c4/0x6e0 LR ioremap_page_range+0x74/0x6e0 Call Trace: ioremap_page_range+0x74/0x6e0 (unreliable) do_ioremap+0x8c/0x120 __ioremap_caller+0x128/0x140 ioremap+0x30/0x50 xive_spapr_populate_irq_data+0x170/0x260 xive_irq_domain_map+0x8c/0x170 irq_domain_associate+0xb4/0x2d0 irq_create_mapping+0x1e0/0x3b0 irq_create_fwspec_mapping+0x27c/0x3e0 irq_create_of_mapping+0x98/0xb0 of_irq_parse_and_map_pci+0x168/0x230 pcibios_setup_device+0x88/0x250 pcibios_setup_bus_devices+0x54/0x100 __of_scan_bus+0x160/0x310 pcibios_scan_phb+0x330/0x390 pcibios_init+0x8c/0x128 do_one_initcall+0x60/0x2c0 kernel_init_freeable+0x290/0x378 kernel_init+0x2c/0x148 ret_from_kernel_thread+0x5c/0x80 Fixes: bed81ee181dd ("powerpc/xive: introduce H_INT_ESB hcall") Cc: [email protected] # v4.14+ Signed-off-by: Cédric Le Goater <[email protected]> Tested-by: Daniel Axtens <[email protected]> Signed-off-by: Michael Ellerman <[email protected]> Link: https://lore.kernel.org/r/[email protected]
2019-08-19powerpc/xive: Fix dump of XIVE interrupt under pseriesCédric Le Goater1-0/+51
The xmon 'dxi' command calls OPAL to query the XIVE configuration of a interrupt. This can only be done on baremetal (PowerNV) and it will crash a pseries machine. Introduce a new XIVE get_irq_config() operation which implements a different query depending on the platform, PowerNV or pseries, and modify xmon to use a top level wrapper. Signed-off-by: Cédric Le Goater <[email protected]> Signed-off-by: Michael Ellerman <[email protected]> Link: https://lore.kernel.org/r/[email protected]
2019-08-15powerpc/xive: Add a check for memory allocation failureChristophe JAILLET1-0/+4
The result of this kzalloc is not checked. Add a check and corresponding error handling code. Signed-off-by: Christophe JAILLET <[email protected]> Reviewed-by: Greg Kurz <[email protected]> Signed-off-by: Michael Ellerman <[email protected]> Link: https://lore.kernel.org/r/cc53462734dfeaf15b6bad0e626b483de18656b4.1564647619.git.christophe.jaillet@wanadoo.fr
2019-08-15powerpc/xive: Use GFP_KERNEL instead of GFP_ATOMIC in 'xive_irq_bitmap_add()'Christophe JAILLET1-1/+1
There is no need to use GFP_ATOMIC here. GFP_KERNEL should be enough. GFP_KERNEL is also already used for another allocation just a few lines below. Signed-off-by: Christophe JAILLET <[email protected]> Reviewed-by: Cédric Le Goater <[email protected]> Reviewed-by: Greg Kurz <[email protected]> Signed-off-by: Michael Ellerman <[email protected]> Link: https://lore.kernel.org/r/85d5d247ce753befd6aa63c473f7823de6520ccd.1564647619.git.christophe.jaillet@wanadoo.fr
2019-07-13Merge tag 'powerpc-5.3-1' of ↵Linus Torvalds1-1/+51
git://git.kernel.org/pub/scm/linux/kernel/git/powerpc/linux Pull powerpc updates from Michael Ellerman: "Notable changes: - Removal of the NPU DMA code, used by the out-of-tree Nvidia driver, as well as some other functions only used by drivers that haven't (yet?) made it upstream. - A fix for a bug in our handling of hardware watchpoints (eg. perf record -e mem: ...) which could lead to register corruption and kernel crashes. - Enable HAVE_ARCH_HUGE_VMAP, which allows us to use large pages for vmalloc when using the Radix MMU. - A large but incremental rewrite of our exception handling code to use gas macros rather than multiple levels of nested CPP macros. And the usual small fixes, cleanups and improvements. Thanks to: Alastair D'Silva, Alexey Kardashevskiy, Andreas Schwab, Aneesh Kumar K.V, Anju T Sudhakar, Anton Blanchard, Arnd Bergmann, Athira Rajeev, Cédric Le Goater, Christian Lamparter, Christophe Leroy, Christophe Lombard, Christoph Hellwig, Daniel Axtens, Denis Efremov, Enrico Weigelt, Frederic Barrat, Gautham R. Shenoy, Geert Uytterhoeven, Geliang Tang, Gen Zhang, Greg Kroah-Hartman, Greg Kurz, Gustavo Romero, Krzysztof Kozlowski, Madhavan Srinivasan, Masahiro Yamada, Mathieu Malaterre, Michael Neuling, Nathan Lynch, Naveen N. Rao, Nicholas Piggin, Nishad Kamdar, Oliver O'Halloran, Qian Cai, Ravi Bangoria, Sachin Sant, Sam Bobroff, Satheesh Rajendran, Segher Boessenkool, Shaokun Zhang, Shawn Anastasio, Stewart Smith, Suraj Jitindar Singh, Thiago Jung Bauermann, YueHaibing" * tag 'powerpc-5.3-1' of git://git.kernel.org/pub/scm/linux/kernel/git/powerpc/linux: (163 commits) powerpc/powernv/idle: Fix restore of SPRN_LDBAR for POWER9 stop state. powerpc/eeh: Handle hugepages in ioremap space ocxl: Update for AFU descriptor template version 1.1 powerpc/boot: pass CONFIG options in a simpler and more robust way powerpc/boot: add {get, put}_unaligned_be32 to xz_config.h powerpc/irq: Don't WARN continuously in arch_local_irq_restore() powerpc/module64: Use symbolic instructions names. powerpc/module32: Use symbolic instructions names. powerpc: Move PPC_HA() PPC_HI() and PPC_LO() to ppc-opcode.h powerpc/module64: Fix comment in R_PPC64_ENTRY handling powerpc/boot: Add lzo support for uImage powerpc/boot: Add lzma support for uImage powerpc/boot: don't force gzipped uImage powerpc/8xx: Add microcode patch to move SMC parameter RAM. powerpc/8xx: Use IO accessors in microcode programming. powerpc/8xx: replace #ifdefs by IS_ENABLED() in microcode.c powerpc/8xx: refactor programming of microcode CPM params. powerpc/8xx: refactor printing of microcode patch name. powerpc/8xx: Refactor microcode write powerpc/8xx: refactor writing of CPM microcode arrays ...
2019-06-02powerpc/pseries: Fix xive=off command lineGreg Kurz1-1/+51
On POWER9, if the hypervisor supports XIVE exploitation mode, the guest OS will unconditionally requests for the XIVE interrupt mode even if XIVE was deactivated with the kernel command line xive=off. Later on, when the spapr XIVE init code handles xive=off, it disables XIVE and tries to fall back on the legacy mode XICS. This discrepency causes a kernel panic because the hypervisor is configured to provide the XIVE interrupt mode to the guest : kernel BUG at arch/powerpc/sysdev/xics/xics-common.c:135! ... NIP xics_smp_probe+0x38/0x98 LR xics_smp_probe+0x2c/0x98 Call Trace: xics_smp_probe+0x2c/0x98 (unreliable) pSeries_smp_probe+0x40/0xa0 smp_prepare_cpus+0x62c/0x6ec kernel_init_freeable+0x148/0x448 kernel_init+0x2c/0x148 ret_from_kernel_thread+0x5c/0x68 Look for xive=off during prom_init and don't ask for XIVE in this case. One exception though: if the host only supports XIVE, we still want to boot so we ignore xive=off. Similarly, have the spapr XIVE init code to looking at the interrupt mode negotiated during CAS, and ignore xive=off if the hypervisor only supports XIVE. Fixes: eac1e731b59e ("powerpc/xive: guest exploitation of the XIVE interrupt controller") Cc: [email protected] # v4.20 Reported-by: Pavithra R. Prakash <[email protected]> Signed-off-by: Greg Kurz <[email protected]> Reviewed-by: Cédric Le Goater <[email protected]> Signed-off-by: Michael Ellerman <[email protected]>
2019-05-30treewide: Replace GPLv2 boilerplate/reference with SPDX - rule 152Thomas Gleixner1-5/+1
Based on 1 normalized pattern(s): this program is free software you can redistribute it and or modify it under the terms of the gnu general public license as published by the free software foundation either version 2 of the license or at your option any later version extracted by the scancode license scanner the SPDX license identifier GPL-2.0-or-later has been chosen to replace the boilerplate/reference in 3029 file(s). Signed-off-by: Thomas Gleixner <[email protected]> Reviewed-by: Allison Randal <[email protected]> Cc: [email protected] Link: https://lkml.kernel.org/r/[email protected] Signed-off-by: Greg Kroah-Hartman <[email protected]>
2018-05-10powerpc/xive: prepare all hcalls to support long busy delaysCédric Le Goater1-8/+28
This is not the case for the moment, but future releases of pHyp might need to introduce some synchronisation routines under the hood which would make the XIVE hcalls longer to complete. As this was done for H_INT_RESET, let's wrap the other hcalls in a loop catching the H_LONG_BUSY_* codes. Signed-off-by: Cédric Le Goater <[email protected]> Signed-off-by: Michael Ellerman <[email protected]>
2018-05-10powerpc/xive: fix hcall H_INT_RESET to support long busy delaysCédric Le Goater1-5/+47
The hcall H_INT_RESET can take some time to complete and in such cases it returns H_LONG_BUSY_* codes requiring the machine to sleep for a while before retrying. Signed-off-by: Cédric Le Goater <[email protected]> Signed-off-by: Michael Ellerman <[email protected]>
2018-02-15powerpc/xive: Use hw CPU ids when configuring the CPU queuesCédric Le Goater1-6/+10
The CPU event notification queues on sPAPR should be configured using a hardware CPU identifier. The problem did not show up on the Power Hypervisor because pHyp supports 8 threads per core which keeps CPU number contiguous. This is not the case on all sPAPR virtual machines, some use SMT=1. Also improve error logging by adding the CPU number. Fixes: eac1e731b59e ("powerpc/xive: guest exploitation of the XIVE interrupt controller") Cc: [email protected] # v4.14+ Signed-off-by: Cédric Le Goater <[email protected]> Signed-off-by: Michael Ellerman <[email protected]>
2017-10-04powerpc/xive: Fix IPI resetCédric Le Goater1-0/+4
When resetting an IPI, hw_ipi should also be set to zero. Fixes: eac1e731b59e ("powerpc/xive: guest exploitation of the XIVE interrupt controller") Signed-off-by: Cédric Le Goater <[email protected]> Signed-off-by: Michael Ellerman <[email protected]>
2017-09-04powerpc/xive: Fix section __init warningCédric Le Goater1-1/+1
xive_spapr_init() is called from a __init routine and calls __init routines. Signed-off-by: Cédric Le Goater <[email protected]> Signed-off-by: Michael Ellerman <[email protected]>
2017-09-02powerpc/xive: introduce H_INT_ESB hcallCédric Le Goater1-1/+43
The H_INT_ESB hcall() is used to issue a load or store to the ESB page instead of using the MMIO pages. This can be used as a workaround on some HW issues. The OS knows that this hcall should be used on an interrupt source when the ESB hcall flag is set to 1 in the hcall H_INT_GET_SOURCE_INFO. To maintain the frontier between the xive frontend and backend, we introduce a new xive operation 'esb_rw' to be used in the routines doing memory accesses on the ESBs. Signed-off-by: Cédric Le Goater <[email protected]> Acked-by: Benjamin Herrenschmidt <[email protected]> Signed-off-by: Michael Ellerman <[email protected]>
2017-09-02powerpc/xive: add the HW IRQ number under xive_irq_dataCédric Le Goater1-0/+2
It will be required later by the H_INT_ESB hcall. Signed-off-by: Cédric Le Goater <[email protected]> Acked-by: Benjamin Herrenschmidt <[email protected]> Signed-off-by: Michael Ellerman <[email protected]>
2017-09-02powerpc/xive: guest exploitation of the XIVE interrupt controllerCédric Le Goater1-0/+618
This is the framework for using XIVE in a PowerVM guest. The support is very similar to the native one in a much simpler form. Each source is associated with an Event State Buffer (ESB). This is a two bit state machine which is used to trigger events. The bits are named "P" (pending) and "Q" (queued) and can be controlled by MMIO. The Guest OS registers event (or notifications) queues on which the HW will post event data for a target to notify. Instead of OPAL calls, a set of Hypervisors call are used to configure the interrupt sources and the event/notification queues of the guest: - H_INT_GET_SOURCE_INFO used to obtain the address of the MMIO page of the Event State Buffer (PQ bits) entry associated with the source. - H_INT_SET_SOURCE_CONFIG assigns a source to a "target". - H_INT_GET_SOURCE_CONFIG determines to which "target" and "priority" is assigned to a source - H_INT_GET_QUEUE_INFO returns the address of the notification management page associated with the specified "target" and "priority". - H_INT_SET_QUEUE_CONFIG sets or resets the event queue for a given "target" and "priority". It is also used to set the notification config associated with the queue, only unconditional notification for the moment. Reset is performed with a queue size of 0 and queueing is disabled in that case. - H_INT_GET_QUEUE_CONFIG returns the queue settings for a given "target" and "priority". - H_INT_RESET resets all of the partition's interrupt exploitation structures to their initial state, losing all configuration set via the hcalls H_INT_SET_SOURCE_CONFIG and H_INT_SET_QUEUE_CONFIG. - H_INT_SYNC issue a synchronisation on a source to make sure sure all notifications have reached their queue. As for XICS, the XIVE interface for the guest is described in the device tree under the "interrupt-controller" node. A couple of new properties are specific to XIVE : - "reg" contains the base address and size of the thread interrupt managnement areas (TIMA), also called rings, for the User level and for the Guest OS level. Only the Guest OS level is taken into account today. - "ibm,xive-eq-sizes" the size of the event queues. One cell per size supported, contains log2 of size, in ascending order. - "ibm,xive-lisn-ranges" the interrupt numbers ranges assigned to the guest. These are allocated using a simple bitmap. and also : - "/ibm,plat-res-int-priorities" contains a list of priorities that the hypervisor has reserved for its own use. Tested with a QEMU XIVE model for pseries and with the Power hypervisor. Signed-off-by: Cédric Le Goater <[email protected]> Acked-by: Benjamin Herrenschmidt <[email protected]> Signed-off-by: Michael Ellerman <[email protected]>