aboutsummaryrefslogtreecommitdiff
AgeCommit message (Collapse)AuthorFilesLines
2017-02-03powerpc: Enable support for GCC pluginsAndrew Donnellan2-0/+9
Enable support for GCC plugins on powerpc. Add an additional version check in gcc-plugins-check to advise users to upgrade to gcc 5.2+ on powerpc to avoid issues with header files (gcc <= 4.6) or missing copies of rs6000-cpus.def (4.8 to 5.1 on 64-bit targets). Signed-off-by: Andrew Donnellan <[email protected]> Acked-by: Kees Cook <[email protected]> Signed-off-by: Michael Ellerman <[email protected]>
2017-02-03powerpc: Correctly disable latent entropy GCC plugin on prom_init.oAndrew Donnellan1-1/+1
Commit 38addce8b600 ("gcc-plugins: Add latent_entropy plugin") excludes certain powerpc early boot code from the latent entropy plugin by adding appropriate CFLAGS. It looks like this was supposed to cover prom_init.o, but ended up saying init.o (which doesn't exist) instead. Fix the typo. Fixes: 38addce8b600 ("gcc-plugins: Add latent_entropy plugin") Signed-off-by: Andrew Donnellan <[email protected]> Signed-off-by: Michael Ellerman <[email protected]>
2017-02-03gcc-plugins: Fix definition of DISABLE_LATENT_ENTROPY_PLUGINAndrew Donnellan1-1/+1
The variable DISABLE_LATENT_ENTROPY_PLUGIN is defined when CONFIG_PAX_LATENT_ENTROPY is set. This is leftover from the original PaX version of the plugin code and doesn't actually exist. Change the condition to depend on CONFIG_GCC_PLUGIN_LATENT_ENTROPY instead. Fixes: 38addce8b600 ("gcc-plugins: Add latent_entropy plugin") Signed-off-by: Andrew Donnellan <[email protected]> Signed-off-by: Michael Ellerman <[email protected]>
2017-02-03cxl: Fix build when CONFIG_DEBUG_FS=nAndrew Donnellan2-5/+57
Stub out the debugfs functions so that the build doesn't break when CONFIG_DEBUG_FS=n. Reported-by: Michael Ellerman <[email protected]> Signed-off-by: Andrew Donnellan <[email protected]> Acked-by: Ian Munsie <[email protected]> Signed-off-by: Michael Ellerman <[email protected]>
2017-02-02powerpc/pseries: Report DLPAR capabilitiesNathan Fontenot1-1/+7
As we add the ability to do DLPAR of additional devices through the sysfs interface we need to know which devices are supported. This adds the reporting of supported devices with a comma separated list reported in the existing /sys/kernel/dlpar. Signed-off-by: Nathan Fontenot <[email protected]> Signed-off-by: Michael Ellerman <[email protected]>
2017-02-02powerpc/pseries: Update affinity for memory and cpus specified in a PRRN eventJohn Allen2-1/+40
Extend the existing PRRN infrastructure to perform the actual affinity updating for cpus and memory in addition to the device tree updating. For cpus, dynamic affinity updating already appears to exist in the kernel in the form of arch_update_cpu_topology(). For memory, we must place a READD operation on the hotplug queue for any phandle included in the PRRN event that is determined to be an LMB. Signed-off-by: John Allen <[email protected]> Reviewed-by: Nathan Fontenot <[email protected]> Signed-off-by: Michael Ellerman <[email protected]>
2017-02-02powerpc/pseries: Introduce memory hotplug READD operationJohn Allen2-0/+42
Currently, memory must be hot removed and subsequently re-added in order to dynamically update the affinity of LMBs specified by a PRRN event. Earlier implementations of the PRRN event handler ran into issues in which the hot remove would occur successfully, but a hotplug event would be initiated from another source and grab the hotplug lock preventing the hot add from occurring. To prevent this situation, this patch introduces the notion of a hot "readd" action for memory which atomizes a hot remove and a hot add into a single, serialized operation on the hotplug queue. Signed-off-by: John Allen <[email protected]> Reviewed-by: Nathan Fontenot <[email protected]> Signed-off-by: Michael Ellerman <[email protected]>
2017-02-02powerpc/pseries: Make the acquire/release of the drc for memory a seperate stepJohn Allen1-12/+22
When adding and removing LMBs we should make the acquire/release of the DRC a separate step to allow for a few improvements. First this will ensure that LMBs removed during a remove by count operation are all available if a error occurs and we need to add them back. By first removeing all the LMBs from the kernel before releasing their DRCs the LMBs are available to add back should an error occur. Also, this will allow for faster re-add operations of memory for PRRN event handling since we can skip the unneeded step of having to release the DRC and the acquire it back. Signed-off-by: Nathan Fontenot <[email protected]> Signed-off-by: John Allen <[email protected]> Signed-off-by: Michael Ellerman <[email protected]>
2017-02-02powerpc/xmon: Cleanup to use is_kernel_addr macroMadhavan Srinivasan1-4/+4
Signed-off-by: Madhavan Srinivasan <[email protected]> Signed-off-by: Michael Ellerman <[email protected]>
2017-02-02powerpc/boot: Update .gitignoreMichael Ellerman1-0/+4
Add a few things that have been missed from .gitignore over the years. Signed-off-by: Michael Ellerman <[email protected]>
2017-02-02powerpc/debug: PTDUMP should depend on DEBUG_FSMichael Ellerman1-2/+1
CONFIG_PPC_PTDUMP currently selects CONFIG_DEBUG_FS. But CONFIG_DEBUG_FS is user-selectable, so we shouldn't select it. Instead depend on it. Signed-off-by: Michael Ellerman <[email protected]>
2017-02-02powerpc/64: Add BPF_JIT to powernv and pseries defconfigsAnton Blanchard2-0/+2
Commit db9112173b18 ("powerpc: Turn on BPF_JIT in ppc64_defconfig") only added BPF_JIT to the ppc64 defconfig. Add it to our powernv and pseries defconfigs too. Signed-off-by: Anton Blanchard <[email protected]> Acked-by: Naveen N. Rao <[email protected]> Signed-off-by: Michael Ellerman <[email protected]>
2017-02-02powerpc/64: Move HAVE_CONTEXT_TRACKING from pseries to common KconfigAnton Blanchard2-1/+1
We added support for HAVE_CONTEXT_TRACKING, but placed the option inside PPC_PSERIES. This has the undesirable effect that NO_HZ_FULL can be enabled on a kernel with both powernv and pseries support, but cannot on a kernel with powernv only support. Signed-off-by: Anton Blanchard <[email protected]> Signed-off-by: Michael Ellerman <[email protected]>
2017-02-02powerpc/sparse: Constify the address pointer in __get_user_nosleep()Daniel Axtens1-1/+1
In __get_user_nosleep, we create an intermediate pointer for the user address we're about to fetch. We currently don't tag this pointer as const. Make it const, as we are simply dereferencing it, and it's scope is limited to the __get_user_nosleep macro. Signed-off-by: Daniel Axtens <[email protected]> Signed-off-by: Michael Ellerman <[email protected]>
2017-02-02powerpc/sparse: Constify the address pointer in __get_user_nocheck()Daniel Axtens1-1/+1
In __get_user_nocheck, we create an intermediate pointer for the user address we're about to fetch. We currently don't tag this pointer as const. Make it const, as we are simply dereferencing it, and it's scope is limited to the __get_user_nocheck macro. Signed-off-by: Daniel Axtens <[email protected]> Signed-off-by: Michael Ellerman <[email protected]>
2017-02-02powerpc/sparse: Constify the address pointer in __get_user_check()Daniel Axtens1-1/+1
In __get_user_check, we create an intermediate pointer for the user address we're about to fetch. We currently don't tag this pointer as const. Make it const, as we are simply dereferencing it, and it's scope is limited to the __get_user_check macro. Signed-off-by: Daniel Axtens <[email protected]> Signed-off-by: Michael Ellerman <[email protected]>
2017-02-02powerpc/powernv: Fix section mismatch from opal_lpc_init()Michael Ellerman1-1/+1
opal_lpc_init() is called from an __init routine, and calls other __init routines, so should also be __init, init? Fixes: 023b13a50183 ("powerpc/powernv: Add support for direct mapped LPC on POWER9") Signed-off-by: Michael Ellerman <[email protected]>
2017-01-31KVM: PPC: Book3S HV: Enable radix guest supportPaul Mackerras5-27/+111
This adds a few last pieces of the support for radix guests: * Implement the backends for the KVM_PPC_CONFIGURE_V3_MMU and KVM_PPC_GET_RMMU_INFO ioctls for radix guests * On POWER9, allow secondary threads to be on/off-lined while guests are running. * Set up LPCR and the partition table entry for radix guests. * Don't allocate the rmap array in the kvm_memory_slot structure on radix. * Don't try to initialize the HPT for radix guests, since they don't have an HPT. * Take out the code that prevents the HV KVM module from initializing on radix hosts. At this stage, we only support radix guests if the host is running in radix mode, and only support HPT guests if the host is running in HPT mode. Thus a guest cannot switch from one mode to the other, which enables some simplifications. Signed-off-by: Paul Mackerras <[email protected]> Signed-off-by: Michael Ellerman <[email protected]>
2017-01-31KVM: PPC: Book3S HV: Invalidate ERAT on guest entry/exit for POWER9 DD1Paul Mackerras1-0/+6
On POWER9 DD1, we need to invalidate the ERAT (effective to real address translation cache) when changing the PIDR register, which we do as part of guest entry and exit. Signed-off-by: Paul Mackerras <[email protected]> Signed-off-by: Michael Ellerman <[email protected]>
2017-01-31KVM: PPC: Book3S HV: Allow guest exit path to have MMU onPaul Mackerras3-17/+58
If we allow LPCR[AIL] to be set for radix guests, then interrupts from the guest to the host can be delivered by the hardware with relocation on, and thus the code path starting at kvmppc_interrupt_hv can be executed in virtual mode (MMU on) for radix guests (previously it was only ever executed in real mode). Most of the code is indifferent to whether the MMU is on or off, but the calls to OPAL that use the real-mode OPAL entry code need to be switched to use the virtual-mode code instead. The affected calls are the calls to the OPAL XICS emulation functions in kvmppc_read_one_intr() and related functions. We test the MSR[IR] bit to detect whether we are in real or virtual mode, and call the opal_rm_* or opal_* function as appropriate. The other place that depends on the MMU being off is the optimization where the guest exit code jumps to the external interrupt vector or hypervisor doorbell interrupt vector, or returns to its caller (which is __kvmppc_vcore_entry). If the MMU is on and we are returning to the caller, then we don't need to use an rfid instruction since the MMU is already on; a simple blr suffices. If there is an external or hypervisor doorbell interrupt to handle, we branch to the relocation-on version of the interrupt vector. Signed-off-by: Paul Mackerras <[email protected]> Signed-off-by: Michael Ellerman <[email protected]>
2017-01-31KVM: PPC: Book3S HV: Invalidate TLB on radix guest vcpu movementPaul Mackerras4-14/+82
With radix, the guest can do TLB invalidations itself using the tlbie (global) and tlbiel (local) TLB invalidation instructions. Linux guests use local TLB invalidations for translations that have only ever been accessed on one vcpu. However, that doesn't mean that the translations have only been accessed on one physical cpu (pcpu) since vcpus can move around from one pcpu to another. Thus a tlbiel might leave behind stale TLB entries on a pcpu where the vcpu previously ran, and if that task then moves back to that previous pcpu, it could see those stale TLB entries and thus access memory incorrectly. The usual symptom of this is random segfaults in userspace programs in the guest. To cope with this, we detect when a vcpu is about to start executing on a thread in a core that is a different core from the last time it executed. If that is the case, then we mark the core as needing a TLB flush and then send an interrupt to any thread in the core that is currently running a vcpu from the same guest. This will get those vcpus out of the guest, and the first one to re-enter the guest will do the TLB flush. The reason for interrupting the vcpus executing on the old core is to cope with the following scenario: CPU 0 CPU 1 CPU 4 (core 0) (core 0) (core 1) VCPU 0 runs task X VCPU 1 runs core 0 TLB gets entries from task X VCPU 0 moves to CPU 4 VCPU 0 runs task X Unmap pages of task X tlbiel (still VCPU 1) task X moves to VCPU 1 task X runs task X sees stale TLB entries That is, as soon as the VCPU starts executing on the new core, it could unmap and tlbiel some page table entries, and then the task could migrate to one of the VCPUs running on the old core and potentially see stale TLB entries. Since the TLB is shared between all the threads in a core, we only use the bit of kvm->arch.need_tlb_flush corresponding to the first thread in the core. To ensure that we don't have a window where we can miss a flush, this moves the clearing of the bit from before the actual flush to after it. This way, two threads might both do the flush, but we prevent the situation where one thread can enter the guest before the flush is finished. Signed-off-by: Paul Mackerras <[email protected]> Signed-off-by: Michael Ellerman <[email protected]>
2017-01-31KVM: PPC: Book3S HV: Make HPT-specific hypercalls return error in radix modePaul Mackerras1-0/+14
If the guest is in radix mode, then it doesn't have a hashed page table (HPT), so all of the hypercalls that manipulate the HPT can't work and should return an error. This adds checks to make them return H_FUNCTION ("function not supported"). Signed-off-by: Paul Mackerras <[email protected]> Signed-off-by: Michael Ellerman <[email protected]>
2017-01-31KVM: PPC: Book3S HV: Implement dirty page logging for radix guestsPaul Mackerras4-33/+144
This adds code to keep track of dirty pages when requested (that is, when memslot->dirty_bitmap is non-NULL) for radix guests. We use the dirty bits in the PTEs in the second-level (partition-scoped) page tables, together with a bitmap of pages that were dirty when their PTE was invalidated (e.g., when the page was paged out). This bitmap is stored in the first half of the memslot->dirty_bitmap area, and kvm_vm_ioctl_get_dirty_log_hv() now uses the second half for the bitmap that gets returned to userspace. Signed-off-by: Paul Mackerras <[email protected]> Signed-off-by: Michael Ellerman <[email protected]>
2017-01-31KVM: PPC: Book3S HV: MMU notifier callbacks for radix guestsPaul Mackerras3-21/+103
This adapts our implementations of the MMU notifier callbacks (unmap_hva, unmap_hva_range, age_hva, test_age_hva, set_spte_hva) to call radix functions when the guest is using radix. These implementations are much simpler than for HPT guests because we have only one PTE to deal with, so we don't need to traverse rmap chains. Signed-off-by: Paul Mackerras <[email protected]> Signed-off-by: Michael Ellerman <[email protected]>
2017-01-31KVM: PPC: Book3S HV: Page table construction and page faults for radix guestsPaul Mackerras5-3/+415
This adds the code to construct the second-level ("partition-scoped" in architecturese) page tables for guests using the radix MMU. Apart from the PGD level, which is allocated when the guest is created, the rest of the tree is all constructed in response to hypervisor page faults. As well as hypervisor page faults for missing pages, we also get faults for reference/change (RC) bits needing to be set, as well as various other error conditions. For now, we only set the R or C bit in the guest page table if the same bit is set in the host PTE for the backing page. This code can take advantage of the guest being backed with either transparent or ordinary 2MB huge pages, and insert 2MB page entries into the guest page tables. There is no support for 1GB huge pages yet. Signed-off-by: Paul Mackerras <[email protected]> Signed-off-by: Michael Ellerman <[email protected]>
2017-01-31KVM: PPC: Book3S HV: Modify guest entry/exit paths to handle radix guestsPaul Mackerras3-11/+49
This adds code to branch around the parts that radix guests don't need - clearing and loading the SLB with the guest SLB contents, saving the guest SLB contents on exit, and restoring the host SLB contents. Since the host is now using radix, we need to save and restore the host value for the PID register. On hypervisor data/instruction storage interrupts, we don't do the guest HPT lookup on radix, but just save the guest physical address for the fault (from the ASDR register) in the vcpu struct. Signed-off-by: Paul Mackerras <[email protected]> Signed-off-by: Michael Ellerman <[email protected]>
2017-01-31KVM: PPC: Book3S HV: Add basic infrastructure for radix guestsPaul Mackerras6-3/+160
This adds a field in struct kvm_arch and an inline helper to indicate whether a guest is a radix guest or not, plus a new file to contain the radix MMU code, which currently contains just a translate function which knows how to traverse the guest page tables to translate an address. Signed-off-by: Paul Mackerras <[email protected]> Signed-off-by: Michael Ellerman <[email protected]>
2017-01-31KVM: PPC: Book3S HV: Use ASDR for HPT guests on POWER9Paul Mackerras1-0/+8
POWER9 adds a register called ASDR (Access Segment Descriptor Register), which is set by hypervisor data/instruction storage interrupts to contain the segment descriptor for the address being accessed, assuming the guest is using HPT translation. (For radix guests, it contains the guest real address of the access.) Thus, for HPT guests on POWER9, we can use this register rather than looking up the SLB with the slbfee. instruction. Signed-off-by: Paul Mackerras <[email protected]> Signed-off-by: Michael Ellerman <[email protected]>
2017-01-31KVM: PPC: Book3S HV: Set process table for HPT guests on POWER9Paul Mackerras3-5/+33
This adds the implementation of the KVM_PPC_CONFIGURE_V3_MMU ioctl for HPT guests on POWER9. With this, we can return 1 for the KVM_CAP_PPC_MMU_HASH_V3 capability. Signed-off-by: Paul Mackerras <[email protected]> Signed-off-by: Michael Ellerman <[email protected]>
2017-01-31KVM: PPC: Book3S HV: Add userspace interfaces for POWER9 MMUPaul Mackerras6-0/+156
This adds two capabilities and two ioctls to allow userspace to find out about and configure the POWER9 MMU in a guest. The two capabilities tell userspace whether KVM can support a guest using the radix MMU, or using the hashed page table (HPT) MMU with a process table and segment tables. (Note that the MMUs in the POWER9 processor cores do not use the process and segment tables when in HPT mode, but the nest MMU does). The KVM_PPC_CONFIGURE_V3_MMU ioctl allows userspace to specify whether a guest will use the radix MMU or the HPT MMU, and to specify the size and location (in guest space) of the process table. The KVM_PPC_GET_RMMU_INFO ioctl gives userspace information about the radix MMU. It returns a list of supported radix tree geometries (base page size and number of bits indexed at each level of the radix tree) and the encoding used to specify the various page sizes for the TLB invalidate entry instruction. Initially, both capabilities return 0 and the ioctls return -EINVAL, until the necessary infrastructure for them to operate correctly is added. Signed-off-by: Paul Mackerras <[email protected]> Signed-off-by: Michael Ellerman <[email protected]>
2017-01-31powerpc/64: Allow for relocation-on interrupts from guest to hostPaul Mackerras2-29/+34
With host and guest both using radix translation, it is feasible for the host to take interrupts that come from the guest with relocation on, and that is in fact what the POWER9 hardware will do when LPCR[AIL] = 3. All such interrupts use HSRR0/1 not SRR0/1 except for system call with LEV=1 (hcall). Therefore this adds the KVM tests to the _HV variants of the relocation-on interrupt handlers, and adds the KVM test to the relocation-on system call entry point. We also instantiate the relocation-on versions of the hypervisor data storage and instruction interrupt handlers, since these can occur with relocation on in radix guests. Signed-off-by: Paul Mackerras <[email protected]> Signed-off-by: Michael Ellerman <[email protected]>
2017-01-31powerpc/64: Make type of partition table flush depend on partition typePaul Mackerras1-3/+13
When changing a partition table entry on POWER9, we do a particular form of the tlbie instruction which flushes all TLBs and caches of the partition table for a given logical partition ID (LPID). This instruction has a field in the instruction word, labelled R (radix), which should be 1 if the partition was previously a radix partition and 0 if it was a HPT partition. This implements that logic. Signed-off-by: Paul Mackerras <[email protected]> Signed-off-by: Michael Ellerman <[email protected]>
2017-01-31powerpc/64: Export pgtable_cache and pgtable_cache_add for KVMPaul Mackerras1-1/+2
This exports the pgtable_cache array and the pgtable_cache_add function so that HV KVM can use them for allocating radix page tables for guests. Signed-off-by: Paul Mackerras <[email protected]> Signed-off-by: Michael Ellerman <[email protected]>
2017-01-31powerpc/64: More definitions for POWER9Paul Mackerras2-1/+15
This adds definitions for bits in the DSISR register which are used by POWER9 for various translation-related exception conditions, and for some more bits in the partition table entry that will be needed by KVM. Signed-off-by: Paul Mackerras <[email protected]> Signed-off-by: Michael Ellerman <[email protected]>
2017-01-31powerpc/64: Enable use of radix MMU under hypervisor on POWER9Paul Mackerras7-6/+81
To use radix as a guest, we first need to tell the hypervisor via the ibm,client-architecture call first that we support POWER9 and architecture v3.00, and that we can do either radix or hash and that we would like to choose later using an hcall (the H_REGISTER_PROC_TBL hcall). Then we need to check whether the hypervisor agreed to us using radix. We need to do this very early on in the kernel boot process before any of the MMU initialization is done. If the hypervisor doesn't agree, we can't use radix and therefore clear the radix MMU feature bit. Later, when we have set up our process table, which points to the radix tree for each process, we need to install that using the H_REGISTER_PROC_TBL hcall. Signed-off-by: Paul Mackerras <[email protected]> Signed-off-by: Michael Ellerman <[email protected]>
2017-01-31powerpc/pseries: Fixes for the "ibm,architecture-vec-5" optionsPaul Mackerras2-5/+5
This fixes the byte index values for some of the option bits in the "ibm,architectur-vec-5" property. The "platform facilities options" bits are in byte 17 not byte 14, so the upper 8 bits of their definitions need to be 0x11 not 0x0E. The "sub processor support" option is in byte 21 not byte 15. Note none of these options are actually looked up in "ibm,architecture-vec-5" at this time, so there is no bug. When checking whether option bits are set, we should check that the offset of the byte being checked is less than the vector length that we got from the hypervisor. Signed-off-by: Paul Mackerras <[email protected]> Signed-off-by: Michael Ellerman <[email protected]>
2017-01-31powerpc/64: Don't try to use radix MMU under a hypervisorPaul Mackerras1-0/+33
Currently, if the kernel is running on a POWER9 processor under a hypervisor, it will try to use the radix MMU even though it doesn't have the necessary code to use radix under a hypervisor (it doesn't negotiate use of radix, and it doesn't do the H_REGISTER_PROC_TBL hcall). The result is that the guest kernel will crash when it tries to turn on the MMU. This fixes it by looking for the /chosen/ibm,architecture-vec-5 property, and if it exists, clears the radix MMU feature bit, before we decide whether to initialize for radix or HPT. This property is created by the hypervisor as a result of the guest calling the ibm,client-architecture-support method to indicate its capabilities, so it will indicate whether the hypervisor agreed to us using radix. Systems without a hypervisor may have this property also (for example, skiboot creates it), so we check the HV bit in the MSR to see whether we are running as a guest or not. If we are in hypervisor mode, then we can do whatever we like including using the radix MMU. The reason for using this property is that in future, when we have support for using radix under a hypervisor, we will need to check this property to see whether the hypervisor agreed to us using radix. Fixes: 2bfd65e45e87 ("powerpc/mm/radix: Add radix callbacks for early init routines") Cc: [email protected] # v4.7+ Signed-off-by: Paul Mackerras <[email protected]> Signed-off-by: Michael Ellerman <[email protected]>
2017-01-31KVM: PPC: Book3S: 64-bit CONFIG_RELOCATABLE support for interruptsNicholas Piggin4-8/+58
64-bit Book3S exception handlers must find the dynamic kernel base to add to the target address when branching beyond __end_interrupts, in order to support kernel running at non-0 physical address. Support this in KVM by branching with CTR, similarly to regular interrupt handlers. The guest CTR saved in HSTATE_SCRATCH1 and restored after the branch. Without this, the host kernel hangs and crashes randomly when it is running at a non-0 address and a KVM guest is started. Signed-off-by: Nicholas Piggin <[email protected]> Acked-by: Paul Mackerras <[email protected]> Signed-off-by: Michael Ellerman <[email protected]>
2017-01-31powerpc/mm: unstub radix__vmemmap_remove_mapping()Reza Arbab1-1/+28
Use remove_pagetable() and friends for radix vmemmap removal. We do not require the special-case handling of vmemmap done in the x86 versions of these functions. This is because vmemmap_free() has already freed the mapped pages, and calls us with an aligned address range. So, add a few failsafe WARNs, but otherwise the code to remove physical mappings is already sufficient for vmemmap. Signed-off-by: Reza Arbab <[email protected]> Acked-by: Balbir Singh <[email protected]> Signed-off-by: Michael Ellerman <[email protected]>
2017-01-31powerpc/mm: add radix__remove_section_mapping()Reza Arbab3-1/+135
Tear down and free the four-level page tables of physical mappings during memory hotremove. Borrow the basic structure of remove_pagetable() and friends from the identically-named x86 functions. Reduce the frequency of tlb flushes and page_table_lock spinlocks by only doing them in the outermost function. There was some question as to whether the locking is needed at all. Leave it for now, but we could consider dropping it. Memory must be offline to be removed, thus not in use. So there shouldn't be the sort of concurrent page walking activity here that might prompt us to use RCU. Signed-off-by: Reza Arbab <[email protected]> Signed-off-by: Michael Ellerman <[email protected]>
2017-01-31powerpc/mm: add radix__create_section_mapping()Reza Arbab3-1/+12
Wire up memory hotplug page mapping for radix. Share the mapping function already used by radix_init_pgtable(). Signed-off-by: Reza Arbab <[email protected]> Acked-by: Balbir Singh <[email protected]> Signed-off-by: Michael Ellerman <[email protected]>
2017-01-31powerpc/mm: refactor radix physical page mappingReza Arbab1-38/+50
Move the page mapping code in radix_init_pgtable() into a separate function that will also be used for memory hotplug. The current goto loop progressively decreases its mapping size as it covers the tail of a range whose end is unaligned. Change this to a for loop which can do the same for both ends of the range. Signed-off-by: Reza Arbab <[email protected]> Signed-off-by: Michael Ellerman <[email protected]>
2017-01-31powerpc/powernv: Add support for direct mapped LPC on POWER9Benjamin Herrenschmidt2-6/+15
Use the new non-PCI ISA bridge support to expose the POWER9 LPC bus as direct mapped via the ISA IO port range. This enables direct access via drivers such as 8250 Signed-off-by: Benjamin Herrenschmidt <[email protected]> Signed-off-by: Michael Ellerman <[email protected]>
2017-01-31powerpc: Add support for non-PCI ISA bridgesBenjamin Herrenschmidt2-0/+92
The POWER9 chip supports an LPC bus that isn't hanging off a PCI bus, so let's add support for that, mapping it to the reserved space at ISA_IO_BASE Signed-off-by: Benjamin Herrenschmidt <[email protected]> Signed-off-by: Michael Ellerman <[email protected]>
2017-01-31powerpc: Move isa bridge definitions to separate includeBenjamin Herrenschmidt7-18/+33
We'll be adding non-PCI isa bridge support so let's not have all the definition in pci-bridge.h Signed-off-by: Benjamin Herrenschmidt <[email protected]> Signed-off-by: Michael Ellerman <[email protected]>
2017-01-31Documentation:powerpc: Add device-tree bindings for power-mgtGautham R. Shenoy1-0/+118
Document the device-tree bindings defining the the properties under the @power-mgt node in the device tree that describe the idle states for Linux running on baremetal POWER servers. These bindings are documented separately instead of using the the common idle state bindings since the idle-states on POWER servers are exposed as property arrays where as the common idle state bindings expect idle-states to be described as nodes. Acked-by: Rob Herring <[email protected]> Signed-off-by: Gautham R. Shenoy <[email protected]> Signed-off-by: Michael Ellerman <[email protected]>
2017-01-31powernv: Pass PSSCR value and mask to power9_idle_stopGautham R. Shenoy7-43/+241
The power9_idle_stop method currently takes only the requested stop level as a parameter and picks up the rest of the PSSCR bits from a hand-coded macro. This is not a very flexible design, especially when the firmware has the capability to communicate the psscr value and the mask associated with a particular stop state via device tree. This patch modifies the power9_idle_stop API to take as parameters the PSSCR value and the PSSCR mask corresponding to the stop state that needs to be set. These PSSCR value and mask are respectively obtained by parsing the "ibm,cpu-idle-state-psscr" and "ibm,cpu-idle-state-psscr-mask" fields from the device tree. In addition to this, the patch adds support for handling stop states for which ESL and EC bits in the PSSCR are zero. As per the architecture, a wakeup from these stop states resumes execution from the subsequent instruction as opposed to waking up at the System Vector. The older firmware sets only the Requested Level (RL) field in the psscr and psscr-mask exposed in the device tree. For older firmware where psscr-mask=0xf, this patch will set the default sane values that the set for for remaining PSSCR fields (i.e PSLL, MTL, ESL, EC, and TR). For the new firmware, the patch will validate that the invariants required by the ISA for the psscr values are maintained by the firmware. This skiboot patch that exports fully populated PSSCR values and the mask for all the stop states can be found here: https://lists.ozlabs.org/pipermail/skiboot/2016-September/004869.html [Optimize the number of instructions before entering STOP with ESL=EC=0, validate the PSSCR values provided by the firimware maintains the invariants required as per the ISA suggested by Balbir Singh] Acked-by: Balbir Singh <[email protected]> Signed-off-by: Gautham R. Shenoy <[email protected]> Signed-off-by: Michael Ellerman <[email protected]>
2017-01-31cpuidle:powernv: Add helper function to populate powernv idle states.Gautham R. Shenoy2-36/+54
In the current code for powernv_add_idle_states, there is a lot of code duplication while initializing an idle state in powernv_states table. Add an inline helper function to populate the powernv_states[] table for a given idle state. Invoke this for populating the "Nap", "Fastsleep" and the stop states in powernv_add_idle_states. Signed-off-by: Gautham R. Shenoy <[email protected]> Acked-by: Balbir Singh <[email protected]> Acked-by: Rafael J. Wysocki <[email protected]> Signed-off-by: Michael Ellerman <[email protected]>
2017-01-31powernv:stop: Rename pnv_arch300_idle_init to pnv_power9_idle_initGautham R. Shenoy1-2/+2
Balbir pointed out that the name of the function pnv_arch300_idle_init was inconsistent with the names of the variables and functions pertaining to POWER9 features in book3s_idle.S. This patch renames pnv_arch300_idle_init to pnv_power9_idle_init. This patch does not change any behaviour. Signed-off-by: Gautham R. Shenoy <[email protected]> Signed-off-by: Michael Ellerman <[email protected]>
2017-01-31powernv:idle: Add IDLE_STATE_ENTER_SEQ_NORET macroGautham R. Shenoy3-9/+12
Currently all the low-power idle states are expected to wake up at reset vector 0x100. Which is why the macro IDLE_STATE_ENTER_SEQ that puts the CPU to an idle state and never returns. On ISA v3.0, when the ESL and EC bits in the PSSCR are zero, the CPU is expected to wake up at the next instruction of the idle instruction. This patch adds a new macro named IDLE_STATE_ENTER_SEQ_NORET for the no-return variant and reuses the name IDLE_STATE_ENTER_SEQ for a variant that allows resuming operation at the instruction next to the idle-instruction. Acked-by: Balbir Singh <[email protected]> Signed-off-by: Gautham R. Shenoy <[email protected]> Signed-off-by: Michael Ellerman <[email protected]>