aboutsummaryrefslogtreecommitdiff
path: root/arch/powerpc/kernel
AgeCommit message (Collapse)AuthorFilesLines
2020-10-06powerpc/smp: Remove get_physical_package_idSrikar Dronamraju1-20/+0
Now that cpu_core_mask has been removed and topology_core_cpumask has been updated to use cpu_cpu_mask, we no more need get_physical_package_id. Signed-off-by: Srikar Dronamraju <[email protected]> Tested-by: Satheesh Rajendran <[email protected]> Signed-off-by: Michael Ellerman <[email protected]> Link: https://lore.kernel.org/r/[email protected]
2020-10-06powerpc/smp: Stop updating cpu_core_maskSrikar Dronamraju1-26/+7
Anton Blanchard reported that his 4096 vcpu KVM guest took around 30 minutes to boot. He also analyzed it to the time taken to iterate while setting the cpu_core_mask. Further analysis shows that cpu_core_mask and cpu_cpu_mask for any CPU would be equal on Power. However updating cpu_core_mask took forever to update as its a per cpu cpumask variable. Instead cpu_cpu_mask was a per NODE /per DIE cpumask that was shared by all the respective CPUs. Also cpu_cpu_mask is needed from a scheduler perspective. However cpu_core_map is an exported symbol. Hence stop updating cpu_core_map and make it point to cpu_cpu_mask. Signed-off-by: Srikar Dronamraju <[email protected]> Tested-by: Satheesh Rajendran <[email protected]> Signed-off-by: Michael Ellerman <[email protected]> Link: https://lore.kernel.org/r/[email protected]
2020-10-06powerpc/tm: Save and restore AMR on treclaim and trechkptGustavo Romero2-4/+32
Althought AMR is stashed in the checkpoint area, currently we don't save it to the per thread checkpoint struct after a treclaim and so we don't restore it either from that struct when we trechkpt. As a consequence when the transaction is later rolled back the kernel space AMR value when the trechkpt was done appears in userspace. That commit saves and restores AMR accordingly on treclaim and trechkpt. Since AMR value is also used in kernel space in other functions, it also takes care of stashing kernel live AMR into the stack before treclaim and before trechkpt, restoring it later, just before returning from tm_reclaim and __tm_recheckpoint. Is also fixes two nonrelated comments about CR and MSR. Signed-off-by: Gustavo Romero <[email protected]> Tested-by: Aneesh Kumar K.V <[email protected]> Signed-off-by: Michael Ellerman <[email protected]> Link: https://lore.kernel.org/r/[email protected]
2020-10-06powerpc/eeh: Clean up PE addressingOliver O'Halloran2-41/+7
When support for EEH on PowerNV was added a lot of pseries specific code was made "generic" and some of the quirks of pseries EEH came along for the ride. One of the stranger quirks is eeh_pe containing two types of PE address: pe->addr and pe->config_addr. There reason for this appears to be historical baggage rather than any real requirements. On pseries EEH PEs are manipulated using RTAS calls. Each EEH RTAS call takes a "PE configuration address" as an input which is used to identify which EEH PE is being manipulated by the call. When initialising the EEH state for a device the first thing we need to do is determine the configuration address for the PE which contains the device so we can enable EEH on that PE. This process is outlined in PAPR which is the modern (i.e post-2003) FW specification for pseries. However, EEH support was first described in the pSeries RISC Platform Architecture (RPA) and although they are mostly compatible EEH is one of the areas where they are not. The major difference is that RPA doesn't actually have the concept of a PE. On RPA systems the EEH RTAS calls are done on a per-device basis using the same config_addr that would be passed to the RTAS functions to access PCI config space (e.g. ibm,read-pci-config). The config_addr is not identical since the function and config register offsets of the config_addr must be set to zero. EEH operations being done on a per-device basis doesn't make a whole lot of sense when you consider how EEH was implemented on legacy PCI systems. For legacy PCI(-X) systems EEH was implemented using special PCI-PCI bridges which contained logic to detect errors and freeze the secondary bus when one occurred. This means that the EEH enabled state is shared among all devices behind that EEH bridge. As a result there's no way to implement the per-device control required for the semantics specified by RPA. It can be made to work if we assume that a separate EEH bridge exists for each EEH capable PCI slot and there are no bridges behind those slots. However, RPA also specifies the ibm,configure-bridge RTAS call for re-initalising bridges behind EEH capable slots after they are reset due to an EEH event so that is probably not a valid assumption. This incoherence was fixed in later PAPR, which succeeded RPA. Unfortunately, since Linux EEH support seems to have been implemented based on the RPA spec some of the legacy assumptions were carried over (probably for POWER4 compatibility). The fix made in PAPR was the introduction of the "PE" concept and redefining the EEH RTAS calls (set-eeh-option, reset-slot, etc) to operate on a per-PE basis so all devices behind an EEH bride would share the same EEH state. The "config_addr" argument to the EEH RTAS calls became the "PE_config_addr" and the OS was required to use the ibm,get-config-addr-info RTAS call to find the correct PE address for the device. When support for the new interfaces was added to Linux it was implemented using something like: At probe time: pdn->eeh_config_addr = rtas_config_addr(pdn); pdn->eeh_pe_config_addr = rtas_get_config_addr_info(pdn); When performing an RTAS call: config_addr = pdn->eeh_config_addr; if (pdn->eeh_pe_config_addr) config_addr = pdn->eeh_pe_config_addr; rtas_call(..., config_addr, ...); In other words, if the ibm,get-config-addr-info RTAS call is implemented and returned a valid result we'd use that as the argument to the EEH RTAS calls. If not, Linux would fall back to using the device's config_addr. Over time these addresses have moved around going from pci_dn to eeh_dev and finally into eeh_pe. Today the users look like this: config_addr = pe->config_addr; if (pe->addr) config_addr = pe->addr; rtas_call(..., config_addr, ...); However, considering the EEH core always operates on a per-PE basis and even on pseries the only per-device operation is the initial call to ibm,set-eeh-option I'm not sure if any of this actually works on an RPA system today. It doesn't make much sense to have the fallback address in a generic structure either since the bulk of the code which reference it is in pseries anyway. The EEH core makes a token effort to support looking up a PE using the config_addr by having two arguments to eeh_pe_get(). However, a survey of all the callers to eeh_pe_get() shows that all bar one have the config_addr argument hard-coded to zero.The only caller that doesn't is in eeh_pe_tree_insert() which has: if (!eeh_has_flag(EEH_VALID_PE_ZERO) && !edev->pe_config_addr) return -EINVAL; pe = eeh_pe_get(hose, edev->pe_config_addr, edev->bdfn); The third argument (config_addr) is only used if the second (pe->addr) argument is invalid. The preceding check ensures that the call to eeh_pe_get() will never happen if edev->pe_config_addr is invalid so there is no situation where eeh_pe_get() will search for a PE based on the 3rd argument. The check also means that we'll never insert a PE into the tree where pe_config_addr is zero since EEH_VALID_PE_ZERO is never set on pseries. All the users of the fallback address on pseries never actually use the fallback and all the only caller that supplies something for the config_addr argument to eeh_pe_get() never use it either. It's all dead code. This patch removes the fallback address from eeh_pe since nothing uses it. Specificly, we do this by: 1) Removing pe->config_addr 2) Removing the EEH_VALID_PE_ZERO flag 3) Removing the fallback address argument to eeh_pe_get(). 4) Removing all the checks for pe->addr being zero in the pseries EEH code. This leaves us with PE's only being identified by what's in their pe->addr field and the EEH core relying on the platform to ensure that eeh_dev's are only inserted into the EEH tree if they're actually inside a PE. No functional changes, I hope. Signed-off-by: Oliver O'Halloran <[email protected]> Signed-off-by: Michael Ellerman <[email protected]> Link: https://lore.kernel.org/r/[email protected]
2020-10-06powerpc/eeh: Move EEH initialisation to an arch initcallOliver O'Halloran1-32/+32
The initialisation of EEH mostly happens in a core_initcall_sync initcall, followed by registering a bus notifier later on in an arch_initcall. Anything involving initcall dependecies is mostly incomprehensible unless you've spent a while staring at code so here's the full sequence: ppc_md.setup_arch <-- pci_controllers are created here ...time passes... core_initcall <-- pci_dns are created from DT nodes core_initcall_sync <-- platforms call eeh_init() postcore_initcall <-- PCI bus type is registered postcore_initcall_sync arch_initcall <-- EEH pci_bus notifier registered subsys_initcall <-- PHBs are scanned here There's no real requirement to do the EEH setup at the core_initcall_sync level. It just needs to be done after pci_dn's are created and before we start scanning PHBs. Simplify the flow a bit by moving the platform EEH inititalisation to an arch_initcall so we can fold the bus notifier registration into eeh_init(). Signed-off-by: Oliver O'Halloran <[email protected]> Signed-off-by: Michael Ellerman <[email protected]> Link: https://lore.kernel.org/r/[email protected]
2020-10-06powerpc/eeh: Delete eeh_ops->initOliver O'Halloran1-8/+0
No longer used since the platforms perform their EEH initialisation before calling eeh_init(). Signed-off-by: Oliver O'Halloran <[email protected]> Signed-off-by: Michael Ellerman <[email protected]> Link: https://lore.kernel.org/r/[email protected]
2020-10-06powerpc/eeh: Rework EEH initialisationOliver O'Halloran1-71/+16
Drop the EEH register / unregister ops thing and have the platform pass the ops structure into eeh_init() directly. This takes one initcall out of the EEH setup path and it means we're only doing EEH setup on the platforms which actually support it. It's also less code and generally easier to follow. No functional changes. Signed-off-by: Oliver O'Halloran <[email protected]> Signed-off-by: Michael Ellerman <[email protected]> Link: https://lore.kernel.org/r/[email protected]
2020-10-06powerpc/64: make restore_interrupts 64e onlyNicholas Piggin1-18/+19
This is not used by 64s. Signed-off-by: Nicholas Piggin <[email protected]> Signed-off-by: Michael Ellerman <[email protected]> Link: https://lore.kernel.org/r/[email protected]
2020-10-06powerpc/64e: remove 64s specific interrupt soft-mask codeNicholas Piggin2-11/+1
Since the assembly soft-masking code was moved to 64e specific, there are some 64s specific interrupt types still there. Remove them. Signed-off-by: Nicholas Piggin <[email protected]> Signed-off-by: Michael Ellerman <[email protected]> Link: https://lore.kernel.org/r/[email protected]
2020-10-06powerpc/64e: remove PACA_IRQ_EE_EDGENicholas Piggin2-24/+0
This is not used anywhere. Signed-off-by: Nicholas Piggin <[email protected]> Signed-off-by: Michael Ellerman <[email protected]> Link: https://lore.kernel.org/r/[email protected]
2020-10-06powerpc/64: fix irq replay pt_regs->softe valueNicholas Piggin1-1/+1
Replayed interrupts get an "artificial" struct pt_regs constructed to pass to interrupt handler functions. This did not get the softe field set correctly, it's as though the interrupt has hit while irqs are disabled. It should be IRQS_ENABLED. This is possibly harmless, asynchronous handlers should not be testing if irqs were disabled, but it might be possible for example some code is shared with synchronous or NMI handlers, and it makes more sense if debug output looks at this. Fixes: 3282a3da25bd ("powerpc/64: Implement soft interrupt replay in C") Signed-off-by: Nicholas Piggin <[email protected]> Signed-off-by: Michael Ellerman <[email protected]> Link: https://lore.kernel.org/r/[email protected]
2020-10-06powerpc/64: fix irq replay missing preemptNicholas Piggin1-0/+7
Prior to commit 3282a3da25bd ("powerpc/64: Implement soft interrupt replay in C"), replayed interrupts returned by the regular interrupt exit code, which performs preemption in case an interrupt had set need_resched. This logic was missed by the conversion. Adding preempt_disable/enable around the interrupt replay and final irq enable will reschedule if needed. Fixes: 3282a3da25bd ("powerpc/64: Implement soft interrupt replay in C") Signed-off-by: Nicholas Piggin <[email protected]> Signed-off-by: Michael Ellerman <[email protected]> Link: https://lore.kernel.org/r/[email protected]
2020-10-06powerpc: untangle cputable mce includeNicholas Piggin2-0/+2
Having cputable.h include mce.h means it pulls in a bunch of low level headers (e.g., synch.h) which then can't use CPU_FTR_ definitions. Signed-off-by: Nicholas Piggin <[email protected]> Signed-off-by: Michael Ellerman <[email protected]> Link: https://lore.kernel.org/r/[email protected]
2020-10-03mm: remove compat_process_vm_{readv,writev}Christoph Hellwig1-2/+2
Now that import_iovec handles compat iovecs, the native syscalls can be used for the compat case as well. Signed-off-by: Christoph Hellwig <[email protected]> Signed-off-by: Al Viro <[email protected]>
2020-10-03fs: remove compat_sys_vmspliceChristoph Hellwig1-1/+1
Now that import_iovec handles compat iovecs, the native vmsplice syscall can be used for the compat case as well. Signed-off-by: Christoph Hellwig <[email protected]> Signed-off-by: Al Viro <[email protected]>
2020-10-03fs: remove the compat readv/writev syscallsChristoph Hellwig1-2/+2
Now that import_iovec handles compat iovecs, the native readv and writev syscalls can be used for the compat case as well. Signed-off-by: Christoph Hellwig <[email protected]> Signed-off-by: Al Viro <[email protected]>
2020-09-25dma-mapping: add a new dma_alloc_pages APIChristoph Hellwig1-0/+2
This API is the equivalent of alloc_pages, except that the returned memory is guaranteed to be DMA addressable by the passed in device. The implementation will also be used to provide a more sensible replacement for DMA_ATTR_NON_CONSISTENT flag. Additionally dma_alloc_noncoherent is switched over to use dma_alloc_pages as its backend. Signed-off-by: Christoph Hellwig <[email protected]> Acked-by: Thomas Bogendoerfer <[email protected]> (MIPS part)
2020-09-25Merge branch 'master' of ↵Christoph Hellwig5-6/+5
https://git.kernel.org/pub/scm/linux/kernel/git/torvalds/linux into dma-mapping-for-next Pull in the latest 5.9 tree for the commit to revert the V4L2_FLAG_MEMORY_NON_CONSISTENT uapi addition.
2020-09-25kbuild: preprocess module linker scriptMasahiro Yamada1-8/+0
There was a request to preprocess the module linker script like we do for the vmlinux one. (https://lkml.org/lkml/2020/8/21/512) The difference between vmlinux.lds and module.lds is that the latter is needed for external module builds, thus must be cleaned up by 'make mrproper' instead of 'make clean'. Also, it must be created by 'make modules_prepare'. You cannot put it in arch/$(SRCARCH)/kernel/, which is cleaned up by 'make clean'. I moved arch/$(SRCARCH)/kernel/module.lds to arch/$(SRCARCH)/include/asm/module.lds.h, which is included from scripts/module.lds.S. scripts/module.lds is fine because 'make clean' keeps all the build artifacts under scripts/. You can add arch-specific sections in <asm/module.lds.h>. Signed-off-by: Masahiro Yamada <[email protected]> Tested-by: Jessica Yu <[email protected]> Acked-by: Will Deacon <[email protected]> Acked-by: Geert Uytterhoeven <[email protected]> Acked-by: Palmer Dabbelt <[email protected]> Reviewed-by: Kees Cook <[email protected]> Acked-by: Jessica Yu <[email protected]>
2020-09-22fs: remove compat_sys_mountChristoph Hellwig1-1/+1
compat_sys_mount is identical to the regular sys_mount now, so remove it and use the native version everywhere. Signed-off-by: Christoph Hellwig <[email protected]> Signed-off-by: Al Viro <[email protected]>
2020-09-18powerpc/sysfs: Remove unused 'err' variable in sysfs_create_dscr_default()Cédric Le Goater1-2/+1
This fixes a compile error with W=1. arch/powerpc/kernel/sysfs.c: In function ‘sysfs_create_dscr_default’: arch/powerpc/kernel/sysfs.c:228:7: error: variable ‘err’ set but not used [-Werror=unused-but-set-variable] int err = 0; ^~~ cc1: all warnings being treated as errors Signed-off-by: Cédric Le Goater <[email protected]> Reviewed-by: Christophe Leroy <[email protected]> Signed-off-by: Michael Ellerman <[email protected]> Link: https://lore.kernel.org/r/[email protected]
2020-09-18powerpc/prom_init: Check display props exist before enabling btextMichael Ellerman1-4/+13
It's possible to enable CONFIG_PPC_EARLY_DEBUG_BOOTX for a pseries kernel (maybe it shouldn't be), which is then booted with qemu/slof. But if you do that the kernel crashes in draw_byte(), with a DAR pointing somewhere near INT_MAX. Adding some debug to prom_init we see that we're not able to read the "address" property from OF, so we're just using whatever junk value was on the stack. So check the properties can be read properly from OF, if not we bail out before initialising btext, which avoids the crash. Signed-off-by: Michael Ellerman <[email protected]> Reviewed-by: Alexey Kardashevskiy <[email protected]> Link: https://lore.kernel.org/r/[email protected]
2020-09-18powerpc/smp: Move ppc_md.cpu_die() to smp_ops.cpu_offline_self()Michael Ellerman2-3/+5
We have smp_ops->cpu_die() and ppc_md.cpu_die(). One of them offlines the current CPU and one offlines another CPU, can you guess which is which? Also one is in smp_ops and one is in ppc_md? So rename ppc_md.cpu_die(), to cpu_offline_self(), because that's what it does. And move it into smp_ops where it belongs. Signed-off-by: Michael Ellerman <[email protected]> Link: https://lore.kernel.org/r/[email protected]
2020-09-18powerpc/smp: Fold cpu_die() into its only callerMichael Ellerman1-4/+0
Avoid the eternal confusion between cpu_die() and __cpu_die() by removing the former, folding it into its only caller. Signed-off-by: Michael Ellerman <[email protected]> Link: https://lore.kernel.org/r/[email protected]
2020-09-18powerpc: Move arch_cpu_idle_dead() into smp.cMichael Ellerman2-8/+6
arch_cpu_idle_dead() is in idle.c, which makes sense, but it's inside a CONFIG_HOTPLUG_CPU block. It would be more at home in smp.c, inside the existing CONFIG_HOTPLUG_CPU block. Note that CONFIG_HOTPLUG_CPU depends on CONFIG_SMP so even though smp.c is not built for SMP=n builds, that's fine. Signed-off-by: Michael Ellerman <[email protected]> Link: https://lore.kernel.org/r/[email protected]
2020-09-18powerpc/process: Fix uninitialised variable errorMichael Ellerman1-1/+1
Clang, and GCC with -Wmaybe-uninitialized, can't see that val is unused in get_fpexec_mode(): arch/powerpc/kernel/process.c:1940:7: error: variable 'val' is used uninitialized whenever 'if' condition is true if (cpu_has_feature(CPU_FTR_SPE)) { ^~~~~~~~~~~~~~~~~~~~~~~~~~~~ We know that CPU_FTR_SPE will only be true iff CONFIG_SPE is also true, but the compiler doesn't. Avoid it by initialising val to zero. Reported-by: kernel test robot <[email protected]> Fixes: 532ed1900d37 ("powerpc/process: Remove useless #ifdef CONFIG_SPE") Signed-off-by: Michael Ellerman <[email protected]> Reviewed-by: Nick Desaulniers <[email protected]> Tested-by: Nick Desaulniers <[email protected]> Link: https://lore.kernel.org/r/[email protected]
2020-09-16powerpc/smp: Create coregroup domainSrikar Dronamraju1-1/+53
Add percpu coregroup maps and masks to create coregroup domain. If a coregroup doesn't exist, the coregroup domain will be degenerated in favour of SMT/CACHE domain. Do note this patch is only creating stubs for cpu_to_coregroup_id. The actual cpu_to_coregroup_id implementation would be in a subsequent patch. Signed-off-by: Srikar Dronamraju <[email protected]> Reviewed-by: Gautham R. Shenoy <[email protected]> Signed-off-by: Michael Ellerman <[email protected]> Link: https://lore.kernel.org/r/[email protected]
2020-09-16powerpc/smp: Allocate cpumask only after searching thread groupSrikar Dronamraju1-4/+3
If allocated earlier and the search fails, then cpu_l1_cache_map cpumask is unnecessarily cleared. However cpu_l1_cache_map can be allocated / cleared after we search thread group. Please note CONFIG_CPUMASK_OFFSTACK is not set on Powerpc. Hence cpumask allocated by zalloc_cpumask_var_node is never freed. Signed-off-by: Srikar Dronamraju <[email protected]> Reviewed-by: Gautham R. Shenoy <[email protected]> Signed-off-by: Michael Ellerman <[email protected]> Link: https://lore.kernel.org/r/[email protected]
2020-09-16powerpc/numa: Detect support for coregroupSrikar Dronamraju1-0/+1
Add support for grouping cores based on the device-tree classification. - The last domain in the associativity domains always refers to the core. - If primary reference domain happens to be the penultimate domain in the associativity domains device-tree property, then there are no coregroups. However if its not a penultimate domain, then there are coregroups. There can be more than one coregroup. For now we would be interested in the last or the smallest coregroups, i.e one sub-group per DIE. Currently there are no firmwares that are exposing this grouping. Hence allow the basis for grouping to be abstract. Once the firmware starts using this grouping, code would be added to detect the type of grouping and adjust the sd domain flags accordingly. Signed-off-by: Srikar Dronamraju <[email protected]> Reviewed-by: Gautham R. Shenoy <[email protected]> Signed-off-by: Michael Ellerman <[email protected]> Link: https://lore.kernel.org/r/[email protected]
2020-09-16powerpc/smp: Optimize start_secondarySrikar Dronamraju1-6/+11
In start_secondary, even if shared_cache was already set, system does a redundant match for cpumask. This redundant check can be removed by checking if shared_cache is already set. While here, localize the sibling_mask variable to within the if condition. Signed-off-by: Srikar Dronamraju <[email protected]> Reviewed-by: Gautham R. Shenoy <[email protected]> Signed-off-by: Michael Ellerman <[email protected]> Link: https://lore.kernel.org/r/[email protected]
2020-09-16powerpc/smp: Dont assume l2-cache to be superset of siblingSrikar Dronamraju1-14/+29
Current code assumes that cpumask of cpus sharing a l2-cache mask will always be a superset of cpu_sibling_mask. Lets stop that assumption. cpu_l2_cache_mask is a superset of cpu_sibling_mask if and only if shared_caches is set. Reviewed-by: Gautham R. Shenoy <[email protected]> Signed-off-by: Srikar Dronamraju <[email protected]> Signed-off-by: Michael Ellerman <[email protected]> Link: https://lore.kernel.org/r/[email protected]
2020-09-16powerpc/smp: Move topology fixups into a new functionSrikar Dronamraju1-6/+11
Move topology fixup based on the platform attributes into its own function which is called just before set_sched_topology. Signed-off-by: Srikar Dronamraju <[email protected]> Reviewed-by: Gautham R. Shenoy <[email protected]> Signed-off-by: Michael Ellerman <[email protected]> Link: https://lore.kernel.org/r/[email protected]
2020-09-16powerpc/smp: Move powerpc_topology aboveSrikar Dronamraju1-52/+52
Just moving the powerpc_topology description above. This will help in using functions in this file and avoid declarations. No other functional changes Signed-off-by: Srikar Dronamraju <[email protected]> Reviewed-by: Gautham R. Shenoy <[email protected]> Signed-off-by: Michael Ellerman <[email protected]> Link: https://lore.kernel.org/r/[email protected]
2020-09-16powerpc/smp: Merge Power9 topology with Power topologySrikar Dronamraju1-22/+3
A new sched_domain_topology_level was added just for Power9. However the same can be achieved by merging powerpc_topology with power9_topology and makes the code more simpler especially when adding a new sched domain. Signed-off-by: Srikar Dronamraju <[email protected]> Reviewed-by: Gautham R. Shenoy <[email protected]> Signed-off-by: Michael Ellerman <[email protected]> Link: https://lore.kernel.org/r/[email protected]
2020-09-16powerpc/smp: Fix a warning under !NEED_MULTIPLE_NODESSrikar Dronamraju1-0/+2
Fix a build warning in a non CONFIG_NEED_MULTIPLE_NODES "error: _numa_cpu_lookup_table_ undeclared" Signed-off-by: Srikar Dronamraju <[email protected]> Reviewed-by: Gautham R. Shenoy <[email protected]> Signed-off-by: Michael Ellerman <[email protected]> Link: https://lore.kernel.org/r/[email protected]
2020-09-15powerpc/pci: unmap legacy INTx interrupts when a PHB is removedCédric Le Goater1-0/+114
When a passthrough IO adapter is removed from a pseries machine using hash MMU and the XIVE interrupt mode, the POWER hypervisor expects the guest OS to clear all page table entries related to the adapter. If some are still present, the RTAS call which isolates the PCI slot returns error 9001 "valid outstanding translations" and the removal of the IO adapter fails. This is because when the PHBs are scanned, Linux maps automatically the INTx interrupts in the Linux interrupt number space but these are never removed. To solve this problem, we introduce a PPC platform specific pcibios_remove_bus() routine which clears all interrupt mappings when the bus is removed. This also clears the associated page table entries of the ESB pages when using XIVE. For this purpose, we record the logical interrupt numbers of the mapped interrupt under the PHB structure and let pcibios_remove_bus() do the clean up. Since some PCI adapters, like GPUs, use the "interrupt-map" property to describe interrupt mappings other than the legacy INTx interrupts, we can not restrict the size of the mapping array to PCI_NUM_INTX. The number of interrupt mappings is computed from the "interrupt-map" property and the mapping array is allocated accordingly. Signed-off-by: Cédric Le Goater <[email protected]> Reviewed-by: Alexey Kardashevskiy <[email protected]> Signed-off-by: Michael Ellerman <[email protected]> Link: https://lore.kernel.org/r/[email protected]
2020-09-15powerpc/powermac: Fix low_sleep_handler with KUAP and KUEPChristophe Leroy1-1/+1
low_sleep_handler() has an hardcoded restore of segment registers that doesn't take KUAP and KUEP into account. Use head_32's load_segment_registers() routine instead. Fixes: a68c31fc01ef ("powerpc/32s: Implement Kernel Userspace Access Protection") Fixes: 31ed2b13c48d ("powerpc/32s: Implement Kernel Userspace Execution Prevention.") Cc: [email protected] Signed-off-by: Christophe Leroy <[email protected]> Signed-off-by: Michael Ellerman <[email protected]> Link: https://lore.kernel.org/r/21b05f7298c1b18f73e6e5b4cd5005aafa24b6da.1599820109.git.christophe.leroy@csgroup.eu
2020-09-15powerpc/process: Remove useless #ifdef CONFIG_PPC_FPUChristophe Leroy1-5/+4
Add a stub for __giveup_fpu() when CONFIG_PPC_FPU is not selected, as done for CONFIG_SPE and CONFIG_ALTIVEC. This allows to remove some #ifdef CONFIG_PPC_FPU. Also change one to IS_ENABLED(). Signed-off-by: Christophe Leroy <[email protected]> Signed-off-by: Michael Ellerman <[email protected]> Link: https://lore.kernel.org/r/69c8b7954ceeccc6b849e52e1fa41b3a0f10f6c1.1597643221.git.christophe.leroy@csgroup.eu
2020-09-15powerpc/process: Remove useless #ifdef CONFIG_SPEChristophe Leroy1-14/+7
cpu_has_feature(CPU_FTR_SPE) returns false when CONFIG_SPE is not set. There is no need to enclose the test in an #ifdef CONFIG_SPE. Remove it. CPU_FTR_SPE only exists on 32 bits. Define it as 0 on 64 bits. We have a couple of places like: #ifdef CONFIG_SPE if (cpu_has_feature(CPU_FTR_SPE)) { do_something_that_requires_CONFIG_SPE } else { return -EINVAL; } #else return -EINVAL; #endif Replace them by a cleaner version: if (cpu_has_feature(CPU_FTR_SPE)) { #ifdef CONFIG_SPE do_something_that_requires_CONFIG_SPE #endif } else { return -EINVAL; } When CONFIG_SPE is not set, this resolves to an unconditional return of -EINVAL Signed-off-by: Christophe Leroy <[email protected]> Signed-off-by: Michael Ellerman <[email protected]> Link: https://lore.kernel.org/r/698df8387555765b70ea42e4a7fa48141c309c1f.1597643221.git.christophe.leroy@csgroup.eu
2020-09-15powerpc/process: Remove useless #ifdef CONFIG_ALTIVECChristophe Leroy1-4/+0
cpu_has_feature(CPU_FTR_ALTIVEC) returns false when CONFIG_ALTIVEC is not set. There is no need to enclose the test in an #ifdef CONFIG_ALTIVEC. Remove it. Signed-off-by: Christophe Leroy <[email protected]> Signed-off-by: Michael Ellerman <[email protected]> Link: https://lore.kernel.org/r/03ba6b52344ca7c336df2bc6e3d31d736c804ae2.1597643221.git.christophe.leroy@csgroup.eu
2020-09-15powerpc/process: Remove useless #ifdef CONFIG_VSXChristophe Leroy1-12/+1
cpu_has_feature(CPU_FTR_VSX) returns false when CONFIG_VSX is not set. There is no need to enclose the test in an #ifdef CONFIG_VSX. Remove it. Signed-off-by: Christophe Leroy <[email protected]> Signed-off-by: Michael Ellerman <[email protected]> Link: https://lore.kernel.org/r/0eb61cf0dc66d781d47deb2228498cd61d03a754.1597643221.git.christophe.leroy@csgroup.eu
2020-09-15powerpc/process: Tag an #endif to help locate the matching #ifdef.Christophe Leroy1-1/+1
That #endif is more than 100 lines after the matching #ifdef, and there are several #ifdef/#else/#endif inbetween. Tag it as /* CONFIG_PPC_BOOK3S_64 */ to help locate the matching #ifdef. Signed-off-by: Christophe Leroy <[email protected]> Signed-off-by: Michael Ellerman <[email protected]> Link: https://lore.kernel.org/r/3612a8f8aaca16de3fc414a7e66293319d6e213c.1597643147.git.christophe.leroy@csgroup.eu
2020-09-15powerpc/process: Replace #ifdef CONFIG_KALLSYMS by IS_ENABLED()Christophe Leroy1-4/+4
The #ifdef CONFIG_KALLSYMS encloses some printk which can compile in all cases. Replace by IS_ENABLED(). Signed-off-by: Christophe Leroy <[email protected]> Signed-off-by: Michael Ellerman <[email protected]> Link: https://lore.kernel.org/r/2d89732a9062b2cf2651728804e4b8f6c9b9358e.1597643164.git.christophe.leroy@csgroup.eu
2020-09-15powerpc/process: Replace an #if defined(CONFIG_4xx) || defined(CONFIG_BOOKE) ↵Christophe Leroy1-6/+7
by IS_ENABLED() The #if defined(CONFIG_4xx) || defined(CONFIG_BOOKE) encloses some printk which can be compiled in all cases. Replace by IS_ENABLED(). Signed-off-by: Christophe Leroy <[email protected]> Signed-off-by: Michael Ellerman <[email protected]> Link: https://lore.kernel.org/r/a1b6ef3d657c8f249193442f56868fc358ea5b6c.1597643160.git.christophe.leroy@csgroup.eu
2020-09-15powerpc/process: Replace an #ifdef CONFIG_PPC_BOOK3S_64 by IS_ENABLED()Christophe Leroy1-3/+1
This #ifdef CONFIG_PPC_BOOK3S_64 calls preload_new_slb_context() when radix is not enabled. radix_enabled() is always defined, and the prototype for preload_new_slb_context() is always present, so the #ifdef is unneeded. Replace it by IS_ENABLED(). Signed-off-by: Christophe Leroy <[email protected]> Signed-off-by: Michael Ellerman <[email protected]> Link: https://lore.kernel.org/r/d31506ca9bac9def68cf7424eded63fdc4fb6660.1597643167.git.christophe.leroy@csgroup.eu
2020-09-15powerpc/process: Replace an #ifdef CONFIG_PPC_47x by IS_ENABLED()Christophe Leroy1-3/+2
isync() is always defined, no need for an #ifdef. Replace it by IS_ENABLED(CONFIG_PPC_47x). Signed-off-by: Christophe Leroy <[email protected]> Signed-off-by: Michael Ellerman <[email protected]> Link: https://lore.kernel.org/r/ac8da0e3baa91dda805e1e492fd65aecd90c1fb5.1597643156.git.christophe.leroy@csgroup.eu
2020-09-15powerpc/32: Fix vmap stack - Properly set r1 before activating MMUChristophe Leroy1-14/+29
We need r1 to be properly set before activating MMU, otherwise any new exception taken while saving registers into the stack in exception prologs will use the user stack, which is wrong and will even lockup or crash when KUAP is selected. Do that by switching the meaning of r11 and r1 until we have saved r1 to the stack: copy r1 into r11 and setup the new stack pointer in r1. To avoid complicating and impacting all generic and specific prolog code (and more), copy back r1 into r11 once r11 is save onto the stack. We could get rid of copying r1 back and forth at the cost of rewriting everything to use r1 instead of r11 all the way when CONFIG_VMAP_STACK is set, but the effort is probably not worth it. Fixes: 028474876f47 ("powerpc/32: prepare for CONFIG_VMAP_STACK") Cc: [email protected] Signed-off-by: Christophe Leroy <[email protected]> Signed-off-by: Michael Ellerman <[email protected]> Link: https://lore.kernel.org/r/8f85e8752ac5af602db7237ef53d634f4f3d3892.1599486108.git.christophe.leroy@csgroup.eu
2020-09-15powerpc/32: Fix vmap stack - Do not activate MMU before reading task structChristophe Leroy2-31/+6
We need r1 to be properly set before activating MMU, so reading task_struct->stack must be done with MMU off. This means we need an additional register to play with MSR bits while r11 now points to the stack. For that, move r10 back to CR (As is already done for hash MMU) and use r10. We still don't have r1 correct yet when we activate MMU. It is done in following patch. Fixes: 028474876f47 ("powerpc/32: prepare for CONFIG_VMAP_STACK") Cc: [email protected] Signed-off-by: Christophe Leroy <[email protected]> Signed-off-by: Michael Ellerman <[email protected]> Link: https://lore.kernel.org/r/a027d447022a006c9c4958ac734128e577a3c5c1.1599486108.git.christophe.leroy@csgroup.eu
2020-09-15powerpc/tau: Disable TAU between measurementsFinn Thain1-41/+24
Enabling CONFIG_TAU_INT causes random crashes: Unrecoverable exception 1700 at c0009414 (msr=1000) Oops: Unrecoverable exception, sig: 6 [#1] BE PAGE_SIZE=4K MMU=Hash SMP NR_CPUS=2 PowerMac Modules linked in: CPU: 0 PID: 0 Comm: swapper/0 Not tainted 5.7.0-pmac-00043-gd5f545e1a8593 #5 NIP: c0009414 LR: c0009414 CTR: c00116fc REGS: c0799eb8 TRAP: 1700 Not tainted (5.7.0-pmac-00043-gd5f545e1a8593) MSR: 00001000 <ME> CR: 22000228 XER: 00000100 GPR00: 00000000 c0799f70 c076e300 00800000 0291c0ac 00e00000 c076e300 00049032 GPR08: 00000001 c00116fc 00000000 dfbd3200 ffffffff 007f80a8 00000000 00000000 GPR16: 00000000 00000000 00000000 00000000 00000000 00000000 00000000 c075ce04 GPR24: c075ce04 dfff8880 c07b0000 c075ce04 00080000 00000001 c079ef98 c079ef5c NIP [c0009414] arch_cpu_idle+0x24/0x6c LR [c0009414] arch_cpu_idle+0x24/0x6c Call Trace: [c0799f70] [00000001] 0x1 (unreliable) [c0799f80] [c0060990] do_idle+0xd8/0x17c [c0799fa0] [c0060ba4] cpu_startup_entry+0x20/0x28 [c0799fb0] [c072d220] start_kernel+0x434/0x44c [c0799ff0] [00003860] 0x3860 Instruction dump: XXXXXXXX XXXXXXXX XXXXXXXX 3d20c07b XXXXXXXX XXXXXXXX XXXXXXXX 7c0802a6 XXXXXXXX XXXXXXXX XXXXXXXX 4e800421 XXXXXXXX XXXXXXXX XXXXXXXX 7d2000a6 ---[ end trace 3a0c9b5cb216db6b ]--- Resolve this problem by disabling each THRMn comparator when handling the associated THRMn interrupt and by disabling the TAU entirely when updating THRMn thresholds. Fixes: 1da177e4c3f41 ("Linux-2.6.12-rc2") Signed-off-by: Finn Thain <[email protected]> Tested-by: Stan Johnson <[email protected]> Signed-off-by: Michael Ellerman <[email protected]> Link: https://lore.kernel.org/r/5a0ba3dc5612c7aac596727331284a3676c08472.1599260540.git.fthain@telegraphics.com.au
2020-09-15powerpc/tau: Check processor type before enabling TAU interruptFinn Thain1-19/+14
According to Freescale's documentation, MPC74XX processors have an erratum that prevents the TAU interrupt from working, so don't try to use it when running on those processors. Fixes: 1da177e4c3f41 ("Linux-2.6.12-rc2") Signed-off-by: Finn Thain <[email protected]> Tested-by: Stan Johnson <[email protected]> Signed-off-by: Michael Ellerman <[email protected]> Link: https://lore.kernel.org/r/c281611544768e758bd58fe812cf702a5bd2d042.1599260540.git.fthain@telegraphics.com.au