aboutsummaryrefslogtreecommitdiff
path: root/arch/powerpc/platforms/powernv
AgeCommit message (Collapse)AuthorFilesLines
2020-08-25powerpc/powernv: Include asm/powernv.h from the local powernv.hOliver O'Halloran2-0/+9
The asm/powernv.h header provides prototypes for functions which need to be called by non-powernv platform code. Also include it in the powernv.h that's local to the platform directory to squash some warnings about non-static functions missing prototypes. Also include powernv.h since from opal-memcons.c since it has the prototypes for the memcons wrangling functions which are used for the opal and ultravisor msglog. Signed-off-by: Oliver O'Halloran <[email protected]> Signed-off-by: Michael Ellerman <[email protected]> Link: https://lore.kernel.org/r/[email protected]
2020-08-25powerpc/powernv/smp: Fix spurious DBG() warningOliver O'Halloran1-1/+1
When building with W=1 we get the following warning: arch/powerpc/platforms/powernv/smp.c: In function ‘pnv_smp_cpu_kill_self’: arch/powerpc/platforms/powernv/smp.c:276:16: error: suggest braces around empty body in an ‘if’ statement [-Werror=empty-body] 276 | cpu, srr1); | ^ cc1: all warnings being treated as errors The full context is this block: if (srr1 && !generic_check_cpu_restart(cpu)) DBG("CPU%d Unexpected exit while offline srr1=%lx!\n", cpu, srr1); When building with DEBUG undefined DBG() expands to nothing and GCC emits the warning due to the lack of braces around an empty statement. Signed-off-by: Oliver O'Halloran <[email protected]> Reviewed-by: Joel Stanley <[email protected]> Signed-off-by: Michael Ellerman <[email protected]> Link: https://lore.kernel.org/r/[email protected]
2020-08-25powerpc/powernv: Remove set but not used variable 'parent'zhengbin1-8/+0
Fix gcc '-Wunused-but-set-variable' warning: arch/powerpc/platforms/powernv/pci-ioda.c: In function pnv_ioda_configure_pe: arch/powerpc/platforms/powernv/pci-ioda.c:867:18: warning: variable parent set but not used [-Wunused-but-set-variable] It is not used since commit b131a8425c34 ("powerpc/powernv: Set PELTV for compound PEs") Reported-by: Hulk Robot <[email protected]> Signed-off-by: zhengbin <[email protected]> Signed-off-by: Michael Ellerman <[email protected]> Link: https://lore.kernel.org/r/[email protected]
2020-08-25ocxl: Remove custom service to allocate interruptsFrederic Barrat1-30/+0
We now allocate interrupts through xive directly. Signed-off-by: Frederic Barrat <[email protected]> Reviewed-by: Cédric Le Goater <[email protected]> Reviewed-by: Greg Kurz <[email protected]> Acked-by: Andrew Donnellan <[email protected]> Signed-off-by: Michael Ellerman <[email protected]> Link: https://lore.kernel.org/r/[email protected]
2020-08-20powerpc/powernv/pci: Fix possible crash when releasing DMA resourcesFrederic Barrat1-1/+1
Fix a typo introduced during recent code cleanup, which could lead to silently not freeing resources or an oops message (on PCI hotplug or CAPI reset). Only impacts ioda2, the code path for ioda1 is correct. Fixes: 01e12629af4e ("powerpc/powernv/pci: Add explicit tracking of the DMA setup state") Signed-off-by: Frederic Barrat <[email protected]> Reviewed-by: Oliver O'Halloran <[email protected]> Signed-off-by: Michael Ellerman <[email protected]> Link: https://lore.kernel.org/r/[email protected]
2020-08-03powerpc/powernv/sriov: Fix use of uninitialised variableOliver O'Halloran1-3/+1
Initialising the value before using it is generally regarded as a good idea so do that. Fixes: 4c51f3e1e870 ("powerpc/powernv/sriov: Make single PE mode a per-BAR setting") Reported-by: Nathan Chancellor <[email protected]> Signed-off-by: Oliver O'Halloran <[email protected]> Signed-off-by: Michael Ellerman <[email protected]> Link: https://lore.kernel.org/r/[email protected]
2020-07-29powerpc/powernv/sriov: Remove unused but set variable 'phb'Wei Yongjun1-2/+0
Gcc report warning as follows: arch/powerpc/platforms/powernv/pci-sriov.c:602:25: warning: variable 'phb' set but not used [-Wunused-but-set-variable] 602 | struct pnv_phb *phb; | ^~~ This variable is not used, so this commit removing it. Reported-by: Hulk Robot <[email protected]> Signed-off-by: Wei Yongjun <[email protected]> Acked-by: Oliver O'Halloran <[email protected]> Signed-off-by: Michael Ellerman <[email protected]> Link: https://lore.kernel.org/r/[email protected]
2020-07-29powerpc: Use fallthrough pseudo-keywordGustavo A. R. Silva1-1/+1
Replace the existing /* fall through */ comments and its variants with the new pseudo-keyword macro fallthrough[1]. Also, remove unnecessary fall-through markings when it is the case. [1] https://www.kernel.org/doc/html/v5.7/process/deprecated.html?highlight=fallthrough#implicit-switch-case-fall-through Signed-off-by: Gustavo A. R. Silva <[email protected]> Signed-off-by: Michael Ellerman <[email protected]> Link: https://lore.kernel.org/r/20200727224201.GA10133@embeddedor
2020-07-27powerpc/powernv/pci.h: delete duplicated wordRandy Dunlap1-1/+1
Drop the repeated word "for". Signed-off-by: Randy Dunlap <[email protected]> Signed-off-by: Michael Ellerman <[email protected]> Link: https://lore.kernel.org/r/[email protected]
2020-07-26powerpc/powernv/sriov: Remove vfs_expandedOliver O'Halloran2-9/+3
Previously iov->vfs_expanded was used for two purposes. 1) To work out how much we need to multiple the per-VF BAR size to figure out the total space required for the IOV BAR. 2) To indicate that IOV is not usable with this device (vfs_expanded == 0). We don't really need the field for either since the multiple in 1) is always the number PEs supported by the PHB. Similarly, we don't really need it in 2) either since the IOV data field will be NULL if we can't use IOV with the device. Signed-off-by: Oliver O'Halloran <[email protected]> Signed-off-by: Michael Ellerman <[email protected]> Link: https://lore.kernel.org/r/[email protected]
2020-07-26powerpc/powernv/sriov: Make single PE mode a per-BAR settingOliver O'Halloran2-69/+73
Using single PE BARs to map an SR-IOV BAR is really a choice about what strategy to use when mapping a BAR. It doesn't make much sense for this to be a global setting since a device might have one large BAR which needs to be mapped with single PE windows and another smaller BAR that can be mapped with a regular segmented window. Make the segmented vs single decision a per-BAR setting and clean up the logic that decides which mode to use. Signed-off-by: Oliver O'Halloran <[email protected]> Signed-off-by: Michael Ellerman <[email protected]> Link: https://lore.kernel.org/r/[email protected]
2020-07-26powerpc/powernv/sriov: Refactor M64 BAR setupOliver O'Halloran1-30/+27
Split up the logic so that we have one branch that handles setting up a segmented window and another that handles setting up single PE windows for each VF. Signed-off-by: Oliver O'Halloran <[email protected]> Reviewed-by: Alexey Kardashevskiy <[email protected]> Signed-off-by: Michael Ellerman <[email protected]> Link: https://lore.kernel.org/r/[email protected]
2020-07-26powerpc/powernv/sriov: Move M64 BAR allocation into a helperOliver O'Halloran1-11/+20
I want to refactor the loop this code is currently inside of. Hoist it on out. Signed-off-by: Oliver O'Halloran <[email protected]> Reviewed-by: Alexey Kardashevskiy <[email protected]> Signed-off-by: Michael Ellerman <[email protected]> Link: https://lore.kernel.org/r/[email protected]
2020-07-26powerpc/powernv/sriov: De-indent setup and teardownOliver O'Halloran2-41/+49
Remove the IODA2 PHB checks. We already assume IODA2 in several places so there's not much point in wrapping most of the setup and teardown process in an if block. Signed-off-by: Oliver O'Halloran <[email protected]> Signed-off-by: Michael Ellerman <[email protected]> Link: https://lore.kernel.org/r/[email protected]
2020-07-26powerpc/powernv/sriov: Drop iov->pe_num_map[]Oliver O'Halloran2-88/+28
Currently the iov->pe_num_map[] does one of two things depending on whether single PE mode is being used or not. When it is, this contains an array which maps a vf_index to the corresponding PE number. When single PE mode is not being used this contains a scalar which is the base PE for the set of enabled VFs (for for VFn is base + n). The array was necessary because when calling pnv_ioda_alloc_pe() there is no guarantee that the allocated PEs would be contigious. We can now allocate contigious blocks of PEs so this is no longer an issue. This allows us to drop the if (single_mode) {} .. else {} block scattered through the SR-IOV code which is a nice clean up. This also fixes a bug in pnv_pci_sriov_disable() which is the non-atomic bitmap_clear() to manipulate the PE allocation map. Other users of the map assume it will be accessed with atomic ops. Signed-off-by: Oliver O'Halloran <[email protected]> Reviewed-by: Alexey Kardashevskiy <[email protected]> Signed-off-by: Michael Ellerman <[email protected]> Link: https://lore.kernel.org/r/[email protected]
2020-07-26powerpc/powernv/pci: Refactor pnv_ioda_alloc_pe()Oliver O'Halloran2-9/+34
Rework the PE allocation logic to allow allocating blocks of PEs rather than individually. We'll use this to allocate contigious blocks of PEs for the SR-IOVs. This patch also adds code to pnv_ioda_alloc_pe() and pnv_ioda_reserve_pe() to use the existing, but unused, phb->pe_alloc_mutex. Currently these functions use atomic bit ops to release a currently allocated PE number. However, the pnv_ioda_alloc_pe() wants to have exclusive access to the bit map while scanning for hole large enough to accomodate the allocation size. Signed-off-by: Oliver O'Halloran <[email protected]> Signed-off-by: Michael Ellerman <[email protected]> Link: https://lore.kernel.org/r/[email protected]
2020-07-26powerpc/powernv/sriov: Factor out M64 BAR setupOliver O'Halloran1-29/+100
The sequence required to use the single PE BAR mode is kinda janky and requires a little explanation. The API was designed with P7-IOC style windows where the setup process is something like: 1. Configure the window start / end address 2. Enable the window 3. Map the segments of each window to the PE For Single PE BARs the process is: 1. Set the PE for segment zero on a disabled window 2. Set the range 3. Enable the window Move the OPAL calls into their own helper functions where the quirks can be contained. Signed-off-by: Oliver O'Halloran <[email protected]> Reviewed-by: Alexey Kardashevskiy <[email protected]> Signed-off-by: Michael Ellerman <[email protected]> Link: https://lore.kernel.org/r/[email protected]
2020-07-26powerpc/powernv/sriov: Simplify used window trackingOliver O'Halloran2-36/+22
No need for the multi-dimensional arrays, just use a bitmap. Signed-off-by: Oliver O'Halloran <[email protected]> Reviewed-by: Alexey Kardashevskiy <[email protected]> Signed-off-by: Michael Ellerman <[email protected]> Link: https://lore.kernel.org/r/[email protected]
2020-07-26powerpc/powernv/sriov: Rename truncate_iovOliver O'Halloran1-5/+6
This prevents SR-IOV being used by making the SR-IOV BAR resources unallocatable. Rename it to reflect what it actually does. Signed-off-by: Oliver O'Halloran <[email protected]> Reviewed-by: Alexey Kardashevskiy <[email protected]> Signed-off-by: Michael Ellerman <[email protected]> Link: https://lore.kernel.org/r/[email protected]
2020-07-26powerpc/powernv/sriov: Explain how SR-IOV works on PowerNVOliver O'Halloran1-0/+130
SR-IOV support on PowerNV is a byzantine maze of hooks. I have no idea how anyone is supposed to know how it works except through a lot of stuffering. Write up some docs about the overall story to help out the next sucker^Wperson who needs to tinker with it. Signed-off-by: Oliver O'Halloran <[email protected]> Reviewed-by: Alexey Kardashevskiy <[email protected]> Signed-off-by: Michael Ellerman <[email protected]> Link: https://lore.kernel.org/r/[email protected]
2020-07-26powerpc/powernv/sriov: Move SR-IOV into a separate fileOliver O'Halloran4-655/+733
pci-ioda.c is getting a bit unwieldly due to the amount of stuff jammed in there. The SR-IOV support can be extracted easily enough and is mostly standalone, so move it into a separate file. This patch also moves the PowerNV SR-IOV specific fields from pci_dn and moves them into a platform specific structure. I'm not sure how they ended up in there in the first place, but leaking platform specifics into common code has proven to be a terrible idea so far so lets stop doing that. Signed-off-by: Oliver O'Halloran <[email protected]> Signed-off-by: Michael Ellerman <[email protected]> Link: https://lore.kernel.org/r/[email protected]
2020-07-26powerpc/powernv/pci: Initialise M64 for IODA1 as a 1-1 windowOliver O'Halloran1-30/+25
We pre-configure the m64 window for IODA1 as a 1-1 segment-PE mapping, similar to PHB3. Currently the actual mapping of segments occurs in pnv_ioda_pick_m64_pe(), but we can move it into pnv_ioda1_init_m64() and drop the IODA1 specific code paths in the PE setup / teardown. Signed-off-by: Oliver O'Halloran <[email protected]> Reviewed-by: Alexey Kardashevskiy <[email protected]> Signed-off-by: Michael Ellerman <[email protected]> Link: https://lore.kernel.org/r/[email protected]
2020-07-26powerpc/powernv/pci: Add explicit tracking of the DMA setup stateOliver O'Halloran2-19/+36
There's an optimisation in the PE setup which skips performing DMA setup for a PE if we only have bridges in a PE. The assumption being that only "real" devices will DMA to system memory, which is probably fair. However, if we start off with only bridge devices in a PE then add a non-bridge device the new device won't be able to use DMA because we never configured it. Fix this (admittedly pretty weird) edge case by tracking whether we've done the DMA setup for the PE or not. If a non-bridge device is added to the PE (via rescan or hotplug, or whatever) we can set up DMA on demand. This also means the only remaining user of the old "DMA Weight" code is the IODA1 DMA setup code that it was originally added for, which is good. Signed-off-by: Oliver O'Halloran <[email protected]> Reviewed-by: Alexey Kardashevskiy <[email protected]> Signed-off-by: Michael Ellerman <[email protected]> Link: https://lore.kernel.org/r/[email protected]
2020-07-26powerpc/powernv/pci: Always tear down DMA windows on PE releaseOliver O'Halloran1-27/+3
Currently we have these two functions: pnv_pci_ioda2_release_dma_pe(), and pnv_pci_ioda2_release_pe_dma() The first is used when tearing down VF PEs and the other is used for normal devices. There's very little difference between the two though. The latter (non-VF) will skip a call to pnv_pci_ioda2_unset_window() unless CONFIG_IOMMU_API=y is set. There's no real point in doing this so fold the two together. Signed-off-by: Oliver O'Halloran <[email protected]> Reviewed-by: Alexey Kardashevskiy <[email protected]> Signed-off-by: Michael Ellerman <[email protected]> Link: https://lore.kernel.org/r/[email protected]
2020-07-26powerpc/powernv/pci: Add pci_bus_to_pnvhb() helperOliver O'Halloran3-74/+38
Add a helper to go from a pci_bus structure to the pnv_phb that hosts that bus. There's a lot of instances of the following pattern: struct pci_controller *hose = pci_bus_to_host(pdev->bus); struct pnv_phb *phb = hose->private_data; Without any other uses of the pci_controller inside the function. This is hard to read since it requires you to memorise the contents of the private data fields and kind of error prone since it involves blindly assigning a void pointer. Add a helper to make it more concise and explicit. Signed-off-by: Oliver O'Halloran <[email protected]> Signed-off-by: Michael Ellerman <[email protected]> Link: https://lore.kernel.org/r/[email protected]
2020-07-26powerpc/eeh: Move PE tree setup into the platformOliver O'Halloran1-1/+26
The EEH core has a concept of a "PE tree" to support PowerNV. The PE tree follows the PCI bus structures because a reset asserted on an upstream bridge will be propagated to the downstream bridges. On pseries there's a 1-1 correspondence between what the guest sees are a PHB and a PE so the "tree" is really just a single node. Current the EEH core is reponsible for setting up this PE tree which it does by traversing the pci_dn tree. The structure of the pci_dn tree matches the bus tree on PowerNV which leads to the PE tree being "correct" this setup method doesn't make a whole lot of sense and it's actively confusing for the pseries case where it doesn't really do anything. We want to remove the dependence on pci_dn anyway so this patch move choosing where to insert a new PE into the platform code rather than being part of the generic EEH code. For PowerNV this simplifies the tree building logic and removes the use of pci_dn. For pseries we keep the existing logic. I'm not really convinced it does anything due to the 1-1 PE-to-PHB correspondence so every device under that PHB should be in the same PE, but I'd rather not remove it entirely until we've had a chance to look at it more deeply. Signed-off-by: Oliver O'Halloran <[email protected]> Reviewed-by: Alexey Kardashevskiy <[email protected]> Signed-off-by: Michael Ellerman <[email protected]> Link: https://lore.kernel.org/r/[email protected]
2020-07-26powerpc/eeh: Rename eeh_{add_to|remove_from}_parent_pe()Oliver O'Halloran1-1/+1
The naming of eeh_{add_to|remove_from}_parent_pe() doesn't really reflect what they actually do. If the PE referred to be edev->pe_config_addr already exists under that PHB then the edev is added to that PE. However, if the PE doesn't exist the a new one is created for the edev. The bulk of the implementation of eeh_add_to_parent_pe() covers that second case. Similarly, most of eeh_remove_from_parent_pe() is determining when it's safe to delete a PE. Signed-off-by: Oliver O'Halloran <[email protected]> Signed-off-by: Michael Ellerman <[email protected]> Link: https://lore.kernel.org/r/[email protected]
2020-07-26powerpc/eeh: Remove class code field from edevOliver O'Halloran1-3/+2
The edev->class_code field is never referenced anywhere except for the platform specific probe functions. The same information is available in the pci_dev for PowerNV and in the pci_dn on pseries so we can remove the field. Signed-off-by: Oliver O'Halloran <[email protected]> Signed-off-by: Michael Ellerman <[email protected]> Link: https://lore.kernel.org/r/[email protected]
2020-07-26powerpc/eeh: Pass eeh_dev to eeh_ops->{read|write}_config()Oliver O'Halloran1-19/+24
Mechanical conversion of the eeh_ops interfaces to use eeh_dev to reference a specific device rather than pci_dn. No functional changes. Signed-off-by: Oliver O'Halloran <[email protected]> Signed-off-by: Michael Ellerman <[email protected]> Link: https://lore.kernel.org/r/[email protected]
2020-07-26powerpc/eeh: Pass eeh_dev to eeh_ops->restore_config()Oliver O'Halloran1-4/+2
Mechanical conversion of the eeh_ops interfaces to use eeh_dev to reference a specific device rather than pci_dn. No functional changes. Signed-off-by: Oliver O'Halloran <[email protected]> Signed-off-by: Michael Ellerman <[email protected]> Link: https://lore.kernel.org/r/[email protected]
2020-07-26powerpc/eeh: Remove VF config space restorationOliver O'Halloran1-14/+6
There's a bunch of strange things about this code. First up is that none of the fields being written to are functional for a VF. The SR-IOV specification lists then as "Reserved, but OS should preserve" so writing new values to them doesn't do anything and is clearly wrong from a correctness perspective. However, since VFs are designed to be managed by the OS there is an argument to be made that we should be saving and restoring some parts of config space. We already sort of do that by saving the first 64 bytes of config space in the eeh_dev (see eeh_dev->config_space[]). This is inadequate since it doesn't even consider saving and restoring the PCI capability structures. However, this is a problem with EEH in general and that needs to be fixed for non-VF devices too. There's no real reason to keep around this around so delete it. Signed-off-by: Oliver O'Halloran <[email protected]> Reviewed-by: Alexey Kardashevskiy <[email protected]> Signed-off-by: Michael Ellerman <[email protected]> Link: https://lore.kernel.org/r/[email protected]
2020-07-26powerpc/eeh: Kill off eeh_ops->get_pe_addr()Oliver O'Halloran1-13/+0
This is used in precisely one place which is in pseries specific platform code. There's no need to have the callback in eeh_ops since the platform chooses the EEH PE addresses anyway. The PowerNV implementation has always been a stub too so remove it. Signed-off-by: Oliver O'Halloran <[email protected]> Reviewed-by: Alexey Kardashevskiy <[email protected]> Signed-off-by: Michael Ellerman <[email protected]> Link: https://lore.kernel.org/r/[email protected]
2020-07-23powerpc/powernv/idle: Exclude mfspr on HID1, 4, 5 on P9 and abovePratik Rajesh Sampat1-3/+3
POWER9 onwards the support for the registers HID1, HID4, HID5 has been receded. Although mfspr on the above registers worked in Power9, In Power10 simulator is unrecognized. Moving their assignment under the check for machines lower than Power9 Signed-off-by: Pratik Rajesh Sampat <[email protected]> Reviewed-by: Gautham R. Shenoy <[email protected]> Reviewed-by: Nicholas Piggin <[email protected]> Signed-off-by: Michael Ellerman <[email protected]> Link: https://lore.kernel.org/r/[email protected]
2020-07-23powerpc/powernv/idle: Rename pnv_first_spr_loss_level variablePratik Rajesh Sampat1-9/+9
Replace the variable name from using "pnv_first_spr_loss_level" to "deep_spr_loss_state". pnv_first_spr_loss_level is supposed to be the earliest state that has OPAL_PM_LOSE_FULL_CONTEXT set, in other places the kernel uses the "deep" states as terminology. Hence renaming the variable to be coherent to its semantics. Signed-off-by: Pratik Rajesh Sampat <[email protected]> Acked-by: Gautham R. Shenoy <[email protected]> Signed-off-by: Michael Ellerman <[email protected]> Link: https://lore.kernel.org/r/[email protected]
2020-07-23powerpc/powernv/idle: Replace CPU feature check with PVR checkPratik Rajesh Sampat1-1/+1
The POWER9 idle driver contains implementation-specific details that means it is not suitable to run on any processor that implements ISA v3.0 (e.g., POWER10), so only init the driver when running on a POWER9. Signed-off-by: Pratik Rajesh Sampat <[email protected]> [mpe: Use updated change log from Nick] Signed-off-by: Michael Ellerman <[email protected]> Link: https://lore.kernel.org/r/[email protected]
2020-07-22powerpc/perf: BHRB control to disable BHRB logic when not usedAthira Rajeev1-2/+20
PowerISA v3.1 has few updates for the Branch History Rolling Buffer(BHRB). BHRB disable is controlled via Monitor Mode Control Register A (MMCRA) bit, namely "BHRB Recording Disable (BHRBRD)". This field controls whether BHRB entries are written when BHRB recording is enabled by other bits. This patch implements support for this BHRB disable bit. By setting 0b1 to this bit will disable the BHRB and by setting 0b0 to this bit will have BHRB enabled. This addresses backward compatibility (for older OS), since this bit will be cleared and hardware will be writing to BHRB by default. This patch addresses changes to set MMCRA (BHRBRD) at boot for power10 (there by the core will run faster) and enable this feature only on runtime ie, on explicit need from user. Also save/restore MMCRA in the restore path of state-loss idle state to make sure we keep BHRB disabled if it was not enabled on request at runtime. Signed-off-by: Athira Rajeev <[email protected]> Signed-off-by: Michael Ellerman <[email protected]> Link: https://lore.kernel.org/r/[email protected]
2020-07-20powerpc/mm/radix: Create separate mappings for hot-plugged memoryAneesh Kumar K.V1-1/+9
To enable memory unplug without splitting kernel page table mapping, we force the max mapping size to the LMB size. LMB size is the unit in which hypervisor will do memory add/remove operation. Pseries systems supports max LMB size of 256MB. Hence on pseries, we now end up mapping memory with 2M page size instead of 1G. To improve that we want hypervisor to hint the kernel about the hotplug memory range. That was added that as part of commit b6eca183e23e ("powerpc/kernel: Enables memory hot-remove after reboot on pseries guests") But PowerVM doesn't provide that hint yet. Once we get PowerVM updated, we can then force the 2M mapping only to hot-pluggable memory region using memblock_is_hotpluggable(). Till then let's depend on LMB size for finding the mapping page size for linear range. With this change KVM guest will also be doing linear mapping with 2M page size. The actual TLB benefit of mapping guest page table entries with hugepage size can only be materialized if the partition scoped entries are also using the same or higher page size. A guest using 1G hugetlbfs backing guest memory can have a performance impact with the above change. Signed-off-by: Bharata B Rao <[email protected]> Signed-off-by: Aneesh Kumar K.V <[email protected]> [mpe: Fold in fix from Aneesh spotted by [email protected]] Signed-off-by: Michael Ellerman <[email protected]> Link: https://lore.kernel.org/r/[email protected]
2020-07-18Merge branch 'fixes' into nextMichael Ellerman1-1/+1
Merge our fixes branch, primarily to bring in the ebb selftests build fix and the pkey fix, which is a dependency for some future work.
2020-07-15powerpc/vas: Report proper error code for address translation failureHaren Myneni1-1/+1
P9 DD2 NX workbook (Table 4-36) says DMA controller uses CC=5 internally for translation fault handling. NX reserves CC=250 for OS to notify user space when NX encounters address translation failure on the request buffer. Not an issue in earlier releases as NX does not get faults on kernel addresses. This patch defines CSB_CC_FAULT_ADDRESS(250) and updates CSB.CC with this proper error code for user space. Fixes: c96c4436aba4 ("powerpc/vas: Update CSB and notify process for fault CRBs") Signed-off-by: Haren Myneni <[email protected]> [mpe: Added Fixes tag and fix typo in comment] Signed-off-by: Michael Ellerman <[email protected]> Link: https://lore.kernel.org/r/[email protected]
2020-07-15powerpc/powernv: Move pnv_ioda_setup_bus_dma under CONFIG_IOMMU_APIOliver O'Halloran1-13/+13
pnv_ioda_setup_bus_dma() is only used when a passed through PE is returned to the host. If the kernel is built without IOMMU support this is dead code. Move it under the #ifdef with the rest of the IOMMU API support. Reported-by: kernel test robot <[email protected]> Signed-off-by: Oliver O'Halloran <[email protected]> Reviewed-by: Alexey Kardashevskiy <[email protected]> Signed-off-by: Michael Ellerman <[email protected]> Link: https://lore.kernel.org/r/[email protected]
2020-07-15powerpc/powernv: Make pnv_pci_sriov_enable() and friends staticOliver O'Halloran1-4/+4
The kernel test robot noticed these are non-static which causes Clang to print some warnings. These are called via ppc_md function pointers so there's no need for them to be non-static. Reported-by: kernel test robot <[email protected]> Signed-off-by: Oliver O'Halloran <[email protected]> Signed-off-by: Michael Ellerman <[email protected]> Link: https://lore.kernel.org/r/[email protected]
2020-06-22powerpc/powernv/ioda: Return correct error if TCE level allocation failedAlexey Kardashevskiy1-1/+1
The iommu_table_ops::xchg_no_kill() callback updates TCE. It is quite possible that not entire table is allocated if it is huge and multilevel so xchg may also allocate subtables. If failed, it returns H_HARDWARE for failed allocation and H_TOO_HARD if it needs it but cannot do because the alloc parameter is "false" (set when called with MMU=off to force retry with MMU=on). The problem is that having separate errors only matters in real mode (MMU=off) but the only caller with alloc="false" does not check the exact error code and simply returns H_TOO_HARD; and for every other mode alloc is "true". Also, the function is also called from the ioctl() handler of the VFIO SPAPR TCE IOMMU subdriver which does not expect hypervisor error codes (H_xxx) and will expose them to the userspace. This converts wrong error codes to -ENOMEM. Signed-off-by: Alexey Kardashevskiy <[email protected]> Signed-off-by: Michael Ellerman <[email protected]> Link: https://lore.kernel.org/r/[email protected]
2020-06-10kernel: better document the use_mm/unuse_mm API contractChristoph Hellwig1-2/+2
Switch the function documentation to kerneldoc comments, and add WARN_ON_ONCE asserts that the calling thread is a kernel thread and does not have ->mm set (or has ->mm set in the case of unuse_mm). Also give the functions a kthread_ prefix to better document the use case. [[email protected]: fix a comment typo, cover the newly merged use_mm/unuse_mm caller in vfio] Link: http://lkml.kernel.org/r/[email protected] [[email protected]: powerpc/vas: fix up for {un}use_mm() rename] Link: http://lkml.kernel.org/r/[email protected] Signed-off-by: Christoph Hellwig <[email protected]> Signed-off-by: Stephen Rothwell <[email protected]> Signed-off-by: Andrew Morton <[email protected]> Tested-by: Jens Axboe <[email protected]> Reviewed-by: Jens Axboe <[email protected]> Acked-by: Felix Kuehling <[email protected]> Acked-by: Greg Kroah-Hartman <[email protected]> [usb] Acked-by: Haren Myneni <[email protected]> Cc: Alex Deucher <[email protected]> Cc: Al Viro <[email protected]> Cc: Felipe Balbi <[email protected]> Cc: Jason Wang <[email protected]> Cc: "Michael S. Tsirkin" <[email protected]> Cc: Zhenyu Wang <[email protected]> Cc: Zhi Wang <[email protected]> Link: http://lkml.kernel.org/r/[email protected] Signed-off-by: Linus Torvalds <[email protected]>
2020-06-05Merge tag 'powerpc-5.8-1' of ↵Linus Torvalds15-316/+1230
git://git.kernel.org/pub/scm/linux/kernel/git/powerpc/linux Pull powerpc updates from Michael Ellerman: - Support for userspace to send requests directly to the on-chip GZIP accelerator on Power9. - Rework of our lockless page table walking (__find_linux_pte()) to make it safe against parallel page table manipulations without relying on an IPI for serialisation. - A series of fixes & enhancements to make our machine check handling more robust. - Lots of plumbing to add support for "prefixed" (64-bit) instructions on Power10. - Support for using huge pages for the linear mapping on 8xx (32-bit). - Remove obsolete Xilinx PPC405/PPC440 support, and an associated sound driver. - Removal of some obsolete 40x platforms and associated cruft. - Initial support for booting on Power10. - Lots of other small features, cleanups & fixes. Thanks to: Alexey Kardashevskiy, Alistair Popple, Andrew Donnellan, Andrey Abramov, Aneesh Kumar K.V, Balamuruhan S, Bharata B Rao, Bulent Abali, Cédric Le Goater, Chen Zhou, Christian Zigotzky, Christophe JAILLET, Christophe Leroy, Dmitry Torokhov, Emmanuel Nicolet, Erhard F., Gautham R. Shenoy, Geoff Levand, George Spelvin, Greg Kurz, Gustavo A. R. Silva, Gustavo Walbon, Haren Myneni, Hari Bathini, Joel Stanley, Jordan Niethe, Kajol Jain, Kees Cook, Leonardo Bras, Madhavan Srinivasan., Mahesh Salgaonkar, Markus Elfring, Michael Neuling, Michal Simek, Nathan Chancellor, Nathan Lynch, Naveen N. Rao, Nicholas Piggin, Oliver O'Halloran, Paul Mackerras, Pingfan Liu, Qian Cai, Ram Pai, Raphael Moreira Zinsly, Ravi Bangoria, Sam Bobroff, Sandipan Das, Segher Boessenkool, Stephen Rothwell, Sukadev Bhattiprolu, Tyrel Datwyler, Wolfram Sang, Xiongfeng Wang. * tag 'powerpc-5.8-1' of git://git.kernel.org/pub/scm/linux/kernel/git/powerpc/linux: (299 commits) powerpc/pseries: Make vio and ibmebus initcalls pseries specific cxl: Remove dead Kconfig options powerpc: Add POWER10 architected mode powerpc/dt_cpu_ftrs: Add MMA feature powerpc/dt_cpu_ftrs: Enable Prefixed Instructions powerpc/dt_cpu_ftrs: Advertise support for ISA v3.1 if selected powerpc: Add support for ISA v3.1 powerpc: Add new HWCAP bits powerpc/64s: Don't set FSCR bits in INIT_THREAD powerpc/64s: Save FSCR to init_task.thread.fscr after feature init powerpc/64s: Don't let DT CPU features set FSCR_DSCR powerpc/64s: Don't init FSCR_DSCR in __init_FSCR() powerpc/32s: Fix another build failure with CONFIG_PPC_KUAP_DEBUG powerpc/module_64: Use special stub for _mcount() with -mprofile-kernel powerpc/module_64: Simplify check for -mprofile-kernel ftrace relocations powerpc/module_64: Consolidate ftrace code powerpc/32: Disable KASAN with pages bigger than 16k powerpc/uaccess: Don't set KUEP by default on book3s/32 powerpc/uaccess: Don't set KUAP by default on book3s/32 powerpc/8xx: Reduce time spent in allow_user_access() and friends ...
2020-05-28powerpc/powernv/pci: Sprinkle around some WARN_ON()sOliver O'Halloran1-2/+2
pnv_pci_ioda_configure_bus() should now only ever be called when a device is added to the bus so add a WARN_ON() to the empty bus check. Similarly, pnv_pci_ioda_setup_bus_PE() should only ever be called for an unconfigured PE, so add a WARN_ON() for that case too. Signed-off-by: Oliver O'Halloran <[email protected]> Signed-off-by: Michael Ellerman <[email protected]> Link: https://lore.kernel.org/r/[email protected]
2020-05-28powerpc/powernv/pci: Reserve the root bus PE during initOliver O'Halloran2-18/+9
Doing it once during boot rather than doing it on the fly and drop the janky populated logic. Signed-off-by: Oliver O'Halloran <[email protected]> Signed-off-by: Michael Ellerman <[email protected]> Link: https://lore.kernel.org/r/[email protected]
2020-05-28powerpc/powernv/pci: Re-work bus PE configurationOliver O'Halloran1-51/+30
For normal PHBs IODA PEs are handled on a per-bus basis so all the devices on that bus will share a PE. Which PE specificly is determined by the location of the MMIO BARs for the devices on the bus so we can't actually configure the bus PEs until after MMIO resources are allocated. As a result PEs are currently configured by pcibios_setup_bridge(), which is called just before the bridge windows are programmed into the bus' parent bridge. Configuring the bus PE here causes a few problems: 1. The root bus doesn't have a parent bridge so setting up the PE for the root bus requires some hacks. 2. The PELT-V isn't setup correctly because pnv_ioda_set_peltv() assumes that PEs will be configured in root-to-leaf order. This assumption is broken because resource assignment is performed depth-first so the leaf bridges are setup before their parents are. The hack mentioned in 1) results in the "correct" PELT-V for busses immediately below the root port, but not for devices below a switch. 3. It's possible to break the sysfs PCI rescan feature by removing all the devices on a bus. When the last device is removed from a PE its will be de-configured. Rescanning the devices on a bus does not cause the bridge to be reconfigured rendering the devices on that bus unusable. We can address most of these problems by moving the PE setup out of pcibios_setup_bridge() and into pcibios_bus_add_device(). This fixes 1) and 2) because pcibios_bus_add_device() is called on each device in root-to-leaf order so PEs for parent buses will always be configured before their children. It also fixes 3) by ensuring the PE is configured before initialising DMA for the device. In the event the PE was de-configured due to removing all the devices in that PE it will now be reconfigured when a new device is added since there's no dependecy on the bridge_setup() hook being called. Signed-off-by: Oliver O'Halloran <[email protected]> Signed-off-by: Michael Ellerman <[email protected]> Link: https://lore.kernel.org/r/[email protected]
2020-05-28powerpc/powernv/pci: Add helper to find ioda_pe from BDFNOliver O'Halloran2-0/+11
For each PHB we maintain a reverse-map that can be used to find the PE that a BDFN is currently mapped to. Add a helper for doing this lookup so we can check if a PE has been configured without looking at pdn->pe_number. Signed-off-by: Oliver O'Halloran <[email protected]> Signed-off-by: Michael Ellerman <[email protected]> Link: https://lore.kernel.org/r/[email protected]
2020-05-28powerpc/powernv/pci: Add an explaination for PNV_IODA_PE_BUS_ALLOliver O'Halloran1-0/+18
It's pretty obsecure and confused me for a long time so I figured it's worth documenting properly. Signed-off-by: Oliver O'Halloran <[email protected]> Signed-off-by: Michael Ellerman <[email protected]> Link: https://lore.kernel.org/r/[email protected]
2020-05-28powerpc/powernv: Add a print indicating when an IODA PE is releasedOliver O'Halloran1-0/+2
Quite useful to know in some cases. Signed-off-by: Oliver O'Halloran <[email protected]> Reviewed-by: Sam Bobroff <[email protected]> Signed-off-by: Michael Ellerman <[email protected]> Link: https://lore.kernel.org/r/[email protected]