aboutsummaryrefslogtreecommitdiff
path: root/arch/powerpc
AgeCommit message (Collapse)AuthorFilesLines
2019-02-22powerpc/mm/hash: Increase vmalloc space to 512T with hash MMUMichael Ellerman1-9/+23
This patch updates the kernel non-linear virtual map to 512TB when we're built with 64K page size and are using the hash MMU. We allocate one context for the vmalloc region and hence the max virtual area size is limited by the context map size (512TB for 64K and 64TB for 4K page size). This patch fixes boot failures with large amounts of system RAM where we need large vmalloc space to handle per cpu allocations. Signed-off-by: Michael Ellerman <[email protected]> Signed-off-by: Aneesh Kumar K.V <[email protected]> Tested-by: Aneesh Kumar K.V <[email protected]>
2019-02-22powerpc/ptrace: Simplify vr_get/set() to avoid GCC warningMichael Ellerman1-2/+8
GCC 8 warns about the logic in vr_get/set(), which with -Werror breaks the build: In function ‘user_regset_copyin’, inlined from ‘vr_set’ at arch/powerpc/kernel/ptrace.c:628:9: include/linux/regset.h:295:4: error: ‘memcpy’ offset [-527, -529] is out of the bounds [0, 16] of object ‘vrsave’ with type ‘union <anonymous>’ [-Werror=array-bounds] arch/powerpc/kernel/ptrace.c: In function ‘vr_set’: arch/powerpc/kernel/ptrace.c:623:5: note: ‘vrsave’ declared here } vrsave; This has been identified as a regression in GCC, see GCC bug 88273. However we can avoid the warning and also simplify the logic and make it more robust. Currently we pass -1 as end_pos to user_regset_copyout(). This says "copy up to the end of the regset". The definition of the regset is: [REGSET_VMX] = { .core_note_type = NT_PPC_VMX, .n = 34, .size = sizeof(vector128), .align = sizeof(vector128), .active = vr_active, .get = vr_get, .set = vr_set }, The end is calculated as (n * size), ie. 34 * sizeof(vector128). In vr_get/set() we pass start_pos as 33 * sizeof(vector128), meaning we can copy up to sizeof(vector128) into/out-of vrsave. The on-stack vrsave is defined as: union { elf_vrreg_t reg; u32 word; } vrsave; And elf_vrreg_t is: typedef __vector128 elf_vrreg_t; So there is no bug, but we rely on all those sizes lining up, otherwise we would have a kernel stack exposure/overwrite on our hands. Rather than relying on that we can pass an explict end_pos based on the sizeof(vrsave). The result should be exactly the same but it's more obviously not over-reading/writing the stack and it avoids the compiler warning. Reported-by: Meelis Roos <[email protected]> Reported-by: Mathieu Malaterre <[email protected]> Cc: [email protected] Tested-by: Mathieu Malaterre <[email protected]> Tested-by: Meelis Roos <[email protected]> Signed-off-by: Michael Ellerman <[email protected]>
2019-02-22powerpc/powernv/npu: Remove redundant change_pte() hookPeter Xu1-10/+0
The change_pte() notifier was designed to use as a quick path to update secondary MMU PTEs on write permission changes or PFN changes. For KVM, it could reduce the vm-exits when vcpu faults on the pages that was touched up by KSM. It's not used to do cache invalidations, for example, if we see the notifier will be called before the real PTE update after all (please see set_pte_at_notify that set_pte_at was called later). All the necessary cache invalidation should all be done in invalidate_range() already. Signed-off-by: Peter Xu <[email protected]> Reviewed-by: Andrea Arcangeli <[email protected]> Reviewed-by: Alistair Popple <[email protected]> Reviewed-by: Balbir Singh <[email protected]> Signed-off-by: Michael Ellerman <[email protected]>
2019-02-22Merge branch 'topic/ppc-kvm' into nextMichael Ellerman8-91/+54
Merge commits we're sharing with kvm-ppc tree.
2019-02-21powerpc/64s: Better printing of machine check info for guest MCEsPaul Mackerras4-7/+9
This adds an "in_guest" parameter to machine_check_print_event_info() so that we can avoid trying to translate guest NIP values into symbolic form using the host kernel's symbol table. Reviewed-by: Aravinda Prasad <[email protected]> Reviewed-by: Mahesh Salgaonkar <[email protected]> Signed-off-by: Paul Mackerras <[email protected]> Signed-off-by: Michael Ellerman <[email protected]>
2019-02-21KVM: PPC: Book3S HV: Simplify machine check handlingPaul Mackerras5-84/+42
This makes the handling of machine check interrupts that occur inside a guest simpler and more robust, with less done in assembler code and in real mode. Now, when a machine check occurs inside a guest, we always get the machine check event struct and put a copy in the vcpu struct for the vcpu where the machine check occurred. We no longer call machine_check_queue_event() from kvmppc_realmode_mc_power7(), because on POWER8, when a vcpu is running on an offline secondary thread and we call machine_check_queue_event(), that calls irq_work_queue(), which doesn't work because the CPU is offline, but instead triggers the WARN_ON(lazy_irq_pending()) in pnv_smp_cpu_kill_self() (which fires again and again because nothing clears the condition). All that machine_check_queue_event() actually does is to cause the event to be printed to the console. For a machine check occurring in the guest, we now print the event in kvmppc_handle_exit_hv() instead. The assembly code at label machine_check_realmode now just calls C code and then continues exiting the guest. We no longer either synthesize a machine check for the guest in assembly code or return to the guest without a machine check. The code in kvmppc_handle_exit_hv() is extended to handle the case where the guest is not FWNMI-capable. In that case we now always synthesize a machine check interrupt for the guest. Previously, if the host thinks it has recovered the machine check fully, it would return to the guest without any notification that the machine check had occurred. If the machine check was caused by some action of the guest (such as creating duplicate SLB entries), it is much better to tell the guest that it has caused a problem. Therefore we now always generate a machine check interrupt for guests that are not FWNMI-capable. Reviewed-by: Aravinda Prasad <[email protected]> Reviewed-by: Mahesh Salgaonkar <[email protected]> Signed-off-by: Paul Mackerras <[email protected]> Signed-off-by: Michael Ellerman <[email protected]>
2019-02-21Merge branch 'topic/dma' into nextMichael Ellerman41-1135/+304
Merge hch's big DMA rework series. This is in a topic branch in case he wants to merge it to minimise conflicts.
2019-02-21KVM: PPC: Book3S HV: Context switch AMR on Power9Michael Ellerman1-1/+4
kvmhv_p9_guest_entry() implements a fast-path guest entry for Power9 when guest and host are both running with the Radix MMU. Currently in that path we don't save the host AMR (Authority Mask Register) value, and we always restore 0 on return to the host. That is OK at the moment because the AMR is not used for storage keys with the Radix MMU. However we plan to start using the AMR on Radix to prevent the kernel from reading/writing to userspace outside of copy_to/from_user(). In order to make that work we need to save/restore the AMR value. We only restore the value if it is different from the guest value, which is already in the register when we exit to the host. This should mean we rarely need to actually restore the value when running a modern Linux as a guest, because it will be using the same value as us. Signed-off-by: Michael Ellerman <[email protected]> Tested-by: Russell Currey <[email protected]>
2019-02-20KVM: Never start grow vCPU halt_poll_ns from value below halt_poll_ns_grow_startNir Weiner1-3/+2
grow_halt_poll_ns() have a strange behaviour in case (vcpu->halt_poll_ns != 0) && (vcpu->halt_poll_ns < halt_poll_ns_grow_start). In this case, vcpu->halt_poll_ns will be multiplied by grow factor (halt_poll_ns_grow) which will require several grow iteration in order to reach a value bigger than halt_poll_ns_grow_start. This means that growing vcpu->halt_poll_ns from value of 0 is slower than growing it from a positive value less than halt_poll_ns_grow_start. Which is misleading and inaccurate. Fix issue by changing grow_halt_poll_ns() to set vcpu->halt_poll_ns to halt_poll_ns_grow_start in any case that (vcpu->halt_poll_ns < halt_poll_ns_grow_start). Regardless if vcpu->halt_poll_ns is 0. use READ_ONCE to get a consistent number for all cases. Reviewed-by: Boris Ostrovsky <[email protected]> Reviewed-by: Liran Alon <[email protected]> Signed-off-by: Nir Weiner <[email protected]> Signed-off-by: Paolo Bonzini <[email protected]>
2019-02-20KVM: Expose the initial start value in grow_halt_poll_ns() as a module parameterNir Weiner1-2/+1
The hard-coded value 10000 in grow_halt_poll_ns() stands for the initial start value when raising up vcpu->halt_poll_ns. It actually sets the first timeout to the first polling session. This value has significant effect on how tolerant we are to outliers. On the standard case, higher value is better - we will spend more time in the polling busyloop, handle events/interrupts faster and result in better performance. But on outliers it puts us in a busy loop that does nothing. Even if the shrink factor is zero, we will still waste time on the first iteration. The optimal value changes between different workloads. It depends on outliers rate and polling sessions length. As this value has significant effect on the dynamic halt-polling algorithm, it should be configurable and exposed. Reviewed-by: Boris Ostrovsky <[email protected]> Reviewed-by: Liran Alon <[email protected]> Signed-off-by: Nir Weiner <[email protected]> Signed-off-by: Paolo Bonzini <[email protected]>
2019-02-20KVM: grow_halt_poll_ns() should never shrink vCPU halt_poll_nsNir Weiner1-1/+4
grow_halt_poll_ns() have a strange behavior in case (halt_poll_ns_grow == 0) && (vcpu->halt_poll_ns != 0). In this case, vcpu->halt_pol_ns will be set to zero. That results in shrinking instead of growing. Fix issue by changing grow_halt_poll_ns() to not modify vcpu->halt_poll_ns in case halt_poll_ns_grow is zero Reviewed-by: Boris Ostrovsky <[email protected]> Reviewed-by: Liran Alon <[email protected]> Signed-off-by: Nir Weiner <[email protected]> Suggested-by: Liran Alon <[email protected]> Signed-off-by: Paolo Bonzini <[email protected]>
2019-02-20KVM: Call kvm_arch_memslots_updated() before updating memslotsSean Christopherson1-1/+1
kvm_arch_memslots_updated() is at this point in time an x86-specific hook for handling MMIO generation wraparound. x86 stashes 19 bits of the memslots generation number in its MMIO sptes in order to avoid full page fault walks for repeat faults on emulated MMIO addresses. Because only 19 bits are used, wrapping the MMIO generation number is possible, if unlikely. kvm_arch_memslots_updated() alerts x86 that the generation has changed so that it can invalidate all MMIO sptes in case the effective MMIO generation has wrapped so as to avoid using a stale spte, e.g. a (very) old spte that was created with generation==0. Given that the purpose of kvm_arch_memslots_updated() is to prevent consuming stale entries, it needs to be called before the new generation is propagated to memslots. Invalidating the MMIO sptes after updating memslots means that there is a window where a vCPU could dereference the new memslots generation, e.g. 0, and incorrectly reuse an old MMIO spte that was created with (pre-wrap) generation==0. Fixes: e59dbe09f8e6 ("KVM: Introduce kvm_arch_memslots_updated()") Cc: <[email protected]> Signed-off-by: Sean Christopherson <[email protected]> Signed-off-by: Paolo Bonzini <[email protected]>
2019-02-20Merge git://git.kernel.org/pub/scm/linux/kernel/git/davem/netDavid S. Miller1-2/+2
Two easily resolvable overlapping change conflicts, one in TCP and one in the eBPF verifier. Signed-off-by: David S. Miller <[email protected]>
2019-02-1932-bit userspace ABI: introduce ARCH_32BIT_OFF_T config optionYury Norov1-0/+1
All new 32-bit architectures should have 64-bit userspace off_t type, but existing architectures has 32-bit ones. To enforce the rule, new config option is added to arch/Kconfig that defaults ARCH_32BIT_OFF_T to be disabled for new 32-bit architectures. All existing 32-bit architectures enable it explicitly. New option affects force_o_largefile() behaviour. Namely, if userspace off_t is 64-bits long, we have no reason to reject user to open big files. Note that even if architectures has only 64-bit off_t in the kernel (arc, c6x, h8300, hexagon, nios2, openrisc, and unicore32), a libc may use 32-bit off_t, and therefore want to limit the file size to 4GB unless specified differently in the open flags. Signed-off-by: Yury Norov <[email protected]> Acked-by: Arnd Bergmann <[email protected]> Signed-off-by: Yury Norov <[email protected]> Signed-off-by: Arnd Bergmann <[email protected]>
2019-02-19Merge branch 'fixes' into nextMichael Ellerman12-34/+56
There's a few important fixes in our fixes branch, in particular the pgd/pud_present() one, so merge it now.
2019-02-19KVM: PPC: Book3S HV: Add KVM stat largepages_[2M/1G]Suraj Jitindar Singh3-1/+19
This adds an entry to the kvm_stats_debugfs directory which provides the number of large (2M or 1G) pages which have been used to setup the guest mappings, for radix guests. Signed-off-by: Suraj Jitindar Singh <[email protected]> Signed-off-by: Paul Mackerras <[email protected]>
2019-02-19KVM: PPC: Release all hardware TCE tables attached to a groupAlexey Kardashevskiy1-1/+0
The SPAPR TCE KVM device references all hardware IOMMU tables assigned to some IOMMU group to ensure that in-kernel KVM acceleration of H_PUT_TCE can work. The tables are references when an IOMMU group gets registered with the VFIO KVM device by the KVM_DEV_VFIO_GROUP_ADD ioctl; KVM_DEV_VFIO_GROUP_DEL calls into the dereferencing code in kvm_spapr_tce_release_iommu_group() which walks through the list of LIOBNs, finds a matching IOMMU table and calls kref_put() when found. However that code stops after the very first successful derefencing leaving other tables referenced till the SPAPR TCE KVM device is destroyed which normally happens on guest reboot or termination so if we do hotplug and unplug in a loop, we are leaking IOMMU tables here. This removes a premature return to let kvm_spapr_tce_release_iommu_group() find and dereference all attached tables. Fixes: 121f80ba68f ("KVM: PPC: VFIO: Add in-kernel acceleration for VFIO") Signed-off-by: Alexey Kardashevskiy <[email protected]> Signed-off-by: Paul Mackerras <[email protected]>
2019-02-19KVM: PPC: Book3S HV: Optimise mmio emulation for devices on FAST_MMIO_BUSSuraj Jitindar Singh1-0/+18
Devices on the KVM_FAST_MMIO_BUS by definition have length zero and are thus used for notification purposes rather than data transfer. For example eventfd for virtio devices. This means that when emulating mmio instructions which target devices on this bus we can immediately handle them and return without needing to load the instruction from guest memory. For now we restrict this to stores as this is the only use case at present. For a normal guest the effect is negligible, however for a nested guest we save on the order of 5us per access. Signed-off-by: Suraj Jitindar Singh <[email protected]> Signed-off-by: Paul Mackerras <[email protected]>
2019-02-19KVM: PPC: Book3S: Allow XICS emulation to work in nested hosts using XIVEPaul Mackerras7-26/+47
Currently, the KVM code assumes that if the host kernel is using the XIVE interrupt controller (the new interrupt controller that first appeared in POWER9 systems), then the in-kernel XICS emulation will use the XIVE hardware to deliver interrupts to the guest. However, this only works when the host is running in hypervisor mode and has full access to all of the XIVE functionality. It doesn't work in any nested virtualization scenario, either with PR KVM or nested-HV KVM, because the XICS-on-XIVE code calls directly into the native-XIVE routines, which are not initialized and cannot function correctly because they use OPAL calls, and OPAL is not available in a guest. This means that using the in-kernel XICS emulation in a nested hypervisor that is using XIVE as its interrupt controller will cause a (nested) host kernel crash. To fix this, we change most of the places where the current code calls xive_enabled() to select between the XICS-on-XIVE emulation and the plain XICS emulation to call a new function, xics_on_xive(), which returns false in a guest. However, there is a further twist. The plain XICS emulation has some functions which are used in real mode and access the underlying XICS controller (the interrupt controller of the host) directly. In the case of a nested hypervisor, this means doing XICS hypercalls directly. When the nested host is using XIVE as its interrupt controller, these hypercalls will fail. Therefore this also adds checks in the places where the XICS emulation wants to access the underlying interrupt controller directly, and if that is XIVE, makes the code use the virtual mode fallback paths, which call generic kernel infrastructure rather than doing direct XICS access. Signed-off-by: Paul Mackerras <[email protected]> Reviewed-by: Cédric Le Goater <[email protected]> Signed-off-by: Paul Mackerras <[email protected]>
2019-02-19KVM: PPC: Remove -I. header search pathsMasahiro Yamada1-5/+0
The header search path -I. in kernel Makefiles is very suspicious; it allows the compiler to search for headers in the top of $(srctree), where obviously no header file exists. Commit 46f43c6ee022 ("KVM: powerpc: convert marker probes to event trace") first added these options, but they are completely useless. Signed-off-by: Masahiro Yamada <[email protected]> Signed-off-by: Paul Mackerras <[email protected]>
2019-02-19KVM: PPC: Book3S HV: Replace kmalloc_node+memset with kzalloc_nodewangbo1-3/+1
Replace kmalloc_node and memset with kzalloc_node Signed-off-by: wangbo <[email protected]> Signed-off-by: Paul Mackerras <[email protected]>
2019-02-19KVM: PPC: Book3S PR: Add emulation for slbfee. instructionPaul Mackerras4-0/+34
Recent kernels, since commit e15a4fea4dee ("powerpc/64s/hash: Add some SLB debugging tests", 2018-10-03) use the slbfee. instruction, which PR KVM currently does not have code to emulate. Consequently recent kernels fail to boot under PR KVM. This adds emulation of slbfee., enabling these kernels to boot successfully. Signed-off-by: Paul Mackerras <[email protected]>
2019-02-19powerpc/powernv/sriov: Register IOMMU groups for VFsAlexey Kardashevskiy2-0/+4
The compound IOMMU group rework moved iommu_register_group() together in pnv_pci_ioda_setup_iommu_api() (which is a part of ppc_md.pcibios_fixup). As the result, pnv_ioda_setup_bus_iommu_group() does not create groups any more, it only adds devices to groups. This works fine for boot time devices. However IOMMU groups for SRIOV's VFs were added by pnv_ioda_setup_bus_iommu_group() so this got broken: pnv_tce_iommu_bus_notifier() expects a group to be registered for VF and it is not. This adds missing group registration and adds a NULL pointer check into the bus notifier so we won't crash if there is no group, although it is not expected to happen now because of the change above. Example oops seen prior to this patch: $ echo 1 > /sys/bus/pci/devices/0000\:01\:00.0/sriov_numvfs Unable to handle kernel paging request for data at address 0x00000030 Faulting instruction address: 0xc0000000004a6018 Oops: Kernel access of bad area, sig: 11 [#1] LE SMP NR_CPUS=2048 NUMA PowerNV CPU: 46 PID: 7006 Comm: bash Not tainted 4.15-ish NIP: c0000000004a6018 LR: c0000000004a6014 CTR: 0000000000000000 REGS: c000008fc876b400 TRAP: 0300 Not tainted (4.15-ish) MSR: 900000000280b033 <SF,HV,VEC,VSX,EE,FP,ME,IR,DR,RI,LE> CFAR: c000000000d0be20 DAR: 0000000000000030 DSISR: 40000000 SOFTE: 1 ... NIP sysfs_do_create_link_sd.isra.0+0x68/0x150 LR sysfs_do_create_link_sd.isra.0+0x64/0x150 Call Trace: pci_dev_type+0x0/0x30 (unreliable) iommu_group_add_device+0x8c/0x600 iommu_add_device+0xe8/0x180 pnv_tce_iommu_bus_notifier+0xb0/0xf0 notifier_call_chain+0x9c/0x110 blocking_notifier_call_chain+0x64/0xa0 device_add+0x524/0x7d0 pci_device_add+0x248/0x450 pci_iov_add_virtfn+0x294/0x3e0 pci_enable_sriov+0x43c/0x580 mlx5_core_sriov_configure+0x15c/0x2f0 [mlx5_core] sriov_numvfs_store+0x180/0x240 dev_attr_store+0x3c/0x60 sysfs_kf_write+0x64/0x90 kernfs_fop_write+0x1ac/0x240 __vfs_write+0x3c/0x70 vfs_write+0xd8/0x220 SyS_write+0x6c/0x110 system_call+0x58/0x6c Fixes: 0bd971676e68 ("powerpc/powernv/npu: Add compound IOMMU groups") Signed-off-by: Alexey Kardashevskiy <[email protected]> Reported-by: Santwana Samantray <[email protected]> Signed-off-by: Michael Ellerman <[email protected]>
2019-02-18powerpc/dma: trim the fat from <asm/dma-mapping.h>Christoph Hellwig6-29/+14
There is no need to provide anything but get_arch_dma_ops to <linux/dma-mapping.h>. More the remaining declarations to <asm/iommu.h> and drop all the includes. Signed-off-by: Christoph Hellwig <[email protected]> Tested-by: Christian Zigotzky <[email protected]> Signed-off-by: Michael Ellerman <[email protected]>
2019-02-18powerpc/dma: remove set_dma_offsetChristoph Hellwig7-19/+10
There is no good reason for this helper, just opencode it. Signed-off-by: Christoph Hellwig <[email protected]> Tested-by: Christian Zigotzky <[email protected]> Signed-off-by: Michael Ellerman <[email protected]>
2019-02-18powerpc/dma: remove get_dma_offsetChristoph Hellwig2-18/+6
Just fold the calculation into __phys_to_dma/__dma_to_phys as those are the only places that should know about it. Signed-off-by: Christoph Hellwig <[email protected]> Acked-by: Benjamin Herrenschmidt <[email protected]> Tested-by: Christian Zigotzky <[email protected]> Signed-off-by: Michael Ellerman <[email protected]>
2019-02-18powerpc/dma: use the generic direct mapping bypassChristoph Hellwig24-221/+16
Now that we've switched all the powerpc nommu and swiotlb methods to use the generic dma_direct_* calls we can remove these ops vectors entirely and rely on the common direct mapping bypass that avoids indirect function calls entirely. This also allows to remove a whole lot of boilerplate code related to setting up these operations. Signed-off-by: Christoph Hellwig <[email protected]> Tested-by: Christian Zigotzky <[email protected]> Signed-off-by: Michael Ellerman <[email protected]>
2019-02-18powerpc/dma: use the dma_direct mapping routinesChristoph Hellwig5-120/+32
Switch the streaming DMA mapping and ownership transfer methods to the functionally identical dma_direct_ versions. Factor the cache maintainance helpers into the form expected by the common code for that. Signed-off-by: Christoph Hellwig <[email protected]> Tested-by: Christian Zigotzky <[email protected]> Signed-off-by: Michael Ellerman <[email protected]>
2019-02-18powerpc/dma: use the dma-direct allocator for coherent platformsChristoph Hellwig5-92/+9
The generic code allows a few nice things such as node local allocations and dipping into the CMA area. The lookup of the right zone for a given dma mask works a little different, but the results should be the same. Signed-off-by: Christoph Hellwig <[email protected]> Tested-by: Christian Zigotzky <[email protected]> Signed-off-by: Michael Ellerman <[email protected]>
2019-02-18swiotlb: remove swiotlb_dma_supportedChristoph Hellwig1-1/+1
The only user left is powerpc, but even there the generic dma-direct version works just as well, given that we guarantee that the swiotlb buffer must always be addressable. Signed-off-by: Christoph Hellwig <[email protected]> Tested-by: Christian Zigotzky <[email protected]> Signed-off-by: Michael Ellerman <[email protected]>
2019-02-18powerpc/dma: remove dma_nommu_dma_supportedChristoph Hellwig3-26/+2
This function is largely identical to the generic version used everywhere else. Replace it with the generic version. Signed-off-by: Christoph Hellwig <[email protected]> Tested-by: Christian Zigotzky <[email protected]> Signed-off-by: Michael Ellerman <[email protected]>
2019-02-18powerpc/dma: remove dma_nommu_get_required_maskChristoph Hellwig3-15/+2
This function is identical to the generic dma_direct_get_required_mask, except that the generic version also takes the bus_dma_mask account, which could lead to incorrect results in the powerpc version. Signed-off-by: Christoph Hellwig <[email protected]> Tested-by: Christian Zigotzky <[email protected]> Signed-off-by: Michael Ellerman <[email protected]>
2019-02-18powerpc/dma: remove dma_nommu_mmap_coherentChristoph Hellwig7-28/+6
The coherent cache version of this function already is functionally identicall to the default version, and by defining the arch_dma_coherent_to_pfn hook the same is ture for the noncoherent version as well. Signed-off-by: Christoph Hellwig <[email protected]> Tested-by: Christian Zigotzky <[email protected]> Signed-off-by: Michael Ellerman <[email protected]>
2019-02-18powerpc/dma: use phys_to_dma instead of get_dma_offsetChristoph Hellwig1-5/+5
Use the standard portable helper instead of the powerpc specific one, which is about to go away. Signed-off-by: Christoph Hellwig <[email protected]> Acked-by: Benjamin Herrenschmidt <[email protected]> Tested-by: Christian Zigotzky <[email protected]> Signed-off-by: Michael Ellerman <[email protected]>
2019-02-18dma-mapping, powerpc: simplify the arch dma_set_mask overrideChristoph Hellwig7-22/+16
Instead of letting the architecture supply all of dma_set_mask just give it an additional hook selected by Kconfig. Signed-off-by: Christoph Hellwig <[email protected]> Tested-by: Christian Zigotzky <[email protected]> Signed-off-by: Michael Ellerman <[email protected]>
2019-02-18powerpc/dma: fix an off-by-one in dma_capableChristoph Hellwig1-6/+2
We need to compare the last byte in the dma range and not the one after it for the bus_dma_mask, just like we do for the regular dma_mask. Fix this cleanly by merging the two comparisms into one. Signed-off-by: Christoph Hellwig <[email protected]> Tested-by: Christian Zigotzky <[email protected]> Signed-off-by: Michael Ellerman <[email protected]>
2019-02-18powerpc/dma: remove max_direct_dma_addrChristoph Hellwig5-31/+6
The max_direct_dma_addr duplicates the bus_dma_mask field in struct device. Use the generic field instead. Signed-off-by: Christoph Hellwig <[email protected]> Tested-by: Christian Zigotzky <[email protected]> Signed-off-by: Michael Ellerman <[email protected]>
2019-02-18powerpc/dma: move pci_dma_dev_setup_swiotlb to fsl_pci.cChristoph Hellwig3-13/+9
pci_dma_dev_setup_swiotlb is only used by the fsl_pci code, and closely related to it, so fsl_pci.c seems like a better place for it. Signed-off-by: Christoph Hellwig <[email protected]> Tested-by: Christian Zigotzky <[email protected]> Signed-off-by: Michael Ellerman <[email protected]>
2019-02-18powerpc/dma: remove get_pci_dma_opsChristoph Hellwig3-17/+8
This function is only used by the Cell iommu code, which can keep track if it is using the iommu internally just as good. Signed-off-by: Christoph Hellwig <[email protected]> Tested-by: Christian Zigotzky <[email protected]> Signed-off-by: Michael Ellerman <[email protected]>
2019-02-18powerpc/dma: remove the iommu fallback for coherent allocationsChristoph Hellwig2-70/+2
All iommu capable platforms now always use the iommu code with the internal bypass, so there is not need for this magic anymore. Signed-off-by: Christoph Hellwig <[email protected]> Tested-by: Christian Zigotzky <[email protected]> Signed-off-by: Michael Ellerman <[email protected]>
2019-02-18powerpc/pci: remove the dma_set_mask pci_controller ops methodsChristoph Hellwig2-9/+0
Unused now. Signed-off-by: Christoph Hellwig <[email protected]> Signed-off-by: Michael Ellerman <[email protected]>
2019-02-18powerpc/dma: stop overriding dma_get_required_maskChristoph Hellwig5-36/+0
The ppc_md and pci_controller_ops methods are unused now and can be removed. The dma_nommu implementation is generic to the generic one except for using max_pfn instead of calling into the memblock API, and all other dma_map_ops instances implement a method of their own. Signed-off-by: Christoph Hellwig <[email protected]> Tested-by: Christian Zigotzky <[email protected]> Signed-off-by: Michael Ellerman <[email protected]>
2019-02-18powerpc/powernv: use the generic iommu bypass codeChristoph Hellwig1-70/+25
Use the generic iommu bypass code instead of overriding set_dma_mask. Signed-off-by: Christoph Hellwig <[email protected]> Signed-off-by: Michael Ellerman <[email protected]>
2019-02-18powerpc/powernv: remove pnv_npu_dma_set_maskChristoph Hellwig1-9/+0
These devices are not PCIe devices and do not have associated dma map ops, so this is just dead code. Signed-off-by: Christoph Hellwig <[email protected]> Signed-off-by: Michael Ellerman <[email protected]>
2019-02-18powerpc/powernv: remove pnv_pci_ioda_pe_single_vendorChristoph Hellwig1-26/+2
This function is completely bogus - the fact that two PCIe devices come from the same vendor has absolutely nothing to say about the DMA capabilities and characteristics. Signed-off-by: Christoph Hellwig <[email protected]> Signed-off-by: Michael Ellerman <[email protected]>
2019-02-18powerpc/dart: use the generic iommu bypass codeChristoph Hellwig1-30/+17
Use the generic iommu bypass code instead of overriding set_dma_mask. Signed-off-by: Christoph Hellwig <[email protected]> Signed-off-by: Michael Ellerman <[email protected]>
2019-02-18powerpc/dart: remove dead cleanup code in iommu_init_early_dartChristoph Hellwig1-10/+1
If dart_init failed we didn't have a chance to setup dma or controller ops yet, so there is no point in resetting them. Signed-off-by: Christoph Hellwig <[email protected]> Signed-off-by: Michael Ellerman <[email protected]>
2019-02-18powerpc/cell: use the generic iommu bypass codeChristoph Hellwig4-135/+20
This gets rid of a lot of clumsy code and finally allows us to mark dma_iommu_ops const. Includes fixes from Michael Ellerman. Signed-off-by: Christoph Hellwig <[email protected]> Signed-off-by: Michael Ellerman <[email protected]>
2019-02-18powerpc/cell: move dma direct window setup out of dma_configureChristoph Hellwig1-9/+15
Configure the dma settings at device setup time, and stop playing games with get_pci_dma_ops. This prepares for using the common dma_configure code later on. Includes fixes from Michael Ellerman. Signed-off-by: Christoph Hellwig <[email protected]> Signed-off-by: Michael Ellerman <[email protected]>
2019-02-18powerpc/pseries: use the generic iommu bypass codeChristoph Hellwig1-73/+27
Use the generic iommu bypass code instead of overriding set_dma_mask. Signed-off-by: Christoph Hellwig <[email protected]> Signed-off-by: Michael Ellerman <[email protected]>