Age | Commit message (Collapse) | Author | Files | Lines |
|
At the moment we create a small window only for 32bit devices, the window
maps 0..2GB of the PCI space only. For other devices we either use
a sketchy bypass or hardware bypass but the former can only work if
the amount of RAM is no bigger than the device's DMA mask and the latter
requires devices to support at least 59bit DMA.
This extends the default DMA window to the maximum size possible to allow
a wider DMA mask than just 32bit. The default window size is now limited
by the the iommu_table::it_map allocation bitmap which is a contiguous
array, 1 bit per an IOMMU page.
This increases the default IOMMU page size from hard coded 4K to
the system page size to allow wider DMA masks.
This increases the level number to not exceed the max order allocation
limit per TCE level. By the same time, this keeps minimal levels number
as 2 in order to save memory.
As the extended window now overlaps the 32bit MMIO region, this adds
an area reservation to iommu_init_table().
After this change the default window size is 0x80000000000==1<<43 so
devices limited to DMA mask smaller than the amount of system RAM can
still use more than just 2GB of memory for DMA.
This is an optimization and not a bug fix for DMA API usage.
With the on-demand allocation of indirect TCE table levels enabled and
2 levels, the first TCE level size is just
1<<ceil((log2(0x7ffffffffff+1)-16)/2)=16384 TCEs or 2 system pages.
Signed-off-by: Alexey Kardashevskiy <[email protected]>
Signed-off-by: Michael Ellerman <[email protected]>
Link: https://lore.kernel.org/r/[email protected]
|
|
git://git.kernel.org/pub/scm/linux/kernel/git/powerpc/linux
Pull powerpc updates from Michael Ellerman:
"Notable changes:
- Removal of the NPU DMA code, used by the out-of-tree Nvidia driver,
as well as some other functions only used by drivers that haven't
(yet?) made it upstream.
- A fix for a bug in our handling of hardware watchpoints (eg. perf
record -e mem: ...) which could lead to register corruption and
kernel crashes.
- Enable HAVE_ARCH_HUGE_VMAP, which allows us to use large pages for
vmalloc when using the Radix MMU.
- A large but incremental rewrite of our exception handling code to
use gas macros rather than multiple levels of nested CPP macros.
And the usual small fixes, cleanups and improvements.
Thanks to: Alastair D'Silva, Alexey Kardashevskiy, Andreas Schwab,
Aneesh Kumar K.V, Anju T Sudhakar, Anton Blanchard, Arnd Bergmann,
Athira Rajeev, Cédric Le Goater, Christian Lamparter, Christophe
Leroy, Christophe Lombard, Christoph Hellwig, Daniel Axtens, Denis
Efremov, Enrico Weigelt, Frederic Barrat, Gautham R. Shenoy, Geert
Uytterhoeven, Geliang Tang, Gen Zhang, Greg Kroah-Hartman, Greg Kurz,
Gustavo Romero, Krzysztof Kozlowski, Madhavan Srinivasan, Masahiro
Yamada, Mathieu Malaterre, Michael Neuling, Nathan Lynch, Naveen N.
Rao, Nicholas Piggin, Nishad Kamdar, Oliver O'Halloran, Qian Cai, Ravi
Bangoria, Sachin Sant, Sam Bobroff, Satheesh Rajendran, Segher
Boessenkool, Shaokun Zhang, Shawn Anastasio, Stewart Smith, Suraj
Jitindar Singh, Thiago Jung Bauermann, YueHaibing"
* tag 'powerpc-5.3-1' of git://git.kernel.org/pub/scm/linux/kernel/git/powerpc/linux: (163 commits)
powerpc/powernv/idle: Fix restore of SPRN_LDBAR for POWER9 stop state.
powerpc/eeh: Handle hugepages in ioremap space
ocxl: Update for AFU descriptor template version 1.1
powerpc/boot: pass CONFIG options in a simpler and more robust way
powerpc/boot: add {get, put}_unaligned_be32 to xz_config.h
powerpc/irq: Don't WARN continuously in arch_local_irq_restore()
powerpc/module64: Use symbolic instructions names.
powerpc/module32: Use symbolic instructions names.
powerpc: Move PPC_HA() PPC_HI() and PPC_LO() to ppc-opcode.h
powerpc/module64: Fix comment in R_PPC64_ENTRY handling
powerpc/boot: Add lzo support for uImage
powerpc/boot: Add lzma support for uImage
powerpc/boot: don't force gzipped uImage
powerpc/8xx: Add microcode patch to move SMC parameter RAM.
powerpc/8xx: Use IO accessors in microcode programming.
powerpc/8xx: replace #ifdefs by IS_ENABLED() in microcode.c
powerpc/8xx: refactor programming of microcode CPM params.
powerpc/8xx: refactor printing of microcode patch name.
powerpc/8xx: Refactor microcode write
powerpc/8xx: refactor writing of CPM microcode arrays
...
|
|
On most arches having function flush_dcache_range(), including PPC32,
this function does a writeback and invalidation of the cache bloc.
On PPC64, flush_dcache_range() only does a writeback while
flush_inval_dcache_range() does the invalidation in addition.
In addition it looks like within arch/powerpc/, there are no PPC64
platforms using flush_dcache_range()
This patch drops the existing 64 bits version of flush_dcache_range()
and renames flush_inval_dcache_range() into flush_dcache_range().
Signed-off-by: Christophe Leroy <[email protected]>
Signed-off-by: Michael Ellerman <[email protected]>
|
|
Based on 1 normalized pattern(s):
this program is free software you can redistribute it and or modify
it under the terms of the gnu general public license as published by
the free software foundation either version 2 of the license or at
your option any later version this program is distributed in the
hope that it will be useful but without any warranty without even
the implied warranty of merchantability or fitness for a particular
purpose see the gnu general public license for more details you
should have received a copy of the gnu general public license along
with this program if not write to the free software foundation inc
59 temple place suite 330 boston ma 02111 1307 usa
extracted by the scancode license scanner the SPDX license identifier
GPL-2.0-or-later
has been chosen to replace the boilerplate/reference in 1334 file(s).
Signed-off-by: Thomas Gleixner <[email protected]>
Reviewed-by: Allison Randal <[email protected]>
Reviewed-by: Richard Fontana <[email protected]>
Cc: [email protected]
Link: https://lkml.kernel.org/r/[email protected]
Signed-off-by: Greg Kroah-Hartman <[email protected]>
|
|
Make the memblock_phys_alloc() function an inline wrapper for
memblock_phys_alloc_range() and update the memblock_phys_alloc() callers
to check the returned value and panic in case of error.
Link: http://lkml.kernel.org/r/[email protected]
Signed-off-by: Mike Rapoport <[email protected]>
Cc: Catalin Marinas <[email protected]>
Cc: Christophe Leroy <[email protected]>
Cc: Christoph Hellwig <[email protected]>
Cc: "David S. Miller" <[email protected]>
Cc: Dennis Zhou <[email protected]>
Cc: Geert Uytterhoeven <[email protected]>
Cc: Greentime Hu <[email protected]>
Cc: Greg Kroah-Hartman <[email protected]>
Cc: Guan Xuetao <[email protected]>
Cc: Guo Ren <[email protected]>
Cc: Guo Ren <[email protected]> [c-sky]
Cc: Heiko Carstens <[email protected]>
Cc: Juergen Gross <[email protected]> [Xen]
Cc: Mark Salter <[email protected]>
Cc: Matt Turner <[email protected]>
Cc: Max Filippov <[email protected]>
Cc: Michael Ellerman <[email protected]>
Cc: Michal Simek <[email protected]>
Cc: Paul Burton <[email protected]>
Cc: Petr Mladek <[email protected]>
Cc: Richard Weinberger <[email protected]>
Cc: Rich Felker <[email protected]>
Cc: Rob Herring <[email protected]>
Cc: Rob Herring <[email protected]>
Cc: Russell King <[email protected]>
Cc: Stafford Horne <[email protected]>
Cc: Tony Luck <[email protected]>
Cc: Vineet Gupta <[email protected]>
Cc: Yoshinori Sato <[email protected]>
Signed-off-by: Andrew Morton <[email protected]>
Signed-off-by: Linus Torvalds <[email protected]>
|
|
Merge more updates from Andrew Morton:
- some of the rest of MM
- various misc things
- dynamic-debug updates
- checkpatch
- some epoll speedups
- autofs
- rapidio
- lib/, lib/lzo/ updates
* emailed patches from Andrew Morton <[email protected]>: (83 commits)
samples/mic/mpssd/mpssd.h: remove duplicate header
kernel/fork.c: remove duplicated include
include/linux/relay.h: fix percpu annotation in struct rchan
arch/nios2/mm/fault.c: remove duplicate include
unicore32: stop printing the virtual memory layout
MAINTAINERS: fix GTA02 entry and mark as orphan
mm: create the new vm_fault_t type
arm, s390, unicore32: remove oneliner wrappers for memblock_alloc()
arch: simplify several early memory allocations
openrisc: simplify pte_alloc_one_kernel()
sh: prefer memblock APIs returning virtual address
microblaze: prefer memblock API returning virtual address
powerpc: prefer memblock APIs returning virtual address
lib/lzo: separate lzo-rle from lzo
lib/lzo: implement run-length encoding
lib/lzo: fast 8-byte copy on arm64
lib/lzo: 64-bit CTZ on arm64
lib/lzo: tidy-up ifdefs
ipc/sem.c: replace kvmalloc/memset with kvzalloc and use struct_size
ipc: annotate implicit fall through
...
|
|
Patch series "memblock: simplify several early memory allocation", v4.
These patches simplify some of the early memory allocations by replacing
usage of older memblock APIs with newer and shinier ones.
Quite a few places in the arch/ code allocated memory using a memblock
API that returns a physical address of the allocated area, then
converted this physical address to a virtual one and then used memset(0)
to clear the allocated range.
More recent memblock APIs do all the three steps in one call and their
usage simplifies the code.
It's important to note that regardless of API used, the core allocation
is nearly identical for any set of memblock allocators: first it tries
to find a free memory with all the constraints specified by the caller
and then falls back to the allocation with some or all constraints
disabled.
The first three patches perform the conversion of call sites that have
exact requirements for the node and the possible memory range.
The fourth patch is a bit one-off as it simplifies openrisc's
implementation of pte_alloc_one_kernel(), and not only the memblock
usage.
The fifth patch takes care of simpler cases when the allocation can be
satisfied with a simple call to memblock_alloc().
The sixth patch removes one-liner wrappers for memblock_alloc on arm and
unicore32, as suggested by Christoph.
This patch (of 6):
There are a several places that allocate memory using memblock APIs that
return a physical address, convert the returned address to the virtual
address and frequently also memset(0) the allocated range.
Update these places to use memblock allocators already returning a
virtual address. Use memblock functions that clear the allocated memory
instead of calling memset(0) where appropriate.
The calls to memblock_alloc_base() that were not followed by memset(0)
are replaced with memblock_alloc_try_nid_raw(). Since the latter does
not panic() when the allocation fails, the appropriate panic() calls are
added to the call sites.
Link: http://lkml.kernel.org/r/[email protected]
Signed-off-by: Mike Rapoport <[email protected]>
Cc: Arnd Bergmann <[email protected]>
Cc: Benjamin Herrenschmidt <[email protected]>
Cc: "David S. Miller" <[email protected]>
Cc: Guan Xuetao <[email protected]>
Cc: Greentime Hu <[email protected]>
Cc: Heiko Carstens <[email protected]>
Cc: Jonas Bonn <[email protected]>
Cc: Martin Schwidefsky <[email protected]>
Cc: Michael Ellerman <[email protected]>
Cc: Michal Hocko <[email protected]>
Cc: Michal Simek <[email protected]>
Cc: Mark Salter <[email protected]>
Cc: Paul Mackerras <[email protected]>
Cc: Rich Felker <[email protected]>
Cc: Russell King <[email protected]>
Cc: Stefan Kristiansson <[email protected]>
Cc: Stafford Horne <[email protected]>
Cc: Vincent Chen <[email protected]>
Cc: Yoshinori Sato <[email protected]>
Cc: Christoph Hellwig <[email protected]>
Cc: Michal Simek <[email protected]>
Signed-off-by: Andrew Morton <[email protected]>
Signed-off-by: Linus Torvalds <[email protected]>
|
|
There is no good reason for this helper, just opencode it.
Signed-off-by: Christoph Hellwig <[email protected]>
Tested-by: Christian Zigotzky <[email protected]>
Signed-off-by: Michael Ellerman <[email protected]>
|
|
Use the generic iommu bypass code instead of overriding set_dma_mask.
Signed-off-by: Christoph Hellwig <[email protected]>
Signed-off-by: Michael Ellerman <[email protected]>
|
|
If dart_init failed we didn't have a chance to setup dma or controller
ops yet, so there is no point in resetting them.
Signed-off-by: Christoph Hellwig <[email protected]>
Signed-off-by: Michael Ellerman <[email protected]>
|
|
Make it explicit that the caller gets a physical address rather than a
virtual one.
This will also allow using meblock_alloc prefix for memblock allocations
returning virtual address, which is done in the following patches.
The conversion is done using the following semantic patch:
@@
expression e1, e2, e3;
@@
(
- memblock_alloc(e1, e2)
+ memblock_phys_alloc(e1, e2)
|
- memblock_alloc_nid(e1, e2, e3)
+ memblock_phys_alloc_nid(e1, e2, e3)
|
- memblock_alloc_try_nid(e1, e2, e3)
+ memblock_phys_alloc_try_nid(e1, e2, e3)
)
Link: http://lkml.kernel.org/r/[email protected]
Signed-off-by: Mike Rapoport <[email protected]>
Acked-by: Michal Hocko <[email protected]>
Cc: Catalin Marinas <[email protected]>
Cc: Chris Zankel <[email protected]>
Cc: "David S. Miller" <[email protected]>
Cc: Geert Uytterhoeven <[email protected]>
Cc: Greentime Hu <[email protected]>
Cc: Greg Kroah-Hartman <[email protected]>
Cc: Guan Xuetao <[email protected]>
Cc: Ingo Molnar <[email protected]>
Cc: "James E.J. Bottomley" <[email protected]>
Cc: Jonas Bonn <[email protected]>
Cc: Jonathan Corbet <[email protected]>
Cc: Ley Foon Tan <[email protected]>
Cc: Mark Salter <[email protected]>
Cc: Martin Schwidefsky <[email protected]>
Cc: Matt Turner <[email protected]>
Cc: Michael Ellerman <[email protected]>
Cc: Michal Simek <[email protected]>
Cc: Palmer Dabbelt <[email protected]>
Cc: Paul Burton <[email protected]>
Cc: Richard Kuo <[email protected]>
Cc: Richard Weinberger <[email protected]>
Cc: Rich Felker <[email protected]>
Cc: Russell King <[email protected]>
Cc: Serge Semin <[email protected]>
Cc: Thomas Gleixner <[email protected]>
Cc: Tony Luck <[email protected]>
Cc: Vineet Gupta <[email protected]>
Cc: Yoshinori Sato <[email protected]>
Signed-off-by: Andrew Morton <[email protected]>
Signed-off-by: Linus Torvalds <[email protected]>
|
|
Currently <linux/slab.h> #includes <linux/kmemleak.h> for no obvious
reason. It looks like it's only a convenience, so remove kmemleak.h
from slab.h and add <linux/kmemleak.h> to any users of kmemleak_* that
don't already #include it. Also remove <linux/kmemleak.h> from source
files that do not use it.
This is tested on i386 allmodconfig and x86_64 allmodconfig. It would
be good to run it through the 0day bot for other $ARCHes. I have
neither the horsepower nor the storage space for the other $ARCHes.
Update: This patch has been extensively build-tested by both the 0day
bot & kisskb/ozlabs build farms. Both of them reported 2 build failures
for which patches are included here (in v2).
[ slab.h is the second most used header file after module.h; kernel.h is
right there with slab.h. There could be some minor error in the
counting due to some #includes having comments after them and I didn't
combine all of those. ]
[[email protected]: security/keys/big_key.c needs vmalloc.h, per sfr]
Link: http://lkml.kernel.org/r/[email protected]
Link: http://kisskb.ellerman.id.au/kisskb/head/13396/
Signed-off-by: Randy Dunlap <[email protected]>
Reviewed-by: Ingo Molnar <[email protected]>
Reported-by: Michael Ellerman <[email protected]> [2 build failures]
Reported-by: Fengguang Wu <[email protected]> [2 build failures]
Reviewed-by: Andrew Morton <[email protected]>
Cc: Wei Yongjun <[email protected]>
Cc: Luis R. Rodriguez <[email protected]>
Cc: Greg Kroah-Hartman <[email protected]>
Cc: Mimi Zohar <[email protected]>
Cc: John Johansen <[email protected]>
Cc: Stephen Rothwell <[email protected]>
Signed-off-by: Andrew Morton <[email protected]>
Signed-off-by: Linus Torvalds <[email protected]>
|
|
We want to use the dma_direct_ namespace for a generic implementation,
so rename powerpc to the second best choice: dma_nommu_.
Signed-off-by: Christoph Hellwig <[email protected]>
|
|
The dma-mapping core and the implementations do not change the DMA
attributes passed by pointer. Thus the pointer can point to const data.
However the attributes do not have to be a bitfield. Instead unsigned
long will do fine:
1. This is just simpler. Both in terms of reading the code and setting
attributes. Instead of initializing local attributes on the stack
and passing pointer to it to dma_set_attr(), just set the bits.
2. It brings safeness and checking for const correctness because the
attributes are passed by value.
Semantic patches for this change (at least most of them):
virtual patch
virtual context
@r@
identifier f, attrs;
@@
f(...,
- struct dma_attrs *attrs
+ unsigned long attrs
, ...)
{
...
}
@@
identifier r.f;
@@
f(...,
- NULL
+ 0
)
and
// Options: --all-includes
virtual patch
virtual context
@r@
identifier f, attrs;
type t;
@@
t f(..., struct dma_attrs *attrs);
@@
identifier r.f;
@@
f(...,
- NULL
+ 0
)
Link: http://lkml.kernel.org/r/[email protected]
Signed-off-by: Krzysztof Kozlowski <[email protected]>
Acked-by: Vineet Gupta <[email protected]>
Acked-by: Robin Murphy <[email protected]>
Acked-by: Hans-Christian Noren Egtvedt <[email protected]>
Acked-by: Mark Salter <[email protected]> [c6x]
Acked-by: Jesper Nilsson <[email protected]> [cris]
Acked-by: Daniel Vetter <[email protected]> [drm]
Reviewed-by: Bart Van Assche <[email protected]>
Acked-by: Joerg Roedel <[email protected]> [iommu]
Acked-by: Fabien Dessenne <[email protected]> [bdisp]
Reviewed-by: Marek Szyprowski <[email protected]> [vb2-core]
Acked-by: David Vrabel <[email protected]> [xen]
Acked-by: Konrad Rzeszutek Wilk <[email protected]> [xen swiotlb]
Acked-by: Joerg Roedel <[email protected]> [iommu]
Acked-by: Richard Kuo <[email protected]> [hexagon]
Acked-by: Geert Uytterhoeven <[email protected]> [m68k]
Acked-by: Gerald Schaefer <[email protected]> [s390]
Acked-by: Bjorn Andersson <[email protected]>
Acked-by: Hans-Christian Noren Egtvedt <[email protected]> [avr32]
Acked-by: Vineet Gupta <[email protected]> [arc]
Acked-by: Robin Murphy <[email protected]> [arm64 and dma-iommu]
Signed-off-by: Andrew Morton <[email protected]>
Signed-off-by: Linus Torvalds <[email protected]>
|
|
Instead of punching a hole in the linear mapping, just use normal
cachable memory, and apply the flush sequence documented in the
CPC625 (aka U3) user manual.
This allows us to remove quite a bit of code related to the early
allocation of the DART and the hole in the linear mapping. We can
also get rid of the copy of the DART for suspend/resume as the
original memory can just be saved/restored now, as long as we
properly sync the caches.
Signed-off-by: Benjamin Herrenschmidt <[email protected]>
[mpe: Integrate dart_init() fix to return ENODEV when DART disabled]
Signed-off-by: Michael Ellerman <[email protected]>
|
|
Now that the table and the offset can co-exist, we no longer need
to flip/flop, we can just establish both once at boot time.
Signed-off-by: Benjamin Herrenschmidt <[email protected]>
Signed-off-by: Michael Ellerman <[email protected]>
|
|
This adds a iommu_table_ops struct and puts pointer to it into
the iommu_table struct. This moves tce_build/tce_free/tce_get/tce_flush
callbacks from ppc_md to the new struct where they really belong to.
This adds the requirement for @it_ops to be initialized before calling
iommu_init_table() to make sure that we do not leave any IOMMU table
with iommu_table_ops uninitialized. This is not a parameter of
iommu_init_table() though as there will be cases when iommu_init_table()
will not be called on TCE tables, for example - VFIO.
This does s/tce_build/set/, s/tce_free/clear/ and removes "tce_"
redundant prefixes.
This removes tce_xxx_rm handlers from ppc_md but does not add
them to iommu_table_ops as this will be done later if we decide to
support TCE hypercalls in real mode. This removes _vm callbacks as
only virtual mode is supported by now so this also removes @rm parameter.
For pSeries, this always uses tce_buildmulti_pSeriesLP/
tce_buildmulti_pSeriesLP. This changes multi callback to fall back to
tce_build_pSeriesLP/tce_free_pSeriesLP if FW_FEATURE_MULTITCE is not
present. The reason for this is we still have to support "multitce=off"
boot parameter in disable_multitce() and we do not want to walk through
all IOMMU tables in the system and replace "multi" callbacks with single
ones.
For powernv, this defines _ops per PHB type which are P5IOC2/IODA1/IODA2.
This makes the callbacks for them public. Later patches will extend
callbacks for IODA1/2.
No change in behaviour is expected.
Signed-off-by: Alexey Kardashevskiy <[email protected]>
Reviewed-by: David Gibson <[email protected]>
Reviewed-by: Gavin Shan <[email protected]>
Signed-off-by: Michael Ellerman <[email protected]>
|
|
Now that we have ported the calls to iommu_init_early_dart to always
supply a pci_controller_ops struct, we can safely drop the check.
Signed-off-by: Daniel Axtens <[email protected]>
Signed-off-by: Michael Ellerman <[email protected]>
|
|
Remove shims, patch callsites to use pci_controller_ops
versions instead.
Also move back the probe mode defines, as explained in the patch
for pci_probe_mode.
Signed-off-by: Daniel Axtens <[email protected]>
Signed-off-by: Michael Ellerman <[email protected]>
|
|
If a pci_controller_ops struct is provided to iommu_init_early_dart,
populate that with the DMA setup ops, rather than ppc_md. If NULL is
provided, populate ppc_md as before.
This also patches the call sites for Maple and Power Mac to pass
NULL, so existing behaviour is preserved.
The benefit of making this optional is that it means we don't have
to change dart, Maple and Power Mac over to the controller_ops
system in one fell swoop.
Signed-off-by: Daniel Axtens <[email protected]>
Signed-off-by: Michael Ellerman <[email protected]>
|
|
The DART table allocation is registered to kmemleak via the
memblock_alloc_base() call. However, the DART table is later unmapped
and dart_tablebase VA no longer accessible. This patch tells kmemleak
not to scan this block and avoid an unhandled paging request.
Signed-off-by: Catalin Marinas <[email protected]>
Cc: Benjamin Herrenschmidt <[email protected]>
Cc: Paul Mackerras <[email protected]>
Signed-off-by: Benjamin Herrenschmidt <[email protected]>
|
|
Commit d084775738b746648d4102337163a04534a02982 switched the generic
powerpc iommu backend code to use the it_page_shift field to determine
page size. Commit 3a553170d35d69bea3877bffa508489dfa6f133d should have
initiliased this field for all platforms, however the DART iommu table
code was not updated.
This commit initialises the it_page_shift field to 4K for the DART
iommu.
Signed-off-by: Alistair Popple <[email protected]>
Signed-off-by: Benjamin Herrenschmidt <[email protected]>
|
|
There are a number of issues in the recent IOMMU pools code:
- On a preempt kernel we might switch CPUs in the middle of building
a scatter gather list. When this happens the handle hint passed in
no longer falls within the local CPU's pool. Check for this and
fall back to the pool hint.
- We were missing a spin_unlock/spin_lock in one spot where we
switch pools.
- We need to provide locking around dart_tlb_invalidate_all and
dart_tlb_invalidate_one now that the global lock is gone.
Reported-by: Alexander Graf <[email protected]>
Signed-off-by: Anton Blanchard <[email protected]>
Signed-off-by: Benjamin Herrenschmidt <[email protected]>
CC: <[email protected]> [v3.6]
|
|
These days they are just wrappers around __pa() and __va() respectively.
Signed-off-by: Michael Ellerman <[email protected]>
Signed-off-by: Benjamin Herrenschmidt <[email protected]>
|
|
Several fixes as well where the +1 was missing.
Done via coccinelle scripts like:
@@
struct resource *ptr;
@@
- ptr->end - ptr->start + 1
+ resource_size(ptr)
and some grep and typing.
Mostly uncompiled, no cross-compilers.
Signed-off-by: Joe Perches <[email protected]>
Signed-off-by: Jiri Kosina <[email protected]>
|
|
No need to set the device tree device_node pci node iommu pointer, its
only used for dlpar remove.
Signed-off-by: Milton Miller <[email protected]>
Signed-off-by: Nishanth Aravamudan <[email protected]>
Signed-off-by: Benjamin Herrenschmidt <[email protected]>
|
|
The PCI-Express bus off the U4/CPC945 bridge supports direct DMA to
all of memory, bypassing the DART iommu, for 64-bit capable devices.
This adds support for it on Bimini and Apple Quad G5's in order to
improve DMA performances of cards using that slot (the x16 graphics
slot). Tested with an Intel ixgbe 10GE card.
Signed-off-by: Benjamin Herrenschmidt <[email protected]>
|
|
via following scripts
FILES=$(find * -type f | grep -vE 'oprofile|[^K]config')
sed -i \
-e 's/lmb/memblock/g' \
-e 's/LMB/MEMBLOCK/g' \
$FILES
for N in $(find . -name lmb.[ch]); do
M=$(echo $N | sed 's/lmb/memblock/g')
mv $N $M
done
and remove some wrong change like lmbench and dlmb etc.
also move memblock.c from lib/ to mm/
Suggested-by: Ingo Molnar <[email protected]>
Acked-by: "H. Peter Anvin" <[email protected]>
Acked-by: Benjamin Herrenschmidt <[email protected]>
Acked-by: Linus Torvalds <[email protected]>
Signed-off-by: Yinghai Lu <[email protected]>
Signed-off-by: Benjamin Herrenschmidt <[email protected]>
|
|
implicit slab.h inclusion from percpu.h
percpu.h is included by sched.h and module.h and thus ends up being
included when building most .c files. percpu.h includes slab.h which
in turn includes gfp.h making everything defined by the two files
universally available and complicating inclusion dependencies.
percpu.h -> slab.h dependency is about to be removed. Prepare for
this change by updating users of gfp and slab facilities include those
headers directly instead of assuming availability. As this conversion
needs to touch large number of source files, the following script is
used as the basis of conversion.
http://userweb.kernel.org/~tj/misc/slabh-sweep.py
The script does the followings.
* Scan files for gfp and slab usages and update includes such that
only the necessary includes are there. ie. if only gfp is used,
gfp.h, if slab is used, slab.h.
* When the script inserts a new include, it looks at the include
blocks and try to put the new include such that its order conforms
to its surrounding. It's put in the include block which contains
core kernel includes, in the same order that the rest are ordered -
alphabetical, Christmas tree, rev-Xmas-tree or at the end if there
doesn't seem to be any matching order.
* If the script can't find a place to put a new include (mostly
because the file doesn't have fitting include block), it prints out
an error message indicating which .h file needs to be added to the
file.
The conversion was done in the following steps.
1. The initial automatic conversion of all .c files updated slightly
over 4000 files, deleting around 700 includes and adding ~480 gfp.h
and ~3000 slab.h inclusions. The script emitted errors for ~400
files.
2. Each error was manually checked. Some didn't need the inclusion,
some needed manual addition while adding it to implementation .h or
embedding .c file was more appropriate for others. This step added
inclusions to around 150 files.
3. The script was run again and the output was compared to the edits
from #2 to make sure no file was left behind.
4. Several build tests were done and a couple of problems were fixed.
e.g. lib/decompress_*.c used malloc/free() wrappers around slab
APIs requiring slab.h to be added manually.
5. The script was run on all .h files but without automatically
editing them as sprinkling gfp.h and slab.h inclusions around .h
files could easily lead to inclusion dependency hell. Most gfp.h
inclusion directives were ignored as stuff from gfp.h was usually
wildly available and often used in preprocessor macros. Each
slab.h inclusion directive was examined and added manually as
necessary.
6. percpu.h was updated not to include slab.h.
7. Build test were done on the following configurations and failures
were fixed. CONFIG_GCOV_KERNEL was turned off for all tests (as my
distributed build env didn't work with gcov compiles) and a few
more options had to be turned off depending on archs to make things
build (like ipr on powerpc/64 which failed due to missing writeq).
* x86 and x86_64 UP and SMP allmodconfig and a custom test config.
* powerpc and powerpc64 SMP allmodconfig
* sparc and sparc64 SMP allmodconfig
* ia64 SMP allmodconfig
* s390 SMP allmodconfig
* alpha SMP allmodconfig
* um on x86_64 SMP allmodconfig
8. percpu.h modifications were reverted so that it could be applied as
a separate patch and serve as bisection point.
Given the fact that I had only a couple of failures from tests on step
6, I'm fairly confident about the coverage of this conversion patch.
If there is a breakage, it's likely to be something in one of the arch
headers which should be easily discoverable easily on most builds of
the specific arch.
Signed-off-by: Tejun Heo <[email protected]>
Guess-its-ok-by: Christoph Lameter <[email protected]>
Cc: Ingo Molnar <[email protected]>
Cc: Lee Schermerhorn <[email protected]>
|
|
That is "success", "unknown", "through", "performance", "[re|un]mapping"
, "access", "default", "reasonable", "[con]currently", "temperature"
, "channel", "[un]used", "application", "example","hierarchy", "therefore"
, "[over|under]flow", "contiguous", "threshold", "enough" and others.
Signed-off-by: André Goddard Rosa <[email protected]>
Signed-off-by: Jiri Kosina <[email protected]>
|
|
Sometimes this is used to hold a simple offset, and sometimes
it is used to hold a pointer. This patch changes it to a union containing
void * and dma_addr_t. get/set accessors are also provided, because it was
getting a bit ugly to get to the actual data.
Signed-off-by: Becky Bruce <[email protected]>
Signed-off-by: Benjamin Herrenschmidt <[email protected]>
|
|
To support Cooperative Memory Overcommitment (CMO), we need to check
for failure from some of the tce hcalls.
These changes for the pseries platform affect the powerpc architecture;
patches for the other affected platforms are included in this patch.
pSeries platform IOMMU code changes:
* platform TCE functions must handle H_NOT_ENOUGH_RESOURCES errors and
return an error.
Architecture IOMMU code changes:
* Calls to ppc_md.tce_build need to check return values and return
DMA_MAPPING_ERROR for transient errors.
Architecture changes:
* struct machdep_calls for tce_build*_pSeriesLP functions need to change
to indicate failure.
* all other platforms will need updates to iommu functions to match the new
calling semantics; they will return 0 on success. The other platforms
default configs have been built, but no further testing was performed.
Signed-off-by: Robert Jennings <[email protected]>
Acked-by: Olof Johansson <[email protected]>
Acked-by: Paul Mackerras <[email protected]>
Signed-off-by: Benjamin Herrenschmidt <[email protected]>
|
|
Update iommu_alloc() to take the struct dma_attrs and pass them on to
tce_build(). This change propagates down to the tce_build functions of
all the platforms.
Signed-off-by: Mark Nelson <[email protected]>
Signed-off-by: Arnd Bergmann <[email protected]>
Signed-off-by: Benjamin Herrenschmidt <[email protected]>
|
|
Signed-off-by: David S. Miller <[email protected]>
|
|
These functions are only called from __init functions.
WARNING: vmlinux.o(.text+0x398f4): Section mismatch: reference to .init.text:.lmb_alloc (between '.iommu_init_early_dart' and '.pci_dma_bus_setup_dart')
Signed-off-by: Stephen Rothwell <[email protected]>
Signed-off-by: Paul Mackerras <[email protected]>
|
|
This implements save and restore hooks for IOMMUs and implements
it the DART iommu.
Signed-off-by: Johannes Berg <[email protected]>
Acked-by: Benjamin Herrenschmidt <[email protected]>
Cc: Olof Johansson <[email protected]>
Signed-off-by: Paul Mackerras <[email protected]>
|
|
This will allow us to build without PCI easier.
Signed-off-by: Stephen Rothwell <[email protected]>
Signed-off-by: Paul Mackerras <[email protected]>
|
|
This patch adds full cell iommu support (and iommu disabled mode).
It implements mapping/unmapping of iommu pages on demand using the
standard powerpc iommu framework. It also supports running with
iommu disabled for machines with less than 2GB of memory. (The
default is off in that case, though it can be forced on with the
kernel command line option iommu=force).
Signed-off-by: Jeremy Kerr <[email protected]>
Signed-off-by: Benjamin Herrenschmidt <[email protected]>
Signed-off-by: Paul Mackerras <[email protected]>
|
|
This patch completely refactors DMA operations for 64 bits powerpc. 32 bits
is untouched for now.
We use the new dev_archdata structure to add the dma operations pointer
and associated data to struct device. While at it, we also add the OF node
pointer and numa node. In the future, we might want to look into merging
that with pci_dn as well.
The old vio, pci-iommu and pci-direct DMA ops are gone. They are now replaced
by a set of generic iommu and direct DMA ops (non PCI specific) that can be
used by bus types. The toplevel implementation is now inline.
Signed-off-by: Benjamin Herrenschmidt <[email protected]>
Signed-off-by: Paul Mackerras <[email protected]>
|
|
The 10Gigabit ethernet device drivers appear to be able to chew
up all 256MB of TCE mappings on pSeries systems, as evidenced by
numerous error messages:
iommu_alloc failed, tbl c0000000010d5c48 vaddr c0000000d875eff0 npages 1
Some experimentation indicates that this is essentially because
one 1500 byte ethernet MTU gets mapped as a 64K DMA region when
the large 64K pages are enabled. Thus, it doesn't take much to
exhaust all of the available DMA mappings for a high-speed card.
This patch changes the iommu allocator to work with its own
unique, distinct page size. Although the patch is long, its
actually quite simple: it just #defines a distinct IOMMU_PAGE_SIZE
and then uses this in all the places that matter.
As a side effect, it also dramatically improves network performance
on platforms with H-calls on iommu translation inserts/removes (since
we no longer call it 16 times for a 1500 bytes packet when the iommu HW
is still 4k).
In the future, we might want to make the IOMMU_PAGE_SIZE a variable
in the iommu_table instance, thus allowing support for different HW
page sizes in the iommu itself.
Signed-off-by: Linas Vepstas <[email protected]>
Signed-off-by: Benjamin Herrenschmidt <[email protected]>
Acked-by: Olof Johansson <[email protected]>
Acked-by: Stephen Rothwell <[email protected]>
Signed-off-by: Paul Mackerras <[email protected]>
|
|
It seems that the occasional data corruption observed with the tg3
driver wasn't due to missing barriers after all, but rather seems to
be due to the DART (= IOMMU) in the U4 northbridge reading stale
IOMMU table entries from memory due to a race. This fixes it by
making the CPU read the entry back from memory before using it.
Signed-off-by: Benjamin Herrenschmidt <[email protected]>
Signed-off-by: Paul Mackerras <[email protected]>
|
|
Signed-off-by: Jörn Engel <[email protected]>
Signed-off-by: Adrian Bunk <[email protected]>
|
|
Better late than never...
Respin based on previous comment. Only remaining issue last time was an
extra mb() that I've taken out.
Signed-off-by: Paul Mackerras <[email protected]>
|
|
Allocate IOMMU tables local to the relevant node.
Signed-off-by: Anton Blanchard <[email protected]>
Acked-by: Olof Johansson <[email protected]>
Signed-off-by: Paul Mackerras <[email protected]>
|
|
Turn on the DART already at 1GB. This is needed because of crippled
devices in some systems, i.e. Airport Extreme cards, only supporting
30-bit DMA addresses.
Otherwise, users with between 1 and 2GB of memory will need to manually
enable it with iommu=force, and that's no good.
Some simple performance tests show that there's a slight impact of
enabling DART, but it's in the 1-3% range (kernel build with disk I/O
as well as over NFS).
iommu=off can still be used for those who don't want to deal with the
overhead (and don't need it for any devices).
Signed-off-by: Olof Johansson <[email protected]>
Signed-off-by: Paul Mackerras <[email protected]>
|
|
|
|
it's int __iomem *, not int * __iomem...
Signed-off-by: Al Viro <[email protected]>
|
|
Currently most callers of lmb_alloc() don't check if it worked or not, if it
ever does weird bad things will probably happen. The few callers who do check
just panic or BUG_ON.
So make lmb_alloc() panic internally, to catch bugs at the source. The few
callers who did check the result no longer need to.
The only caller that did anything interesting with the return result was
careful_allocation(). For it we create __lmb_alloc_base() which _doesn't_ panic
automatically, a little messy, but passable.
Signed-off-by: Michael Ellerman <[email protected]>
Signed-off-by: Paul Mackerras <[email protected]>
|
|
Rpn is assigned every time in the loop, no need to increase it too.
Signed-off-by: Olof Johansson <[email protected]>
Signed-off-by: Paul Mackerras <[email protected]>
|
|
The patch enabling the new G5's with U4 broke initialization of the DART
driver, causing it to trigger a BUG_ON for a case that is actually
valid. This patch fixes it:
Signed-off-by: Benjamin Herrenschmidt <[email protected]>
Signed-off-by: Paul Mackerras <[email protected]>
|