aboutsummaryrefslogtreecommitdiff
path: root/arch/arm/mm
AgeCommit message (Collapse)AuthorFilesLines
2016-03-17mm: remove VM_FAULT_MINORJan Kara1-1/+1
The define has a comment from Nick Piggin from 2007: /* For backwards compat. Remove me quickly. */ I guess 9 years should not be too hurried sense of 'quickly' even for kernel measures. Signed-off-by: Jan Kara <[email protected]> Signed-off-by: Andrew Morton <[email protected]> Signed-off-by: Linus Torvalds <[email protected]>
2016-03-17mm: cleanup *pte_alloc* interfacesKirill A. Shutemov2-4/+4
There are few things about *pte_alloc*() helpers worth cleaning up: - 'vma' argument is unused, let's drop it; - most __pte_alloc() callers do speculative check for pmd_none(), before taking ptl: let's introduce pte_alloc() macro which does the check. The only direct user of __pte_alloc left is userfaultfd, which has different expectation about atomicity wrt pmd. - pte_alloc_map() and pte_alloc_map_lock() are redefined using pte_alloc(). [[email protected]: fix build for arm64 hugetlbpage] [[email protected]: fix arch/arm/mm/mmu.c some more] Signed-off-by: Kirill A. Shutemov <[email protected]> Cc: Dave Hansen <[email protected]> Signed-off-by: Sudeep Holla <[email protected]> Acked-by: Kirill A. Shutemov <[email protected]> Signed-off-by: Stephen Rothwell <[email protected]> Signed-off-by: Andrew Morton <[email protected]> Signed-off-by: Linus Torvalds <[email protected]>
2016-03-06Merge branch 'fixes' of git://ftp.arm.linux.org.uk/~rmk/linux-armLinus Torvalds1-0/+3
Pull ARM fixes from Russell King: "Just two ARM fixes this time: one to fix the hyp-stub for older ARM CPUs, and another to fix the set_memory_xx() permission functions to deal with zero sizes correctly" * 'fixes' of git://ftp.arm.linux.org.uk/~rmk/linux-arm: ARM: 8544/1: set_memory_xx fixes ARM: 8534/1: virt: fix hyp-stub build for pre-ARMv7 CPUs
2016-03-04Merge branches 'amba', 'fixes', 'misc' and 'tauros2' into for-nextRussell King8-96/+259
2016-03-04ARM: 8546/1: dma-mapping: refactor to fix coherent+cma+gfp=0Rabin Vincent1-37/+128
Given a device which uses arm_coherent_dma_ops and on which dev_get_cma_area(dev) returns non-NULL, the following usage of the DMA API with gfp=0 results in memory corruption and a memory leak. p = dma_alloc_coherent(dev, sz, &dma, 0); if (p) dma_free_coherent(dev, sz, p, dma); The memory leak is because the alloc allocates using __alloc_simple_buffer() but the free attempts dma_release_from_contiguous() which does not do free anything since the page is not in the CMA area. The memory corruption is because the free calls __dma_remap() on a page which is backed by only first level page tables. The apply_to_page_range() + __dma_update_pte() loop ends up interpreting the section mapping as an addresses to a second level page table and writing the new PTE to memory which is not used by page tables. We don't have access to the GFP flags used for allocation in the free function. Fix this by adding allocator backends and using this information in the free function so that we always use the correct release routine. Fixes: 21caf3a7 ("ARM: 8398/1: arm DMA: Fix allocation from CMA for coherent DMA") Signed-off-by: Rabin Vincent <[email protected]> Signed-off-by: Russell King <[email protected]>
2016-03-04ARM: 8547/1: dma-mapping: store buffer informationRabin Vincent1-1/+48
Keep a list of allocated DMA buffers so that we can store metadata in alloc() which we later need in free(). Signed-off-by: Rabin Vincent <[email protected]> Signed-off-by: Russell King <[email protected]>
2016-03-04ARM: 8544/1: set_memory_xx fixesMika Penttilä1-0/+3
Allow zero size updates. This makes set_memory_xx() consistent with x86, s390 and arm64 and makes apply_to_page_range() not to BUG() when loading modules. Signed-off-by: Mika Penttilä [email protected] Signed-off-by: Russell King <[email protected]>
2016-02-27mm: ASLR: use get_random_long()Daniel Cashman1-1/+1
Replace calls to get_random_int() followed by a cast to (unsigned long) with calls to get_random_long(). Also address shifting bug which, in case of x86 removed entropy mask for mmap_rnd_bits values > 31 bits. Signed-off-by: Daniel Cashman <[email protected]> Acked-by: Kees Cook <[email protected]> Cc: "Theodore Ts'o" <[email protected]> Cc: Arnd Bergmann <[email protected]> Cc: Greg Kroah-Hartman <[email protected]> Cc: Catalin Marinas <[email protected]> Cc: Will Deacon <[email protected]> Cc: Ralf Baechle <[email protected]> Cc: Benjamin Herrenschmidt <[email protected]> Cc: Paul Mackerras <[email protected]> Cc: Michael Ellerman <[email protected]> Cc: David S. Miller <[email protected]> Cc: Thomas Gleixner <[email protected]> Cc: Ingo Molnar <[email protected]> Cc: H. Peter Anvin <[email protected]> Cc: Al Viro <[email protected]> Cc: Nick Kralevich <[email protected]> Cc: Jeff Vander Stoep <[email protected]> Cc: Mark Salyzyn <[email protected]> Signed-off-by: Andrew Morton <[email protected]> Signed-off-by: Linus Torvalds <[email protected]>
2016-02-22ARM: 8535/1: mm: DEBUG_RODATA makes no sense with XIP_KERNELArnd Bergmann1-1/+1
When CONFIG_DEBUG_ALIGN_RODATA is set, we get a link error: arch/arm/mm/built-in.o:(.data+0x4bc): undefined reference to `__start_rodata_section_aligned' However, this combination is useless, as XIP_KERNEL implies that all the RODATA is already marked readonly, so both CONFIG_DEBUG_RODATA and CONFIG_DEBUG_ALIGN_RODATA (which depends on the other) are not needed with XIP_KERNEL, and this patches enforces that using a Kconfig dependency. Signed-off-by: Arnd Bergmann <[email protected]> Fixes: 25362dc496ed ("ARM: 8501/1: mm: flip priority of CONFIG_DEBUG_RODATA") Acked-by: Ard Biesheuvel <[email protected]> Signed-off-by: Russell King <[email protected]>
2016-02-17ARM: make the physical-relative calculation more obviousRussell King1-1/+1
The physical-relative calculation between the XIP text and data sections introduced by the previous patch was far from obvious. Let's simplify it by turning it into a macro which takes the two (virtual) addresses. This allows us to arrange the calculation in a more obvious manner - we can make it two sub-expressions which calculate the physical address for each symbol, and then takes the difference of those physical addresses. Signed-off-by: Russell King <[email protected]>
2016-02-16ARM: 8512/1: proc-v7.S: Adjust stack address when XIP_KERNELNicolas Pitre1-1/+1
When XIP_KERNEL is enabled, the virt to phys address translation for RAM is not the same as the virt to phys address translation for .text. The only way to know where physical RAM is located is to use PLAT_PHYS_OFFSET. The MACRO will be useful for other places where there is a similar problem. Signed-off-by: Nicolas Pitre <[email protected]> Signed-off-by: Chris Brandt <[email protected]> Signed-off-by: Russell King <[email protected]>
2016-02-11ARM: 8502/1: mm: mark section-aligned portion of rodata NXKees Cook1-3/+4
When rodata is large enough that it crosses a section boundary after the kernel text, mark the rest NX. This is as close to full NX of rodata as we can get without splitting page tables or doing section alignment via CONFIG_DEBUG_ALIGN_RODATA. When the config is: CONFIG_DEBUG_RODATA=y # CONFIG_DEBUG_ALIGN_RODATA is not set Before: ---[ Kernel Mapping ]--- 0x80000000-0x80100000 1M RW NX SHD 0x80100000-0x80a00000 9M ro x SHD 0x80a00000-0xa0000000 502M RW NX SHD After: ---[ Kernel Mapping ]--- 0x80000000-0x80100000 1M RW NX SHD 0x80100000-0x80700000 6M ro x SHD 0x80700000-0x80a00000 3M ro NX SHD 0x80a00000-0xa0000000 502M RW NX SHD Signed-off-by: Kees Cook <[email protected]> Reviewed-by: Ard Biesheuvel <[email protected]> Signed-off-by: Russell King <[email protected]>
2016-02-11ARM: 8518/1: Use correct symbols for XIP_KERNELChris Brandt1-2/+6
For an XIP build, _etext does not represent the end of the binary image that needs to stay mapped into the MODULES_VADDR area. Years ago, data came before text in the memory map. However, now that the order is text/init/data, an XIP_KERNEL needs to map up to the data location in order to keep from cutting off parts of the kernel that are needed. We only map up to the beginning of data because data has already been copied, so there's no reason to keep it around anymore. A new symbol is created to make it clear what it is we are referring to. This fixes the bug where you might lose the end of your kernel area after page table setup is complete. Signed-off-by: Chris Brandt <[email protected]> Signed-off-by: Russell King <[email protected]>
2016-02-11ARM: 8507/1: dma-mapping: Use DMA_ATTR_ALLOC_SINGLE_PAGES hint to optimize allocDoug Anderson1-0/+4
If we know that TLB efficiency will not be an issue when memory is accessed then it's not terribly important to allocate big chunks of memory. The whole point of allocating the big chunks was that it would make TLB usage efficient. As Marek Szyprowski indicated: Please note that mapping memory with larger pages significantly improves performance, especially when IOMMU has a little TLB cache. This can be easily observed when multimedia devices do processing of RGB data with 90/270 degree rotation Image rotation is distinctly an operation that needs to bounce around through memory, so it makes sense that TLB efficiency is important there. Video decoding, on the other hand, is a fairly sequential operation. During video decoding it's not expected that we'll be jumping all over memory. Decoding video is also pretty heavy and the TLB misses aren't a huge deal. Presumably most HW video acceleration users of dma-mapping will not care about huge pages and will set DMA_ATTR_ALLOC_SINGLE_PAGES. Allocating big chunks of memory is quite expensive, especially if we're doing it repeadly and memory is full. In one (out of tree) usage model it is common that arm_iommu_alloc_attrs() is called 16 times in a row, each one trying to allocate 4 MB of memory. This is called whenever the system encounters a new video, which could easily happen while the memory system is stressed out. In fact, on certain social media websites that auto-play video and have infinite scrolling, it's quite common to see not just one of these 16x4MB allocations but 2 or 3 right after another. Asking the system even to do a small amount of extra work to give us big chunks in this case is just not a good use of time. Allocating big chunks of memory is also expensive indirectly. Even if we ask the system not to do ANY extra work to allocate _our_ memory, we're still potentially eating up all big chunks in the system. Presumably there are other users in the system that aren't quite as flexible and that actually need these big chunks. By eating all the big chunks we're causing extra work for the rest of the system. We also may start making other memory allocations fail. While the system may be robust to such failures (as is the case with dwc2 USB trying to allocate buffers for Ethernet data and with WiFi trying to allocate buffers for WiFi data), it is yet another big performance hit. Signed-off-by: Douglas Anderson <[email protected]> Acked-by: Marek Szyprowski <[email protected]> Tested-by: Javier Martinez Canillas <[email protected]> Signed-off-by: Russell King <[email protected]>
2016-02-11ARM: 8505/1: dma-mapping: Optimize allocationDoug Anderson1-14/+20
The __iommu_alloc_buffer() is expected to be called to allocate pretty sizeable buffers. Upon simple tests of video I saw it trying to allocate 4,194,304 bytes. The function tries to allocate large chunks in order to optimize IOMMU TLB usage. The current function is very, very slow. One problem is the way it keeps trying and trying to allocate big chunks. Imagine a very fragmented memory that has 4M free but no contiguous pages at all. Further imagine allocating 4M (1024 pages). We'll do the following memory allocations: - For page 1: - Try to allocate order 10 (no retry) - Try to allocate order 9 (no retry) - ... - Try to allocate order 0 (with retry, but not needed) - For page 2: - Try to allocate order 9 (no retry) - Try to allocate order 8 (no retry) - ... - Try to allocate order 0 (with retry, but not needed) - ... - ... Total number of calls to alloc() calls for this case is: sum(int(math.log(i, 2)) + 1 for i in range(1, 1025)) => 9228 The above is obviously worse case, but given how slow alloc can be we really want to try to avoid even somewhat bad cases. I timed the old code with a device under memory pressure and it wasn't hard to see it take more than 120 seconds to allocate 4 megs of memory! (NOTE: testing was done on kernel 3.14, so possibly mainline would behave differently). A second problem is that allocating big chunks under memory pressure when we don't need them is just not a great idea anyway unless we really need them. We can make due pretty well with smaller chunks so it's probably wise to leave bigger chunks for other users once memory pressure is on. Let's adjust the allocation like this: 1. If a big chunk fails, stop trying to hard and bump down to lower order allocations. 2. Don't try useless orders. The whole point of big chunks is to optimize the TLB and it can really only make use of 2M, 1M, 64K and 4K sizes. We'll still tend to eat up a bunch of big chunks, but that might be the right answer for some users. A future patch could possibly add a new DMA_ATTR that would let the caller decide that TLB optimization isn't important and that we should use smaller chunks. Presumably this would be a sane strategy for some callers. Signed-off-by: Douglas Anderson <[email protected]> Acked-by: Marek Szyprowski <[email protected]> Reviewed-by: Robin Murphy <[email protected]> Reviewed-by: Tomasz Figa <[email protected]> Tested-by: Javier Martinez Canillas <[email protected]> Signed-off-by: Russell King <[email protected]>
2016-02-08ARM: 8501/1: mm: flip priority of CONFIG_DEBUG_RODATAKees Cook2-25/+28
The use of CONFIG_DEBUG_RODATA is generally seen as an essential part of kernel self-protection: http://www.openwall.com/lists/kernel-hardening/2015/11/30/13 Additionally, its name has grown to mean things beyond just rodata. To get ARM closer to this, we ought to rearrange the names of the configs that control how the kernel protects its memory. What was called CONFIG_ARM_KERNMEM_PERMS is realy doing the work that other architectures call CONFIG_DEBUG_RODATA. This redefines CONFIG_DEBUG_RODATA to actually do the bulk of the ROing (and NXing). In the place of the old CONFIG_DEBUG_RODATA, use CONFIG_DEBUG_ALIGN_RODATA, since that's what the option does: adds section alignment for making rodata explicitly NX, as arm does not split the page tables like arm64 does without _ALIGN_RODATA. Also adds human readable names to the sections so I could more easily debug my typos, and makes CONFIG_DEBUG_RODATA default "y" for CPU_V7. Results in /sys/kernel/debug/kernel_page_tables for each config state: # CONFIG_DEBUG_RODATA is not set # CONFIG_DEBUG_ALIGN_RODATA is not set ---[ Kernel Mapping ]--- 0x80000000-0x80900000 9M RW x SHD 0x80900000-0xa0000000 503M RW NX SHD CONFIG_DEBUG_RODATA=y CONFIG_DEBUG_ALIGN_RODATA=y ---[ Kernel Mapping ]--- 0x80000000-0x80100000 1M RW NX SHD 0x80100000-0x80700000 6M ro x SHD 0x80700000-0x80a00000 3M ro NX SHD 0x80a00000-0xa0000000 502M RW NX SHD CONFIG_DEBUG_RODATA=y # CONFIG_DEBUG_ALIGN_RODATA is not set ---[ Kernel Mapping ]--- 0x80000000-0x80100000 1M RW NX SHD 0x80100000-0x80a00000 9M ro x SHD 0x80a00000-0xa0000000 502M RW NX SHD Signed-off-by: Kees Cook <[email protected]> Reviewed-by: Laura Abbott <[email protected]> Signed-off-by: Russell King <[email protected]>
2016-02-08ARM: make virt_to_idmap() return unsigned longRussell King1-1/+1
Make virt_to_idmap() return an unsigned long rather than phys_addr_t. Returning phys_addr_t here makes no sense, because the definition of virt_to_idmap() is that it shall return a physical address which maps identically with the virtual address. Since virtual addresses are limited to 32-bit, identity mapped physical addresses are as well. Almost all users already had an implicit narrowing cast to unsigned long so let's make this official and part of this interface. Tested-by: Grygorii Strashko <[email protected]> Signed-off-by: Russell King <[email protected]>
2016-01-22tree wide: use kvfree() than conditional kfree()/vfree()Tetsuo Handa1-9/+2
There are many locations that do if (memory_was_allocated_by_vmalloc) vfree(ptr); else kfree(ptr); but kvfree() can handle both kmalloc()ed memory and vmalloc()ed memory using is_vmalloc_addr(). Unless callers have special reasons, we can replace this branch with kvfree(). Please check and reply if you found problems. Signed-off-by: Tetsuo Handa <[email protected]> Acked-by: Michal Hocko <[email protected]> Acked-by: Jan Kara <[email protected]> Acked-by: Russell King <[email protected]> Reviewed-by: Andreas Dilger <[email protected]> Acked-by: "Rafael J. Wysocki" <[email protected]> Acked-by: David Rientjes <[email protected]> Cc: "Luck, Tony" <[email protected]> Cc: Oleg Drokin <[email protected]> Cc: Boris Petkov <[email protected]> Signed-off-by: Andrew Morton <[email protected]> Signed-off-by: Linus Torvalds <[email protected]>
2016-01-20Merge tag 'armsoc-multiplatform' of ↵Linus Torvalds4-19/+19
git://git.kernel.org/pub/scm/linux/kernel/git/arm/arm-soc Pull ARM SoC multiplatform code updates from Arnd Bergmann: "This branch is the culmination of 5 years of effort to bring the ARMv6 and ARMv7 platforms together such that they can all be enabled and boot the same kernel. It has been a tremendous amount of cleanup and refactoring by a huge number of people, and creation of several new (and major) subsystems to better abstract out all the platform details in an appropriate manner. The bulk of this branch is a large patchset from Arnd that brings several of the more minor and older platforms we have closer to multiplatform support. Among these are MMP, S3C64xx, Orion5x, mv78xx0 and realview Much of this is moving around header files from old mach directories, but there are also some cleanup patches of debug_ll (lowlevel debug per-platform options) and other parts. Linus Walleij also has some patchs to clean up the older ARM Realview platforms by finally introducing DT support, and Rob Herring has some for ARM Versatile which is now DT-only. Both of these platforms are now multiplatform. Finally, a couple of patches from Russell for Dove PMU, and a fix from Valentin Rothberg for Exynos ADC, which were rebased on top of the series to avoid conflicts" * tag 'armsoc-multiplatform' of git://git.kernel.org/pub/scm/linux/kernel/git/arm/arm-soc: (75 commits) ARM: realview: don't select SMP_ON_UP for UP builds ARM: s3c: simplify s3c_irqwake_{e,}intallow definition ARM: s3c64xx: fix pm-debug compilation iio: exynos-adc: fix irqf_oneshot.cocci warnings ARM: realview: build realview-dt SMP support only when used ARM: realview: select apropriate targets ARM: realview: clean up header files ARM: realview: make all header files local ARM: no longer make CPU targets visible separately ARM: integrator: use explicit core module options ARM: realview: enable multiplatform ARM: make default platform work for NOMMU ARM: debug-ll: move DEBUG_LL_UART_EFM32 to correct Kconfig location ARM: defconfig: use correct debug_ll settings ARM: versatile: convert to multi-platform ARM: versatile: merge mach code into a single file ARM: versatile: switch to DT only booting and remove legacy code ARM: versatile: add DT based PCI detection ARM: pxa: mark ezx structures as __maybe_unused ARM: pxa: mark raumfeld init functions as __maybe_unused ...
2016-01-15mm: differentiate page_mapped() from page_mapcount() for compound pagesKirill A. Shutemov1-1/+1
Let's define page_mapped() to be true for compound pages if any sub-pages of the compound page is mapped (with PMD or PTE). On other hand page_mapcount() return mapcount for this particular small page. This will make cases like page_get_anon_vma() behave correctly once we allow huge pages to be mapped with PTE. Most users outside core-mm should use page_mapcount() instead of page_mapped(). Signed-off-by: Kirill A. Shutemov <[email protected]> Tested-by: Sasha Levin <[email protected]> Tested-by: Aneesh Kumar K.V <[email protected]> Acked-by: Jerome Marchand <[email protected]> Cc: Vlastimil Babka <[email protected]> Cc: Andrea Arcangeli <[email protected]> Cc: Hugh Dickins <[email protected]> Cc: Dave Hansen <[email protected]> Cc: Mel Gorman <[email protected]> Cc: Rik van Riel <[email protected]> Cc: Naoya Horiguchi <[email protected]> Cc: Steve Capper <[email protected]> Cc: Johannes Weiner <[email protected]> Cc: Michal Hocko <[email protected]> Cc: Christoph Lameter <[email protected]> Cc: David Rientjes <[email protected]> Signed-off-by: Andrew Morton <[email protected]> Signed-off-by: Linus Torvalds <[email protected]>
2016-01-15arm, thp: remove infrastructure for handling splitting PMDsKirill A. Shutemov1-15/+0
With new refcounting we don't need to mark PMDs splitting. Let's drop code to handle this. pmdp_splitting_flush() is not needed too: on splitting PMD we will do pmdp_clear_flush() + set_pte_at(). pmdp_clear_flush() will do IPI as needed for fast_gup. [[email protected]: fix unterminated ifdef in header file] Signed-off-by: Kirill A. Shutemov <[email protected]> Cc: Sasha Levin <[email protected]> Cc: Aneesh Kumar K.V <[email protected]> Cc: Jerome Marchand <[email protected]> Cc: Vlastimil Babka <[email protected]> Cc: Andrea Arcangeli <[email protected]> Cc: Hugh Dickins <[email protected]> Cc: Dave Hansen <[email protected]> Cc: Mel Gorman <[email protected]> Cc: Rik van Riel <[email protected]> Cc: Vlastimil Babka <[email protected]> Cc: Steve Capper <[email protected]> Cc: Johannes Weiner <[email protected]> Cc: Michal Hocko <[email protected]> Cc: Christoph Lameter <[email protected]> Cc: David Rientjes <[email protected]> Signed-off-by: Arnd Bergmann <[email protected]> Signed-off-by: Andrew Morton <[email protected]> Signed-off-by: Linus Torvalds <[email protected]>
2016-01-14arm: mm: support ARCH_MMAP_RND_BITSDaniel Cashman1-2/+1
arm: arch_mmap_rnd() uses a hard-code value of 8 to generate the random offset for the mmap base address. This value represents a compromise between increased ASLR effectiveness and avoiding address-space fragmentation. Replace it with a Kconfig option, which is sensibly bounded, so that platform developers may choose where to place this compromise. Keep 8 as the minimum acceptable value. [[email protected]: ARM: avoid ARCH_MMAP_RND_BITS for NOMMU] Signed-off-by: Daniel Cashman <[email protected]> Cc: Russell King <[email protected]> Acked-by: Kees Cook <[email protected]> Cc: Ingo Molnar <[email protected]> Cc: Jonathan Corbet <[email protected]> Cc: Don Zickus <[email protected]> Cc: Eric W. Biederman <[email protected]> Cc: Heinrich Schuchardt <[email protected]> Cc: Josh Poimboeuf <[email protected]> Cc: Kirill A. Shutemov <[email protected]> Cc: Naoya Horiguchi <[email protected]> Cc: Andrea Arcangeli <[email protected]> Cc: Mel Gorman <[email protected]> Cc: Thomas Gleixner <[email protected]> Cc: David Rientjes <[email protected]> Cc: Mark Salyzyn <[email protected]> Cc: Jeff Vander Stoep <[email protected]> Cc: Nick Kralevich <[email protected]> Cc: Catalin Marinas <[email protected]> Cc: Will Deacon <[email protected]> Cc: "H. Peter Anvin" <[email protected]> Cc: Hector Marco-Gisbert <[email protected]> Cc: Borislav Petkov <[email protected]> Cc: Ralf Baechle <[email protected]> Cc: Heiko Carstens <[email protected]> Cc: Martin Schwidefsky <[email protected]> Cc: Benjamin Herrenschmidt <[email protected]> Signed-off-by: Arnd Bergmann <[email protected]> Signed-off-by: Andrew Morton <[email protected]> Signed-off-by: Linus Torvalds <[email protected]>
2016-01-12Merge branch 'devel-stable' into for-linusRussell King3-42/+100
2016-01-05Merge branches 'misc' and 'misc-rc6' into for-linusRussell King5-32/+53
2016-01-04ARM: 8494/1: mm: Enable PXN when running non-LPAE kernel on LPAE processorJungseung Lee1-1/+1
The VMSA field of MMFR0 (bottom 4 bits) is incremented for each added feature. PXN is supported if the value is >= 4 and LPAE is supported if it is >= 5. In case a kernel with CONFIG_ARM_LPAE disabled is used on a processor that supports LPAE, we can still use PXN in short descriptors. So check for >= 4 not == 4. Signed-off-by: Jungseung Lee <[email protected]> Acked-by: Catalin Marinas <[email protected]> Signed-off-by: Ben Hutchings <[email protected]> Signed-off-by: Russell King <[email protected]>
2015-12-22ARM: 8482/1: l2x0: make it possible to disable outer sync from DTLinus Walleij1-3/+10
According to commit 2503a5ecd86c002506001eba432c524ea009fe7f "ARM: 6201/1: RealView: Do not use outer_sync() on ARM11MPCore boards with L220" Some PB11MPCore RealView core tiles have broken outer_sync. We got rid of the custom barriers from the machine by disabling outer sync, but that was just for the boardfile case. We have to be able to do the same in the device tree case. Since __l2c_init() is cloning and copying the L2C vtable, we pass an argument to this function to optionally numb the outer sync operation if desired, before initializing the cache. After this we can set up the cache correctly on the RealView PB11MPCore. This was tested on a PB11MPCore known to have the issue. Before this, spurious crashes would occur if we try to set up the cache properly, after this it boots rock solid. Cc: Arnd Bergmann <[email protected]> Cc: [email protected] Acked-by: Rob Herring <[email protected]> Signed-off-by: Linus Walleij <[email protected]> Signed-off-by: Russell King <[email protected]>
2015-12-18Merge tag 'realview-multiplatform-tag' of ↵Arnd Bergmann1-15/+15
git://git.kernel.org/pub/scm/linux/kernel/git/linusw/linux-integrator into next/multiplatform Pull "Multiplatform support for the RealView" from Linus Walleij: Here is the result of my application of the second part of Arnds patchset, actually enabling multiplatform and getting the RealView off the ground as a multiplatform target. It is dependent on an outstanding patch to the irqchips tree bumping the number of GICs to 2 for the RealView platform. I cannot say I will be sleepless if these go in side by side: each branch will compile but will not boot until both trees have been pulled hurting bisectability a bit. - Tested on the ARM PB11MPCore - Tested with boardfile boot - Tested with DeviceTree boot * tag 'realview-multiplatform-tag' of git://git.kernel.org/pub/scm/linux/kernel/git/linusw/linux-integrator: ARM: realview: select apropriate targets ARM: realview: clean up header files ARM: realview: make all header files local ARM: no longer make CPU targets visible separately ARM: integrator: use explicit core module options ARM: realview: enable multiplatform
2015-12-18ARM: no longer make CPU targets visible separatelyArnd Bergmann1-15/+15
Now that realview and integrator always select the correct CPU type themselves based on the core tiles, there is no need to still have them user-visible in arch/arm/mm/Kconfig. The ARM925T symbol has been selected by the only user for many years, so that can be removed along with the realview and integrator specific ones. This also solves randconfig build problems on realview. Signed-off-by: Arnd Bergmann <[email protected]> Cc: Russell King <[email protected]> Signed-off-by: Linus Walleij <[email protected]>
2015-12-17Merge branch 'libnvdimm-fixes' of ↵Linus Torvalds1-1/+1
git://git.kernel.org/pub/scm/linux/kernel/git/nvdimm/nvdimm Pull libnvdimm fixes from Dan Williams: - Two bug fixes for misuse of PAGE_MASK in scatterlist and dma-debug. These are tagged for -stable. The scatterlist impact is potentially corrupted dma addresses on HIGHMEM enabled platforms. - A minor locking fix for the NFIT hot-add implementation that is new in 4.4-rc. This would only trigger in the case a hot-add raced driver removal. * 'libnvdimm-fixes' of git://git.kernel.org/pub/scm/linux/kernel/git/nvdimm/nvdimm: dma-debug: Fix dma_debug_entry offset calculation Revert "scatterlist: use sg_phys()" nfit: acpi_nfit_notify(): Do not leave device locked
2015-12-17ARM: 8453/2: proc-v7.S: don't locate temporary stack space in .text sectionNicolas Pitre1-7/+16
The proc-v7.S code uses a small temporary stack to preserve register content in its setup code. This stack is located in the .text section which is normally meant to be read-only. Move that temporary stack to the .bss section and get its address in a position independent way, similarly to what we do in other parts of the kernel. While at it, one comments was updated to reflect reality, and the list of saved registers in the proc-v7.S case is updated to match the comment next to it for coherency. Signed-off-by: Nicolas Pitre <[email protected]> Signed-off-by: Russell King <[email protected]>
2015-12-16Merge tag 'realview-base-armsoc-1-tag' of ↵Arnd Bergmann1-2/+0
git://git.kernel.org/pub/scm/linux/kernel/git/linusw/linux-integrator into next/multiplatform Merge "Realview multiplatform support" from Linus Walleij: The board and infrastructure changes for RealView multiplatform and extended DT support. * tag 'realview-base-armsoc-1-tag' of git://git.kernel.org/pub/scm/linux/kernel/git/linusw/linux-integrator: ARM: realview: add an DT SMP boot method ARM: realview: select SP810 and ICST for the DT variant soc: versatile: add support for the PB11MPCore clk: versatile-icst: add device tree support clk: versatile-icst: refactor to allocate regmap separately clk: versatile-icst: convert to use regmap ARM: realview: remove private barrier implementation ARM: no longer force unbuffered DMA for realview clk/realview: stop using machine headers ARM: realview: don't map undefined PCI registers ARM: realview: remove sparsemem hack Conflicts: drivers/clk/versatile/Kconfig
2015-12-15Revert "scatterlist: use sg_phys()"Dan Williams1-1/+1
commit db0fa0cb0157 "scatterlist: use sg_phys()" did replacements of the form: phys_addr_t phys = page_to_phys(sg_page(s)); phys_addr_t phys = sg_phys(s) & PAGE_MASK; However, this breaks platforms where sizeof(phys_addr_t) > sizeof(unsigned long). Revert for 4.3 and 4.4 to make room for a combined helper in 4.5. Cc: <[email protected]> Cc: Jens Axboe <[email protected]> Cc: Christoph Hellwig <[email protected]> Cc: Russell King <[email protected]> Cc: David Woodhouse <[email protected]> Cc: Andrew Morton <[email protected]> Fixes: db0fa0cb0157 ("scatterlist: use sg_phys()") Suggested-by: Joerg Roedel <[email protected]> Reported-by: Vitaly Lavrov <[email protected]> Signed-off-by: Dan Williams <[email protected]>
2015-12-15ARM: 8471/1: need to save/restore arm register(r11) when it is corruptedAnson Huang1-2/+2
In cpu_v7_do_suspend routine, r11 is used while it is NOT saved/restored, different compiler may have different usage of ARM general registers, so it may cause issues during calling cpu_v7_do_suspend. We meet kernel fault occurs when using GCC 4.8.3, r11 contains valid value before calling into cpu_v7_do_suspend, but when returned from this routine, r11 is corrupted and lead to kernel fault. Doing save/restore for those corrupted registers is a must in assemble code. Signed-off-by: Anson Huang <[email protected]> Reviewed-by: Nicolas Pitre <[email protected]> Cc: <[email protected]> # v3.3+ Signed-off-by: Russell King <[email protected]>
2015-12-15ARM: no longer force unbuffered DMA for realviewArnd Bergmann1-2/+0
Commit 42c4dafe803dca ("ARM: 6202/1: Do not ARM_DMA_MEM_BUFFERABLE on RealView boards with L210/L220") changed the generic setting for ARM_DMA_MEM_BUFFERABLE to be disabled on any Realview kernel that includes support for any of the ARM11 variations. Doing this was required to allow doing DMA without a lockup in the l2x0 cache controller on the Realview platform. Unfortunately, in a kernel that also contains support for any ARMv7 based machine, the same change makes it impossible to do DMA on ARMv7, which gets in the way of enabling multiplatform support on Realview. As confirmed by Catalin Marinas and Linus Walleij, the current code for Realview that we have in the kernel does not actually perform any DMA, and this is unlikely to change in the future. Therefore we can revert 42c4dafe803dca without introducing regressions, but we must never start using DMA on this platform in the future. Signed-off-by: Arnd Bergmann <[email protected]> Cc: Russell King <[email protected]> Signed-off-by: Linus Walleij <[email protected]>
2015-12-13ARM: only consider memblocks with NOMAP cleared for linear mappingArd Biesheuvel2-1/+7
Take the new memblock attribute MEMBLOCK_NOMAP into account when deciding whether a certain region is or should be covered by the kernel direct mapping. Tested-by: Ryan Harkin <[email protected]> Reviewed-by: Matt Fleming <[email protected]> Signed-off-by: Ard Biesheuvel <[email protected]>
2015-12-13ARM: implement create_mapping_late() for EFI useArd Biesheuvel1-0/+20
This implements create_mapping_late(), which we will use to populate the UEFI Runtime Services page tables. Tested-by: Ryan Harkin <[email protected]> Reviewed-by: Matt Fleming <[email protected]> Signed-off-by: Ard Biesheuvel <[email protected]>
2015-12-13ARM: add support for non-global kernel mappingsArd Biesheuvel1-15/+20
Add support to the kernel translation table population routines for creating non-global mappings. This will be used by the UEFI runtime services, which will use temporary mappings in the userland range. Reviewed-by: Matt Fleming <[email protected]> Signed-off-by: Ard Biesheuvel <[email protected]>
2015-12-13ARM: factor out allocation routine from __create_mapping()Ard Biesheuvel1-11/+23
To allow __create_mapping() to be used for populating UEFI Runtime Services page tables, factor out the allocation routine 'early_alloc' and pass it down as a function pointer into alloc_init_[pud|pmd|pte]. This way, new users of __create_mapping() can supply another allocation function. Tested-by: Ryan Harkin <[email protected]> Reviewed-by: Matt Fleming <[email protected]> Signed-off-by: Ard Biesheuvel <[email protected]>
2015-12-13ARM: split off core mapping logic from create_mappingArd Biesheuvel1-25/+31
In order to be able to reuse the core mapping logic of create_mapping for mapping the UEFI Runtime Services into a private set of page tables, split it off from create_mapping() into a separate function __create_mapping which we will wire up in a subsequent patch. Tested-by: Ryan Harkin <[email protected]> Reviewed-by: Matt Fleming <[email protected]> Signed-off-by: Ard Biesheuvel <[email protected]>
2015-12-13ARM: add support for generic early_ioremap/early_memremapArd Biesheuvel2-1/+10
This enables the generic early_ioremap implementation for ARM. It uses the fixmap region reserved for kmap. Since early_ioremap is only supported before paging_init(), and kmap is only supported afterwards, this is guaranteed not to cause any clashes. Tested-by: Ryan Harkin <[email protected]> Reviewed-by: Matt Fleming <[email protected]> Signed-off-by: Ard Biesheuvel <[email protected]>
2015-12-04ARM: 8464/1: Update all mm structures with section adjustmentsLaura Abbott1-30/+62
Currently, when updating section permissions to mark areas RO or NX, the only mm updated is current->mm. This is working off the assumption that there are no additional mm structures at the time. This may not always hold true. (Example: calling modprobe early will trigger a fork/exec). Ensure all mm structres get updated with the new section information. Reviewed-by: Kees Cook <[email protected]> Signed-off-by: Laura Abbott <[email protected]> Signed-off-by: Russell King <[email protected]>
2015-12-03ARM: 8462/1: cache-uniphier: use common API to find the next level cacheMasahiro Yamada1-12/+1
The function uniphier_cache_get_next_level_node() does the same thing as of_find_next_cache_node(). Drop the former and stick to the common API. Signed-off-by: Masahiro Yamada <[email protected]> Signed-off-by: Russell King <[email protected]>
2015-12-02ARM: 8465/1: mm: keep reserved ASIDs in sync with mm after multiple rolloversWill Deacon1-12/+26
Under some unusual context-switching patterns, it is possible to end up with multiple threads from the same mm running concurrently with different ASIDs: 1. CPU x schedules task t with mm p containing ASID a and generation g This task doesn't block and the CPU doesn't context switch. So: * per_cpu(active_asid, x) = {g,a} * p->context.id = {g,a} 2. Some other CPU generates an ASID rollover. The global generation is now (g + 1). CPU x is still running t, with no context switch and so per_cpu(reserved_asid, x) = {g,a} 3. CPU y schedules task t', which shares mm p with t. The generation mismatches, so we take the slowpath and hit the reserved ASID from CPU x. p is then updated so that p->context.id = {g + 1,a} 4. CPU y schedules some other task u, which has an mm != p. 5. Some other CPU generates *another* CPU rollover. The global generation is now (g + 2). CPU x is still running t, with no context switch and so per_cpu(reserved_asid, x) = {g,a}. 6. CPU y once again schedules task t', but now *fails* to hit the reserved ASID from CPU x because of the generation mismatch. This results in a new ASID being allocated, despite the fact that t is still running on CPU x with the same mm. Consequently, TLBIs (e.g. as a result of CoW) will not be synchronised between the two threads. This patch fixes the problem by updating all of the matching reserved ASIDs when we hit on the slowpath (i.e. in step 3 above). This keeps the reserved ASIDs in-sync with the mm and avoids the problem. Cc: <[email protected]> Reported-by: Tony Thompson <[email protected]> Reviewed-by: Catalin Marinas <[email protected]> Signed-off-by: Will Deacon <[email protected]> Signed-off-by: Russell King <[email protected]>
2015-12-01ARM: mohawk: allow building with MMU disabledArnd Bergmann1-0/+2
It is in principle possible to build an MMP kernel for the mohawk CPU with the MMU code disabled, except for one simple build error: proc-mohawk.S:345: Error: invalid operands (*UND* and *UND* sections) for `|' proc-mohawk.S:345: Error: invalid operands (*ABS* and *UND* sections) for `|' proc-mohawk.S:345: Error: invalid operands (*UND* and *UND* sections) for `|' proc-mohawk.S:345: Error: invalid operands (*UND* and *UND* sections) for `|' proc-mohawk.S:345: Error: undefined symbol L_PTE_USER used as an immediate value This patch changes the proc-mohawk code to do the same as the other CPUs and not try to actually do anything for the cpu_mohawk_set_pte_ext function, which won't be used anyway. Signed-off-by: Arnd Bergmann <[email protected]>
2015-12-01ARM: make xscale iwmmxt code multiplatform awareArnd Bergmann2-2/+2
In a multiplatform configuration, we may end up building a kernel for both Marvell PJ1 and an ARMv4 CPU implementation. In that case, the xscale-cp0 code is built with gcc -march=armv4{,t}, which results in a build error from the coprocessor instructions. Since we know this code will only have to run on an actual xscale processor, we can simply build the entire file for ARMv5TE. Related to this, we need to handle the iWMMXT initialization sequence differently during boot, to ensure we don't try to touch xscale specific registers on other CPUs from the xscale_cp0_init initcall. cpu_is_xscale() used to be hardcoded to '1' in any configuration that enables any XScale-compatible core, but this breaks once we can have a combined kernel with MMP1 and something else. In this patch, I replace the existing cpu_is_xscale() macro with a new cpu_is_xscale_family() macro that evaluates true for xscale, xsc3 and mohawk, which makes the behavior more deterministic. The two existing users of cpu_is_xscale() are modified accordingly, but slightly change behavior for kernels that enable CPU_MOHAWK without also enabling CPU_XSCALE or CPU_XSC3. Previously, these would leave leave PMD_BIT4 in the page tables untouched, now they clear it as we've always done for kernels that enable both MOHAWK and the support for the older CPU types. Since the previous behavior was inconsistent, I assume it was unintentional. Signed-off-by: Arnd Bergmann <[email protected]>
2015-11-26ARM: l2c: tauros2: use descriptive definitions for register bitsRussell King1-5/+10
Use descriptive definitions for the Tauros2 register bits, and while we're here, clean up the "Tauros2: %s line fill burt8." message. Signed-off-by: Russell King <[email protected]>
2015-11-26ARM: l2c: tauros2: fix OF-enabled non-DT bootRussell King1-9/+8
Signed-off-by: Russell King <[email protected]>
2015-11-16ARM: 8451/1: v7-M: Set an early stack for __v7m_setupEzequiel Garcia1-9/+5
On ARM v7-M, when PROCINFO_INITFUNC (__v7m_setup) is called, a stack is needed before calling the supervisor call (SVC), which is used by the supervisor call to save the context. Currently, __v7m_setup() prepares a temporary stack in the .text.init section, which is is broken if the kernel is executing directly from read-only memory. In particular, this is the case for LPC43xx, which allows to execute the kernel in-place from a serial flash through its SPIFI controller. This commit fixes the issue by seting an early stack to its usual location. Also, __v7m_setup() is currently saving and restoring the previous stack. That was bogus, because there's no stack previously set, so this commit removes it. Acked-by: Uwe Kleine-König <[email protected]> Signed-off-by: Ezequiel Garcia <[email protected]> Signed-off-by: Russell King <[email protected]>
2015-11-16ARM: 8448/1: add some L220 DT settingsLinus Walleij1-0/+20
The RealView ARM11MPCore enables parity, eventmon and shared override in the cache controller through its current boardfile, but the code and DT bindings for the ARM L220 is currently lacking the ability to set this up from DT. Add the required bool parameters for parity and shared override, but keep eventmon out of it: this should be enabled by the event monitor code. Cc: [email protected] Acked-by: Rob Herring <[email protected]> Signed-off-by: Linus Walleij <[email protected]> Signed-off-by: Russell King <[email protected]>
2015-11-10Merge tag 'armsoc-soc' of ↵Linus Torvalds3-0/+566
git://git.kernel.org/pub/scm/linux/kernel/git/arm/arm-soc Pull ARM SoC platform updates from Olof Johansson: "New and/or improved SoC support for this release: Marvell Berlin: - Enable standard DT-based cpufreq - Add CPU hotplug support Freescale: - Ethernet init for i.MX7D - Suspend/resume support for i.MX6UL Allwinner: - Support for R8 chipset (used on NTC's $9 C.H.I.P board) Mediatek: - SMP support for some platforms Uniphier: - L2 support - Cleaned up SMP support, etc. plus a handful of other patches around above functionality, and a few other smaller changes" * tag 'armsoc-soc' of git://git.kernel.org/pub/scm/linux/kernel/git/arm/arm-soc: (42 commits) ARM: uniphier: rework SMP operations to use trampoline code ARM: uniphier: add outer cache support Documentation: EXYNOS: Update bootloader interface on exynos542x ARM: mvebu: add broken-idle option ARM: orion5x: use mac_pton() helper ARM: at91: pm: at91_pm_suspend_in_sram() must be 8-byte aligned ARM: sunxi: Add R8 support ARM: digicolor: select pinctrl/gpio driver arm: berlin: add CPU hotplug support arm: berlin: use non-self-cleared reset register to reset cpu ARM: mediatek: add smp bringup code ARM: mediatek: enable gpt6 on boot up to make arch timer working soc: mediatek: Fix random hang up issue while kernel init soc: ti: qmss: make acc queue support optional in the driver soc: ti: add firmware file name as part of the driver Documentation: dt: soc: Add description for knav qmss driver ARM: S3C64XX: Use PWM lookup table for mach-smartq ARM: S3C64XX: Use PWM lookup table for mach-hmt ARM: S3C64XX: Use PWM lookup table for mach-crag6410 ARM: S3C64XX: Use PWM lookup table for smdk6410 ...