aboutsummaryrefslogtreecommitdiff
AgeCommit message (Collapse)AuthorFilesLines
2015-08-24arm64: makefile: fix perf_callchain.o kconfig dependencyWill Deacon1-2/+2
Commit 4b3dc9679cf7 ("arm64: force CONFIG_SMP=y and remove redundant #ifdefs") incorrectly resolved a conflict on arch/arm64/kernel/Makefile which resulted in a partial revert of 52da443ec4d0 ("arm64: perf: factor out callchain code"), leading to perf_callchain.o depending on CONFIG_HW_PERF_EVENTS instead of CONFIG_PERF_EVENTS. This patch restores the kconfig dependency for perf_callchain.o. Reported-by: Mark Rutland <[email protected]> Signed-off-by: Will Deacon <[email protected]>
2015-08-24arm64: set MAX_MEMBLOCK_ADDR according to linear region sizeArd Biesheuvel1-0/+8
The linear region size of a 39-bit VA kernel is only 256 GB, which may be insufficient to cover all of system RAM, even on platforms that have much less than 256 GB of memory but which is laid out very sparsely. So make sure we clip the memory we will not be able to map before installing it into the memblock memory table, by setting MAX_MEMBLOCK_ADDR accordingly. Reviewed-by: Catalin Marinas <[email protected]> Tested-by: Stuart Yoder <[email protected]> Signed-off-by: Ard Biesheuvel <[email protected]> Signed-off-by: Will Deacon <[email protected]>
2015-08-24of/fdt: make memblock maximum physical address arch configurableArd Biesheuvel1-5/+7
When parsing the memory nodes to populate the memblock memory table, we check against high and low limits and clip any memory that exceeds either one of them. However, for arm64, the high limit of (phys_addr_t)~0 is not very meaningful, since phys_addr_t is 64 bits (i.e., no limit) but there may be other constraints that limit the memory ranges that we can support. So rename MAX_PHYS_ADDR to MAX_MEMBLOCK_ADDR (for clarity) and only define it if the arch does not supply a definition of its own. Acked-by: Rob Herring <[email protected]> Reviewed-by: Catalin Marinas <[email protected]> Tested-by: Stuart Yoder <[email protected]> Signed-off-by: Ard Biesheuvel <[email protected]> Signed-off-by: Will Deacon <[email protected]>
2015-08-24arm64: Fix source code file path in commentsAlexander Kuleshov1-1/+1
Architecture specific code for i386 and x86_64 was unified and merged to the arch/x86. This patch fix old path of x86 architecture in a comment from the arch/arm64/include/asm/fixmap.h. Signed-off-by: Alexander Kuleshov <[email protected]> Signed-off-by: Will Deacon <[email protected]>
2015-08-21arm64: entry: always restore x0 from the stack on syscall returnWill Deacon1-11/+6
We have a micro-optimisation on the fast syscall return path where we take care to keep x0 live with the return value from the syscall so that we can avoid restoring it from the stack. The benefit of doing this is fairly suspect, since we will be restoring x1 from the stack anyway (which lives adjacent in the pt_regs structure) and the only additional cost is saving x0 back to pt_regs after the syscall handler, which could be seen as a poor man's prefetch. More importantly, this causes issues with the context tracking code. The ct_user_enter macro ends up branching into C code, which is free to use x0 as a scratch register and consequently leads to us returning junk back to userspace as the syscall return value. Rather than special case the context-tracking code, this patch removes the questionable optimisation entirely. Cc: <[email protected]> Cc: Larry Bassel <[email protected]> Cc: Kevin Hilman <[email protected]> Reviewed-by: Catalin Marinas <[email protected]> Reported-by: Hanjun Guo <[email protected]> Tested-by: Hanjun Guo <[email protected]> Signed-off-by: Will Deacon <[email protected]>
2015-08-20arm64: mdscr_el1: avoid exposing DCC to userspaceWill Deacon1-1/+2
We don't want to expose the DCC to userspace, particularly as there is a kernel console driver for it. This patch resets mdscr_el1 to disable userspace access to the DCC registers on the cold boot path. Signed-off-by: Will Deacon <[email protected]>
2015-08-19arm64: kconfig: Move LIST_POISON to a safe valueJeff Vander Stoep1-0/+4
Move the poison pointer offset to 0xdead000000000000, a recognized value that is not mappable by user-space exploits. Cc: <[email protected]> Acked-by: Catalin Marinas <[email protected]> Signed-off-by: Thierry Strudel <[email protected]> Signed-off-by: Jeff Vander Stoep <[email protected]> Signed-off-by: Will Deacon <[email protected]>
2015-08-12arm64: Add __exception_irq_entry definition for function graphJungseok Lee2-2/+27
The gic_handle_irq() is defined with __exception_irq_entry attribute. A single remaining work is to add its definition as ARM did. Below shows how function graph data is changed with these hunks. A prologue of an interrupt handler is drawn as follows. - current status 0) 0.208 us | cpuidle_not_available(); 0) | default_idle_call() { 0) | arch_cpu_idle() { 0) | __handle_domain_irq() { 0) | irq_enter() { 0) 0.313 us | rcu_irq_enter(); 0) 0.261 us | __local_bh_disable_ip(); - with this change 0) 0.625 us | cpuidle_not_available(); 0) | default_idle_call() { 0) | arch_cpu_idle() { 0) ==========> | 0) | gic_handle_irq() { 0) | __handle_domain_irq() { 0) | irq_enter() { 0) 0.885 us | rcu_irq_enter(); 0) 0.781 us | __local_bh_disable_ip(); An epilogue of an interrupt handler is recorded as follows. - current status 0) 0.261 us | idle_cpu(); 0) | rcu_irq_exit() { 0) 0.521 us | rcu_eqs_enter_common.isra.46(); 0) 2.552 us | } 0) ! 322.448 us | } 0) ! 583.437 us | } 0) # 1656.041 us | } 0) # 1658.073 us | } - with this change 0) 0.677 us | idle_cpu(); 0) | rcu_irq_exit() { 0) 1.770 us | rcu_eqs_enter_common.isra.46(); 0) 7.968 us | } 0) # 1803.541 us | } 0) # 2626.667 us | } 0) # 2632.969 us | } 0) <========== | 0) # 14425.00 us | } 0) # 14430.98 us | } Cc: AKASHI Takahiro <[email protected]> Cc: Marc Zyngier <[email protected]> Cc: Rabin Vincent <[email protected]> Cc: Steven Rostedt <[email protected]> Signed-off-by: Jungseok Lee <[email protected]> Signed-off-by: Will Deacon <[email protected]>
2015-08-05Merge branch 'aarch64/psci/drivers' into aarch64/for-next/coreWill Deacon10-390/+453
Move our PSCI implementation out into drivers/firmware/ where it can be shared with arch/arm/. Conflicts: arch/arm64/kernel/psci.c
2015-08-05arm64: mm: ensure patched kernel text is fetched from PoUWill Deacon3-1/+16
The arm64 booting document requires that the bootloader has cleaned the kernel image to the PoC. However, when a CPU re-enters the kernel due to either a CPU hotplug "on" event or resuming from a low-power state (e.g. cpuidle), the kernel text may in-fact be dirty at the PoU due to things like alternative patching or even module loading. Thanks to I-cache speculation with the MMU off, stale instructions could be fetched prior to enabling the MMU, potentially leading to crashes when executing regions of code that have been modified at runtime. This patch addresses the issue by ensuring that the local I-cache is invalidated immediately after a CPU has enabled its MMU but before jumping out of the identity mapping. Any stale instructions fetched from the PoC will then be discarded and refetched correctly from the PoU. Patching kernel text executed prior to the MMU being enabled is prohibited, so the early entry code will always be clean. Reviewed-by: Mark Rutland <[email protected]> Tested-by: Mark Rutland <[email protected]> Signed-off-by: Will Deacon <[email protected]>
2015-08-04arm64: alternatives: ensure secondary CPUs execute ISB after patchingWill Deacon1-0/+1
In order to guarantee that the patched instruction stream is visible to a CPU, that CPU must execute an isb instruction after any related cache maintenance has completed. The instruction patching routines in kernel/insn.c get this right for things like jump labels and ftrace, but the alternatives patching omits it entirely leaving secondary cores in a potential limbo between the old and the new code. This patch adds an isb following the secondary polling loop in the altenatives patching. Signed-off-by: Will Deacon <[email protected]>
2015-08-04arm64: make ll/sc __cmpxchg_case_##name asm consistentMark Rutland1-1/+1
The ll/sc __cmpxchg_case_##name assembly mostly uses symbolic names for operands, but in a single case uses %2 to refer to what is otherwise known as %[v]. This makes the code more painful to read than is necessary. Use %[v] instead. Signed-off-by: Mark Rutland <[email protected]> Cc: Will Deacon <[email protected]> Signed-off-by: Will Deacon <[email protected]>
2015-08-03arm64: dma-mapping: Simplify pgprot handlingRobin Murphy1-3/+2
Since __get_dma_pgprot() does The Right Thing(TM) in the non-coherent case, and the non-cacheable alias for DMA buffers is private to the kernel anyway, we can simplify things slightly and make the code more readable by just using PAGE_KERNEL as the base pgprot. Suggested-by: Catalin Marinas <[email protected]> Reviewed-by: Catalin Marinas <[email protected]> Signed-off-by: Robin Murphy <[email protected]> Signed-off-by: Will Deacon <[email protected]>
2015-08-03MAINTAINERS: add PSCI entryMark Rutland1-0/+9
Add myself and Lorenzo as maintainers of the PSCI client code. Signed-off-by: Mark Rutland <[email protected]> Acked-by: Catalin Marinas <[email protected]> Cc: Lorenzo Pieralisi <[email protected]> Cc: Will Deacon <[email protected]> Signed-off-by: Will Deacon <[email protected]>
2015-08-03drivers: psci: support native SMC{32,64} callsMark Rutland1-6/+19
A 32-bit OS cannot make calls with SMC64 IDs, while a 64-bit OS must invoke some PSCI functions with SMC64 IDs. This patch introduces and makes use of a new macro to choose the appropriate IDs based on the register width of the OS, which will allow 32-bit callers to use the PSCI client code. Signed-off-by: Mark Rutland <[email protected]> Tested-by: Hanjun Guo <[email protected]> Cc: Lorenzo Pieralisi <[email protected]> Signed-off-by: Will Deacon <[email protected]>
2015-08-03arm64: psci: factor invocation code to driversMark Rutland9-390/+431
To enable sharing with arm, move the core PSCI framework code to drivers/firmware. This results in a minor gain in lines of code, but this will quickly be amortised by the removal of code currently duplicated in arch/arm. Signed-off-by: Mark Rutland <[email protected]> Acked-by: Catalin Marinas <[email protected]> Reviewed-by: Hanjun Guo <[email protected]> Tested-by: Hanjun Guo <[email protected]> Cc: Lorenzo Pieralisi <[email protected]> Cc: Will Deacon <[email protected]> Signed-off-by: Will Deacon <[email protected]>
2015-07-31arm64: restore cpu suspend/resume functionalitySudeep Holla1-1/+0
Commit 4b3dc9679cf7 ("arm64: force CONFIG_SMP=y and remove redundant #ifdefs") accidentally retained code for !CONFIG_SMP in cpu_resume function. This resulted in the hash index being zeroed in x7 after proper computation, which is then used to get the cpu context pointer while resuming. This patch removes the remanant code and restores back the cpu suspend/ resume functionality. Fixes: 4b3dc9679cf7 ("arm64: force CONFIG_SMP=y and remove redundant #ifdefs") Signed-off-by: Sudeep Holla <[email protected]> Cc: Lorenzo Pieralisi <[email protected]> Cc: Catalin Marinas <[email protected]> Cc: Will Deacon <[email protected]> Signed-off-by: Will Deacon <[email protected]>
2015-07-30ARM64: PCI: do not enable resources on PROBE_ONLY systemsLorenzo Pieralisi1-0/+13
On ARM64 PROBE_ONLY PCI systems resources are not currently claimed, therefore they can't be enabled since they do not have a valid parent pointer; this in turn prevents enabling PCI devices on ARM64 PROBE_ONLY systems, causing PCI devices initialization to fail. To solve this issue, resources must be claimed when devices are added on PROBE_ONLY systems, which ensures that the resource hierarchy is validated and the resource tree is sane, but this requires changes in the ARM64 resource management that can affect adversely existing PCI set-ups (claiming resources on !PROBE_ONLY systems might break existing ARM64 PCI platform implementations). As a temporary solution in preparation for a proper resources claiming implementation in ARM64 core, to enable PCI PROBE_ONLY systems on ARM64, this patch adds a pcibios_enable_device() arch implementation that simply prevents enabling resources on PROBE_ONLY systems (mirroring ARM behaviour). This is always a safe thing to do because on PROBE_ONLY systems the configuration space set-up can be considered immutable, and it is in preparation of proper resource claiming that would finally validate the PCI resources tree in the ARM64 arch implementation on PROBE_ONLY systems. For !PROBE_ONLY systems resources enablement in pcibios_enable_device() on ARM64 is implemented as in current PCI core, leaving the behaviour unchanged. Signed-off-by: Lorenzo Pieralisi <[email protected]> Cc: Will Deacon <[email protected]> Cc: Bjorn Helgaas <[email protected]> Cc: Catalin Marinas <[email protected]> Signed-off-by: Will Deacon <[email protected]>
2015-07-30arm64: cmpxchg: truncate sub-word signed types before comparisonWill Deacon1-4/+4
When performing a cmpxchg operation on a signed sub-word type (e.g. s8), we need to ensure that the upper register bits of the "old" value used for comparison are zeroed, otherwise we may erroneously fail the cmpxchg which may even be interpreted as success by the caller (if the compiler performs the truncation as part of its check). This has been observed in mod_state, where negative values where causing problems with this_cpu_cmpxchg. This patch fixes the issue by explicitly casting 8-bit and 16-bit "old" values using unsigned types in our cmpxchg wrappers. 32-bit types can be left alone, since the underlying asm makes use of W registers in this case. Reported-by: Mark Rutland <[email protected]> Signed-off-by: Will Deacon <[email protected]>
2015-07-30arm64: alternative: put secondary CPUs into polling loop during patchWill Deacon2-6/+26
When patching the kernel text with alternatives, we may end up patching parts of the stop_machine state machine (e.g. atomic_dec_and_test in ack_state) and consequently corrupt the instruction stream of any secondary CPUs. This patch passes the cpu_online_mask to stop_machine, forcing all of the CPUs into our own callback which can place the secondary cores into a dumb (but safe!) polling loop whilst the patching is carried out. Signed-off-by: Will Deacon <[email protected]>
2015-07-29arm64/Documentation: clarify wording regarding memory below the ImageArd Biesheuvel1-4/+7
Clarify that the memory below the start of the image but inside the region covered by the linear mapping has no special significance to the kernel, and may be used by the firmware provided that it is marked as reserved. Also, fix up some whitespace errors. Acked-by: Mark Rutland <[email protected]> Signed-off-by: Ard Biesheuvel <[email protected]> Signed-off-by: Will Deacon <[email protected]>
2015-07-29arm64: lse: fix lse cmpxchg code indentationWill Deacon1-3/+3
For some reason, the ll/sc cmpxchg asm is all off to the left and awkward to read in conjunction with the following (correctly indented) LSE version. This patch shifts the ll/sc code back to where it should be. Signed-off-by: Will Deacon <[email protected]>
2015-07-29arm64: remove redundant object file listJonas Rabenstein1-1/+0
Commit 4b3dc9679cf7 ("arm64: force CONFIG_SMP=y and remove redundant #ifdefs") forces SMP on arm64. To build the necessary objects for SMP, they were added to the arm64-obj-y rule in arch/arm64/kernel/Makefile, without removing the arm64-obj-$(CONFIG_SMP) rule. Remove redundant object file list depending on always-yes CONFIG_SMP in arch/arm64/kernel/Makefile. Signed-off-by: Jonas Rabenstein <[email protected]> Signed-off-by: Will Deacon <[email protected]>
2015-07-29arm64: remove dead-code depending on CONFIG_UP_LATE_INITJonas Rabenstein3-28/+14
Commit 4b3dc9679cf7 ("arm64: force CONFIG_SMP=y and remove redundant and therfore can not be selected anymore. Remove dead #ifdef-block depending on UP_LATE_INIT in arch/arm64/kernel/setup.c Signed-off-by: Jonas Rabenstein <[email protected]> [will: kill do_post_cpus_up_work altogether] Signed-off-by: Will Deacon <[email protected]>
2015-07-28arm64: pgtable: fix definition of pte_validWill Deacon1-1/+1
pte_valid should check if the PTE_VALID bit (1 << 0) is set in the pte, so fix the macro definition to use bitwise & instead of logical &&. Signed-off-by: Will Deacon <[email protected]>
2015-07-28arm64: spinlock: fix ll/sc unlock on big-endian systemsWill Deacon1-1/+1
When unlocking a spinlock, we perform a read-modify-write on the owner ticket in order to increment it and store it back with release semantics. In the LL/SC case, we load the 16-bit ticket using a 32-bit load and therefore store back the wrong halfword on a big-endian system, corrupting the lock after the first unlock and killing the system dead. This patch fixes the unlock code to use 16-bit accessors consistently. Signed-off-by: Will Deacon <[email protected]>
2015-07-28arm64: Use last level TLBI for user pte changesCatalin Marinas2-6/+22
The flush_tlb_page() function is used on user address ranges when PTEs (or PMDs/PUDs for huge pages) were changed (attributes or clearing). For such cases, it is more efficient to invalidate only the last level of the TLB with the "tlbi vale1is" instruction. In the TLB shoot-down case, the TLB caching of the intermediate page table levels (pmd, pud, pgd) is handled by __flush_tlb_pgtable() via the __(pte|pmd|pud)_free_tlb() functions and it is not deferred to tlb_finish_mmu() (as of commit 285994a62c80 - "arm64: Invalidate the TLB corresponding to intermediate page table levels"). The tlb_flush() function only needs to invalidate the TLB for the last level of page tables; the __flush_tlb_range() function gains a fourth argument for last level TLBI. Signed-off-by: Catalin Marinas <[email protected]> Cc: Will Deacon <[email protected]> Signed-off-by: Will Deacon <[email protected]>
2015-07-28arm64: Clean up __flush_tlb(_kernel)_range functionsCatalin Marinas1-26/+21
This patch moves the MAX_TLB_RANGE check into the flush_tlb(_kernel)_range functions directly to avoid the undescore-prefixed definitions (and for consistency with a subsequent patch). Signed-off-by: Catalin Marinas <[email protected]> Cc: Will Deacon <[email protected]> Signed-off-by: Will Deacon <[email protected]>
2015-07-28arm64: mm: mark create_mapping as __initMark Rutland1-1/+1
Currently create_mapping is marked with __ref, apparently because it refers to early_alloc. However, create_mapping has no logic to prevent erroneous use of early_alloc after it has been freed, and is only ever called by __init functions anyway. Thus the __ref marker is misleading and unnecessary. Instead, this patch marks create_mapping as __init, resulting in warnings if it is used from a a non __init functions, and allowing its memory to be reclaimed. Signed-off-by: Mark Rutland <[email protected]> Cc: Catalin Marinas <[email protected]> Cc: Will Deacon <[email protected]> Signed-off-by: Will Deacon <[email protected]>
2015-07-27arm64: debug: rename enum debug_el to avoid symbol collisionWill Deacon3-8/+8
lib/list_sort.c defines a 'struct debug_el', where "el" is assumedly a a contraction of "element". This conflicts with 'enum debug_el' in our asm/debug-monitors.h header file, where "el" stands for Exception Level. The result is build failure when targetting allmodconfig, so rename our enum to 'dbg_active_el' to be slightly more explicit about what it is. Signed-off-by: Will Deacon <[email protected]>
2015-07-27arm64: mm: add __init section marker to free_initrd_memWang Long1-2/+2
It is not needed after booting, this patch moves the free_initrd_mem() function to the __init section. This patch also make keep_initrd __initdata, to reduce kernel size. Signed-off-by: Wang Long <[email protected]> Signed-off-by: Will Deacon <[email protected]>
2015-07-27arm64: elf: use cpuid_feature_extract_field for hwcap detectionWill Deacon1-20/+15
cpuid_feature_extract_field takes care of the fiddly ID register field sign-extension, so use that instead of rolling our own version. Signed-off-by: Will Deacon <[email protected]>
2015-07-27arm64: lse: use generic cpufeature detection for LSE atomicsWill Deacon2-20/+21
Rework the cpufeature detection to support ISAR0 and use that for detecting the presence of LSE atomics. Signed-off-by: Will Deacon <[email protected]>
2015-07-27arm64: kconfig: group the v8.1 features togetherWill Deacon1-43/+47
ARMv8 CPUs do not support any of the v8.1 features, so group them together in Kconfig to make it clear that they're part of 8.1 and not relevant to older cores. Signed-off-by: Will Deacon <[email protected]>
2015-07-27arm64: lse: rename ARM64_CPU_FEAT_LSE_ATOMICS for consistencyWill Deacon3-4/+4
Other CPU features follow an 'ARM64_HAS_*' naming scheme, so do the same for the LSE atomics. Signed-off-by: Will Deacon <[email protected]>
2015-07-27arm64: kconfig: select HAVE_CMPXCHG_LOCALWill Deacon1-0/+1
We implement an optimised cmpxchg_local macro, so let the kernel know. Reviewed-by: Steve Capper <[email protected]> Acked-by: Catalin Marinas <[email protected]> Signed-off-by: Will Deacon <[email protected]>
2015-07-27arm64: atomic64_dec_if_positive: fix incorrect branch conditionWill Deacon2-2/+2
If we attempt to atomic64_dec_if_positive on INT_MIN, we will underflow and incorrectly decide that the original parameter was positive. This patches fixes the broken condition code so that we handle this corner case correctly. Reviewed-by: Steve Capper <[email protected]> Acked-by: Catalin Marinas <[email protected]> Signed-off-by: Will Deacon <[email protected]>
2015-07-27arm64: atomics: implement atomic{,64}_cmpxchg using cmpxchgWill Deacon3-89/+2
We don't need duplicate cmpxchg implementations, so use cmpxchg to implement atomic{,64}_cmpxchg, like we do for xchg already. Reviewed-by: Steve Capper <[email protected]> Reviewed-by: Catalin Marinas <[email protected]> Signed-off-by: Will Deacon <[email protected]>
2015-07-27arm64: atomics: prefetch the destination word for write prior to stxrWill Deacon4-0/+21
The cost of changing a cacheline from shared to exclusive state can be significant, especially when this is triggered by an exclusive store, since it may result in having to retry the transaction. This patch makes use of prfm to prefetch cachelines for write prior to ldxr/stxr loops when using the ll/sc atomic routines. Reviewed-by: Catalin Marinas <[email protected]> Signed-off-by: Will Deacon <[email protected]>
2015-07-27arm64: atomics: tidy up common atomic{,64}_* macrosWill Deacon1-59/+40
The common (i.e. identical for ll/sc and lse) atomic macros in atomic.h are needlessley different for atomic_t and atomic64_t. This patch tidies up the definitions to make them consistent across the two atomic types and factors out common code such as the add_unless implementation based on cmpxchg. Reviewed-by: Steve Capper <[email protected]> Reviewed-by: Catalin Marinas <[email protected]> Signed-off-by: Will Deacon <[email protected]>
2015-07-27arm64: cmpxchg: avoid memory barrier on comparison failureWill Deacon1-26/+22
cmpxchg doesn't require memory barrier semantics when the value comparison fails, so make the barrier conditional on success. Reviewed-by: Steve Capper <[email protected]> Reviewed-by: Catalin Marinas <[email protected]> Signed-off-by: Will Deacon <[email protected]>
2015-07-27arm64: cmpxchg: avoid "cc" clobber in ll/sc routinesWill Deacon2-10/+8
We can perform the cmpxchg comparison using eor and cbnz which avoids the "cc" clobber for the ll/sc case and consequently for the LSE case where we may have to fall-back on the ll/sc code at runtime. Reviewed-by: Steve Capper <[email protected]> Reviewed-by: Catalin Marinas <[email protected]> Signed-off-by: Will Deacon <[email protected]>
2015-07-27arm64: cmpxchg_dbl: patch in lse instructions when supported by the CPUWill Deacon3-51/+94
On CPUs which support the LSE atomic instructions introduced in ARMv8.1, it makes sense to use them in preference to ll/sc sequences. This patch introduces runtime patching of our cmpxchg_double primitives so that the LSE casp instruction is used instead. Reviewed-by: Catalin Marinas <[email protected]> Signed-off-by: Will Deacon <[email protected]>
2015-07-27arm64: cmpxchg: patch in lse instructions when supported by the CPUWill Deacon4-66/+98
On CPUs which support the LSE atomic instructions introduced in ARMv8.1, it makes sense to use them in preference to ll/sc sequences. This patch introduces runtime patching of our cmpxchg primitives so that the LSE cas instruction is used instead. Reviewed-by: Catalin Marinas <[email protected]> Signed-off-by: Will Deacon <[email protected]>
2015-07-27arm64: xchg: patch in lse instructions when supported by the CPUWill Deacon1-5/+33
On CPUs which support the LSE atomic instructions introduced in ARMv8.1, it makes sense to use them in preference to ll/sc sequences. This patch introduces runtime patching of our xchg primitives so that the LSE swp instruction (yes, you read right!) is used instead. Reviewed-by: Steve Capper <[email protected]> Reviewed-by: Catalin Marinas <[email protected]> Signed-off-by: Will Deacon <[email protected]>
2015-07-27arm64: bitops: patch in lse instructions when supported by the CPUWill Deacon2-21/+45
On CPUs which support the LSE atomic instructions introduced in ARMv8.1, it makes sense to use them in preference to ll/sc sequences. This patch introduces runtime patching of our bitops functions so that LSE atomic instructions are used instead. Reviewed-by: Steve Capper <[email protected]> Reviewed-by: Catalin Marinas <[email protected]> Signed-off-by: Will Deacon <[email protected]>
2015-07-27arm64: locks: patch in lse instructions when supported by the CPUWill Deacon1-29/+108
On CPUs which support the LSE atomic instructions introduced in ARMv8.1, it makes sense to use them in preference to ll/sc sequences. This patch introduces runtime patching of our locking functions so that LSE atomic instructions are used for spinlocks and rwlocks. Reviewed-by: Catalin Marinas <[email protected]> Signed-off-by: Will Deacon <[email protected]>
2015-07-27arm64: atomics: patch in lse instructions when supported by the CPUWill Deacon6-124/+342
On CPUs which support the LSE atomic instructions introduced in ARMv8.1, it makes sense to use them in preference to ll/sc sequences. This patch introduces runtime patching of atomic_t and atomic64_t routines so that the call-site for the out-of-line ll/sc sequences is patched with an LSE atomic instruction when we detect that the CPU supports it. If binutils is not recent enough to assemble the LSE instructions, then the ll/sc sequences are inlined as though CONFIG_ARM64_LSE_ATOMICS=n. Reviewed-by: Catalin Marinas <[email protected]> Signed-off-by: Will Deacon <[email protected]>
2015-07-27arm64: introduce CONFIG_ARM64_LSE_ATOMICS as fallback to ll/sc atomicsWill Deacon6-2/+224
In order to patch in the new atomic instructions at runtime, we need to generate wrappers around the out-of-line exclusive load/store atomics. This patch adds a new Kconfig option, CONFIG_ARM64_LSE_ATOMICS. which causes our atomic functions to branch to the out-of-line ll/sc implementations. To avoid the register spill overhead of the PCS, the out-of-line functions are compiled with specific compiler flags to force out-of-line save/restore of any registers that are usually caller-saved. Reviewed-by: Catalin Marinas <[email protected]> Signed-off-by: Will Deacon <[email protected]>
2015-07-27arm64: alternatives: add cpu feature for lse atomicsWill Deacon2-1/+3
Add a CPU feature for the LSE atomic instructions, so that they can be patched in at runtime when we detect that they are supported. Reviewed-by: Steve Capper <[email protected]> Reviewed-by: Catalin Marinas <[email protected]> Signed-off-by: Will Deacon <[email protected]>