aboutsummaryrefslogtreecommitdiff
path: root/arch/mips/include
AgeCommit message (Collapse)AuthorFilesLines
2019-10-23MIPS: arc: use function argument for passing argc/argv to prom_init_cmdlineThomas Bogendoerfer1-8/+1
prom_argc and prom_argv are only used by prom_init_cmdline(), so we could pass them directly as function argument. Signed-off-by: Thomas Bogendoerfer <[email protected]> Signed-off-by: Paul Burton <[email protected]> Cc: Ralf Baechle <[email protected]> Cc: Paul Burton <[email protected]> Cc: James Hogan <[email protected]> Cc: [email protected] Cc: [email protected]
2019-10-23MIPS: arc: remove unused stuffThomas Bogendoerfer1-2/+1
remove unused _prom_envp and prom_argc macro. Signed-off-by: Thomas Bogendoerfer <[email protected]> Signed-off-by: Paul Burton <[email protected]> Cc: Ralf Baechle <[email protected]> Cc: Paul Burton <[email protected]> Cc: James Hogan <[email protected]> Cc: [email protected] Cc: [email protected]
2019-10-23MIPS: bmips: mark exception vectors as char arraysJonas Gorski1-5/+5
The vectors span more than one byte, so mark them as arrays. Fixes the following build error when building when using GCC 8.3: In file included from ./include/linux/string.h:19, from ./include/linux/bitmap.h:9, from ./include/linux/cpumask.h:12, from ./arch/mips/include/asm/processor.h:15, from ./arch/mips/include/asm/thread_info.h:16, from ./include/linux/thread_info.h:38, from ./include/asm-generic/preempt.h:5, from ./arch/mips/include/generated/asm/preempt.h:1, from ./include/linux/preempt.h:81, from ./include/linux/spinlock.h:51, from ./include/linux/mmzone.h:8, from ./include/linux/bootmem.h:8, from arch/mips/bcm63xx/prom.c:10: arch/mips/bcm63xx/prom.c: In function 'prom_init': ./arch/mips/include/asm/string.h:162:11: error: '__builtin_memcpy' forming offset [2, 32] is out of the bounds [0, 1] of object 'bmips_smp_movevec' with type 'char' [-Werror=array-bounds] __ret = __builtin_memcpy((dst), (src), __len); \ ^~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ arch/mips/bcm63xx/prom.c:97:3: note: in expansion of macro 'memcpy' memcpy((void *)0xa0000200, &bmips_smp_movevec, 0x20); ^~~~~~ In file included from arch/mips/bcm63xx/prom.c:14: ./arch/mips/include/asm/bmips.h:80:13: note: 'bmips_smp_movevec' declared here extern char bmips_smp_movevec; Fixes: 18a1eef92dcd ("MIPS: BMIPS: Introduce bmips.h") Signed-off-by: Jonas Gorski <[email protected]> Reviewed-by: Florian Fainelli <[email protected]> Signed-off-by: Paul Burton <[email protected]> Cc: [email protected] Cc: Ralf Baechle <[email protected]> Cc: James Hogan <[email protected]>
2019-10-23MIPS: Loongson: Fix GENMASK misuseRikard Falkeborn1-1/+1
Arguments are supposed to be ordered high then low. Fixes: 6a6f9b7dafd50efc1b2 ("MIPS: Loongson: Add CFUCFG&CSR support") Signed-off-by: Rikard Falkeborn <[email protected]> Reviewed-by: Huacai Chen <[email protected]> Signed-off-by: Paul Burton <[email protected]> Cc: [email protected] Cc: [email protected] Cc: [email protected] Cc: [email protected] Cc: [email protected] Cc: [email protected] Cc: [email protected] Cc: [email protected] Cc: [email protected]
2019-10-18mips: vdso: Fix __arch_get_hw_counter()Vincenzo Frascino1-1/+3
On some MIPS variants (e.g. MIPS r1), vDSO clock_mode is set to VDSO_CLOCK_NONE. When VDSO_CLOCK_NONE is set the expected kernel behavior is to fallback on syscalls. To do that the generic vDSO library expects UULONG_MAX as return value of __arch_get_hw_counter(). Fix __arch_get_hw_counter() on MIPS defining a __VDSO_USE_SYSCALL case that addressed the described scenario. Reported-by: Maxime Bizon <[email protected]> Signed-off-by: Vincenzo Frascino <[email protected]> Tested-by: Maxime Bizon <[email protected]> Signed-off-by: Paul Burton <[email protected]> Cc: [email protected]
2019-10-10mips: Fix unroll macro when building with ClangNathan Chancellor1-1/+2
Building with Clang errors after commit 6baaeadae911 ("MIPS: Provide unroll() macro, use it for cache ops") since the GCC_VERSION macro is defined in include/linux/compiler-gcc.h, which is only included in compiler.h when using GCC: In file included from arch/mips/kernel/mips-mt.c:20: ./arch/mips/include/asm/r4kcache.h:254:1: error: use of undeclared identifier 'GCC_VERSION'; did you mean 'S_VERSION'? __BUILD_BLAST_CACHE(i, icache, Index_Invalidate_I, Hit_Invalidate_I, 32, ) ^ ./arch/mips/include/asm/r4kcache.h:219:4: note: expanded from macro '__BUILD_BLAST_CACHE' cache_unroll(32, kernel_cache, indexop, ^ ./arch/mips/include/asm/r4kcache.h:203:2: note: expanded from macro 'cache_unroll' unroll(times, _cache_op, insn, op, (addr) + (i++ * (lsize))); ^ ./arch/mips/include/asm/unroll.h:28:15: note: expanded from macro 'unroll' BUILD_BUG_ON(GCC_VERSION >= 40700 && \ ^ Use CONFIG_GCC_VERSION, which will always be set by Kconfig. Additionally, Clang 8 had improvements around __builtin_constant_p so use that as a lower limit for this check with Clang (although MIPS wasn't buildable until Clang 9); building a kernel with Clang 9.0.0 has no issues after this change. Fixes: 6baaeadae911 ("MIPS: Provide unroll() macro, use it for cache ops") Link: https://github.com/ClangBuiltLinux/linux/issues/736 Signed-off-by: Nathan Chancellor <[email protected]> Signed-off-by: Paul Burton <[email protected]> Cc: Ralf Baechle <[email protected]> Cc: James Hogan <[email protected]> Cc: [email protected] Cc: [email protected] Cc: [email protected] Cc: Nick Desaulniers <[email protected]>
2019-10-10MIPS: elf_hwcap: Export userspace ASEsJiaxun Yang1-0/+11
A Golang developer reported MIPS hwcap isn't reflecting instructions that the processor actually supported so programs can't apply optimized code at runtime. Thus we export the ASEs that can be used in userspace programs. Reported-by: Meng Zhuo <[email protected]> Signed-off-by: Jiaxun Yang <[email protected]> Cc: [email protected] Cc: Paul Burton <[email protected]> Cc: <[email protected]> # 4.14+ Signed-off-by: Paul Burton <[email protected]>
2019-10-09MIPS: SGI-IP22/28: Use PROM for memory detectionThomas Bogendoerfer2-9/+1
EARLY_PRINTK uses ArcWrite (via prom_putchar) on IP22/28, which needs to not mess up PROMs data structures. ARC PROM gives out a list of memory chunks, which are used and which are free. This fixes the problem of not working early printk. By using XKPHYS spaces more than 256MB memory on Indigo2 R4k machines is working now, too. Signed-off-by: Thomas Bogendoerfer <[email protected]> Signed-off-by: Paul Burton <[email protected]> Cc: Ralf Baechle <[email protected]> Cc: James Hogan <[email protected]> Cc: [email protected] Cc: [email protected]
2019-10-09MIPS: SGI-IP22: set PHYS_OFFSET to memory startThomas Bogendoerfer1-2/+1
IP22 started at physical 0x08000000. To avoid wasting memory for page structs set PHYS_OFFSET. Signed-off-by: Thomas Bogendoerfer <[email protected]> Signed-off-by: Paul Burton <[email protected]> Cc: Ralf Baechle <[email protected]> Cc: James Hogan <[email protected]> Cc: [email protected] Cc: [email protected]
2019-10-09MIPS: fw: arc: use call_o32 to call ARC prom from 64bit kernelThomas Bogendoerfer1-73/+30
When using a 64bit kernel with generic spaces setup stack is also placed in XKPYHS, which the 32bit PROM can't handle. By using call_o32 for ARC_CALLs a stack placed in KSEG0 is used when calling PROM. Signed-off-by: Thomas Bogendoerfer <[email protected]> Signed-off-by: Paul Burton <[email protected]> Cc: Ralf Baechle <[email protected]> Cc: James Hogan <[email protected]> Cc: [email protected] Cc: [email protected]
2019-10-09MIPS: fw: arc: remove unused ARC codeThomas Bogendoerfer1-12/+0
Current kernel uses only a few ARC calls. Drop all unused ARC functions. Signed-off-by: Thomas Bogendoerfer <[email protected]> Signed-off-by: Paul Burton <[email protected]> Cc: Ralf Baechle <[email protected]> Cc: James Hogan <[email protected]> Cc: [email protected] Cc: [email protected]
2019-10-09MIPS: Drop 32-bit asm string functionsPaul Burton1-123/+0
We have assembly implementations of strcpy(), strncpy(), strcmp() & strncmp() which: - Are simple byte-at-a-time loops with no particular optimizations. As a comment in the code describes, they're "rather naive". - Offer no clear performance advantage over the generic C implementations - in microbenchmarks performed by Alexander Lobakin the asm functions sometimes win & sometimes lose, but generally not by large margins in either direction. - Don't support 64-bit kernels, where we already make use of the generic C implementations. - Tend to bloat kernel code size due to inlining. - Don't support CONFIG_FORTIFY_SOURCE. - Won't support nanoMIPS without rework. For all of these reasons, delete the asm implementations & make use of the generic C implementations for 32-bit kernels just like we already do for 64-bit kernels. Signed-off-by: Paul Burton <[email protected]> URL: https://lore.kernel.org/linux-mips/[email protected]/ Cc: Alexander Lobakin <[email protected]> Reviewed-by: Philippe Mathieu-Daudé <[email protected]> Cc: [email protected]
2019-10-09MIPS: Provide unroll() macro, use it for cache opsPaul Burton2-339/+95
Currently we have a lot of duplication in asm/r4kcache.h to handle manually unrolling loops of cache ops for various line sizes, and we have to explicitly handle the difference in cache op immediate width between MIPSr6 & earlier ISA revisions with further duplication. Introduce an unroll() macro in asm/unroll.h which expands to a switch statement which is used to call a function or expand a preprocessor macro a compile-time constant number of times in a row - effectively explicitly unrolling a loop. We make use of this here to remove the cache op duplication & will use it further in later patches. A nice side effect of this is that calculating the cache op offset immediate is now the compiler's responsibility, so we're no longer sensitive to the width change of that immediate in MIPSr6 & will be similarly agnostic to immediate width in any future supported ISA. Signed-off-by: Paul Burton <[email protected]> Cc: [email protected]
2019-10-09MIPS: include: Mark __xchg as __always_inlineThomas Bogendoerfer1-2/+2
Commit ac7c3e4ff401 ("compiler: enable CONFIG_OPTIMIZE_INLINING forcibly") allows compiler to uninline functions marked as 'inline'. In cace of __xchg this would cause to reference function __xchg_called_with_bad_pointer, which is an error case for catching bugs and will not happen for correct code, if __xchg is inlined. Signed-off-by: Thomas Bogendoerfer <[email protected]> Reviewed-by: Philippe Mathieu-Daudé <[email protected]> Signed-off-by: Paul Burton <[email protected]> Cc: Ralf Baechle <[email protected]> Cc: James Hogan <[email protected]> Cc: [email protected] Cc: [email protected]
2019-10-07MIPS: futex: Restore \n after sync instructionsPaul Burton1-3/+3
Commit 3c1d3f097972 ("MIPS: futex: Emit Loongson3 sync workarounds within asm") inadvertently removed the newlines following __WEAK_LLSC_MB, which causes build failures for configurations in which __WEAK_LLSC_MB expands to a sync instruction: {standard input}: Assembler messages: {standard input}:9346: Error: symbol `sync3' is already defined {standard input}:9380: Error: symbol `sync3' is already defined ... Fix this by restoring the newlines to separate the sync instruction from anything following it (such as the 3: label), preventing inadvertent concatenation. Signed-off-by: Paul Burton <[email protected]> Fixes: 3c1d3f097972 ("MIPS: futex: Emit Loongson3 sync workarounds within asm")
2019-10-07MIPS: PCI: use information from 1-wire PROM for IOC3 detectionThomas Bogendoerfer2-0/+10
IOC3 chips in SGI system are conntected to a bridge ASIC, which has a 1-wire prom attached with part number information. This changeset uses this information to create PCI subsystem information, which the MFD driver uses for further platform device setup. Signed-off-by: Thomas Bogendoerfer <[email protected]> Signed-off-by: Paul Burton <[email protected]> Cc: Jonathan Corbet <[email protected]> Cc: Ralf Baechle <[email protected]> Cc: James Hogan <[email protected]> Cc: Lee Jones <[email protected]> Cc: David S. Miller <[email protected]> Cc: Srinivas Kandagatla <[email protected]> Cc: Alessandro Zummo <[email protected]> Cc: Alexandre Belloni <[email protected]> Cc: Greg Kroah-Hartman <[email protected]> Cc: Jiri Slaby <[email protected]> Cc: [email protected] Cc: [email protected] Cc: [email protected] Cc: [email protected] Cc: [email protected] Cc: [email protected]
2019-10-07mips: Kconfig: Add ARCH_HAS_FORTIFY_SOURCEDmitry Korotin1-0/+2
FORTIFY_SOURCE detects various overflows at compile and run time. (6974f0c4555e ("include/linux/string.h: add the option of fortified string.h functions) ARCH_HAS_FORTIFY_SOURCE means that the architecture can be built and run with CONFIG_FORTIFY_SOURCE. Since mips can be built and run with that flag, select ARCH_HAS_FORTIFY_SOURCE as default. Signed-off-by: Dmitry Korotin <[email protected]> Signed-off-by: Paul Burton <[email protected]> Cc: [email protected]
2019-10-07MIPS: Loongson: Add Loongson-3A R4 basic supportHuacai Chen2-7/+25
All Loongson-3 CPU family: Code-name Brand-name PRId Loongson-3A R1 Loongson-3A1000 0x6305 Loongson-3A R2 Loongson-3A2000 0x6308 Loongson-3A R2.1 Loongson-3A2000 0x630c Loongson-3A R3 Loongson-3A3000 0x6309 Loongson-3A R3.1 Loongson-3A3000 0x630d Loongson-3A R4 Loongson-3A4000 0xc000 Loongson-3B R1 Loongson-3B1000 0x6306 Loongson-3B R2 Loongson-3B1500 0x6307 Features of R4 revision of Loongson-3A: - All R2/R3 features, including SFB, V-Cache, FTLB, RIXI, DSP, etc. - Support variable ASID bits. - Support MSA and VZ extensions. - Support CPUCFG (CPU config) and CSR (Control and Status Register) extensions. - 64 entries of VTLB (classic TLB), 2048 entries of FTLB (8-way set-associative). Now 64-bit Loongson processors has three types of PRID.IMP: 0x6300 is the classic one so we call it PRID_IMP_LOONGSON_64C (e.g., Loongson-2E/ 2F/3A1000/3B1000/3B1500/3A2000/3A3000), 0x6100 is for some processors which has reduced capabilities so we call it PRID_IMP_LOONGSON_64R (e.g., Loongson-2K), 0xc000 is supposed to cover all new processors in general (e.g., Loongson-3A4000+) so we call it PRID_IMP_LOONGSON_64G. Signed-off-by: Huacai Chen <[email protected]> Signed-off-by: Jiaxun Yang <[email protected]> Signed-off-by: Paul Burton <[email protected]> Cc: Ralf Baechle <[email protected]> Cc: James Hogan <[email protected]> Cc: [email protected] Cc: [email protected] Cc: Fuxin Zhang <[email protected]> Cc: Zhangjin Wu <[email protected]> Cc: Huacai Chen <[email protected]>
2019-10-07MIPS: Loongson: Add CFUCFG&CSR supportHuacai Chen1-0/+227
Loongson-3A R4+ (Loongson-3A4000 and newer) has CPUCFG (CPU config) and CSR (Control and Status Register) extensions. This patch add read/write functionalities for them. Signed-off-by: Huacai Chen <[email protected]> Signed-off-by: Jiaxun Yang <[email protected]> Signed-off-by: Paul Burton <[email protected]> Cc: Ralf Baechle <[email protected]> Cc: James Hogan <[email protected]> Cc: [email protected] Cc: [email protected] Cc: Fuxin Zhang <[email protected]> Cc: Zhangjin Wu <[email protected]> Cc: Huacai Chen <[email protected]>
2019-10-07MIPS: barrier: Make __smp_mb__before_atomic() a no-op for Loongson3Paul Burton1-1/+11
Loongson3 systems with CONFIG_CPU_LOONGSON3_WORKAROUNDS enabled already emit a full completion barrier as part of the inline assembly containing LL/SC loops for atomic operations. As such the barrier emitted by __smp_mb__before_atomic() is redundant, and we can remove it. Signed-off-by: Paul Burton <[email protected]> Cc: [email protected] Cc: Huacai Chen <[email protected]> Cc: Jiaxun Yang <[email protected]> Cc: [email protected]
2019-10-07MIPS: barrier: Remove loongson_llsc_mb()Paul Burton1-40/+0
The loongson_llsc_mb() macro is no longer used - instead barriers are emitted as part of inline asm using the __SYNC() macro. Remove the now-defunct loongson_llsc_mb() macro. Signed-off-by: Paul Burton <[email protected]> Cc: [email protected] Cc: Huacai Chen <[email protected]> Cc: Jiaxun Yang <[email protected]> Cc: [email protected]
2019-10-07MIPS: futex: Emit Loongson3 sync workarounds within asmPaul Burton2-14/+14
Generate the sync instructions required to workaround Loongson3 LL/SC errata within inline asm blocks, which feels a little safer than doing it from C where strictly speaking the compiler would be well within its rights to insert a memory access between the separate asm statements we previously had, containing sync & ll instructions respectively. Signed-off-by: Paul Burton <[email protected]> Cc: [email protected] Cc: Huacai Chen <[email protected]> Cc: Jiaxun Yang <[email protected]> Cc: [email protected]
2019-10-07MIPS: cmpxchg: Omit redundant barriers for Loongson3Paul Burton1-3/+23
When building a kernel configured to support Loongson3 LL/SC workarounds (ie. CONFIG_CPU_LOONGSON3_WORKAROUNDS=y) the inline assembly in __xchg_asm() & __cmpxchg_asm() already emits completion barriers, and as such we don't need to emit extra barriers from the xchg() or cmpxchg() macros. Add compile-time constant checks causing us to omit the redundant memory barriers. Signed-off-by: Paul Burton <[email protected]> Cc: [email protected] Cc: Huacai Chen <[email protected]> Cc: Jiaxun Yang <[email protected]> Cc: [email protected]
2019-10-07MIPS: cmpxchg: Emit Loongson3 sync workarounds within asmPaul Burton1-7/+6
Generate the sync instructions required to workaround Loongson3 LL/SC errata within inline asm blocks, which feels a little safer than doing it from C where strictly speaking the compiler would be well within its rights to insert a memory access between the separate asm statements we previously had, containing sync & ll instructions respectively. Signed-off-by: Paul Burton <[email protected]> Cc: [email protected] Cc: Huacai Chen <[email protected]> Cc: Jiaxun Yang <[email protected]> Cc: [email protected]
2019-10-07MIPS: bitops: Use smp_mb__before_atomic in test_* opsPaul Burton1-3/+3
Use smp_mb__before_atomic() rather than smp_mb__before_llsc() in test_and_set_bit(), test_and_clear_bit() & test_and_change_bit(). The _atomic() versions make semantic sense in these cases, and will allow a later patch to omit redundant barriers for Loongson3 systems that already include a barrier within __test_bit_op(). Signed-off-by: Paul Burton <[email protected]> Cc: [email protected] Cc: Huacai Chen <[email protected]> Cc: Jiaxun Yang <[email protected]> Cc: [email protected]
2019-10-07MIPS: bitops: Emit Loongson3 sync workarounds within asmPaul Burton1-9/+2
Generate the sync instructions required to workaround Loongson3 LL/SC errata within inline asm blocks, which feels a little safer than doing it from C where strictly speaking the compiler would be well within its rights to insert a memory access between the separate asm statements we previously had, containing sync & ll instructions respectively. Signed-off-by: Paul Burton <[email protected]> Cc: [email protected] Cc: Huacai Chen <[email protected]> Cc: Jiaxun Yang <[email protected]> Cc: [email protected]
2019-10-07MIPS: bitops: Use BIT_WORD() & BITS_PER_LONGPaul Burton2-16/+12
Rather than using custom SZLONG_LOG & SZLONG_MASK macros to shift & mask a bit index to form word & bit offsets respectively, make use of the standard BIT_WORD() & BITS_PER_LONG macros for the same purpose. volatile is added to the definition of pointers to the long-sized word we'll operate on, in order to prevent the compiler complaining that we cast away the volatile qualifier of the addr argument. This should have no effect on generated code, which in the LL/SC case is inline asm anyway & in the non-LLSC case access is constrained by compiler barriers provided by raw_local_irq_{save,restore}(). Signed-off-by: Paul Burton <[email protected]> Cc: [email protected] Cc: Huacai Chen <[email protected]> Cc: Jiaxun Yang <[email protected]> Cc: [email protected]
2019-10-07MIPS: bitops: Abstract LL/SC loopsPaul Burton1-204/+63
Introduce __bit_op() & __test_bit_op() macros which abstract away the implementation of LL/SC loops. This cuts down on a lot of duplicate boilerplate code, and also allows R10000_LLSC_WAR to be handled outside of the individual bitop functions. Signed-off-by: Paul Burton <[email protected]> Cc: [email protected] Cc: Huacai Chen <[email protected]> Cc: Jiaxun Yang <[email protected]> Cc: [email protected]
2019-10-07MIPS: bitops: Avoid redundant zero-comparison for non-LLSCPaul Burton1-6/+12
The IRQ-disabling non-LLSC fallbacks for bitops on UP systems already return a zero or one, so there's no need to perform another comparison against zero. Move these comparisons into the LLSC paths to avoid the redundant work. Signed-off-by: Paul Burton <[email protected]> Cc: [email protected] Cc: Huacai Chen <[email protected]> Cc: Jiaxun Yang <[email protected]> Cc: [email protected]
2019-10-07MIPS: bitops: Use the BIT() macroPaul Burton1-15/+16
Use the BIT() macro in asm/bitops.h rather than open-coding its equivalent. Signed-off-by: Paul Burton <[email protected]> Cc: [email protected] Cc: Huacai Chen <[email protected]> Cc: Jiaxun Yang <[email protected]> Cc: [email protected]
2019-10-07MIPS: bitops: Allow immediates in test_and_{set,clear,change}_bitPaul Burton1-6/+6
The logical operations or & xor used in the test_and_set_bit_lock(), test_and_clear_bit() & test_and_change_bit() functions currently force the value 1<<bit to be placed in a register. If the bit is compile-time constant & fits within the immediate field of an or/xor instruction (ie. 16 bits) then we can make use of the ori/xori instruction variants & avoid the use of an extra register. Add the extra "i" constraints in order to allow use of these immediate encodings. Signed-off-by: Paul Burton <[email protected]> Cc: [email protected] Cc: Huacai Chen <[email protected]> Cc: Jiaxun Yang <[email protected]> Cc: [email protected]
2019-10-07MIPS: bitops: Implement test_and_set_bit() in terms of _lock variantPaul Burton1-53/+13
The only difference between test_and_set_bit() & test_and_set_bit_lock() is memory ordering barrier semantics - the former provides a full barrier whilst the latter only provides acquire semantics. We can therefore implement test_and_set_bit() in terms of test_and_set_bit_lock() with the addition of the extra memory barrier. Do this in order to avoid duplicating logic. Signed-off-by: Paul Burton <[email protected]> Cc: [email protected] Cc: Huacai Chen <[email protected]> Cc: Jiaxun Yang <[email protected]> Cc: [email protected]
2019-10-07MIPS: bitops: ins start position is always an immediatePaul Burton1-3/+3
The start position for an ins instruction is always encoded as an immediate, so allowing registers to be used by the inline asm makes no sense. It should never happen anyway since a bit index should always be small enough to be treated as an immediate, but remove the nonsensical "r" for sanity. Signed-off-by: Paul Burton <[email protected]> Cc: [email protected] Cc: Huacai Chen <[email protected]> Cc: Jiaxun Yang <[email protected]> Cc: [email protected]
2019-10-07MIPS: bitops: Use MIPS_ISA_REV, not #ifdefsPaul Burton1-9/+4
Rather than #ifdef on CONFIG_CPU_* to determine whether the ins instruction is supported we can simply check MIPS_ISA_REV to discover whether we're targeting MIPSr2 or higher. Do so in order to clean up the code. Signed-off-by: Paul Burton <[email protected]> Cc: [email protected] Cc: Huacai Chen <[email protected]> Cc: Jiaxun Yang <[email protected]> Cc: [email protected]
2019-10-07MIPS: bitops: Only use ins for bit 16 or higherPaul Burton1-1/+1
set_bit() can set bits 0-15 using an ori instruction, rather than loading the value -1 into a register & then using an ins instruction. That is, rather than the following: li t0, -1 ll t1, 0(t2) ins t1, t0, 4, 1 sc t1, 0(t2) We can have the simpler: ll t1, 0(t2) ori t1, t1, 0x10 sc t1, 0(t2) The or path already allows immediates to be used, so simply restricting the ins path to bits that don't fit in immediates is sufficient to take advantage of this. Signed-off-by: Paul Burton <[email protected]> Cc: [email protected] Cc: Huacai Chen <[email protected]> Cc: Jiaxun Yang <[email protected]> Cc: [email protected]
2019-10-07MIPS: bitops: Handle !kernel_uses_llsc firstPaul Burton1-108/+105
Reorder conditions in our various bitops functions that check kernel_uses_llsc such that they handle the !kernel_uses_llsc case first. This allows us to avoid the need to duplicate the kernel_uses_llsc check in all the other cases. For functions that don't involve barriers common to the various implementations, we switch to returning from within each if block making each case easier to read in isolation. Signed-off-by: Paul Burton <[email protected]> Cc: [email protected] Cc: Huacai Chen <[email protected]> Cc: Jiaxun Yang <[email protected]> Cc: [email protected]
2019-10-07MIPS: atomic: Deduplicate 32b & 64b read, set, xchg, cmpxchgPaul Burton1-43/+27
Remove the remaining duplication between 32b & 64b in asm/atomic.h by making use of an ATOMIC_OPS() macro to generate: - atomic_read()/atomic64_read() - atomic_set()/atomic64_set() - atomic_cmpxchg()/atomic64_cmpxchg() - atomic_xchg()/atomic64_xchg() This is consistent with the way all other functions in asm/atomic.h are generated, and ensures consistency between the 32b & 64b functions. Of note is that this results in the above now being static inline functions rather than macros. Signed-off-by: Paul Burton <[email protected]> Cc: [email protected] Cc: Huacai Chen <[email protected]> Cc: Jiaxun Yang <[email protected]> Cc: [email protected]
2019-10-07MIPS: atomic: Unify 32b & 64b sub_if_positivePaul Burton1-106/+58
Unify the definitions of atomic_sub_if_positive() & atomic64_sub_if_positive() using a macro like we do for most other atomic functions. This allows us to share the implementation ensuring consistency between the two. Notably this provides the appropriate loongson3_war barriers in the atomic64_sub_if_positive() case which were previously missing. The code is rearranged a little to handle the !kernel_uses_llsc case first in order to de-indent the LL/SC case & allow us not to go over 80 characters per line. Signed-off-by: Paul Burton <[email protected]> Cc: [email protected] Cc: Huacai Chen <[email protected]> Cc: Jiaxun Yang <[email protected]> Cc: [email protected]
2019-10-07MIPS: atomic: Use _atomic barriers in atomic_sub_if_positive()Paul Burton1-2/+2
Use smp_mb__before_atomic() & smp_mb__after_atomic() in atomic_sub_if_positive() rather than the equivalent smp_mb__before_llsc() & smp_llsc_mb(). The former are more standard & this preps us for avoiding redundant duplicate barriers on Loongson3 in a later patch. Signed-off-by: Paul Burton <[email protected]> Cc: [email protected] Cc: Huacai Chen <[email protected]> Cc: Jiaxun Yang <[email protected]> Cc: [email protected]
2019-10-07MIPS: atomic: Emit Loongson3 sync workarounds within asmPaul Burton1-6/+14
Generate the sync instructions required to workaround Loongson3 LL/SC errata within inline asm blocks, which feels a little safer than doing it from C where strictly speaking the compiler would be well within its rights to insert a memory access between the separate asm statements we previously had, containing sync & ll instructions respectively. Signed-off-by: Paul Burton <[email protected]> Cc: [email protected] Cc: Huacai Chen <[email protected]> Cc: Jiaxun Yang <[email protected]> Cc: [email protected]
2019-10-07MIPS: atomic: Use one macro to generate 32b & 64b functionsPaul Burton1-151/+45
Cut down on duplication by generalizing the ATOMIC_OP(), ATOMIC_OP_RETURN() & ATOMIC_FETCH_OP() macros to work for both 32b & 64b atomics, and removing the ATOMIC64_ variants. This ensures consistency between our atomic_* & atomic64_* functions. Signed-off-by: Paul Burton <[email protected]> Cc: [email protected] Cc: Huacai Chen <[email protected]> Cc: Jiaxun Yang <[email protected]> Cc: [email protected]
2019-10-07MIPS: atomic: Handle !kernel_uses_llsc firstPaul Burton1-50/+49
Handle the !kernel_uses_llsc path first in our ATOMIC_OP(), ATOMIC_OP_RETURN() & ATOMIC_FETCH_OP() macros & return from within the block. This allows us to de-indent the kernel_uses_llsc path by one level which will be useful when making further changes. Signed-off-by: Paul Burton <[email protected]> Cc: [email protected] Cc: Huacai Chen <[email protected]> Cc: Jiaxun Yang <[email protected]> Cc: [email protected]
2019-10-07MIPS: atomic: Fix whitespace in ATOMIC_OP macrosPaul Burton1-92/+92
We define macros in asm/atomic.h which end each line with space characters before a backslash to continue on the next line. Remove the space characters leaving tabs as the whitespace used for conformity with coding convention. Signed-off-by: Paul Burton <[email protected]> Cc: [email protected] Cc: Huacai Chen <[email protected]> Cc: Jiaxun Yang <[email protected]> Cc: [email protected]
2019-10-07MIPS: barrier: Clean up sync_ginv()Paul Burton1-1/+1
Use the new __SYNC() infrastructure to implement sync_ginv(), for consistency with much of the rest of the asm/barrier.h. Signed-off-by: Paul Burton <[email protected]> Cc: [email protected] Cc: Huacai Chen <[email protected]> Cc: Jiaxun Yang <[email protected]> Cc: [email protected]
2019-10-07MIPS: barrier: Clean up __sync() definitionPaul Burton1-14/+4
Implement __sync() using the new __SYNC() infrastructure, which will take care of not emitting an instruction for old R3k CPUs that don't support it. The only behavioral difference is that __sync() will now provide a compiler barrier on these old CPUs, but that seems like reasonable behavior anyway. Signed-off-by: Paul Burton <[email protected]> Cc: [email protected] Cc: Huacai Chen <[email protected]> Cc: Jiaxun Yang <[email protected]> Cc: [email protected]
2019-10-07MIPS: barrier: Remove fast_mb() Octeon #ifdef'eryPaul Burton1-2/+2
The definition of fast_mb() is the same in both the Octeon & non-Octeon cases, so remove the duplication & define it only once. Signed-off-by: Paul Burton <[email protected]> Cc: [email protected] Cc: Huacai Chen <[email protected]> Cc: Jiaxun Yang <[email protected]> Cc: [email protected]
2019-10-07MIPS: barrier: Clean up __smp_mb() definitionPaul Burton1-8/+4
We #ifdef on Cavium Octeon CPUs, but emit the same sync instruction in both cases. Remove the #ifdef & simply expand to the __sync() macro. Whilst here indent the strong ordering case definitions to match the indentation of the weak ordering ones, helping readability. Signed-off-by: Paul Burton <[email protected]> Cc: [email protected] Cc: Huacai Chen <[email protected]> Cc: Jiaxun Yang <[email protected]> Cc: [email protected]
2019-10-07MIPS: barrier: Clean up rmb() & wmb() definitionsPaul Burton1-14/+14
Simplify our definitions of rmb() & wmb() using the new __SYNC() infrastructure. The fast_rmb() & fast_wmb() macros are removed, since they only provided a level of indirection that made the code less readable & weren't directly used anywhere in the kernel tree. The Octeon #ifdef'ery is removed, since the "syncw" instruction previously used is merely an alias for "sync 4" which __SYNC() will emit for the wmb sync type when the kernel is configured for an Octeon CPU. Similarly __SYNC() will emit nothing for the rmb sync type in Octeon configurations. Signed-off-by: Paul Burton <[email protected]> Cc: [email protected] Cc: Huacai Chen <[email protected]> Cc: Jiaxun Yang <[email protected]> Cc: [email protected]
2019-10-07MIPS: barrier: Add __SYNC() infrastructurePaul Burton2-111/+209
Introduce an asm/sync.h header which provides infrastructure that can be used to generate sync instructions of various types, and for various reasons. For example if we need a sync instruction that provides a full completion barrier but only on systems which have weak memory ordering, we can generate the appropriate assembly code using: __SYNC(full, weak_ordering) When the kernel is configured to run on systems with weak memory ordering (ie. CONFIG_WEAK_ORDERING is selected) we'll emit a sync instruction. When the kernel is configured to run on systems with strong memory ordering (ie. CONFIG_WEAK_ORDERING is not selected) we'll emit nothing. The caller doesn't need to know which happened - it simply says what it needs & when, with no concern for checking the kernel configuration. There are some scenarios in which we may want to emit code only when we *didn't* emit a sync instruction. For example, some Loongson3 CPUs suffer from a bug that requires us to emit a sync instruction prior to each ll instruction (enabled by CONFIG_CPU_LOONGSON3_WORKAROUNDS). In cases where this bug workaround is enabled, it's wasteful to then have more generic code emit another sync instruction to provide barriers we need in general. A __SYNC_ELSE() macro allows for this, providing an extra argument that contains code to be assembled only in cases where the sync instruction was not emitted. For example if we have a scenario in which we generally want to emit a release barrier but for affected Loongson3 configurations upgrade that to a full completion barrier, we can do that like so: __SYNC_ELSE(full, loongson3_war, __SYNC(rl, always)) The assembly generated by these macros can be used either as inline assembly or in assembly source files. Differing types of sync as provided by MIPSr6 are defined, but currently they all generate a full completion barrier except in kernels configured for Cavium Octeon systems. There the wmb sync-type is used, and rmb syncs are omitted, as has been the case since commit 6b07d38aaa52 ("MIPS: Octeon: Use optimized memory barrier primitives."). Using __SYNC() with the wmb or rmb types will abstract away the Octeon specific behavior and allow us to later clean up asm/barrier.h code that currently includes a plethora of #ifdef's. Signed-off-by: Paul Burton <[email protected]> Cc: [email protected] Cc: Huacai Chen <[email protected]> Cc: Jiaxun Yang <[email protected]> Cc: [email protected]
2019-10-07MIPS: Use compact branch for LL/SC loops on MIPSr6+Paul Burton1-0/+4
When targeting MIPSr6 or higher make use of a compact branch in LL/SC loops, preventing the insertion of a delay slot nop that only serves to waste space. Signed-off-by: Paul Burton <[email protected]> Cc: [email protected] Cc: Huacai Chen <[email protected]> Cc: Jiaxun Yang <[email protected]> Cc: [email protected]