aboutsummaryrefslogtreecommitdiff
path: root/arch/riscv/lib/Makefile
AgeCommit message (Collapse)AuthorFilesLines
2024-09-19riscv: Omit optimized string routines when using KASANSamuel Holland1-0/+2
The optimized string routines are implemented in assembly, so they are not instrumented for use with KASAN. Fall back to the C version of the routines in order to improve KASAN coverage. This fixes the kasan_strings() unit test. Signed-off-by: Samuel Holland <[email protected]> Reviewed-by: Alexandre Ghiti <[email protected]> Tested-by: Alexandre Ghiti <[email protected]> Link: https://lore.kernel.org/r/[email protected] Signed-off-by: Palmer Dabbelt <[email protected]>
2024-07-10riscv: Optimize crc32 with Zbc extensionXiao Wang1-0/+1
As suggested by the B-ext spec, the Zbc (carry-less multiplication) instructions can be used to accelerate CRC calculations. Currently, the crc32 is the most widely used crc function inside kernel, so this patch focuses on the optimization of just the crc32 APIs. Compared with the current table-lookup based optimization, Zbc based optimization can also achieve large stride during CRC calculation loop, meantime, it avoids the memory access latency of the table-lookup based implementation and it reduces memory footprint. If Zbc feature is not supported in a runtime environment, then the table-lookup based implementation would serve as fallback via alternative mechanism. By inspecting the vmlinux built by gcc v12.2.0 with default optimization level (-O2), we can see below instruction count change for each 8-byte stride in the CRC32 loop: rv64: crc32_be (54->31), crc32_le (54->13), __crc32c_le (54->13) rv32: crc32_be (50->32), crc32_le (50->16), __crc32c_le (50->16) The compile target CPU is little endian, extra effort is needed for byte swapping for the crc32_be API, thus, the instruction count change is not as significant as that in the *_le cases. This patch is tested on QEMU VM with the kernel CRC32 selftest for both rv64 and rv32. Running the CRC32 selftest on a real hardware (SpacemiT K1) with Zbc extension shows 65% and 125% performance improvement respectively on crc32_test() and crc32c_test(). Signed-off-by: Xiao Wang <[email protected]> Reviewed-by: Charlie Jenkins <[email protected]> Link: https://lore.kernel.org/r/[email protected] Signed-off-by: Palmer Dabbelt <[email protected]>
2024-01-17Merge patch series "riscv: Add fine-tuned checksum functions"Palmer Dabbelt1-1/+2
Charlie Jenkins <[email protected]> says: Each architecture generally implements fine-tuned checksum functions to leverage the instruction set. This patch adds the main checksum functions that are used in networking. Tested on QEMU, this series allows the CHECKSUM_KUNIT tests to complete an average of 50.9% faster. This patch takes heavy use of the Zbb extension using alternatives patching. To test this patch, enable the configs for KUNIT, then CHECKSUM_KUNIT. I have attempted to make these functions as optimal as possible, but I have not ran anything on actual riscv hardware. My performance testing has been limited to inspecting the assembly, running the algorithms on x86 hardware, and running in QEMU. ip_fast_csum is a relatively small function so even though it is possible to read 64 bits at a time on compatible hardware, the bottleneck becomes the clean up and setup code so loading 32 bits at a time is actually faster. * b4-shazam-merge: kunit: Add tests for csum_ipv6_magic and ip_fast_csum riscv: Add checksum library riscv: Add checksum header riscv: Add static key for misaligned accesses asm-generic: Improve csum_fold Link: https://lore.kernel.org/r/[email protected] Signed-off-by: Palmer Dabbelt <[email protected]>
2024-01-17riscv: Add checksum libraryCharlie Jenkins1-0/+1
Provide a 32 and 64 bit version of do_csum. When compiled for 32-bit will load from the buffer in groups of 32 bits, and when compiled for 64-bit will load in groups of 64 bits. Additionally provide riscv optimized implementation of csum_ipv6_magic. Signed-off-by: Charlie Jenkins <[email protected]> Acked-by: Conor Dooley <[email protected]> Reviewed-by: Xiao Wang <[email protected]> Link: https://lore.kernel.org/r/[email protected] Signed-off-by: Palmer Dabbelt <[email protected]>
2024-01-16riscv: lib: vectorize copy_to_user/copy_from_userAndy Chiu1-1/+5
This patch utilizes Vector to perform copy_to_user/copy_from_user. If Vector is available and the size of copy is large enough for Vector to perform better than scalar, then direct the kernel to do Vector copies for userspace. Though the best programming practice for users is to reduce the copy, this provides a faster variant when copies are inevitable. The optimal size for using Vector, copy_to_user_thres, is only a heuristic for now. We can add DT parsing if people feel the need of customizing it. The exception fixup code of the __asm_vector_usercopy must fallback to the scalar one because accessing user pages might fault, and must be sleepable. Current kernel-mode Vector does not allow tasks to be preemptible, so we must disactivate Vector and perform a scalar fallback in such case. The original implementation of Vector operations comes from https://github.com/sifive/sifive-libc, which we agree to contribute to Linux kernel. Co-developed-by: Jerry Shih <[email protected]> Signed-off-by: Jerry Shih <[email protected]> Co-developed-by: Nick Knight <[email protected]> Signed-off-by: Nick Knight <[email protected]> Suggested-by: Guo Ren <[email protected]> Signed-off-by: Andy Chiu <[email protected]> Tested-by: Björn Töpel <[email protected]> Tested-by: Lad Prabhakar <[email protected]> Link: https://lore.kernel.org/r/[email protected] Signed-off-by: Palmer Dabbelt <[email protected]>
2024-01-16riscv: Add vector extension XOR implementationGreentime Hu1-0/+1
This patch adds support for vector optimized XOR and it is tested in qemu. Co-developed-by: Han-Kuan Chen <[email protected]> Signed-off-by: Han-Kuan Chen <[email protected]> Signed-off-by: Greentime Hu <[email protected]> Signed-off-by: Andy Chiu <[email protected]> Tested-by: Björn Töpel <[email protected]> Tested-by: Lad Prabhakar <[email protected]> Link: https://lore.kernel.org/r/[email protected] Signed-off-by: Palmer Dabbelt <[email protected]>
2023-03-14RISC-V: Use Zicboz in clear_page when availableAndrew Jones1-0/+1
Using memset() to zero a 4K page takes 563 total instructions, where 20 are branches. clear_page(), with Zicboz and a 64 byte block size, takes 169 total instructions, where 4 are branches and 33 are nops. Even though the block size is a variable, thanks to alternatives, we can still implement a Duff device without having to do any preliminary calculations. This is achieved by using the alternatives' cpufeature value (the upper 16 bits of patch_id). The value used is the maximum zicboz block size order accepted at the patch site. This enables us to stop patching / unrolling when 4K bytes have been zeroed (we would loop and continue after 4K if the page size would be larger) For 4K pages, unrolling 16 times allows block sizes of 64 and 128 to only loop a few times and larger block sizes to not loop at all. Since cbo.zero doesn't take an offset, we also need an 'add' after each instruction, making the loop body 112 to 160 bytes. Hopefully this is small enough to not cause icache misses. Signed-off-by: Andrew Jones <[email protected]> Acked-by: Conor Dooley <[email protected]> Link: https://lore.kernel.org/r/[email protected] Signed-off-by: Palmer Dabbelt <[email protected]>
2023-01-31RISC-V: add infrastructure to allow different str* implementationsHeiko Stuebner1-0/+3
Depending on supported extensions on specific RISC-V cores, optimized str* functions might make sense. This adds basic infrastructure to allow patching the function calls via alternatives later on. The Linux kernel provides standard implementations for string functions but when architectures want to extend them, they need to provide their own. The added generic string functions are done in assembler (taken from disassembling the main-kernel functions for now) to allow us to control the used registers and extend them with optimized variants. This doesn't override the compiler's use of builtin replacements. So still first of all the compiler will select if a builtin will be better suitable i.e. for known strings. For all regular cases we will want to later select possible optimized variants and in the worst case fall back to the generic implemention added with this change. Reviewed-by: Andrew Jones <[email protected]> Signed-off-by: Heiko Stuebner <[email protected]> Reviewed-by: Conor Dooley <[email protected]> Link: https://lore.kernel.org/r/[email protected] Signed-off-by: Palmer Dabbelt <[email protected]>
2021-01-14riscv: Add support for function error injectionGuo Ren1-0/+2
Inspired by the commit 42d038c4fb00 ("arm64: Add support for function error injection"), this patch supports function error injection for riscv. This patch mainly support two functions: one is regs_set_return_value() which is used to overwrite the return value; the another function is override_function_with_return() which is to override the probed function returning and jump to its caller. Test log: cd /sys/kernel/debug/fail_function echo sys_clone > inject echo 100 > probability echo 1 > interval ls / [ 313.176875] FAULT_INJECTION: forcing a failure. [ 313.176875] name fail_function, interval 1, probability 100, space 0, times 1 [ 313.184357] CPU: 0 PID: 87 Comm: sh Not tainted 5.8.0-rc5-00007-g6a758cc #117 [ 313.187616] Call Trace: [ 313.189100] [<ffffffe0002036b6>] walk_stackframe+0x0/0xc2 [ 313.191626] [<ffffffe00020395c>] show_stack+0x40/0x4c [ 313.193927] [<ffffffe000556c60>] dump_stack+0x7c/0x96 [ 313.194795] [<ffffffe0005522e8>] should_fail+0x140/0x142 [ 313.195923] [<ffffffe000299ffc>] fei_kprobe_handler+0x2c/0x5a [ 313.197687] [<ffffffe0009e2ec4>] kprobe_breakpoint_handler+0xb4/0x18a [ 313.200054] [<ffffffe00020357e>] do_trap_break+0x36/0xca [ 313.202147] [<ffffffe000201bca>] ret_from_exception+0x0/0xc [ 313.204556] [<ffffffe000201bbc>] ret_from_syscall+0x0/0x2 -sh: can't fork: Invalid argument Signed-off-by: Guo Ren <[email protected]> Reviewed-by: Masami Hiramatsu <[email protected]> Cc: Palmer Dabbelt <[email protected]> Cc: Paul Walmsley <[email protected]> Signed-off-by: Palmer Dabbelt <[email protected]>
2020-12-10riscv: provide memmove implementationNylon Chen1-0/+1
The memmove used by the kernel feature like KASAN. Signed-off-by: Nick Hu <[email protected]> Signed-off-by: Nick Hu <[email protected]> Signed-off-by: Nylon Chen <[email protected]> Signed-off-by: Palmer Dabbelt <[email protected]>
2020-10-04riscv: use memcpy based uaccess for nommu againChristoph Hellwig1-1/+1
This reverts commit adccfb1a805ea84d2db38eb53032533279bdaa97. Now that the generic uaccess by mempcy code handles unaligned addresses the generic code can be used for all RISC-V CPUs. Signed-off-by: Christoph Hellwig <[email protected]> Signed-off-by: Palmer Dabbelt <[email protected]>
2020-03-18riscv: uaccess should be used in nommu modeGreentime Hu1-1/+1
It might have the unaligned access exception when trying to exchange data with user space program. In this case, it failed in tty_ioctl(). Therefore we should enable uaccess.S for NOMMU mode since the generic code doesn't handle the unaligned access cases. 0x8013a212 <tty_ioctl+462>: ld a5,460(s1) [ 0.115279] Oops - load address misaligned [#1] [ 0.115284] CPU: 0 PID: 29 Comm: sh Not tainted 5.4.0-rc5-00020-gb4c27160d562-dirty #36 [ 0.115294] epc: 000000008013a212 ra : 000000008013a212 sp : 000000008f48dd50 [ 0.115303] gp : 00000000801cac28 tp : 000000008fb80000 t0 : 00000000000000e8 [ 0.115312] t1 : 000000008f58f108 t2 : 0000000000000009 s0 : 000000008f48ddf0 [ 0.115321] s1 : 000000008f8c6220 a0 : 0000000000000001 a1 : 000000008f48dd28 [ 0.115330] a2 : 000000008fb80000 a3 : 00000000801a7398 a4 : 0000000000000000 [ 0.115339] a5 : 0000000000000000 a6 : 000000008f58f0c6 a7 : 000000000000001d [ 0.115348] s2 : 000000008f8c6308 s3 : 000000008f78b7c8 s4 : 000000008fb834c0 [ 0.115357] s5 : 0000000000005413 s6 : 0000000000000000 s7 : 000000008f58f2b0 [ 0.115366] s8 : 000000008f858008 s9 : 000000008f776818 s10: 000000008f776830 [ 0.115375] s11: 000000008fb840a8 t3 : 1999999999999999 t4 : 000000008f78704c [ 0.115384] t5 : 0000000000000005 t6 : 0000000000000002 [ 0.115391] status: 0000000200001880 badaddr: 000000008f8c63ec cause: 0000000000000004 [ 0.115401] ---[ end trace 00d490c6a8b6c9ac ]--- This failure could be fixed after this patch applied. [ 0.002282] Run /init as init process Initializing random number generator... [ 0.005573] random: dd: uninitialized urandom read (512 bytes read) done. Welcome to Buildroot buildroot login: root Password: Jan 1 00:00:00 login[62]: root login on 'ttySIF0' ~ # Signed-off-by: Greentime Hu <[email protected]> Reviewed-by: Palmer Dabbelt <[email protected]> Signed-off-by: Palmer Dabbelt <[email protected]>
2019-11-17riscv: add nommu supportChristoph Hellwig1-6/+5
The kernel runs in M-mode without using page tables, and thus can't run bare metal without help from additional firmware. Most of the patch is just stubbing out code not needed without page tables, but there is an interesting detail in the signals implementation: - The normal RISC-V syscall ABI only implements rt_sigreturn as VDSO entry point, but the ELF VDSO is not supported for nommu Linux. We instead copy the code to call the syscall onto the stack. In addition to enabling the nommu code a new defconfig for a small kernel image that can run in nommu mode on qemu is also provided, to run a kernel in qemu you can use the following command line: qemu-system-riscv64 -smp 2 -m 64 -machine virt -nographic \ -kernel arch/riscv/boot/loader \ -drive file=rootfs.ext2,format=raw,id=hd0 \ -device virtio-blk-device,drive=hd0 Contains contributions from Damien Le Moal <[email protected]>. Signed-off-by: Christoph Hellwig <[email protected]> Reviewed-by: Anup Patel <[email protected]> [[email protected]: updated to apply; add CONFIG_MMU guards around PCI_IOBASE definition to fix build issues; fixed checkpatch issues; move the PCI_IO_* and VMEMMAP address space macros along with the others; resolve sparse warning] Signed-off-by: Paul Walmsley <[email protected]>
2019-08-08RISC-V: Remove udivdi3Palmer Dabbelt1-2/+0
This should never have landed in the first place: it was added as part of 64-bit divide support for 32-bit systems, but the kernel doesn't allow this sort of division. I must have forgotten to remove it. This patch removes the support. Since this routine only worked on 64-bit platforms but was only built on 32-bit platforms, it's essentially just nonsense anyway. Signed-off-by: Palmer Dabbelt <[email protected]> Acked-by: Nicolas Pitre <[email protected]> Link: https://lore.kernel.org/linux-riscv/[email protected]/T/#t Reported-by: Eric Lin <[email protected]> Signed-off-by: Paul Walmsley <[email protected]>
2019-05-21treewide: Add SPDX license identifier - Makefile/KconfigThomas Gleixner1-0/+1
Add SPDX license identifiers to all Make/Kconfig files which: - Have no license information of any form These files fall under the project license, GPL v2 only. The resulting SPDX license identifier is: GPL-2.0-only Signed-off-by: Thomas Gleixner <[email protected]> Signed-off-by: Greg Kroah-Hartman <[email protected]>
2018-11-12RISC-V: lib: Fix build error for 64-bitOlof Johansson1-1/+1
Fixes the following build error from tinyconfig: riscv64-unknown-linux-gnu-ld: kernel/sched/fair.o: in function `.L8': fair.c:(.text+0x70): undefined reference to `__lshrti3' riscv64-unknown-linux-gnu-ld: kernel/time/clocksource.o: in function `.L0 ': clocksource.c:(.text+0x334): undefined reference to `__lshrti3' Fixes: 7f47c73b355f ("RISC-V: Build tishift only on 64-bit") Signed-off-by: Olof Johansson <[email protected]> Signed-off-by: Palmer Dabbelt <[email protected]>
2018-10-22RISC-V: Build tishift only on 64-bitZong Li1-1/+2
Only RV64 supports 128 integer size. Signed-off-by: Zong Li <[email protected]> Reviewed-by: Christoph Hellwig <[email protected]> Signed-off-by: Palmer Dabbelt <[email protected]>
2018-08-13RISC-V: implement __lshrti3.Alex Guo1-0/+1
Signed-off-by: Alex Guo <[email protected]> Signed-off-by: Palmer Dabbelt <[email protected]>
2017-09-26RISC-V: Build InfrastructurePalmer Dabbelt1-0/+6
This patch contains all the build infrastructure that actually enables the RISC-V port. This includes Makefiles, linker scripts, and Kconfig files. It also contains the only top-level change, which adds RISC-V to the list of architectures that need a sed run to produce the ARCH variable when building locally. Signed-off-by: Palmer Dabbelt <[email protected]>