diff options
author | Paolo Abeni <[email protected]> | 2017-06-29 15:55:58 +0200 |
---|---|---|
committer | Ingo Molnar <[email protected]> | 2017-06-30 09:52:51 +0200 |
commit | 236222d39347e0e486010f10c1493e83dbbdfba8 (patch) | |
tree | b2f0b92d331561d7b38b717569355b2ffe92fa83 | |
parent | 4d8a991d460d4fa4829beaffdcba45a217ca0fa7 (diff) |
x86/uaccess: Optimize copy_user_enhanced_fast_string() for short strings
According to the Intel datasheet, the REP MOVSB instruction
exposes a pretty heavy setup cost (50 ticks), which hurts
short string copy operations.
This change tries to avoid this cost by calling the explicit
loop available in the unrolled code for strings shorter
than 64 bytes.
The 64 bytes cutoff value is arbitrary from the code logic
point of view - it has been selected based on measurements,
as the largest value that still ensures a measurable gain.
Micro benchmarks of the __copy_from_user() function with
lengths in the [0-63] range show this performance gain
(shorter the string, larger the gain):
- in the [55%-4%] range on Intel Xeon(R) CPU E5-2690 v4
- in the [72%-9%] range on Intel Core i7-4810MQ
Other tested CPUs - namely Intel Atom S1260 and AMD Opteron
8216 - show no difference, because they do not expose the
ERMS feature bit.
Signed-off-by: Paolo Abeni <[email protected]>
Acked-by: Linus Torvalds <[email protected]>
Cc: Alan Cox <[email protected]>
Cc: Andy Lutomirski <[email protected]>
Cc: Borislav Petkov <[email protected]>
Cc: Brian Gerst <[email protected]>
Cc: Denys Vlasenko <[email protected]>
Cc: H. Peter Anvin <[email protected]>
Cc: Hannes Frederic Sowa <[email protected]>
Cc: Josh Poimboeuf <[email protected]>
Cc: Kees Cook <[email protected]>
Cc: Peter Zijlstra <[email protected]>
Cc: Thomas Gleixner <[email protected]>
Link: http://lkml.kernel.org/r/4533a1d101fd460f80e21329a34928fad521c1d4.1498744345.git.pabeni@redhat.com
[ Clarified the changelog. ]
Signed-off-by: Ingo Molnar <[email protected]>
-rw-r--r-- | arch/x86/lib/copy_user_64.S | 7 |
1 files changed, 5 insertions, 2 deletions
diff --git a/arch/x86/lib/copy_user_64.S b/arch/x86/lib/copy_user_64.S index c5959576c315..020f75cc8cf6 100644 --- a/arch/x86/lib/copy_user_64.S +++ b/arch/x86/lib/copy_user_64.S @@ -37,7 +37,7 @@ ENTRY(copy_user_generic_unrolled) movl %edx,%ecx andl $63,%edx shrl $6,%ecx - jz 17f + jz .L_copy_short_string 1: movq (%rsi),%r8 2: movq 1*8(%rsi),%r9 3: movq 2*8(%rsi),%r10 @@ -58,7 +58,8 @@ ENTRY(copy_user_generic_unrolled) leaq 64(%rdi),%rdi decl %ecx jnz 1b -17: movl %edx,%ecx +.L_copy_short_string: + movl %edx,%ecx andl $7,%edx shrl $3,%ecx jz 20f @@ -174,6 +175,8 @@ EXPORT_SYMBOL(copy_user_generic_string) */ ENTRY(copy_user_enhanced_fast_string) ASM_STAC + cmpl $64,%edx + jb .L_copy_short_string /* less then 64 bytes, avoid the costly 'rep' */ movl %edx,%ecx 1: rep movsb |