diff options
| author | Ard Biesheuvel <[email protected]> | 2020-12-31 17:41:54 +0100 |
|---|---|---|
| committer | Herbert Xu <[email protected]> | 2021-01-08 15:39:47 +1100 |
| commit | 86ad60a65f29dd862a11c22bb4b5be28d6c5cef1 (patch) | |
| tree | d0f396b57e398f50604e9e9fb20e793a02b9ccf0 /tools/perf/scripts/python/bin | |
| parent | fecff3b931a52c8d5263fb1537161f0214acb44a (diff) | |
crypto: x86/aes-ni-xts - use direct calls to and 4-way stride
The XTS asm helper arrangement is a bit odd: the 8-way stride helper
consists of back-to-back calls to the 4-way core transforms, which
are called indirectly, based on a boolean that indicates whether we
are performing encryption or decryption.
Given how costly indirect calls are on x86, let's switch to direct
calls, and given how the 8-way stride doesn't really add anything
substantial, use a 4-way stride instead, and make the asm core
routine deal with any multiple of 4 blocks. Since 512 byte sectors
or 4 KB blocks are the typical quantities XTS operates on, increase
the stride exported to the glue helper to 512 bytes as well.
As a result, the number of indirect calls is reduced from 3 per 64 bytes
of in/output to 1 per 512 bytes of in/output, which produces a 65% speedup
when operating on 1 KB blocks (measured on a Intel(R) Core(TM) i7-8650U CPU)
Fixes: 9697fa39efd3f ("x86/retpoline/crypto: Convert crypto assembler indirect jumps")
Tested-by: Eric Biggers <[email protected]> # x86_64
Signed-off-by: Ard Biesheuvel <[email protected]>
Signed-off-by: Herbert Xu <[email protected]>
Diffstat (limited to 'tools/perf/scripts/python/bin')
0 files changed, 0 insertions, 0 deletions