boringssl/crypto/fipsmodule/ec
David Benjamin 638a408cd2 Add a tuned variable-time P-256 multiplication function.
This reuses wnaf.c's window scheduling, but has access to the tuned
field arithemetic and pre-computed base point table. Unlike wnaf.c, we
do not make the points affine as it's not worth it for a single table.
(We already precomputed the base point table.)

Annoyingly, 32-bit x86 gets slower by a bit, but the other platforms are
faster. My guess is that that the generic code gets to use the
bn_mul_mont assembly and the compiler, faced with the increased 32-bit
register pressure and the extremely register-poor x86, is making
bad decisions on the otherwise P-256-tuned C code. The three platforms
that see much larger gains are significantly more important than 32-bit
x86 at this point, so go with this change.

armv7a (Nexus 5X) before/after [+14.4%]:
Did 2703 ECDSA P-256 verify operations in 5034539us (536.9 ops/sec)
Did 3127 ECDSA P-256 verify operations in 5091379us (614.2 ops/sec)

aarch64 (Nexus 5X) before/after [+9.2%]:
Did 6783 ECDSA P-256 verify operations in 5031324us (1348.2 ops/sec)
Did 7410 ECDSA P-256 verify operations in 5033291us (1472.2 ops/sec)

x86 before/after [-2.7%]:
Did 8961 ECDSA P-256 verify operations in 10075901us (889.3 ops/sec)
Did 8568 ECDSA P-256 verify operations in 10003001us (856.5 ops/sec)

x86_64 before/after [+8.6%]:
Did 29808 ECDSA P-256 verify operations in 10008662us (2978.2 ops/sec)
Did 32528 ECDSA P-256 verify operations in 10057137us (3234.3 ops/sec)

Change-Id: I5fa643149f5bfbbda9533e3008baadfee9979b93
Reviewed-on: https://boringssl-review.googlesource.com/25684
Reviewed-by: Adam Langley <agl@google.com>
Commit-Queue: Adam Langley <agl@google.com>
CQ-Verified: CQ bot account: commit-bot@chromium.org <commit-bot@chromium.org>
2018-02-12 22:00:48 +00:00
..
asm Merge Intel copyright notice into standard 2018-02-12 21:44:27 +00:00
ec_key.c Make BN_cmp constant-time. 2018-02-06 03:10:44 +00:00
ec_montgomery.c Add a function which folds BN_MONT_CTX_{new,set} together. 2018-02-02 20:23:25 +00:00
ec_test.cc Rename bn->top to bn->width. 2018-02-05 23:44:24 +00:00
ec.c Don't crash when failing to set affine coordinates when the generator is missing. 2018-02-07 23:08:17 +00:00
internal.h Add a tuned variable-time P-256 multiplication function. 2018-02-12 22:00:48 +00:00
oct.c Make BN_mod_*_quick constant-time. 2018-02-06 01:16:04 +00:00
p224-64.c Align various point_get_affine_coordinates implementations. 2018-01-08 20:03:42 +00:00
p256-x86_64_test.cc Add a function which folds BN_MONT_CTX_{new,set} together. 2018-02-02 20:23:25 +00:00
p256-x86_64_tests.txt
p256-x86_64-table.h Run the comment converter on libcrypto. 2017-08-18 21:49:04 +00:00
p256-x86_64.c Merge Intel copyright notice into standard 2018-02-12 21:44:27 +00:00
p256-x86_64.h Merge Intel copyright notice into standard 2018-02-12 21:44:27 +00:00
simple.c Make BN_mod_*_quick constant-time. 2018-02-06 01:16:04 +00:00
util.c ec/p256.c: fiat-crypto field arithmetic (64, 32) 2017-12-11 17:55:46 +00:00
wnaf.c Add a tuned variable-time P-256 multiplication function. 2018-02-12 22:00:48 +00:00