32e0d10069
This introduces EC_FELEM, which is analogous to EC_SCALAR. It is used for EC_POINT's representation in the generic EC_METHOD, as well as random operations on tuned EC_METHODs that still are implemented genericly. Unlike EC_SCALAR, EC_FELEM's exact representation is awkwardly specific to the EC_METHOD, analogous to how the old values were BIGNUMs but may or may not have been in Montgomery form. This is kind of a nuisance, but no more than before. (If p224-64.c were easily convertable to Montgomery form, we could say |EC_FELEM| is always in Montgomery form. If we exposed the internal add and double implementations in each of the curves, we could give |EC_POINT| an |EC_METHOD|-specific representation and |EC_FELEM| is purely a |EC_GFp_mont_method| type. I'll leave this for later.) The generic add and doubling formulas are aligned with the formulas proved in fiat-crypto. Those only applied to a = -3, so I've proved a generic one in https://github.com/mit-plv/fiat-crypto/pull/356, in case someone uses a custom curve. The new formulas are verified, constant-time, and swap a multiply for a square. As expressed in fiat-crypto they do use more temporaries, but this seems to be fine with stack-allocated EC_FELEMs. (We can try to help the compiler later, but benchamrks below suggest this isn't necessary.) Unlike BIGNUM, EC_FELEM can be stack-allocated. It also captures the bounds in the type system and, in particular, that the width is correct, which will make it easier to select a point in constant-time in the future. (Indeed the old code did not always have the correct width. Its point formula involved halving and implemented this in variable time and variable width.) Before: Did 77274 ECDH P-256 operations in 10046087us (7692.0 ops/sec) Did 5959 ECDH P-384 operations in 10031701us (594.0 ops/sec) Did 10815 ECDSA P-384 signing operations in 10087892us (1072.1 ops/sec) Did 8976 ECDSA P-384 verify operations in 10071038us (891.3 ops/sec) Did 2600 ECDH P-521 operations in 10091688us (257.6 ops/sec) Did 4590 ECDSA P-521 signing operations in 10055195us (456.5 ops/sec) Did 3811 ECDSA P-521 verify operations in 10003574us (381.0 ops/sec) After: Did 77736 ECDH P-256 operations in 10029858us (7750.5 ops/sec) [+0.8%] Did 7519 ECDH P-384 operations in 10068076us (746.8 ops/sec) [+25.7%] Did 13335 ECDSA P-384 signing operations in 10029962us (1329.5 ops/sec) [+24.0%] Did 11021 ECDSA P-384 verify operations in 10088600us (1092.4 ops/sec) [+22.6%] Did 2912 ECDH P-521 operations in 10001325us (291.2 ops/sec) [+13.0%] Did 5150 ECDSA P-521 signing operations in 10027462us (513.6 ops/sec) [+12.5%] Did 4264 ECDSA P-521 verify operations in 10069694us (423.4 ops/sec) [+11.1%] This more than pays for removing points_make_affine previously and even speeds up ECDH P-256 slightly. (The point-on-curve check uses the generic code.) Next is to push the stack-allocating up to ec_wNAF_mul, followed by a constant-time single-point multiplication. Bug: 239 Change-Id: I44a2dff7c52522e491d0f8cffff64c4ab5cd353c Reviewed-on: https://boringssl-review.googlesource.com/27668 Reviewed-by: Adam Langley <agl@google.com> |
||
---|---|---|
.. | ||
BUILD.gn | ||
CMakeLists.txt | ||
curve25519_tables.h | ||
curve25519.c | ||
internal.h | ||
LICENSE | ||
make_curve25519_tables.py | ||
METADATA | ||
p256.c | ||
README.chromium | ||
README.md |
Fiat
Some of the code in this directory is generated by Fiat and thus these files are licensed under the MIT license. (See LICENSE file.)
Curve25519
To generate the field arithmetic procedures in curve25519.c
from a fiat-crypto
checkout (as of 7892c66d5e0e5770c79463ce551193ceef870641
), run
make src/Specific/solinas32_2e255m19_10limbs/femul.c
(replacing femul
with
the desired field operation). The "source" file specifying the finite field and
referencing the desired implementation strategy is
src/Specific/solinas32_2e255m19_10limbs/CurveParameters.v
, specifying roughly
"unsaturated arithmetic modulo 2^255-19 using 10 limbs of radix 2^25.5 in 32-bit
unsigned integers with a single carry chain and two wraparound carries" where
only the prime is considered normative and everything else is treated as
"compiler hints".
The 64-bit implementation uses 5 limbs of radix 2^51 with instruction scheduling
taken from curve25519-donna-c64. It is found in
src/Specific/solinas64_2e255m19_5limbs_donna
.
P256
To generate the field arithmetic procedures in p256.c
from a fiat-crypto
checkout, run
make src/Specific/montgomery64_2e256m2e224p2e192p2e96m1_4limbs/femul.c
.
The corresponding "source" file is
src/Specific/montgomery64_2e256m2e224p2e192p2e96m1_4limbs/CurveParameters.v
,
specifying roughly "64-bit saturated word-by-word Montgomery reduction modulo
2^256 - 2^224 + 2^192 + 2^96 - 1". Again, everything except for the prime is
untrusted. There is currently a known issue where fesub.c
for p256 does not
manage to complete the build (specialization) within a week on Coq 8.7.0.
https://github.com/JasonGross/fiat-crypto/tree/3e6851ddecaac70d0feb484a75360d57f6e41244/src/Specific/montgomery64_2e256m2e224p2e192p2e96m1_4limbs
does manage to build that file, but the work on that branch was never finished
(the correctness proofs of implementation templates still apply, but the
now abandoned prototype specialization facilities there are unverified).
Working With Fiat Crypto Field Arithmetic
The fiat-crypto readme https://github.com/mit-plv/fiat-crypto#arithmetic-core contains an overview of the implementation templates followed by a tour of the specialization machinery. It may be helpful to first read about the less messy parts of the system from chapter 3 of http://adam.chlipala.net/theses/andreser.pdf. There is work ongoing to replace the entire specialization mechanism with something much more principled https://github.com/mit-plv/fiat-crypto/projects/4.