boringssl/third_party/fiat
David Benjamin 32e0d10069 Add EC_FELEM for EC_POINTs and related temporaries.
This introduces EC_FELEM, which is analogous to EC_SCALAR. It is used
for EC_POINT's representation in the generic EC_METHOD, as well as
random operations on tuned EC_METHODs that still are implemented
genericly.

Unlike EC_SCALAR, EC_FELEM's exact representation is awkwardly specific
to the EC_METHOD, analogous to how the old values were BIGNUMs but may
or may not have been in Montgomery form. This is kind of a nuisance, but
no more than before. (If p224-64.c were easily convertable to Montgomery
form, we could say |EC_FELEM| is always in Montgomery form. If we
exposed the internal add and double implementations in each of the
curves, we could give |EC_POINT| an |EC_METHOD|-specific representation
and |EC_FELEM| is purely a |EC_GFp_mont_method| type. I'll leave this
for later.)

The generic add and doubling formulas are aligned with the formulas
proved in fiat-crypto. Those only applied to a = -3, so I've proved a
generic one in https://github.com/mit-plv/fiat-crypto/pull/356, in case
someone uses a custom curve.  The new formulas are verified,
constant-time, and swap a multiply for a square. As expressed in
fiat-crypto they do use more temporaries, but this seems to be fine with
stack-allocated EC_FELEMs. (We can try to help the compiler later,
but benchamrks below suggest this isn't necessary.)

Unlike BIGNUM, EC_FELEM can be stack-allocated. It also captures the
bounds in the type system and, in particular, that the width is correct,
which will make it easier to select a point in constant-time in the
future. (Indeed the old code did not always have the correct width. Its
point formula involved halving and implemented this in variable time and
variable width.)

Before:
Did 77274 ECDH P-256 operations in 10046087us (7692.0 ops/sec)
Did 5959 ECDH P-384 operations in 10031701us (594.0 ops/sec)
Did 10815 ECDSA P-384 signing operations in 10087892us (1072.1 ops/sec)
Did 8976 ECDSA P-384 verify operations in 10071038us (891.3 ops/sec)
Did 2600 ECDH P-521 operations in 10091688us (257.6 ops/sec)
Did 4590 ECDSA P-521 signing operations in 10055195us (456.5 ops/sec)
Did 3811 ECDSA P-521 verify operations in 10003574us (381.0 ops/sec)

After:
Did 77736 ECDH P-256 operations in 10029858us (7750.5 ops/sec) [+0.8%]
Did 7519 ECDH P-384 operations in 10068076us (746.8 ops/sec) [+25.7%]
Did 13335 ECDSA P-384 signing operations in 10029962us (1329.5 ops/sec) [+24.0%]
Did 11021 ECDSA P-384 verify operations in 10088600us (1092.4 ops/sec) [+22.6%]
Did 2912 ECDH P-521 operations in 10001325us (291.2 ops/sec) [+13.0%]
Did 5150 ECDSA P-521 signing operations in 10027462us (513.6 ops/sec) [+12.5%]
Did 4264 ECDSA P-521 verify operations in 10069694us (423.4 ops/sec) [+11.1%]

This more than pays for removing points_make_affine previously and even
speeds up ECDH P-256 slightly. (The point-on-curve check uses the
generic code.)

Next is to push the stack-allocating up to ec_wNAF_mul, followed by a
constant-time single-point multiplication.

Bug: 239
Change-Id: I44a2dff7c52522e491d0f8cffff64c4ab5cd353c
Reviewed-on: https://boringssl-review.googlesource.com/27668
Reviewed-by: Adam Langley <agl@google.com>
2018-04-25 16:39:58 +00:00
..
BUILD.gn Add files in third_party/fiat for Chromium to pick up. 2018-01-10 22:02:03 +00:00
CMakeLists.txt
curve25519_tables.h Use 51-bit limbs from fiat-crypto in 64-bit. 2018-01-23 22:25:07 +00:00
curve25519.c Remove redundant assertion in fe_mul_121666_impl. 2018-02-27 23:50:02 +00:00
internal.h Remove x86_64 x25519 assembly. 2018-02-01 21:44:58 +00:00
LICENSE
make_curve25519_tables.py Use 51-bit limbs from fiat-crypto in 64-bit. 2018-01-23 22:25:07 +00:00
METADATA third_party: re-format METATADA files 2018-02-27 19:57:12 +00:00
p256.c Add EC_FELEM for EC_POINTs and related temporaries. 2018-04-25 16:39:58 +00:00
README.chromium Add files in third_party/fiat for Chromium to pick up. 2018-01-10 22:02:03 +00:00
README.md Use 51-bit limbs from fiat-crypto in 64-bit. 2018-01-23 22:25:07 +00:00

Fiat

Some of the code in this directory is generated by Fiat and thus these files are licensed under the MIT license. (See LICENSE file.)

Curve25519

To generate the field arithmetic procedures in curve25519.c from a fiat-crypto checkout (as of 7892c66d5e0e5770c79463ce551193ceef870641), run make src/Specific/solinas32_2e255m19_10limbs/femul.c (replacing femul with the desired field operation). The "source" file specifying the finite field and referencing the desired implementation strategy is src/Specific/solinas32_2e255m19_10limbs/CurveParameters.v, specifying roughly "unsaturated arithmetic modulo 2^255-19 using 10 limbs of radix 2^25.5 in 32-bit unsigned integers with a single carry chain and two wraparound carries" where only the prime is considered normative and everything else is treated as "compiler hints".

The 64-bit implementation uses 5 limbs of radix 2^51 with instruction scheduling taken from curve25519-donna-c64. It is found in src/Specific/solinas64_2e255m19_5limbs_donna.

P256

To generate the field arithmetic procedures in p256.c from a fiat-crypto checkout, run make src/Specific/montgomery64_2e256m2e224p2e192p2e96m1_4limbs/femul.c. The corresponding "source" file is src/Specific/montgomery64_2e256m2e224p2e192p2e96m1_4limbs/CurveParameters.v, specifying roughly "64-bit saturated word-by-word Montgomery reduction modulo 2^256 - 2^224 + 2^192 + 2^96 - 1". Again, everything except for the prime is untrusted. There is currently a known issue where fesub.c for p256 does not manage to complete the build (specialization) within a week on Coq 8.7.0. https://github.com/JasonGross/fiat-crypto/tree/3e6851ddecaac70d0feb484a75360d57f6e41244/src/Specific/montgomery64_2e256m2e224p2e192p2e96m1_4limbs does manage to build that file, but the work on that branch was never finished (the correctness proofs of implementation templates still apply, but the now abandoned prototype specialization facilities there are unverified).

Working With Fiat Crypto Field Arithmetic

The fiat-crypto readme https://github.com/mit-plv/fiat-crypto#arithmetic-core contains an overview of the implementation templates followed by a tour of the specialization machinery. It may be helpful to first read about the less messy parts of the system from chapter 3 of http://adam.chlipala.net/theses/andreser.pdf. There is work ongoing to replace the entire specialization mechanism with something much more principled https://github.com/mit-plv/fiat-crypto/projects/4.