Commit Graph

82 Commits

Author SHA1 Message Date
4d03fe12e5 WIP
Change-Id: Iefa0bfc1317f19c8bf2a42a652d35a85319459a3
2019-04-19 11:28:56 +01:00
70c0f56b3b WIP
Change-Id: I30e8140fadaeaad071cd5ce2fb2017b0fb454210
2019-04-19 01:08:08 +01:00
7f9c1fafc4 WIP
Change-Id: I94e84093316a908782178505a0094b27e21e7f67
2019-04-18 21:38:44 +01:00
87571d41b1 WIP
Change-Id: I153c653fd9d33ae89ef71f035f23b253f412e7eb
2019-04-18 17:37:27 +01:00
9b1cfd9893 PerlASM
Change-Id: I8be2c37eafb68d84b09852eb50aff2c356d77e39
2019-04-18 17:35:07 +01:00
eb43eca5a8 Add support for SIKE/p503 post-quantum KEM
Based on Microsoft's implementation available on github:
Source: https://github.com/Microsoft/PQCrypto-SIDH
Commit: 77044b76181eb61c744ac8eb7ddc7a8fe72f6919

Following changes has been applied

* In intel assembly, use MOV instead of MOVQ:
  Intel instruction reference in the Intel Software Developer's Manual
  volume 2A, the MOVQ has 4 forms. None of them mentions moving
  literal to GPR, hence "movq $rax, 0x0" is wrong. Instead, on 64bit
  system, MOV can be used.

* Some variables were wrongly zero-initialized (as per C99 spec)

* Move constant values to .RODATA segment, as keeping them in .TEXT
  segment is not compatible with XOM.

* Fixes issue in arm64 code related to the fact that compiler doesn't
  reserve enough space for the linker to relocate address of a global
  variable when used by 'ldr' instructions. Solution is to use 'adrp'
  followed by 'add' instruction. Relocations for 'adrp' and 'add'
  instructions is generated by prefixing the label with :pg_hi21:
  and :lo12: respectively.

* Enable MULX and ADX. Code from MS doesn't support PIC. MULX can't
  reference global variable directly. Instead RIP-relative addressing
  can be used. This improves performance around 10%-13% on SkyLake

* Check if CPU supports BMI2 and ADOX instruction at runtime. On AMD64
  optimized implementation of montgomery multiplication and reduction
  have 2 implementations - faster one takes advantage of BMI2
  instruction set introduced in Haswell and ADOX introduced in
  Broadwell. Thanks to OPENSSL_ia32cap_P it can be decided at runtime
  which implementation to choose. As CPU configuration is static by
  nature, branch predictor will be correct most of the time and hence
  this check very often has no cost.

* Reuse some utilities from boringssl instead of reimplementing them.
  This includes things like:
  * definition of a limb size (use crypto_word_t instead of digit_t)
  * use functions for checking in constant time if value is 0 and/or
    less then
  * #define's used for conditional compilation

* Use SSE2 for conditional swap on vector registers. Improves
  performance a little bit.

* Fix f2elm_t definition. Code imported from MSR defines f2elm_t type as
  a array of arrays. This decays to a pointer to an array (when passing
  as an argument). In C, one can't assign const pointer to an array with
  non-const pointer to an array. Seems it violates 6.7.3/8 from C99
  (same for C11). This problem occures in GCC 6, only when -pedantic
  flag is specified and it occures always in GCC 4.9 (debian jessie).

* Fix definition of eval_3_isog. Second argument in eval_3_isog mustn't be
  const. Similar reason as above.

* Use HMAC-SHA256 instead of cSHAKE-256 to avoid upstreaming cSHAKE
  and SHA3 code.

* Add speed and unit tests for SIKE.

Change-Id: I22f0bb1f9edff314a35cd74b48e8c4962568e330
2019-04-12 11:26:23 -07:00
David Benjamin
be7006adac Update third_party/googletest.
The new version of googletest deprecates INSTANTIATE_TEST_CASE_P in
favor of INSTANTIATE_TEST_SUITE_P, so apply the change.

This requires blacklisting C4628 on MSVC 2015 which says about digraphs
given foo<::std::tuple<...>>. Disable that warning. Digraphs are not
useful and C++11 apparently explicitly disambiguates that.

It also requires applying
https://github.com/google/googletest/pull/2226, to deal with a warning
in older MSVC.

Update-Note: Consumers using BoringSSL with their own copy of googletest
must ensure googletest was updated to a version from 2019-01-03 or
later for INSTANTIATE_TEST_SUITE_P to work. (I believe all relevant
consumers are fine here. If anyone can't update googletest and is
building BoringSSL tests, building with
-DINSTANTIATE_TEST_SUITE_P=INSTANTIATE_TEST_CASE_P would work as
workaround.)

Bug: chromium:936651
Change-Id: I23ada8de34a53131cab88a36a88d3185ab085c64
Reviewed-on: https://boringssl-review.googlesource.com/c/boringssl/+/35504
Reviewed-by: Adam Langley <agl@google.com>
2019-04-10 22:09:43 +00:00
David Benjamin
f109f20873 Clear out a bunch of -Wextra-semi warnings.
Unfortunately, it's not enough to be able to turn it on thanks to the
PURE_VIRTUAL macro. But it gets us most of the way there.

Change-Id: Ie6ad5119fcfd420115fa49d7312f3586890244f4
Reviewed-on: https://boringssl-review.googlesource.com/c/34949
Reviewed-by: Adam Langley <agl@google.com>
Commit-Queue: Adam Langley <agl@google.com>
2019-02-21 19:12:39 +00:00
David Benjamin
9847cdd785 Fix signed left-shifts in curve25519.c.
Due to a language flaw in C, left-shifts on signed integers are
undefined for negative numbers. This makes them all but useless. Cast to
the unsigned type, left-shift, and cast back (casts are defined to wrap)
to silence UBSan.

Change-Id: I8fbe739aee1c99cf553462b675863e6d68c2b302
Reviewed-on: https://boringssl-review.googlesource.com/c/34446
Reviewed-by: Adam Langley <agl@google.com>
2019-01-22 23:27:34 +00:00
Adam Langley
899835fad4 Rename Fiat include files to end in .h
Otherwise generate_build_files.py thinks that they're top-level source
files.

Fixes grpc/grpc#17780

Change-Id: I9f14a816a5045c1101841a2ef7ef9868abcd5d12
Reviewed-on: https://boringssl-review.googlesource.com/c/34364
Reviewed-by: Adam Langley <agl@google.com>
2019-01-21 17:29:45 +00:00
David Benjamin
32e59d2d32 Switch to new fiat pipeline.
This new version makes it much easier to tell which code is handwritten
and which is verified. For some reason, it also is *dramatically* faster
for 32-bit x86 GCC. Clang x86_64, however, does take a small hit.
Benchmarks below.

x86, GCC 7.3.0, OPENSSL_SMALL
(For some reason, GCC used to be really bad at compiling the 32-bit curve25519
code. The new one fixes this. I'm not sure what changed.)
Before:
Did 17135 Ed25519 key generation operations in 10026402us (1709.0 ops/sec)
Did 17170 Ed25519 signing operations in 10074192us (1704.4 ops/sec)
Did 9180 Ed25519 verify operations in 10034025us (914.9 ops/sec)
Did 17271 Curve25519 base-point multiplication operations in 10050837us (1718.4 ops/sec)
Did 10605 Curve25519 arbitrary point multiplication operations in 10047714us (1055.5 ops/sec)
Did 7800 ECDH P-256 operations in 10018331us (778.6 ops/sec)
Did 24308 ECDSA P-256 signing operations in 10019241us (2426.1 ops/sec)
Did 9191 ECDSA P-256 verify operations in 10081639us (911.7 ops/sec)
After:
Did 99873 Ed25519 key generation operations in 10021810us (9965.6 ops/sec) [+483.1%]
Did 99960 Ed25519 signing operations in 10052236us (9944.1 ops/sec) [+483.4%]
Did 53676 Ed25519 verify operations in 10009078us (5362.7 ops/sec) [+486.2%]
Did 102000 Curve25519 base-point multiplication operations in 10039764us (10159.6 ops/sec) [+491.2%]
Did 60802 Curve25519 arbitrary point multiplication operations in 10056897us (6045.8 ops/sec) [+472.8%]
Did 7900 ECDH P-256 operations in 10054509us (785.7 ops/sec) [+0.9%]
Did 24926 ECDSA P-256 signing operations in 10050919us (2480.0 ops/sec) [+2.2%]
Did 9494 ECDSA P-256 verify operations in 10064659us (943.3 ops/sec) [+3.5%]

x86, Clang 8.0.0 trunk 349417, OPENSSL_SMALL
Before:
Did 82750 Ed25519 key generation operations in 10051177us (8232.9 ops/sec)
Did 82400 Ed25519 signing operations in 10035806us (8210.6 ops/sec)
Did 41511 Ed25519 verify operations in 10048919us (4130.9 ops/sec)
Did 83300 Curve25519 base-point multiplication operations in 10044283us (8293.3 ops/sec)
Did 49700 Curve25519 arbitrary point multiplication operations in 10007005us (4966.5 ops/sec)
Did 14039 ECDH P-256 operations in 10093929us (1390.8 ops/sec)
Did 40950 ECDSA P-256 signing operations in 10006757us (4092.2 ops/sec)
Did 16068 ECDSA P-256 verify operations in 10095996us (1591.5 ops/sec)
After:
Did 80476 Ed25519 key generation operations in 10048648us (8008.6 ops/sec) [-2.7%]
Did 79050 Ed25519 signing operations in 10049180us (7866.3 ops/sec) [-4.2%]
Did 40501 Ed25519 verify operations in 10048347us (4030.6 ops/sec) [-2.4%]
Did 81300 Curve25519 base-point multiplication operations in 10017480us (8115.8 ops/sec) [-2.1%]
Did 48278 Curve25519 arbitrary point multiplication operations in 10092500us (4783.6 ops/sec) [-3.7%]
Did 15402 ECDH P-256 operations in 10096705us (1525.4 ops/sec) [+9.7%]
Did 44200 ECDSA P-256 signing operations in 10037715us (4403.4 ops/sec) [+7.6%]
Did 17000 ECDSA P-256 verify operations in 10008813us (1698.5 ops/sec) [+6.7%]

x86_64, GCC 7.3.0
(Note these P-256 numbers are not affected by this change. Included to get a
sense of noise.)
Before:
Did 557000 Ed25519 key generation operations in 10011721us (55634.8 ops/sec)
Did 550000 Ed25519 signing operations in 10016449us (54909.7 ops/sec)
Did 190000 Ed25519 verify operations in 10014565us (18972.4 ops/sec)
Did 587000 Curve25519 base-point multiplication operations in 10015402us (58609.7 ops/sec)
Did 230000 Curve25519 arbitrary point multiplication operations in 10023827us (22945.3 ops/sec)
Did 179000 ECDH P-256 operations in 10016294us (17870.9 ops/sec)
Did 557000 ECDSA P-256 signing operations in 10014158us (55621.3 ops/sec)
Did 198000 ECDSA P-256 verify operations in 10036694us (19727.6 ops/sec)
After:
Did 569000 Ed25519 key generation operations in 10004965us (56871.8 ops/sec) [+2.2%]
Did 563000 Ed25519 signing operations in 10000064us (56299.6 ops/sec) [+2.5%]
Did 196000 Ed25519 verify operations in 10025650us (19549.9 ops/sec) [+3.0%]
Did 596000 Curve25519 base-point multiplication operations in 10008666us (59548.4 ops/sec) [+1.6%]
Did 229000 Curve25519 arbitrary point multiplication operations in 10028921us (22834.0 ops/sec) [-0.5%]
Did 182910 ECDH P-256 operations in 10014905us (18263.8 ops/sec) [+2.2%]
Did 562000 ECDSA P-256 signing operations in 10011944us (56133.0 ops/sec) [+0.9%]
Did 202000 ECDSA P-256 verify operations in 10046901us (20105.7 ops/sec) [+1.9%]

x86_64, GCC 7.3.0, OPENSSL_SMALL
Before:
Did 350000 Ed25519 key generation operations in 10002540us (34991.1 ops/sec)
Did 344000 Ed25519 signing operations in 10010420us (34364.2 ops/sec)
Did 197000 Ed25519 verify operations in 10030593us (19639.9 ops/sec)
Did 362000 Curve25519 base-point multiplication operations in 10004615us (36183.3 ops/sec)
Did 235000 Curve25519 arbitrary point multiplication operations in 10025951us (23439.2 ops/sec)
Did 32032 ECDH P-256 operations in 10056486us (3185.2 ops/sec)
Did 96354 ECDSA P-256 signing operations in 10007297us (9628.4 ops/sec)
Did 37774 ECDSA P-256 verify operations in 10044892us (3760.5 ops/sec)
After:
Did 343000 Ed25519 key generation operations in 10025108us (34214.1 ops/sec) [-2.2%]
Did 340000 Ed25519 signing operations in 10014870us (33949.5 ops/sec) [-1.2%]
Did 192000 Ed25519 verify operations in 10025082us (19152.0 ops/sec) [-2.5%]
Did 355000 Curve25519 base-point multiplication operations in 10013220us (35453.1 ops/sec) [-2.0%]
Did 231000 Curve25519 arbitrary point multiplication operations in 10010775us (23075.1 ops/sec) [-1.6%]
Did 31540 ECDH P-256 operations in 10009664us (3151.0 ops/sec) [-1.1%]
Did 99012 ECDSA P-256 signing operations in 10090296us (9812.6 ops/sec) [+1.9%]
Did 37695 ECDSA P-256 verify operations in 10092859us (3734.8 ops/sec) [-0.7%]

x86_64, Clang 8.0.0 trunk 349417
(Note these P-256 numbers are not affected by this change. Included to get a
sense of noise.)
Before:
Did 600000 Ed25519 key generation operations in 10000278us (59998.3 ops/sec)
Did 595000 Ed25519 signing operations in 10010375us (59438.3 ops/sec)
Did 184000 Ed25519 verify operations in 10013984us (18374.3 ops/sec)
Did 636000 Curve25519 base-point multiplication operations in 10005250us (63566.6 ops/sec)
Did 229000 Curve25519 arbitrary point multiplication operations in 10006059us (22886.1 ops/sec)
Did 179250 ECDH P-256 operations in 10026354us (17877.9 ops/sec)
Did 547000 ECDSA P-256 signing operations in 10017585us (54604.0 ops/sec)
Did 197000 ECDSA P-256 verify operations in 10013020us (19674.4 ops/sec)
After:
Did 560000 Ed25519 key generation operations in 10009295us (55948.0 ops/sec) [-6.8%]
Did 548000 Ed25519 signing operations in 10007912us (54756.7 ops/sec) [-7.9%]
Did 170000 Ed25519 verify operations in 10056948us (16903.7 ops/sec) [-8.0%]
Did 592000 Curve25519 base-point multiplication operations in 10016818us (59100.6 ops/sec) [-7.0%]
Did 214000 Curve25519 arbitrary point multiplication operations in 10043918us (21306.4 ops/sec) [-6.9%]
Did 180000 ECDH P-256 operations in 10026019us (17953.3 ops/sec) [+0.4%]
Did 550000 ECDSA P-256 signing operations in 10004943us (54972.8 ops/sec) [+0.7%]
Did 198000 ECDSA P-256 verify operations in 10021714us (19757.1 ops/sec) [+0.4%]

x86_64, Clang 8.0.0 trunk 349417, OPENSSL_SMALL
Before:
Did 326000 Ed25519 key generation operations in 10003266us (32589.4 ops/sec)
Did 322000 Ed25519 signing operations in 10026783us (32114.0 ops/sec)
Did 181000 Ed25519 verify operations in 10015635us (18071.7 ops/sec)
Did 335000 Curve25519 base-point multiplication operations in 10000359us (33498.8 ops/sec)
Did 224000 Curve25519 arbitrary point multiplication operations in 10027245us (22339.1 ops/sec)
Did 68552 ECDH P-256 operations in 10018900us (6842.3 ops/sec)
Did 184000 ECDSA P-256 signing operations in 10014516us (18373.3 ops/sec)
Did 76020 ECDSA P-256 verify operations in 10016891us (7589.2 ops/sec)
After:
Did 310000 Ed25519 key generation operations in 10022086us (30931.7 ops/sec) [-5.1%]
Did 308000 Ed25519 signing operations in 10007543us (30776.8 ops/sec) [-4.2%]
Did 173000 Ed25519 verify operations in 10005829us (17289.9 ops/sec) [-4.3%]
Did 321000 Curve25519 base-point multiplication operations in 10027058us (32013.4 ops/sec) [-4.4%]
Did 212000 Curve25519 arbitrary point multiplication operations in 10015203us (21167.8 ops/sec) [-5.2%]
Did 64059 ECDH P-256 operations in 10042781us (6378.6 ops/sec) [-6.8%]
Did 170000 ECDSA P-256 signing operations in 10030896us (16947.6 ops/sec) [-7.8%]
Did 72176 ECDSA P-256 verify operations in 10075369us (7163.6 ops/sec) [-5.6%]

Bug: 254
Change-Id: Ib04c773f01b542bcb8611cceb582466bfa6f6d52
Reviewed-on: https://boringssl-review.googlesource.com/c/34306
Commit-Queue: Adam Langley <agl@google.com>
Reviewed-by: Adam Langley <agl@google.com>
2019-01-18 00:24:03 +00:00
David Benjamin
5590c715e2 Mark some unmarked array sizes in curve25519.c.
Change-Id: I92589f5d5e89c836cff3c26739b43eb65de67836
Reviewed-on: https://boringssl-review.googlesource.com/c/34304
Commit-Queue: Adam Langley <agl@google.com>
Reviewed-by: Adam Langley <agl@google.com>
2019-01-16 20:49:29 +00:00
David Benjamin
43e636a2e4 Remove bundled copy of android-cmake.
I don't believe we use this anymore. People using it should upgrade to a newer
NDK (or, worst case, download android-cmake themselves).

Change-Id: Ia99d7b19d6f2ec3f4ffe90795813b00480dc2d60
Reviewed-on: https://boringssl-review.googlesource.com/c/34004
Commit-Queue: Adam Langley <agl@google.com>
Reviewed-by: Adam Langley <agl@google.com>
2019-01-03 16:28:10 +00:00
David Benjamin
5ecfb10d54 Modernize OPENSSL_COMPILE_ASSERT, part 2.
The change seems to have stuck, so bring us closer to C/++11 static asserts.

(If we later find we need to support worse toolchains, we can always use
__LINE__ or __COUNTER__ to avoid duplicate typedef names and just punt on
embedding the message into the type name.)

Change-Id: I0e5bb1106405066f07740728e19ebe13cae3e0ee
Reviewed-on: https://boringssl-review.googlesource.com/c/33145
Commit-Queue: Adam Langley <agl@google.com>
Reviewed-by: Adam Langley <agl@google.com>
CQ-Verified: CQ bot account: commit-bot@chromium.org <commit-bot@chromium.org>
2018-11-14 16:06:37 +00:00
David Benjamin
8618f2bfe0 Optimize EC_GFp_mont_method's cmp_x_coordinate.
For simplicity, punt order > field or width mismatches. Analogous
optimizations are possible, but the generic path works fine and no
commonly-used curve looks hits those cases.

Before:
Did 5888 ECDSA P-384 verify operations in 3094535us (1902.7 ops/sec)
After [+6.7%]:
Did 6107 ECDSA P-384 verify operations in 3007515us (2030.6 ops/sec)

Also we can fill in p - order generically and avoid extra copies of some
constants.

Change-Id: I38e1b6d51b28ed4f8cb74697b00a4f0fbc5efc3c
Reviewed-on: https://boringssl-review.googlesource.com/c/33068
Commit-Queue: David Benjamin <davidben@google.com>
CQ-Verified: CQ bot account: commit-bot@chromium.org <commit-bot@chromium.org>
Reviewed-by: Adam Langley <agl@google.com>
2018-11-13 01:48:21 +00:00
David Benjamin
0b3f497bcd Optimize EC_GFp_nistp256_method's cmp_x_coordinate.
Before:
Did 35496 ECDSA P-256 verify operations in 10027999us (3539.7 ops/sec)
After [+6.9%]:
Did 38170 ECDSA P-256 verify operations in 10090160us (3782.9 ops/sec)

Change-Id: Ib272d19954f46d96efc2b6d5dd480b5b85a34523
Reviewed-on: https://boringssl-review.googlesource.com/c/33067
Commit-Queue: David Benjamin <davidben@google.com>
CQ-Verified: CQ bot account: commit-bot@chromium.org <commit-bot@chromium.org>
Reviewed-by: Adam Langley <agl@google.com>
2018-11-13 00:52:18 +00:00
David Benjamin
fa3aadcd40 Push BIGNUM out of EC_METHOD's affine coordinates hook.
This is in preparation for removing the BIGNUM from cmp_x_coordinate.

Change-Id: Id8394248e3019a4897c238289f039f436a13679d
Reviewed-on: https://boringssl-review.googlesource.com/c/33064
Reviewed-by: Adam Langley <agl@google.com>
Commit-Queue: David Benjamin <davidben@google.com>
CQ-Verified: CQ bot account: commit-bot@chromium.org <commit-bot@chromium.org>
2018-11-12 21:32:53 +00:00
Adam Langley
9edbc7ff9f Revert "Revert "Speed up ECDSA verify on x86-64.""
This reverts commit e907ed4c4b. CPUID
checks have been added so hopefully this time sticks.

Change-Id: I5e0e5b87427c1230132681f936b3c70bac8263b8
Reviewed-on: https://boringssl-review.googlesource.com/c/32924
Commit-Queue: David Benjamin <davidben@google.com>
Reviewed-by: David Benjamin <davidben@google.com>
CQ-Verified: CQ bot account: commit-bot@chromium.org <commit-bot@chromium.org>
2018-11-07 23:57:22 +00:00
Adam Langley
e907ed4c4b Revert "Speed up ECDSA verify on x86-64."
This reverts commit 3d450d2844. It fails
SDE, looks like a missing CPUID check before using vector instructions.

Change-Id: I6b7dd71d9e5b1f509d2e018bd8be38c973476b4e
Reviewed-on: https://boringssl-review.googlesource.com/c/32864
Reviewed-by: Adam Langley <agl@google.com>
Commit-Queue: Adam Langley <agl@google.com>
2018-11-06 00:29:15 +00:00
David Benjamin
cfd50c63a1 Route the tuned add/dbl implementations out of EC_METHOD.
Some consumer stumbled upon EC_POINT_{add,dbl} being faster with a
"custom" P-224 curve than the built-in one and made "custom" clones to
work around this. Before the EC_FELEM refactor, EC_GFp_nistp224_method
used BN_mod_mul for all reductions in fallback point arithmetic (we
primarily support the multiplication functions and keep the low-level
point arithmetic for legacy reasons) which took quite a performance hit.

EC_FELEM fixed this, but standalone felem_{mul,sqr} calls out of
nistp224 perform a lot of reductions, rather than batching them up as
that implementation is intended. So it is still slightly faster to use a
"custom" curve.

Custom curves are the last thing we want to encourage, so just route the
tuned implementations out of EC_METHOD to close this gap. Now the
built-in implementation is always solidly faster than (or identical to)
the custom clone.  This also reduces the number of places where we mix
up tuned vs. generic implementation, which gets us closer to making
EC_POINT's representation EC_METHOD-specific.

Change-Id: I843e1101a6208eaabb56d29d342e886e523c78b4
Reviewed-on: https://boringssl-review.googlesource.com/c/32848
Commit-Queue: David Benjamin <davidben@google.com>
CQ-Verified: CQ bot account: commit-bot@chromium.org <commit-bot@chromium.org>
Reviewed-by: Adam Langley <agl@google.com>
2018-11-06 00:17:19 +00:00
Nir Drucker
3d450d2844 Speed up ECDSA verify on x86-64.
This commit improves the performance of ECDSA signature verification
(over NIST P-256 curve) for x86 platforms. The speedup is by a factor of 1.15x.
It does so by:
  1) Leveraging the fact that the verification does not need
     to run in constant time. To this end, we implemented:
    a) the function ecp_nistz256_points_mul_public in a similar way to
       the current ecp_nistz256_points_mul function by removing its constant
       time features.
    b) the Binary Extended Euclidean Algorithm (BEEU) in x86 assembly to
       replace the current modular inverse function used for the inversion.
  2) The last step in the ECDSA_verify function compares the (x) affine
     coordinate with the signature (r) value. Converting x from the Jacobian's
     representation to the affine coordinate requires to perform one inversions
     (x_affine = x * z^(-2)). We save this inversion and speed up the computations
     by instead bringing r to x (r_jacobian = r*z^2) which is faster.

The measured results are:
Before (on a Kaby Lake desktop with gcc-5):
Did 26000 ECDSA P-224 signing operations in 1002372us (25938.5 ops/sec)
Did 11000 ECDSA P-224 verify operations in 1043821us (10538.2 ops/sec)
Did 55000 ECDSA P-256 signing operations in 1017560us (54050.9 ops/sec)
Did 17000 ECDSA P-256 verify operations in 1051280us (16170.8 ops/sec)

After (on a Kaby Lake desktop with gcc-5):
Did 27000 ECDSA P-224 signing operations in 1011287us (26698.7 ops/sec)
Did 11640 ECDSA P-224 verify operations in 1076698us (10810.8 ops/sec)
Did 55000 ECDSA P-256 signing operations in 1016880us (54087.0 ops/sec)
Did 20000 ECDSA P-256 verify operations in 1038736us (19254.2 ops/sec)

Before (on a Skylake server platform with gcc-5):
Did 25000 ECDSA P-224 signing operations in 1021651us (24470.2 ops/sec)
Did 10373 ECDSA P-224 verify operations in 1046563us (9911.5 ops/sec)
Did 50000 ECDSA P-256 signing operations in 1002774us (49861.7 ops/sec)
Did 15000 ECDSA P-256 verify operations in 1006471us (14903.6 ops/sec)

After (on a Skylake server platform with gcc-5):
Did 25000 ECDSA P-224 signing operations in 1020958us (24486.8 ops/sec)
Did 10373 ECDSA P-224 verify operations in 1046359us (9913.4 ops/sec)
Did 50000 ECDSA P-256 signing operations in 1003996us (49801.0 ops/sec)
Did 18000 ECDSA P-256 verify operations in 1021604us (17619.4 ops/sec)

Developers and authors:
***************************************************************************
Nir Drucker (1,2), Shay Gueron (1,2)
(1) Amazon Web Services Inc.
(2) University of Haifa, Israel
***************************************************************************

Change-Id: Idd42a7bc40626bce974ea000b61fdb5bad33851c
Reviewed-on: https://boringssl-review.googlesource.com/c/31304
Commit-Queue: Adam Langley <agl@google.com>
CQ-Verified: CQ bot account: commit-bot@chromium.org <commit-bot@chromium.org>
Reviewed-by: David Benjamin <davidben@google.com>
Reviewed-by: Adam Langley <agl@google.com>
2018-11-05 23:48:07 +00:00
Joshua Liebow-Feeser
8c7c6356e6 Support symbol prefixes
- In base.h, if BORINGSSL_PREFIX is defined, include
  boringssl_prefix_symbols.h
- In all .S files, if BORINGSSL_PREFIX is defined, include
  boringssl_prefix_symbols_asm.h
- In base.h, BSSL_NAMESPACE_BEGIN and BSSL_NAMESPACE_END are
  defined with appropriate values depending on whether
  BORINGSSL_PREFIX is defined; these macros are used in place
  of 'namespace bssl {' and '}'
- Add util/make_prefix_headers.go, which takes a list of symbols
  and auto-generates the header files mentioned above
- In CMakeLists.txt, if BORINGSSL_PREFIX and BORINGSSL_PREFIX_SYMBOLS
  are defined, run util/make_prefix_headers.go to generate header
  files
- In various CMakeLists.txt files, add "global_target" that all
  targets depend on to give us a place to hook logic that must run
  before all other targets (in particular, the header file generation
  logic)
- Document this in BUILDING.md, including the fact that it is
  the caller's responsibility to provide the symbol list and keep it
  up to date
- Note that this scheme has not been tested on Windows, and likely
  does not work on it; Windows support will need to be added in a
  future commit

Change-Id: If66a7157f46b5b66230ef91e15826b910cf979a2
Reviewed-on: https://boringssl-review.googlesource.com/31364
Commit-Queue: David Benjamin <davidben@google.com>
CQ-Verified: CQ bot account: commit-bot@chromium.org <commit-bot@chromium.org>
Reviewed-by: David Benjamin <davidben@google.com>
2018-09-06 20:07:52 +00:00
Joshua Liebow-Feeser
67e64342c1 Document that ED25519_sign only fails on allocation failure
Change-Id: I45866c3a4aa98ebac51d4e554a22eb5add45002f
Reviewed-on: https://boringssl-review.googlesource.com/31404
Reviewed-by: David Benjamin <davidben@google.com>
Commit-Queue: David Benjamin <davidben@google.com>
CQ-Verified: CQ bot account: commit-bot@chromium.org <commit-bot@chromium.org>
2018-08-29 18:35:12 +00:00
David Benjamin
bdc409801f Add new curve/hash ECDSA combinations from Wycheproof.
Change-Id: I7bb36c4e4108a2b7d9481ab2cafc245ea31927c0
Reviewed-on: https://boringssl-review.googlesource.com/30847
Reviewed-by: Adam Langley <agl@google.com>
2018-08-10 18:26:06 +00:00
David Benjamin
af37f84840 Add RSA-PSS tests from Wycheproof.
Along the way, split up the EVPTest Wycheproof tests into separate tests (they
shard better when running in parallel).

Change-Id: I5ee919f7ec7c35a7f2e0cc2af4142991a808a9db
Reviewed-on: https://boringssl-review.googlesource.com/30846
Reviewed-by: Adam Langley <agl@google.com>
2018-08-10 18:26:00 +00:00
David Benjamin
f84c0dad7a Use newly-sharded ECDH tests.
Also remove some transition step for a recent format change. Together, this
removes the curve hacks in the converter, which can now be purely syntactic.
The RSA ones are still a bit all over the place in terms of sharded vs
combined, so leaving that alone for now.

Change-Id: I721d6b0de388a53a39543725e366dc5b52e83561
Reviewed-on: https://boringssl-review.googlesource.com/30845
Reviewed-by: Adam Langley <agl@google.com>
2018-08-10 18:25:51 +00:00
David Benjamin
a711b53e0b Update Wycheproof test vectors.
This only updates the repository. We'll catch up with the new tests in a
subsequent commit.

Change-Id: I074a041479159ce1141af3241e7158599b648365
Reviewed-on: https://boringssl-review.googlesource.com/30844
Commit-Queue: David Benjamin <davidben@google.com>
Reviewed-by: Adam Langley <agl@google.com>
2018-08-10 17:56:29 +00:00
David Benjamin
42ea84b317 Update Wycheproof test vectors.
They've since added new files that split up ECDH and RSA. The former especially
could be useful. A later commit will switch to those. Along the way, fix the
aes_cmac_test.json entry in the convert_wycheproof.go which got lost at some
point.

Change-Id: I9c4a2e5fc5f3e0935482f583c5466c1b64fe325e
Reviewed-on: https://boringssl-review.googlesource.com/29686
Commit-Queue: Adam Langley <agl@google.com>
Reviewed-by: Adam Langley <agl@google.com>
CQ-Verified: CQ bot account: commit-bot@chromium.org <commit-bot@chromium.org>
2018-07-13 20:46:20 +00:00
Adam Langley
576b637861 Move convert_wycheproof.go to util/
This file is not part of the Wycheproof project and consumers of
BoringSSL who wish to provide Wycheproof themselves (and not have
third_party/wycheproof_testvectors) need it in another location.

Change-Id: I730fe294f46a9aac77b858a91a03ee64fb8ea579
Reviewed-on: https://boringssl-review.googlesource.com/28704
Commit-Queue: Adam Langley <agl@google.com>
Commit-Queue: David Benjamin <davidben@google.com>
Reviewed-by: David Benjamin <davidben@google.com>
CQ-Verified: CQ bot account: commit-bot@chromium.org <commit-bot@chromium.org>
2018-05-22 17:16:36 +00:00
David Benjamin
62abcebb01 Add a driver for Wycheproof CMAC tests.
Change-Id: Iafe81d22647c99167ab27a5345cfa970755112ac
Reviewed-on: https://boringssl-review.googlesource.com/28465
Commit-Queue: David Benjamin <davidben@google.com>
CQ-Verified: CQ bot account: commit-bot@chromium.org <commit-bot@chromium.org>
Reviewed-by: Adam Langley <agl@google.com>
2018-05-15 21:19:42 +00:00
Martin Kreichgauer
044f637fef reformat third_party/wycheproof_testvectors/METADATA
Change-Id: Ib12f41dec023e20dfd1182513bf11571950d7c85
Reviewed-on: https://boringssl-review.googlesource.com/28245
Reviewed-by: David Benjamin <davidben@google.com>
Commit-Queue: David Benjamin <davidben@google.com>
CQ-Verified: CQ bot account: commit-bot@chromium.org <commit-bot@chromium.org>
2018-05-08 19:00:35 +00:00
David Benjamin
bf33114b51 Rename third_party/wycheproof to satisfy a bureaucrat.
Make it clear this is not a pristine full copy of all of Wycheproof as a
library.

Change-Id: I1aa5253a1d7c696e69b2e8d7897924f15303d9ac
Reviewed-on: https://boringssl-review.googlesource.com/28188
Commit-Queue: David Benjamin <davidben@google.com>
Commit-Queue: Martin Kreichgauer <martinkr@google.com>
Reviewed-by: Martin Kreichgauer <martinkr@google.com>
CQ-Verified: CQ bot account: commit-bot@chromium.org <commit-bot@chromium.org>
2018-05-07 18:33:50 +00:00
David Benjamin
179c4e257a Update Wycheproof, add keywrap tests, and fix a bug.
The bug, courtesy of Wycheproof, is that AES key wrap requires the input
be at least two blocks, not one. This also matches the OpenSSL behavior
of those two APIs.

Update-Note: AES_wrap_key with in_len = 8 and AES_unwrap_key with
in_len = 16 will no longer work.

Change-Id: I5fc63ebc16920c2f9fd488afe8c544e0647d7507
Reviewed-on: https://boringssl-review.googlesource.com/27925
Commit-Queue: David Benjamin <davidben@google.com>
CQ-Verified: CQ bot account: commit-bot@chromium.org <commit-bot@chromium.org>
Reviewed-by: Adam Langley <agl@google.com>
2018-05-04 17:08:44 +00:00
David Benjamin
8e75ae4880 Add a Wycheproof driver for AES-CBC.
Change-Id: I782ea51e1db8d05f552832a7c6910954fa2dda5f
Reviewed-on: https://boringssl-review.googlesource.com/27924
Commit-Queue: David Benjamin <davidben@google.com>
CQ-Verified: CQ bot account: commit-bot@chromium.org <commit-bot@chromium.org>
Reviewed-by: Adam Langley <agl@google.com>
2018-05-02 19:41:48 +00:00
David Benjamin
302bb3964a Small curve25519 cleanups.
Per Brian, x25519_ge_frombytes_vartime does not match the usual
BoringSSL return value convention, and we're slightly inconsistent about
whether to mask the last byte with 63 or 127. (It then gets ANDed with
64, so it doesn't matter which.) Use 127 to align with the curve25519
RFC. Finally, when we invert the transformation, use the same constants
inverted so that they're parallel.

Bug: 243, 244
Change-Id: I0e3aca0433ead210446c58d86b2f57526bde1eac
Reviewed-on: https://boringssl-review.googlesource.com/27984
Reviewed-by: Adam Langley <agl@google.com>
2018-05-02 19:24:00 +00:00
David Benjamin
3f944674b2 Add an ECDH Wycheproof driver.
Unfortunately, this driver suffers a lot from Wycheproof's Java
heritgate, but so it goes. Their test formats bake in a lot of Java API
mistakes.

Change-Id: I3299e85efb58e99e4fa34841709c3bea6518968d
Reviewed-on: https://boringssl-review.googlesource.com/27865
Reviewed-by: Steven Valdez <svaldez@google.com>
Commit-Queue: David Benjamin <davidben@google.com>
CQ-Verified: CQ bot account: commit-bot@chromium.org <commit-bot@chromium.org>
2018-05-01 19:38:07 +00:00
David Benjamin
7760af4bce Print tcId in converted Wycheproof files.
This is to make it easier to correlate the two.

Change-Id: I62aa381499d67ae279bbe86eebeb9a5bc9ef5266
Reviewed-on: https://boringssl-review.googlesource.com/27864
Reviewed-by: Steven Valdez <svaldez@google.com>
Commit-Queue: David Benjamin <davidben@google.com>
CQ-Verified: CQ bot account: commit-bot@chromium.org <commit-bot@chromium.org>
2018-05-01 19:09:16 +00:00
David Benjamin
5505328633 Add AEAD Wycheproof drivers.
Change-Id: I840863c445fd9dac3fd60ac4b1c572ea7d924c9c
Reviewed-on: https://boringssl-review.googlesource.com/27826
Reviewed-by: Steven Valdez <svaldez@google.com>
Commit-Queue: Steven Valdez <svaldez@google.com>
CQ-Verified: CQ bot account: commit-bot@chromium.org <commit-bot@chromium.org>
2018-05-01 18:36:00 +00:00
David Benjamin
c596415ec6 Add a DSA Wycheproof driver.
DSA is deprecated and will ultimately be removed but, in the
meantime, it still ought to be tested.

Change-Id: I75af25430b8937a43b11dced1543a98f7a6fbbd3
Reviewed-on: https://boringssl-review.googlesource.com/27825
Reviewed-by: Steven Valdez <svaldez@google.com>
Commit-Queue: David Benjamin <davidben@google.com>
CQ-Verified: CQ bot account: commit-bot@chromium.org <commit-bot@chromium.org>
2018-04-30 16:04:31 +00:00
David Benjamin
5707274214 Add Ed25519 Wycheproof driver.
This works with basically no modifications.

Change-Id: I92f4d90f3c0ec8170d532cf7872754fadb36644d
Reviewed-on: https://boringssl-review.googlesource.com/27824
Commit-Queue: Steven Valdez <svaldez@google.com>
Reviewed-by: Steven Valdez <svaldez@google.com>
CQ-Verified: CQ bot account: commit-bot@chromium.org <commit-bot@chromium.org>
2018-04-30 15:29:01 +00:00
David Benjamin
041dd68cec Clear mallocs in ec_wNAF_mul.
EC_POINT is split into the existing public EC_POINT (where the caller is
sanity-checked about group mismatches) and the low-level EC_RAW_POINT
(which, like EC_FELEM and EC_SCALAR, assume that is your problem and is
a plain old struct). Having both EC_POINT and EC_RAW_POINT is a little
silly, but we're going to want different type signatures for functions
which return void anyway (my plan is to lift a non-BIGNUM
get_affine_coordinates up through the ECDSA and ECDH code), so I think
it's fine.

This wasn't strictly necessary, but wnaf.c is a lot tidier now. Perf is
a wash; once we get up to this layer, it's only 8 entries in the table
so not particularly interesting.

Bug: 239
Change-Id: I8ace749393d359f42649a5bb0734597bb7c07a2e
Reviewed-on: https://boringssl-review.googlesource.com/27706
Commit-Queue: David Benjamin <davidben@google.com>
CQ-Verified: CQ bot account: commit-bot@chromium.org <commit-bot@chromium.org>
Reviewed-by: Adam Langley <agl@google.com>
2018-04-27 19:44:58 +00:00
David Benjamin
e14e4a7ee3 Remove ec_compute_wNAF's failure cases.
Replace them with asserts and better justify why each of the internal
cases are not reachable. Also change the loop to count up to bits+1 so
it is obvious there is no memory error. (The previous loop shape made
more sense when ec_compute_wNAF would return a variable length
schedule.)

Change-Id: I9c7df6abac4290b7a3e545e3d4aa1462108e239e
Reviewed-on: https://boringssl-review.googlesource.com/27705
Commit-Queue: David Benjamin <davidben@google.com>
CQ-Verified: CQ bot account: commit-bot@chromium.org <commit-bot@chromium.org>
Reviewed-by: Adam Langley <agl@google.com>
2018-04-27 19:24:58 +00:00
David Benjamin
40d76f4f7d Add ECDSA and RSA verify Wycheproof drivers.
Along the way, add some utility functions for getting common things
(curves, hashes, etc.) in the names Wycheproof uses.

Change-Id: I09c11ea2970cf2c8a11a8c2a861d85396efda125
Reviewed-on: https://boringssl-review.googlesource.com/27786
Commit-Queue: David Benjamin <davidben@google.com>
CQ-Verified: CQ bot account: commit-bot@chromium.org <commit-bot@chromium.org>
Reviewed-by: Adam Langley <agl@google.com>
2018-04-27 18:58:38 +00:00
David Benjamin
5509bc06d8 Add a test driver for Wycheproof's x25519_test.json.
FileTest and Wycheproof express more-or-less the same things, so I've
just written a script to mechanically convert them. Saves writing a JSON
parser.

I've also left a TODO with other files that are worth converting. Per
Thai, the webcrypto variants of the files are just a different format
and will later be consolidated, so I've ignored those. The
curve/hash-specific ECDSA files and the combined one are intended to be
the same, so I've ignored the combined one. (Just by test counts, there
are some discrepancies, but Thai says he'll fix that and we can update
when that happens.)

Change-Id: I5fcbd5cb0e1bea32964b09fb469cb43410f53c2d
Reviewed-on: https://boringssl-review.googlesource.com/27785
Commit-Queue: David Benjamin <davidben@google.com>
CQ-Verified: CQ bot account: commit-bot@chromium.org <commit-bot@chromium.org>
Reviewed-by: Adam Langley <agl@google.com>
2018-04-27 18:55:38 +00:00
David Benjamin
2d10c3688c Check in a copy of Project Wycheproof test vectors.
This is just a pristine copy of the JSON files for now. It's not hooked
up to anything yet.

Change-Id: I608b4b0368578f159cad23950d70578ff4c23da3
Reviewed-on: https://boringssl-review.googlesource.com/27784
Reviewed-by: Adam Langley <agl@google.com>
2018-04-26 23:07:29 +00:00
David Benjamin
32e0d10069 Add EC_FELEM for EC_POINTs and related temporaries.
This introduces EC_FELEM, which is analogous to EC_SCALAR. It is used
for EC_POINT's representation in the generic EC_METHOD, as well as
random operations on tuned EC_METHODs that still are implemented
genericly.

Unlike EC_SCALAR, EC_FELEM's exact representation is awkwardly specific
to the EC_METHOD, analogous to how the old values were BIGNUMs but may
or may not have been in Montgomery form. This is kind of a nuisance, but
no more than before. (If p224-64.c were easily convertable to Montgomery
form, we could say |EC_FELEM| is always in Montgomery form. If we
exposed the internal add and double implementations in each of the
curves, we could give |EC_POINT| an |EC_METHOD|-specific representation
and |EC_FELEM| is purely a |EC_GFp_mont_method| type. I'll leave this
for later.)

The generic add and doubling formulas are aligned with the formulas
proved in fiat-crypto. Those only applied to a = -3, so I've proved a
generic one in https://github.com/mit-plv/fiat-crypto/pull/356, in case
someone uses a custom curve.  The new formulas are verified,
constant-time, and swap a multiply for a square. As expressed in
fiat-crypto they do use more temporaries, but this seems to be fine with
stack-allocated EC_FELEMs. (We can try to help the compiler later,
but benchamrks below suggest this isn't necessary.)

Unlike BIGNUM, EC_FELEM can be stack-allocated. It also captures the
bounds in the type system and, in particular, that the width is correct,
which will make it easier to select a point in constant-time in the
future. (Indeed the old code did not always have the correct width. Its
point formula involved halving and implemented this in variable time and
variable width.)

Before:
Did 77274 ECDH P-256 operations in 10046087us (7692.0 ops/sec)
Did 5959 ECDH P-384 operations in 10031701us (594.0 ops/sec)
Did 10815 ECDSA P-384 signing operations in 10087892us (1072.1 ops/sec)
Did 8976 ECDSA P-384 verify operations in 10071038us (891.3 ops/sec)
Did 2600 ECDH P-521 operations in 10091688us (257.6 ops/sec)
Did 4590 ECDSA P-521 signing operations in 10055195us (456.5 ops/sec)
Did 3811 ECDSA P-521 verify operations in 10003574us (381.0 ops/sec)

After:
Did 77736 ECDH P-256 operations in 10029858us (7750.5 ops/sec) [+0.8%]
Did 7519 ECDH P-384 operations in 10068076us (746.8 ops/sec) [+25.7%]
Did 13335 ECDSA P-384 signing operations in 10029962us (1329.5 ops/sec) [+24.0%]
Did 11021 ECDSA P-384 verify operations in 10088600us (1092.4 ops/sec) [+22.6%]
Did 2912 ECDH P-521 operations in 10001325us (291.2 ops/sec) [+13.0%]
Did 5150 ECDSA P-521 signing operations in 10027462us (513.6 ops/sec) [+12.5%]
Did 4264 ECDSA P-521 verify operations in 10069694us (423.4 ops/sec) [+11.1%]

This more than pays for removing points_make_affine previously and even
speeds up ECDH P-256 slightly. (The point-on-curve check uses the
generic code.)

Next is to push the stack-allocating up to ec_wNAF_mul, followed by a
constant-time single-point multiplication.

Bug: 239
Change-Id: I44a2dff7c52522e491d0f8cffff64c4ab5cd353c
Reviewed-on: https://boringssl-review.googlesource.com/27668
Reviewed-by: Adam Langley <agl@google.com>
2018-04-25 16:39:58 +00:00
David Benjamin
364a51ec3a Abstract scalar inversion in EC_METHOD.
This introduces a hook for the OpenSSL assembly.

Change-Id: I35e0588f0ed5bed375b12f738d16c9f46ceedeea
Reviewed-on: https://boringssl-review.googlesource.com/27592
Reviewed-by: Adam Langley <alangley@gmail.com>
2018-04-24 16:13:24 +00:00
David Benjamin
5fca613918 Fix typo in point_add.
Rather than writing the answer into the output, it wrote it into some
awkwardly-named temporaries. Thanks to Daniel Hirche for reporting this
issue!

Bug: chromium:825273
Change-Id: I5def4be045cd1925453c9873218e5449bf25e3f5
Reviewed-on: https://boringssl-review.googlesource.com/26785
Reviewed-by: Adam Langley <agl@google.com>
Commit-Queue: David Benjamin <davidben@google.com>
CQ-Verified: CQ bot account: commit-bot@chromium.org <commit-bot@chromium.org>
2018-03-23 21:12:29 +00:00
Daniel Hirche
8d4f7e5421 Remove redundant assertion in fe_mul_121666_impl.
Change-Id: Ie2368dc9f6be791b7c3ad1c610dcd603634be6e4
Reviewed-on: https://boringssl-review.googlesource.com/26244
Reviewed-by: David Benjamin <davidben@google.com>
Commit-Queue: David Benjamin <davidben@google.com>
CQ-Verified: CQ bot account: commit-bot@chromium.org <commit-bot@chromium.org>
2018-02-27 23:50:02 +00:00
Martin Kreichgauer
8041d8c40e third_party: re-format METATADA files
Change-Id: Ic2e9f54f5ced053c1463d5c09a74db5b2a3ea098
Reviewed-on: https://boringssl-review.googlesource.com/26224
Reviewed-by: David Benjamin <davidben@google.com>
Commit-Queue: David Benjamin <davidben@google.com>
CQ-Verified: CQ bot account: commit-bot@chromium.org <commit-bot@chromium.org>
2018-02-27 19:57:12 +00:00