Commit Graph

5067 Commits

Author SHA1 Message Date
Adam Langley
45210dd4e2 Tidy up |ec_GFp_simple_point2oct| and friend.
(Just happened to see these as I went by.)

Change-Id: I348b163e6986bfca8b58e56885c35a813efe28f6
Reviewed-on: https://boringssl-review.googlesource.com/25725
Reviewed-by: David Benjamin <davidben@google.com>
Commit-Queue: David Benjamin <davidben@google.com>
CQ-Verified: CQ bot account: commit-bot@chromium.org <commit-bot@chromium.org>
2018-02-05 04:40:59 +00:00
Adam Langley
2044181e01 Set output point to the generator when not on the curve.
Processing off-curve points is sufficiently dangerous to worry about
code that doesn't check the return value of
|EC_POINT_set_affine_coordinates| and |EC_POINT_oct2point|. While we
have integrated on-curve checks into these functions, code that ignores
the return value will still be able to work with an invalid point
because it's already been installed in the output by the time the check
is done.

Instead, in the event of an off-curve point, set the output point to the
generator, which is certainly on the curve and hopefully safe.

Change-Id: Ibc73dceb2d8d21920e07c4f6def2c8249cb78ca0
Reviewed-on: https://boringssl-review.googlesource.com/25724
Commit-Queue: David Benjamin <davidben@google.com>
Reviewed-by: David Benjamin <davidben@google.com>
CQ-Verified: CQ bot account: commit-bot@chromium.org <commit-bot@chromium.org>
2018-02-05 02:03:29 +00:00
Adam Langley
a312391050 cavp_tlskdf_test.cc: include errno.h since errno is referenced.
Change-Id: Id2d9923b3f0984be995a8057f60e714946f0f0b2
Reviewed-on: https://boringssl-review.googlesource.com/25664
Commit-Queue: David Benjamin <davidben@google.com>
Reviewed-by: David Benjamin <davidben@google.com>
CQ-Verified: CQ bot account: commit-bot@chromium.org <commit-bot@chromium.org>
2018-02-02 22:50:27 +00:00
Adam Langley
091b455f09 Support running CAVP tests on an Android device.
This change allows run_cavp.go to execute tests on a connected Android
device and collect the results.

Change-Id: Ica83239c58d83907b82c591c4873a3de4ba0b3c0
Reviewed-on: https://boringssl-review.googlesource.com/25604
Reviewed-by: David Benjamin <davidben@google.com>
2018-02-02 22:34:17 +00:00
Adam Langley
472ba2c2dd Require that Ed25519 |s| values be < order.
https://tools.ietf.org/html/rfc8032#section-5.1.7 adds this requirement
to prevent signature malleability.

Change-Id: Iac9a3649d97fc69e6efb4aea1ab1e002768fadc9
Reviewed-on: https://boringssl-review.googlesource.com/25564
Commit-Queue: David Benjamin <davidben@google.com>
Reviewed-by: David Benjamin <davidben@google.com>
CQ-Verified: CQ bot account: commit-bot@chromium.org <commit-bot@chromium.org>
2018-02-02 20:45:08 +00:00
David Benjamin
f4b708cc1e Add a function which folds BN_MONT_CTX_{new,set} together.
These empty states aren't any use to either caller or implementor.

Change-Id: If0b748afeeb79e4a1386182e61c5b5ecf838de62
Reviewed-on: https://boringssl-review.googlesource.com/25254
Reviewed-by: Adam Langley <agl@google.com>
2018-02-02 20:23:25 +00:00
David Benjamin
feffb87168 Make BN_bn2bin_padded work with non-minimal BIGNUMs.
Checking the excess words for zero doesn't need to be in constant time,
but it's free. BN_bn2bin_padded is a little silly as read_word_padded
only exists to work around bn->top being minimal. Once non-minimal
BIGNUMs are turned on and the RSA code works right, we can simplify
BN_bn2bin_padded.

Bug: 232
Change-Id: Ib81e30ca1e5a8ea90ab3278bf4ded219bac481ac
Reviewed-on: https://boringssl-review.googlesource.com/25253
Reviewed-by: Adam Langley <agl@google.com>
2018-02-02 20:16:50 +00:00
David Benjamin
385e4e9d98 Handle directive arguments with * in them.
Some of the CFI directives from upstream include expressions such as:

   .cfi_adjust_cfa_offset 32*5+8

(Also the latest version of peg moves the go generate line to
delocate.peg.go.)

Change-Id: I21bdf9ae44f81e4eca7b3565c4581a670f621a80
Reviewed-on: https://boringssl-review.googlesource.com/25624
Reviewed-by: Adam Langley <agl@google.com>
2018-02-02 19:59:06 +00:00
David Benjamin
6c41465548 Remove redundant bn->top computation.
One less to worry about.

Bug: 232
Change-Id: Ib7d38e18fee02590088d76363e17f774cfefa59b
Reviewed-on: https://boringssl-review.googlesource.com/25252
Reviewed-by: Adam Langley <agl@google.com>
2018-02-02 18:54:09 +00:00
David Benjamin
7979dbede2 Use bn_resize_words in BN_from_montgomery_word.
Saves a bit of work, and we get a width sanity-check.

Bug: 232
Change-Id: I1c6bc376c9d8aaf60a078fdc39f35b6f44a688c6
Reviewed-on: https://boringssl-review.googlesource.com/25251
Reviewed-by: Adam Langley <agl@google.com>
2018-02-02 18:52:49 +00:00
David Benjamin
76ce04bec8 Fix up BN_MONT_CTX_set with non-minimal values.
Give a non-minimal modulus, there are two possible values of R we might
pick: 2^(BN_BITS2 * width) or 2^(BN_BITS2 * bn_minimal_width).
Potentially secret moduli would make the former attractive and things
might even work, but our only secret moduli (RSA) have public bit
widths. It's more cases to test and the usual BIGNUM invariant is that
widths do not affect numerical output.

Thus, settle on minimizing mont->N for now. With the top explicitly made
minimal, computing |lgBigR| is also a little simpler.

This CL also abstracts out the < R check in the RSA code, and implements
it in a width-agnostic way.

Bug: 232
Change-Id: I354643df30530db7866bb7820e34241d7614f3c2
Reviewed-on: https://boringssl-review.googlesource.com/25250
Reviewed-by: Adam Langley <agl@google.com>
2018-02-02 18:52:15 +00:00
David Benjamin
0758b6837e Reject negative numbers in BN_{mod_mul,to,from}_montgomery.
These functions already require their inputs to be reduced mod N (or, in
some cases, bounded by R or N*R), so negative numbers are nonsense.  The
code still attempted to account for them by working on the absolute
value and fiddling with the sign bit. (The output would be in range (-N,
N) instead of [0, N).)

This complicates relaxing bn_correct_top because bn_correct_top is also
used to prevent storing a negative zero. Instead, just reject negative
inputs.

Upgrade-Note: These functions are public API, so some callers may
    notice. Code search suggests there is only one caller outside
    BoringSSL, and it looks fine.

Bug: 232
Change-Id: Ieba3acbb36b0ff6b72b8ed2b14882ec9b88e4665
Reviewed-on: https://boringssl-review.googlesource.com/25249
Reviewed-by: Adam Langley <agl@google.com>
2018-02-02 18:44:54 +00:00
David Benjamin
9a5bfc0350 Tidy up BN_mod_mul_montgomery.
This matches bn_mod_mul_montgomery_small and removes a bit of
unnecessary stuttering.

Change-Id: Ife249c6e8754aef23c144dbfdea5daaf7ed9f48a
Reviewed-on: https://boringssl-review.googlesource.com/25248
Reviewed-by: Adam Langley <agl@google.com>
2018-02-02 18:44:01 +00:00
David Benjamin
2ccdf584aa Factor out BN_to_montgomery(1) optimization.
This cuts down on a duplicated place where we mess with bn->top. It also
also better abstracts away what determines the value of R.

(I ordered this wrong and rebasing will be annoying. Specifically, the
question is what happens if the modulus is non-minimal. In
https://boringssl-review.googlesource.com/c/boringssl/+/25250/, R will
be determined by the stored width of mont->N, so we want to use mont's
copy of the modulus. Though, one way or another, the important part is
that it's inside the Montgomery abstraction.)

Bug: 232
Change-Id: I74212e094c8a47f396b87982039e49048a130916
Reviewed-on: https://boringssl-review.googlesource.com/25247
Reviewed-by: Adam Langley <agl@google.com>
2018-02-02 18:42:39 +00:00
David Benjamin
dc8b1abb75 Do RSA sqrt(2) business in BIGNUM.
This is actually a bit more complicated (the mismatching widths cases
will never actually happen in RSA), but it's easier to think about and
removes more width-sensitive logic.

Bug: 232
Change-Id: I85fe6e706be1f7d14ffaf587958e930f47f85b3c
Reviewed-on: https://boringssl-review.googlesource.com/25246
Reviewed-by: Adam Langley <agl@google.com>
2018-02-02 18:32:32 +00:00
David Benjamin
43cf27e7d7 Add bn_copy_words.
This makes it easier going to and from non-minimal BIGNUMs and words
without worrying about the widths which are ultimately to become less
friendly.

Bug: 232
Change-Id: Ia57cb29164c560b600573c27b112ad9375a86aad
Reviewed-on: https://boringssl-review.googlesource.com/25245
Reviewed-by: Adam Langley <agl@google.com>
2018-02-02 18:24:39 +00:00
David Benjamin
ad5cfdf541 Add initial support for non-minimal BIGNUMs.
Thanks to Andres Erbsen for extremely helpful suggestions on how finally
plug this long-standing hole!

OpenSSL BIGNUMs are currently minimal-width, which means they cannot be
constant-time. We'll need to either excise BIGNUM from RSA and EC or
somehow fix BIGNUM. EC_SCALAR and later EC_FELEM work will excise it
from EC, but RSA's BIGNUMs are more transparent.  Teaching BIGNUM to
handle non-minimal word widths is probably simpler.

The main constraint is BIGNUM's large "calculator" API surface. One
could, in theory, do arbitrary math on RSA components, which means all
public functions must tolerate non-minimal inputs. This is also useful
for EC; https://boringssl-review.googlesource.com/c/boringssl/+/24445 is
silly.

As a first step, fix comparison-type functions that were assuming
minimal BIGNUMs. I've also added bn_resize_words, but it is testing-only
until the rest of the library is fixed.

bn->top is now a loose upper bound we carry around. It does not affect
numerical results, only performance and secrecy. This is a departure
from the original meaning, and compiler help in auditing everything is
nice, so the final change in this series will rename bn->top to
bn->width. Thus these new functions are named per "width", not "top".

Looking further ahead, how are output BIGNUM widths determined? There's
three notions of correctness here:

1. Do I compute the right answer for all widths?

2. Do I handle secret data in constant time?

3. Does my memory usage not balloon absurdly?

For (1), a BIGNUM function must give the same answer for all input
widths. BN_mod_add_quick may assume |a| < |m|, but |a| may still be
wider than |m| by way of leading zeres. The simplest approach is to
write code in a width-agnostic way and rely on functions to accept all
widths. Where functions need to look at bn->d, we'll a few helper
functions to smooth over funny widths.

For (2), (1) is little cumbersome. Consider constant-time modular
addition. A sane type system would guarantee input widths match. But C
is weak here, and bifurcating the internals is a lot of work. Thus, at
least for now, I do not propose we move RSA's internal computation out
of BIGNUM. (EC_SCALAR/EC_FELEM are valuable for EC because we get to
stack-allocate, curves were already specialized, and EC only has two
types with many operations on those types. None of these apply to RSA.
We've got numbers mod n, mod p, mod q, and their corresponding
exponents, each of which is used for basically one operation.)

Instead, constant-time BIGNUM functions will output non-minimal widths.
This is trivial for BN_bin2bn or modular arithmetic. But for BN_mul,
constant-time[*] would dictate r->top = a->top + b->top. A calculator
repeatedly multiplying by one would then run out of memory.  Those we'll
split into a private BN_mul_fixed for crypto, leaving BN_mul for
calculators. BN_mul is just BN_mul_fixed followed by bn_correct_top.

[*] BN_mul is not constant-time for other reasons, but that will be
fixed separately.

Bug: 232
Change-Id: Ide2258ae8c09a9a41bb71d6777908d1c27917069
Reviewed-on: https://boringssl-review.googlesource.com/25244
Reviewed-by: Adam Langley <agl@google.com>
2018-02-02 18:03:46 +00:00
David Benjamin
884086e0e2 Remove x86_64 x25519 assembly.
Now that we have 64-bit C code, courtesy of fiat-crypto, the tradeoff
for carrying the assembly changes:

Assembly:
Did 16000 Curve25519 base-point multiplication operations in 1059932us (15095.3 ops/sec)
Did 16000 Curve25519 arbitrary point multiplication operations in 1060023us (15094.0 ops/sec)

fiat64:
Did 39000 Curve25519 base-point multiplication operations in 1004712us (38817.1 ops/sec)
Did 14000 Curve25519 arbitrary point multiplication operations in 1006827us (13905.1 ops/sec)

The assembly is still about 9% faster than fiat64, but fiat64 gets to
use the Ed25519 tables for the base point multiplication, so overall it
is actually faster to disable the assembly:

>>> 1/(1/15094.0 + 1/15095.3)
7547.324986004976
>>> 1/(1/38817.1 + 1/13905.1)
10237.73016319501

(At the cost of touching a 30kB table.)

The assembly implementation is no longer pulling its weight. Remove it
and use the fiat code in all build configurations.

Change-Id: Id736873177d5568bb16ea06994b9fcb1af104e33
Reviewed-on: https://boringssl-review.googlesource.com/25524
Reviewed-by: Adam Langley <agl@google.com>
2018-02-01 21:44:58 +00:00
David Benjamin
fa65113400 Push an error if custom private keys fail.
The private key callback may not push one of its own (it's possible to
register a custom error library and whatnot, but this is tedious). If
the callback does not push any, we report SSL_ERROR_SYSCALL. This is not
completely wrong, as "syscall" really means "I don't know, something you
gave me, probably the BIO, failed so I assume you know what happened",
but most callers just check errno. And indeed cert_cb pushes its own
error, so this probably should as well.

Update-Note: Custom private key callbacks which push an error code on
    failure will report both that error followed by
    SSL_R_PRIVATE_KEY_OPERATION_FAILED. Callbacks which did not push any
    error will switch from SSL_ERROR_SYSCALL to SSL_ERROR_SSL with
    SSL_R_PRIVATE_KEY_OPERATION_FAILED.

Change-Id: I7e90cd327fe0cbcff395470381a3591364a82c74
Reviewed-on: https://boringssl-review.googlesource.com/25544
Reviewed-by: Adam Langley <agl@google.com>
2018-02-01 21:43:42 +00:00
David Benjamin
48669209b7 Fix fuzzer mode suppressions.
All the patterns need to account for a possible "-Split" version now.

Change-Id: Ie1b38ce10777d61d70a4d5a8bb2d44cdc98e4bfb
Reviewed-on: https://boringssl-review.googlesource.com/25504
Reviewed-by: Adam Langley <agl@google.com>
2018-01-31 22:57:51 +00:00
Adam Langley
ddb57cfb51 Add tests for split handshakes.
This change adds a couple of focused tests to ssl_test.cc, but also
programmically duplicates many runner tests in a split-handshake mode.

Change-Id: I9dafc8a394581e5daf1318722e1015de82117fd9
Reviewed-on: https://boringssl-review.googlesource.com/25388
Commit-Queue: Adam Langley <agl@google.com>
CQ-Verified: CQ bot account: commit-bot@chromium.org <commit-bot@chromium.org>
Reviewed-by: Adam Langley <agl@google.com>
2018-01-31 22:33:42 +00:00
Adam Langley
3fe8fa74ac Add initial, experimental support for split handshakes.
Split handshakes allows the handshaking of a TLS connection to be
performed remotely. This encompasses not just the private-key and ticket
operations – support for that was already available – but also things
such as selecting the certificates and cipher suites.

The the comment block in ssl.h for details. This is highly experimental
and will change significantly before its settled.

Change-Id: I337bdfa4c3262169e9b79dd4e70b57f0d380fcad
Reviewed-on: https://boringssl-review.googlesource.com/25387
Reviewed-by: Adam Langley <agl@google.com>
Commit-Queue: Adam Langley <agl@google.com>
2018-01-31 22:24:17 +00:00
Steven Valdez
7e5dd25d47 Remove draft22 and experiment2.
Change-Id: I2486dc810ea842c534015fc04917712daa26cfde
Update-Note: Now that tls13_experiment2 is gone, the server should remove the set_tls13_variant call. To avoid further churn, we'll make the server default for future variants to be what we'd like to deploy.
Reviewed-on: https://boringssl-review.googlesource.com/25104
Commit-Queue: Steven Valdez <svaldez@google.com>
CQ-Verified: CQ bot account: commit-bot@chromium.org <commit-bot@chromium.org>
Reviewed-by: David Benjamin <davidben@google.com>
2018-01-31 18:07:53 +00:00
Nick Harper
3c034b2cf3 Add support for QUIC transport params.
This adds support for sending the quic_transport_parameters
(draft-ietf-quic-tls) in ClientHello and EncryptedExtensions, as well as
reading the value sent by the peer.

Bug: boringssl:224
Change-Id: Ied633f557cb13ac87454d634f2bd81ab156f5399
Reviewed-on: https://boringssl-review.googlesource.com/24464
Commit-Queue: David Benjamin <davidben@google.com>
CQ-Verified: CQ bot account: commit-bot@chromium.org <commit-bot@chromium.org>
Reviewed-by: David Benjamin <davidben@google.com>
2018-01-30 23:54:40 +00:00
David Benjamin
a62dbf88d8 Move OPENSSL_FALLTHROUGH to internal headers.
Having it in base.h pollutes the global namespace a bit and, in
particular, causes clang to give unhelpful suggestions in consuming
projects.

Change-Id: I6ca1a88bdd1701f0c49192a0df56ac0953c7067c
Reviewed-on: https://boringssl-review.googlesource.com/25464
Commit-Queue: Steven Valdez <svaldez@google.com>
Reviewed-by: Steven Valdez <svaldez@google.com>
CQ-Verified: CQ bot account: commit-bot@chromium.org <commit-bot@chromium.org>
2018-01-29 18:17:57 +00:00
Matthew Braithwaite
5301c10c53 ssl_verify_peer_cert: implement |SSL_VERIFY_NONE| as advertised.
Since SSL{,_CTX}_set_custom_verify take a |mode| parameter that may be
|SSL_VERIFY_NONE|, it should do what it says on the tin, which is to
perform verification and ignore the result.

Change-Id: I0d8490111fb199c6b325cc167cf205316ecd4b49
Reviewed-on: https://boringssl-review.googlesource.com/25224
Commit-Queue: David Benjamin <davidben@google.com>
CQ-Verified: CQ bot account: commit-bot@chromium.org <commit-bot@chromium.org>
Reviewed-by: David Benjamin <davidben@google.com>
2018-01-26 22:42:17 +00:00
Adam Langley
e8d2439cd3 Expose ssl_session_serialize to libssl.
This function can serialise a session to a |CBB|.

Change-Id: Icdb7aef900f03f947c3fa4625dd218401eb8eafc
Reviewed-on: https://boringssl-review.googlesource.com/25385
Commit-Queue: David Benjamin <davidben@google.com>
CQ-Verified: CQ bot account: commit-bot@chromium.org <commit-bot@chromium.org>
Reviewed-by: David Benjamin <davidben@google.com>
2018-01-26 22:31:47 +00:00
David Benjamin
0ab3f0ca25 Notice earlier if a server echoes the TLS 1.3 compatibility session ID.
Mono's legacy TLS 1.0 stack, as a server, does not implement any form of
resumption, but blindly echos the ClientHello session ID in the
ServerHello for no particularly good reason.

This is invalid, but due to quirks of how our client checked session ID
equality, we only noticed on the second connection, rather than the
first. Flaky failures do no one any good, so break deterministically on
the first connection, when we realize something strange is going on.

Bug: chromium:796910
Change-Id: I1f255e915fcdffeafb80be481f6c0acb3c628846
Reviewed-on: https://boringssl-review.googlesource.com/25424
Commit-Queue: Steven Valdez <svaldez@google.com>
CQ-Verified: CQ bot account: commit-bot@chromium.org <commit-bot@chromium.org>
Reviewed-by: Steven Valdez <svaldez@google.com>
2018-01-26 21:53:27 +00:00
Adam Langley
0ab86cf6f9 Require only that the nonce be strictly monotonic in TLS's AES-GCM
Previously we required that the calls to TLS's AES-GCM use an
incrementing nonce. This change relaxes that requirement so that nonces
need only be strictly monotonic (i.e. values can now be skipped). This
still meets the uniqueness requirements of a nonce.

Change-Id: Ib649a58bb93bf4dc0e081de8a5971daefffe9c70
Reviewed-on: https://boringssl-review.googlesource.com/25384
Commit-Queue: David Benjamin <davidben@google.com>
Reviewed-by: David Benjamin <davidben@google.com>
CQ-Verified: CQ bot account: commit-bot@chromium.org <commit-bot@chromium.org>
2018-01-26 20:09:44 +00:00
Adam Langley
449a9e6a9e Make the gdb window larger.
Running can spawn gdb in an xterm, but the default xterm is rather
small. We could have everyone set their .Xdefaults, I presume, to solve
this, but very few people are running the old xterm these days.

Change-Id: I46eb3ff22f292eb44ce8c5124e83f1ab8aef9547
Reviewed-on: https://boringssl-review.googlesource.com/24846
Reviewed-by: Adam Langley <agl@google.com>
Reviewed-by: David Benjamin <davidben@google.com>
Commit-Queue: David Benjamin <davidben@google.com>
CQ-Verified: CQ bot account: commit-bot@chromium.org <commit-bot@chromium.org>
2018-01-26 19:59:23 +00:00
Adam Langley
ab5a947d8e Reslice TLS AEAD setup.
This change reslices how the functions that generate the key block and
initialise the TLS AEADs are cut. This makes future changes easier.

Change-Id: I7e0f7327375301bed96f33c195b80156db83ce6d
Reviewed-on: https://boringssl-review.googlesource.com/24845
Reviewed-by: Adam Langley <agl@google.com>
Reviewed-by: David Benjamin <davidben@google.com>
Commit-Queue: David Benjamin <davidben@google.com>
CQ-Verified: CQ bot account: commit-bot@chromium.org <commit-bot@chromium.org>
2018-01-26 19:48:03 +00:00
Adam Langley
c61b577197 Add some more utility functions to bytestring.
Change-Id: I7932258890b0b2226ff6841af45926e1b11979ba
Reviewed-on: https://boringssl-review.googlesource.com/24844
Commit-Queue: David Benjamin <davidben@google.com>
Reviewed-by: David Benjamin <davidben@google.com>
CQ-Verified: CQ bot account: commit-bot@chromium.org <commit-bot@chromium.org>
2018-01-25 23:51:36 +00:00
David Benjamin
5a869aa3e8 Documentation typo.
Change-Id: Ie2e90cba642f416d3845171c96a3743846817657
Reviewed-on: https://boringssl-review.googlesource.com/25264
Commit-Queue: Steven Valdez <svaldez@google.com>
Reviewed-by: Steven Valdez <svaldez@google.com>
CQ-Verified: CQ bot account: commit-bot@chromium.org <commit-bot@chromium.org>
2018-01-25 14:47:06 +00:00
David Benjamin
610cdbb102 Switch some ints to bools and Spans.
Change-Id: I505b29ae20fb660229900c4e046a0b1e5606d02c
Reviewed-on: https://boringssl-review.googlesource.com/25164
Commit-Queue: David Benjamin <davidben@google.com>
CQ-Verified: CQ bot account: commit-bot@chromium.org <commit-bot@chromium.org>
Reviewed-by: Steven Valdez <svaldez@google.com>
2018-01-24 19:24:07 +00:00
David Benjamin
32b5940267 Don't leak the exponent bit width in BN_mod_exp_mont_consttime.
(See also https://github.com/openssl/openssl/pull/5154.)

The exponent here is one of d, dmp1, or dmq1 for RSA. This value and its
bit length are both secret. The only public upper bound is the bit width
of the corresponding modulus (RSA n, p, and q, respectively).

Although BN_num_bits is constant-time (sort of; see bn_correct_top notes
in preceding patch), this does not fix the root problem, which is that
the windows are based on the minimal bit width, not the upper bound. We
could use BN_num_bits(m), but BN_mod_exp_mont_consttime is public API
and may be called with larger exponents. Instead, use all top*BN_BITS2
bits in the BIGNUM. This is still sensitive to the long-standing
bn_correct_top leak, but we need to fix that regardless.

This may cause us to do a handful of extra multiplications for RSA keys
which are just above a whole number of words, but that is not a standard
RSA key size.

Change-Id: I5e2f12b70c303b27c597a7e513b7bf7288f7b0e3
Reviewed-on: https://boringssl-review.googlesource.com/25185
Commit-Queue: David Benjamin <davidben@google.com>
CQ-Verified: CQ bot account: commit-bot@chromium.org <commit-bot@chromium.org>
Reviewed-by: Adam Langley <agl@google.com>
2018-01-23 22:27:37 +00:00
David Benjamin
cb1ad205d0 Use 51-bit limbs from fiat-crypto in 64-bit.
Our 64-bit performance was much lower than it could have been, since we
weren't using the 64-bit multipliers. Fortunately, fiat-crypto is
awesome, so this is just a matter of synthesizing new code and
integration work.

Functions without the signature fiat-crypto curly braces were written by
hand and warrant more review. (It's just redistributing some bits.)

These use the donna variants which takes (and proves) some of the
instruction scheduling from donna as that's significantly faster.
Glancing over things, I suspect but have not confirmed the gap is due to
this:
https://github.com/mit-plv/fiat-crypto/pull/295#issuecomment-356892413

Clang without OPENSSL_SMALL (ECDH omitted since that uses assembly and
is unaffected by this CL).

Before:
Did 105149 Ed25519 key generation operations in 5025208us (20924.3 ops/sec)
Did 125000 Ed25519 signing operations in 5024003us (24880.6 ops/sec)
Did 37642 Ed25519 verify operations in 5072539us (7420.7 ops/sec)

After:
Did 206000 Ed25519 key generation operations in 5020547us (41031.4 ops/sec)
Did 227000 Ed25519 signing operations in 5005232us (45352.5 ops/sec)
Did 69840 Ed25519 verify operations in 5004769us (13954.7 ops/sec)

Clang + OPENSSL_SMALL:

Before:
Did 68598 Ed25519 key generation operations in 5024629us (13652.4 ops/sec)
Did 73000 Ed25519 signing operations in 5067837us (14404.6 ops/sec)
Did 36765 Ed25519 verify operations in 5078684us (7239.1 ops/sec)
Did 74000 Curve25519 base-point multiplication operations in 5016465us (14751.4 ops/sec)
Did 45600 Curve25519 arbitrary point multiplication operations in 5034680us (9057.2 ops/sec)

After:
Did 117315 Ed25519 key generation operations in 5021860us (23360.9 ops/sec)
Did 126000 Ed25519 signing operations in 5003521us (25182.3 ops/sec)
Did 64974 Ed25519 verify operations in 5047790us (12871.8 ops/sec)
Did 134000 Curve25519 base-point multiplication operations in 5058946us (26487.7 ops/sec)
Did 86000 Curve25519 arbitrary point multiplication operations in 5050478us (17028.1 ops/sec)

GCC without OPENSSL_SMALL (ECDH omitted since that uses assembly and
is unaffected by this CL).

Before:
Did 35552 Ed25519 key generation operations in 5030756us (7066.9 ops/sec)
Did 38286 Ed25519 signing operations in 5001648us (7654.7 ops/sec)
Did 10584 Ed25519 verify operations in 5068158us (2088.3 ops/sec)

After:
Did 92158 Ed25519 key generation operations in 5024021us (18343.5 ops/sec)
Did 99000 Ed25519 signing operations in 5011908us (19753.0 ops/sec)
Did 31122 Ed25519 verify operations in 5069878us (6138.6 ops/sec)

Change-Id: Ic0c24d50b4ee2bbc408b94965e9d63319936107d
Reviewed-on: https://boringssl-review.googlesource.com/24805
Commit-Queue: David Benjamin <davidben@google.com>
CQ-Verified: CQ bot account: commit-bot@chromium.org <commit-bot@chromium.org>
Reviewed-by: Adam Langley <agl@google.com>
2018-01-23 22:25:07 +00:00
David Benjamin
a1bc1ba47c Fix up CTR_DRBG_update comment.
The original comment was a little confusing. Also lowercase
CTR_DRBG_update to make our usual naming for static functions.

Bug: 227
Change-Id: I381c7ba12b788452d54520b7bc3b13bba8a59f2d
Reviewed-on: https://boringssl-review.googlesource.com/25204
Reviewed-by: Adam Langley <agl@google.com>
2018-01-23 22:19:03 +00:00
David Benjamin
8017cdde38 Make BN_num_bits_word constant-time.
(The BN_num_bits_word implementation was originally written by Andy
Polyakov for OpenSSL. See also
https://github.com/openssl/openssl/pull/5154.)

BN_num_bits, by way of BN_num_bits_word, currently leaks the
most-significant word of its argument via branching and memory access
pattern.

BN_num_bits is called on RSA prime factors in various places. These have
public bit lengths, but all bits beyond the high bit are secret. This
fully resolves those cases.

There are a few places where BN_num_bits is called on an input where
the bit length is also secret. The two left in BoringSSL are:

- BN_mod_exp_mont_consttime calls it on the RSA private exponent.

- The timing "fix" to add the order to k in DSA.

This does *not* fully resolve those cases as we still only look at the
top word. Today, that is guaranteed to be non-zero, but only because of
the long-standing bn_correct_top timing leak. Once that is fixed (I hope
to have patches soon), a constant-time BN_num_bits on such inputs must
count bits on each word.

Instead, those cases should not call BN_num_bits at all. The former uses
the bit width to pick windows, but it should be using the maximum bit
width. The next patch will fix this.  The latter is the same "fix" we
excised from ECDSA in a838f9dc7e.  That
should be excised from DSA after the bn_correct_top bug is fixed.

Thanks to Dinghao Wu, Danfeng Zhang, Shuai Wang, Pei Wang, and Xiao Liu
for reporting this issue.

Change-Id: Idc3da518cc5ec18bd8688b95f959b15300a57c14
Reviewed-on: https://boringssl-review.googlesource.com/25184
Reviewed-by: Adam Langley <agl@google.com>
2018-01-23 22:14:54 +00:00
David Benjamin
b9f30bb6fe Unwind total_num from wNAF_mul.
The EC_POINTs are still allocated (for now), but everything else fits on
the stack nicely, which saves a lot of fiddling with cleanup and
allocations.

Change-Id: Ib8480737ecc97e6b40b2c05f217cd8d3dc82cb72
Reviewed-on: https://boringssl-review.googlesource.com/25150
Reviewed-by: Adam Langley <agl@google.com>
2018-01-23 22:04:58 +00:00
David Benjamin
d86c0d2889 Pull the malloc out of compute_wNAF.
This is to simplify clearing unnecessary mallocs out of ec_wNAF_mul, and
perhaps to use it in tuned variable-time multiplication functions.

Change-Id: Ic390d2e8e20d0ee50f3643830a582e94baebba95
Reviewed-on: https://boringssl-review.googlesource.com/25149
Reviewed-by: Adam Langley <agl@google.com>
2018-01-23 21:53:58 +00:00
David Benjamin
6ca09409cc Always compute the maximum-length wNAF.
This cuts out another total_num-length array and simplifies things.
Leading zeros at the front of the schedule don't do anything, so it's
easier to just produce a fixed-length one. (I'm also hoping to
ultimately reuse this function in //third_party/fiat/p256.c and get the
best of both worlds for ECDSA verification; tuned field arithmetic
operations, precomputed table, and variable-time multiply.)

Change-Id: I771f4ff7dcfdc3ee0eff8d9038d6dc9a0be3d4e0
Reviewed-on: https://boringssl-review.googlesource.com/25148
Reviewed-by: Adam Langley <agl@google.com>
2018-01-23 21:51:25 +00:00
David Benjamin
a42d7bee85 Reorganize curve25519.c slightly.
Adding 51-bit limbs will require two implementations of most of the
field operations. Group them together to make this more manageable. Also
move the representation-independent functions to the end.

Change-Id: I264e8ac64318a1d5fa72e6ad6f7ccf2f0a2c2be9
Reviewed-on: https://boringssl-review.googlesource.com/24804
Commit-Queue: David Benjamin <davidben@google.com>
CQ-Verified: CQ bot account: commit-bot@chromium.org <commit-bot@chromium.org>
Reviewed-by: Adam Langley <agl@google.com>
2018-01-23 21:42:14 +00:00
David Benjamin
0c1eafc6fe Add additional constants to make_curve25519_tables.py.
These are also constants that depend on the field representation.

Change-Id: I22333c099352ad64eb27fe15ffdc38c6ae7c07ff
Reviewed-on: https://boringssl-review.googlesource.com/24746
Commit-Queue: David Benjamin <davidben@google.com>
CQ-Verified: CQ bot account: commit-bot@chromium.org <commit-bot@chromium.org>
Reviewed-by: Adam Langley <agl@google.com>
2018-01-23 21:35:03 +00:00
David Benjamin
522ad7e8fc Use EC_SCALAR for compute_wNAF.
Note this switches from walking BN_num_bits to the full bit length of
the scalar. But that can only cause it to add a few extra zeros to the
front of the schedule, which r_is_at_infinity will skip over.

Change-Id: I91e087c9c03505566b68f75fb37dfb53db467652
Reviewed-on: https://boringssl-review.googlesource.com/25147
Reviewed-by: Adam Langley <agl@google.com>
2018-01-23 21:34:50 +00:00
David Benjamin
338eeb0c4f Remove r_is_inverted logic.
This appears to be pointless. Before, we would have a 50% chance of
doing an inversion at each non-zero bit but the first
(r_is_at_infinity), plus a 50% chance of doing an inversion at the end.
Now we would have a 50% chance of doing an inversion at each non-zero
bit. That's the same number of coin flips.

Change-Id: I8158fd48601cb041188826d4f68ac1a31a6fbbbc
Reviewed-on: https://boringssl-review.googlesource.com/25146
Reviewed-by: Adam Langley <agl@google.com>
Commit-Queue: David Benjamin <davidben@google.com>
CQ-Verified: CQ bot account: commit-bot@chromium.org <commit-bot@chromium.org>
2018-01-23 21:29:13 +00:00
David Benjamin
2d77d4084a Generate curve25519 tables with a script.
This is to make it easier to add new field element representations. The
Ed25519 logic in the script is partially adapted from RFC 8032's Python
code, but I replaced the point addition logic with the naive textbook
formula since this script only cares about being obviously correct.

Change-Id: I0b90bf470993c177070fd1010ac5865fedb46c82
Reviewed-on: https://boringssl-review.googlesource.com/24745
Commit-Queue: David Benjamin <davidben@google.com>
CQ-Verified: CQ bot account: commit-bot@chromium.org <commit-bot@chromium.org>
Reviewed-by: Adam Langley <agl@google.com>
2018-01-23 21:21:54 +00:00
David Benjamin
042b49cf3c Extract curve25519 tables into a separate header.
This is in preparation for writing a script to generate them. I'm
manually moving the existing tables over so it will be easier to confirm
the script didn't change the values.

Change-Id: Id83e95c80d981e19d1179d45bf47559b3e1fc86e
Reviewed-on: https://boringssl-review.googlesource.com/24744
Commit-Queue: David Benjamin <davidben@google.com>
CQ-Verified: CQ bot account: commit-bot@chromium.org <commit-bot@chromium.org>
Reviewed-by: Adam Langley <agl@google.com>
2018-01-23 21:08:49 +00:00
David Benjamin
5d9408714c Remove unnecessary window size cases.
The optimization for wsize = 1 only kicks in for 19-bit primes. The
cases for b >= 800 and cannot happen due to EC_MAX_SCALAR_BYTES.

Change-Id: If5ca908563f027172cdf31c9a22342152fecd12f
Reviewed-on: https://boringssl-review.googlesource.com/25145
Reviewed-by: Adam Langley <agl@google.com>
2018-01-23 21:08:39 +00:00
David Benjamin
4111dd2fc2 Don't compute a per-scalar window size in wNAF code.
Simplify things slightly. The probability of the scalar being small
enough to go down a window size is astronomically small. (2^-186 for
P-256 and 2^-84 for P-384.)

Change-Id: Ie879f0b06bcfd1e6e6e3bf3f54e0d7d6567525a4
Reviewed-on: https://boringssl-review.googlesource.com/25144
Reviewed-by: Adam Langley <agl@google.com>
2018-01-23 21:06:42 +00:00
David Benjamin
186df3a655 Implement fe_sq2_tt with fe_sq_tt.
fiat-crypto only generates fe_mul and fe_sq, but the original Ed25519
implementation we had also had fe_sq2 for computing 2*f^2. Previously,
we inlined a version of fe_mul.

Instead, we could implement it with fe_sq and fe_add. Performance-wise,
this seems to not regress. If anything, it makes it faster?

Before (clang, run for 10 seconds):
Did 243000 Ed25519 key generation operations in 10025910us (24237.2 ops/sec)
Did 250000 Ed25519 signing operations in 10035580us (24911.4 ops/sec)
Did 73305 Ed25519 verify operations in 10071101us (7278.7 ops/sec)
Did 184000 Curve25519 base-point multiplication operations in 10040138us (18326.4 ops/sec)
Did 186000 Curve25519 arbitrary point multiplication operations in 10052721us (18502.5 ops/sec)

After (clang, run for 10 seconds):
Did 242424 Ed25519 key generation operations in 10013117us (24210.6 ops/sec)
Did 253000 Ed25519 signing operations in 10011744us (25270.3 ops/sec)
Did 73899 Ed25519 verify operations in 10048040us (7354.6 ops/sec)
Did 194000 Curve25519 base-point multiplication operations in 10005389us (19389.6 ops/sec)
Did 195000 Curve25519 arbitrary point multiplication operations in 10028443us (19444.7 ops/sec)

Before (clang + OPENSSL_SMALL, run for 10 seconds):
Did 144000 Ed25519 key generation operations in 10019344us (14372.2 ops/sec)
Did 146000 Ed25519 signing operations in 10011653us (14583.0 ops/sec)
Did 74052 Ed25519 verify operations in 10005789us (7400.9 ops/sec)
Did 150000 Curve25519 base-point multiplication operations in 10007468us (14988.8 ops/sec)
Did 91392 Curve25519 arbitrary point multiplication operations in 10057678us (9086.8 ops/sec)

After (clang + OPENSSL_SMALL, run for 10 seconds):
Did 144000 Ed25519 key generation operations in 10066724us (14304.6 ops/sec)
Did 148000 Ed25519 signing operations in 10062043us (14708.7 ops/sec)
Did 74820 Ed25519 verify operations in 10058557us (7438.4 ops/sec)
Did 151000 Curve25519 base-point multiplication operations in 10063492us (15004.7 ops/sec)
Did 90402 Curve25519 arbitrary point multiplication operations in 10049141us (8996.0 ops/sec)

Change-Id: I31e9f61833492c3ff2dfd78e1dee5e06f43c850f
Reviewed-on: https://boringssl-review.googlesource.com/24724
Reviewed-by: Adam Langley <agl@google.com>
2018-01-23 20:49:50 +00:00