Commit Graph

2267 Commits

Author SHA1 Message Date
David Benjamin
f8058d4114 Add M=8 L=2 AES-128-CCM as well.
The Bluetooth Mesh spec uses both apparently. Also extract a pile of
test vectors from that document (thanks to Kyle Lund for showing me
which to extract).

Change-Id: I04a04fafb7386ca28adfe1446fa388e841778931
Reviewed-on: https://boringssl-review.googlesource.com/26324
Commit-Queue: Steven Valdez <svaldez@google.com>
Reviewed-by: Steven Valdez <svaldez@google.com>
CQ-Verified: CQ bot account: commit-bot@chromium.org <commit-bot@chromium.org>
2018-03-02 18:45:06 +00:00
Adam Langley
c01786403f Update link to CMVP certificate.
NIST redid their website and broke all the old links.

Change-Id: I5b7cba878404bb63e49f221f6203c8e1e6545af4
Reviewed-on: https://boringssl-review.googlesource.com/26204
Reviewed-by: Adam Langley <agl@google.com>
2018-02-26 22:14:35 +00:00
David Benjamin
672f6fc248 Always use adr with __thumb2__.
Thumb2 addresses are a bit a mess, depending on whether a label is
interpreted as a function pointer value (for use with BX and BLX) or as
a program counter value (for use with PC-relative addressing). Clang's
integrated assembler mis-assembles this code. See
https://crbug.com/124610#c54 for details.

Instead, use the ADR pseudo-instruction which has clear semantics and
should be supported by every assembler that handles the OpenSSL Thumb2
code. (In other files, the ADR vs SUB conditionals are based on
__thumb2__ already. For some reason, this one is based on __APPLE__, I'm
guessing to deal with an older version of clang assembler.)

It's unclear to me which of clang or binutils is "correct" or if this is
even a well-defined notion beyond "whatever binutils does". But I will
note that https://github.com/openssl/openssl/pull/4669 suggests binutils
has also changed behavior around this before.

See also https://github.com/openssl/openssl/pull/5431 in OpenSSL.

Bug: chromium:124610
Change-Id: I5e7a0c8c0f54a3f65cc324ad599a41883675f368
Reviewed-on: https://boringssl-review.googlesource.com/26164
Commit-Queue: Steven Valdez <svaldez@google.com>
Reviewed-by: Steven Valdez <svaldez@google.com>
CQ-Verified: CQ bot account: commit-bot@chromium.org <commit-bot@chromium.org>
2018-02-22 22:28:15 +00:00
Daniel Hirche
36714fc8ee Remove redundant length-check in |ec_wNAF_mul|.
Right now, |g_wNAF| and |p_wNAF| are of same size.

This change makes GCC's "-Werror=logical-op" happy and adds a compile-time
assertion in case the initial size of either array ever changes.

Change-Id: I29e39a7a121a0a9d016c53da6b7c25675ddecbdc
Reviewed-on: https://boringssl-review.googlesource.com/26104
Reviewed-by: David Benjamin <davidben@google.com>
Commit-Queue: David Benjamin <davidben@google.com>
CQ-Verified: CQ bot account: commit-bot@chromium.org <commit-bot@chromium.org>
2018-02-21 17:03:14 +00:00
Fred Gylys-Colwell
02d696f2a1 Delete |pthread_key_t| on dlclose.
When OPENSSL_DANGEROUS_RELEASE_PTHREAD_KEY is defined during the build,
this change adds a destructor function that is called when BoringSSL is
unloaded via |dlclose| or during process exit. Using |dlclose| with
BoringSSL is not supported and will leak memory, but this change allows
some code that is already doing it to survive longer.

Change-Id: Ifc6d6aae61ed0f15d61cd3dbb4ea9f8006e43dba
Reviewed-on: https://boringssl-review.googlesource.com/25784
Reviewed-by: Adam Langley <agl@google.com>
Reviewed-by: David Benjamin <davidben@google.com>
Reviewed-by: Fred Gylys-Colwell <fredgc@google.com>
2018-02-20 19:53:24 +00:00
David Benjamin
085955c567 Actually use the u64 cast.
The point was to remove the silly moduli.

Change-Id: I48c507c9dd1fc46e38e8991ed528b02b8da3dc1d
Reviewed-on: https://boringssl-review.googlesource.com/26044
Commit-Queue: Steven Valdez <svaldez@google.com>
Reviewed-by: Steven Valdez <svaldez@google.com>
CQ-Verified: CQ bot account: commit-bot@chromium.org <commit-bot@chromium.org>
2018-02-16 20:02:56 +00:00
Steven Valdez
f16cd4278f Add AES_128_CCM AEAD.
Change-Id: I830be64209deada0f24c3b6d50dc86155085c377
Reviewed-on: https://boringssl-review.googlesource.com/25904
Commit-Queue: Steven Valdez <svaldez@google.com>
Reviewed-by: Steven Valdez <svaldez@google.com>
CQ-Verified: CQ bot account: commit-bot@chromium.org <commit-bot@chromium.org>
2018-02-16 15:57:27 +00:00
David Benjamin
78a832d793 Document RSAZ slightly better.
Better commit such details to comments before I forget them.

Change-Id: Ie36332235c692f4369413b4340a742b5ad895ce1
Reviewed-on: https://boringssl-review.googlesource.com/25984
Commit-Queue: Steven Valdez <svaldez@google.com>
Reviewed-by: Steven Valdez <svaldez@google.com>
CQ-Verified: CQ bot account: commit-bot@chromium.org <commit-bot@chromium.org>
2018-02-15 18:14:04 +00:00
Aaron Green
67968895b3 Remove unused strings.h #include from crypto/mem.c
crypto/mem.c #include's <strings.h>, but doesn't use call any functions
from it.

Change-Id: If60b31be7dd6b347bcb077a59825a557a2492081
Reviewed-on: https://boringssl-review.googlesource.com/25964
Reviewed-by: David Benjamin <davidben@google.com>
Commit-Queue: David Benjamin <davidben@google.com>
CQ-Verified: CQ bot account: commit-bot@chromium.org <commit-bot@chromium.org>
2018-02-14 01:40:23 +00:00
David Benjamin
02cca1987b clang-format RSAZ C code.
Change-Id: I7fb9b06ec89ba11641454145708e157359b07cf0
Reviewed-on: https://boringssl-review.googlesource.com/25924
Commit-Queue: David Benjamin <davidben@google.com>
CQ-Verified: CQ bot account: commit-bot@chromium.org <commit-bot@chromium.org>
Reviewed-by: Steven Valdez <svaldez@google.com>
2018-02-13 22:30:03 +00:00
David Benjamin
10443f5a6e Adjust comment on potential R^3 optimization.
It's doable, but a bit of effort due to the different radix.

Change-Id: Ibfa15c31bb37de930f155ee6d19551a2b6437073
Reviewed-on: https://boringssl-review.googlesource.com/25944
Commit-Queue: David Benjamin <davidben@google.com>
CQ-Verified: CQ bot account: commit-bot@chromium.org <commit-bot@chromium.org>
Reviewed-by: Steven Valdez <svaldez@google.com>
2018-02-13 22:19:13 +00:00
Aaron Green
862e0d2e1b Add cpu-aarch64-fuchsia.c
Fuchsia/Zircon recently added support for exposing arm64 CPU features;
this CL uses the new system call to set OPENSSL_armcap_P.

Change-Id: I045dc0b58117afe6dae315a82bf9acfd8d99be1a
Reviewed-on: https://boringssl-review.googlesource.com/25865
Reviewed-by: David Benjamin <davidben@google.com>
Commit-Queue: David Benjamin <davidben@google.com>
CQ-Verified: CQ bot account: commit-bot@chromium.org <commit-bot@chromium.org>
2018-02-13 20:12:47 +00:00
David Benjamin
638a408cd2 Add a tuned variable-time P-256 multiplication function.
This reuses wnaf.c's window scheduling, but has access to the tuned
field arithemetic and pre-computed base point table. Unlike wnaf.c, we
do not make the points affine as it's not worth it for a single table.
(We already precomputed the base point table.)

Annoyingly, 32-bit x86 gets slower by a bit, but the other platforms are
faster. My guess is that that the generic code gets to use the
bn_mul_mont assembly and the compiler, faced with the increased 32-bit
register pressure and the extremely register-poor x86, is making
bad decisions on the otherwise P-256-tuned C code. The three platforms
that see much larger gains are significantly more important than 32-bit
x86 at this point, so go with this change.

armv7a (Nexus 5X) before/after [+14.4%]:
Did 2703 ECDSA P-256 verify operations in 5034539us (536.9 ops/sec)
Did 3127 ECDSA P-256 verify operations in 5091379us (614.2 ops/sec)

aarch64 (Nexus 5X) before/after [+9.2%]:
Did 6783 ECDSA P-256 verify operations in 5031324us (1348.2 ops/sec)
Did 7410 ECDSA P-256 verify operations in 5033291us (1472.2 ops/sec)

x86 before/after [-2.7%]:
Did 8961 ECDSA P-256 verify operations in 10075901us (889.3 ops/sec)
Did 8568 ECDSA P-256 verify operations in 10003001us (856.5 ops/sec)

x86_64 before/after [+8.6%]:
Did 29808 ECDSA P-256 verify operations in 10008662us (2978.2 ops/sec)
Did 32528 ECDSA P-256 verify operations in 10057137us (3234.3 ops/sec)

Change-Id: I5fa643149f5bfbbda9533e3008baadfee9979b93
Reviewed-on: https://boringssl-review.googlesource.com/25684
Reviewed-by: Adam Langley <agl@google.com>
Commit-Queue: Adam Langley <agl@google.com>
CQ-Verified: CQ bot account: commit-bot@chromium.org <commit-bot@chromium.org>
2018-02-12 22:00:48 +00:00
David Benjamin
6e4ff114fc Merge Intel copyright notice into standard
This was done by OpenSSL with the kind permission of Intel. This change
is imported from upstream's commit
dcf6e50f48e6bab92dcd2dacb27fc17c0de34199.

Change-Id: Ie8d3b700cd527a6e8cf66e0728051b2acd8cc6b9
Reviewed-on: https://boringssl-review.googlesource.com/25588
Commit-Queue: David Benjamin <davidben@google.com>
CQ-Verified: CQ bot account: commit-bot@chromium.org <commit-bot@chromium.org>
Reviewed-by: Adam Langley <agl@google.com>
2018-02-12 21:44:27 +00:00
David Benjamin
f6cf8bbc84 Sync up AES assembly.
This syncs up with OpenSSL master as of
50ea9d2b3521467a11559be41dcf05ee05feabd6. The non-license non-spelling
changes are CFI bits, which were added in upstream in
b84460ad3a3e4fcb22efaa0a8365b826f4264ecf.

Change-Id: I42280985f834d5b9133eacafc8ff9dbd2f0ea59a
Reviewed-on: https://boringssl-review.googlesource.com/25704
Reviewed-by: Adam Langley <agl@google.com>
2018-02-11 01:03:17 +00:00
David Benjamin
6dc994265e Sync up some perlasm license headers and easy fixes.
These files are otherwise up-to-date with OpenSSL master as of
50ea9d2b3521467a11559be41dcf05ee05feabd6, modulo a couple of spelling
fixes which I've imported.

I've also reverted the same-line label and instruction patch to
x86_64-mont*.pl. The new delocate parser handles that fine.

Change-Id: Ife35c671a8104c3cc2fb6c5a03127376fccc4402
Reviewed-on: https://boringssl-review.googlesource.com/25644
Reviewed-by: Adam Langley <agl@google.com>
2018-02-11 01:00:35 +00:00
David Benjamin
0f4f6c2e02 p256-x86_64.pl: add CFI directives.
(Imported from upstream's 86e112788e2ab9740c0cabf3ae4b1eb67b386bab.)

Change-Id: I1ba11e47f1ec9846ea00c738db737c35ce7aaab1
Reviewed-on: https://boringssl-review.googlesource.com/25587
Reviewed-by: Adam Langley <agl@google.com>
2018-02-11 00:53:41 +00:00
David Benjamin
02808ddcaa p256-x86_64-asm.pl: Win64 SEH face-lift.
This imports 384e6de4c7e35e37fb3d6fbeb32ddcb5eb0d3d3f and
79ca382d4762c58c4b92fceb4e202e90c71292ae from upstream.

Differences from upstream:

- We've removed a number of unused functions.

- We never imported 3ff08e1dde56747011a702a9a5aae06cfa8ae5fc, which was
  to give the assembly control over the memory layout in the tables. So
  our "gather" is "select" (which is implemented the same because the
  memory layout never did change) and our "scatter" is in C.

Change-Id: I90d4a17da9f5f693f4dc4706887dec15f010071b
Reviewed-on: https://boringssl-review.googlesource.com/25586
Reviewed-by: Adam Langley <agl@google.com>
2018-02-11 00:52:23 +00:00
David Benjamin
05640fd373 p256-x86_64-asm.pl: Add OpenSSL copyright
As of upstream's 6aa36e8e5a062e31543e7796f0351ff9628832ce, the
corresponding file in OpenSSL has both an Intel and OpenSSL copyright
blocks.  To properly sync up with OpenSSL, use the OpenSSL copyright
block and our version of the Intel copyright block.

Change-Id: I4dc072a11390a54d0ce38ec0b8893e48f52638de
Reviewed-on: https://boringssl-review.googlesource.com/25585
Reviewed-by: Adam Langley <agl@google.com>
2018-02-11 00:50:19 +00:00
David Benjamin
8ae929f1e9 p256-x86_64.pl: update commentary with before-after performance data.
(Imported from upstream's f0e6871df2e4641d0532e8f99d26c7a6454d03df.)

Change-Id: I2b799ff2a133839b0fe9d9093799d3a86045d709
Reviewed-on: https://boringssl-review.googlesource.com/25584
Reviewed-by: Adam Langley <agl@google.com>
2018-02-11 00:49:54 +00:00
Daniel Hirche
d25e62e772 Return NULL instead of zero in |bn_resized_from_ctx|.
Change-Id: I5fc029ceddfa60b2ccc97c138b94c1826f6d75fa
Reviewed-on: https://boringssl-review.googlesource.com/25844
Reviewed-by: David Benjamin <davidben@google.com>
Commit-Queue: David Benjamin <davidben@google.com>
CQ-Verified: CQ bot account: commit-bot@chromium.org <commit-bot@chromium.org>
2018-02-10 23:10:54 +00:00
David Benjamin
38c20fe8d5 Fix threading issues with RSA freeze_private_key.
OpenSSL's RSA API is poorly designed and does not have a single place to
properly initialize the key. See
https://github.com/openssl/openssl/issues/5158.

To workaround this flaw, we must lazily instantiate pre-computed
Montgomery bits with locking. This is a ton of complexity. More
importantly, it makes it very difficult to implement RSA without side
channels. The correct in-memory representation of d, dmp1, and dmq1
depend on n, p, and q, respectively. (Those values have private
magnitudes and must be sized relative to the respective moduli.)

08805fe279 attempted to fix up the various
widths under lock, when we set up BN_MONT_CTX. However, this introduces
threading issues because other threads may access those exposed
components (RSA_get0_* also count as exposed for these purposes because
they are get0 functions), while a private key operation is in progress.

Instead, we do the following:

- There is no actual need to minimize n, p, and q, but we have minimized
  copies in the BN_MONT_CTXs, so use those.

- Store additional copies of d, dmp1, and dmq1, at the cost of more
  memory used. These copies have the correct width and are private,
  unlike d, dmp1, and dmq1 which are sadly exposed. Fix private key
  operations to use them.

- Move the frozen bit out of rsa->flags, as that too was historically
  accessible without locking.

(Serialization still uses the original BIGNUMs, but the RSAPrivateKey
serialization format already inherently leaks the magnitude, so this
doesn't matter.)

Change-Id: Ia3a9b0629f8efef23abb30bfed110d247d1db42f
Reviewed-on: https://boringssl-review.googlesource.com/25824
Commit-Queue: David Benjamin <davidben@google.com>
CQ-Verified: CQ bot account: commit-bot@chromium.org <commit-bot@chromium.org>
Reviewed-by: Adam Langley <agl@google.com>
2018-02-09 22:17:11 +00:00
Adam Langley
61dedd6815 Don't crash when failing to set affine coordinates when the generator is missing.
If a caller is in the process on constructing an arbitrary |EC_GROUP|,
and they try to create an |EC_POINT| to set as the generator which is
invalid, we would previously crash.

Change-Id: Ida91354257a02bd56ac29ba3104c9782b8d70f6b
Reviewed-on: https://boringssl-review.googlesource.com/25764
Commit-Queue: David Benjamin <davidben@google.com>
Reviewed-by: David Benjamin <davidben@google.com>
CQ-Verified: CQ bot account: commit-bot@chromium.org <commit-bot@chromium.org>
2018-02-07 23:08:17 +00:00
David Benjamin
376f3f1727 Add BN_count_low_zero_bits.
This allows a BIGNUM consumer to avoid messing around with bn->d and
bn->top/width.

Bug: 232
Change-Id: I134cf412fef24eb404ff66c84831b4591d921a17
Reviewed-on: https://boringssl-review.googlesource.com/25484
Commit-Queue: David Benjamin <davidben@google.com>
CQ-Verified: CQ bot account: commit-bot@chromium.org <commit-bot@chromium.org>
Reviewed-by: Adam Langley <agl@google.com>
2018-02-06 03:10:54 +00:00
David Benjamin
d24cb22c55 Make BN_cmp constant-time.
This is a bit easier to read than BN_less_than_consttime when we must do
>= or <=, about as much work to compute, and lots of code calls BN_cmp
on secret data. This also, by extension, makes BN_cmp_word
constant-time.

BN_equal_consttime is probably a little more efficient and is perfectly
readable, so leave that one around.

Change-Id: Id2e07fe312f01cb6fd10a1306dcbf6397990cf13
Reviewed-on: https://boringssl-review.googlesource.com/25444
Commit-Queue: David Benjamin <davidben@google.com>
CQ-Verified: CQ bot account: commit-bot@chromium.org <commit-bot@chromium.org>
Reviewed-by: Adam Langley <agl@google.com>
2018-02-06 03:10:44 +00:00
David Benjamin
ac383701b7 Simplify bn_mul_part_recursive.
The loop and the outermost special-cases are basically the same.

Change-Id: I5e3ca60ad9a04efa66b479eebf8c3637a11cdceb
Reviewed-on: https://boringssl-review.googlesource.com/25406
Commit-Queue: David Benjamin <davidben@google.com>
CQ-Verified: CQ bot account: commit-bot@chromium.org <commit-bot@chromium.org>
Reviewed-by: Adam Langley <agl@google.com>
2018-02-06 03:04:04 +00:00
David Benjamin
6488f4e2ba Fix over-allocated bounds on bn_mul_part_recursive.
Same mistake as bn_mul_recursive.

Change-Id: I2374d37e5da61c82ccb1ad79da55597fa3f10640
Reviewed-on: https://boringssl-review.googlesource.com/25405
Commit-Queue: David Benjamin <davidben@google.com>
CQ-Verified: CQ bot account: commit-bot@chromium.org <commit-bot@chromium.org>
Reviewed-by: Adam Langley <agl@google.com>
2018-02-06 02:57:55 +00:00
David Benjamin
2bf82975ad Make bn_mul_part_recursive constant-time.
This follows similar lines as the previous cleanups and fixes the
documentation of the preconditions.

And with that, RSA private key operations, provided p and q have the
same bit length, should be constant time, as far as I know. (Though I'm
sure I've missed something.)

bn_cmp_part_words and bn_cmp_words are no longer used and deleted.

Bug: 234
Change-Id: Iceefa39f57e466c214794c69b335c4d2c81f5577
Reviewed-on: https://boringssl-review.googlesource.com/25404
Commit-Queue: David Benjamin <davidben@google.com>
CQ-Verified: CQ bot account: commit-bot@chromium.org <commit-bot@chromium.org>
Reviewed-by: Adam Langley <agl@google.com>
2018-02-06 02:51:54 +00:00
David Benjamin
6541308ff3 Don't allocate oversized arrays for bn_mul_recursive.
The power of two computations here were extremely confusing and one of
the comments mixed && and ||. Remove the cached k = j + j value.
Optimizing the j*8, j*8, j*2, and j*4 multiplications is the compiler's
job. If it doesn't manage it, it was only a couple shifts anyway.

With that fixed, it becomes easier to tell that rr was actaully
allocated twice as large as necessary. I suspect rr is also
incorrectly-allocated in the bn_mul_part_recursive case, but I'll wait
until I've checked that function over first. (The array size
documentation on the other bn_{mul,sqr}_recursive functions have had
mistakes before.)

Change-Id: I298400b988e3bd108d01d6a7c8a5b262ddf81feb
Reviewed-on: https://boringssl-review.googlesource.com/25364
Commit-Queue: David Benjamin <davidben@google.com>
CQ-Verified: CQ bot account: commit-bot@chromium.org <commit-bot@chromium.org>
Reviewed-by: Adam Langley <agl@google.com>
2018-02-06 02:51:44 +00:00
David Benjamin
34a2c5e476 Make bn_mul_recursive constant-time.
I left the input length as int because the calling convention passes
these messy deltas around. This micro-optimization is almost certainly
pointless, but bn_sub_part_words is written in assembly, so I've left it
alone for now. The documented preconditions were also all completely
wrong, so I've fixed them. We actually only call them for even tighter
bounds (one of dna or dnb is 0 and the other is 0 or -1), at least
outside bn_mul_part_recursive which I still need to read through.

This leaves bn_mul_part_recursive, which is reachable for RSA keys which
are not a power of two in bit width.

The first iteration of this had an uncaught bug, so I added a few more
aggressive tests generated with:

  A = 0x...
  B = 0x...

  # Chop off 0, 1 and > 1 word for both 32 and 64-bit.
  for i in (0, 1, 2, 4):
    for j in (0, 1, 2, 4):
      a = A >> (32*i)
      b = B >> (32*j)
      p = a * b
      print "Product = %x" % p
      print "A = %x" % a
      print "B = %x" % b
      print

Bug: 234
Change-Id: I72848d992637c0390cdd3c4f81cb919393b59eb8
Reviewed-on: https://boringssl-review.googlesource.com/25344
Commit-Queue: David Benjamin <davidben@google.com>
CQ-Verified: CQ bot account: commit-bot@chromium.org <commit-bot@chromium.org>
Reviewed-by: Adam Langley <agl@google.com>
2018-02-06 02:51:34 +00:00
David Benjamin
b01dd1c622 Make bn_sqr_recursive constant-time.
We still need BN_mul and, in particular, bn_mul_recursive will either
require bn_abs_sub_words be generalized or that we add a parallel
bn_abs_sub_part_words, but start with the easy one.

While I'm here, simplify the i and j mess in here. It's patterned after
the multiplication one, but can be much simpler.

Bug: 234
Change-Id: If936099d53304f2512262a1cbffb6c28ae30ccee
Reviewed-on: https://boringssl-review.googlesource.com/25325
Commit-Queue: David Benjamin <davidben@google.com>
CQ-Verified: CQ bot account: commit-bot@chromium.org <commit-bot@chromium.org>
Reviewed-by: Adam Langley <agl@google.com>
2018-02-06 02:47:34 +00:00
David Benjamin
3b3e12d81e Simplify BN_bn2bin_padded.
There is no more need for the "constant-time" reading beyond bn->top. We
can write the bytes out naively because RSA computations no longer call
bn_correct_top/bn_set_minimal_width.

Specifically, the final computation is a BN_mod_mul_montgomery to remove
the blinding, and that keeps the sizes correct.

Bug: 237
Change-Id: I6e90d81c323b644e179d899f411479ea16deab98
Reviewed-on: https://boringssl-review.googlesource.com/25324
Commit-Queue: David Benjamin <davidben@google.com>
CQ-Verified: CQ bot account: commit-bot@chromium.org <commit-bot@chromium.org>
Reviewed-by: Adam Langley <agl@google.com>
2018-02-06 02:41:38 +00:00
David Benjamin
be837402a9 Make the rest of RSA CRT constant-time.
Alas, the existence of RSA keys with q > p is obnoxious, but we can
canonicalize it away. To my knowledge, the remaining leaks in RSA are:

- Key generation. This is kind of hopelessly non-constant-time but
  perhaps deserves a more careful ponder. Though hopefully it does not
  come in at a measurable point for practical purposes.

- Private key serialization. RSAPrivateKey inherently leaks the
  magnitudes of d, dmp1, dmq1, and iqmp. This is unavoidable but
  hopefully does not come in at a measurable point for practical
  purposes.

- If p and q have different word widths, we currently fall back to the
  variable-time BN_mod rather than Montgomery reduction at the start of
  CRT. I can think of ways to apply Montgomery reduction, but it's
  probably better to deny CRT to such keys, if not reject them outright.

- bn_mul_fixed and bn_sqr_fixed which affect the Montgomery
  multiplication bn_mul_mont-less configurations, as well as the final
  CRT multiplication. We should fix this.

Bug: 233
Change-Id: I8c2ecf8f8ec104e9f26299b66ac8cbb0cad04616
Reviewed-on: https://boringssl-review.googlesource.com/25263
Commit-Queue: David Benjamin <davidben@google.com>
CQ-Verified: CQ bot account: commit-bot@chromium.org <commit-bot@chromium.org>
Reviewed-by: Adam Langley <agl@google.com>
2018-02-06 02:40:34 +00:00
David Benjamin
150ad30d28 Split BN_uadd into a bn_uadd_fixed.
This is to be used in constant-time RSA CRT.

Bug: 233
Change-Id: Ibade5792324dc6aba38cab6971d255d41fb5eb91
Reviewed-on: https://boringssl-review.googlesource.com/25286
Commit-Queue: David Benjamin <davidben@google.com>
CQ-Verified: CQ bot account: commit-bot@chromium.org <commit-bot@chromium.org>
Reviewed-by: Adam Langley <agl@google.com>
2018-02-06 02:39:45 +00:00
David Benjamin
5b10def1cf Compute mont->RR in constant-time.
Use the now constant-time modular arithmetic functions.

Bug: 236
Change-Id: I4567d67bfe62ca82ec295f2233d1a6c9b131e5d2
Reviewed-on: https://boringssl-review.googlesource.com/25285
Commit-Queue: David Benjamin <davidben@google.com>
CQ-Verified: CQ bot account: commit-bot@chromium.org <commit-bot@chromium.org>
Reviewed-by: Adam Langley <agl@google.com>
2018-02-06 01:40:24 +00:00
David Benjamin
6f564afbdd Make BN_mod_*_quick constant-time.
As the EC code will ultimately want to use these in "words" form by way
of EC_FELEM, and because it's much easier, I've implement these as
low-level words-based functions that require all inputs have the same
width. The BIGNUM versions which RSA and, for now, EC calls are
implemented on top of that.

Unfortunately, doing such things in constant-time and accounting for
undersized inputs requires some scratch space, and these functions don't
take BN_CTX. So I've added internal bn_mod_*_quick_ctx functions that
take a BN_CTX and the old functions now allocate a bit unnecessarily.
RSA only needs lshift (for BN_MONT_CTX) and sub (for CRT), but the
generic EC code wants add as well.

The generic EC code isn't even remotely constant-time, and I hope to
ultimately use stack-allocated EC_FELEMs, so I've made the actual
implementations here implemented in "words", which is much simpler
anyway due to not having to take care of widths.

I've also gone ahead and switched the EC code to these functions,
largely as a test of their performance (an earlier iteration made the EC
code noticeably slower). These operations are otherwise not
performance-critical in RSA.

The conversion from BIGNUM to BIGNUM+BN_CTX should be dropped by the
static linker already, and the unused BIGNUM+BN_CTX functions will fall
off when EC_FELEM happens.

Update-Note: BN_mod_*_quick bounce on malloc a bit now, but they're not
    really used externally. The one caller I found was wpa_supplicant
    which bounces on malloc already. They appear to be implementing
    compressed coordinates by hand? We may be able to convince them to
    call EC_POINT_set_compressed_coordinates_GFp.

Bug: 233, 236
Change-Id: I2bf361e9c089e0211b97d95523dbc06f1168e12b
Reviewed-on: https://boringssl-review.googlesource.com/25261
Commit-Queue: David Benjamin <davidben@google.com>
CQ-Verified: CQ bot account: commit-bot@chromium.org <commit-bot@chromium.org>
Reviewed-by: Adam Langley <agl@google.com>
2018-02-06 01:16:04 +00:00
David Benjamin
eaa80b7069 Remove DSA k+q kludge.
With fixed-width BIGNUMs, this is no longer a concern. With this CL, I
believe we now no longer call BN_num_bits on BIGNUMs with secret
magnitude.

Of course, DSA then turns around and calls the variable-time BN_mod
immediately afterwards anyway. But the DSA is deprecated and doomed to
be removed someday anyway.

Change-Id: Iac1dab22aa51c0e7f5ca0f7f44a026a242a4eaa2
Reviewed-on: https://boringssl-review.googlesource.com/25284
Commit-Queue: David Benjamin <davidben@google.com>
CQ-Verified: CQ bot account: commit-bot@chromium.org <commit-bot@chromium.org>
Reviewed-by: Adam Langley <agl@google.com>
2018-02-06 00:51:54 +00:00
David Benjamin
08805fe279 Normalize RSA private component widths.
d, dmp1, dmq1, and iqmp have private magnitudes. This is awkward because
the RSAPrivateKey serialization leaks the magnitudes. Do the best we can
and fix them up before any RSA operations.

This moves the piecemeal BN_MONT_CTX_set_locked into a common function
where we can do more complex canonicalization on the keys.  Ideally this
would be done on key import, but the exposed struct (and OpenSSL 1.1.0's
bad API design) mean there is no single point in time when key import is
finished.

Also document the constraints on RSA_set0_* functions. (These
constraints aren't new. They just were never documented before.)

Update-Note: If someone tried to use an invalid RSA key where d >= n,
   dmp1 >= p, dmq1 >= q, or iqmp >= p, this may break. Such keys would not
   have passed RSA_check_key, but it's possible to manually assemble
   keys that bypass it.
Bug: 232
Change-Id: I421f883128952f892ac0cde0d224873a625f37c5
Reviewed-on: https://boringssl-review.googlesource.com/25259
Commit-Queue: David Benjamin <davidben@google.com>
CQ-Verified: CQ bot account: commit-bot@chromium.org <commit-bot@chromium.org>
Reviewed-by: Adam Langley <agl@google.com>
2018-02-05 23:58:53 +00:00
David Benjamin
c7b6e0a664 Don't leak widths in bn_mod_mul_montgomery_fallback.
The fallback functions still themselves leak, but I've left TODOs there.

This only affects BN_mod_mul_montgomery on platforms where we don't use
the bn_mul_mont assembly, but BN_mul additionally affects the final
multiplication in RSA CRT.

Bug: 232
Change-Id: Ia1ae16162c38e10c056b76d6b2afbed67f1a5e16
Reviewed-on: https://boringssl-review.googlesource.com/25260
Commit-Queue: David Benjamin <davidben@google.com>
CQ-Verified: CQ bot account: commit-bot@chromium.org <commit-bot@chromium.org>
Reviewed-by: Adam Langley <agl@google.com>
2018-02-05 23:57:03 +00:00
David Benjamin
08d774a45f Remove some easy bn_set_minimal_width calls.
Functions that deserialize from bytes and Montgomery multiplication have
no reason to minimize their inputs.

Bug: 232
Change-Id: I121cc9b388033d684057b9df4ad0c08364849f58
Reviewed-on: https://boringssl-review.googlesource.com/25258
Commit-Queue: David Benjamin <davidben@google.com>
CQ-Verified: CQ bot account: commit-bot@chromium.org <commit-bot@chromium.org>
Reviewed-by: Adam Langley <agl@google.com>
2018-02-05 23:47:14 +00:00
David Benjamin
09633cc34e Rename bn->top to bn->width.
This has no behavior change, but it has a semantic one. This CL is an
assertion that all BIGNUM functions tolerate non-minimal BIGNUMs now.
Specifically:

- Functions that do not touch top/width are assumed to not care.

- Functions that do touch top/width will be changed by this CL. These
  should be checked in review that they tolerate non-minimal BIGNUMs.

Subsequent CLs will start adjusting the widths that BIGNUM functions
output, to fix timing leaks.

Bug: 232
Change-Id: I3a2b41b071f2174452f8d3801bce5c78947bb8f7
Reviewed-on: https://boringssl-review.googlesource.com/25257
Commit-Queue: David Benjamin <davidben@google.com>
CQ-Verified: CQ bot account: commit-bot@chromium.org <commit-bot@chromium.org>
Reviewed-by: Adam Langley <agl@google.com>
2018-02-05 23:44:24 +00:00
David Benjamin
23223ebbc1 Tidy BN_bn2hex and BN_print with non-minimal inputs.
These actually work as-is, but BN_bn2hex allocates more memory than
necessary, and we may as well skip the unnecessary words where we can.
Also add a test for this.

Bug: 232
Change-Id: Ie271fe9f3901d00dd5c3d7d63c1776de81a10ec7
Reviewed-on: https://boringssl-review.googlesource.com/25304
Commit-Queue: David Benjamin <davidben@google.com>
CQ-Verified: CQ bot account: commit-bot@chromium.org <commit-bot@chromium.org>
Reviewed-by: Adam Langley <agl@google.com>
2018-02-05 23:18:33 +00:00
David Benjamin
cb4e300f17 Store EC field and orders in minimal form.
The order (and later the field) are used to size stack-allocated fixed
width word arrays. They're also entirely public, so this is fine.

Bug: 232
Change-Id: Ie98869cdbbdfea92dcad64a300f7e0b47bef6bf2
Reviewed-on: https://boringssl-review.googlesource.com/25256
Commit-Queue: David Benjamin <davidben@google.com>
CQ-Verified: CQ bot account: commit-bot@chromium.org <commit-bot@chromium.org>
Reviewed-by: Adam Langley <agl@google.com>
2018-02-05 23:17:34 +00:00
David Benjamin
226b4b51b5 Make the rest of BIGNUM accept non-minimal values.
Test this by re-running bn_tests.txt tests a lot. For the most part,
this was done by scattering bn_minimal_width or bn_correct_top calls as
needed. We'll incrementally tease apart the functions that need to act
on non-minimal BIGNUMs in constant-time.

BN_sqr was switched to call bn_correct_top at the end, rather than
sample bn_minimal_width, in anticipation of later splitting it into
BN_sqr (for calculators) and BN_sqr_fixed (for BN_mod_mul_montgomery).

BN_div_word also uses bn_correct_top because it calls BN_lshift so
officially shouldn't rely on BN_lshift returning something
minimal-width, though I expect we'd want to split off a BN_lshift_fixed
than change that anyway?

The shifts sample bn_minimal_width rather than bn_correct_top because
they all seem to try to be very clever around the bit width. If we need
constant-time versions of them, we can adjust them later.

Bug: 232
Change-Id: Ie17b39034a713542dbe906cf8954c0c5483c7db7
Reviewed-on: https://boringssl-review.googlesource.com/25255
Commit-Queue: David Benjamin <davidben@google.com>
CQ-Verified: CQ bot account: commit-bot@chromium.org <commit-bot@chromium.org>
Reviewed-by: Adam Langley <agl@google.com>
2018-02-05 23:05:34 +00:00
Adam Langley
45210dd4e2 Tidy up |ec_GFp_simple_point2oct| and friend.
(Just happened to see these as I went by.)

Change-Id: I348b163e6986bfca8b58e56885c35a813efe28f6
Reviewed-on: https://boringssl-review.googlesource.com/25725
Reviewed-by: David Benjamin <davidben@google.com>
Commit-Queue: David Benjamin <davidben@google.com>
CQ-Verified: CQ bot account: commit-bot@chromium.org <commit-bot@chromium.org>
2018-02-05 04:40:59 +00:00
Adam Langley
2044181e01 Set output point to the generator when not on the curve.
Processing off-curve points is sufficiently dangerous to worry about
code that doesn't check the return value of
|EC_POINT_set_affine_coordinates| and |EC_POINT_oct2point|. While we
have integrated on-curve checks into these functions, code that ignores
the return value will still be able to work with an invalid point
because it's already been installed in the output by the time the check
is done.

Instead, in the event of an off-curve point, set the output point to the
generator, which is certainly on the curve and hopefully safe.

Change-Id: Ibc73dceb2d8d21920e07c4f6def2c8249cb78ca0
Reviewed-on: https://boringssl-review.googlesource.com/25724
Commit-Queue: David Benjamin <davidben@google.com>
Reviewed-by: David Benjamin <davidben@google.com>
CQ-Verified: CQ bot account: commit-bot@chromium.org <commit-bot@chromium.org>
2018-02-05 02:03:29 +00:00
Adam Langley
472ba2c2dd Require that Ed25519 |s| values be < order.
https://tools.ietf.org/html/rfc8032#section-5.1.7 adds this requirement
to prevent signature malleability.

Change-Id: Iac9a3649d97fc69e6efb4aea1ab1e002768fadc9
Reviewed-on: https://boringssl-review.googlesource.com/25564
Commit-Queue: David Benjamin <davidben@google.com>
Reviewed-by: David Benjamin <davidben@google.com>
CQ-Verified: CQ bot account: commit-bot@chromium.org <commit-bot@chromium.org>
2018-02-02 20:45:08 +00:00
David Benjamin
f4b708cc1e Add a function which folds BN_MONT_CTX_{new,set} together.
These empty states aren't any use to either caller or implementor.

Change-Id: If0b748afeeb79e4a1386182e61c5b5ecf838de62
Reviewed-on: https://boringssl-review.googlesource.com/25254
Reviewed-by: Adam Langley <agl@google.com>
2018-02-02 20:23:25 +00:00
David Benjamin
feffb87168 Make BN_bn2bin_padded work with non-minimal BIGNUMs.
Checking the excess words for zero doesn't need to be in constant time,
but it's free. BN_bn2bin_padded is a little silly as read_word_padded
only exists to work around bn->top being minimal. Once non-minimal
BIGNUMs are turned on and the RSA code works right, we can simplify
BN_bn2bin_padded.

Bug: 232
Change-Id: Ib81e30ca1e5a8ea90ab3278bf4ded219bac481ac
Reviewed-on: https://boringssl-review.googlesource.com/25253
Reviewed-by: Adam Langley <agl@google.com>
2018-02-02 20:16:50 +00:00
David Benjamin
6c41465548 Remove redundant bn->top computation.
One less to worry about.

Bug: 232
Change-Id: Ib7d38e18fee02590088d76363e17f774cfefa59b
Reviewed-on: https://boringssl-review.googlesource.com/25252
Reviewed-by: Adam Langley <agl@google.com>
2018-02-02 18:54:09 +00:00