Commit Graph

5718 Commits

Author SHA1 Message Date
Watson Ladd
3390fd88d7 Correct outdated comments
Change-Id: Idc3a41d025fefa9017fce108bed63cb8af426c9b
Reviewed-on: https://boringssl-review.googlesource.com/c/boringssl/+/35244
Reviewed-by: David Benjamin <davidben@google.com>
Commit-Queue: David Benjamin <davidben@google.com>
2019-03-07 21:55:09 +00:00
David Benjamin
f9c8d30897 Remove SSL_get_structure_sizes.
With all those structures made opaque, it's not really useful as a build
sanity-check anymore.

Update-Note: This function is removed, but I don't see any actual uses.
Change-Id: Ib5640e778466da980596e7085d97104d22aa9d33
Reviewed-on: https://boringssl-review.googlesource.com/c/boringssl/+/35184
Commit-Queue: David Benjamin <davidben@google.com>
Commit-Queue: Adam Langley <agl@google.com>
Reviewed-by: Adam Langley <agl@google.com>
2019-03-05 17:58:10 +00:00
David Benjamin
b8d7b7498c Prefer vpaes over bsaes in AES-GCM-SIV and AES-CCM.
The AES-GCM-SIV code does not use ctr128_f at all so bsaes is simply
identical to aes_nohw. Also, while CCM encrypts with CTR mode, its MAC
is not parallelizable at all.

(Given the existence of non-parallelizable modes, we ought to make a
vpaes-armv7.pl to ensure constant-time AES on NEON. For now, pick the
right implementation for x86_64 at least.)

aes_ctr_set_key and friends probably aren't the right abstraction
(observe the large vs small inputs hint *almost* matches whether you
touch block128_f), but the right abstraction depends on a couple
questions:

- If you don't provide ctr128_f, is there a perf hit to implementing
  ctr128_f on top of your block128_f to unify calling code?

- It is almost certainly better to use bsaes with gcm.c by calling
  ctr128_f exclusively and paying some copies (a dedicated calling
  convention would be even better, but would be a headache) to integrate
  leading and trailing blocks into the CTR pass. Is this a win, loss, or
  no-op for hwaes, where block128_f is just fine? hwaes is the one mode
  we really should not regress.

Hopefully those will get answered as we continue to chip away at this.

Bug: 256
Change-Id: I8f0150b223b671e68f7da6faaff94a3bea398d4d
Reviewed-on: https://boringssl-review.googlesource.com/c/boringssl/+/35169
Reviewed-by: Adam Langley <agl@google.com>
2019-03-05 17:55:03 +00:00
David Benjamin
da8bb847fd Tell ASan about the OPENSSL_malloc prefix.
OpenSSL's BN_mul function had a single-word buffer underflow (see
576129cd72ae054d246221f111aabf42b9c6d76d). We already independently
fixed this but, if we hadn't, ASan wouldn't have noticed because of
OPENSSL_malloc.

ASan has runtime hooks we can call to make it more accurate.

Change-Id: Ifc9c3837ece2bc456c5bdc960be707d7b1759904
Reviewed-on: https://boringssl-review.googlesource.com/c/boringssl/+/35165
Reviewed-by: Adam Langley <agl@google.com>
2019-03-05 17:53:16 +00:00
David Benjamin
8d685ec867 modes/asm/ghash-armv4.pl: address "infixes are deprecated" warnings.
This imports ce5eb5e8149d8d03660575f4b8504c993851988a and
1212818eb07add297fe562eba80ac46a9893781e from OpenSSL's 1.1.1 branch.

Change-Id: I121c0771371697191a163a28d972a7b3cee37762
Reviewed-on: https://boringssl-review.googlesource.com/c/boringssl/+/35164
Reviewed-by: Adam Langley <agl@google.com>
2019-03-05 17:52:28 +00:00
David Benjamin
55db667c62 Enable vpaes for aarch64, with CTR optimizations.
This patches vpaes-armv8.pl to add vpaes_ctr32_encrypt_blocks. CTR mode
is by far the most important mode these days. It should have access to
_vpaes_encrypt_2x, which gives a considerable speed boost. Also exclude
vpaes_ecb_* as they're not even used.

For iOS, this change is completely a no-op. iOS ARMv8 always has crypto
extensions, and we already statically drop all other AES
implementations.

Android ARMv8 is *not* required to have crypto extensions, but every
ARMv8 device I've seen has them. For those, it is a no-op
performance-wise and a win on size. vpaes appears to be about 5.6KiB
smaller than the tables. ARMv8 always makes SIMD (NEON) available, so we
can statically drop aes_nohw.

In theory, however, crypto-less Android ARMv8 is possible. Today such
chips get a variable-time AES. This CL fixes this, but the performance
story is complex.

The Raspberry Pi 3 is not Android but has a Cortex-A53 chip
without crypto extensions. (But the official images are 32-bit, so even
this is slightly artificial...) There, vpaes is a performance win.

Raspberry Pi 3, Model B+, Cortex-A53
Before:
Did 265000 AES-128-GCM (16 bytes) seal operations in 1003312us (264125.2 ops/sec): 4.2 MB/s
Did 44000 AES-128-GCM (256 bytes) seal operations in 1002141us (43906.0 ops/sec): 11.2 MB/s
Did 9394 AES-128-GCM (1350 bytes) seal operations in 1032104us (9101.8 ops/sec): 12.3 MB/s
Did 1562 AES-128-GCM (8192 bytes) seal operations in 1008982us (1548.1 ops/sec): 12.7 MB/s
After:
Did 277000 AES-128-GCM (16 bytes) seal operations in 1001884us (276479.1 ops/sec): 4.4 MB/s
Did 52000 AES-128-GCM (256 bytes) seal operations in 1001480us (51923.2 ops/sec): 13.3 MB/s
Did 11000 AES-128-GCM (1350 bytes) seal operations in 1007979us (10912.9 ops/sec): 14.7 MB/s
Did 2013 AES-128-GCM (8192 bytes) seal operations in 1085545us (1854.4 ops/sec): 15.2 MB/s

The Pixel 3 has a Cortex-A75 with crypto extensions, so it would never
run this code. However, artificially ignoring them gives another data
point (ARM documentation[*] suggests the extensions are still optional
on a Cortex-A75.) Sadly, vpaes no longer wins on perf over aes_nohw.
But, it is constant-time:

Pixel 3, AES/PMULL extensions ignored, Cortex-A75:
Before:
Did 2102000 AES-128-GCM (16 bytes) seal operations in 1000378us (2101205.7 ops/sec): 33.6 MB/s
Did 358000 AES-128-GCM (256 bytes) seal operations in 1002658us (357051.0 ops/sec): 91.4 MB/s
Did 75000 AES-128-GCM (1350 bytes) seal operations in 1012830us (74049.9 ops/sec): 100.0 MB/s
Did 13000 AES-128-GCM (8192 bytes) seal operations in 1036524us (12541.9 ops/sec): 102.7 MB/s
After:
Did 1453000 AES-128-GCM (16 bytes) seal operations in 1000213us (1452690.6 ops/sec): 23.2 MB/s
Did 285000 AES-128-GCM (256 bytes) seal operations in 1002227us (284366.7 ops/sec): 72.8 MB/s
Did 60000 AES-128-GCM (1350 bytes) seal operations in 1016106us (59049.0 ops/sec): 79.7 MB/s
Did 11000 AES-128-GCM (8192 bytes) seal operations in 1094184us (10053.2 ops/sec): 82.4 MB/s

Note the numbers above run with PMULL off, so the slow GHASH is
dampening the regression. If we test aes_nohw and vpaes paired with
PMULL on, the 20% perf hit becomes a 31% hit. The PMULL-less variant is
more likely to represent a real chip.

This is consistent with upstream's note in the comment, though it is
unclear if 20% is the right order of magnitude: "these results are worse
than scalar compiler-generated code, but it's constant-time and
therefore preferred".

[*] http://infocenter.arm.com/help/index.jsp?topic=/com.arm.doc.100458_0301_00_en/lau1442495529696.html

Bug: 246
Change-Id: If1dc87f5131fce742052498295476fbae4628dbf
Reviewed-on: https://boringssl-review.googlesource.com/c/boringssl/+/35026
Commit-Queue: David Benjamin <davidben@google.com>
Reviewed-by: Adam Langley <agl@google.com>
2019-03-04 20:31:39 +00:00
David Benjamin
b1b4ff93ca Check in vpaes-armv8.pl from OpenSSL unused and unmodified.
This is done separately to make the diffs in the subsequent CL easier to
see. Imported from OpenSSL at revision
25ca718150cef41e1c1d9c2c8c58e2b1e2cad3fa.

Bug: 246
Change-Id: I9e7067ea177963fb9b77bf6fb39702ffe6e34ed4
Reviewed-on: https://boringssl-review.googlesource.com/c/boringssl/+/35025
Commit-Queue: David Benjamin <davidben@google.com>
Reviewed-by: Adam Langley <agl@google.com>
2019-03-04 20:23:09 +00:00
Jeremy Apthorp
1fa5abc0b4 silence unused variable warnings when using OPENSSL_clear_free
e.g. here: adbe3b837e/src/node_crypto.cc (L3439)

Change-Id: I2d43a3439d6a56c8eee3636b3c1f5ba615b233ba
Reviewed-on: https://boringssl-review.googlesource.com/c/boringssl/+/35144
Reviewed-by: Adam Langley <agl@google.com>
Commit-Queue: Adam Langley <agl@google.com>
2019-03-04 19:55:29 +00:00
Jeremy Apthorp
19220dd6af Handle NULL public key in |EC_KEY_set_public_key|.
Node.js expects to be able to pass NULL to this function to clear the
current public key:
adbe3b837e/src/node_crypto.cc (L5316)

Change-Id: Id4e34d8e8b556c28000e4df12ff6f4432ad9220c
Reviewed-on: https://boringssl-review.googlesource.com/c/boringssl/+/35124
Reviewed-by: Adam Langley <agl@google.com>
Commit-Queue: Adam Langley <agl@google.com>
2019-03-04 19:45:29 +00:00
David Benjamin
5ce12e6436 Add a 32-bit SSSE3 GHASH implementation.
The 64-bit version can be fairly straightforwardly translated.

Ironically, this makes 32-bit x86 the first architecture to meet the
goal of constant-time AES-GCM given SIMD assembly. (Though x86_64 could
join by simply giving up on bsaes...)

Bug: 263
Change-Id: Icb2cec936457fac7132bbb5dbb094433bc14b86e
Reviewed-on: https://boringssl-review.googlesource.com/c/boringssl/+/35024
Commit-Queue: David Benjamin <davidben@google.com>
Reviewed-by: Adam Langley <agl@google.com>
2019-03-04 19:02:52 +00:00
Robert Sloan
ae1e08709f Also include abi_test.cc in ssl_test_files.
Change-Id: I1225f1623d4438a2ccaf482eddbe4f460cfaf78c
Reviewed-on: https://boringssl-review.googlesource.com/c/35104
Reviewed-by: David Benjamin <davidben@google.com>
Commit-Queue: David Benjamin <davidben@google.com>
2019-03-02 04:15:28 +00:00
David Benjamin
c3889634a1 Don't pull abi_test.cc into non-GTest targets.
The test_support is kind of a mess right now because it's sometimes used in
GTest targets and sometimes not. It really should be split into two libraries,
but do this for now to unbreak the Android build.

Change-Id: I7cd2b0f6ed9eda1a529ec3c69a92390e20da66f8
Reviewed-on: https://boringssl-review.googlesource.com/c/35084
Reviewed-by: Adam Langley <agl@google.com>
Commit-Queue: David Benjamin <davidben@google.com>
2019-03-01 20:24:27 +00:00
Alessandro Ghedini
a6124742d0 Update *_set_cert_cb documentation regarding resumption
Since 34202b93b6 cert_cb is always called
before resumption is checked.

Change-Id: I27ca5653144027a1f545a90ecb6b68e64783a66a
Reviewed-on: https://boringssl-review.googlesource.com/c/35004
Reviewed-by: David Benjamin <davidben@google.com>
2019-02-27 17:26:07 +00:00
David Benjamin
1e0262ad87 Add a reference for Linux ARM ABI.
The Android NDK docs link to a ARM GNU/Linux Application Binary Interface
Supplement document. Also fix a type in trampoline-armv4.pl. The generic ARM
document is usually shortened AAPCS, not APCS.

I couldn't find a corresponding link for aarch64.

Change-Id: I6e5543f5c9e26955cd3945e9e7a5dcff27c2bd78
Reviewed-on: https://boringssl-review.googlesource.com/c/35064
Commit-Queue: David Benjamin <davidben@google.com>
Commit-Queue: Adam Langley <agl@google.com>
Reviewed-by: Adam Langley <agl@google.com>
2019-02-27 17:18:02 +00:00
David Benjamin
a57435e138 Remove __ARM_ARCH__ guard on gcm_*_v8.
OpenSSL's c1669e1c205dc8e695fb0c10a655f434e758b9f7 switched it to
__ARM_MAX_ARCH__, which we mirrored in assembly but not C. The C version
should be __ARM_MAX_ARCH__ to match. However, __ARM_MAX_ARCH__ is
hardcoded to 8, so just remove the check.

Change-Id: Ic873203db1478f49437b889b84ee7fb28eba1a6d
Reviewed-on: https://boringssl-review.googlesource.com/c/35045
Commit-Queue: Adam Langley <agl@google.com>
Reviewed-by: Adam Langley <agl@google.com>
2019-02-27 02:26:21 +00:00
David Benjamin
f1f73f8966 Fix bsaes-armv7.pl getting disabled by accident.
https://boringssl-review.googlesource.com/c/34188 accidentally disabled
it (__ARM_MAX_ARCH__ wasn't defined), which, in turn, masked a bug in
https://boringssl-review.googlesource.com/c/34874.

Remove the __ARM_MAX_ARCH__ check as that's hardcoded to 8 anyway. Then
revert the problematic part of the bsaes-armv7.pl change. That brings
back the somewhat questionable post-dispatch to pre-dispatch call, but I
hope to patch the fallbacks out soon anyway.

Change-Id: I567e55fe35cb716d5ed56580113a302617f5ad71
Reviewed-on: https://boringssl-review.googlesource.com/c/35044
Commit-Queue: David Benjamin <davidben@google.com>
Commit-Queue: Adam Langley <agl@google.com>
Reviewed-by: Adam Langley <agl@google.com>
2019-02-27 02:06:21 +00:00
David Benjamin
6443173d03 Add an option to configure bssl speed chunk size.
bsaes, in its current incarnation, hits various pathological behaviors
at different input sizes. Make it easy to experiment around them.

Bug: 256
Change-Id: Ib6c6ca7d06a570dbf7d4d2ea81c1db0d94d3d0c4
Reviewed-on: https://boringssl-review.googlesource.com/c/34876
Commit-Queue: David Benjamin <davidben@google.com>
Reviewed-by: Adam Langley <agl@google.com>
2019-02-25 20:25:58 +00:00
David Benjamin
98ad4d77e3 Appease GCC's uninitialized value warning.
GCC notices that one function believes < 0 is the error while the other
believes it's != 0. unw_get_reg never returns positive, but match them.

Change-Id: I40af614e6b1400bf3d398bd32beb6d3ec702bc11
Reviewed-on: https://boringssl-review.googlesource.com/c/34985
Reviewed-by: Adam Langley <agl@google.com>
Commit-Queue: Adam Langley <agl@google.com>
2019-02-22 23:56:14 +00:00
Adam Langley
a367d9267f Set VPAES flags in x86-64 code.
The ImplDispatchTest was broken because the 64-bit VPAES code wasn't
setting the hit flags.

Change-Id: I30200db64337deba7ae9d70d8427decbdfceca58
Reviewed-on: https://boringssl-review.googlesource.com/c/34986
Reviewed-by: David Benjamin <davidben@google.com>
2019-02-22 23:41:50 +00:00
David Benjamin
65dc321492 Enable vpaes for AES_* functions.
This makes the AES_* functions meet our constant-time goals for
platforms where we have vpaes available. In particular, QUIC packet
number encryption needs single-block operations and those should have
vpaes available.

As a bonus, when vpaes is statically available, the aes_nohw_* functions
should be dropped by the linker. (Notably, NEON is guaranteed on
aarch64. Although vpaes-armv8.pl itself may take some more exploration.
https://crbug.com/boringssl/246#c4)

Bug: 263
Change-Id: Ie1c4727a166ec101a8453761757c87dadc188769
Reviewed-on: https://boringssl-review.googlesource.com/c/34875
Commit-Queue: David Benjamin <davidben@google.com>
Reviewed-by: Adam Langley <agl@google.com>
2019-02-22 23:09:19 +00:00
David Benjamin
3c19830f6f Avoid double-dispatch with AES_* vs aes_nohw_*.
In particular, consistently pair bsaes with aes_nohw.

Ideally the aes_nohw_* calls in bsaes-*.pl would be patched out and
bsaes grows its own constant-time key setup
(https://crbug.com/boringssl/256), but I'll sort that out separately. In
the meantime, avoid going through AES_* which now dispatch. This avoids
several nuisances:

1. If we were to add, say, a vpaes-armv7.pl the ABI tests would break.
   Fundamentally, we cannot assume that an AES_KEY has one and only one
   representation and must keep everything matching up.

2. AES_* functions should enable vpaes. This makes AES_* faster and
   constant-time for vector-capable CPUs
   (https://crbug.com/boringssl/263), relevant for QUIC packet number
   encryption, allowing us to add vpaes-armv8.pl
   (https://crbug.com/boringssl/246) without carrying a (likely) mostly
   unused AES implementation.

3. It's silly to double-dispatch when the EVP layer has already
   dispatched.

4. We should avoid asm calling into C. Otherwise, we need to test asm
   for ABI compliance as both caller and callee. Currently we only test
   it for callee compliance. When asm calls into asm, it *should* comply
   with the ABI as caller too, but mistakes don't matter as long as the
   called function triggers it. If the function is asm, this is fixed.
   If it is C, we must care about arbitrary C compiler output.

Bug: 263
Change-Id: Ic85af5c765fd57cbffeaf301c3872bad6c5bbf78
Reviewed-on: https://boringssl-review.googlesource.com/c/34874
Commit-Queue: Adam Langley <agl@google.com>
Reviewed-by: Adam Langley <agl@google.com>
2019-02-22 22:51:51 +00:00
Kaustubha Govind
c18353d214 Add uint64_t support in CBS and CBB.
We need these APIs to parse some Certificate Transparency structures.

Bug: chromium:634570
Change-Id: I4eb46058985a7369dc119ba6a1214913b237da39
Reviewed-on: https://boringssl-review.googlesource.com/c/34944
Reviewed-by: David Benjamin <davidben@google.com>
Reviewed-by: Adam Langley <agl@google.com>
Commit-Queue: David Benjamin <davidben@google.com>
Commit-Queue: Adam Langley <agl@google.com>
2019-02-22 20:38:17 +00:00
David Benjamin
f109f20873 Clear out a bunch of -Wextra-semi warnings.
Unfortunately, it's not enough to be able to turn it on thanks to the
PURE_VIRTUAL macro. But it gets us most of the way there.

Change-Id: Ie6ad5119fcfd420115fa49d7312f3586890244f4
Reviewed-on: https://boringssl-review.googlesource.com/c/34949
Reviewed-by: Adam Langley <agl@google.com>
Commit-Queue: Adam Langley <agl@google.com>
2019-02-21 19:12:39 +00:00
Steven Valdez
0326105aa9 Add compiled python files to .gitignore.
Change-Id: If5d88d88bd1ea8189cc715cc38e70bd3b11c4b67
Reviewed-on: https://boringssl-review.googlesource.com/c/34950
Commit-Queue: Steven Valdez <svaldez@google.com>
Commit-Queue: David Benjamin <davidben@google.com>
Reviewed-by: David Benjamin <davidben@google.com>
2019-02-21 17:41:59 +00:00
David Benjamin
24a18b8a40 Fix x86_64-xlate.pl comment regex.
This did not correctly capture lines like the following:

https://boringssl.googlesource.com/boringssl/+/refs/heads/master/crypto/chacha/asm/chacha-x86_64.pl#260
https://boringssl.googlesource.com/boringssl/+/refs/heads/master/crypto/fipsmodule/aes/asm/aes-x86_64.pl#992
https://boringssl.googlesource.com/boringssl/+/refs/heads/master/crypto/fipsmodule/aes/asm/aesni-x86_64.pl#641
https://boringssl.googlesource.com/boringssl/+/refs/heads/master/crypto/fipsmodule/aes/asm/bsaes-x86_64.pl#387
https://boringssl.googlesource.com/boringssl/+/refs/heads/master/crypto/fipsmodule/modes/asm/ghash-x86_64.pl#455
https://boringssl.googlesource.com/boringssl/+/refs/heads/master/crypto/fipsmodule/ec/asm/p256-x86_64-asm.pl#92

Reportedly that last one causes problems with some assemblers.

Change-Id: I82d6f0d81b902e48fad3c45947f84f02370eb1ab
Reviewed-on: https://boringssl-review.googlesource.com/c/34925
Reviewed-by: Adam Langley <agl@google.com>
2019-02-21 16:50:17 +00:00
David Benjamin
1908667015 Add go 1.11 to go.mod.
Go 1.12 really wants to record a version in go.mod if there is no
version in there. 1.12 is not yet released, so stick 1.11 in there for
now. We'll bump it to 1.12 and so on as we update our minimum versions.

Change-Id: I79ac85837149ab7cadd2f23acd8ab2d207a1a355
Reviewed-on: https://boringssl-review.googlesource.com/c/34924
Reviewed-by: Adam Langley <agl@google.com>
2019-02-21 16:42:44 +00:00
David Benjamin
104306f587 Remove STRICT_ALIGNMENT code from modes.
STRICT_ALIGNMENT is a remnant of OpenSSL code would cast pointers to
size_t* and load more than one byte at a time. Not all architectures
support unaligned access, so it did an alignment check and only enterred
this path if aligned or the underlying architecture didn't care.

This is UB. Unaligned casts in C are undefined on all architectures, so
we switch these to memcpy some time ago. Compilers can optimize memcpy
to the unaligned accesses we wanted. That left our modes logic as:

- If STRICT_ALIGNMENT is 1 and things are unaligned, work byte-by-byte.

- Otherwise, use the memcpy-based word-by-word code, which now works
  independent of STRICT_ALIGNMENT.

Remove the first check to simplify things. On x86, x86_64, and aarch64,
STRICT_ALIGNMENT is zero and this is a no-op. ARM is more complex. Per
[0], ARMv7 and up support unaligned access. ARMv5 do not. ARMv6 does,
but can run in a mode where it looks more like ARMv5.

For ARMv7 and up, STRICT_ALIGNMENT should have been zero, but was one.
Thus this change should be an improvement for ARMv7 (right now unaligned
inputs lose bsaes-armv7). The Android NDK does not even support the
pre-ARMv7 ABI anymore[1]. Nonetheless, Cronet still supports ARMv6 as a
library. It builds with -march=armv6 which GCC interprets as supporting
unaligned access, so it too did not want this code.

For completeness, should anyone still care about ARMv5 or be building
with an overly permissive -march flag, GCC does appear unable to inline
the memcpy calls. However, GCC also does not interpret
(uintptr_t)ptr % sizeof(size_t) as an alignment assertion, so such
consumers have already been paying for the memcpy here and throughout
the library.

In general, C's arcane pointer rules mean we must resort to memcpy
often, so, realistically, we must require that the compiler optimize
memcpy well.

[0] https://medium.com/@iLevex/the-curious-case-of-unaligned-access-on-arm-5dd0ebe24965
[1] https://developer.android.com/ndk/guides/abis#armeabi

Change-Id: I3c7dea562adaeb663032e395499e69530dd8e145
Reviewed-on: https://boringssl-review.googlesource.com/c/34873
Reviewed-by: Adam Langley <agl@google.com>
2019-02-14 17:39:36 +00:00
David Benjamin
d8598ce03f Remove non-STRICT_ALIGNMENT code from xts.c.
Independent of the underlying CPU architecture, casting unaligned
pointers to uint64_t* is undefined. Just use a memcpy. The compiler
should be able to optimize that itself.

Change-Id: I39210871fca3eaf1f4b1d205b2bb0c337116d9cc
Reviewed-on: https://boringssl-review.googlesource.com/c/34872
Reviewed-by: Adam Langley <agl@google.com>
2019-02-14 17:32:11 +00:00
David Benjamin
4d8e1ce5e9 Patch XTS out of ARMv7 bsaes too.
Bug: 256
Change-Id: I822274bf05901d82b41dc9c9c4e6d0b5d622f3ff
Reviewed-on: https://boringssl-review.googlesource.com/c/34871
Reviewed-by: Adam Langley <agl@google.com>
2019-02-14 17:31:37 +00:00
David Benjamin
fb35b147ca Remove stray prototype.
The function's since been renamed.

Change-Id: Id1a9788dfeb5c46b3463611b08318b3f253d03df
Reviewed-on: https://boringssl-review.googlesource.com/c/34870
Reviewed-by: Adam Langley <agl@google.com>
2019-02-14 17:31:14 +00:00
David Benjamin
eb2c2cdf17 Always define GHASH.
There is a C implementation of gcm_ghash_4bit to pair with
gcm_gmult_4bit. It's even slightly faster per the numbers below (x86_64
OPENSSL_NO_ASM build), but, more importantly, we trim down the
combinatorial explosion of GCM implementations and free up complexity
budget for potentially using bsaes better in the future.

Old:
Did 2557000 AES-128-GCM (16 bytes) seal operations in 1000057us (2556854.3 ops/sec): 40.9 MB/s
Did 94000 AES-128-GCM (1350 bytes) seal operations in 1009613us (93105.0 ops/sec): 125.7 MB/s
Did 17000 AES-128-GCM (8192 bytes) seal operations in 1024768us (16589.1 ops/sec): 135.9 MB/s
Did 2511000 AES-256-GCM (16 bytes) seal operations in 1000196us (2510507.9 ops/sec): 40.2 MB/s
Did 84000 AES-256-GCM (1350 bytes) seal operations in 1000412us (83965.4 ops/sec): 113.4 MB/s
Did 15000 AES-256-GCM (8192 bytes) seal operations in 1046963us (14327.2 ops/sec): 117.4 MB/s

New:
Did 2739000 AES-128-GCM (16 bytes) seal operations in 1000322us (2738118.3 ops/sec): 43.8 MB/s
Did 100000 AES-128-GCM (1350 bytes) seal operations in 1008190us (99187.7 ops/sec): 133.9 MB/s
Did 17000 AES-128-GCM (8192 bytes) seal operations in 1006360us (16892.6 ops/sec): 138.4 MB/s
Did 2546000 AES-256-GCM (16 bytes) seal operations in 1000150us (2545618.2 ops/sec): 40.7 MB/s
Did 86000 AES-256-GCM (1350 bytes) seal operations in 1000970us (85916.7 ops/sec): 116.0 MB/s
Did 14850 AES-256-GCM (8192 bytes) seal operations in 1023459us (14509.6 ops/sec): 118.9 MB/s

While I'm here, tighten up some of the functions and align the ctr32 and
non-ctr32 paths.

Bug: 256
Change-Id: Id4df699cefc8630dd5a350d44f927900340f5e60
Reviewed-on: https://boringssl-review.googlesource.com/c/34869
Reviewed-by: Adam Langley <agl@google.com>
2019-02-14 17:30:55 +00:00
Watson Ladd
2f213f643f Update delegated credentials to draft-03
Change-Id: I0c648340ac7bb134fcda42c56a83f4815bbaa557
Reviewed-on: https://boringssl-review.googlesource.com/c/34884
Commit-Queue: David Benjamin <davidben@google.com>
Reviewed-by: David Benjamin <davidben@google.com>
2019-02-13 20:04:33 +00:00
David Benjamin
b22c9fea47 Use Windows symbol APIs in the unwind tester.
This should make things a bit easier to debug.

Update-Note: Test binaries on Windows now link to dbghelp.
Bug: 259
Change-Id: I9da1fc89d429080c5250238e4341445922b1dd8e
Reviewed-on: https://boringssl-review.googlesource.com/c/34868
Reviewed-by: Adam Langley <agl@google.com>
Commit-Queue: David Benjamin <davidben@google.com>
2019-02-12 20:42:47 +00:00
David Benjamin
2e819d8be4 Unwind RDRAND functions correctly on Windows.
But for the ABI conversion bits, these are just leaf functions and don't
even need unwind tables. Just renumber the registers on Windows to only
used volatile ones.

In doing so, this switches to writing rdrand explicitly. perlasm already
knows how to manually encode it and our minimum assembler versions
surely cover rdrand by now anyway. Also add the .size directive. I'm not
sure what it's used for, but the other files have it.

(This isn't a generally reusable technique. The more complex functions
will need actual unwind codes.)

Bug: 259
Change-Id: I1d5669bcf8b6e34939885d78aea6f60597be1528
Reviewed-on: https://boringssl-review.googlesource.com/c/34867
Commit-Queue: Adam Langley <agl@google.com>
Reviewed-by: Adam Langley <agl@google.com>
2019-02-12 20:24:27 +00:00
David Benjamin
15ba2d11a9 Patch out unused aesni-x86_64 functions.
This shrinks the bssl binary by about 8k.

Change-Id: I571f258ccf7032ae34db3f20904ad9cc81cca839
Reviewed-on: https://boringssl-review.googlesource.com/c/34866
Commit-Queue: David Benjamin <davidben@google.com>
Reviewed-by: Adam Langley <agl@google.com>
2019-02-11 20:25:22 +00:00
David Benjamin
cc2b8e2552 Add ABI tests for aesni-gcm-x86_64.pl.
Change-Id: Ic23fc5fbec2c4f8df5d06f807c6bd2c5e1f0e99c
Reviewed-on: https://boringssl-review.googlesource.com/c/34865
Commit-Queue: David Benjamin <davidben@google.com>
Reviewed-by: Adam Langley <agl@google.com>
2019-02-11 20:08:38 +00:00
David Benjamin
7a3b94cd2c Add ABI tests for x86_64-mont5.pl.
Fix some missing CFI bits.

Change-Id: I42114527f0ef8e03079d37a9f466d64a63a313f5
Reviewed-on: https://boringssl-review.googlesource.com/c/34864
Commit-Queue: David Benjamin <davidben@google.com>
Reviewed-by: Adam Langley <agl@google.com>
2019-02-11 19:27:13 +00:00
Jeremy Apthorp
7ef4223fb3 sync EVP_get_cipherbyname with EVP_do_all_sorted
EVP_get_cipherbyname should work on everything that EVP_do_all_sorted
lists, and conversely, there should be nothing that
EVP_get_cipherbyname works on that EVP_do_all_sorted doesn't list.

node.js uses these APIs to enumerate and instantiate ciphers.

Change-Id: I87fcedce62d06774f7c6ee7acc898326276be089
Reviewed-on: https://boringssl-review.googlesource.com/c/33984
Reviewed-by: David Benjamin <davidben@google.com>
Reviewed-by: Adam Langley <agl@google.com>
Commit-Queue: Adam Langley <agl@google.com>
2019-02-11 17:20:23 +00:00
Katrin Leinweber
d2a0ffdfa7 Hyperlink DOI to preferred resolver
Change-Id: Ib9983a74d5d2f8be7c96cedde17be5a4e9223d5e
Reviewed-on: https://boringssl-review.googlesource.com/c/34844
Reviewed-by: Adam Langley <agl@google.com>
Commit-Queue: Adam Langley <agl@google.com>
2019-02-08 19:20:05 +00:00
David Benjamin
a6c689e0da Remove stray semicolons.
Thanks to Nico Weber for pointing this out.

Change-Id: I763fd4a6f8fe467a027d5b249d9f76633ab4375a
Reviewed-on: https://boringssl-review.googlesource.com/c/34824
Commit-Queue: David Benjamin <davidben@google.com>
Commit-Queue: Steven Valdez <svaldez@google.com>
Reviewed-by: Steven Valdez <svaldez@google.com>
2019-02-07 17:36:54 +00:00
Adam Langley
2d38b83976 Remove separate default group list for servers.
It's the same as for clients, and we're probably not going to change
that any time soon.

Change-Id: Ic48cb640e98b0957d264267b97b5393f1977c6e6
Reviewed-on: https://boringssl-review.googlesource.com/c/34665
Reviewed-by: David Benjamin <davidben@google.com>
2019-02-06 00:33:29 +00:00
Adam Langley
fcc1ad78f9 Enable all curves (inc CECPQ2) during fuzzing.
Change-Id: I8083e841de135e9ec244609b1c20f0280ce20072
Reviewed-on: https://boringssl-review.googlesource.com/c/34664
Reviewed-by: David Benjamin <davidben@google.com>
2019-02-06 00:32:45 +00:00
David Benjamin
70fe610556 Implement ABI testing for aarch64.
This caught a bug in bn_mul_mont. Tested manually on iOS and Android.

Change-Id: I1819fcd9ad34dbe3ba92bba952507d86dd12185a
Reviewed-on: https://boringssl-review.googlesource.com/c/34805
Commit-Queue: David Benjamin <davidben@google.com>
Reviewed-by: Adam Langley <agl@google.com>
2019-02-05 21:44:04 +00:00
David Benjamin
55b9acda99 Fix ABI error in bn_mul_mont on aarch64.
This was caught by an aarch64 ABI tester. aarch64 has the same
considerations around small arguments as x86_64 does. The aarch64
version of bn_mul_mont does not mask off the upper words of the
argument.

The x86_64 version does, so size_t is, strictly speaking, wrong for
aarch64, but bn_mul_mont already has an implicit size limit to support
its internal alloca, so this doesn't really make things worse than
before.

Change-Id: I39bffc8fdb2287e45a2d1f0d1b4bd5532bbf3868
Reviewed-on: https://boringssl-review.googlesource.com/c/34804
Commit-Queue: David Benjamin <davidben@google.com>
Reviewed-by: Adam Langley <agl@google.com>
2019-02-05 21:17:54 +00:00
David Benjamin
0a87c4982c Implement ABI testing for ARM.
Update-Note: There's some chance this'll break iOS since I was unable to
test it there. The iPad I have to test on is too new to run 32-bit code
at all.

Change-Id: I6593f91b67a5e8a82828237d3b69ed948b07922d
Reviewed-on: https://boringssl-review.googlesource.com/c/34725
Commit-Queue: David Benjamin <davidben@google.com>
Reviewed-by: Adam Langley <agl@google.com>
2019-02-05 21:01:44 +00:00
David Benjamin
0a67eba62d Fix the order of Windows unwind codes.
The unwind tester suggests Windows doesn't care, but the documentation
says that unwind codes should be sorted in descending offset, which
means the last instruction should be first.

https://docs.microsoft.com/en-us/cpp/build/exception-handling-x64?view=vs-2017#struct-unwind_code

Bug: 259
Change-Id: I21e54c362e18e0405f980005112cc3f7c417c70c
Reviewed-on: https://boringssl-review.googlesource.com/c/34785
Commit-Queue: David Benjamin <davidben@google.com>
Reviewed-by: Adam Langley <agl@google.com>
2019-02-05 19:38:23 +00:00
David Benjamin
28f035f48b Implement unwind testing for Windows.
Unfortunately, due to most OpenSSL assembly using custom exception
handlers to unwind, most of our assembly doesn't work with
non-destructive unwind. For now, CHECK_ABI behaves like
CHECK_ABI_NO_UNWIND on Windows, and CHECK_ABI_SEH will test unwinding on
both platforms.

The tests do, however, work with the unwind-code-based assembly we
recently added, as well as the clmul-based GHASH which is also
code-based. Remove the ad-hoc SEH tests which intentionally hit memory
access exceptions, now that we can test unwind directly.

Now that we can test it, the next step is to implement SEH directives in
perlasm so writing these unwind codes is less of a chore.

Bug: 259
Change-Id: I23a57a22c5dc9fa4513f575f18192335779678a5
Reviewed-on: https://boringssl-review.googlesource.com/c/34784
Commit-Queue: David Benjamin <davidben@google.com>
Reviewed-by: Adam Langley <agl@google.com>
2019-02-05 19:22:15 +00:00
David Benjamin
fc31677a1d Tolerate spaces when parsing .type directives.
The .type foo, @abi-omnipotent lines weren't being parsed correctly.
This doesn't change the generated files, but some internal state (used
in-progress work on perlasm SEH directives) wasn't quite right.

Change-Id: Id6aec79281a59f45b2eb2aea9f1fb8806b4c483e
Reviewed-on: https://boringssl-review.googlesource.com/c/34786
Reviewed-by: Adam Langley <agl@google.com>
2019-02-05 15:47:26 +00:00
David Benjamin
20a9b409bb runner: Don't generate an RSA key on startup.
RSA keygen isn't the fastest. Just use the existing one in
rsaCertificate.

Change-Id: Icd151232928e67e0a7d5becabf9dc96b0e9bfa22
Reviewed-on: https://boringssl-review.googlesource.com/c/34764
Commit-Queue: Steven Valdez <svaldez@google.com>
Reviewed-by: Steven Valdez <svaldez@google.com>
2019-02-04 16:08:41 +00:00
David Benjamin
33f456b8b0 Don't use bsaes over vpaes for CTR-DRBG.
RAND_bytes rarely uses large enough inputs for bsaes to be worth it.
https://boringssl-review.googlesource.com/c/boringssl/+/33589 includes some
rough benchmarks of various bits here. Some observations:

- 8 blocks of bsaes costs roughly 6.5 blocks of vpaes. Note the comparison
  isn't quite accurate because I'm measuring bsaes_ctr32_encrypt_blocks against
  vpaes_encrypt and vpaes in CTR mode today must make do with a C loop. Even
  assuming a cutoff of 6 rather than 7 blocks, it's rare to ask for 96 bytes
  of entropy at a time.

- CTR-DRBG performs some stray block operations (ctr_drbg_update), which bsaes
  is bad at without extra work to fold them into the CTR loop (not really worth
  it).

- CTR-DRBG calculates a couple new key schedules every RAND_bytes call. We
  don't currently have a constant-time bsaes key schedule. Unfortunately, even
  plain vpaes loses to the current aes_nohw used by bsaes, but it's not
  constant-time. Also taking CTR-DRBG out of the bsaes equation

- Machines without AES hardware (clients) are not going to be RNG-bound. It's
  mostly servers pushing way too many CBC IVs that care. This means bsaes's
  current side channel tradeoffs make even less sense here.

I'm not sure yet what we should do for the rest of the bsaes mess, but it seems
clear that we want to stick with vpaes for the RNG.

Bug: 256
Change-Id: Iec8f13af232794afd007cb1065913e8117eeee24
Reviewed-on: https://boringssl-review.googlesource.com/c/34744
Reviewed-by: Adam Langley <agl@google.com>
2019-02-01 18:03:39 +00:00