Commit Graph

5674 Commits

Author SHA1 Message Date
David Benjamin
0a87c4982c Implement ABI testing for ARM.
Update-Note: There's some chance this'll break iOS since I was unable to
test it there. The iPad I have to test on is too new to run 32-bit code
at all.

Change-Id: I6593f91b67a5e8a82828237d3b69ed948b07922d
Reviewed-on: https://boringssl-review.googlesource.com/c/34725
Commit-Queue: David Benjamin <davidben@google.com>
Reviewed-by: Adam Langley <agl@google.com>
2019-02-05 21:01:44 +00:00
David Benjamin
0a67eba62d Fix the order of Windows unwind codes.
The unwind tester suggests Windows doesn't care, but the documentation
says that unwind codes should be sorted in descending offset, which
means the last instruction should be first.

https://docs.microsoft.com/en-us/cpp/build/exception-handling-x64?view=vs-2017#struct-unwind_code

Bug: 259
Change-Id: I21e54c362e18e0405f980005112cc3f7c417c70c
Reviewed-on: https://boringssl-review.googlesource.com/c/34785
Commit-Queue: David Benjamin <davidben@google.com>
Reviewed-by: Adam Langley <agl@google.com>
2019-02-05 19:38:23 +00:00
David Benjamin
28f035f48b Implement unwind testing for Windows.
Unfortunately, due to most OpenSSL assembly using custom exception
handlers to unwind, most of our assembly doesn't work with
non-destructive unwind. For now, CHECK_ABI behaves like
CHECK_ABI_NO_UNWIND on Windows, and CHECK_ABI_SEH will test unwinding on
both platforms.

The tests do, however, work with the unwind-code-based assembly we
recently added, as well as the clmul-based GHASH which is also
code-based. Remove the ad-hoc SEH tests which intentionally hit memory
access exceptions, now that we can test unwind directly.

Now that we can test it, the next step is to implement SEH directives in
perlasm so writing these unwind codes is less of a chore.

Bug: 259
Change-Id: I23a57a22c5dc9fa4513f575f18192335779678a5
Reviewed-on: https://boringssl-review.googlesource.com/c/34784
Commit-Queue: David Benjamin <davidben@google.com>
Reviewed-by: Adam Langley <agl@google.com>
2019-02-05 19:22:15 +00:00
David Benjamin
fc31677a1d Tolerate spaces when parsing .type directives.
The .type foo, @abi-omnipotent lines weren't being parsed correctly.
This doesn't change the generated files, but some internal state (used
in-progress work on perlasm SEH directives) wasn't quite right.

Change-Id: Id6aec79281a59f45b2eb2aea9f1fb8806b4c483e
Reviewed-on: https://boringssl-review.googlesource.com/c/34786
Reviewed-by: Adam Langley <agl@google.com>
2019-02-05 15:47:26 +00:00
David Benjamin
20a9b409bb runner: Don't generate an RSA key on startup.
RSA keygen isn't the fastest. Just use the existing one in
rsaCertificate.

Change-Id: Icd151232928e67e0a7d5becabf9dc96b0e9bfa22
Reviewed-on: https://boringssl-review.googlesource.com/c/34764
Commit-Queue: Steven Valdez <svaldez@google.com>
Reviewed-by: Steven Valdez <svaldez@google.com>
2019-02-04 16:08:41 +00:00
David Benjamin
33f456b8b0 Don't use bsaes over vpaes for CTR-DRBG.
RAND_bytes rarely uses large enough inputs for bsaes to be worth it.
https://boringssl-review.googlesource.com/c/boringssl/+/33589 includes some
rough benchmarks of various bits here. Some observations:

- 8 blocks of bsaes costs roughly 6.5 blocks of vpaes. Note the comparison
  isn't quite accurate because I'm measuring bsaes_ctr32_encrypt_blocks against
  vpaes_encrypt and vpaes in CTR mode today must make do with a C loop. Even
  assuming a cutoff of 6 rather than 7 blocks, it's rare to ask for 96 bytes
  of entropy at a time.

- CTR-DRBG performs some stray block operations (ctr_drbg_update), which bsaes
  is bad at without extra work to fold them into the CTR loop (not really worth
  it).

- CTR-DRBG calculates a couple new key schedules every RAND_bytes call. We
  don't currently have a constant-time bsaes key schedule. Unfortunately, even
  plain vpaes loses to the current aes_nohw used by bsaes, but it's not
  constant-time. Also taking CTR-DRBG out of the bsaes equation

- Machines without AES hardware (clients) are not going to be RNG-bound. It's
  mostly servers pushing way too many CBC IVs that care. This means bsaes's
  current side channel tradeoffs make even less sense here.

I'm not sure yet what we should do for the rest of the bsaes mess, but it seems
clear that we want to stick with vpaes for the RNG.

Bug: 256
Change-Id: Iec8f13af232794afd007cb1065913e8117eeee24
Reviewed-on: https://boringssl-review.googlesource.com/c/34744
Reviewed-by: Adam Langley <agl@google.com>
2019-02-01 18:03:39 +00:00
David Benjamin
470bd56c9b perlasm/x86_64-xlate.pl: refine symbol recognition in .xdata.
Hexadecimals were erroneously recognized as symbols in .xdata.

(Imported from upstream's b068a9b914887af5cc99895754412582fbb0e10b)

Change-Id: I5d8e8e1969669a8961733802d9f034cf26c45552
Reviewed-on: https://boringssl-review.googlesource.com/c/34704
Reviewed-by: Adam Langley <agl@google.com>
Commit-Queue: David Benjamin <davidben@google.com>
2019-02-01 18:02:44 +00:00
David Benjamin
9978f0a865 Add instructions for debugging on Android with gdb.
Android's official documentation seems to assume you're using the NDK
build system or Android Studio. I extracted this from one of their
scripts a while back. May as well put it somewhere we can easily find
it.

Change-Id: I259abc54e6935ab537956a7cbf9f80e924a60b7a
Reviewed-on: https://boringssl-review.googlesource.com/c/34724
Reviewed-by: Adam Langley <agl@google.com>
2019-02-01 02:51:11 +00:00
Jesse Selover
d7266ecc9b Enforce key usage for RSA keys in TLS 1.2.
For now, this is off by default and controlled by SSL_set_enforce_rsa_key_usage.
This may be set as late as certificate verification so we may start by enforcing
it for known roots.

Generalizes ssl_cert_check_digital_signature_key_usage to check any part of the
key_usage, and adds a new error KEY_USAGE_BIT_INCORRECT for the generalized
method.

Bug: chromium:795089
Change-Id: Ifa504c321bec3263a4e74f2dc48513e3b895d3ee
Reviewed-on: https://boringssl-review.googlesource.com/c/34604
Reviewed-by: David Benjamin <davidben@google.com>
Commit-Queue: David Benjamin <davidben@google.com>
2019-01-30 21:28:34 +00:00
David Benjamin
1a51a5b4a6 Remove infra/config folder in master branch.
As of https://boringssl-review.googlesource.com/c/34584, the LUCI config
has been consolidated on the infra/config branch.

Change-Id: Idd9f38b99197b9ff324d98c4aecb5d8fe94a2f9e
Reviewed-on: https://boringssl-review.googlesource.com/c/34684
Reviewed-by: Andrii Shyshkalov <tandrii@google.com>
Reviewed-by: David Benjamin <davidben@google.com>
Commit-Queue: David Benjamin <davidben@google.com>
2019-01-30 00:21:43 +00:00
Filippo Valsorda
73308b6606 Avoid SCT/OCSP extensions in SH on {Omit|Empty}Extensions
They were causing a "panic: ServerHello unexpectedly contained extensions"
if the client unconditionally signals support for OCSP or SCTs.

Change-Id: Ia60639431daf78679b269dfe337c1af171fd7d8b
Reviewed-on: https://boringssl-review.googlesource.com/c/34644
Reviewed-by: David Benjamin <davidben@google.com>
Commit-Queue: David Benjamin <davidben@google.com>
2019-01-29 00:51:31 +00:00
David Benjamin
23e1a1f2d3 Test and fix an ABI issue with small parameters.
Calling conventions must specify how to handle arguments smaller than a
machine word. Should the caller pad them up to a machine word size with
predictable values (zero/sign-extended), or should the callee tolerate
an arbitrary bit pattern?

Annoyingly, I found no text in either SysV or Win64 ABI documentation
describing any of this and resorted to experiment. The short answer is
that callees must tolerate an arbitrary bit pattern on x86_64, which
means we must test this. See the comment in abi_test::internal::ToWord
for the long answer.

CHECK_ABI now, if the type of the parameter is smaller than
crypto_word_t, fills the remaining bytes with 0xaa. This is so the
number is out of bounds for code expecting either zero or sign
extension. (Not that crypto assembly has any business seeing negative
numbers.)

Doing so reveals a bug in ecp_nistz256_ord_sqr_mont. The rep parameter
is typed int, but the code expected uint64_t. In practice, the compiler
will always compile this correctly because:

- On both Win64 and SysV, rep is a register parameter.

- The rep parameter is always a constant, so the compiler has no reason
  to leave garbage in the upper half.

However, I was indeed able to get a bug out of GCC via:

  uint64_t foo = (1ull << 63) | 2;  // Some global the compiler can't
                                    // prove constant.
  ecp_nistz256_ord_sqr_mont(res, a, foo >> 1);

Were ecp_nistz256_ord_sqr_mont a true int-taking function, this would
act like ecp_nistz256_ord_sqr_mont(res, a, 1). Instead, it hung. Fix
this by having it take a full-width word.

This mess has several consequences:

- ABI testing now ideally needs a functional testing component to fully cover
  this case. A bad input might merely produce the wrong answer. Still,
  this is fairly effective as it will cause most code to either segfault
  or loop forever. (Not the enc parameter to AES however...)

- We cannot freely change the type of assembly function prototypes. If the
  prototype says int or unsigned, it must be ignoring the upper half and
  thus "fixing" it to size_t cannot have handled the full range. (Unless
  it was simply wrong of the parameter is already bounded.) If the
  prototype says size_t, switching to int or unsigned will hit this type
  of bug. The former is a safer failure mode though.

- The simplest path out of this mess: new assembly code should *only*
  ever take word-sized parameters. This is not a tall order as the bad
  parameters are usually ints that should have been size_t.

Calling conventions are hard.

Change-Id: If8254aff8953844679fbce4bd3e345e5e2fa5213
Reviewed-on: https://boringssl-review.googlesource.com/c/34627
Commit-Queue: David Benjamin <davidben@google.com>
Reviewed-by: Adam Langley <agl@google.com>
2019-01-28 21:09:40 +00:00
David Benjamin
ab578adf44 Add RSAZ ABI tests.
As part of this, move the CPU checks to C.

Change-Id: I17b701e1196c1ca116bbd23e0e669cf603ad464d
Reviewed-on: https://boringssl-review.googlesource.com/c/34626
Reviewed-by: Adam Langley <agl@google.com>
Commit-Queue: David Benjamin <davidben@google.com>
2019-01-28 21:00:49 +00:00
David Benjamin
3859fc883d Better document RSAZ and tidy up types.
It's an assembly function, so types are a little meaningless, but
everything is passed through as BN_ULONG, so be consistent. Also
annotate all the RSAZ prototypes with sizes.

Change-Id: I32e59e896da39e79c30ce9db52652fd645a033b4
Reviewed-on: https://boringssl-review.googlesource.com/c/34625
Reviewed-by: Adam Langley <agl@google.com>
Commit-Queue: David Benjamin <davidben@google.com>
2019-01-28 20:54:27 +00:00
David Benjamin
e569c7e25d Add ABI testing for 32-bit x86.
This is much less interesting (stack-based parameters, Windows and SysV
match, no SEH concerns as far as I can tell) than x86_64, but it was
easy to do and I'm more familiar with x86 than ARM, so it made a better
second architecture to make sure all the architecture ifdefs worked out.

Also fix a bug in the x86_64 direction flag code. It was shifting in the
wrong direction, making give 0 or 1<<20 rather than 0 or 1.

(Happily, x86_64 appears to be unique in having vastly different calling
conventions between OSs. x86 is the same between SysV and Windows, and
ARM had the good sense to specify a (mostly) common set of rules.)

Since a lot of the assembly functions use the same names and the tests
were written generically, merely dropping in a trampoline and
CallerState implementation gives us a bunch of ABI tests for free.

Change-Id: I15408c18d43e88cfa1c5c0634a8b268a150ed961
Reviewed-on: https://boringssl-review.googlesource.com/c/34624
Reviewed-by: Adam Langley <agl@google.com>
Commit-Queue: David Benjamin <davidben@google.com>
2019-01-28 20:40:06 +00:00
David Benjamin
8cbb5f8f20 Add a very roundabout EC keygen API.
OpenSSL's EVP-level EC API involves a separate "paramgen" operation,
which is ultimately just a roundabout way to go from a NID to an
EC_GROUP. But Node uses this, and it's the pattern used within OpenSSL
these days, so this appears to be the official upstream recommendation.

Also add a #define for OPENSSL_EC_EXPLICIT_CURVE, because Node uses it,
but fail attempts to use it. Explicit curve encodings are forbidden by
RFC 5480 and generally a bad idea. (Parsing such keys back into OpenSSL
will cause it to lose the optimized path.)

Change-Id: I5e97080e77cf90fc149f6cf6f2cc4900f573fc64
Reviewed-on: https://boringssl-review.googlesource.com/c/34565
Commit-Queue: David Benjamin <davidben@google.com>
Commit-Queue: Adam Langley <agl@google.com>
Reviewed-by: Adam Langley <agl@google.com>
2019-01-25 23:08:12 +00:00
David Benjamin
23dcf88e18 Add some Node compatibility functions.
This doesn't cover all the functions used by Node, but it's the easy
bits. (EVP_PKEY_paramgen will be done separately as its a non-trivial
bit of machinery.)

Change-Id: I6501e99f9239ffcdcc57b961ebe85d0ad3965549
Reviewed-on: https://boringssl-review.googlesource.com/c/34544
Reviewed-by: Adam Langley <agl@google.com>
Commit-Queue: Adam Langley <agl@google.com>
2019-01-25 16:50:30 +00:00
Christopher Patton
6c1b376e1d Implement server support for delegated credentials.
This implements the server-side of delegated credentials, a proposed
extension for TLS:
https://tools.ietf.org/html/draft-ietf-tls-subcerts-02

Change-Id: I6a29cf1ead87b90aeca225335063aaf190a417ff
Reviewed-on: https://boringssl-review.googlesource.com/c/33666
Reviewed-by: Adam Langley <agl@google.com>
Commit-Queue: Adam Langley <agl@google.com>
2019-01-24 20:06:58 +00:00
David Benjamin
4545503926 Add a constant-time pshufb-based GHASH implementation.
We currently require clmul instructions for constant-time GHASH
on x86_64. Otherwise, it falls back to a variable-time 4-bit table
implementation. However, a significant proportion of clients lack these
instructions.

Inspired by vpaes, we can use pshufb and a slightly different order of
incorporating the bits to make a constant-time GHASH. This requires
SSSE3, which is very common. Benchmarking old machines we had on hand,
it appears to be a no-op on Sandy Bridge and a small slowdown for
Penryn.

Sandy Bridge (Intel Pentium CPU 987 @ 1.50GHz):
(Note: these numbers are before 16-byte-aligning the table. That was an
improvement on Penryn, so it's possible Sandy Bridge is now better.)
Before:
Did 4244750 AES-128-GCM (16 bytes) seal operations in 4015000us (1057222.9 ops/sec): 16.9 MB/s
Did 442000 AES-128-GCM (1350 bytes) seal operations in 4016000us (110059.8 ops/sec): 148.6 MB/s
Did 84000 AES-128-GCM (8192 bytes) seal operations in 4015000us (20921.5 ops/sec): 171.4 MB/s
Did 3349250 AES-256-GCM (16 bytes) seal operations in 4016000us (833976.6 ops/sec): 13.3 MB/s
Did 343500 AES-256-GCM (1350 bytes) seal operations in 4016000us (85532.9 ops/sec): 115.5 MB/s
Did 65250 AES-256-GCM (8192 bytes) seal operations in 4015000us (16251.6 ops/sec): 133.1 MB/s
After:
Did 4229250 AES-128-GCM (16 bytes) seal operations in 4016000us (1053100.1 ops/sec): 16.8 MB/s [-0.4%]
Did 442250 AES-128-GCM (1350 bytes) seal operations in 4016000us (110122.0 ops/sec): 148.7 MB/s [+0.1%]
Did 83500 AES-128-GCM (8192 bytes) seal operations in 4015000us (20797.0 ops/sec): 170.4 MB/s [-0.6%]
Did 3286500 AES-256-GCM (16 bytes) seal operations in 4016000us (818351.6 ops/sec): 13.1 MB/s [-1.9%]
Did 342750 AES-256-GCM (1350 bytes) seal operations in 4015000us (85367.4 ops/sec): 115.2 MB/s [-0.2%]
Did 65250 AES-256-GCM (8192 bytes) seal operations in 4016000us (16247.5 ops/sec): 133.1 MB/s [-0.0%]

Penryn (Intel Core 2 Duo CPU P8600 @ 2.40GHz):
Before:
Did 1179000 AES-128-GCM (16 bytes) seal operations in 1000139us (1178836.1 ops/sec): 18.9 MB/s
Did 97000 AES-128-GCM (1350 bytes) seal operations in 1006347us (96388.2 ops/sec): 130.1 MB/s
Did 18000 AES-128-GCM (8192 bytes) seal operations in 1028943us (17493.7 ops/sec): 143.3 MB/s
Did 977000 AES-256-GCM (16 bytes) seal operations in 1000197us (976807.6 ops/sec): 15.6 MB/s
Did 82000 AES-256-GCM (1350 bytes) seal operations in 1012434us (80992.9 ops/sec): 109.3 MB/s
Did 15000 AES-256-GCM (8192 bytes) seal operations in 1006528us (14902.7 ops/sec): 122.1 MB/s
After:
Did 1306000 AES-128-GCM (16 bytes) seal operations in 1000153us (1305800.2 ops/sec): 20.9 MB/s [+10.8%]
Did 94000 AES-128-GCM (1350 bytes) seal operations in 1009852us (93082.9 ops/sec): 125.7 MB/s [-3.4%]
Did 17000 AES-128-GCM (8192 bytes) seal operations in 1012096us (16796.8 ops/sec): 137.6 MB/s [-4.0%]
Did 1070000 AES-256-GCM (16 bytes) seal operations in 1000929us (1069006.9 ops/sec): 17.1 MB/s [+9.4%]
Did 79000 AES-256-GCM (1350 bytes) seal operations in 1002209us (78825.9 ops/sec): 106.4 MB/s [-2.7%]
Did 15000 AES-256-GCM (8192 bytes) seal operations in 1061489us (14131.1 ops/sec): 115.8 MB/s [-5.2%]

Change-Id: I1c3760a77af7bee4aee3745d1c648d9e34594afb
Reviewed-on: https://boringssl-review.googlesource.com/c/34267
Commit-Queue: David Benjamin <davidben@google.com>
Reviewed-by: Adam Langley <agl@google.com>
2019-01-24 17:19:21 +00:00
Adam Langley
9801a07145 Tweak some slightly fragile tests.
These tests failed when CECPQ2 was enabled by default. Even if we're
not going to make CECPQ2 the default, it's worth fixing them to be more
robust.

Change-Id: Idef508bca9e17a4ef0e0a8a396755abd975f9908
Reviewed-on: https://boringssl-review.googlesource.com/c/34524
Commit-Queue: Adam Langley <agl@google.com>
Commit-Queue: David Benjamin <davidben@google.com>
Reviewed-by: David Benjamin <davidben@google.com>
2019-01-23 22:48:16 +00:00
Adam Langley
4bfab5d9d7 Make 256-bit ciphers a preference for CECPQ2, not a requirement.
If 256-bit ciphers are a requirement for CECPQ2 then that introduces a
link between supported ciphers and supported groups: offering CECPQ2
without a 256-bit cipher is invalid. But that's a little weird since
these things were otherwise independent.

So, rather than require a 256-bit cipher for CECPQ2, just prefer them.

Change-Id: I491749e41708cd9c5eeed5b4ae23c11e5c0b9725
Reviewed-on: https://boringssl-review.googlesource.com/c/34504
Commit-Queue: David Benjamin <davidben@google.com>
Reviewed-by: David Benjamin <davidben@google.com>
2019-01-23 22:38:56 +00:00
David Benjamin
fa81cc65dd Update comments around JDK11 workaround.
11.0.2 has since been released, but we are now aware of several more
bugs, so the workaround is unlikely to be removable for the foreseeable
future.

Change-Id: I8e7edcba2f002d0558a21e607306ddf9a205bfb3
Reviewed-on: https://boringssl-review.googlesource.com/c/34484
Commit-Queue: David Benjamin <davidben@google.com>
Commit-Queue: Adam Langley <agl@google.com>
Reviewed-by: Adam Langley <agl@google.com>
2019-01-23 20:00:38 +00:00
David Benjamin
c47f7936d0 Add a RelWithAsserts build configuration.
On our bots, debug unit tests take around 2.5x as long to complete as
release tests on Linux, 3x as long on macOS, and 6x as long on Windows.
Our tests are fast, so this does not particularly matter, but SDE
inflates a 13 second test run to 8 minutes. On Windows (MSVC), where we
don't but would like to test with SDE, the difference between optimized
and unoptimized is even larger, and test runs are slower in general.

This suggests running SDE tests in release mode. Release mode tests,
however, are less effective because they do not include asserts. Thus,
add a RelWithAsserts option.

(Chromium does something similar. I believe most of the test-running
configurations on the critical path run is_debug = false and
dcheck_always_on = true.)

Change-Id: I273dd86ab8ea039f34eca431483827c87dc5c461
Reviewed-on: https://boringssl-review.googlesource.com/c/34464
Commit-Queue: Adam Langley <agl@google.com>
Reviewed-by: Adam Langley <agl@google.com>
2019-01-23 17:21:56 +00:00
Adam Langley
51011b4a26 Remove union from |SHA512_CTX|.
With 2fe0360a4e, we no longer use the
other member of this union so it can be removed.

Change-Id: Ideb7c47a72df0b420eb1e7d8c718e1cacb2129f5
Reviewed-on: https://boringssl-review.googlesource.com/c/34449
Commit-Queue: Adam Langley <agl@google.com>
Commit-Queue: David Benjamin <davidben@google.com>
Reviewed-by: David Benjamin <davidben@google.com>
2019-01-22 23:36:46 +00:00
David Benjamin
4f3f597d32 Avoid unwind tests on libc functions.
When built under UBSan, it gets confused inside a PLT stub.

Change-Id: Ib082ecc076ba2111337ff5921e465e4beb99aab5
Reviewed-on: https://boringssl-review.googlesource.com/c/34448
Reviewed-by: Adam Langley <agl@google.com>
2019-01-22 23:29:24 +00:00
David Benjamin
14c611cf91 Don't pass NULL,0 to qsort.
qsort shares the same C language bug as mem*. Two of our calls may see
zero-length lists. This trips UBSan.

Change-Id: Id292dd277129881001eb57b1b2db78438cf4642e
Reviewed-on: https://boringssl-review.googlesource.com/c/34447
Reviewed-by: Adam Langley <agl@google.com>
2019-01-22 23:28:38 +00:00
David Benjamin
9847cdd785 Fix signed left-shifts in curve25519.c.
Due to a language flaw in C, left-shifts on signed integers are
undefined for negative numbers. This makes them all but useless. Cast to
the unsigned type, left-shift, and cast back (casts are defined to wrap)
to silence UBSan.

Change-Id: I8fbe739aee1c99cf553462b675863e6d68c2b302
Reviewed-on: https://boringssl-review.googlesource.com/c/34446
Reviewed-by: Adam Langley <agl@google.com>
2019-01-22 23:27:34 +00:00
David Benjamin
fc27a1919c Add an option to build with UBSan.
Change-Id: I31d5660fa4792bbb1ef8a721bad4bdbdb0e56863
Reviewed-on: https://boringssl-review.googlesource.com/c/34445
Reviewed-by: Adam Langley <agl@google.com>
2019-01-22 23:26:35 +00:00
David Benjamin
2fe0360a4e Fix undefined pointer casts in SHA-512 code.
Casting an unaligned pointer to uint64_t* is undefined, even on
platforms that support unaligned access. Additionally, dereferencing as
uint64_t violates strict aliasing rules. Instead, use memcpys which we
assume any sensible compiler can optimize. Also simplify the PULL64
business with the existing CRYPTO_bswap8.

This also removes the need for the
SHA512_BLOCK_CAN_MANAGE_UNALIGNED_DATA logic. The generic C code now
handles unaligned data and the assembly already can as well. (The only
problematic platform with assembly is old ARM, but sha512-armv4.pl
already handles this via an __ARM_ARCH__ check.  See also OpenSSL's
version of this file which always defines
SHA512_BLOCK_CAN_MANAGE_UNALIGNED_DATA if SHA512_ASM is defined.)

Add unaligned tests to digest_test.cc, so we retain coverage of
unaligned EVP_MD inputs.

Change-Id: Idfd8586c64bab2a77292af2fa8eebbd193e57c7d
Reviewed-on: https://boringssl-review.googlesource.com/c/34444
Commit-Queue: Adam Langley <agl@google.com>
Reviewed-by: Adam Langley <agl@google.com>
2019-01-22 23:18:36 +00:00
Adam Langley
72f015562c HRSS: flatten sample distribution.
With HRSS-SXY, the sampling algorithm now longer has to be the same
between the two parties. Therefore we can change it at will (as long as
it remains reasonably uniform) and thus take the opportunity to make the
output distribution flatter.

Change-Id: I74c667fcf919fe11ddcf2f4fb8a540b5112268bf
Reviewed-on: https://boringssl-review.googlesource.com/c/34404
Commit-Queue: Adam Langley <agl@google.com>
Reviewed-by: David Benjamin <davidben@google.com>
2019-01-22 22:06:43 +00:00
Adam Langley
c1615719ce Add test of assembly code dispatch.
The first attempt involved using Linux's support for hardware
breakpoints to detect when assembly code was run. However, this doesn't
work with SDE, which is a problem.

This version has the assembly code update a global flags variable when
it's run, but only in non-FIPS and non-debug builds.

Update-Note: Assembly files now pay attention to the NDEBUG preprocessor
symbol. Ensure the build passes the symbol in. (If release builds fail
to link due to missing BORINGSSL_function_hit, this is the cause.)

Change-Id: I6b7ced442b7a77d0b4ae148b00c351f68af89a6e
Reviewed-on: https://boringssl-review.googlesource.com/c/33384
Commit-Queue: Adam Langley <agl@google.com>
Reviewed-by: David Benjamin <davidben@google.com>
2019-01-22 20:22:53 +00:00
Adam Langley
eadef4730e Simplify HRSS mod3 circuits.
The multiplication and subtraction circuits were found by djb using GNU
Superoptimizer, and the addition circuit is derived from the subtraction
one by hand. They depend on a different representation: -1 is now (1, 1)
rather than (1, 0), and the latter becomes undefined.

The following Python program checks that the circuits work:

values = [0, 1, -1]

def toBits(v):
    if v == 0:
        return 0, 0
    elif v == 1:
        return 0, 1
    elif v == -1:
        return 1, 1
    else:
        raise ValueError(v)

def mul((s1, a1), (s2, a2)):
    return ((s1 ^ s2) & a1 & a2, a1 & a2)

def add((s1, a1), (s2, a2)):
    t = s1 ^ a2
    return (t & (s2 ^ a1), (a1 ^ a2) | (t ^ s2))

def sub((s1, a1), (s2, a2)):
    t = a1 ^ a2
    return ((s1 ^ a2) & (t ^ s2), t | (s1 ^ s2))

def fromBits((s, a)):
    if s == 0 and a == 0:
        return 0
    if s == 0 and a == 1:
        return 1
    if s == 1 and a == 1:
        return -1
    else:
        raise ValueError((s, a))

def wrap(v):
    if v == 2:
        return -1
    elif v == -2:
        return 1
    else:
        return v

for v1 in values:
    for v2 in values:
        print v1, v2

        result = fromBits(mul(toBits(v1), toBits(v2)))
        if result != v1 * v2:
            raise ValueError((v1, v2, result))

        result = fromBits(add(toBits(v1), toBits(v2)))
        if result != wrap(v1 + v2):
            raise ValueError((v1, v2, result))

        result = fromBits(sub(toBits(v1), toBits(v2)))
        if result != wrap(v1 - v2):
            raise ValueError((v1, v2, result))

Change-Id: Ie1a4ca5a82c2651057efc62330eca6fdd9878122
Reviewed-on: https://boringssl-review.googlesource.com/c/34344
Reviewed-by: David Benjamin <davidben@google.com>
2019-01-21 21:32:35 +00:00
Adam Langley
20f4a043eb Add SSL_OP_NO_RENEGOTIATION
Since |ssl_renegotiate_never| is the default, this option is moot.
However, OpenSSL defines and supports it so this helps code that wishes
to support both.

Change-Id: I3a2f6e93a078d39526d10f9cd0a990953bd45825
Reviewed-on: https://boringssl-review.googlesource.com/c/34384
Reviewed-by: Adam Langley <alangley@gmail.com>
Commit-Queue: Adam Langley <alangley@gmail.com>
2019-01-21 18:08:55 +00:00
Adam Langley
899835fad4 Rename Fiat include files to end in .h
Otherwise generate_build_files.py thinks that they're top-level source
files.

Fixes grpc/grpc#17780

Change-Id: I9f14a816a5045c1101841a2ef7ef9868abcd5d12
Reviewed-on: https://boringssl-review.googlesource.com/c/34364
Reviewed-by: Adam Langley <agl@google.com>
2019-01-21 17:29:45 +00:00
David Benjamin
32e59d2d32 Switch to new fiat pipeline.
This new version makes it much easier to tell which code is handwritten
and which is verified. For some reason, it also is *dramatically* faster
for 32-bit x86 GCC. Clang x86_64, however, does take a small hit.
Benchmarks below.

x86, GCC 7.3.0, OPENSSL_SMALL
(For some reason, GCC used to be really bad at compiling the 32-bit curve25519
code. The new one fixes this. I'm not sure what changed.)
Before:
Did 17135 Ed25519 key generation operations in 10026402us (1709.0 ops/sec)
Did 17170 Ed25519 signing operations in 10074192us (1704.4 ops/sec)
Did 9180 Ed25519 verify operations in 10034025us (914.9 ops/sec)
Did 17271 Curve25519 base-point multiplication operations in 10050837us (1718.4 ops/sec)
Did 10605 Curve25519 arbitrary point multiplication operations in 10047714us (1055.5 ops/sec)
Did 7800 ECDH P-256 operations in 10018331us (778.6 ops/sec)
Did 24308 ECDSA P-256 signing operations in 10019241us (2426.1 ops/sec)
Did 9191 ECDSA P-256 verify operations in 10081639us (911.7 ops/sec)
After:
Did 99873 Ed25519 key generation operations in 10021810us (9965.6 ops/sec) [+483.1%]
Did 99960 Ed25519 signing operations in 10052236us (9944.1 ops/sec) [+483.4%]
Did 53676 Ed25519 verify operations in 10009078us (5362.7 ops/sec) [+486.2%]
Did 102000 Curve25519 base-point multiplication operations in 10039764us (10159.6 ops/sec) [+491.2%]
Did 60802 Curve25519 arbitrary point multiplication operations in 10056897us (6045.8 ops/sec) [+472.8%]
Did 7900 ECDH P-256 operations in 10054509us (785.7 ops/sec) [+0.9%]
Did 24926 ECDSA P-256 signing operations in 10050919us (2480.0 ops/sec) [+2.2%]
Did 9494 ECDSA P-256 verify operations in 10064659us (943.3 ops/sec) [+3.5%]

x86, Clang 8.0.0 trunk 349417, OPENSSL_SMALL
Before:
Did 82750 Ed25519 key generation operations in 10051177us (8232.9 ops/sec)
Did 82400 Ed25519 signing operations in 10035806us (8210.6 ops/sec)
Did 41511 Ed25519 verify operations in 10048919us (4130.9 ops/sec)
Did 83300 Curve25519 base-point multiplication operations in 10044283us (8293.3 ops/sec)
Did 49700 Curve25519 arbitrary point multiplication operations in 10007005us (4966.5 ops/sec)
Did 14039 ECDH P-256 operations in 10093929us (1390.8 ops/sec)
Did 40950 ECDSA P-256 signing operations in 10006757us (4092.2 ops/sec)
Did 16068 ECDSA P-256 verify operations in 10095996us (1591.5 ops/sec)
After:
Did 80476 Ed25519 key generation operations in 10048648us (8008.6 ops/sec) [-2.7%]
Did 79050 Ed25519 signing operations in 10049180us (7866.3 ops/sec) [-4.2%]
Did 40501 Ed25519 verify operations in 10048347us (4030.6 ops/sec) [-2.4%]
Did 81300 Curve25519 base-point multiplication operations in 10017480us (8115.8 ops/sec) [-2.1%]
Did 48278 Curve25519 arbitrary point multiplication operations in 10092500us (4783.6 ops/sec) [-3.7%]
Did 15402 ECDH P-256 operations in 10096705us (1525.4 ops/sec) [+9.7%]
Did 44200 ECDSA P-256 signing operations in 10037715us (4403.4 ops/sec) [+7.6%]
Did 17000 ECDSA P-256 verify operations in 10008813us (1698.5 ops/sec) [+6.7%]

x86_64, GCC 7.3.0
(Note these P-256 numbers are not affected by this change. Included to get a
sense of noise.)
Before:
Did 557000 Ed25519 key generation operations in 10011721us (55634.8 ops/sec)
Did 550000 Ed25519 signing operations in 10016449us (54909.7 ops/sec)
Did 190000 Ed25519 verify operations in 10014565us (18972.4 ops/sec)
Did 587000 Curve25519 base-point multiplication operations in 10015402us (58609.7 ops/sec)
Did 230000 Curve25519 arbitrary point multiplication operations in 10023827us (22945.3 ops/sec)
Did 179000 ECDH P-256 operations in 10016294us (17870.9 ops/sec)
Did 557000 ECDSA P-256 signing operations in 10014158us (55621.3 ops/sec)
Did 198000 ECDSA P-256 verify operations in 10036694us (19727.6 ops/sec)
After:
Did 569000 Ed25519 key generation operations in 10004965us (56871.8 ops/sec) [+2.2%]
Did 563000 Ed25519 signing operations in 10000064us (56299.6 ops/sec) [+2.5%]
Did 196000 Ed25519 verify operations in 10025650us (19549.9 ops/sec) [+3.0%]
Did 596000 Curve25519 base-point multiplication operations in 10008666us (59548.4 ops/sec) [+1.6%]
Did 229000 Curve25519 arbitrary point multiplication operations in 10028921us (22834.0 ops/sec) [-0.5%]
Did 182910 ECDH P-256 operations in 10014905us (18263.8 ops/sec) [+2.2%]
Did 562000 ECDSA P-256 signing operations in 10011944us (56133.0 ops/sec) [+0.9%]
Did 202000 ECDSA P-256 verify operations in 10046901us (20105.7 ops/sec) [+1.9%]

x86_64, GCC 7.3.0, OPENSSL_SMALL
Before:
Did 350000 Ed25519 key generation operations in 10002540us (34991.1 ops/sec)
Did 344000 Ed25519 signing operations in 10010420us (34364.2 ops/sec)
Did 197000 Ed25519 verify operations in 10030593us (19639.9 ops/sec)
Did 362000 Curve25519 base-point multiplication operations in 10004615us (36183.3 ops/sec)
Did 235000 Curve25519 arbitrary point multiplication operations in 10025951us (23439.2 ops/sec)
Did 32032 ECDH P-256 operations in 10056486us (3185.2 ops/sec)
Did 96354 ECDSA P-256 signing operations in 10007297us (9628.4 ops/sec)
Did 37774 ECDSA P-256 verify operations in 10044892us (3760.5 ops/sec)
After:
Did 343000 Ed25519 key generation operations in 10025108us (34214.1 ops/sec) [-2.2%]
Did 340000 Ed25519 signing operations in 10014870us (33949.5 ops/sec) [-1.2%]
Did 192000 Ed25519 verify operations in 10025082us (19152.0 ops/sec) [-2.5%]
Did 355000 Curve25519 base-point multiplication operations in 10013220us (35453.1 ops/sec) [-2.0%]
Did 231000 Curve25519 arbitrary point multiplication operations in 10010775us (23075.1 ops/sec) [-1.6%]
Did 31540 ECDH P-256 operations in 10009664us (3151.0 ops/sec) [-1.1%]
Did 99012 ECDSA P-256 signing operations in 10090296us (9812.6 ops/sec) [+1.9%]
Did 37695 ECDSA P-256 verify operations in 10092859us (3734.8 ops/sec) [-0.7%]

x86_64, Clang 8.0.0 trunk 349417
(Note these P-256 numbers are not affected by this change. Included to get a
sense of noise.)
Before:
Did 600000 Ed25519 key generation operations in 10000278us (59998.3 ops/sec)
Did 595000 Ed25519 signing operations in 10010375us (59438.3 ops/sec)
Did 184000 Ed25519 verify operations in 10013984us (18374.3 ops/sec)
Did 636000 Curve25519 base-point multiplication operations in 10005250us (63566.6 ops/sec)
Did 229000 Curve25519 arbitrary point multiplication operations in 10006059us (22886.1 ops/sec)
Did 179250 ECDH P-256 operations in 10026354us (17877.9 ops/sec)
Did 547000 ECDSA P-256 signing operations in 10017585us (54604.0 ops/sec)
Did 197000 ECDSA P-256 verify operations in 10013020us (19674.4 ops/sec)
After:
Did 560000 Ed25519 key generation operations in 10009295us (55948.0 ops/sec) [-6.8%]
Did 548000 Ed25519 signing operations in 10007912us (54756.7 ops/sec) [-7.9%]
Did 170000 Ed25519 verify operations in 10056948us (16903.7 ops/sec) [-8.0%]
Did 592000 Curve25519 base-point multiplication operations in 10016818us (59100.6 ops/sec) [-7.0%]
Did 214000 Curve25519 arbitrary point multiplication operations in 10043918us (21306.4 ops/sec) [-6.9%]
Did 180000 ECDH P-256 operations in 10026019us (17953.3 ops/sec) [+0.4%]
Did 550000 ECDSA P-256 signing operations in 10004943us (54972.8 ops/sec) [+0.7%]
Did 198000 ECDSA P-256 verify operations in 10021714us (19757.1 ops/sec) [+0.4%]

x86_64, Clang 8.0.0 trunk 349417, OPENSSL_SMALL
Before:
Did 326000 Ed25519 key generation operations in 10003266us (32589.4 ops/sec)
Did 322000 Ed25519 signing operations in 10026783us (32114.0 ops/sec)
Did 181000 Ed25519 verify operations in 10015635us (18071.7 ops/sec)
Did 335000 Curve25519 base-point multiplication operations in 10000359us (33498.8 ops/sec)
Did 224000 Curve25519 arbitrary point multiplication operations in 10027245us (22339.1 ops/sec)
Did 68552 ECDH P-256 operations in 10018900us (6842.3 ops/sec)
Did 184000 ECDSA P-256 signing operations in 10014516us (18373.3 ops/sec)
Did 76020 ECDSA P-256 verify operations in 10016891us (7589.2 ops/sec)
After:
Did 310000 Ed25519 key generation operations in 10022086us (30931.7 ops/sec) [-5.1%]
Did 308000 Ed25519 signing operations in 10007543us (30776.8 ops/sec) [-4.2%]
Did 173000 Ed25519 verify operations in 10005829us (17289.9 ops/sec) [-4.3%]
Did 321000 Curve25519 base-point multiplication operations in 10027058us (32013.4 ops/sec) [-4.4%]
Did 212000 Curve25519 arbitrary point multiplication operations in 10015203us (21167.8 ops/sec) [-5.2%]
Did 64059 ECDH P-256 operations in 10042781us (6378.6 ops/sec) [-6.8%]
Did 170000 ECDSA P-256 signing operations in 10030896us (16947.6 ops/sec) [-7.8%]
Did 72176 ECDSA P-256 verify operations in 10075369us (7163.6 ops/sec) [-5.6%]

Bug: 254
Change-Id: Ib04c773f01b542bcb8611cceb582466bfa6f6d52
Reviewed-on: https://boringssl-review.googlesource.com/c/34306
Commit-Queue: Adam Langley <agl@google.com>
Reviewed-by: Adam Langley <agl@google.com>
2019-01-18 00:24:03 +00:00
David Benjamin
f36c3ad3e4 Don't look for libunwind if cross-compiling.
pkg-config gets confused and doesn't know to look in, say,
/usr/lib/i386-linux-gnu when building for 32-bit. Fortunately, CMake
sets a CMAKE_CROSSCOMPILING variable whenever CMAKE_SYSTEM_NAME is set
manually (as done in util/32-bit-toolchain.cmake).

Change-Id: I638b4d54ea92ade4b2b5baa40a3c5e8c17914d46
Reviewed-on: https://boringssl-review.googlesource.com/c/34305
Reviewed-by: Adam Langley <agl@google.com>
Commit-Queue: David Benjamin <davidben@google.com>
2019-01-16 21:14:00 +00:00
David Benjamin
5590c715e2 Mark some unmarked array sizes in curve25519.c.
Change-Id: I92589f5d5e89c836cff3c26739b43eb65de67836
Reviewed-on: https://boringssl-review.googlesource.com/c/34304
Commit-Queue: Adam Langley <agl@google.com>
Reviewed-by: Adam Langley <agl@google.com>
2019-01-16 20:49:29 +00:00
Adam Langley
823effe975 Revert "Fix protos_len size in SSL_set_alpn_protos and SSL_CTX_set_alpn_protos"
This reverts commit 35771ff8af. It breaks
tcnetty, which is tcnetty's fault but we have a large backlog from
Christmas to break with at the moment.

Bug: chromium:879657
Change-Id: Iafe93b335d88722170ec2689a25e145969e19e73
Reviewed-on: https://boringssl-review.googlesource.com/c/34324
Commit-Queue: David Benjamin <davidben@google.com>
Reviewed-by: David Benjamin <davidben@google.com>
2019-01-16 20:02:16 +00:00
David Benjamin
73b1f181b6 Add ABI tests for GCM.
Change-Id: If28096e677104c6109e31e31a636fee82ef4ba11
Reviewed-on: https://boringssl-review.googlesource.com/c/34266
Commit-Queue: David Benjamin <davidben@google.com>
Reviewed-by: Adam Langley <agl@google.com>
2019-01-15 22:49:37 +00:00
David Benjamin
8285ccd8fc Fix SSL_R_TOO_MUCH_READ_EARLY_DATA.
https://boringssl-review.googlesource.com/15164 allocated a new error code by
hand, rather than using the make_errors.go script, which caused it to clobber
the error space reserved for alerts.

Change-Id: Ife92c45da2c1d3c5506439bd5781ae91240d16d8
Reviewed-on: https://boringssl-review.googlesource.com/c/34307
Commit-Queue: David Benjamin <davidben@google.com>
Commit-Queue: Adam Langley <agl@google.com>
Reviewed-by: Adam Langley <agl@google.com>
2019-01-15 21:53:52 +00:00
David Benjamin
b65ce68c8f Test CRYPTO_gcm128_tag in gcm_test.cc.
CRYPTO_gcm128_encrypt should be paired with CRYPTO_gcm128_tag, not
CRYPTO_gcm128_finish.

Change-Id: Ia3023a196fe5b613e9309b5bac19ea849dbc33b7
Reviewed-on: https://boringssl-review.googlesource.com/c/34265
Commit-Queue: Adam Langley <agl@google.com>
Reviewed-by: Adam Langley <agl@google.com>
2019-01-15 18:19:57 +00:00
David Benjamin
f18bd55240 Remove pointer cast in P-256 table.
We expect the table to have a slightly nested structure, so just
generate it that way. Avoid risking strict aliasing problems. Thanks to
Brian Smith for pointing this out.

Change-Id: Ie21610c4afab07a610d914265079135dba17b3b7
Reviewed-on: https://boringssl-review.googlesource.com/c/34264
Commit-Queue: Adam Langley <agl@google.com>
Reviewed-by: Adam Langley <agl@google.com>
2019-01-15 00:16:17 +00:00
Adam Langley
3eac8b7708 Ignore new fields in forthcoming Wycheproof tests.
Change-Id: I95dd20bb71c18cecd4cae72bcdbd708ee5e92e77
Reviewed-on: https://boringssl-review.googlesource.com/c/34284
Commit-Queue: Adam Langley <agl@google.com>
Commit-Queue: David Benjamin <davidben@google.com>
Reviewed-by: David Benjamin <davidben@google.com>
2019-01-14 22:02:37 +00:00
David Benjamin
5349ddb747 Fix RSAZ's OPENSSL_cleanse.
https://boringssl-review.googlesource.com/28584 switched RSAZ's buffer
to being externally-allocated, which means the OPENSSL_cleanse needs to
be tweaked to match.

Change-Id: I0a7307ac86aa10933d10d380ef652c355fed3ee9
Reviewed-on: https://boringssl-review.googlesource.com/c/34191
Reviewed-by: Adam Langley <agl@google.com>
Commit-Queue: David Benjamin <davidben@google.com>
2019-01-14 20:04:39 +00:00
Alessandro Ghedini
3cbb0299a2 Allow configuring QUIC method per-connection
This allows sharing SSL_CTX between TCP and QUIC connections, such that
common settings can be configured without having to duplicate the
context.

Change-Id: Ie920e7f2a772dd6c6c7b63fdac243914ac5b7b26
Reviewed-on: https://boringssl-review.googlesource.com/c/33904
Reviewed-by: David Benjamin <davidben@google.com>
Commit-Queue: David Benjamin <davidben@google.com>
2019-01-14 19:54:59 +00:00
Tom Tan
de3c1f69cc Fix header file for _byteswap_ulong and _byteswap_uint64 from MSVC CRT
_byteswap_ulong and _byteswap_uint64 are documented (see below link) as coming from stdlib.h.
 On some build configurations stdlib.h is pulled in by intrin.h but that is not guaranteed. In particular,
this assumption causes build breaks when building Chromium for Windows ARM64 with clang-cl. This
 change switches the #include to use the documented header file, thus fixing Windows ARM64 with clang-cl.


https://docs.microsoft.com/en-us/cpp/c-runtime-library/reference/byteswap-uint64-byteswap-ulong-byteswap-ushort

Bug: chromium:893460
Change-Id: I738c7227a9e156c894c2be62b52228a5bbd88414
Reviewed-on: https://boringssl-review.googlesource.com/c/34244
Reviewed-by: David Benjamin <davidben@google.com>
Reviewed-by: Bruce Dawson <brucedawson@chromium.org>
Commit-Queue: David Benjamin <davidben@google.com>
2019-01-14 19:49:39 +00:00
David Benjamin
2bee229103 Add ABI tests for HRSS assembly.
The last instruction did not unwind correctly. Also add .type and .size
annotations so that errors show up properly.

Change-Id: Id18e12b4ed51bdabb90bd5ac66631fd989649eec
Reviewed-on: https://boringssl-review.googlesource.com/c/34190
Reviewed-by: Adam Langley <agl@google.com>
Commit-Queue: Adam Langley <agl@google.com>
2019-01-09 04:10:25 +00:00
David Benjamin
d99b549b8e Add AES ABI tests.
This involves fixing some bugs in aes_nohw_cbc_encrypt's annotations,
and working around a libunwind bug. In doing so, support .cfi_remember_state
and .cfi_restore_state in perlasm.

Change-Id: Iaedfe691356b0468327a6be0958d034dafa760e5
Reviewed-on: https://boringssl-review.googlesource.com/c/34189
Reviewed-by: Adam Langley <agl@google.com>
Commit-Queue: Adam Langley <agl@google.com>
2019-01-09 03:54:55 +00:00
David Benjamin
c0f4dbe4e2 Move aes_nohw, bsaes, and vpaes prototypes to aes/internal.h.
This is in preparation for adding ABI tests to them.

In doing so, update delocate.go so that OPENSSL_ia32cap_get is consistently
callable outside the module. Right now it's callable both inside and outside
normally, but not in FIPS mode because the function is generated. This is
needed for tests and the module to share headers that touch OPENSSL_ia32cap_P.

Change-Id: Idbc7d694acfb974e0b04adac907dab621e87de62
Reviewed-on: https://boringssl-review.googlesource.com/c/34188
Commit-Queue: Adam Langley <agl@google.com>
Reviewed-by: Adam Langley <agl@google.com>
2019-01-09 03:35:55 +00:00
David Benjamin
e592d595c4 Add direction flag checking to CHECK_ABI.
Linux and Windows ABIs both require that the direction flag be cleared
on function exit, so that functions can rely on it being cleared on
entry. (Some OpenSSL assembly preserves it, which is stronger, but we
only require what is specified by the ABI so CHECK_ABI works with C
compiler output.)

Change-Id: I1a320aed4371176b4b44fe672f1a90167b84160f
Reviewed-on: https://boringssl-review.googlesource.com/c/34187
Commit-Queue: Adam Langley <agl@google.com>
Reviewed-by: Adam Langley <agl@google.com>
2019-01-09 03:22:15 +00:00