Commit Graph

49 Commits

Author SHA1 Message Date
David Benjamin
a57435e138 Remove __ARM_ARCH__ guard on gcm_*_v8.
OpenSSL's c1669e1c205dc8e695fb0c10a655f434e758b9f7 switched it to
__ARM_MAX_ARCH__, which we mirrored in assembly but not C. The C version
should be __ARM_MAX_ARCH__ to match. However, __ARM_MAX_ARCH__ is
hardcoded to 8, so just remove the check.

Change-Id: Ic873203db1478f49437b889b84ee7fb28eba1a6d
Reviewed-on: https://boringssl-review.googlesource.com/c/35045
Commit-Queue: Adam Langley <agl@google.com>
Reviewed-by: Adam Langley <agl@google.com>
2019-02-27 02:26:21 +00:00
David Benjamin
104306f587 Remove STRICT_ALIGNMENT code from modes.
STRICT_ALIGNMENT is a remnant of OpenSSL code would cast pointers to
size_t* and load more than one byte at a time. Not all architectures
support unaligned access, so it did an alignment check and only enterred
this path if aligned or the underlying architecture didn't care.

This is UB. Unaligned casts in C are undefined on all architectures, so
we switch these to memcpy some time ago. Compilers can optimize memcpy
to the unaligned accesses we wanted. That left our modes logic as:

- If STRICT_ALIGNMENT is 1 and things are unaligned, work byte-by-byte.

- Otherwise, use the memcpy-based word-by-word code, which now works
  independent of STRICT_ALIGNMENT.

Remove the first check to simplify things. On x86, x86_64, and aarch64,
STRICT_ALIGNMENT is zero and this is a no-op. ARM is more complex. Per
[0], ARMv7 and up support unaligned access. ARMv5 do not. ARMv6 does,
but can run in a mode where it looks more like ARMv5.

For ARMv7 and up, STRICT_ALIGNMENT should have been zero, but was one.
Thus this change should be an improvement for ARMv7 (right now unaligned
inputs lose bsaes-armv7). The Android NDK does not even support the
pre-ARMv7 ABI anymore[1]. Nonetheless, Cronet still supports ARMv6 as a
library. It builds with -march=armv6 which GCC interprets as supporting
unaligned access, so it too did not want this code.

For completeness, should anyone still care about ARMv5 or be building
with an overly permissive -march flag, GCC does appear unable to inline
the memcpy calls. However, GCC also does not interpret
(uintptr_t)ptr % sizeof(size_t) as an alignment assertion, so such
consumers have already been paying for the memcpy here and throughout
the library.

In general, C's arcane pointer rules mean we must resort to memcpy
often, so, realistically, we must require that the compiler optimize
memcpy well.

[0] https://medium.com/@iLevex/the-curious-case-of-unaligned-access-on-arm-5dd0ebe24965
[1] https://developer.android.com/ndk/guides/abis#armeabi

Change-Id: I3c7dea562adaeb663032e395499e69530dd8e145
Reviewed-on: https://boringssl-review.googlesource.com/c/34873
Reviewed-by: Adam Langley <agl@google.com>
2019-02-14 17:39:36 +00:00
David Benjamin
fb35b147ca Remove stray prototype.
The function's since been renamed.

Change-Id: Id1a9788dfeb5c46b3463611b08318b3f253d03df
Reviewed-on: https://boringssl-review.googlesource.com/c/34870
Reviewed-by: Adam Langley <agl@google.com>
2019-02-14 17:31:14 +00:00
David Benjamin
eb2c2cdf17 Always define GHASH.
There is a C implementation of gcm_ghash_4bit to pair with
gcm_gmult_4bit. It's even slightly faster per the numbers below (x86_64
OPENSSL_NO_ASM build), but, more importantly, we trim down the
combinatorial explosion of GCM implementations and free up complexity
budget for potentially using bsaes better in the future.

Old:
Did 2557000 AES-128-GCM (16 bytes) seal operations in 1000057us (2556854.3 ops/sec): 40.9 MB/s
Did 94000 AES-128-GCM (1350 bytes) seal operations in 1009613us (93105.0 ops/sec): 125.7 MB/s
Did 17000 AES-128-GCM (8192 bytes) seal operations in 1024768us (16589.1 ops/sec): 135.9 MB/s
Did 2511000 AES-256-GCM (16 bytes) seal operations in 1000196us (2510507.9 ops/sec): 40.2 MB/s
Did 84000 AES-256-GCM (1350 bytes) seal operations in 1000412us (83965.4 ops/sec): 113.4 MB/s
Did 15000 AES-256-GCM (8192 bytes) seal operations in 1046963us (14327.2 ops/sec): 117.4 MB/s

New:
Did 2739000 AES-128-GCM (16 bytes) seal operations in 1000322us (2738118.3 ops/sec): 43.8 MB/s
Did 100000 AES-128-GCM (1350 bytes) seal operations in 1008190us (99187.7 ops/sec): 133.9 MB/s
Did 17000 AES-128-GCM (8192 bytes) seal operations in 1006360us (16892.6 ops/sec): 138.4 MB/s
Did 2546000 AES-256-GCM (16 bytes) seal operations in 1000150us (2545618.2 ops/sec): 40.7 MB/s
Did 86000 AES-256-GCM (1350 bytes) seal operations in 1000970us (85916.7 ops/sec): 116.0 MB/s
Did 14850 AES-256-GCM (8192 bytes) seal operations in 1023459us (14509.6 ops/sec): 118.9 MB/s

While I'm here, tighten up some of the functions and align the ctr32 and
non-ctr32 paths.

Bug: 256
Change-Id: Id4df699cefc8630dd5a350d44f927900340f5e60
Reviewed-on: https://boringssl-review.googlesource.com/c/34869
Reviewed-by: Adam Langley <agl@google.com>
2019-02-14 17:30:55 +00:00
David Benjamin
cc2b8e2552 Add ABI tests for aesni-gcm-x86_64.pl.
Change-Id: Ic23fc5fbec2c4f8df5d06f807c6bd2c5e1f0e99c
Reviewed-on: https://boringssl-review.googlesource.com/c/34865
Commit-Queue: David Benjamin <davidben@google.com>
Reviewed-by: Adam Langley <agl@google.com>
2019-02-11 20:08:38 +00:00
David Benjamin
0a87c4982c Implement ABI testing for ARM.
Update-Note: There's some chance this'll break iOS since I was unable to
test it there. The iPad I have to test on is too new to run 32-bit code
at all.

Change-Id: I6593f91b67a5e8a82828237d3b69ed948b07922d
Reviewed-on: https://boringssl-review.googlesource.com/c/34725
Commit-Queue: David Benjamin <davidben@google.com>
Reviewed-by: Adam Langley <agl@google.com>
2019-02-05 21:01:44 +00:00
David Benjamin
0a67eba62d Fix the order of Windows unwind codes.
The unwind tester suggests Windows doesn't care, but the documentation
says that unwind codes should be sorted in descending offset, which
means the last instruction should be first.

https://docs.microsoft.com/en-us/cpp/build/exception-handling-x64?view=vs-2017#struct-unwind_code

Bug: 259
Change-Id: I21e54c362e18e0405f980005112cc3f7c417c70c
Reviewed-on: https://boringssl-review.googlesource.com/c/34785
Commit-Queue: David Benjamin <davidben@google.com>
Reviewed-by: Adam Langley <agl@google.com>
2019-02-05 19:38:23 +00:00
David Benjamin
28f035f48b Implement unwind testing for Windows.
Unfortunately, due to most OpenSSL assembly using custom exception
handlers to unwind, most of our assembly doesn't work with
non-destructive unwind. For now, CHECK_ABI behaves like
CHECK_ABI_NO_UNWIND on Windows, and CHECK_ABI_SEH will test unwinding on
both platforms.

The tests do, however, work with the unwind-code-based assembly we
recently added, as well as the clmul-based GHASH which is also
code-based. Remove the ad-hoc SEH tests which intentionally hit memory
access exceptions, now that we can test unwind directly.

Now that we can test it, the next step is to implement SEH directives in
perlasm so writing these unwind codes is less of a chore.

Bug: 259
Change-Id: I23a57a22c5dc9fa4513f575f18192335779678a5
Reviewed-on: https://boringssl-review.googlesource.com/c/34784
Commit-Queue: David Benjamin <davidben@google.com>
Reviewed-by: Adam Langley <agl@google.com>
2019-02-05 19:22:15 +00:00
David Benjamin
e569c7e25d Add ABI testing for 32-bit x86.
This is much less interesting (stack-based parameters, Windows and SysV
match, no SEH concerns as far as I can tell) than x86_64, but it was
easy to do and I'm more familiar with x86 than ARM, so it made a better
second architecture to make sure all the architecture ifdefs worked out.

Also fix a bug in the x86_64 direction flag code. It was shifting in the
wrong direction, making give 0 or 1<<20 rather than 0 or 1.

(Happily, x86_64 appears to be unique in having vastly different calling
conventions between OSs. x86 is the same between SysV and Windows, and
ARM had the good sense to specify a (mostly) common set of rules.)

Since a lot of the assembly functions use the same names and the tests
were written generically, merely dropping in a trampoline and
CallerState implementation gives us a bunch of ABI tests for free.

Change-Id: I15408c18d43e88cfa1c5c0634a8b268a150ed961
Reviewed-on: https://boringssl-review.googlesource.com/c/34624
Reviewed-by: Adam Langley <agl@google.com>
Commit-Queue: David Benjamin <davidben@google.com>
2019-01-28 20:40:06 +00:00
David Benjamin
4545503926 Add a constant-time pshufb-based GHASH implementation.
We currently require clmul instructions for constant-time GHASH
on x86_64. Otherwise, it falls back to a variable-time 4-bit table
implementation. However, a significant proportion of clients lack these
instructions.

Inspired by vpaes, we can use pshufb and a slightly different order of
incorporating the bits to make a constant-time GHASH. This requires
SSSE3, which is very common. Benchmarking old machines we had on hand,
it appears to be a no-op on Sandy Bridge and a small slowdown for
Penryn.

Sandy Bridge (Intel Pentium CPU 987 @ 1.50GHz):
(Note: these numbers are before 16-byte-aligning the table. That was an
improvement on Penryn, so it's possible Sandy Bridge is now better.)
Before:
Did 4244750 AES-128-GCM (16 bytes) seal operations in 4015000us (1057222.9 ops/sec): 16.9 MB/s
Did 442000 AES-128-GCM (1350 bytes) seal operations in 4016000us (110059.8 ops/sec): 148.6 MB/s
Did 84000 AES-128-GCM (8192 bytes) seal operations in 4015000us (20921.5 ops/sec): 171.4 MB/s
Did 3349250 AES-256-GCM (16 bytes) seal operations in 4016000us (833976.6 ops/sec): 13.3 MB/s
Did 343500 AES-256-GCM (1350 bytes) seal operations in 4016000us (85532.9 ops/sec): 115.5 MB/s
Did 65250 AES-256-GCM (8192 bytes) seal operations in 4015000us (16251.6 ops/sec): 133.1 MB/s
After:
Did 4229250 AES-128-GCM (16 bytes) seal operations in 4016000us (1053100.1 ops/sec): 16.8 MB/s [-0.4%]
Did 442250 AES-128-GCM (1350 bytes) seal operations in 4016000us (110122.0 ops/sec): 148.7 MB/s [+0.1%]
Did 83500 AES-128-GCM (8192 bytes) seal operations in 4015000us (20797.0 ops/sec): 170.4 MB/s [-0.6%]
Did 3286500 AES-256-GCM (16 bytes) seal operations in 4016000us (818351.6 ops/sec): 13.1 MB/s [-1.9%]
Did 342750 AES-256-GCM (1350 bytes) seal operations in 4015000us (85367.4 ops/sec): 115.2 MB/s [-0.2%]
Did 65250 AES-256-GCM (8192 bytes) seal operations in 4016000us (16247.5 ops/sec): 133.1 MB/s [-0.0%]

Penryn (Intel Core 2 Duo CPU P8600 @ 2.40GHz):
Before:
Did 1179000 AES-128-GCM (16 bytes) seal operations in 1000139us (1178836.1 ops/sec): 18.9 MB/s
Did 97000 AES-128-GCM (1350 bytes) seal operations in 1006347us (96388.2 ops/sec): 130.1 MB/s
Did 18000 AES-128-GCM (8192 bytes) seal operations in 1028943us (17493.7 ops/sec): 143.3 MB/s
Did 977000 AES-256-GCM (16 bytes) seal operations in 1000197us (976807.6 ops/sec): 15.6 MB/s
Did 82000 AES-256-GCM (1350 bytes) seal operations in 1012434us (80992.9 ops/sec): 109.3 MB/s
Did 15000 AES-256-GCM (8192 bytes) seal operations in 1006528us (14902.7 ops/sec): 122.1 MB/s
After:
Did 1306000 AES-128-GCM (16 bytes) seal operations in 1000153us (1305800.2 ops/sec): 20.9 MB/s [+10.8%]
Did 94000 AES-128-GCM (1350 bytes) seal operations in 1009852us (93082.9 ops/sec): 125.7 MB/s [-3.4%]
Did 17000 AES-128-GCM (8192 bytes) seal operations in 1012096us (16796.8 ops/sec): 137.6 MB/s [-4.0%]
Did 1070000 AES-256-GCM (16 bytes) seal operations in 1000929us (1069006.9 ops/sec): 17.1 MB/s [+9.4%]
Did 79000 AES-256-GCM (1350 bytes) seal operations in 1002209us (78825.9 ops/sec): 106.4 MB/s [-2.7%]
Did 15000 AES-256-GCM (8192 bytes) seal operations in 1061489us (14131.1 ops/sec): 115.8 MB/s [-5.2%]

Change-Id: I1c3760a77af7bee4aee3745d1c648d9e34594afb
Reviewed-on: https://boringssl-review.googlesource.com/c/34267
Commit-Queue: David Benjamin <davidben@google.com>
Reviewed-by: Adam Langley <agl@google.com>
2019-01-24 17:19:21 +00:00
Adam Langley
c1615719ce Add test of assembly code dispatch.
The first attempt involved using Linux's support for hardware
breakpoints to detect when assembly code was run. However, this doesn't
work with SDE, which is a problem.

This version has the assembly code update a global flags variable when
it's run, but only in non-FIPS and non-debug builds.

Update-Note: Assembly files now pay attention to the NDEBUG preprocessor
symbol. Ensure the build passes the symbol in. (If release builds fail
to link due to missing BORINGSSL_function_hit, this is the cause.)

Change-Id: I6b7ced442b7a77d0b4ae148b00c351f68af89a6e
Reviewed-on: https://boringssl-review.googlesource.com/c/33384
Commit-Queue: Adam Langley <agl@google.com>
Reviewed-by: David Benjamin <davidben@google.com>
2019-01-22 20:22:53 +00:00
David Benjamin
73b1f181b6 Add ABI tests for GCM.
Change-Id: If28096e677104c6109e31e31a636fee82ef4ba11
Reviewed-on: https://boringssl-review.googlesource.com/c/34266
Commit-Queue: David Benjamin <davidben@google.com>
Reviewed-by: Adam Langley <agl@google.com>
2019-01-15 22:49:37 +00:00
David Benjamin
b65ce68c8f Test CRYPTO_gcm128_tag in gcm_test.cc.
CRYPTO_gcm128_encrypt should be paired with CRYPTO_gcm128_tag, not
CRYPTO_gcm128_finish.

Change-Id: Ia3023a196fe5b613e9309b5bac19ea849dbc33b7
Reviewed-on: https://boringssl-review.googlesource.com/c/34265
Commit-Queue: Adam Langley <agl@google.com>
Reviewed-by: Adam Langley <agl@google.com>
2019-01-15 18:19:57 +00:00
David Benjamin
ce45588695 Speculatively remove __STDC_*_MACROS.
C99 added macros such as PRIu64 to inttypes.h, but it said to exclude them from
C++ unless __STDC_FORMAT_MACROS or __STDC_CONSTANT_MACROS was defined. This
text was never incorporated into any C++ standard and explicitly overruled in
C++11.

Some libc headers followed C99. Notably, glibc prior to 2.18
(https://sourceware.org/bugzilla/show_bug.cgi?id=15366) and old versions of the
Android NDK.

In the NDK, although it was fixed some time ago (API level 20), the NDK used to
use separate headers per API level. Only applications using minSdkVersion >= 20
would get the fix. Starting NDK r14, "unified" headers are available which,
among other things, make the fix available (opt-in) independent of
minSdkVersion. In r15, unified headers are opt-out, and in r16 they are
mandatory.

Try removing these and see if anyone notices. The former is past our five year
watermark. The latter is not and Android has hit
https://boringssl-review.googlesource.com/c/boringssl/+/32686 before, but
unless it is really widespread, it's probably simpler to ask consumers to
define __STDC_CONSTANT_MACROS and __STDC_FORMAT_MACROS globally.

Update-Note: If you see compile failures relating to PRIu64, UINT64_MAX, and
friends, update your glibc or NDK. As a short-term fix, add
__STDC_CONSTANT_MACROS and __STDC_FORMAT_MACROS to your build, but get in touch
so we have a sense of how widespread it is.

Bug: 198
Change-Id: I56cca5f9acdff803de1748254bc45096e4c959c2
Reviewed-on: https://boringssl-review.googlesource.com/c/33146
Reviewed-by: Adam Langley <agl@google.com>
Commit-Queue: Adam Langley <agl@google.com>
CQ-Verified: CQ bot account: commit-bot@chromium.org <commit-bot@chromium.org>
2018-11-14 16:14:37 +00:00
David Benjamin
5ecfb10d54 Modernize OPENSSL_COMPILE_ASSERT, part 2.
The change seems to have stuck, so bring us closer to C/++11 static asserts.

(If we later find we need to support worse toolchains, we can always use
__LINE__ or __COUNTER__ to avoid duplicate typedef names and just punt on
embedding the message into the type name.)

Change-Id: I0e5bb1106405066f07740728e19ebe13cae3e0ee
Reviewed-on: https://boringssl-review.googlesource.com/c/33145
Commit-Queue: Adam Langley <agl@google.com>
Reviewed-by: Adam Langley <agl@google.com>
CQ-Verified: CQ bot account: commit-bot@chromium.org <commit-bot@chromium.org>
2018-11-14 16:06:37 +00:00
Aaron Green
28babde159 Include aes.h in mode/internal.h
block128_f was recently changed to take an AES_KEY instead of a void*,
but AES_KEY is not defined in base.h.  internal.h should not depend on
other sources to include aes.h for it.

Change-Id: I81aab5124ce4397eb76a83ff09779bfaea66d3c1
Reviewed-on: https://boringssl-review.googlesource.com/32364
Commit-Queue: David Benjamin <davidben@google.com>
Reviewed-by: David Benjamin <davidben@google.com>
CQ-Verified: CQ bot account: commit-bot@chromium.org <commit-bot@chromium.org>
2018-10-03 17:36:04 +00:00
David Benjamin
73535ab252 Fix undefined block128_f, etc., casts.
This one is a little thorny. All the various block cipher modes
functions and callbacks take a void *key. This allows them to be used
with multiple kinds of block ciphers.

However, the implementations of those callbacks are the normal typed
functions, like AES_encrypt. Those take AES_KEY *key. While, at the ABI
level, this is perfectly fine, C considers this undefined behavior.

If we wish to preserve this genericness, we could either instantiate
multiple versions of these mode functions or create wrappers of
AES_encrypt, etc., that take void *key.

The former means more code and is tedious without C++ templates (maybe
someday...). The latter would not be difficult for a compiler to
optimize out. C mistakenly allowed comparing function pointers for
equality, which means a compiler cannot replace pointers to wrapper
functions with the real thing. (That said, the performance-sensitive
bits already act in chunks, e.g. ctr128_f, so the function call overhead
shouldn't matter.)

But our only 128-bit block cipher is AES anyway, so I just switched
things to use AES_KEY throughout. AES is doing fine, and hopefully we
would have the sense not to pair a hypothetical future block cipher with
so many modes!

Change-Id: Ied3e843f0e3042a439f09e655b29847ade9d4c7d
Reviewed-on: https://boringssl-review.googlesource.com/32107
Reviewed-by: Adam Langley <agl@google.com>
2018-10-01 17:35:02 +00:00
David Benjamin
302ef5ee12 Keep the GCM bits in one place.
This avoids needing to duplicate the "This API differs [...]" comment.

Change-Id: If07c77bb66ecdae4e525fa01cc8c762dbacb52f1
Reviewed-on: https://boringssl-review.googlesource.com/32005
Reviewed-by: Adam Langley <agl@google.com>
Commit-Queue: Adam Langley <agl@google.com>
CQ-Verified: CQ bot account: commit-bot@chromium.org <commit-bot@chromium.org>
2018-09-17 22:12:21 +00:00
David Benjamin
580be2b184 Trim 88 bytes from each AES-GCM EVP_AEAD.
EVP_AEAD reused portions of EVP_CIPHER's GCM128_CONTEXT which contains both the
key and intermediate state for each operation. (The legacy OpenSSL EVP_CIPHER
API has no way to store just a key.) Split out a GCM128_KEY and store that
instead.

Change-Id: Ibc550084fa82963d3860346ed26f9cf170dceda5
Reviewed-on: https://boringssl-review.googlesource.com/32004
Commit-Queue: David Benjamin <davidben@google.com>
Commit-Queue: Adam Langley <agl@google.com>
Reviewed-by: Adam Langley <agl@google.com>
CQ-Verified: CQ bot account: commit-bot@chromium.org <commit-bot@chromium.org>
2018-09-17 22:05:51 +00:00
Adam Langley
6410e18e91 Update several assembly files from upstream.
This change syncs several assembly files from upstream. The only meanful
additions are more CFI directives.

Change-Id: I6aec50b6fddbea297b79bae22cfd68d5c115220f
Reviewed-on: https://boringssl-review.googlesource.com/30364
Reviewed-by: Adam Langley <agl@google.com>
2018-08-07 18:57:17 +00:00
Adam Langley
05750f23ae Revert "Revert "Revert "Revert "Make x86(-64) use the same aes_hw_* infrastructure as POWER and the ARMs.""""
This was reverted a second time because it ended up always setting the
final argument to CRYPTO_gcm128_init to zero, which disabled some
acceleration of GCM on ≥Haswell. With this update, that argument will be
set to 1 if |aes_hw_*| functions are being used.

Probably this will need to be reverted too for some reason. I'm hoping
to fill the entire git short description with “Revert”.

Change-Id: Ib4a06f937d35d95affdc0b63f29f01c4a8c47d03
Reviewed-on: https://boringssl-review.googlesource.com/28484
Reviewed-by: Adam Langley <agl@google.com>
Commit-Queue: Adam Langley <agl@google.com>
CQ-Verified: CQ bot account: commit-bot@chromium.org <commit-bot@chromium.org>
2018-05-14 22:09:29 +00:00
Adam Langley
f64c373784 Fix build with GCC 4.9.2 and -Wtype-limits.
gRPC builds on Debian Jessie, which has GCC 4.9.2, and builds with
-Wtype-limits, which makes it warn about code intended for 64-bit
systems when building on 32-bit systems.

We have tried to avoid these issues with Clang previously by guarding
with “sizeof(size_t) > 4”, but this version of GCC isn't smart enough to
figure that out.

Change-Id: I800ceb3891436fa7c81474ede4b8656021568357
Reviewed-on: https://boringssl-review.googlesource.com/28247
Reviewed-by: David Benjamin <davidben@google.com>
Commit-Queue: David Benjamin <davidben@google.com>
CQ-Verified: CQ bot account: commit-bot@chromium.org <commit-bot@chromium.org>
2018-05-08 22:21:45 +00:00
David Benjamin
06d467c58a ghashv8-armx.pl: add Qualcomm Kryo results.
(Imported from upstream's 753316232243ccbf86b96c1c51ffcb41651d9ad5.)

Just to sync up a bit further.

Change-Id: I805150d0f0c10d68648fae83603b0d46231ae4ec
Reviewed-on: https://boringssl-review.googlesource.com/27685
Commit-Queue: Steven Valdez <svaldez@google.com>
Reviewed-by: Steven Valdez <svaldez@google.com>
CQ-Verified: CQ bot account: commit-bot@chromium.org <commit-bot@chromium.org>
2018-04-24 19:48:59 +00:00
David Benjamin
a7c8f2b7b0 ghashv8-armvx.pl: Fix various typos.
(Imported from upstream's 46f4e1bec51dc96fa275c168752aa34359d9ee51.)

Change-Id: Ie9c1e9cfc38a3962e3674a68bc0174d064272fc2
Reviewed-on: https://boringssl-review.googlesource.com/27684
Commit-Queue: Steven Valdez <svaldez@google.com>
Reviewed-by: Steven Valdez <svaldez@google.com>
CQ-Verified: CQ bot account: commit-bot@chromium.org <commit-bot@chromium.org>
2018-04-24 19:48:49 +00:00
David Benjamin
085955c567 Actually use the u64 cast.
The point was to remove the silly moduli.

Change-Id: I48c507c9dd1fc46e38e8991ed528b02b8da3dc1d
Reviewed-on: https://boringssl-review.googlesource.com/26044
Commit-Queue: Steven Valdez <svaldez@google.com>
Reviewed-by: Steven Valdez <svaldez@google.com>
CQ-Verified: CQ bot account: commit-bot@chromium.org <commit-bot@chromium.org>
2018-02-16 20:02:56 +00:00
Steven Valdez
f16cd4278f Add AES_128_CCM AEAD.
Change-Id: I830be64209deada0f24c3b6d50dc86155085c377
Reviewed-on: https://boringssl-review.googlesource.com/25904
Commit-Queue: Steven Valdez <svaldez@google.com>
Reviewed-by: Steven Valdez <svaldez@google.com>
CQ-Verified: CQ bot account: commit-bot@chromium.org <commit-bot@chromium.org>
2018-02-16 15:57:27 +00:00
David Benjamin
6dc994265e Sync up some perlasm license headers and easy fixes.
These files are otherwise up-to-date with OpenSSL master as of
50ea9d2b3521467a11559be41dcf05ee05feabd6, modulo a couple of spelling
fixes which I've imported.

I've also reverted the same-line label and instruction patch to
x86_64-mont*.pl. The new delocate parser handles that fine.

Change-Id: Ife35c671a8104c3cc2fb6c5a03127376fccc4402
Reviewed-on: https://boringssl-review.googlesource.com/25644
Reviewed-by: Adam Langley <agl@google.com>
2018-02-11 01:00:35 +00:00
Adam Langley
f8d05579b4 Add ASN1_INTEGET_set_uint64.
Change-Id: I3298875a376c98cbb60deb8c99b9548c84b014df
Reviewed-on: https://boringssl-review.googlesource.com/24484
Commit-Queue: Adam Langley <agl@google.com>
CQ-Verified: CQ bot account: commit-bot@chromium.org <commit-bot@chromium.org>
Reviewed-by: David Benjamin <davidben@google.com>
2018-01-02 16:01:31 +00:00
David Benjamin
875095aa7c Silence ARMv8 deprecated IT instruction warnings.
ARMv8 kindly deprecated most of its IT instructions in Thumb mode.
These files are taken from upstream and are used on both ARMv7 and ARMv8
processors. Accordingly, silence the warnings by marking the file as
targetting ARMv7. In other files, they were accidentally silenced anyway
by way of the existing .arch lines.

This can be reproduced by building with the new NDK and passing
-DCMAKE_ASM_FLAGS=-march=armv8-a. Some of our downstream code ends up
passing that to the assembly.

Note this change does not attempt to arrange for ARMv8-A/T32 to get
code which honors the constraints. It only silences the warnings and
continues to give it the same ARMv7-A/Thumb-2 code that backwards
compatibility dictates it continue to run.

Bug: chromium:575886, b/63131949
Change-Id: I24ce0b695942eaac799347922b243353b43ad7df
Reviewed-on: https://boringssl-review.googlesource.com/24166
Reviewed-by: Adam Langley <agl@google.com>
2017-12-14 01:56:22 +00:00
David Benjamin
4358f104cf Remove clang assembler .arch workaround.
This makes it difficult to build against the NDK's toolchain file. The
problem is __clang__ just means Clang is the frontend and implies
nothing about which assembler. When using as, it is fine. When using
clang-as on Linux, one needs a clang-as from this year.

The only places where we case about clang's integrated assembler are iOS
(where perlasm strips out .arch anyway) and build environments like
Chromium which have a regularly-updated clang. Thus we can remove this
now.

Bug: 39
Update-Note: Holler if this breaks the build. If it doesn't break the
   build, you can probably remove any BORINGSSL_CLANG_SUPPORTS_DOT_ARCH
   or explicit -march armv8-a+crypto lines in your BoringSSL build.
Change-Id: I21ce54b14c659830520c2f1d51c7bd13e0980c68
Reviewed-on: https://boringssl-review.googlesource.com/24124
Commit-Queue: Adam Langley <agl@google.com>
Reviewed-by: Adam Langley <agl@google.com>
CQ-Verified: CQ bot account: commit-bot@chromium.org <commit-bot@chromium.org>
2017-12-13 22:22:41 +00:00
David Benjamin
896332581e Appease UBSan on pointer alignment.
Even without strict-aliasing, C does not allow casting pointers to types
that don't match their alignment. After this change, UBSan is happy with
our code at default settings but for the negative left shift language
bug.

Note: architectures without unaligned loads do not generate the same
code for memcpy and pointer casts. But even ARMv6 can perform unaligned
loads and stores (ARMv5 couldn't), so we should be okay here.

Before:
Did 11086000 AES-128-GCM (16 bytes) seal operations in 5000391us (2217026.6 ops/sec): 35.5 MB/s
Did 370000 AES-128-GCM (1350 bytes) seal operations in 5005208us (73923.0 ops/sec): 99.8 MB/s
Did 63000 AES-128-GCM (8192 bytes) seal operations in 5029958us (12525.0 ops/sec): 102.6 MB/s
Did 9894000 AES-256-GCM (16 bytes) seal operations in 5000017us (1978793.3 ops/sec): 31.7 MB/s
Did 316000 AES-256-GCM (1350 bytes) seal operations in 5005564us (63129.7 ops/sec): 85.2 MB/s
Did 54000 AES-256-GCM (8192 bytes) seal operations in 5054156us (10684.3 ops/sec): 87.5 MB/s

After:
Did 11026000 AES-128-GCM (16 bytes) seal operations in 5000197us (2205113.1 ops/sec): 35.3 MB/s
Did 370000 AES-128-GCM (1350 bytes) seal operations in 5005781us (73914.5 ops/sec): 99.8 MB/s
Did 63000 AES-128-GCM (8192 bytes) seal operations in 5032695us (12518.1 ops/sec): 102.5 MB/s
Did 9831750 AES-256-GCM (16 bytes) seal operations in 5000010us (1966346.1 ops/sec): 31.5 MB/s
Did 316000 AES-256-GCM (1350 bytes) seal operations in 5005702us (63128.0 ops/sec): 85.2 MB/s
Did 54000 AES-256-GCM (8192 bytes) seal operations in 5053642us (10685.4 ops/sec): 87.5 MB/s

(Tested with the no-asm builds; most of this code isn't reachable
otherwise.)

Change-Id: I025c365d26491abed0116b0de3b7612159e52297
Reviewed-on: https://boringssl-review.googlesource.com/22804
Reviewed-by: Adam Langley <agl@google.com>
2017-11-10 21:07:03 +00:00
Adam Langley
0967853d68 Add CFI start/end for _aesni_ctr32[_ghash]_6x
These functions don't appear to do any stack manipulation thus all they
need are start/end directives in order for the correct CFI tables to be
emitted.

Change-Id: I4c94a9446030d363fa4bcb7c8975c689df3d21dc
Reviewed-on: https://boringssl-review.googlesource.com/22765
Reviewed-by: David Benjamin <davidben@google.com>
Commit-Queue: Adam Langley <agl@google.com>
CQ-Verified: CQ bot account: commit-bot@chromium.org <commit-bot@chromium.org>
2017-11-09 00:31:14 +00:00
Adam Langley
ee2c1f3e68 aesni-gcm-x86_64.pl: sync CFI directives from upstream.
Change-Id: Id70cfc78c8d103117d4c2195206b023a5d51edc3
Reviewed-on: https://boringssl-review.googlesource.com/22764
Commit-Queue: Adam Langley <agl@google.com>
Commit-Queue: David Benjamin <davidben@google.com>
Reviewed-by: David Benjamin <davidben@google.com>
CQ-Verified: CQ bot account: commit-bot@chromium.org <commit-bot@chromium.org>
2017-11-09 00:18:23 +00:00
David Benjamin
808f832917 Run the comment converter on libcrypto.
crypto/{asn1,x509,x509v3,pem} were skipped as they are still OpenSSL
style.

Change-Id: I3cd9a60e1cb483a981aca325041f3fbce294247c
Reviewed-on: https://boringssl-review.googlesource.com/19504
Reviewed-by: Adam Langley <agl@google.com>
Commit-Queue: David Benjamin <davidben@google.com>
CQ-Verified: CQ bot account: commit-bot@chromium.org <commit-bot@chromium.org>
2017-08-18 21:49:04 +00:00
David Benjamin
d4e37951b4 x86_64 assembly pack: "optimize" for Knights Landing, add AVX-512 results.
The changes to the assembly files are synced from upstream's
64d92d74985ebb3d0be58a9718f9e080a14a8e7f. cpu-intel.c is translated to C
from that commit and d84df594404ebbd71d21fec5526178d935e4d88d.

Change-Id: I02c8f83aa4780df301c21f011ef2d8d8300e2f2a
Reviewed-on: https://boringssl-review.googlesource.com/18411
Commit-Queue: Adam Langley <agl@google.com>
Reviewed-by: Adam Langley <agl@google.com>
2017-07-26 22:01:37 +00:00
David Benjamin
0a3663a64f ARMv4 assembly pack: harmonize Thumb-ification of iOS build.
Three modules were left behind in
I59df0b567e8e80befe5c399f817d6410ddafc577.

(Imported from upstream's c93f06c12f10c07cea935abd78a07a037e27f155.)

This actually meant functions defined in those two files were
non-functional. I'm guessing no one noticed upstream because, if you go
strictly by iOS compile-time capabilities, all this code is unreachable
on ios32, only ios64.

Change-Id: I55035edf2aebf96d14bdf66161afa2374643d4ec
Reviewed-on: https://boringssl-review.googlesource.com/17113
Reviewed-by: David Benjamin <davidben@google.com>
2017-06-13 17:49:16 +00:00
David Benjamin
f03cdc3a93 Sync ARM assembly up to 609b0852e4d50251857dbbac3141ba042e35a9ae.
This change was made by copying over the files as of that commit and
then discarding the parts of the diff which corresponding to our own
changes.

Change-Id: I28c5d711f7a8cec30749b8174687434129af5209
Reviewed-on: https://boringssl-review.googlesource.com/17111
Reviewed-by: Adam Langley <agl@google.com>
2017-06-13 17:47:20 +00:00
David Benjamin
8da59555c6 ARMv4 assembly pack: allow Thumb2 even in iOS build, and engage it in most modules.
(Imported from upstream's a285992763f3961f69a8d86bf7dfff020a08cef9.)

Change-Id: I59df0b567e8e80befe5c399f817d6410ddafc577
Reviewed-on: https://boringssl-review.googlesource.com/17110
Reviewed-by: Adam Langley <agl@google.com>
2017-06-13 17:47:10 +00:00
David Benjamin
ae96383af3 ARMv4 assembly pack: implement support for Thumb2.
As some of ARM processors, more specifically Cortex-Mx series, are
Thumb2-only, we need to support Thumb2-only builds even in assembly.

(Imported from upstream's 11208dcfb9105e8afa37233185decefd45e89e17.)

Change-Id: I7cb48ce6a842cf3cfdf553f6e6e6227d52d525c0
Reviewed-on: https://boringssl-review.googlesource.com/17108
Reviewed-by: Adam Langley <agl@google.com>
2017-06-13 17:46:35 +00:00
David Benjamin
e2ff2ca0dc Revert "Use unified ARM assembly."
This reverts commit 2cd63877b5. We've
since imported a change to upstream which adds some #defines that should
do the same thing on clang. (Though if gas accepts unified assembly too,
that does seem a better approach. Ah well. Diverging on these files is
expensive.)

This is to reduce the diff and make applying some subsequent changes
easier.

Change-Id: I3f5eae2a71919b291a8de9415b894d8f0c67e3cf
Reviewed-on: https://boringssl-review.googlesource.com/17107
Reviewed-by: Adam Langley <agl@google.com>
2017-06-13 17:45:51 +00:00
David Benjamin
9f579bfe6c Use unions rather than aliasing when possible.
This is less likely to make the compiler grumpy and generates the same
code. (Although this file has worse casts here which I'm still trying to
get the compiler to cooperate on.)

Change-Id: If7ac04c899d2cba2df34eac51d932a82d0c502d9
Reviewed-on: https://boringssl-review.googlesource.com/16986
Commit-Queue: Adam Langley <agl@google.com>
Reviewed-by: Adam Langley <agl@google.com>
CQ-Verified: CQ bot account: commit-bot@chromium.org <commit-bot@chromium.org>
2017-06-08 00:21:18 +00:00
Adam Langley
c5e9ac1cac Move AES-GCM-SIV out from SMALL and handle unaligned keys.
In order to use AES-GCM-SIV in the open-source QUIC boxer, it needs to
be moved out from OPENSSL_SMALL. (Hopefully the linker can still discard
it in the vast majority of cases.)

Additionally, the input to the key schedule function comes from outside
and may not be aligned, thus we need to use unaligned instructions to
read it.

Change-Id: I02c261fe0663d13a96c428174943c7e5ac8415a7
Reviewed-on: https://boringssl-review.googlesource.com/16824
Commit-Queue: David Benjamin <davidben@google.com>
Reviewed-by: David Benjamin <davidben@google.com>
CQ-Verified: CQ bot account: commit-bot@chromium.org <commit-bot@chromium.org>
2017-06-01 18:45:06 +00:00
David Benjamin
6757fbf8e3 Convert a number of tests to GTest.
BUG=129

Change-Id: Ifcdacb2f5f59fd03b757f88778ceb1e672208fd9
Reviewed-on: https://boringssl-review.googlesource.com/16744
Reviewed-by: Adam Langley <agl@google.com>
Commit-Queue: Adam Langley <agl@google.com>
CQ-Verified: CQ bot account: commit-bot@chromium.org <commit-bot@chromium.org>
2017-06-01 17:02:13 +00:00
David Benjamin
583c12ea97 Remove filename argument to x86 asm_init.
43e5a26b53 removed the .file directive
from x86asm.pl. This removes the parameter from asm_init altogether. See
also upstream's e195c8a2562baef0fdcae330556ed60b1e922b0e.

Change-Id: I65761bc962d09f9210661a38ecf6df23eae8743d
Reviewed-on: https://boringssl-review.googlesource.com/16247
Reviewed-by: Steven Valdez <svaldez@google.com>
Commit-Queue: Steven Valdez <svaldez@google.com>
CQ-Verified: CQ bot account: commit-bot@chromium.org <commit-bot@chromium.org>
2017-05-12 14:58:27 +00:00
David Benjamin
def85b403d Revise OPENSSL_ia32cap_P strategy to avoid TEXTRELs.
OPENSSL_ia32cap_addr avoids any relocations within the module, at the
cost of a runtime TEXTREL, which causes problems in some cases.
(Notably, if someone links us into a binary which uses the GCC "ifunc"
attribute, the loader crashes.)

We add a OPENSSL_ia32cap_addr_delta symbol (which is reachable
relocation-free from the module) stores the difference between
OPENSSL_ia32cap_P and its own address.  Next, reference
OPENSSL_ia32cap_P in code as usual, but always doing LEAQ (or the
equivalent GOTPCREL MOVQ) into a register first. This pattern we can
then transform into a LEAQ and ADDQ on OPENSSL_ia32cap_addr_delta.

ADDQ modifies the FLAGS register, so this is only a safe transformation
if we safe and restore flags first. That, in turn, is only a safe
transformation if code always uses %rsp as a stack pointer (specifically
everything below the stack must be fair game for scribbling over). Linux
delivers signals on %rsp, so this should already be an ABI requirement.
Further, we must clear the red zone (using LEAQ to avoid touching FLAGS)
which signal handlers may not scribble over.

This also fixes the GOTTPOFF logic to clear the red zone.

Change-Id: I4ca6133ab936d5a13d5c8ef265a12ab6bd0073c9
Reviewed-on: https://boringssl-review.googlesource.com/15545
Reviewed-by: Adam Langley <agl@google.com>
Commit-Queue: Adam Langley <agl@google.com>
CQ-Verified: CQ bot account: commit-bot@chromium.org <commit-bot@chromium.org>
2017-04-27 21:07:33 +00:00
David Benjamin
91871018a4 Add an OPENSSL_ia32cap_get() function for C code.
OPENSSL_ia32cap_addr avoids any relocations within the module, at the
cost of a runtime TEXTREL, which causes problems in some cases.
(Notably, if someone links us into a binary which uses the GCC "ifunc"
attribute, the loader crashes.)

Fix C references of OPENSSL_ia32cap_addr with a function. This is
analogous to the BSS getters. A follow-up commit will fix perlasm with a
different scheme which avoids calling into a function (clobbering
registers and complicating unwind directives.)

Change-Id: I09d6cda4cec35b693e16b5387611167da8c7a6de
Reviewed-on: https://boringssl-review.googlesource.com/15525
Reviewed-by: Adam Langley <agl@google.com>
2017-04-27 20:34:23 +00:00
David Benjamin
1997ef22d7 Tidy up aesni_gcm_crypt logic.
CRYPTO_gcm128_init is currently assuming that it gets passed in
aesni_encrypt whenever it selects the AVX implementation. This is true,
but we can easily avoid this assumption by adding an extra boolean
input.

Change-Id: Ie7888323f0c93ff9df8f1cf3ba784fb35bb07076
Reviewed-on: https://boringssl-review.googlesource.com/15370
Reviewed-by: Adam Langley <agl@google.com>
2017-04-21 22:49:04 +00:00
Adam Langley
518ba0772b Switch constant-time functions to using |crypto_word_t|.
Using |size_t| was correct, except for NaCl, which is a 64-bit build
with 32-bit pointers. In that configuration, |size_t| is smaller than
the native word size.

This change adds |crypto_word_t|, an unsigned type with native size and
switches constant-time functions to using it.

Change-Id: Ib275127063d5edbb7c55d413132711b7c74206b0
Reviewed-on: https://boringssl-review.googlesource.com/15325
Reviewed-by: Adam Langley <agl@google.com>
2017-04-21 22:06:05 +00:00
Adam Langley
0648129566 Move modes/ into the FIPS module
The changes to delocate.go are needed because modes/ does things like
return the address of a module function. Both of these need to be
changed from referencing the GOT to using local symbols.

Rather than testing whether |ghash| is |gcm_ghash_avx|, we can just keep
that information in a flag.

The test for |aesni_ctr32_encrypt_blocks| is more problematic, but I
believe that it's superfluous and can be dropped: if you passed in a
stream function that was semantically different from
|aesni_ctr32_encrypt_blocks| you would already have a bug because
|CRYPTO_gcm128_[en|de]crypt_ctr32| will handle a block at the end
themselves, and assume a big-endian, 32-bit counter anyway.

Change-Id: I68a84ebdab6c6006e11e9467e3362d7585461385
Reviewed-on: https://boringssl-review.googlesource.com/15064
Reviewed-by: Adam Langley <agl@google.com>
2017-04-21 17:46:37 +00:00