This addresses
- request for improvement for faster key setup in RT#3576;
- clearing registers and stack in RT#3554 (this is more of a gesture to
see if there will be some traction from compiler side);
- more commentary around input parameters handling and stack layout
(desired when RT#3553 was reviewed);
- minor size and single block performance optimization (was lying around);
(Imported from upstream's 23f6eec71dbd472044db7dc854599f1de14a1f48)
This one is best reviewed by verifying that
23f6eec71dbd472044db7dc854599f1de14a1f48^ in upstream has the exact same
versions of these files (we had no local diffs), so we can just copy them
wholesale.
bssl speed reports a wash on my Mac. If I keep running it, different ones win
each time.
Change-Id: I729bd39cf0b3a30cc24de839e1c734dcaef972b8
Reviewed-on: https://boringssl-review.googlesource.com/4491
Reviewed-by: Adam Langley <agl@google.com>
XTS bug spotted and fix suggested by Adrian Kotelba.
(Imported from upstream's e620e5ae37bc3fc5e457ebf3edcdd01b20f8c5dd.)
Another patch we missed.
Change-Id: Ibea40eeec01a49b29064b14631706756795c9592
Reviewed-on: https://boringssl-review.googlesource.com/4489
Reviewed-by: Adam Langley <agl@google.com>
This facilitates "universal" builds, ones that target multiple
architectures, e.g. ARMv5 through ARMv7.
(Imported from upstream's c1669e1c205dc8e695fb0c10a655f434e758b9f7)
This is a change from a while ago which was a source of divergence between our
perlasm and upstream's. This change in upstream came with the following comment
in Configure:
Note that -march is not among compiler options in below linux-armv4
target line. Not specifying one is intentional to give you choice to:
a) rely on your compiler default by not specifying one;
b) specify your target platform explicitly for optimal performance,
e.g. -march=armv6 or -march=armv7-a;
c) build "universal" binary that targets *range* of platforms by
specifying minimum and maximum supported architecture;
As for c) option. It actually makes no sense to specify maximum to be
less than ARMv7, because it's the least requirement for run-time
switch between platform-specific code paths. And without run-time
switch performance would be equivalent to one for minimum. Secondly,
there are some natural limitations that you'd have to accept and
respect. Most notably you can *not* build "universal" binary for
big-endian platform. This is because ARMv7 processor always picks
instructions in little-endian order. Another similar limitation is
that -mthumb can't "cross" -march=armv6t2 boundary, because that's
where it became Thumb-2. Well, this limitation is a bit artificial,
because it's not really impossible, but it's deemed too tricky to
support. And of course you have to be sure that your binutils are
actually up to the task of handling maximum target platform.
Change-Id: Ie5f674d603393f0a1354a0d0973987484a4a650c
Reviewed-on: https://boringssl-review.googlesource.com/4488
Reviewed-by: Adam Langley <agl@google.com>
Pointer out and suggested by: Ard Biesheuvel.
(Imported from upstream's 5dcf70a1c57c2019bfad640fe14fd4a73212860a)
This is from a while ago, but it's one source of divergence between our copy of
these files and master's.
Change-Id: I6525a27f25eb86a92420c32996af47ecc42ee020
Reviewed-on: https://boringssl-review.googlesource.com/4487
Reviewed-by: Adam Langley <agl@google.com>
(Imported from upstream's 7eeeb49e1103533bc81c234eb19613353866e474)
Here are the performance numbers on a Nexus 9 (32-bit binary):
Before:
Did 4376000 AES-128-GCM (16 bytes) seal operations in 1000016us (4375930.0 ops/sec): 70.0 MB/s
Did 642000 AES-128-GCM (1350 bytes) seal operations in 1001090us (641301.0 ops/sec): 865.8 MB/s
Did 126000 AES-128-GCM (8192 bytes) seal operations in 1001460us (125816.3 ops/sec): 1030.7 MB/s
Did 4120000 AES-256-GCM (16 bytes) seal operations in 1000004us (4119983.5 ops/sec): 65.9 MB/s
Did 547000 AES-256-GCM (1350 bytes) seal operations in 1001165us (546363.5 ops/sec): 737.6 MB/s
Did 99000 AES-256-GCM (8192 bytes) seal operations in 1000027us (98997.3 ops/sec): 811.0 MB/s
After:
Did 4569000 AES-128-GCM (16 bytes) seal operations in 1000011us (4568949.7 ops/sec): 73.1 MB/s
Did 796000 AES-128-GCM (1350 bytes) seal operations in 1000161us (795871.9 ops/sec): 1074.4 MB/s
Did 162000 AES-128-GCM (8192 bytes) seal operations in 1003828us (161382.2 ops/sec): 1322.0 MB/s
Did 4398000 AES-256-GCM (16 bytes) seal operations in 1000001us (4397995.6 ops/sec): 70.4 MB/s
Did 634000 AES-256-GCM (1350 bytes) seal operations in 1001290us (633183.2 ops/sec): 854.8 MB/s
Did 122000 AES-256-GCM (8192 bytes) seal operations in 1005650us (121314.6 ops/sec): 993.8 MB/s
Change-Id: I2fef921069ad174f5651dfe59be262625fb3f7c9
Reviewed-on: https://boringssl-review.googlesource.com/4483
Reviewed-by: Adam Langley <agl@google.com>
ARM has optimized Cortex-A5x pipeline to favour pairs of complementary
AES instructions. While modified code improves performance of post-r0p0
Cortex-A53 performance by >40% (for CBC decrypt and CTR), it hurts
original r0p0. We favour later revisions, because one can't prevent
future from coming. Improvement on post-r0p0 Cortex-A57 exceeds 50%,
while new code is not slower on r0p0, or Apple A7 for that matter.
[Update even SHA results for latest Cortex-A53.]
(Imported from upstream's 94376cccb4ed5b376220bffe0739140ea9dad8c8)
Change-Id: I581c65b566116b1f4211fb1bd5a1a54479889d70
Reviewed-on: https://boringssl-review.googlesource.com/4481
Reviewed-by: Adam Langley <agl@google.com>
(Imported from upstream's 7b644df899d0c818488686affc0bfe2dfdd0d0c2)
Looking at update_gypi_and_asm.py with git diff -w, the only differences seem
to be that .asciz fixed a bug where a space after a ',' got swallowed (sigh).
BUG=338886
Change-Id: Ib52296f4a62bc6f892a0d4ee7367493a8c639a3b
Reviewed-on: https://boringssl-review.googlesource.com/4480
Reviewed-by: Adam Langley <agl@google.com>
MSVC seems to dislike the zero-array trick in C++, but not C. Turns out there
was no need for the include, so that's an easy fix.
Change-Id: I6def7b430a450c4ff7eeafa3611f0d40f5fc5945
Reviewed-on: https://boringssl-review.googlesource.com/4580
Reviewed-by: Adam Langley <agl@google.com>
RFC 5915 requires the use of the I2OSP primitive as defined in RFC 3447
for encoding ECPrivateKey. Fix this and add a test.
See also upstream's 30cd4ff294252c4b6a4b69cbef6a5b4117705d22, though it mixes
up degree and order.
Change-Id: I81ba14da3c8d69e3799422c669fab7f16956f322
Reviewed-on: https://boringssl-review.googlesource.com/4469
Reviewed-by: Adam Langley <agl@google.com>
Follow-up to sha256-armv4.pl in cooperation with Ard Biesheuvel
(Linaro) and Sami Tolvanen (Google).
(Imported from upstream's b1a5d1c652086257930a1f62ae51c9cdee654b2c.)
Change-Id: Ibc4f289cc8f499924ade8d6b8d494f53bc08bda7
Reviewed-on: https://boringssl-review.googlesource.com/4467
Reviewed-by: Adam Langley <agl@google.com>
(Imported from upstream's 51f8d095562f36cdaa6893597b5c609e943b0565.)
I don't see why we'd care, but just to minimize divergence.
Change-Id: I4b07e72c88fcb04654ad28d8fd371e13d59a61b5
Reviewed-on: https://boringssl-review.googlesource.com/4466
Reviewed-by: Adam Langley <agl@google.com>
This is a really dumb API wart. Now that we have a limited set of curves that
are all reasonable, the automatic logic should just always kick in. This makes
set_ecdh_auto a no-op and, instead of making it the first choice, uses it as
the fallback behavior should none of the older curve selection APIs be used.
Currently, by default, server sockets can only use the plain RSA key exchange.
BUG=481139
Change-Id: Iaabc82de766cd00968844a71aaac29bd59841cd4
Reviewed-on: https://boringssl-review.googlesource.com/4531
Reviewed-by: Adam Langley <agl@google.com>
See upstream's bd891f098bdfcaa285c073ce556d0f5e27ec3a10. It honestly seems
kinda dumb for a client to do this, but apparently the spec allows this.
Judging by code inspection, OpenSSL 1.0.1 also allowed this, so this avoids a
behavior change when switching from 1.0.1 to BoringSSL.
Add a test for this, which revealed that, unlike upstream's version, this
actually works with ecdh_auto since tls1_get_shared_curve also needs updating.
(To be mentioned in newsletter.)
Change-Id: Ie622700f17835965457034393b90f346740cfca8
Reviewed-on: https://boringssl-review.googlesource.com/4464
Reviewed-by: Adam Langley <agl@google.com>
In cooperation with Ard Biesheuvel (Linaro) and Sami Tolvanen (Google).
(Imported from upstream's 2ecd32a1f8f0643ae7b38f59bbaf9f0d6ef326fe)
Change-Id: Iac5853220654b6ef4cb3bb7f8d1efe0eb2ecf634
Reviewed-on: https://boringssl-review.googlesource.com/4463
Reviewed-by: Adam Langley <agl@google.com>
Because RFC 6066 is obnoxious like that and IIS servers actually do this
when OCSP-stapling is configured, but the OCSP server cannot be reached.
BUG=478947
Change-Id: I3d34c1497e0b6b02d706278dcea5ceb684ff60ae
Reviewed-on: https://boringssl-review.googlesource.com/4461
Reviewed-by: Adam Langley <agl@google.com>
054e682675 removed the compatibility include of
mem.h in crypto.h. mem.h doesn't exist in upstream which defines these
functions in crypto.h instead. The compatibility include should probably be
restored to avoid causing all kinds of grief when porting consumers over.
Change-Id: Idfe0f9b43ebee5df22bebfe0ed6dc85ec98b4de0
Reviewed-on: https://boringssl-review.googlesource.com/4530
Reviewed-by: Adam Langley <agl@google.com>
The 2 arg OpenSSL sk_find() returned -1 on error and >= 0 on
success. BoringSSL's 3 arg sk_find() returns -1 if the sk argument
is NULL, 0 if the item is not found, and 1 if found.
In practice, all callers of the sk_find() macros in BoringSSL only
check for zero/non-zero. If sk is ever NULL, it looks like most
callers are going to use uninitialized data as the index because
the return value check is insufficient.
Change-Id: I640089a0f4044aaa8d50178b2aecd9c3c1fe2f9c
Reviewed-on: https://boringssl-review.googlesource.com/4500
Reviewed-by: Adam Langley <agl@google.com>
See upstream's a0eed48d37a4b7beea0c966caf09ad46f4a92a44. Rather than import
that, we should just ensure neg + zero isn't a possible state.
Add some tests for asc2bn and dec2bn while we're here. Also fix a bug with
dec2bn where it doesn't actually ignore trailing data as it's supposed to.
Change-Id: I2385b67b740e57020c75a247bee254085ab7ce15
Reviewed-on: https://boringssl-review.googlesource.com/4484
Reviewed-by: Adam Langley <agl@google.com>
arm-xlate.pl conditions some things on the flavour matching /linux/. This
change will need to be mirrored in update_gypi_and_asm.py.
Change-Id: I60483aaf40fd13181173373f12f6d3651a2a8a0c
Reviewed-on: https://boringssl-review.googlesource.com/4460
Reviewed-by: Adam Langley <agl@google.com>
This is as partial import of upstream's
9b05cbc33e7895ed033b1119e300782d9e0cf23c. It includes the perlasm changes, but
not the CPU feature detection bits as we do those differently. This is largely
so we don't diverge from upstream, but it'll help with iOS assembly in the
future.
sha512-armv8.pl is modified slightly from upstream to switch from conditioning
on the output file to conditioning on an extra argument. This makes our
previous change from upstream (removing the 'open STDOUT' line) more explicit.
BUG=338886
Change-Id: Ic8ca1388ae20e94566f475bad3464ccc73f445df
Reviewed-on: https://boringssl-review.googlesource.com/4405
Reviewed-by: Adam Langley <agl@google.com>
See tools/clang/scripts/update.sh. This'll be used to run ASan on the bots.
BUG=469928
Change-Id: I6b5093c2db21ad4ed742852944e77a6b32e29e29
Reviewed-on: https://boringssl-review.googlesource.com/4402
Reviewed-by: Adam Langley <agl@google.com>
See upstream's 3ae91cfb327c9ed689b9aaf7bca01a3f5a0657cb.
I misread that code and thought it was allowing empty cipher suites when there
*is* a session ID, but it was allowing them when there isn't. Which doesn't
make much sense because it'll get rejected later anyway. (Verified by toying
with handshake_client.go.)
Change-Id: Ia870a1518bca36fce6f3018892254f53ab49f460
Reviewed-on: https://boringssl-review.googlesource.com/4401
Reviewed-by: Adam Langley <agl@google.com>
CRYPTO_MUTEX was the wrong size. Fortunately, Apple was kind enough to define
pthread_rwlock_t unconditionally, so we can be spared fighting with feature
macros. Some of the stdlib.h removals were wrong and clang is pick about
multiply-defined typedefs. Apparently that's a C11 thing?
BUG=478598
Change-Id: Ibdcb8de9e5d83ca28e4c55b2979177d1ef0f9721
Reviewed-on: https://boringssl-review.googlesource.com/4404
Reviewed-by: Adam Langley <agl@google.com>
Change-Id: I489d00bc4ee22a5ecad75dc1eb84776f044566e5
Reviewed-on: https://boringssl-review.googlesource.com/4391
Reviewed-by: David Benjamin <davidben@chromium.org>
Reviewed-by: Adam Langley <agl@google.com>
This is taken from upstream, although it originally came from us. This
will only take effect on 64-bit systems (x86-64 and aarch64).
Before:
Did 1496 ECDH P-256 operations in 1038743us (1440.2 ops/sec)
Did 2783 ECDSA P-256 signing operations in 1081006us (2574.5 ops/sec)
Did 2400 ECDSA P-256 verify operations in 1059508us (2265.2 ops/sec)
After:
Did 4147 ECDH P-256 operations in 1061723us (3905.9 ops/sec)
Did 9372 ECDSA P-256 signing operations in 1040589us (9006.4 ops/sec)
Did 4114 ECDSA P-256 verify operations in 1063478us (3868.4 ops/sec)
Change-Id: I11fabb03239cc3a7c4a97325ed4e4c97421f91a9
We don't support the SSL BIO so this is a no-op change.
Change-Id: Iba9522b837ebb0eb6adc80d5df6dcac99abf2552
Reviewed-on: https://boringssl-review.googlesource.com/4360
Reviewed-by: Adam Langley <agl@google.com>
Instead, each module defines a static CRYPTO_EX_DATA_CLASS to hold the values.
This makes CRYPTO_cleanup_all_ex_data a no-op as spreading the
CRYPTO_EX_DATA_CLASSes across modules (and across crypto and ssl) makes cleanup
slightly trickier. We can make it do something if needbe, but it's probably not
worth the trouble.
Change-Id: Ib6f6fd39a51d8ba88649f0fa29c66db540610c76
Reviewed-on: https://boringssl-review.googlesource.com/4375
Reviewed-by: Adam Langley <agl@google.com>
No functions for using it were ever added.
Change-Id: Iaee6e5bc8254a740435ccdcdbd715b851d8a0dce
Reviewed-on: https://boringssl-review.googlesource.com/4374
Reviewed-by: Adam Langley <agl@google.com>
No wrappers were ever added and codesearch confirms no one ever added to it
manually. Probably anyone doing complex things with BIOs just made a custom
BIO_METHOD. We can put it back with proper functions if the need ever arises.
Change-Id: Icb5da7ceeb8f1da6d08f4a8854d53dfa75827d9c
Reviewed-on: https://boringssl-review.googlesource.com/4373
Reviewed-by: Adam Langley <agl@google.com>
Callers are required to use the wrappers now. They still need OPENSSL_EXPORT
since crypto and ssl get built separately in the standalone shared library
build.
Change-Id: I61186964e6099b9b589c4cd45b8314dcb2210c89
Reviewed-on: https://boringssl-review.googlesource.com/4372
Reviewed-by: Adam Langley <agl@google.com>
It's unused and requires ex_data support a class number per type.
Change-Id: Ie1fb55053631ef00c3318f3253f7c9501988f522
Reviewed-on: https://boringssl-review.googlesource.com/4371
Reviewed-by: Adam Langley <agl@google.com>
This is never used and we can make the built-in one performant.
Change-Id: I6fc7639ba852349933789e73762bc3fa1341b2ff
Reviewed-on: https://boringssl-review.googlesource.com/4370
Reviewed-by: Adam Langley <agl@google.com>