This imports a fix to x86gas.pl from upstream's
a98c648e40ea5158c8ba29b5a70ccc239d426a20. It's needed to get poly1305-x86.pl
working.
Confirmed that this is a no-op for our current assembly files.
Change-Id: I28de1dbf421b29a06147d1aea3ff3659372a78b3
Reviewed-on: https://boringssl-review.googlesource.com/7210
Reviewed-by: Adam Langley <agl@google.com>
We'd manually marked some of them hidden, but missed some. Do it in the perlasm
driver instead since we will never expose an asm symbol directly. This reduces
some of our divergence from upstream on these files (and indeed we'd
accidentally lose some .hiddens at one point).
BUG=586141
Change-Id: Ie1bfc6f38ba73d33f5c56a8a40c2bf1668562e7e
Reviewed-on: https://boringssl-review.googlesource.com/7140
Reviewed-by: Adam Langley <agl@google.com>
Since we pre-generate our perlasm, having the output of these files be
sensitive to the environment the run in is unhelpful. It would be bad to
suddenly change what features we do or don't compile in whenever workstations'
toolchains change or if developers do or don't have CC variables set.
Previously, all compiler-version-gated features were turned on in
https://boringssl-review.googlesource.com/6260, but this broke the build. I
also wasn't thorough enough in gathering performance numbers. So, flip them all
to off instead. I'll enable them one-by-one as they're tested.
This should result in no change to generated assembly.
Change-Id: Ib4259b3f97adc4939cb0557c5580e8def120d5bc
Reviewed-on: https://boringssl-review.googlesource.com/6383
Reviewed-by: Adam Langley <agl@google.com>
This reverts commit b9c26014de.
The win64 bot seems unhappy. Will sniff at it tomorrow. In
the meantime, get the tree green again.
Change-Id: I058ddb3ec549beee7eabb2f3f72feb0a4a5143b2
Reviewed-on: https://boringssl-review.googlesource.com/6353
Reviewed-by: Adam Langley <alangley@gmail.com>
Since we pre-generate our perlasm, having the output of these files be
sensitive to the environment the run in is unhelpful. It would be bad to
suddenly change what features we do or don't compile in whenever workstations'
toolchains change.
Enable all compiler-version-gated features as they should all be runtime-gated
anyway. This should align with what upstream's files would have produced on
modern toolschains. We should assume our assemblers can take whatever we'd like
to throw at them. (If it turns out some can't, we'd rather find out and
probably switch the problematic instructions to explicit byte sequences.)
This actually results in a fairly significant change to the assembly we
generate. I'm guessing upstream's buildsystem sets the CC environment variable,
while ours doesn't and so the version checks were all coming out conservative.
diffstat of generated files:
linux-x86/crypto/sha/sha1-586.S | 1176 ++++++++++++
linux-x86/crypto/sha/sha256-586.S | 2248 ++++++++++++++++++++++++
linux-x86_64/crypto/bn/rsaz-avx2.S | 1644 +++++++++++++++++
linux-x86_64/crypto/bn/rsaz-x86_64.S | 638 ++++++
linux-x86_64/crypto/bn/x86_64-mont.S | 332 +++
linux-x86_64/crypto/bn/x86_64-mont5.S | 1130 ++++++++++++
linux-x86_64/crypto/modes/aesni-gcm-x86_64.S | 754 ++++++++
linux-x86_64/crypto/modes/ghash-x86_64.S | 475 +++++
linux-x86_64/crypto/sha/sha1-x86_64.S | 1121 ++++++++++++
linux-x86_64/crypto/sha/sha256-x86_64.S | 1062 +++++++++++
linux-x86_64/crypto/sha/sha512-x86_64.S | 2241 ++++++++++++++++++++++++
mac-x86/crypto/sha/sha1-586.S | 1174 ++++++++++++
mac-x86/crypto/sha/sha256-586.S | 2248 ++++++++++++++++++++++++
mac-x86_64/crypto/bn/rsaz-avx2.S | 1637 +++++++++++++++++
mac-x86_64/crypto/bn/rsaz-x86_64.S | 638 ++++++
mac-x86_64/crypto/bn/x86_64-mont.S | 331 +++
mac-x86_64/crypto/bn/x86_64-mont5.S | 1130 ++++++++++++
mac-x86_64/crypto/modes/aesni-gcm-x86_64.S | 750 ++++++++
mac-x86_64/crypto/modes/ghash-x86_64.S | 475 +++++
mac-x86_64/crypto/sha/sha1-x86_64.S | 1121 ++++++++++++
mac-x86_64/crypto/sha/sha256-x86_64.S | 1062 +++++++++++
mac-x86_64/crypto/sha/sha512-x86_64.S | 2241 ++++++++++++++++++++++++
win-x86/crypto/sha/sha1-586.asm | 1173 ++++++++++++
win-x86/crypto/sha/sha256-586.asm | 2248 ++++++++++++++++++++++++
win-x86_64/crypto/bn/rsaz-avx2.asm | 1858 +++++++++++++++++++-
win-x86_64/crypto/bn/rsaz-x86_64.asm | 638 ++++++
win-x86_64/crypto/bn/x86_64-mont.asm | 352 +++
win-x86_64/crypto/bn/x86_64-mont5.asm | 1184 ++++++++++++
win-x86_64/crypto/modes/aesni-gcm-x86_64.asm | 933 ++++++++++
win-x86_64/crypto/modes/ghash-x86_64.asm | 515 +++++
win-x86_64/crypto/sha/sha1-x86_64.asm | 1152 ++++++++++++
win-x86_64/crypto/sha/sha256-x86_64.asm | 1088 +++++++++++
win-x86_64/crypto/sha/sha512-x86_64.asm | 2499 ++++++
SHA* gets faster. RSA and AES-GCM seem to be more of a wash and even slower
sometimes! This is a little concerning. Though when I repeated the latter two,
it's definitely noisy (RSA in particular), so we may wish to repeat in a more
controlled environment. We could also flip some of these toggles to something
other than the highest setting if it seems some of the variants aren't
desirable. We just shouldn't have them enabled or disabled on accident. This
aligns us closer to upstream though.
$ /tmp/bssl.old speed SHA-
Did 5028000 SHA-1 (16 bytes) operations in 1000048us (5027758.7 ops/sec): 80.4 MB/s
Did 1708000 SHA-1 (256 bytes) operations in 1000257us (1707561.2 ops/sec): 437.1 MB/s
Did 73000 SHA-1 (8192 bytes) operations in 1008406us (72391.5 ops/sec): 593.0 MB/s
Did 3041000 SHA-256 (16 bytes) operations in 1000311us (3040054.5 ops/sec): 48.6 MB/s
Did 779000 SHA-256 (256 bytes) operations in 1000820us (778361.7 ops/sec): 199.3 MB/s
Did 26000 SHA-256 (8192 bytes) operations in 1009875us (25745.8 ops/sec): 210.9 MB/s
Did 1837000 SHA-512 (16 bytes) operations in 1000251us (1836539.0 ops/sec): 29.4 MB/s
Did 803000 SHA-512 (256 bytes) operations in 1000969us (802222.6 ops/sec): 205.4 MB/s
Did 41000 SHA-512 (8192 bytes) operations in 1016768us (40323.8 ops/sec): 330.3 MB/s
$ /tmp/bssl.new speed SHA-
Did 5354000 SHA-1 (16 bytes) operations in 1000104us (5353443.2 ops/sec): 85.7 MB/s
Did 1779000 SHA-1 (256 bytes) operations in 1000121us (1778784.8 ops/sec): 455.4 MB/s
Did 87000 SHA-1 (8192 bytes) operations in 1012641us (85914.0 ops/sec): 703.8 MB/s
Did 3517000 SHA-256 (16 bytes) operations in 1000114us (3516599.1 ops/sec): 56.3 MB/s
Did 935000 SHA-256 (256 bytes) operations in 1000096us (934910.2 ops/sec): 239.3 MB/s
Did 38000 SHA-256 (8192 bytes) operations in 1004476us (37830.7 ops/sec): 309.9 MB/s
Did 2930000 SHA-512 (16 bytes) operations in 1000259us (2929241.3 ops/sec): 46.9 MB/s
Did 1008000 SHA-512 (256 bytes) operations in 1000509us (1007487.2 ops/sec): 257.9 MB/s
Did 45000 SHA-512 (8192 bytes) operations in 1000593us (44973.3 ops/sec): 368.4 MB/s
$ /tmp/bssl.old speed RSA
Did 820 RSA 2048 signing operations in 1017008us (806.3 ops/sec)
Did 27000 RSA 2048 verify operations in 1015400us (26590.5 ops/sec)
Did 1292 RSA 2048 (3 prime, e=3) signing operations in 1008185us (1281.5 ops/sec)
Did 65000 RSA 2048 (3 prime, e=3) verify operations in 1011388us (64268.1 ops/sec)
Did 120 RSA 4096 signing operations in 1061027us (113.1 ops/sec)
Did 8208 RSA 4096 verify operations in 1002717us (8185.8 ops/sec)
$ /tmp/bssl.new speed RSA
Did 760 RSA 2048 signing operations in 1003351us (757.5 ops/sec)
Did 25900 RSA 2048 verify operations in 1028931us (25171.8 ops/sec)
Did 1320 RSA 2048 (3 prime, e=3) signing operations in 1040806us (1268.2 ops/sec)
Did 63000 RSA 2048 (3 prime, e=3) verify operations in 1016042us (62005.3 ops/sec)
Did 104 RSA 4096 signing operations in 1008718us (103.1 ops/sec)
Did 6875 RSA 4096 verify operations in 1093441us (6287.5 ops/sec)
$ /tmp/bssl.old speed GCM
Did 5316000 AES-128-GCM (16 bytes) seal operations in 1000082us (5315564.1 ops/sec): 85.0 MB/s
Did 712000 AES-128-GCM (1350 bytes) seal operations in 1000252us (711820.6 ops/sec): 961.0 MB/s
Did 149000 AES-128-GCM (8192 bytes) seal operations in 1003182us (148527.4 ops/sec): 1216.7 MB/s
Did 5919750 AES-256-GCM (16 bytes) seal operations in 1000016us (5919655.3 ops/sec): 94.7 MB/s
Did 800000 AES-256-GCM (1350 bytes) seal operations in 1000951us (799239.9 ops/sec): 1079.0 MB/s
Did 152000 AES-256-GCM (8192 bytes) seal operations in 1000765us (151883.8 ops/sec): 1244.2 MB/s
$ /tmp/bssl.new speed GCM
Did 5315000 AES-128-GCM (16 bytes) seal operations in 1000125us (5314335.7 ops/sec): 85.0 MB/s
Did 755000 AES-128-GCM (1350 bytes) seal operations in 1000878us (754337.7 ops/sec): 1018.4 MB/s
Did 151000 AES-128-GCM (8192 bytes) seal operations in 1005655us (150150.9 ops/sec): 1230.0 MB/s
Did 5913500 AES-256-GCM (16 bytes) seal operations in 1000041us (5913257.6 ops/sec): 94.6 MB/s
Did 782000 AES-256-GCM (1350 bytes) seal operations in 1001484us (780841.2 ops/sec): 1054.1 MB/s
Did 121000 AES-256-GCM (8192 bytes) seal operations in 1006389us (120231.8 ops/sec): 984.9 MB/s
Change-Id: I0efb32f896c597abc7d7e55c31d038528a5c72a1
Reviewed-on: https://boringssl-review.googlesource.com/6260
Reviewed-by: Adam Langley <alangley@gmail.com>
This change causes each global arm or aarch64 asm function to be put
into its own section by default. This matches the behaviour of the
-ffunction-sections option to GCC and allows the --gc-sections option to
the linker to discard unused asm functions on a function-by-function
basis.
Sometimes several asm functions will share the same data an, in that
situation, the data is put into the section of one of the functions and
the section of the other function is merged with the added
“.global_with_section” directive.
Change-Id: I12c9b844d48d104d28beb816764358551eac4456
Reviewed-on: https://boringssl-review.googlesource.com/6003
Reviewed-by: Adam Langley <agl@google.com>
This change causes the generated assembly files for ARM and AArch64 to
have #if guards for __arm__ and __aarch64__, respectively. Since
building on ARM is only supported for Linux, we only have to worry about
GCC/Clang's predefines.
Change-Id: I7198eab6230bcfc26257f0fb6a0cc3166df0bb29
Reviewed-on: https://boringssl-review.googlesource.com/5173
Reviewed-by: Adam Langley <agl@google.com>
(Imported from upstream's 7b644df899d0c818488686affc0bfe2dfdd0d0c2)
Looking at update_gypi_and_asm.py with git diff -w, the only differences seem
to be that .asciz fixed a bug where a space after a ',' got swallowed (sigh).
BUG=338886
Change-Id: Ib52296f4a62bc6f892a0d4ee7367493a8c639a3b
Reviewed-on: https://boringssl-review.googlesource.com/4480
Reviewed-by: Adam Langley <agl@google.com>
This is as partial import of upstream's
9b05cbc33e7895ed033b1119e300782d9e0cf23c. It includes the perlasm changes, but
not the CPU feature detection bits as we do those differently. This is largely
so we don't diverge from upstream, but it'll help with iOS assembly in the
future.
sha512-armv8.pl is modified slightly from upstream to switch from conditioning
on the output file to conditioning on an extra argument. This makes our
previous change from upstream (removing the 'open STDOUT' line) more explicit.
BUG=338886
Change-Id: Ic8ca1388ae20e94566f475bad3464ccc73f445df
Reviewed-on: https://boringssl-review.googlesource.com/4405
Reviewed-by: Adam Langley <agl@google.com>
Change-Id: I582eaa2ff922bbf1baf298a5c6857543524a8d4e
Reviewed-on: https://boringssl-review.googlesource.com/3810
Reviewed-by: David Benjamin <davidben@chromium.org>
Reviewed-by: Adam Langley <agl@google.com>
It's unclear why .extern was being suppressed, it's also a little
unclear how the Chromium build was working without this. None the less,
it's causing problems with Android and it's more obviously correct to
make these symbols as hidden.
Change-Id: Id13ec238b80b8bd08d8ae923ac659835450e77f8
Reviewed-on: https://boringssl-review.googlesource.com/3800
Reviewed-by: Adam Langley <agl@google.com>
Though this doesn't mean that masm becomes supported, the script is
still provided on don't-ask-in-case-of-doubt-use-nasm basis.
See RT#3650 for background.
(Imported from upstream's 2f8d82d6418c4de8330e2870c1ca6386dc9e1b34)
The data_word changes were already fixed with our
3e700bb3e8, but best to avoid diverging there.
Change-Id: Iab5455534e8bd632fb2b247ff792d411b105f17a
Reviewed-on: https://boringssl-review.googlesource.com/3581
Reviewed-by: Adam Langley <agl@google.com>
Although x86masm.pl exists, upstream's documentation suggest only x86nasm.pl is
supported. Yasm seems to handle it fine with a small change.
Change-Id: Ia77be57c6b743527225924b2b398f2f07a084a7f
Reviewed-on: https://boringssl-review.googlesource.com/2092
Reviewed-by: Adam Langley <agl@google.com>
We were building the NASM flavor with MASM which is why it didn't work. Get the
MASM output working: cpuid and cmove are not available in MASM unless the file
declares .686. Also work around MASM rejecting a very long line in SHA-256.
The follow-up change will get the NASM flavor working. We should probably use
that one as it's documented as supported upstream. But let's make this one
functional too.
Change-Id: Ica69cc042a7250c7bc9ba9325caab597cd4ce616
Reviewed-on: https://boringssl-review.googlesource.com/2091
Reviewed-by: Adam Langley <agl@google.com>
Windows doesn't have ssize_t, sadly. There's SSIZE_T, but defining an
OPENSSL_SSIZE_T seems worse than just using an int.
Change-Id: I09bb5aa03f96da78b619e551f92ed52ce24d9f3f
Reviewed-on: https://boringssl-review.googlesource.com/1352
Reviewed-by: Adam Langley <agl@google.com>
This change marks public symbols as dynamically exported. This means
that it becomes viable to build a shared library of libcrypto and libssl
with -fvisibility=hidden.
On Windows, one not only needs to mark functions for export in a
component, but also for import when using them from a different
component. Because of this we have to build with
|BORINGSSL_IMPLEMENTATION| defined when building the code. Other
components, when including our headers, won't have that defined and then
the |OPENSSL_EXPORT| tag becomes an import tag instead. See the #defines
in base.h
In the asm code, symbols are now hidden by default and those that need
to be exported are wrapped by a C function.
In order to support Chromium, a couple of libssl functions were moved to
ssl.h from ssl_locl.h: ssl_get_new_session and ssl_update_cache.
Change-Id: Ib4b76e2f1983ee066e7806c24721e8626d08a261
Reviewed-on: https://boringssl-review.googlesource.com/1350
Reviewed-by: Adam Langley <agl@google.com>
Appease the Chromium build on OS X.
Change-Id: Idb7466b4d3e4cc9161cd09066b2f79a6290838b1
Reviewed-on: https://boringssl-review.googlesource.com/1240
Reviewed-by: Adam Langley <agl@google.com>
Initial fork from f2d678e6e89b6508147086610e985d4e8416e867 (1.0.2 beta).
(This change contains substantial changes from the original and
effectively starts a new history.)