(Imports upstream's ace05265d2d599e350cf84ed60955b7f2b173bc9.)
Change-Id: I151a03d662f7effe87f22fd9db7e0265368798b8
Reviewed-on: https://boringssl-review.googlesource.com/13774
Commit-Queue: Adam Langley <agl@google.com>
Reviewed-by: David Benjamin <davidben@google.com>
(Imports upstream's 6025001707fd65679d758c877200469d4e72ea88.)
Change-Id: I2f237d675b029cfc7ba3640aa9ce7248cc230013
Reviewed-on: https://boringssl-review.googlesource.com/13773
Commit-Queue: Adam Langley <agl@google.com>
Reviewed-by: David Benjamin <davidben@google.com>
(Imports upstream's b7f5503fa6e1feebec2ac12b8ddcb5b5672452a6.)
Change-Id: Ia8d2a8f71c97265d77ef8f6fc3cdfb7cf411c5ce
Reviewed-on: https://boringssl-review.googlesource.com/13772
Commit-Queue: Adam Langley <agl@google.com>
Reviewed-by: David Benjamin <davidben@google.com>
Upstream did this in 609b0852e4d50251857dbbac3141ba042e35a9ae and it's
easier to apply patches if we do also.
Change-Id: I5142693ed1e26640987ff16f5ea510e81bba200e
Reviewed-on: https://boringssl-review.googlesource.com/13771
Commit-Queue: Adam Langley <agl@google.com>
Reviewed-by: David Benjamin <davidben@google.com>
Most C standard library functions are undefined if passed NULL, even
when the corresponding length is zero. This gives them (and, in turn,
all functions which call them) surprising behavior on empty arrays.
Some compilers will miscompile code due to this rule. See also
https://www.imperialviolet.org/2016/06/26/nonnull.html
Add OPENSSL_memcpy, etc., wrappers which avoid this problem.
BUG=23
Change-Id: I95f42b23e92945af0e681264fffaf578e7f8465e
Reviewed-on: https://boringssl-review.googlesource.com/12928
Commit-Queue: David Benjamin <davidben@google.com>
Reviewed-by: Adam Langley <agl@google.com>
This change imports sha256-armv4.pl from upstream at rev 8d1ebff4. This
includes changes to remove the use of adrl, which is not supported by
Clang.
Change-Id: I429e7051d63b59acad21601e40883fc3bd8dd2f5
Reviewed-on: https://boringssl-review.googlesource.com/12480
Commit-Queue: Adam Langley <alangley@gmail.com>
Commit-Queue: David Benjamin <davidben@google.com>
Reviewed-by: David Benjamin <davidben@google.com>
CQ-Verified: CQ bot account: commit-bot@chromium.org <commit-bot@chromium.org>
This prevents a compiler warning from breaking ppc64le build.
Change-Id: I6752109bd02c6d078e656f89327093f8fb13a125
Reviewed-on: https://boringssl-review.googlesource.com/12363
Reviewed-by: Adam Langley <agl@google.com>
This change contains a C implementation of SHA-1 for POWER using
AltiVec. It is almost as fast as the scalar-only assembly implementation
for POWER/POWERPC family in OpenSSL but it is easier to maintain and it
allows error checking with tools like ASAN.
This is tested only for ppc64le. It may nor may not work for other
platforms in the POWER/POWERPC familiy.
Before:
SHA-1 @ 16 bytes: ~30 MB/s
SHA-1 @ 8K: ~140 MB/s
After:
SHA-1 @ 16 bytes: ~70 MB/s
SHA-1 @ 8K: ~480 MB/s
Change-Id: I790352e86d9c0cc4e1e57d11c5a0aa5b0780ca6b
Reviewed-on: https://boringssl-review.googlesource.com/12203
Reviewed-by: Adam Langley <agl@google.com>
Commit-Queue: Adam Langley <agl@google.com>
CQ-Verified: CQ bot account: commit-bot@chromium.org <commit-bot@chromium.org>
Macros need a healthy dose of parentheses to avoid expression-level
misparses. Most of this comes from the clang-tidy CL here:
https://android-review.googlesource.com/c/235696/
Also switch most of the macros to use do { ... } while (0) to avoid all
the excessive comma operators and statement-level misparses.
Change-Id: I4c2ee51e347d2aa8c74a2d82de63838b03bbb0f9
Reviewed-on: https://boringssl-review.googlesource.com/11660
Commit-Queue: David Benjamin <davidben@google.com>
Reviewed-by: Adam Langley <agl@google.com>
RT#4530
(Imported from upstream's 7123aa81e9fb19afb11fdf3850662c5f7ff1f19c.)
We've yet to enable this code, but this confirms that we do indeed need
to get our future all-variants stuff working on Windows as well as
Linux and find an AVX2-capable CI setup on each.
The crash here is caused by some win64-only code using %rax as a frame
pointer (perlasm injects a mov rax,rsp in the prologue of every win64
function).
Change-Id: Ifbe59ceb6ae29266d9cf8a461920344a32b6e555
Reviewed-on: https://boringssl-review.googlesource.com/10366
CQ-Verified: CQ bot account: commit-bot@chromium.org <commit-bot@chromium.org>
Commit-Queue: David Benjamin <davidben@google.com>
Commit-Queue: Adam Langley <agl@google.com>
Reviewed-by: Adam Langley <agl@google.com>
Change-Id: I6d552d26b3d72f6fffdc4d4d9fc3b5d82fb4e8bb
Reviewed-on: https://boringssl-review.googlesource.com/9010
Reviewed-by: David Benjamin <davidben@google.com>
Commit-Queue: David Benjamin <davidben@google.com>
CQ-Verified: CQ bot account: commit-bot@chromium.org <commit-bot@chromium.org>
Depending on architecture, perlasm differed on which one or both of:
perl foo.pl flavor output.S
perl foo.pl flavor > output.S
Upstream has now unified on the first form after making a number of
changes to their files (the second does not even work for their x86
files anymore). Sync those portions of our perlasm scripts with upstream
and update CMakeLists.txt and generate_build_files.py per the new
convention.
This imports various commits like this one:
184bc45f683c76531d7e065b6553ca9086564576 (this was done by taking a
diff, so I don't have the full list)
Confirmed that generate_build_files.py sees no change.
BUG=14
Change-Id: Id2fb5b8bc2a7369d077221b5df9a6947d41f50d2
Reviewed-on: https://boringssl-review.googlesource.com/8518
Reviewed-by: Adam Langley <agl@google.com>
This reverts commits:
- 9158637142
- a90aa64302
- c0d8b83b44
It turns out code outside of BoringSSL also mismatches Init and Update/Final
functions. Since this is largely cosmetic, it's probably not worth the cost to
do this.
Change-Id: I14e7b299172939f69ced2114be45ccba1dbbb704
Reviewed-on: https://boringssl-review.googlesource.com/7793
Reviewed-by: Adam Langley <agl@google.com>
As with SHA512_Final, use the different APIs rather than store md_len.
Change-Id: Ie1150de6fefa96f283d47aa03de0f18de38c93eb
Reviewed-on: https://boringssl-review.googlesource.com/7722
Reviewed-by: Adam Langley <agl@google.com>
This is in preparation for taking md_len out of SHA256_CTX by allowing us to do
something similar to SHA512_CTX. md32_common.h now emits a static "finish"
function which Final composes with the extraction step.
Change-Id: I314fb31e2482af642fd280500cc0e4716aef1ac6
Reviewed-on: https://boringssl-review.googlesource.com/7721
Reviewed-by: Adam Langley <agl@google.com>
Rather than store md_len, factor out the common parts of SHA384_Final and
SHA512_Final and then extract the right state. Also add a missing
SHA384_Transform and be consistent about "1" vs "one" in comments.
This also removes the NULL output special-case which no other hash function
had.
Change-Id: If60008bae7d7d5b123046a46d8fd64139156a7c5
Reviewed-on: https://boringssl-review.googlesource.com/7720
Reviewed-by: Adam Langley <agl@google.com>
Most of the OPENSSL_armcap_P accesses in assembly use named constants from
arm_arch.h, but some don't. Consistently use the constants. The dispatch really
should be in C, but in the meantime, make it easier to tell what's going on.
I'll send this patch upstream so we won't be carrying a diff here.
Change-Id: I63c68d2351ea5ce11005813314988e32b6459526
Reviewed-on: https://boringssl-review.googlesource.com/7203
Reviewed-by: Adam Langley <agl@google.com>
It's only used in one file. No sense in polluting the namespace here.
Change-Id: Iaf3870a4be2d2cad950f4d080e25fe7f0d3929c7
Reviewed-on: https://boringssl-review.googlesource.com/6660
Reviewed-by: Adam Langley <agl@google.com>
Nothing ever uses the return value. It'd be better off discarding it rather
than make callers stick (void) everywhere.
Change-Id: Ia28c970a1e5a27db441e4511249589d74408849b
Reviewed-on: https://boringssl-review.googlesource.com/6653
Reviewed-by: Adam Langley <agl@google.com>
I would hope any sensible compiler would recognize the rotation. (If
not, we should at least pull this into crypto/internal.h.) Confirmed
that clang at least produces the exact same instructions for
sha256_block_data_order for release + NO_ASM. This is also mostly moot
as SHA-1 and SHA-256 both have assembly versions on x86 that sidestep
most of this.
For the digests, take it out of md32_common.h since it doesn't use the
macro. md32_common.h isn't sure whether it's a multiply-included header
or not. It should be, but it has an #include guard (doesn't quite do
what you'd want) and will get HOST_c2l, etc., confused if one tries to
include it twice.
Change-Id: I1632801de6473ffd2c6557f3412521ec5d6b305c
Reviewed-on: https://boringssl-review.googlesource.com/6650
Reviewed-by: Adam Langley <agl@google.com>
stdint.h already has macros for this. The spec says that, in C++,
__STDC_CONSTANT_MACROS is needed, so define it for bytestring_test.cc.
Chromium seems to use these macros without trouble, so I'm assuming we
can rely on them.
Change-Id: I56d178689b44d22c6379911bbb93d3b01dd832a3
Reviewed-on: https://boringssl-review.googlesource.com/6510
Reviewed-by: Adam Langley <agl@google.com>
The previous logic only defined
|SHA512_BLOCK_CAN_MANAGE_UNALIGNED_DATA| when the assembly language
optimizations were enabled, but
|SHA512_BLOCK_CAN_MANAGE_UNALIGNED_DATA| is also useful when the C
implementations are used.
If support for ARM processors that don't support unaligned access is
important, then it might be better to condition the enabling of
|SHA512_BLOCK_CAN_MANAGE_UNALIGNED_DATA| on ARM based on more specific
flags.
Change-Id: Ie8c37c73aba308c3ccf79371ce5831512e419989
Reviewed-on: https://boringssl-review.googlesource.com/6402
Reviewed-by: Adam Langley <agl@google.com>
The documentation in md32_common.h is now (more) correct with respect
to the most important details of the layout of |HASH_CTX|. The
documentation explaining why sha512.c doesn't use md32_common.h is now
more accurate as well.
Before, the C implementations of HASH_BLOCK_DATA_ORDER took a pointer
to the |HASH_CTX| and the assembly language implementations took a
pointer to the hash state |h| member of |HASH_CTX|. (This worked
because |h| is always the first member of |HASH_CTX|.) Now, the C
implementations take a pointer directly to |h| too.
The definitions of |MD4_CTX|, |MD5_CTX|, and |SHA1_CTX| were changed to
be consistent with |SHA256_CTX| and |SHA512_CTX| in storing the hash
state in an array. This will break source compatibility with any
external code that accesses the hash state directly, but will not
affect binary compatibility.
The second parameter of |HASH_BLOCK_DATA_ORDER| is now of type
|const uint8_t *|; previously it was |void *| and all implementations
had a |uint8_t *data| variable to access it as an array of bytes.
This change paves the way for future refactorings such as automatically
generating the |*_Init| functions and/or sharing one I-U-F
implementation across all digest algorithms.
Change-Id: I6e9dd09ff057c67941021d324a4fa1d39f58b0db
Reviewed-on: https://boringssl-review.googlesource.com/6405
Reviewed-by: Adam Langley <agl@google.com>
The documentation in md32_common.h is now (more) correct with respect
to the most important details of the layout of |HASH_CTX|. The
documentation explaining why sha512.c doesn't use md32_common.h is now
more accurate as well.
Before, the C implementations of HASH_BLOCK_DATA_ORDER took a pointer
to the |HASH_CTX| and the assembly language implementations tool a
pointer to the hash state |h| member of |HASH_CTX|. (This worked
because |h| is always the first member of |HASH_CTX|.) Now, the C
implementations take a pointer directly to |h| too.
The definitions of |MD4_CTX|, |MD5_CTX|, and |SHA1_CTX| were changed to
be consistent with |SHA256_CTX| and |SHA512_CTX| in storing the hash
state in an array. This will break source compatibility with any
external code that accesses the hash state directly, but will not
affect binary compatibility.
The second parameter of |HASH_BLOCK_DATA_ORDER| is now of type
|const uint8_t *|; previously it was |void *| and all implementations
had a |uint8_t *data| variable to access it as an array of bytes.
This change paves the way for future refactorings such as automatically
generating the |*_Init| functions and/or sharing one I-U-F
implementation across all digest algorithms.
Change-Id: I30513bb40b5f1d2c8932551d54073c35484b3f8b
Reviewed-on: https://boringssl-review.googlesource.com/6401
Reviewed-by: Adam Langley <agl@google.com>
Since we pre-generate our perlasm, having the output of these files be
sensitive to the environment the run in is unhelpful. It would be bad to
suddenly change what features we do or don't compile in whenever workstations'
toolchains change or if developers do or don't have CC variables set.
Previously, all compiler-version-gated features were turned on in
https://boringssl-review.googlesource.com/6260, but this broke the build. I
also wasn't thorough enough in gathering performance numbers. So, flip them all
to off instead. I'll enable them one-by-one as they're tested.
This should result in no change to generated assembly.
Change-Id: Ib4259b3f97adc4939cb0557c5580e8def120d5bc
Reviewed-on: https://boringssl-review.googlesource.com/6383
Reviewed-by: Adam Langley <agl@google.com>
This reverts commit b9c26014de.
The win64 bot seems unhappy. Will sniff at it tomorrow. In
the meantime, get the tree green again.
Change-Id: I058ddb3ec549beee7eabb2f3f72feb0a4a5143b2
Reviewed-on: https://boringssl-review.googlesource.com/6353
Reviewed-by: Adam Langley <alangley@gmail.com>
Since we pre-generate our perlasm, having the output of these files be
sensitive to the environment the run in is unhelpful. It would be bad to
suddenly change what features we do or don't compile in whenever workstations'
toolchains change.
Enable all compiler-version-gated features as they should all be runtime-gated
anyway. This should align with what upstream's files would have produced on
modern toolschains. We should assume our assemblers can take whatever we'd like
to throw at them. (If it turns out some can't, we'd rather find out and
probably switch the problematic instructions to explicit byte sequences.)
This actually results in a fairly significant change to the assembly we
generate. I'm guessing upstream's buildsystem sets the CC environment variable,
while ours doesn't and so the version checks were all coming out conservative.
diffstat of generated files:
linux-x86/crypto/sha/sha1-586.S | 1176 ++++++++++++
linux-x86/crypto/sha/sha256-586.S | 2248 ++++++++++++++++++++++++
linux-x86_64/crypto/bn/rsaz-avx2.S | 1644 +++++++++++++++++
linux-x86_64/crypto/bn/rsaz-x86_64.S | 638 ++++++
linux-x86_64/crypto/bn/x86_64-mont.S | 332 +++
linux-x86_64/crypto/bn/x86_64-mont5.S | 1130 ++++++++++++
linux-x86_64/crypto/modes/aesni-gcm-x86_64.S | 754 ++++++++
linux-x86_64/crypto/modes/ghash-x86_64.S | 475 +++++
linux-x86_64/crypto/sha/sha1-x86_64.S | 1121 ++++++++++++
linux-x86_64/crypto/sha/sha256-x86_64.S | 1062 +++++++++++
linux-x86_64/crypto/sha/sha512-x86_64.S | 2241 ++++++++++++++++++++++++
mac-x86/crypto/sha/sha1-586.S | 1174 ++++++++++++
mac-x86/crypto/sha/sha256-586.S | 2248 ++++++++++++++++++++++++
mac-x86_64/crypto/bn/rsaz-avx2.S | 1637 +++++++++++++++++
mac-x86_64/crypto/bn/rsaz-x86_64.S | 638 ++++++
mac-x86_64/crypto/bn/x86_64-mont.S | 331 +++
mac-x86_64/crypto/bn/x86_64-mont5.S | 1130 ++++++++++++
mac-x86_64/crypto/modes/aesni-gcm-x86_64.S | 750 ++++++++
mac-x86_64/crypto/modes/ghash-x86_64.S | 475 +++++
mac-x86_64/crypto/sha/sha1-x86_64.S | 1121 ++++++++++++
mac-x86_64/crypto/sha/sha256-x86_64.S | 1062 +++++++++++
mac-x86_64/crypto/sha/sha512-x86_64.S | 2241 ++++++++++++++++++++++++
win-x86/crypto/sha/sha1-586.asm | 1173 ++++++++++++
win-x86/crypto/sha/sha256-586.asm | 2248 ++++++++++++++++++++++++
win-x86_64/crypto/bn/rsaz-avx2.asm | 1858 +++++++++++++++++++-
win-x86_64/crypto/bn/rsaz-x86_64.asm | 638 ++++++
win-x86_64/crypto/bn/x86_64-mont.asm | 352 +++
win-x86_64/crypto/bn/x86_64-mont5.asm | 1184 ++++++++++++
win-x86_64/crypto/modes/aesni-gcm-x86_64.asm | 933 ++++++++++
win-x86_64/crypto/modes/ghash-x86_64.asm | 515 +++++
win-x86_64/crypto/sha/sha1-x86_64.asm | 1152 ++++++++++++
win-x86_64/crypto/sha/sha256-x86_64.asm | 1088 +++++++++++
win-x86_64/crypto/sha/sha512-x86_64.asm | 2499 ++++++
SHA* gets faster. RSA and AES-GCM seem to be more of a wash and even slower
sometimes! This is a little concerning. Though when I repeated the latter two,
it's definitely noisy (RSA in particular), so we may wish to repeat in a more
controlled environment. We could also flip some of these toggles to something
other than the highest setting if it seems some of the variants aren't
desirable. We just shouldn't have them enabled or disabled on accident. This
aligns us closer to upstream though.
$ /tmp/bssl.old speed SHA-
Did 5028000 SHA-1 (16 bytes) operations in 1000048us (5027758.7 ops/sec): 80.4 MB/s
Did 1708000 SHA-1 (256 bytes) operations in 1000257us (1707561.2 ops/sec): 437.1 MB/s
Did 73000 SHA-1 (8192 bytes) operations in 1008406us (72391.5 ops/sec): 593.0 MB/s
Did 3041000 SHA-256 (16 bytes) operations in 1000311us (3040054.5 ops/sec): 48.6 MB/s
Did 779000 SHA-256 (256 bytes) operations in 1000820us (778361.7 ops/sec): 199.3 MB/s
Did 26000 SHA-256 (8192 bytes) operations in 1009875us (25745.8 ops/sec): 210.9 MB/s
Did 1837000 SHA-512 (16 bytes) operations in 1000251us (1836539.0 ops/sec): 29.4 MB/s
Did 803000 SHA-512 (256 bytes) operations in 1000969us (802222.6 ops/sec): 205.4 MB/s
Did 41000 SHA-512 (8192 bytes) operations in 1016768us (40323.8 ops/sec): 330.3 MB/s
$ /tmp/bssl.new speed SHA-
Did 5354000 SHA-1 (16 bytes) operations in 1000104us (5353443.2 ops/sec): 85.7 MB/s
Did 1779000 SHA-1 (256 bytes) operations in 1000121us (1778784.8 ops/sec): 455.4 MB/s
Did 87000 SHA-1 (8192 bytes) operations in 1012641us (85914.0 ops/sec): 703.8 MB/s
Did 3517000 SHA-256 (16 bytes) operations in 1000114us (3516599.1 ops/sec): 56.3 MB/s
Did 935000 SHA-256 (256 bytes) operations in 1000096us (934910.2 ops/sec): 239.3 MB/s
Did 38000 SHA-256 (8192 bytes) operations in 1004476us (37830.7 ops/sec): 309.9 MB/s
Did 2930000 SHA-512 (16 bytes) operations in 1000259us (2929241.3 ops/sec): 46.9 MB/s
Did 1008000 SHA-512 (256 bytes) operations in 1000509us (1007487.2 ops/sec): 257.9 MB/s
Did 45000 SHA-512 (8192 bytes) operations in 1000593us (44973.3 ops/sec): 368.4 MB/s
$ /tmp/bssl.old speed RSA
Did 820 RSA 2048 signing operations in 1017008us (806.3 ops/sec)
Did 27000 RSA 2048 verify operations in 1015400us (26590.5 ops/sec)
Did 1292 RSA 2048 (3 prime, e=3) signing operations in 1008185us (1281.5 ops/sec)
Did 65000 RSA 2048 (3 prime, e=3) verify operations in 1011388us (64268.1 ops/sec)
Did 120 RSA 4096 signing operations in 1061027us (113.1 ops/sec)
Did 8208 RSA 4096 verify operations in 1002717us (8185.8 ops/sec)
$ /tmp/bssl.new speed RSA
Did 760 RSA 2048 signing operations in 1003351us (757.5 ops/sec)
Did 25900 RSA 2048 verify operations in 1028931us (25171.8 ops/sec)
Did 1320 RSA 2048 (3 prime, e=3) signing operations in 1040806us (1268.2 ops/sec)
Did 63000 RSA 2048 (3 prime, e=3) verify operations in 1016042us (62005.3 ops/sec)
Did 104 RSA 4096 signing operations in 1008718us (103.1 ops/sec)
Did 6875 RSA 4096 verify operations in 1093441us (6287.5 ops/sec)
$ /tmp/bssl.old speed GCM
Did 5316000 AES-128-GCM (16 bytes) seal operations in 1000082us (5315564.1 ops/sec): 85.0 MB/s
Did 712000 AES-128-GCM (1350 bytes) seal operations in 1000252us (711820.6 ops/sec): 961.0 MB/s
Did 149000 AES-128-GCM (8192 bytes) seal operations in 1003182us (148527.4 ops/sec): 1216.7 MB/s
Did 5919750 AES-256-GCM (16 bytes) seal operations in 1000016us (5919655.3 ops/sec): 94.7 MB/s
Did 800000 AES-256-GCM (1350 bytes) seal operations in 1000951us (799239.9 ops/sec): 1079.0 MB/s
Did 152000 AES-256-GCM (8192 bytes) seal operations in 1000765us (151883.8 ops/sec): 1244.2 MB/s
$ /tmp/bssl.new speed GCM
Did 5315000 AES-128-GCM (16 bytes) seal operations in 1000125us (5314335.7 ops/sec): 85.0 MB/s
Did 755000 AES-128-GCM (1350 bytes) seal operations in 1000878us (754337.7 ops/sec): 1018.4 MB/s
Did 151000 AES-128-GCM (8192 bytes) seal operations in 1005655us (150150.9 ops/sec): 1230.0 MB/s
Did 5913500 AES-256-GCM (16 bytes) seal operations in 1000041us (5913257.6 ops/sec): 94.6 MB/s
Did 782000 AES-256-GCM (1350 bytes) seal operations in 1001484us (780841.2 ops/sec): 1054.1 MB/s
Did 121000 AES-256-GCM (8192 bytes) seal operations in 1006389us (120231.8 ops/sec): 984.9 MB/s
Change-Id: I0efb32f896c597abc7d7e55c31d038528a5c72a1
Reviewed-on: https://boringssl-review.googlesource.com/6260
Reviewed-by: Adam Langley <alangley@gmail.com>
We haven't tested it yet, but it was only disabled on 64-bit. Disable it on
32-bit as well until we're ready to turn it on.
Change-Id: I50e74aef2c5c3ba539a868c2bb6fb90fdf28a5f0
Reviewed-on: https://boringssl-review.googlesource.com/6271
Reviewed-by: Adam Langley <alangley@gmail.com>
We missed 7eb9680ae1bf5dd9aeb61c401f2c3bd900ac9aeb. This is a no-op as we don't
set shaext right now anyway. This also includes some cosmetic changes to
minimize the diff with upstream. ("cosmetic". Upstream's perl doesn't like
spaces.)
Change-Id: I17fa663ddaa38c27854d4f59fb83960528d9ba78
Reviewed-on: https://boringssl-review.googlesource.com/6250
Reviewed-by: Adam Langley <alangley@gmail.com>
2ab24a2d40 added sections to ARM assembly
files. However, in cases where .align directives were not next to the
labels that they were intended to apply to, the section directives would
cause them to be ignored.
Change-Id: I32117f6747ff8545b80c70dd3b8effdc6e6f67e0
Reviewed-on: https://boringssl-review.googlesource.com/6050
Reviewed-by: David Benjamin <davidben@chromium.org>
Reviewed-by: Adam Langley <agl@google.com>
This change causes each global arm or aarch64 asm function to be put
into its own section by default. This matches the behaviour of the
-ffunction-sections option to GCC and allows the --gc-sections option to
the linker to discard unused asm functions on a function-by-function
basis.
Sometimes several asm functions will share the same data an, in that
situation, the data is put into the section of one of the functions and
the section of the other function is merged with the added
“.global_with_section” directive.
Change-Id: I12c9b844d48d104d28beb816764358551eac4456
Reviewed-on: https://boringssl-review.googlesource.com/6003
Reviewed-by: Adam Langley <agl@google.com>
arm_arch.h is included from ARM asm files, but lives in crypto/, not
openssl/include/. Since the asm files are often built from a different
location than their position in the source tree, relative include paths
are unlikely to work so, rather than having crypto/ be a de-facto,
second global include path, this change moves arm_arch.h to
include/openssl/.
It also removes entries from many include paths because they should be
needed as relative includes are always based on the locations of the
source file.
Change-Id: I638ff43d641ca043a4fc06c0d901b11c6ff73542
Reviewed-on: https://boringssl-review.googlesource.com/5746
Reviewed-by: Adam Langley <agl@google.com>
See upstream's 9f0b86c68bb96d49301bbd6473c8235ca05ca06b. Generated by
using upstream's script in 5a3ce86e21715a683ff0d32421ed5c6d5e84234d and
then manually throwing out the false positives. (We converted a bunch of
stuff already in 91157550061d5d794898fe47b95384a7ba5f7b9d.)
This may require some wrestling with depot_tools to land in Chromium due
to Rietveld's encoding bugs, but hopefully that will avoid future
problems; Rietveld breaks if either old or new file is Latin-1.
Change-Id: I26dcb20c7377f92a0c843ef5d74d440a82ea8ceb
Reviewed-on: https://boringssl-review.googlesource.com/5483
Reviewed-by: Adam Langley <agl@google.com>
The SHA-2 family has some exceptions, but they're all programmer errors
and should be documented as such. (Are the failure cases even
necessary?)
Change-Id: I00bd0a9450cff78d8caac479817fbd8d3de872b8
Reviewed-on: https://boringssl-review.googlesource.com/4953
Reviewed-by: Adam Langley <agl@google.com>
Use sized integer types rather than unsigned char/int/long. The latter
two are especially a mess as they're both used in lieu of uint32_t.
Sometimes the code just blindly uses unsigned long and sometimes it uses
unsigned int when an LP64 architecture would notice.
Change-Id: I4c5c6aaf82cfe9fe523435588d286726a7c43056
Reviewed-on: https://boringssl-review.googlesource.com/4952
Reviewed-by: Adam Langley <agl@google.com>
ARM has optimized Cortex-A5x pipeline to favour pairs of complementary
AES instructions. While modified code improves performance of post-r0p0
Cortex-A53 performance by >40% (for CBC decrypt and CTR), it hurts
original r0p0. We favour later revisions, because one can't prevent
future from coming. Improvement on post-r0p0 Cortex-A57 exceeds 50%,
while new code is not slower on r0p0, or Apple A7 for that matter.
[Update even SHA results for latest Cortex-A53.]
(Imported from upstream's 94376cccb4ed5b376220bffe0739140ea9dad8c8)
Change-Id: I581c65b566116b1f4211fb1bd5a1a54479889d70
Reviewed-on: https://boringssl-review.googlesource.com/4481
Reviewed-by: Adam Langley <agl@google.com>
Follow-up to sha256-armv4.pl in cooperation with Ard Biesheuvel
(Linaro) and Sami Tolvanen (Google).
(Imported from upstream's b1a5d1c652086257930a1f62ae51c9cdee654b2c.)
Change-Id: Ibc4f289cc8f499924ade8d6b8d494f53bc08bda7
Reviewed-on: https://boringssl-review.googlesource.com/4467
Reviewed-by: Adam Langley <agl@google.com>
(Imported from upstream's 51f8d095562f36cdaa6893597b5c609e943b0565.)
I don't see why we'd care, but just to minimize divergence.
Change-Id: I4b07e72c88fcb04654ad28d8fd371e13d59a61b5
Reviewed-on: https://boringssl-review.googlesource.com/4466
Reviewed-by: Adam Langley <agl@google.com>
In cooperation with Ard Biesheuvel (Linaro) and Sami Tolvanen (Google).
(Imported from upstream's 2ecd32a1f8f0643ae7b38f59bbaf9f0d6ef326fe)
Change-Id: Iac5853220654b6ef4cb3bb7f8d1efe0eb2ecf634
Reviewed-on: https://boringssl-review.googlesource.com/4463
Reviewed-by: Adam Langley <agl@google.com>
This is as partial import of upstream's
9b05cbc33e7895ed033b1119e300782d9e0cf23c. It includes the perlasm changes, but
not the CPU feature detection bits as we do those differently. This is largely
so we don't diverge from upstream, but it'll help with iOS assembly in the
future.
sha512-armv8.pl is modified slightly from upstream to switch from conditioning
on the output file to conditioning on an extra argument. This makes our
previous change from upstream (removing the 'open STDOUT' line) more explicit.
BUG=338886
Change-Id: Ic8ca1388ae20e94566f475bad3464ccc73f445df
Reviewed-on: https://boringssl-review.googlesource.com/4405
Reviewed-by: Adam Langley <agl@google.com>