Clang (3.6, at least) doesn't like .arch when its internal as is used.
Instead, one has to pass -march=armv8-a+crypto on the command line.
Change-Id: Ifc5b57fbebd0eb53658481b0a0c111e808c81d93
Reviewed-on: https://boringssl-review.googlesource.com/4411
Reviewed-by: Adam Langley <agl@google.com>
(Imported from upstream's 7eeeb49e1103533bc81c234eb19613353866e474)
Here are the performance numbers on a Nexus 9 (32-bit binary):
Before:
Did 4376000 AES-128-GCM (16 bytes) seal operations in 1000016us (4375930.0 ops/sec): 70.0 MB/s
Did 642000 AES-128-GCM (1350 bytes) seal operations in 1001090us (641301.0 ops/sec): 865.8 MB/s
Did 126000 AES-128-GCM (8192 bytes) seal operations in 1001460us (125816.3 ops/sec): 1030.7 MB/s
Did 4120000 AES-256-GCM (16 bytes) seal operations in 1000004us (4119983.5 ops/sec): 65.9 MB/s
Did 547000 AES-256-GCM (1350 bytes) seal operations in 1001165us (546363.5 ops/sec): 737.6 MB/s
Did 99000 AES-256-GCM (8192 bytes) seal operations in 1000027us (98997.3 ops/sec): 811.0 MB/s
After:
Did 4569000 AES-128-GCM (16 bytes) seal operations in 1000011us (4568949.7 ops/sec): 73.1 MB/s
Did 796000 AES-128-GCM (1350 bytes) seal operations in 1000161us (795871.9 ops/sec): 1074.4 MB/s
Did 162000 AES-128-GCM (8192 bytes) seal operations in 1003828us (161382.2 ops/sec): 1322.0 MB/s
Did 4398000 AES-256-GCM (16 bytes) seal operations in 1000001us (4397995.6 ops/sec): 70.4 MB/s
Did 634000 AES-256-GCM (1350 bytes) seal operations in 1001290us (633183.2 ops/sec): 854.8 MB/s
Did 122000 AES-256-GCM (8192 bytes) seal operations in 1005650us (121314.6 ops/sec): 993.8 MB/s
Change-Id: I2fef921069ad174f5651dfe59be262625fb3f7c9
Reviewed-on: https://boringssl-review.googlesource.com/4483
Reviewed-by: Adam Langley <agl@google.com>
This is as partial import of upstream's
9b05cbc33e7895ed033b1119e300782d9e0cf23c. It includes the perlasm changes, but
not the CPU feature detection bits as we do those differently. This is largely
so we don't diverge from upstream, but it'll help with iOS assembly in the
future.
sha512-armv8.pl is modified slightly from upstream to switch from conditioning
on the output file to conditioning on an extra argument. This makes our
previous change from upstream (removing the 'open STDOUT' line) more explicit.
BUG=338886
Change-Id: Ic8ca1388ae20e94566f475bad3464ccc73f445df
Reviewed-on: https://boringssl-review.googlesource.com/4405
Reviewed-by: Adam Langley <agl@google.com>
This is an initial cut at aarch64 support. I have only qemu to test it
however—hopefully hardware will be coming soon.
This also affects 32-bit ARM in that aarch64 chips can run 32-bit code
and we would like to be able to take advantage of the crypto operations
even in 32-bit mode. AES and GHASH should Just Work in this case: the
-armx.pl files can be built for either 32- or 64-bit mode based on the
flavour argument given to the Perl script.
SHA-1 and SHA-256 don't work like this however because they've never
support for multiple implementations, thus BoringSSL built for 32-bit
won't use the SHA instructions on an aarch64 chip.
No dedicated ChaCha20 or Poly1305 support yet.
Change-Id: Ib275bc4894a365c8ec7c42f4e91af6dba3bd686c
Reviewed-on: https://boringssl-review.googlesource.com/2801
Reviewed-by: Adam Langley <agl@google.com>