Measured on a SkyLake processor:
Before:
Did 11373750 AES-128-GCM (16 bytes) seal operations in 1016000us (11194635.8 ops/sec): 179.1 MB/s
Did 2253000 AES-128-GCM (1350 bytes) seal operations in 1016000us (2217519.7 ops/sec): 2993.7 MB/s
Did 453750 AES-128-GCM (8192 bytes) seal operations in 1015000us (447044.3 ops/sec): 3662.2 MB/s
Did 10753500 AES-256-GCM (16 bytes) seal operations in 1016000us (10584153.5 ops/sec): 169.3 MB/s
Did 1898750 AES-256-GCM (1350 bytes) seal operations in 1015000us (1870689.7 ops/sec): 2525.4 MB/s
Did 374000 AES-256-GCM (8192 bytes) seal operations in 1016000us (368110.2 ops/sec): 3015.6 MB/s
After:
Did 11074000 AES-128-GCM (16 bytes) seal operations in 1015000us (10910344.8 ops/sec): 174.6 MB/s
Did 3178250 AES-128-GCM (1350 bytes) seal operations in 1016000us (3128198.8 ops/sec): 4223.1 MB/s
Did 734500 AES-128-GCM (8192 bytes) seal operations in 1016000us (722933.1 ops/sec): 5922.3 MB/s
Did 10394750 AES-256-GCM (16 bytes) seal operations in 1015000us (10241133.0 ops/sec): 163.9 MB/s
Did 2502250 AES-256-GCM (1350 bytes) seal operations in 1016000us (2462844.5 ops/sec): 3324.8 MB/s
Did 544500 AES-256-GCM (8192 bytes) seal operations in 1015000us (536453.2 ops/sec): 4394.6 MB/s
Change-Id: If058935796441ed4e577b9a72d3aa43422edba58
Reviewed-on: https://boringssl-review.googlesource.com/7273
Reviewed-by: Adam Langley <alangley@gmail.com>
Change-Id: Ibba0271efa86e1b1af97f2a08b73677dfd236b7a
Reviewed-on: https://boringssl-review.googlesource.com/12986
CQ-Verified: CQ bot account: commit-bot@chromium.org <commit-bot@chromium.org>
Reviewed-by: David Benjamin <davidben@google.com>
Commit-Queue: David Benjamin <davidben@google.com>
Some, very recent, versions of Clang now support `.arch`. Allow them to
see these directives with BORINGSSL_CLANG_SUPPORTS_DOT_ARCH.
BUG=39
Change-Id: I122ab4b3d5f14502ffe0c6e006950dc64abf0201
Reviewed-on: https://boringssl-review.googlesource.com/10600
Reviewed-by: Adam Langley <agl@google.com>
Commit-Queue: Adam Langley <agl@google.com>
CQ-Verified: CQ bot account: commit-bot@chromium.org <commit-bot@chromium.org>
The minimum version is purely based on what we've patched out of the perlasm
files. I'm assuming they're accurate.
Change-Id: I5ae176cf793512125fa78f203a1314396e8a14d7
Reviewed-on: https://boringssl-review.googlesource.com/8238
Reviewed-by: Adam Langley <agl@google.com>
We now have a copy of android-cmake. Also remove the mention of running cmake
twice. It seems to work fine once?
The API level also got specified twice somehow.
BUG=26
Change-Id: I1331b079a4d8531cd53f7de3605ac318c14b3e26
Reviewed-on: https://boringssl-review.googlesource.com/7985
Reviewed-by: Adam Langley <agl@google.com>
Track the Chromium requirements. This makes our bots build with 2015 instead of
2013.
BUG=43
Change-Id: Id5329900a5d1d5fae4b5b22299ed47bc1b947dd8
Reviewed-on: https://boringssl-review.googlesource.com/7820
Reviewed-by: Steven Valdez <svaldez@google.com>
Reviewed-by: David Benjamin <davidben@google.com>
This removes chacha_vec_arm.S and chacha_vec.c in favor of unifying on
upstream's code. Upstream's is faster and this cuts down on the number of
distinct codepaths. Our old scheme also didn't give vectorized code on
Windows or aarch64.
BoringSSL-specific modifications made to the assembly:
- As usual, the shelling out to $CC is replaced with hardcoding $avx. I've
tested up to the AVX2 codepath, so enable it all.
- I've removed the AMD XOP code as I have not tested it.
- As usual, the ARM file need the arm_arch.h include tweaked.
Speed numbers follow. We can hope for further wins on these benchmarks after
importing the Poly1305 assembly.
x86
---
Old:
Did 1422000 ChaCha20-Poly1305 (16 bytes) seal operations in 1000433us (1421384.5 ops/sec): 22.7 MB/s
Did 123000 ChaCha20-Poly1305 (1350 bytes) seal operations in 1003803us (122534.0 ops/sec): 165.4 MB/s
Did 22000 ChaCha20-Poly1305 (8192 bytes) seal operations in 1000282us (21993.8 ops/sec): 180.2 MB/s
Did 1428000 ChaCha20-Poly1305-Old (16 bytes) seal operations in 1000214us (1427694.5 ops/sec): 22.8 MB/s
Did 124000 ChaCha20-Poly1305-Old (1350 bytes) seal operations in 1006332us (123219.8 ops/sec): 166.3 MB/s
Did 22000 ChaCha20-Poly1305-Old (8192 bytes) seal operations in 1020771us (21552.3 ops/sec): 176.6 MB/s
New:
Did 1520000 ChaCha20-Poly1305 (16 bytes) seal operations in 1000567us (1519138.6 ops/sec): 24.3 MB/s
Did 152000 ChaCha20-Poly1305 (1350 bytes) seal operations in 1004216us (151361.9 ops/sec): 204.3 MB/s
Did 31000 ChaCha20-Poly1305 (8192 bytes) seal operations in 1009085us (30720.9 ops/sec): 251.7 MB/s
Did 1797000 ChaCha20-Poly1305-Old (16 bytes) seal operations in 1000141us (1796746.7 ops/sec): 28.7 MB/s
Did 171000 ChaCha20-Poly1305-Old (1350 bytes) seal operations in 1003204us (170453.9 ops/sec): 230.1 MB/s
Did 31000 ChaCha20-Poly1305-Old (8192 bytes) seal operations in 1005349us (30835.1 ops/sec): 252.6 MB/s
x86_64, no AVX2
---
Old:
Did 1782000 ChaCha20-Poly1305 (16 bytes) seal operations in 1000204us (1781636.5 ops/sec): 28.5 MB/s
Did 317000 ChaCha20-Poly1305 (1350 bytes) seal operations in 1001579us (316500.2 ops/sec): 427.3 MB/s
Did 62000 ChaCha20-Poly1305 (8192 bytes) seal operations in 1012146us (61256.0 ops/sec): 501.8 MB/s
Did 1778000 ChaCha20-Poly1305-Old (16 bytes) seal operations in 1000220us (1777608.9 ops/sec): 28.4 MB/s
Did 315000 ChaCha20-Poly1305-Old (1350 bytes) seal operations in 1002886us (314093.5 ops/sec): 424.0 MB/s
Did 71000 ChaCha20-Poly1305-Old (8192 bytes) seal operations in 1014606us (69977.9 ops/sec): 573.3 MB/s
New:
Did 1866000 ChaCha20-Poly1305 (16 bytes) seal operations in 1000019us (1865964.5 ops/sec): 29.9 MB/s
Did 399000 ChaCha20-Poly1305 (1350 bytes) seal operations in 1001017us (398594.6 ops/sec): 538.1 MB/s
Did 84000 ChaCha20-Poly1305 (8192 bytes) seal operations in 1005645us (83528.5 ops/sec): 684.3 MB/s
Did 1881000 ChaCha20-Poly1305-Old (16 bytes) seal operations in 1000325us (1880388.9 ops/sec): 30.1 MB/s
Did 404000 ChaCha20-Poly1305-Old (1350 bytes) seal operations in 1000004us (403998.4 ops/sec): 545.4 MB/s
Did 85000 ChaCha20-Poly1305-Old (8192 bytes) seal operations in 1010048us (84154.4 ops/sec): 689.4 MB/s
x86_64, AVX2
---
Old:
Did 2375000 ChaCha20-Poly1305 (16 bytes) seal operations in 1000282us (2374330.4 ops/sec): 38.0 MB/s
Did 448000 ChaCha20-Poly1305 (1350 bytes) seal operations in 1001865us (447166.0 ops/sec): 603.7 MB/s
Did 88000 ChaCha20-Poly1305 (8192 bytes) seal operations in 1005217us (87543.3 ops/sec): 717.2 MB/s
Did 2409000 ChaCha20-Poly1305-Old (16 bytes) seal operations in 1000188us (2408547.2 ops/sec): 38.5 MB/s
Did 446000 ChaCha20-Poly1305-Old (1350 bytes) seal operations in 1001003us (445553.1 ops/sec): 601.5 MB/s
Did 90000 ChaCha20-Poly1305-Old (8192 bytes) seal operations in 1006722us (89399.1 ops/sec): 732.4 MB/s
New:
Did 2622000 ChaCha20-Poly1305 (16 bytes) seal operations in 1000266us (2621302.7 ops/sec): 41.9 MB/s
Did 794000 ChaCha20-Poly1305 (1350 bytes) seal operations in 1000783us (793378.8 ops/sec): 1071.1 MB/s
Did 173000 ChaCha20-Poly1305 (8192 bytes) seal operations in 1000176us (172969.6 ops/sec): 1417.0 MB/s
Did 2623000 ChaCha20-Poly1305-Old (16 bytes) seal operations in 1000330us (2622134.7 ops/sec): 42.0 MB/s
Did 783000 ChaCha20-Poly1305-Old (1350 bytes) seal operations in 1000531us (782584.4 ops/sec): 1056.5 MB/s
Did 174000 ChaCha20-Poly1305-Old (8192 bytes) seal operations in 1000840us (173854.0 ops/sec): 1424.2 MB/s
arm, Nexus 4
---
Old:
Did 388550 ChaCha20-Poly1305 (16 bytes) seal operations in 1000580us (388324.8 ops/sec): 6.2 MB/s
Did 90000 ChaCha20-Poly1305 (1350 bytes) seal operations in 1003816us (89657.9 ops/sec): 121.0 MB/s
Did 19000 ChaCha20-Poly1305 (8192 bytes) seal operations in 1045750us (18168.8 ops/sec): 148.8 MB/s
Did 398500 ChaCha20-Poly1305-Old (16 bytes) seal operations in 1000305us (398378.5 ops/sec): 6.4 MB/s
Did 90500 ChaCha20-Poly1305-Old (1350 bytes) seal operations in 1000305us (90472.4 ops/sec): 122.1 MB/s
Did 19000 ChaCha20-Poly1305-Old (8192 bytes) seal operations in 1043278us (18211.8 ops/sec): 149.2 MB/s
New:
Did 424788 ChaCha20-Poly1305 (16 bytes) seal operations in 1000641us (424515.9 ops/sec): 6.8 MB/s
Did 115000 ChaCha20-Poly1305 (1350 bytes) seal operations in 1001526us (114824.8 ops/sec): 155.0 MB/s
Did 27000 ChaCha20-Poly1305 (8192 bytes) seal operations in 1033023us (26136.9 ops/sec): 214.1 MB/s
Did 447750 ChaCha20-Poly1305-Old (16 bytes) seal operations in 1000549us (447504.3 ops/sec): 7.2 MB/s
Did 117500 ChaCha20-Poly1305-Old (1350 bytes) seal operations in 1001923us (117274.5 ops/sec): 158.3 MB/s
Did 27000 ChaCha20-Poly1305-Old (8192 bytes) seal operations in 1025118us (26338.4 ops/sec): 215.8 MB/s
aarch64, Nexus 6p
(Note we didn't have aarch64 assembly before at all, and still don't have it
for Poly1305. Hopefully once that's added this will be faster than the arm
numbers...)
---
Old:
Did 145040 ChaCha20-Poly1305 (16 bytes) seal operations in 1003065us (144596.8 ops/sec): 2.3 MB/s
Did 14000 ChaCha20-Poly1305 (1350 bytes) seal operations in 1042605us (13427.9 ops/sec): 18.1 MB/s
Did 2618 ChaCha20-Poly1305 (8192 bytes) seal operations in 1093241us (2394.7 ops/sec): 19.6 MB/s
Did 148000 ChaCha20-Poly1305-Old (16 bytes) seal operations in 1000709us (147895.1 ops/sec): 2.4 MB/s
Did 14000 ChaCha20-Poly1305-Old (1350 bytes) seal operations in 1047294us (13367.8 ops/sec): 18.0 MB/s
Did 2607 ChaCha20-Poly1305-Old (8192 bytes) seal operations in 1090745us (2390.1 ops/sec): 19.6 MB/s
New:
Did 358000 ChaCha20-Poly1305 (16 bytes) seal operations in 1000769us (357724.9 ops/sec): 5.7 MB/s
Did 45000 ChaCha20-Poly1305 (1350 bytes) seal operations in 1021267us (44062.9 ops/sec): 59.5 MB/s
Did 8591 ChaCha20-Poly1305 (8192 bytes) seal operations in 1047136us (8204.3 ops/sec): 67.2 MB/s
Did 343000 ChaCha20-Poly1305-Old (16 bytes) seal operations in 1000489us (342832.4 ops/sec): 5.5 MB/s
Did 44000 ChaCha20-Poly1305-Old (1350 bytes) seal operations in 1008326us (43636.7 ops/sec): 58.9 MB/s
Did 8866 ChaCha20-Poly1305-Old (8192 bytes) seal operations in 1083341us (8183.9 ops/sec): 67.0 MB/s
Change-Id: I629fe195d072f2c99e8f947578fad6d70823c4c8
Reviewed-on: https://boringssl-review.googlesource.com/7202
Reviewed-by: Adam Langley <agl@google.com>
Updating the Perl docs to describe behavior of Strawberry Perl and possible
interaction with CMake on Windows.
Also adding a few other links and instructions for using CMake/Ninja to build
release mode with position independent code, since this seems generally useful.
Change-Id: I616c0d267da749fe90673bc9e8bde9ec181fec25
Reviewed-on: https://boringssl-review.googlesource.com/7113
Reviewed-by: David Benjamin <davidben@google.com>
Change-Id: If78cba6abc36999488981db2a12b039024c757df
Reviewed-on: https://boringssl-review.googlesource.com/6391
Reviewed-by: Brian Smith <brian@briansmith.org>
Reviewed-by: David Benjamin <davidben@chromium.org>
Reviewed-by: Adam Langley <agl@google.com>
Intel's P-256 code has very large tables and things like Chromium just
don't need that extra size. However, servers generally do so this change
adds an OPENSSL_SMALL define that currently just drops the 64-bit P-224
but will gate Intel's P-256 in the future too.
Change-Id: I2e55c6e06327fafabef9b96d875069d95c0eea81
Reviewed-on: https://boringssl-review.googlesource.com/6362
Reviewed-by: Adam Langley <alangley@gmail.com>
Also, organize the links in BUILDING.md sensibly.
Change-Id: Ie9c65750849fcdab7a6a6bf11d1c9cdafb53bc00
Reviewed-on: https://boringssl-review.googlesource.com/6140
Reviewed-by: Adam Langley <alangley@gmail.com>
It's very annoying having to remember the right incant every time I want
to switch around between my build, build-release, build-asan, etc.,
output directories.
Unfortunately, this target is pretty unfriendly without CMake 3.2+ (and
Ninja 1.5+). This combination gives a USES_TERMINAL flag to
add_custom_target which uses Ninja's "console" pool, otherwise the
output buffering gets in the way. Ubuntu LTS is still on an older CMake,
so do a version check in the meantime.
CMake also has its own test mechanism (CTest), but this doesn't use it.
It seems to prefer knowing what all the tests are and then tries to do
its own output management and parallelizing and such. We already have
our own runners. all_tests.go could actually be converted tidily, but
generate_build_files.py also needs to read it, and runner.go has very
specific needs.
Naming the target ninja -C build test would be nice, but CTest squats
that name and CMake grumps when you use a reserved name, so I've gone
with run_tests.
Change-Id: Ibd20ebd50febe1b4e91bb19921f3bbbd9fbcf66c
Reviewed-on: https://boringssl-review.googlesource.com/6270
Reviewed-by: Adam Langley <alangley@gmail.com>
Some ARM environments don't support |getauxval| or signals and need to
configure the capabilities of the chip at compile time. This change adds
defines that allow them to do so.
Change-Id: I4e6987f69dd13444029bc7ac7ed4dbf8fb1faa76
Reviewed-on: https://boringssl-review.googlesource.com/6280
Reviewed-by: Adam Langley <agl@google.com>
This change makes the runner tests (in ssl/test/runner) act like a
normal Go test rather than being a Go binary. This better aligns with
some internal tools.
Thus, from this point onwards, one has to run the runner tests with `go
test` rather than `go run` or `go build && ./runner`.
This will break the bots.
Change-Id: Idd72c31e8e0c2b7ed9939dacd3b801dbd31710dd
Reviewed-on: https://boringssl-review.googlesource.com/6009
Reviewed-by: Matt Braithwaite <mab@google.com>
Reviewed-by: David Benjamin <davidben@chromium.org>
Reviewed-by: Adam Langley <agl@google.com>