Commit Graph

34 Commits

Author SHA1 Message Date
David Benjamin
762e1d039c Import chacha-x86.pl fix.
Patch from https://mta.openssl.org/pipermail/openssl-dev/2016-March/005625.html.

Upstream has yet to make a decision on aliasing requirements for their
assembly. If they choose to go with the stricter aliasing requirement rather
than land this patch, we'll probably want to tweak EVP_AEAD's API guarantees
accordingly and then undiverge.

In the meantime, import this to avoid a regression on x86 from when we had
compiler-vectorized code on GCC platforms.  Per our assembly coverage tools and
pending multi-CPU-variant tests, we have good coverage here. Unlike Poly1305
(which is currently waiting on yet another upstream bugfix), where there is
risk of missed carries everywhere, it is much more difficult to accidentally
make a ChaCha20 implementation that fails based on the data passed into it.

This restores a sizeable speed improvement on x86.

Before:
Did 1131000 ChaCha20-Poly1305 (16 bytes) seal operations in 1000205us (1130768.2 ops/sec): 18.1 MB/s
Did 161000 ChaCha20-Poly1305 (1350 bytes) seal operations in 1006136us (160018.1 ops/sec): 216.0 MB/s
Did 28000 ChaCha20-Poly1305 (8192 bytes) seal operations in 1023264us (27363.4 ops/sec): 224.2 MB/s
Did 1166000 ChaCha20-Poly1305-Old (16 bytes) seal operations in 1000447us (1165479.0 ops/sec): 18.6 MB/s
Did 160000 ChaCha20-Poly1305-Old (1350 bytes) seal operations in 1004818us (159232.8 ops/sec): 215.0 MB/s
Did 30000 ChaCha20-Poly1305-Old (8192 bytes) seal operations in 1016977us (29499.2 ops/sec): 241.7 MB/s

After:
Did 2208000 ChaCha20-Poly1305 (16 bytes) seal operations in 1000031us (2207931.6 ops/sec): 35.3 MB/s
Did 402000 ChaCha20-Poly1305 (1350 bytes) seal operations in 1001717us (401310.9 ops/sec): 541.8 MB/s
Did 97000 ChaCha20-Poly1305 (8192 bytes) seal operations in 1005394us (96479.6 ops/sec): 790.4 MB/s
Did 2444000 ChaCha20-Poly1305-Old (16 bytes) seal operations in 1000089us (2443782.5 ops/sec): 39.1 MB/s
Did 459000 ChaCha20-Poly1305-Old (1350 bytes) seal operations in 1000563us (458741.7 ops/sec): 619.3 MB/s
Did 97000 ChaCha20-Poly1305-Old (8192 bytes) seal operations in 1007942us (96235.7 ops/sec): 788.4 MB/s

Change-Id: I976da606dae062a776e0cc01229ec03a074035d1
Reviewed-on: https://boringssl-review.googlesource.com/7561
Reviewed-by: Steven Valdez <svaldez@google.com>
Reviewed-by: David Benjamin <davidben@google.com>
2016-03-28 15:58:24 +00:00
David Benjamin
d7166d07ad Add a standalone ChaCha test.
The coverage tool revealed that we weren't testing all codepaths of the ChaCha
assembly. Add a standalone test as it's much easier to iterate over all lengths
when there isn't the entire AEAD in the way.

I wasn't able to find a really long test vector, so I generated a random one
with the Go implementation we have in runner.

This test gives us full coverage on the ChaCha20_ssse3 variant. (We'll see how
it fares on the other codepaths when the multi-variant test harnesses get in. I
certainly hope there isn't a more novel way to call ChaCha20 than this...)

Change-Id: I087e421c7351f46ea65dacdc7127e4fbf5f4c0aa
Reviewed-on: https://boringssl-review.googlesource.com/7299
Reviewed-by: Adam Langley <agl@google.com>
2016-03-04 19:11:03 +00:00
David Benjamin
886119b9f7 Disable ChaCha20 assembly for OPENSSL_X86.
They fail the newly-added in-place tests. Since we don't have bots for them
yet, verified manually that the arm and aarch64 code is fine.

Change-Id: Ic6f4060f63e713e09707af05e6b7736b7b65c5df
Reviewed-on: https://boringssl-review.googlesource.com/7252
Reviewed-by: Adam Langley <agl@google.com>
2016-02-29 22:30:46 +00:00
David Benjamin
35be688078 Enable upstream's ChaCha20 assembly for x86 and ARM (32- and 64-bit).
This removes chacha_vec_arm.S and chacha_vec.c in favor of unifying on
upstream's code. Upstream's is faster and this cuts down on the number of
distinct codepaths. Our old scheme also didn't give vectorized code on
Windows or aarch64.

BoringSSL-specific modifications made to the assembly:

- As usual, the shelling out to $CC is replaced with hardcoding $avx. I've
  tested up to the AVX2 codepath, so enable it all.

- I've removed the AMD XOP code as I have not tested it.

- As usual, the ARM file need the arm_arch.h include tweaked.

Speed numbers follow. We can hope for further wins on these benchmarks after
importing the Poly1305 assembly.

x86
---
Old:
Did 1422000 ChaCha20-Poly1305 (16 bytes) seal operations in 1000433us (1421384.5 ops/sec): 22.7 MB/s
Did 123000 ChaCha20-Poly1305 (1350 bytes) seal operations in 1003803us (122534.0 ops/sec): 165.4 MB/s
Did 22000 ChaCha20-Poly1305 (8192 bytes) seal operations in 1000282us (21993.8 ops/sec): 180.2 MB/s
Did 1428000 ChaCha20-Poly1305-Old (16 bytes) seal operations in 1000214us (1427694.5 ops/sec): 22.8 MB/s
Did 124000 ChaCha20-Poly1305-Old (1350 bytes) seal operations in 1006332us (123219.8 ops/sec): 166.3 MB/s
Did 22000 ChaCha20-Poly1305-Old (8192 bytes) seal operations in 1020771us (21552.3 ops/sec): 176.6 MB/s
New:
Did 1520000 ChaCha20-Poly1305 (16 bytes) seal operations in 1000567us (1519138.6 ops/sec): 24.3 MB/s
Did 152000 ChaCha20-Poly1305 (1350 bytes) seal operations in 1004216us (151361.9 ops/sec): 204.3 MB/s
Did 31000 ChaCha20-Poly1305 (8192 bytes) seal operations in 1009085us (30720.9 ops/sec): 251.7 MB/s
Did 1797000 ChaCha20-Poly1305-Old (16 bytes) seal operations in 1000141us (1796746.7 ops/sec): 28.7 MB/s
Did 171000 ChaCha20-Poly1305-Old (1350 bytes) seal operations in 1003204us (170453.9 ops/sec): 230.1 MB/s
Did 31000 ChaCha20-Poly1305-Old (8192 bytes) seal operations in 1005349us (30835.1 ops/sec): 252.6 MB/s

x86_64, no AVX2
---
Old:
Did 1782000 ChaCha20-Poly1305 (16 bytes) seal operations in 1000204us (1781636.5 ops/sec): 28.5 MB/s
Did 317000 ChaCha20-Poly1305 (1350 bytes) seal operations in 1001579us (316500.2 ops/sec): 427.3 MB/s
Did 62000 ChaCha20-Poly1305 (8192 bytes) seal operations in 1012146us (61256.0 ops/sec): 501.8 MB/s
Did 1778000 ChaCha20-Poly1305-Old (16 bytes) seal operations in 1000220us (1777608.9 ops/sec): 28.4 MB/s
Did 315000 ChaCha20-Poly1305-Old (1350 bytes) seal operations in 1002886us (314093.5 ops/sec): 424.0 MB/s
Did 71000 ChaCha20-Poly1305-Old (8192 bytes) seal operations in 1014606us (69977.9 ops/sec): 573.3 MB/s
New:
Did 1866000 ChaCha20-Poly1305 (16 bytes) seal operations in 1000019us (1865964.5 ops/sec): 29.9 MB/s
Did 399000 ChaCha20-Poly1305 (1350 bytes) seal operations in 1001017us (398594.6 ops/sec): 538.1 MB/s
Did 84000 ChaCha20-Poly1305 (8192 bytes) seal operations in 1005645us (83528.5 ops/sec): 684.3 MB/s
Did 1881000 ChaCha20-Poly1305-Old (16 bytes) seal operations in 1000325us (1880388.9 ops/sec): 30.1 MB/s
Did 404000 ChaCha20-Poly1305-Old (1350 bytes) seal operations in 1000004us (403998.4 ops/sec): 545.4 MB/s
Did 85000 ChaCha20-Poly1305-Old (8192 bytes) seal operations in 1010048us (84154.4 ops/sec): 689.4 MB/s

x86_64, AVX2
---
Old:
Did 2375000 ChaCha20-Poly1305 (16 bytes) seal operations in 1000282us (2374330.4 ops/sec): 38.0 MB/s
Did 448000 ChaCha20-Poly1305 (1350 bytes) seal operations in 1001865us (447166.0 ops/sec): 603.7 MB/s
Did 88000 ChaCha20-Poly1305 (8192 bytes) seal operations in 1005217us (87543.3 ops/sec): 717.2 MB/s
Did 2409000 ChaCha20-Poly1305-Old (16 bytes) seal operations in 1000188us (2408547.2 ops/sec): 38.5 MB/s
Did 446000 ChaCha20-Poly1305-Old (1350 bytes) seal operations in 1001003us (445553.1 ops/sec): 601.5 MB/s
Did 90000 ChaCha20-Poly1305-Old (8192 bytes) seal operations in 1006722us (89399.1 ops/sec): 732.4 MB/s
New:
Did 2622000 ChaCha20-Poly1305 (16 bytes) seal operations in 1000266us (2621302.7 ops/sec): 41.9 MB/s
Did 794000 ChaCha20-Poly1305 (1350 bytes) seal operations in 1000783us (793378.8 ops/sec): 1071.1 MB/s
Did 173000 ChaCha20-Poly1305 (8192 bytes) seal operations in 1000176us (172969.6 ops/sec): 1417.0 MB/s
Did 2623000 ChaCha20-Poly1305-Old (16 bytes) seal operations in 1000330us (2622134.7 ops/sec): 42.0 MB/s
Did 783000 ChaCha20-Poly1305-Old (1350 bytes) seal operations in 1000531us (782584.4 ops/sec): 1056.5 MB/s
Did 174000 ChaCha20-Poly1305-Old (8192 bytes) seal operations in 1000840us (173854.0 ops/sec): 1424.2 MB/s

arm, Nexus 4
---
Old:
Did 388550 ChaCha20-Poly1305 (16 bytes) seal operations in 1000580us (388324.8 ops/sec): 6.2 MB/s
Did 90000 ChaCha20-Poly1305 (1350 bytes) seal operations in 1003816us (89657.9 ops/sec): 121.0 MB/s
Did 19000 ChaCha20-Poly1305 (8192 bytes) seal operations in 1045750us (18168.8 ops/sec): 148.8 MB/s
Did 398500 ChaCha20-Poly1305-Old (16 bytes) seal operations in 1000305us (398378.5 ops/sec): 6.4 MB/s
Did 90500 ChaCha20-Poly1305-Old (1350 bytes) seal operations in 1000305us (90472.4 ops/sec): 122.1 MB/s
Did 19000 ChaCha20-Poly1305-Old (8192 bytes) seal operations in 1043278us (18211.8 ops/sec): 149.2 MB/s
New:
Did 424788 ChaCha20-Poly1305 (16 bytes) seal operations in 1000641us (424515.9 ops/sec): 6.8 MB/s
Did 115000 ChaCha20-Poly1305 (1350 bytes) seal operations in 1001526us (114824.8 ops/sec): 155.0 MB/s
Did 27000 ChaCha20-Poly1305 (8192 bytes) seal operations in 1033023us (26136.9 ops/sec): 214.1 MB/s
Did 447750 ChaCha20-Poly1305-Old (16 bytes) seal operations in 1000549us (447504.3 ops/sec): 7.2 MB/s
Did 117500 ChaCha20-Poly1305-Old (1350 bytes) seal operations in 1001923us (117274.5 ops/sec): 158.3 MB/s
Did 27000 ChaCha20-Poly1305-Old (8192 bytes) seal operations in 1025118us (26338.4 ops/sec): 215.8 MB/s

aarch64, Nexus 6p
(Note we didn't have aarch64 assembly before at all, and still don't have it
for Poly1305. Hopefully once that's added this will be faster than the arm
numbers...)
---
Old:
Did 145040 ChaCha20-Poly1305 (16 bytes) seal operations in 1003065us (144596.8 ops/sec): 2.3 MB/s
Did 14000 ChaCha20-Poly1305 (1350 bytes) seal operations in 1042605us (13427.9 ops/sec): 18.1 MB/s
Did 2618 ChaCha20-Poly1305 (8192 bytes) seal operations in 1093241us (2394.7 ops/sec): 19.6 MB/s
Did 148000 ChaCha20-Poly1305-Old (16 bytes) seal operations in 1000709us (147895.1 ops/sec): 2.4 MB/s
Did 14000 ChaCha20-Poly1305-Old (1350 bytes) seal operations in 1047294us (13367.8 ops/sec): 18.0 MB/s
Did 2607 ChaCha20-Poly1305-Old (8192 bytes) seal operations in 1090745us (2390.1 ops/sec): 19.6 MB/s
New:
Did 358000 ChaCha20-Poly1305 (16 bytes) seal operations in 1000769us (357724.9 ops/sec): 5.7 MB/s
Did 45000 ChaCha20-Poly1305 (1350 bytes) seal operations in 1021267us (44062.9 ops/sec): 59.5 MB/s
Did 8591 ChaCha20-Poly1305 (8192 bytes) seal operations in 1047136us (8204.3 ops/sec): 67.2 MB/s
Did 343000 ChaCha20-Poly1305-Old (16 bytes) seal operations in 1000489us (342832.4 ops/sec): 5.5 MB/s
Did 44000 ChaCha20-Poly1305-Old (1350 bytes) seal operations in 1008326us (43636.7 ops/sec): 58.9 MB/s
Did 8866 ChaCha20-Poly1305-Old (8192 bytes) seal operations in 1083341us (8183.9 ops/sec): 67.0 MB/s

Change-Id: I629fe195d072f2c99e8f947578fad6d70823c4c8
Reviewed-on: https://boringssl-review.googlesource.com/7202
Reviewed-by: Adam Langley <agl@google.com>
2016-02-23 17:19:45 +00:00
David Benjamin
0182ecd346 Consistently use named constants in ARM assembly files.
Most of the OPENSSL_armcap_P accesses in assembly use named constants from
arm_arch.h, but some don't. Consistently use the constants. The dispatch really
should be in C, but in the meantime, make it easier to tell what's going on.

I'll send this patch upstream so we won't be carrying a diff here.

Change-Id: I63c68d2351ea5ce11005813314988e32b6459526
Reviewed-on: https://boringssl-review.googlesource.com/7203
Reviewed-by: Adam Langley <agl@google.com>
2016-02-23 17:18:18 +00:00
David Benjamin
295960044b Fix chacha-armv4.pl.
Patch taken from https://rt.openssl.org/Ticket/Display.html?id=4323.

Change-Id: Icbaf8f9a0f92da48f213b251b0afa4b0d5aa627d
Reviewed-on: https://boringssl-review.googlesource.com/7201
Reviewed-by: Adam Langley <agl@google.com>
2016-02-23 01:07:48 +00:00
David Benjamin
ea4d6863c7 Check in pristine copies of OpenSSL's chacha-{arm*,x86}.pl.
They won't be used as-is. This is just to make the diffs easier to see. Taken
from upstream's 4f16039efe3589aa4d63a6f1fab799d0cd9338ca.

Change-Id: I34d8be409f9c8f15b8a6da4b2d98ba3e60aa2210
Reviewed-on: https://boringssl-review.googlesource.com/7200
Reviewed-by: Adam Langley <agl@google.com>
2016-02-23 01:06:43 +00:00
Brian Smith
f547007332 Use |alignas| more in crypto/chacha/chacha_vec.c.
Commit 75a64c08fc missed one case where
the GCC syntax should have been replaced with |alignas|.

Change-Id: Iebdaa9c9a2c0aff171f0b5d4daac607e351a4b7e
Reviewed-on: https://boringssl-review.googlesource.com/6974
Reviewed-by: David Benjamin <davidben@google.com>
2016-01-27 22:12:22 +00:00
Brian Smith
7cae9f5b6c Use |alignas| for alignment.
MSVC doesn't have stdalign.h and so doesn't support |alignas| in C
code. Define |alignas(x)| as a synonym for |__decltype(align(x))|
instead for it.

This also fixes -Wcast-qual warnings in rsaz_exp.c.

Change-Id: Ifce9031724cb93f5a4aa1f567e7af61b272df9d5
Reviewed-on: https://boringssl-review.googlesource.com/6924
Reviewed-by: Adam Langley <agl@google.com>
Reviewed-by: David Benjamin <davidben@google.com>
2016-01-25 23:05:04 +00:00
Brian Smith
d3a4e280db Fix trivial -Wcast-qual violations.
Fix casts from const to non-const where dropping the constness is
completely unnecessary. The changes to chacha_vec.c don't result in any
changes to chacha_vec_arm.S.

Change-Id: I2f10081fd0e73ff5db746347c5971f263a5221a6
Reviewed-on: https://boringssl-review.googlesource.com/6923
Reviewed-by: David Benjamin <davidben@google.com>
2016-01-21 21:06:02 +00:00
Brian Smith
e80a2ecd0d Change |CRYPTO_chacha_20| to use 96-bit nonces, 32-bit counters.
The new function |CRYPTO_chacha_96_bit_nonce_from_64_bit_nonce| can be
used to adapt code from that uses 64 bit nonces, in a way that is
compatible with the old semantics.

Change-Id: I83d5b2d482e006e82982f58c9f981e8078c3e1b0
Reviewed-on: https://boringssl-review.googlesource.com/6100
Reviewed-by: Adam Langley <alangley@gmail.com>
2015-10-26 23:58:46 +00:00
William Hesse
6dc1851f30 Fix aarch64 (64-bit ARM) guard on chacha_vec_arm.S.
Change-Id: Ia3632639daa8655ea5e2f81ba2a5163949f522b2
Reviewed-on: https://boringssl-review.googlesource.com/6110
Reviewed-by: Adam Langley <alangley@gmail.com>
2015-10-26 23:32:38 +00:00
Brian Smith
953cfc837f Document how to regenerate crypto/chacha/chacha_vec_arm.S.
Also, organize the links in BUILDING.md sensibly.

Change-Id: Ie9c65750849fcdab7a6a6bf11d1c9cdafb53bc00
Reviewed-on: https://boringssl-review.googlesource.com/6140
Reviewed-by: Adam Langley <alangley@gmail.com>
2015-10-26 23:29:57 +00:00
Adam Langley
0dd93002dd Revert section changes for ASM.
This change reverts the following commits:
  72d9cba7cb
  5b61b9ebc5
  3f85e04f40
  2ab24a2d40

Change-Id: I669b83f2269cf96aa71a649a346147b9407a811e
Reviewed-on: https://boringssl-review.googlesource.com/6056
Reviewed-by: David Benjamin <davidben@chromium.org>
Reviewed-by: Adam Langley <agl@google.com>
2015-09-30 22:09:52 +00:00
Adam Langley
5b61b9ebc5 Update ChaCha20 ARM asm with sections.
The ChaCha20 ARM asm is generated from GCC. This change updates the GCC
command line to include -ffunction-sections, which causes GCC to put
each function in its own section so that the linker with --gc-sections
can trim unused functions.

Since the file only has a single function, this is a bit useless, but
it'll now be consistent with the other ARM asm.

Change-Id: If12c675700310ea55af817b5433844eeffc9d029
Reviewed-on: https://boringssl-review.googlesource.com/6006
Reviewed-by: Adam Langley <agl@google.com>
2015-09-29 18:07:54 +00:00
Adam Langley
73415b6aa0 Move arm_arch.h and fix up lots of include paths.
arm_arch.h is included from ARM asm files, but lives in crypto/, not
openssl/include/. Since the asm files are often built from a different
location than their position in the source tree, relative include paths
are unlikely to work so, rather than having crypto/ be a de-facto,
second global include path, this change moves arm_arch.h to
include/openssl/.

It also removes entries from many include paths because they should be
needed as relative includes are always based on the locations of the
source file.

Change-Id: I638ff43d641ca043a4fc06c0d901b11c6ff73542
Reviewed-on: https://boringssl-review.googlesource.com/5746
Reviewed-by: Adam Langley <agl@google.com>
2015-08-26 01:57:59 +00:00
Adam Langley
9eaf07d460 Emit #if guards for ARM assembly files.
This change causes the generated assembly files for ARM and AArch64 to
have #if guards for __arm__ and __aarch64__, respectively. Since
building on ARM is only supported for Linux, we only have to worry about
GCC/Clang's predefines.

Change-Id: I7198eab6230bcfc26257f0fb6a0cc3166df0bb29
Reviewed-on: https://boringssl-review.googlesource.com/5173
Reviewed-by: Adam Langley <agl@google.com>
2015-06-23 21:00:32 +00:00
Adam Langley
26c2b929ba Switch nonce type in chacha_vec.c to uint32_t.
This was suggested in https://boringssl-review.googlesource.com/#/c/3460
but I forgot to upload the change before submitting in Gerrit.

Change-Id: I3a333fe2e8880603a9027638dd013f21d8270638
2015-02-13 13:16:59 -08:00
Adam Langley
d306f165a4 Don't require the ChaCha nonce to be aligned on ARM.
Change-Id: I34ee66fcc53d3371591beee3373c46598c31b5c5
Reviewed-on: https://boringssl-review.googlesource.com/3460
Reviewed-by: David Benjamin <davidben@chromium.org>
Reviewed-by: Adam Langley <agl@google.com>
2015-02-13 20:35:36 +00:00
Adam Langley
868c7ef1f4 Don't assume alignment of ChaCha key on ARM.
When addressing [1], I checked the AEAD code but brain-farted: a key is
aligned in that code, but it's the Poly1305 key, which doesn't matter
here.

It would be nice to align the ChaCha key too, but Android doesn't have
|posix_memalign| in the versions that we care about. It does have
|memalign|, but that's documented as "obsolete" and we don't have a
concept of an Android OS yet and I don't want to add one just for this.

So this change uses the buffer for loading the key again.

(Note that we never used to check for alignment of the |key| before
calling this. We must have gotten it for free somehow when checking the
alignment of |in| and |out|. But there are clearly some paths that don't
have an aligned key:
https://code.google.com/p/chromium/issues/detail?id=454308.)

At least the generation script started paying off immediately ☺.

[1] https://boringssl-review.googlesource.com/#/c/3132/1/crypto/chacha/chacha_vec.c@185

Change-Id: I4f893ba0733440fddd453f9636cc2aeaf05076ed
Reviewed-on: https://boringssl-review.googlesource.com/3270
Reviewed-by: Adam Langley <agl@google.com>
2015-02-03 00:34:17 +00:00
Adam Langley
2b2d66d409 Remove string.h from base.h.
Including string.h in base.h causes any file that includes a BoringSSL
header to include string.h. Generally this wouldn't be a problem,
although string.h might slow down the compile if it wasn't otherwise
needed. However, it also causes problems for ipsec-tools in Android
because OpenSSL didn't have this behaviour.

This change removes string.h from base.h and, instead, adds it to each
.c file that requires it.

Change-Id: I5968e50b0e230fd3adf9b72dd2836e6f52d6fb37
Reviewed-on: https://boringssl-review.googlesource.com/3200
Reviewed-by: David Benjamin <davidben@chromium.org>
Reviewed-by: Adam Langley <agl@google.com>
2015-02-02 19:14:15 +00:00
Brian Smith
efed2210e8 Enable more warnings & treat warnings as errors on Windows.
Change-Id: I2bf0144aaa8b670ff00b8e8dfe36bd4d237b9a8a
Reviewed-on: https://boringssl-review.googlesource.com/3140
Reviewed-by: Adam Langley <agl@google.com>
2015-01-31 00:18:55 +00:00
Douglas Katzman
0bb81fcd66 Fix misleading comment.
|num_rounds| is neither a parameter nor manifest constant.

Change-Id: I6c1d3a3819731f53fdd01eef6bb4de8a45176a1d
Reviewed-on: https://boringssl-review.googlesource.com/3180
Reviewed-by: Adam Langley <agl@google.com>
2015-01-30 22:31:00 +00:00
Adam Langley
ab21891e34 Add a script to generate the ChaCha ARM asm.
Obviously I shouldn't be doing this by hand each time.

Change-Id: I64e3f5ede5c47eddbff3b18172a95becc681b486
Reviewed-on: https://boringssl-review.googlesource.com/3170
Reviewed-by: David Benjamin <davidben@chromium.org>
Reviewed-by: Adam Langley <agl@google.com>
2015-01-30 21:47:22 +00:00
Adam Langley
be629e0e92 Another fix for the regenerated chacha_vec_arm.S.
I put the header back, but missed the #endif at the end of the file.
Regenerating this is clearly too error prone – I'll write a script to do
it for the future.

Change-Id: I06968c9f7a4673f5942725e727c67cb4e01d361a
2015-01-30 10:28:37 -08:00
Adam Langley
d5c0980e33 Update the command line for generating the ChaCha asm.
This command line matches what was used to generate the current file.

Change-Id: I2f730768f4efc1b23860cc7b27d5005311c554ea
2015-01-29 12:20:07 -08:00
Adam Langley
8de7597af3 Don't require alignment in ChaCha20 on ARM.
By copying the input and output data via an aligned buffer, the
alignment requirements for the NEON ChaCha implementation on ARM can be
eliminted. This does, however, reduce the speed when aligned buffers are
used. However, updating the GCC version used to generate the ASM more
than makes up for that.

On a SnapDragon 801 (OnePlus One) the aligned speed was 214.6 MB/s and
the unaligned speed was 112.1 MB/s. Now both are 218.4 MB/s. A Nexus 7
also shows a slight speed up.

Change-Id: I68321ba56767fa5354b31a1491a539b299236e9a
Reviewed-on: https://boringssl-review.googlesource.com/3132
Reviewed-by: Adam Langley <agl@google.com>
2015-01-29 20:06:41 +00:00
Adam Langley
4a0f0c4910 Change CMakeLists.txt to two-space indent.
find -name CMakeLists.txt -type f | xargs sed -e 's/\t/  /g' -i

Change-Id: I01636b1849c00ba918f48828252492d99b0403ac
2015-01-28 16:37:10 -08:00
Adam Langley
60d612fdcf Fix ARM build with OPENSSL_NO_ASM.
Change-Id: Id77fb7c904cbfe8172466dff20b6a715d90b806c
Reviewed-on: https://boringssl-review.googlesource.com/1710
Reviewed-by: David Benjamin <davidben@chromium.org>
Reviewed-by: Adam Langley <agl@google.com>
2014-09-03 19:23:25 +00:00
Adam Langley
eb7d2ed1fe Add visibility rules.
This change marks public symbols as dynamically exported. This means
that it becomes viable to build a shared library of libcrypto and libssl
with -fvisibility=hidden.

On Windows, one not only needs to mark functions for export in a
component, but also for import when using them from a different
component. Because of this we have to build with
|BORINGSSL_IMPLEMENTATION| defined when building the code. Other
components, when including our headers, won't have that defined and then
the |OPENSSL_EXPORT| tag becomes an import tag instead. See the #defines
in base.h

In the asm code, symbols are now hidden by default and those that need
to be exported are wrapped by a C function.

In order to support Chromium, a couple of libssl functions were moved to
ssl.h from ssl_locl.h: ssl_get_new_session and ssl_update_cache.

Change-Id: Ib4b76e2f1983ee066e7806c24721e8626d08a261
Reviewed-on: https://boringssl-review.googlesource.com/1350
Reviewed-by: Adam Langley <agl@google.com>
2014-07-31 22:03:11 +00:00
Adam Langley
4c921e1bbc Move public headers to include/openssl/
Previously, public headers lived next to the respective code and there
were symlinks from include/openssl to them.

This doesn't work on Windows.

This change moves the headers to live in include/openssl. In cases where
some symlinks pointed to the same header, I've added a file that just
includes the intended target. These cases are all for backwards-compat.

Change-Id: I6e285b74caf621c644b5168a4877db226b07fd92
Reviewed-on: https://boringssl-review.googlesource.com/1180
Reviewed-by: David Benjamin <davidben@chromium.org>
Reviewed-by: Adam Langley <agl@google.com>
2014-07-14 22:42:18 +00:00
Adam Langley
0113a4fb60 Support building with PNaCl.
PNaCl needs OPENSSL_NO_ASM to work and a couple of cases were missing
because it hasn't previously been tested.

Additionally, it defined _BSD_SOURCE and others on the command line,
causing duplicate definition errors when defined in source code.

It's missing readdir_r.

It uses newlib, which appears to use u_short in socket.h without ever
defining it.

Change-Id: Ieccfc7365723d0521f6327eebe9f44a2afc57406
Reviewed-on: https://boringssl-review.googlesource.com/1140
Reviewed-by: Adam Langley <agl@google.com>
2014-07-11 19:04:04 +00:00
Adam Langley
d031f11596 Remove |num_rounds| argument from chacha_core.
The function was hard-coded to 20 rounds already so the argument was
already useless. Thanks to Huzaifa Sidhpurwala for noticing.

Change-Id: I5f9d6ca6d46c6ab769b19820f8f918349544846d
2014-06-23 13:14:13 -07:00
Adam Langley
de0b202684 ChaCha20-Poly1305 support. 2014-06-20 13:17:35 -07:00