boringssl

Author	SHA1	Message	Date
Brian Smith	644539191b	chacha20_poly1305_x86_64.pl: Suppress Yasm non-local label warnings. Before, attempting to build the code using Yasm as the assembler would result in warnings like this: warning : no non-local label before `.chacha20_consts' Precede the local labels with a non-local label to suppress these warnings. It isn't clear why these labels are defined as local labels instead of regular labels. Making them non-local may be a better idea. For reference, Yasm's interpretation of local labels is described succinctly at https://www.tortall.net/projects/yasm/manual/html/nasm-local-label.html. Change-Id: Ifc92de7fd7379859fe33f1137ab20b6ec282cd0b Reviewed-on: https://boringssl-review.googlesource.com/13384 Reviewed-by: Adam Langley <agl@google.com>	2017-02-09 18:05:41 +00:00
David Benjamin	5c9d411e14	Fix some compact unwind errors. The Mac ld gets unhappy about "weird" unwind directives: In chacha20_poly1305_x86_64.pl, $keyp is being pushed on the stack (according to the comment) because it gets clobbered in the computation somewhere. $keyp is %r9 which is not callee-saved (it's an argument register), so we don't need to tag it with .cfi_offset. In x25519-asm-x86_64.S, x25519_x86_64_mul saves %rdi on the stack. However it too is not callee-saved (it's an argument register) and should not have a .cfi_offset. %rdi also does not appear to be written to anywhere in the function, so there's no need to save it at all. (This does not resolve the "r15 is saved too far from return address" errors. Just the non-standard register ones.) BUG=176 Change-Id: I53f3f7db3d1745384fb47cb52cd6536aabb5065e Reviewed-on: https://boringssl-review.googlesource.com/13560 Commit-Queue: David Benjamin <davidben@google.com> Reviewed-by: Adam Langley <agl@google.com>	2017-02-02 22:05:06 +00:00
Brian Smith	360a4c2616	chacha20_poly1305_x86_64.pl: Use NASM-compatible syntax for \|ldea\|. Cargo-cult the way other Perlasm scripts do it. Change-Id: I86aaf725e41b601f24595518a8a6bc481fa0c7fc Reviewed-on: https://boringssl-review.googlesource.com/13382 Reviewed-by: Adam Langley <agl@google.com>	2017-01-27 23:17:13 +00:00
Brian Smith	357a9f23fe	chacha20_poly1305_x86_64.pl: Use \|imulq\| instead of \|imul\|. Perlasm requires the size suffix when targeting NASM and Yasm; without it, the resulting .asm file has \|imu\| instead of \|imul\|. Change-Id: Icb95b8c0b68cf4f93becdc1930dc217398f56bec Reviewed-on: https://boringssl-review.googlesource.com/13381 Reviewed-by: Adam Langley <agl@google.com>	2017-01-27 23:16:52 +00:00
Brian Smith	3416d28a57	chacha20_poly1305_x86_64.pl: Escape command line args like other PerlAsm scripts. Use the same quoting used in other files so that this file can be built the same way as other files on platforms that require the other kind of quoting. Change-Id: I808769bf014fbfe526fedcdc1e1f617b3490d03b Reviewed-on: https://boringssl-review.googlesource.com/13380 Reviewed-by: Adam Langley <agl@google.com>	2017-01-27 23:16:27 +00:00
Adam Langley	1da9c67a99	Use a Perlasm variable rather than an #if to exclude the ChaCha20-Poly1305 asm on Windows. The Windows assembler doesn't appear to do preprocessor macros but nor can it cope with this style of label. Change-Id: I0b8ca7372bb9ea0f20101ed138681d379944658e Reviewed-on: https://boringssl-review.googlesource.com/13207 Reviewed-by: David Benjamin <davidben@google.com>	2017-01-23 22:05:06 +00:00
vkrasnov	8d56558031	Optimized Seal/Open routines for ChaCha20-Poly1305 for x86-64 This is basically the same implementation I wrote for Go The Go implementation: https://github.com/golang/crypto/blob/master/chacha20poly1305/chacha20poly1305_amd64.s The Cloudflare patch for OpenSSL: https://github.com/cloudflare/sslconfig/blob/master/patches/openssl__chacha20_poly1305_draft_and_rfc_ossl102j.patch The Seal/Open is only available for the new version, the old one uses the bundled Poly1305, and the existing ChaCha20 implementations The benefits of this code, compared to the optimized code currently disabled in BoringSSL: * Passes test vectors * Faster performance: The AVX2 code (on Haswell), is 55% faster for 16B, 15% for 1350 and 6% for 8192 byte buffers * Even faster on pre-AVX2 CPUs Feel free to put whatever license, etc. is appropriate, under the existing CLA. Benchmarks are for 16/1350/8192 chunk sizes and given in MB/s: Before (Ivy Bridge): 34.2 589.5 739.4 After: 68.4 692.1 799.4 Before (Skylake): 50 1233 1649 After: 119.4 1736 2196 After (Andy's): 63.6 1608 2261 Change-Id: I9186f721812655011fc17698b67ddbe8a1c7203b Reviewed-on: https://boringssl-review.googlesource.com/13142 Commit-Queue: Adam Langley <agl@google.com> Reviewed-by: Adam Langley <agl@google.com>	2017-01-23 21:12:44 +00:00

7 Commits