Commit Graph

71 Commits

Author SHA1 Message Date
David Benjamin
17d553d299 Add a CFI tester to CHECK_ABI.
This uses the x86 trap flag and libunwind to test CFI works at each
instruction. For now, it just uses the system one out of pkg-config and
disables unwind tests if unavailable. We'll probably want to stick a
copy into //third_party and perhaps try the LLVM one later.

This tester caught two bugs in P-256 CFI annotations already:
I47b5f9798b3bcee1748e537b21c173d312a14b42 and
I9f576d868850312d6c14d1386f8fbfa85021b347

An earlier design used PTRACE_SINGLESTEP with libunwind's remote
unwinding features. ptrace is a mess around stop signals (see group-stop
discussion in ptrace(2)) and this is 10x faster, so I went with it. The
question of which is more future-proof is complex:

- There are two libunwinds with the same API,
  https://www.nongnu.org/libunwind/ and LLVM's. This currently uses the
  system nongnu.org for convenience. In future, LLVM's should be easier
  to bundle (less complex build) and appears to even support Windows,
  but I haven't tested this.  Moreover, setting the trap flag keeps the
  test single-process, which is less complex on Windows. That suggests
  the trap flag design and switching to LLVM later. However...

- Not all architectures have a trap flag settable by userspace. As far
  as I can tell, ARMv8's PSTATE.SS can only be set from the kernel. If
  we stick with nongnu.org libunwind, we can use PTRACE_SINGLESTEP and
  remote unwinding. Or we implement it for LLVM. Another thought is for
  the ptracer to bounce SIGTRAP back into the process, to share the
  local unwinding code.

- ARMv7 has no trap flag at all and PTRACE_SINGLESTEP fails. Debuggers
  single-step by injecting breakpoints instead. However, ARMv8's trap
  flag seems to work in both AArch32 and AArch64 modes, so we may be
  able to condition it on a 64-bit kernel.

Sadly, neither strategy works with Intel SDE. Adding flags to cpucap
vectors as we do with ARM would help, but it would not emulate CPUs
newer than the host CPU. For now, I've just had SDE tests disable these.

Annoyingly, CMake does not allow object libraries to have dependencies,
so make test_support a proper static library. Rename the target to
test_support_lib to avoid
https://gitlab.kitware.com/cmake/cmake/issues/17785

Update-Note: This adds a new optional test dependency, but it's disabled
by default (define BORINGSSL_HAVE_LIBUNWIND), so consumers do not need
to do anything. We'll probably want to adjust this in the future.

Bug: 181
Change-Id: I817263d7907aff0904a9cee83f8b26747262cc0c
Reviewed-on: https://boringssl-review.googlesource.com/c/33966
Commit-Queue: David Benjamin <davidben@google.com>
Reviewed-by: Adam Langley <agl@google.com>
2019-01-03 22:01:55 +00:00
David Benjamin
5edf8957b5 Translate .L directives inside .byte too.
Win64 unwind tables place distances from the start of a function in
byte-wide values.

Change-Id: Ie2aad7f6f5b702a60933bd52d872a83cba4e73a9
Reviewed-on: https://boringssl-review.googlesource.com/c/33784
Commit-Queue: David Benjamin <davidben@google.com>
Reviewed-by: Adam Langley <alangley@gmail.com>
2018-12-21 16:35:32 +00:00
David Benjamin
e157dc9208 Make symbol-prefixing work on 32-bit x86.
On Linux, this introduces yet another symbol to blacklist.

Change-Id: Ieafe45a25f3b41da6c6934dd9488f4ee400bcab9
Reviewed-on: https://boringssl-review.googlesource.com/c/33350
Commit-Queue: David Benjamin <davidben@google.com>
Reviewed-by: Adam Langley <agl@google.com>
2018-11-27 22:35:17 +00:00
David Benjamin
bbc429148f Add a note that generated files are generated.
Folks keep assuming checked-in assembly files are the source. Between
the preprocessor, delocate, NASM not using the C preprocessor, and GAS's
arch-specific comment syntax, comment markers are kind of a disaster.
This set appears to work for now.

Change-Id: I48e26dafb444dfa310df80dcce87ac291fde8037
Reviewed-on: https://boringssl-review.googlesource.com/c/33304
Commit-Queue: Adam Langley <agl@google.com>
Reviewed-by: Adam Langley <agl@google.com>
2018-11-21 20:05:05 +00:00
David Benjamin
293d9ee4e8 Support execute-only memory for AArch64 assembly.
Put data in .rodata and, rather than adr, use the combination of adrp :pg_hi21:
and add :lo12:. Unfortunately, iOS uses different syntax, so we must add more
transforms to arm-xlate.pl.

Tested manually by:

1. Use Android NDK r19-beta1

2. Follow usual instructions to configure CMake for aarch64, but pass
   -DCMAKE_EXE_LINKER_FLAGS="-fuse-ld=lld -Wl,-execute-only".

3. Build. Confirm with readelf -l tool/bssl that .text is not marked
   readable.

4. Push the test binaries onto a Pixel 3. Test normally and with
   --cpu={none,neon,crypto}. I had to pass --gtest_filter=-*Thread* to
   crypto_test. There appears to be an issue with some runtime function
   that's unrelated to our assembly.

No measurable performance difference.

Going forward, to support this, we will need to apply similar changes to
all other AArch64 assembly. This is relatively straightforward, but may
be a little finicky for dual-AArch32/AArch64 files (aesv8-armx.pl).

Update-Note: Assembly syntax is a mess. There's a decent chance some
assembler will get offend.

Change-Id: Ib59b921d4cce76584320fefd23e6bb7ebd4847eb
Reviewed-on: https://boringssl-review.googlesource.com/c/33245
Reviewed-by: Adam Langley <agl@google.com>
Commit-Queue: David Benjamin <davidben@google.com>
2018-11-19 19:58:15 +00:00
Adam Langley
ff997452fc Don't include quotes in heredocs.
Unsurprisingly it doesn't work.

Change-Id: Ida2b9879184f2dfcce217559f8773553ecf0c33d
Reviewed-on: https://boringssl-review.googlesource.com/31947
Commit-Queue: Adam Langley <agl@google.com>
Commit-Queue: David Benjamin <davidben@google.com>
Reviewed-by: David Benjamin <davidben@google.com>
CQ-Verified: CQ bot account: commit-bot@chromium.org <commit-bot@chromium.org>
2018-09-14 16:51:00 +00:00
Adam Langley
695e589b0c Include newlines at the end of generated asm.
Perl's print doesn't automatically include a newline and the delocate
script doesn't like files that don't end with one.

Change-Id: Ib1bce2b3bb6fbe1a122bd88b58198b497c599adb
Reviewed-on: https://boringssl-review.googlesource.com/31804
Commit-Queue: David Benjamin <davidben@google.com>
Reviewed-by: David Benjamin <davidben@google.com>
CQ-Verified: CQ bot account: commit-bot@chromium.org <commit-bot@chromium.org>
2018-09-10 16:47:13 +00:00
Adam Langley
e77c27d734 Automatically disable assembly with MSAN.
MSAN is incompatible with hand-written assembly code. Previously we
required that OPENSSL_NO_ASM be set when building with MSAN, and the
CMake build would take care of this. However, with other build systems
it wasn't always so easy.

This change automatically disables assembly when the compiler is
configured for MSAN.

Change-Id: I6c219120f62d16b99bafc2efb02948ecbecaf87f
Reviewed-on: https://boringssl-review.googlesource.com/31724
Commit-Queue: David Benjamin <davidben@google.com>
Reviewed-by: David Benjamin <davidben@google.com>
CQ-Verified: CQ bot account: commit-bot@chromium.org <commit-bot@chromium.org>
2018-09-07 21:12:37 +00:00
David Benjamin
19ac2666b9 Make symbol-prefixing work on ARM.
The assembly files need some includes. Also evp.h has some conflicting
macros. Finally, md5.c's pattern of checking if a function name is
defined needs to switch to checking MD5_ASM.

Change-Id: Ib1987ba6f279144f0505f6951dead53968e05f20
Reviewed-on: https://boringssl-review.googlesource.com/31704
Reviewed-by: Adam Langley <agl@google.com>
Commit-Queue: David Benjamin <davidben@google.com>
CQ-Verified: CQ bot account: commit-bot@chromium.org <commit-bot@chromium.org>
2018-09-07 17:43:05 +00:00
Joshua Liebow-Feeser
8c7c6356e6 Support symbol prefixes
- In base.h, if BORINGSSL_PREFIX is defined, include
  boringssl_prefix_symbols.h
- In all .S files, if BORINGSSL_PREFIX is defined, include
  boringssl_prefix_symbols_asm.h
- In base.h, BSSL_NAMESPACE_BEGIN and BSSL_NAMESPACE_END are
  defined with appropriate values depending on whether
  BORINGSSL_PREFIX is defined; these macros are used in place
  of 'namespace bssl {' and '}'
- Add util/make_prefix_headers.go, which takes a list of symbols
  and auto-generates the header files mentioned above
- In CMakeLists.txt, if BORINGSSL_PREFIX and BORINGSSL_PREFIX_SYMBOLS
  are defined, run util/make_prefix_headers.go to generate header
  files
- In various CMakeLists.txt files, add "global_target" that all
  targets depend on to give us a place to hook logic that must run
  before all other targets (in particular, the header file generation
  logic)
- Document this in BUILDING.md, including the fact that it is
  the caller's responsibility to provide the symbol list and keep it
  up to date
- Note that this scheme has not been tested on Windows, and likely
  does not work on it; Windows support will need to be added in a
  future commit

Change-Id: If66a7157f46b5b66230ef91e15826b910cf979a2
Reviewed-on: https://boringssl-review.googlesource.com/31364
Commit-Queue: David Benjamin <davidben@google.com>
CQ-Verified: CQ bot account: commit-bot@chromium.org <commit-bot@chromium.org>
Reviewed-by: David Benjamin <davidben@google.com>
2018-09-06 20:07:52 +00:00
David Benjamin
6dc994265e Sync up some perlasm license headers and easy fixes.
These files are otherwise up-to-date with OpenSSL master as of
50ea9d2b3521467a11559be41dcf05ee05feabd6, modulo a couple of spelling
fixes which I've imported.

I've also reverted the same-line label and instruction patch to
x86_64-mont*.pl. The new delocate parser handles that fine.

Change-Id: Ife35c671a8104c3cc2fb6c5a03127376fccc4402
Reviewed-on: https://boringssl-review.googlesource.com/25644
Reviewed-by: Adam Langley <agl@google.com>
2018-02-11 01:00:35 +00:00
David Benjamin
4281bcd5d2 Revert assembly changes in "Hide CPU capability symbols in C."
This partially reverts commit 38636aba74.
Some build on Android seems to break now. I'm not really sure what the
situation is, but if the weird common symbols are still there (can we
remove them?), they probably ought to have the right flags.

Change-Id: Ief589d763d16b995ac6be536505acf7596a87b30
Reviewed-on: https://boringssl-review.googlesource.com/22404
Commit-Queue: David Benjamin <davidben@google.com>
Commit-Queue: Steven Valdez <svaldez@google.com>
Reviewed-by: Steven Valdez <svaldez@google.com>
CQ-Verified: CQ bot account: commit-bot@chromium.org <commit-bot@chromium.org>
2017-10-30 20:39:57 +00:00
David Benjamin
38636aba74 Hide CPU capability symbols in C.
Our assembly does not use the GOT to reference symbols, which means
references to visible symbols will often require a TEXTREL. This is
undesirable, so all assembly-referenced symbols should be hidden. CPU
capabilities are the only such symbols defined in C.

These symbols may be hidden by doing at least one of:

1. Build with -fvisibility=hidden
2. __attribute__((visibility("hidden"))) in C.
3. .extern + .hidden in some assembly file referencing the symbol.

We have lots of consumers and can't always rely on (1) happening. We
were doing (3) by way of d216b71f90 and
16e38b2b8f, but missed 32-bit x86 because
it doesn't cause a linker error.

Those two patches are not in upstream. Upstream instead does (3) by way
of x86cpuid.pl and friends, but we have none of these files.

Standardize on doing (2). This avoids accidentally getting TEXTRELs on
some 32-bit x86 build configurations.  This also undoes
d216b71f90 and
16e38b2b8f. They are no now longer needed
and reduce the upstream diff.

Change-Id: Ib51c43fce6a7d8292533635e5d85d3c197a93644
Reviewed-on: https://boringssl-review.googlesource.com/22064
Commit-Queue: Matt Braithwaite <mab@google.com>
Reviewed-by: Matt Braithwaite <mab@google.com>
CQ-Verified: CQ bot account: commit-bot@chromium.org <commit-bot@chromium.org>
2017-10-23 18:36:49 +00:00
David Benjamin
f03cdc3a93 Sync ARM assembly up to 609b0852e4d50251857dbbac3141ba042e35a9ae.
This change was made by copying over the files as of that commit and
then discarding the parts of the diff which corresponding to our own
changes.

Change-Id: I28c5d711f7a8cec30749b8174687434129af5209
Reviewed-on: https://boringssl-review.googlesource.com/17111
Reviewed-by: Adam Langley <agl@google.com>
2017-06-13 17:47:20 +00:00
David Benjamin
8da59555c6 ARMv4 assembly pack: allow Thumb2 even in iOS build, and engage it in most modules.
(Imported from upstream's a285992763f3961f69a8d86bf7dfff020a08cef9.)

Change-Id: I59df0b567e8e80befe5c399f817d6410ddafc577
Reviewed-on: https://boringssl-review.googlesource.com/17110
Reviewed-by: Adam Langley <agl@google.com>
2017-06-13 17:47:10 +00:00
David Benjamin
583c12ea97 Remove filename argument to x86 asm_init.
43e5a26b53 removed the .file directive
from x86asm.pl. This removes the parameter from asm_init altogether. See
also upstream's e195c8a2562baef0fdcae330556ed60b1e922b0e.

Change-Id: I65761bc962d09f9210661a38ecf6df23eae8743d
Reviewed-on: https://boringssl-review.googlesource.com/16247
Reviewed-by: Steven Valdez <svaldez@google.com>
Commit-Queue: Steven Valdez <svaldez@google.com>
CQ-Verified: CQ bot account: commit-bot@chromium.org <commit-bot@chromium.org>
2017-05-12 14:58:27 +00:00
David Benjamin
ad50a0d7cd Fix diff_asm.go and revert another local MASM perlasm change.
We're not using the MASM output, so don't bother maintaining a diff on
it.

Change-Id: I7321e58c8b267be91d58849927139b74cc96eddc
Reviewed-on: https://boringssl-review.googlesource.com/16246
Commit-Queue: David Benjamin <davidben@google.com>
Commit-Queue: Steven Valdez <svaldez@google.com>
Reviewed-by: Steven Valdez <svaldez@google.com>
CQ-Verified: CQ bot account: commit-bot@chromium.org <commit-bot@chromium.org>
2017-05-11 23:30:01 +00:00
Steven Valdez
43e5a26b53 Fixing assembly coverage reporting.
Due to issues with CMake enable_language, we have to delay setting
CMAKE_ASM_FLAGS until after enable_language(ASM) has been called.

We also need to remove the '.file' macro from x86gas.pl to prevent the
filenames from being overridden from those provided by the build
system.

Change-Id: I436f57ec45e4751714af49e1211a0d7810e4e56a
Reviewed-on: https://boringssl-review.googlesource.com/16127
Reviewed-by: Steven Valdez <svaldez@google.com>
Commit-Queue: Steven Valdez <svaldez@google.com>
CQ-Verified: CQ bot account: commit-bot@chromium.org <commit-bot@chromium.org>
2017-05-11 16:55:29 +00:00
David Benjamin
c862c31f4c perlasm/x86_64-xlate.pl: work around problem with hex constants in masm.
Perl, multiple versions, for some reason occasionally takes issue with
letter b[?] in ox([0-9a-f]+) regex. As result some constants, such as
0xb1 came out wrong when generating code for MASM. Fixes upstream
GH#3241.

(Imported from upstream's c47aea8af1e28e46e1ad5e2e7468b49fec3f4f29.)

This does not affect of the configurations we generate and is imported
to avoid a diff against upstream.

Change-Id: Iacde0ca5220c3607681fad081fbe72d8d613518f
Reviewed-on: https://boringssl-review.googlesource.com/15985
Reviewed-by: Adam Langley <agl@google.com>
2017-05-05 23:10:56 +00:00
David Benjamin
e34eaa6409 Remove old masm workaround.
This dates to ded93581f1, but we have
since switched to building with nasm, to match upstream's supported
assemblers. Since this doesn't affect anything we generate, remove the
workaround to reduce the diff against upstream.

Change-Id: I549ae97ad6d6f28836f6c9d54dcf51c518de7521
Reviewed-on: https://boringssl-review.googlesource.com/15986
Reviewed-by: Adam Langley <agl@google.com>
2017-05-05 23:07:47 +00:00
Adam Langley
107d4388cb Gate assembly sources on !OPENSSL_NO_ASM.
Change-Id: I32b37306265e89afca568f20bfba2e04559c4f0b
Reviewed-on: https://boringssl-review.googlesource.com/14527
Commit-Queue: Adam Langley <agl@google.com>
Commit-Queue: David Benjamin <davidben@google.com>
Reviewed-by: David Benjamin <davidben@google.com>
CQ-Verified: CQ bot account: commit-bot@chromium.org <commit-bot@chromium.org>
2017-03-30 19:34:21 +00:00
David Benjamin
98f5dc30ba perlasm/x86_64-xlate.pl: recognize even offset(%reg) in cfa_expression.
This is handy when "offset(%reg)" is a perl variable.

(Imported from upstream's 1cb35b47db8462f5653803501ed68d33b10c249f.)

Change-Id: I2f03907a7741371a71045f98318e0ab9396a8fc7
Reviewed-on: https://boringssl-review.googlesource.com/13906
Commit-Queue: David Benjamin <davidben@google.com>
Reviewed-by: Steven Valdez <svaldez@google.com>
CQ-Verified: CQ bot account: commit-bot@chromium.org <commit-bot@chromium.org>
2017-02-16 22:21:25 +00:00
David Benjamin
f3cc7a3366 perlasm/x86_64-xlate.pl: fix pair of typo-bugs in the new cfi_directive.
.cfi_{start|end}proc and .cfi_def_cfa were not tracked.

(Imported from upstream's 88be429f2ed04f0acc71f7fd5456174c274f2f76.)

Change-Id: I6abd480255218890349d139b62f62144b34c700d
Reviewed-on: https://boringssl-review.googlesource.com/13905
Commit-Queue: David Benjamin <davidben@google.com>
Reviewed-by: Steven Valdez <svaldez@google.com>
CQ-Verified: CQ bot account: commit-bot@chromium.org <commit-bot@chromium.org>
2017-02-16 22:21:18 +00:00
David Benjamin
4c4053191a perlasm/x86_64-xlate.pl: typo fix in comment.
(Imported from upstream's fa3f83552f53447deced45579865cec9f55a947e.)

Change-Id: I659422a604b9d1d61334e09dff0c1de3aedb2d04
Reviewed-on: https://boringssl-review.googlesource.com/13904
Commit-Queue: David Benjamin <davidben@google.com>
Commit-Queue: Steven Valdez <svaldez@google.com>
Reviewed-by: Steven Valdez <svaldez@google.com>
CQ-Verified: CQ bot account: commit-bot@chromium.org <commit-bot@chromium.org>
2017-02-16 22:20:33 +00:00
David Benjamin
0f28691d3d Fix a few typos.
(Imported from upstream's 7e12cdb52e3f4beff050caeecf3634870bb9a7c4.)

Change-Id: I9a6bba72c039e45ae5c0302a8a3dff7148cf1897
Reviewed-on: https://boringssl-review.googlesource.com/13869
Reviewed-by: David Benjamin <davidben@google.com>
Commit-Queue: David Benjamin <davidben@google.com>
CQ-Verified: CQ bot account: commit-bot@chromium.org <commit-bot@chromium.org>
2017-02-16 18:50:51 +00:00
Adam Langley
c948d46569 Remove trailing whitespace from Perl files.
Upstream did this in 609b0852e4d50251857dbbac3141ba042e35a9ae and it's
easier to apply patches if we do also.

Change-Id: I5142693ed1e26640987ff16f5ea510e81bba200e
Reviewed-on: https://boringssl-review.googlesource.com/13771
Commit-Queue: Adam Langley <agl@google.com>
Reviewed-by: David Benjamin <davidben@google.com>
2017-02-14 00:13:55 +00:00
Adam Langley
689eb3d03a x86_64-xlate.pl: import fix(?) from upstream.
This imports the changes to x86_64-xlate from upstream's
9c940446f614d1294fa197ffd4128206296b04da. It looks like it's a fix,
although it doesn't alter our generated asm at all. Either way, no point
in diverging from upstream on this point.

Change-Id: Iaedf2cdb9580cfccf6380dbc3df36b0e9c148d1c
Reviewed-on: https://boringssl-review.googlesource.com/13767
Commit-Queue: Adam Langley <agl@google.com>
Commit-Queue: David Benjamin <davidben@google.com>
Reviewed-by: David Benjamin <davidben@google.com>
2017-02-13 21:52:39 +00:00
Adam Langley
9ad43cbf64 x86_64-xlate.pl: drop some whitespace.
This aligns us better with upstream's version of this file.

Change-Id: I771b6a6c57f2e11e30c95c7a5499c39575b16253
Reviewed-on: https://boringssl-review.googlesource.com/13766
Commit-Queue: Adam Langley <agl@google.com>
Commit-Queue: David Benjamin <davidben@google.com>
Reviewed-by: David Benjamin <davidben@google.com>
2017-02-13 21:51:38 +00:00
Adam Langley
9be3238e18 perlasm/x86_64-xlate.pl: recognize DWARF CFI directives.
(Imports upstream's a3b5684fc1d4f3aabdf68dcf6c577f6dd24d2b2d.)

CFI directives annotate instructions that are significant for stack
unwinding procedure. In addition to directives recognized by GNU
assembler this module implements three synthetic ones:

- .cfi_push annotates push instructions in prologue and translates to
  .cfi_adjust_cfa_offset (if needed) and .cfi_offset;
- .cfi_pop annotates pop instructions in epilogue and translates to
  .cfi_adjust_cfs_offset (if needed) and .cfi_restore;
- .cfi_cfa_expression encodes DW_CFA_def_cfa_expression and passes it
  to .cfi_escape as byte vector;

CFA expression syntax is made up mix of DWARF operator suffixes [subset
of] and references to registers with optional bias. Following example
describes offloaded original stack pointer at specific offset from
current stack pointer:

        .cfi_cfa_expression     %rsp+40,deref,+8

Final +8 has everything to do with the fact that CFA, Canonical Frame
Address, is reference to top of caller's stack, and on x86_64 call to
subroutine pushes 8-byte return address.

Change-Id: Ic675bf52b5405000be34e9da31c9cf1660f4b491
Reviewed-on: https://boringssl-review.googlesource.com/13765
Commit-Queue: Adam Langley <agl@google.com>
Commit-Queue: David Benjamin <davidben@google.com>
CQ-Verified: CQ bot account: commit-bot@chromium.org <commit-bot@chromium.org>
Reviewed-by: David Benjamin <davidben@google.com>
2017-02-13 21:48:43 +00:00
Adam Langley
949628a2ab perlasm/x86_64-xlate.pl: remove obsolete .picmeup synthetic directive.
(Imports upstream's 9d301cfea7181766b79ba31ed257d30fb84b1b0f.)

Change-Id: Ibc384f5ae4879561e2b26b3c9c2a51af5d91a996
Reviewed-on: https://boringssl-review.googlesource.com/13764
Commit-Queue: Adam Langley <agl@google.com>
Commit-Queue: David Benjamin <davidben@google.com>
CQ-Verified: CQ bot account: commit-bot@chromium.org <commit-bot@chromium.org>
Reviewed-by: David Benjamin <davidben@google.com>
2017-02-11 00:00:58 +00:00
Adam Langley
25126633dc perlasm/x86_64-xlate.pl: minor readability updates.
(Imports upstream's e09b6216a5423555271509acf5112da5484ec15d.)

Change-Id: Ie9d785e415271bede1d35d014ac015e6984e3a52
Reviewed-on: https://boringssl-review.googlesource.com/13763
Commit-Queue: Adam Langley <agl@google.com>
Commit-Queue: David Benjamin <davidben@google.com>
Reviewed-by: David Benjamin <davidben@google.com>
CQ-Verified: CQ bot account: commit-bot@chromium.org <commit-bot@chromium.org>
2017-02-10 23:58:41 +00:00
Adam Langley
314997902e perlasm/x86_64-xlate.pl: clarify SEH coding guidelines.
(Imported from upstream's e1dbf7f431b996010844e220d3200cbf2122dbb3)

Change-Id: I71933922f597358790e8a4222e9d69c4b121bc19
Reviewed-on: https://boringssl-review.googlesource.com/13762
Commit-Queue: Adam Langley <agl@google.com>
Commit-Queue: David Benjamin <davidben@google.com>
Reviewed-by: David Benjamin <davidben@google.com>
CQ-Verified: CQ bot account: commit-bot@chromium.org <commit-bot@chromium.org>
2017-02-10 23:57:09 +00:00
Adam Langley
4229d26b7e perlasm/x86_64-xlate.pl: add support for AVX512 OPMASK-ing.
(Imported from upstream's 526ab896459a58748af198f6703108b79c917f08.)

Change-Id: I975c1a3ffe76e3c3f99ed8286b448b97fd4a8b70
Reviewed-on: https://boringssl-review.googlesource.com/13761
Commit-Queue: Adam Langley <agl@google.com>
Commit-Queue: David Benjamin <davidben@google.com>
Reviewed-by: David Benjamin <davidben@google.com>
CQ-Verified: CQ bot account: commit-bot@chromium.org <commit-bot@chromium.org>
2017-02-10 23:56:06 +00:00
David Benjamin
fa99197b9d perlasm/x86_64-xlate.pl: refine sign extension in ea package.
$1<<32>>32 worked fine with either 32- or 64-bit perl for a good while,
relying on quirk that [pure] 32-bit perl performed it as $1<<0>>0.  But
this apparently changed in some version past minimally required 5.10,
and operation result became 0. Yet, it went unnoticed for another while,
because most perl package providers configure their packages with
-Duse64bitint option.

(Imported from upstream's 82e089308bd9a7794a45f0fa3973d7659420fbd8.)

Change-Id: Ie9708bb521c8d7d01afd2e064576f46be2a811a5
Reviewed-on: https://boringssl-review.googlesource.com/12821
CQ-Verified: CQ bot account: commit-bot@chromium.org <commit-bot@chromium.org>
Reviewed-by: Steven Valdez <svaldez@google.com>
Reviewed-by: David Benjamin <davidben@google.com>
Commit-Queue: David Benjamin <davidben@google.com>
2016-12-14 17:36:29 +00:00
Adam Langley
4467e59bc8 Add PPC64LE assembly for AES-GCM.
This change adds AES and GHASH assembly from upstream, with the aim of
speeding up AES-GCM.

The PPC64LE assembly matches the interface of the ARMv8 assembly so I've
changed the prefix of both sets of asm functions to be the same
("aes_hw_").

Otherwise, the new assmebly files and Perlasm match exactly those from
upstream's c536b6be1a (from their master branch).

Before:
Did 1879000 AES-128-GCM (16 bytes) seal operations in 1000428us (1878196.1 ops/sec): 30.1 MB/s
Did 61000 AES-128-GCM (1350 bytes) seal operations in 1006660us (60596.4 ops/sec): 81.8 MB/s
Did 11000 AES-128-GCM (8192 bytes) seal operations in 1072649us (10255.0 ops/sec): 84.0 MB/s
Did 1665000 AES-256-GCM (16 bytes) seal operations in 1000591us (1664016.6 ops/sec): 26.6 MB/s
Did 52000 AES-256-GCM (1350 bytes) seal operations in 1006971us (51640.0 ops/sec): 69.7 MB/s
Did 8840 AES-256-GCM (8192 bytes) seal operations in 1013294us (8724.0 ops/sec): 71.5 MB/s

After:
Did 4994000 AES-128-GCM (16 bytes) seal operations in 1000017us (4993915.1 ops/sec): 79.9 MB/s
Did 1389000 AES-128-GCM (1350 bytes) seal operations in 1000073us (1388898.6 ops/sec): 1875.0 MB/s
Did 319000 AES-128-GCM (8192 bytes) seal operations in 1000101us (318967.8 ops/sec): 2613.0 MB/s
Did 4668000 AES-256-GCM (16 bytes) seal operations in 1000149us (4667304.6 ops/sec): 74.7 MB/s
Did 1202000 AES-256-GCM (1350 bytes) seal operations in 1000646us (1201224.0 ops/sec): 1621.7 MB/s
Did 269000 AES-256-GCM (8192 bytes) seal operations in 1002804us (268247.8 ops/sec): 2197.5 MB/s

Change-Id: Id848562bd4e1aa79a4683012501dfa5e6c08cfcc
Reviewed-on: https://boringssl-review.googlesource.com/11262
Reviewed-by: Adam Langley <agl@google.com>
Commit-Queue: Adam Langley <agl@google.com>
CQ-Verified: CQ bot account: commit-bot@chromium.org <commit-bot@chromium.org>
2016-09-27 18:43:20 +00:00
David Benjamin
8e726eca12 Remove unused crypto/perlasm/cbc.pl.
In OpenSSL, they're used in the 32-bit x86 Blowfish, CAST, DES, and RC5
assembly bits. We don't have any of those.

Change-Id: I36f22ca873842a200323cd3f398d2446f7bbabca
Reviewed-on: https://boringssl-review.googlesource.com/10780
Commit-Queue: David Benjamin <davidben@google.com>
Reviewed-by: Adam Langley <agl@google.com>
2016-09-12 19:03:57 +00:00
David Benjamin
d1fa9f970e Sync x86 perlasm drivers with upstream master.
Upstream added new instructions in
f4d456408d9d7bca31f34765d1a05fbd9fa55826 and
4e3d2866b6e8e7a700ea22e05840a093bfd7a4b1.

Change-Id: I835650426a0dffca2d8686d64aef99097a4bd186
Reviewed-on: https://boringssl-review.googlesource.com/8520
Reviewed-by: Adam Langley <agl@google.com>
2016-06-27 22:00:51 +00:00
David Benjamin
66194feedd perlasm/x86_64-xlate.pl: address errors and warnings in elderly perls.
(Imported from upstream's 67b8bf4d849a7c40d0226de4ebe2590c4cc7c1f7.)

Verified a no-op in generate_build_files.py.

Change-Id: I09648893ab5c795f3934da0b2ecbc5fd7eb068d5
Reviewed-on: https://boringssl-review.googlesource.com/8519
Reviewed-by: Adam Langley <agl@google.com>
2016-06-27 22:00:26 +00:00
David Benjamin
ac81d92968 Revert local change to x86masm.pl.
We're not using the masm output (and upstream does not even support it).
Reduce unnecessary diff from upstream.

Change-Id: Ic0b0f804bd7ec1429b3b1f40746297b57dcfcef6
Reviewed-on: https://boringssl-review.googlesource.com/8517
Reviewed-by: Adam Langley <agl@google.com>
2016-06-27 21:49:07 +00:00
David Benjamin
ff594ca8c8 Make arm-xlate.pl set use strict.
It was already nearly clean. Just one undeclared variable.

(Imported from upstream's abeae4d3251181f1cedd15e4433e79406b766155.)

Change-Id: I3b8f20034f914fc44faabf165d1553d4084c87cc
Reviewed-on: https://boringssl-review.googlesource.com/8393
Reviewed-by: Adam Langley <agl@google.com>
2016-06-22 23:11:27 +00:00
David Benjamin
b111f7a0e4 Rebase x86_64-xlate.pl atop master.
This functionally pulls in a number of changes from upstream, including:
4e3d2866b6e8e7a700ea22e05840a093bfd7a4b1
1eb12c437bbeb2c748291bcd23733d4a59d5d1ca
6a4ea0022c475bbc2c7ad98a6f05f6e2e850575b
c25278db8e4c21772a0cd81f7873e767cbc6d219
e0a651945cb5a70a2abd9902c0fd3e9759d35867
d405aa2ff265965c71ce7331cf0e49d634a06924
ce3d25d3e5a7e82fd59fd30dff7acc39baed8b5e
9ba96fbb2523cb12747c559c704c58bd8f9e7982

Notably, c25278db8e4c21772a0cd81f7873e767cbc6d219 makes it enable 'use strict'.

To avoid having to deal with complex conflicts, this was done by taking a diff
of our copy of the file with the point just before
c25278db8e4c21772a0cd81f7873e767cbc6d219, and reapplying the non-reverting
parts of our diff on top of upstream's current version.

Confirmed with generate_build_files.py that this makes no changes *except*
d405aa2ff265965c71ce7331cf0e49d634a06924 causes this sort of change throughout
chacha-x86_64.pl's nasm output:

@@ -1179,7 +1179,7 @@ $L$oop8x:
        vpslld  ymm14,ymm0,12
        vpsrld  ymm0,ymm0,20
        vpor    ymm0,ymm14,ymm0
-       vbroadcasti128  ymm14,YMMWORD[r11]
+       vbroadcasti128  ymm14,XMMWORD[r11]
        vpaddd  ymm13,ymm13,ymm5
        vpxor   ymm1,ymm13,ymm1
        vpslld  ymm15,ymm1,12

This appears to be correct. vbroadcasti128 takes a 128-bit-wide second
argument, so it wants XMMWORD, not YMMWORD. I suppose nasm just didn't care.

(Looking at a diff-diff may be a more useful way to review this CL.)

Change-Id: I61be0d225ddf13b5f05d1369ddda84b2f322ef9d
Reviewed-on: https://boringssl-review.googlesource.com/8392
Reviewed-by: Adam Langley <agl@google.com>
Commit-Queue: Adam Langley <agl@google.com>
2016-06-22 19:54:14 +00:00
David Benjamin
8d5717b019 perlasm/x86_64-xlate.pl: handle binary constants early.
Not all assemblers of "gas" flavour handle binary constants, e.g.
seasoned MacOS Xcode doesn't, so give them a hand.

(Imported from upstream's ba26fa14556ba49466d51e4d9e6be32afee9c465.)

Change-Id: I35096dc8035e06d2fbef2363b869128da206ff9d
Reviewed-on: https://boringssl-review.googlesource.com/7459
Reviewed-by: Steven Valdez <svaldez@google.com>
Reviewed-by: David Benjamin <davidben@google.com>
2016-03-17 18:23:40 +00:00
Steven Valdez
aeb69a02b8 Pass pure constants verbatim in perlasm/x86_64-xlate.pl
(Imported from upstream's 10c639a8a56c90bec9e332c7ca76ef552b3952ac)

Change-Id: Ia8203eeae9d274249595a6e352ec2f77a97ca5d5
Reviewed-on: https://boringssl-review.googlesource.com/7227
Reviewed-by: David Benjamin <davidben@google.com>
2016-03-01 17:52:20 +00:00
David Benjamin
030d08513e ymm registers are not suffixed with w.
This imports a fix to x86gas.pl from upstream's
a98c648e40ea5158c8ba29b5a70ccc239d426a20. It's needed to get poly1305-x86.pl
working.

Confirmed that this is a no-op for our current assembly files.

Change-Id: I28de1dbf421b29a06147d1aea3ff3659372a78b3
Reviewed-on: https://boringssl-review.googlesource.com/7210
Reviewed-by: Adam Langley <agl@google.com>
2016-02-23 23:18:53 +00:00
David Benjamin
3ab3e3db6e Mark ARM assembly globals hidden uniformly in arm-xlate.pl.
We'd manually marked some of them hidden, but missed some. Do it in the perlasm
driver instead since we will never expose an asm symbol directly. This reduces
some of our divergence from upstream on these files (and indeed we'd
accidentally lose some .hiddens at one point).

BUG=586141

Change-Id: Ie1bfc6f38ba73d33f5c56a8a40c2bf1668562e7e
Reviewed-on: https://boringssl-review.googlesource.com/7140
Reviewed-by: Adam Langley <agl@google.com>
2016-02-11 17:28:03 +00:00
David Benjamin
b8ba65a73a Fix arm perlasm trailing newline.
Change-Id: I1119fd8d5a0e03832644cb4b8128eed8ed9acb9e
Reviewed-on: https://boringssl-review.googlesource.com/6890
Reviewed-by: Adam Langley <agl@google.com>
2016-01-19 16:35:20 +00:00
David Benjamin
278d34234f Get rid of all compiler version checks in perlasm files.
Since we pre-generate our perlasm, having the output of these files be
sensitive to the environment the run in is unhelpful. It would be bad to
suddenly change what features we do or don't compile in whenever workstations'
toolchains change or if developers do or don't have CC variables set.

Previously, all compiler-version-gated features were turned on in
https://boringssl-review.googlesource.com/6260, but this broke the build. I
also wasn't thorough enough in gathering performance numbers. So, flip them all
to off instead. I'll enable them one-by-one as they're tested.

This should result in no change to generated assembly.

Change-Id: Ib4259b3f97adc4939cb0557c5580e8def120d5bc
Reviewed-on: https://boringssl-review.googlesource.com/6383
Reviewed-by: Adam Langley <agl@google.com>
2015-10-28 19:33:04 +00:00
David Benjamin
75885e29c4 Revert "Get rid of all compiler version checks in perlasm files."
This reverts commit b9c26014de.

The win64 bot seems unhappy. Will sniff at it tomorrow. In
the meantime, get the tree green again.

Change-Id: I058ddb3ec549beee7eabb2f3f72feb0a4a5143b2
Reviewed-on: https://boringssl-review.googlesource.com/6353
Reviewed-by: Adam Langley <alangley@gmail.com>
2015-10-26 23:12:39 +00:00
David Benjamin
b9c26014de Get rid of all compiler version checks in perlasm files.
Since we pre-generate our perlasm, having the output of these files be
sensitive to the environment the run in is unhelpful. It would be bad to
suddenly change what features we do or don't compile in whenever workstations'
toolchains change.

Enable all compiler-version-gated features as they should all be runtime-gated
anyway. This should align with what upstream's files would have produced on
modern toolschains. We should assume our assemblers can take whatever we'd like
to throw at them. (If it turns out some can't, we'd rather find out and
probably switch the problematic instructions to explicit byte sequences.)

This actually results in a fairly significant change to the assembly we
generate. I'm guessing upstream's buildsystem sets the CC environment variable,
while ours doesn't and so the version checks were all coming out conservative.

diffstat of generated files:

 linux-x86/crypto/sha/sha1-586.S              | 1176 ++++++++++++
 linux-x86/crypto/sha/sha256-586.S            | 2248 ++++++++++++++++++++++++
 linux-x86_64/crypto/bn/rsaz-avx2.S           | 1644 +++++++++++++++++
 linux-x86_64/crypto/bn/rsaz-x86_64.S         |  638 ++++++
 linux-x86_64/crypto/bn/x86_64-mont.S         |  332 +++
 linux-x86_64/crypto/bn/x86_64-mont5.S        | 1130 ++++++++++++
 linux-x86_64/crypto/modes/aesni-gcm-x86_64.S |  754 ++++++++
 linux-x86_64/crypto/modes/ghash-x86_64.S     |  475 +++++
 linux-x86_64/crypto/sha/sha1-x86_64.S        | 1121 ++++++++++++
 linux-x86_64/crypto/sha/sha256-x86_64.S      | 1062 +++++++++++
 linux-x86_64/crypto/sha/sha512-x86_64.S      | 2241 ++++++++++++++++++++++++
 mac-x86/crypto/sha/sha1-586.S                | 1174 ++++++++++++
 mac-x86/crypto/sha/sha256-586.S              | 2248 ++++++++++++++++++++++++
 mac-x86_64/crypto/bn/rsaz-avx2.S             | 1637 +++++++++++++++++
 mac-x86_64/crypto/bn/rsaz-x86_64.S           |  638 ++++++
 mac-x86_64/crypto/bn/x86_64-mont.S           |  331 +++
 mac-x86_64/crypto/bn/x86_64-mont5.S          | 1130 ++++++++++++
 mac-x86_64/crypto/modes/aesni-gcm-x86_64.S   |  750 ++++++++
 mac-x86_64/crypto/modes/ghash-x86_64.S       |  475 +++++
 mac-x86_64/crypto/sha/sha1-x86_64.S          | 1121 ++++++++++++
 mac-x86_64/crypto/sha/sha256-x86_64.S        | 1062 +++++++++++
 mac-x86_64/crypto/sha/sha512-x86_64.S        | 2241 ++++++++++++++++++++++++
 win-x86/crypto/sha/sha1-586.asm              | 1173 ++++++++++++
 win-x86/crypto/sha/sha256-586.asm            | 2248 ++++++++++++++++++++++++
 win-x86_64/crypto/bn/rsaz-avx2.asm           | 1858 +++++++++++++++++++-
 win-x86_64/crypto/bn/rsaz-x86_64.asm         |  638 ++++++
 win-x86_64/crypto/bn/x86_64-mont.asm         |  352 +++
 win-x86_64/crypto/bn/x86_64-mont5.asm        | 1184 ++++++++++++
 win-x86_64/crypto/modes/aesni-gcm-x86_64.asm |  933 ++++++++++
 win-x86_64/crypto/modes/ghash-x86_64.asm     |  515 +++++
 win-x86_64/crypto/sha/sha1-x86_64.asm        | 1152 ++++++++++++
 win-x86_64/crypto/sha/sha256-x86_64.asm      | 1088 +++++++++++
 win-x86_64/crypto/sha/sha512-x86_64.asm      | 2499 ++++++

SHA* gets faster. RSA and AES-GCM seem to be more of a wash and even slower
sometimes!  This is a little concerning. Though when I repeated the latter two,
it's definitely noisy (RSA in particular), so we may wish to repeat in a more
controlled environment. We could also flip some of these toggles to something
other than the highest setting if it seems some of the variants aren't
desirable. We just shouldn't have them enabled or disabled on accident. This
aligns us closer to upstream though.

$ /tmp/bssl.old speed SHA-
Did 5028000 SHA-1 (16 bytes) operations in 1000048us (5027758.7 ops/sec): 80.4 MB/s
Did 1708000 SHA-1 (256 bytes) operations in 1000257us (1707561.2 ops/sec): 437.1 MB/s
Did 73000 SHA-1 (8192 bytes) operations in 1008406us (72391.5 ops/sec): 593.0 MB/s
Did 3041000 SHA-256 (16 bytes) operations in 1000311us (3040054.5 ops/sec): 48.6 MB/s
Did 779000 SHA-256 (256 bytes) operations in 1000820us (778361.7 ops/sec): 199.3 MB/s
Did 26000 SHA-256 (8192 bytes) operations in 1009875us (25745.8 ops/sec): 210.9 MB/s
Did 1837000 SHA-512 (16 bytes) operations in 1000251us (1836539.0 ops/sec): 29.4 MB/s
Did 803000 SHA-512 (256 bytes) operations in 1000969us (802222.6 ops/sec): 205.4 MB/s
Did 41000 SHA-512 (8192 bytes) operations in 1016768us (40323.8 ops/sec): 330.3 MB/s
$ /tmp/bssl.new speed SHA-
Did 5354000 SHA-1 (16 bytes) operations in 1000104us (5353443.2 ops/sec): 85.7 MB/s
Did 1779000 SHA-1 (256 bytes) operations in 1000121us (1778784.8 ops/sec): 455.4 MB/s
Did 87000 SHA-1 (8192 bytes) operations in 1012641us (85914.0 ops/sec): 703.8 MB/s
Did 3517000 SHA-256 (16 bytes) operations in 1000114us (3516599.1 ops/sec): 56.3 MB/s
Did 935000 SHA-256 (256 bytes) operations in 1000096us (934910.2 ops/sec): 239.3 MB/s
Did 38000 SHA-256 (8192 bytes) operations in 1004476us (37830.7 ops/sec): 309.9 MB/s
Did 2930000 SHA-512 (16 bytes) operations in 1000259us (2929241.3 ops/sec): 46.9 MB/s
Did 1008000 SHA-512 (256 bytes) operations in 1000509us (1007487.2 ops/sec): 257.9 MB/s
Did 45000 SHA-512 (8192 bytes) operations in 1000593us (44973.3 ops/sec): 368.4 MB/s

$ /tmp/bssl.old speed RSA
Did 820 RSA 2048 signing operations in 1017008us (806.3 ops/sec)
Did 27000 RSA 2048 verify operations in 1015400us (26590.5 ops/sec)
Did 1292 RSA 2048 (3 prime, e=3) signing operations in 1008185us (1281.5 ops/sec)
Did 65000 RSA 2048 (3 prime, e=3) verify operations in 1011388us (64268.1 ops/sec)
Did 120 RSA 4096 signing operations in 1061027us (113.1 ops/sec)
Did 8208 RSA 4096 verify operations in 1002717us (8185.8 ops/sec)
$ /tmp/bssl.new speed RSA
Did 760 RSA 2048 signing operations in 1003351us (757.5 ops/sec)
Did 25900 RSA 2048 verify operations in 1028931us (25171.8 ops/sec)
Did 1320 RSA 2048 (3 prime, e=3) signing operations in 1040806us (1268.2 ops/sec)
Did 63000 RSA 2048 (3 prime, e=3) verify operations in 1016042us (62005.3 ops/sec)
Did 104 RSA 4096 signing operations in 1008718us (103.1 ops/sec)
Did 6875 RSA 4096 verify operations in 1093441us (6287.5 ops/sec)

$ /tmp/bssl.old speed GCM
Did 5316000 AES-128-GCM (16 bytes) seal operations in 1000082us (5315564.1 ops/sec): 85.0 MB/s
Did 712000 AES-128-GCM (1350 bytes) seal operations in 1000252us (711820.6 ops/sec): 961.0 MB/s
Did 149000 AES-128-GCM (8192 bytes) seal operations in 1003182us (148527.4 ops/sec): 1216.7 MB/s
Did 5919750 AES-256-GCM (16 bytes) seal operations in 1000016us (5919655.3 ops/sec): 94.7 MB/s
Did 800000 AES-256-GCM (1350 bytes) seal operations in 1000951us (799239.9 ops/sec): 1079.0 MB/s
Did 152000 AES-256-GCM (8192 bytes) seal operations in 1000765us (151883.8 ops/sec): 1244.2 MB/s
$ /tmp/bssl.new speed GCM
Did 5315000 AES-128-GCM (16 bytes) seal operations in 1000125us (5314335.7 ops/sec): 85.0 MB/s
Did 755000 AES-128-GCM (1350 bytes) seal operations in 1000878us (754337.7 ops/sec): 1018.4 MB/s
Did 151000 AES-128-GCM (8192 bytes) seal operations in 1005655us (150150.9 ops/sec): 1230.0 MB/s
Did 5913500 AES-256-GCM (16 bytes) seal operations in 1000041us (5913257.6 ops/sec): 94.6 MB/s
Did 782000 AES-256-GCM (1350 bytes) seal operations in 1001484us (780841.2 ops/sec): 1054.1 MB/s
Did 121000 AES-256-GCM (8192 bytes) seal operations in 1006389us (120231.8 ops/sec): 984.9 MB/s

Change-Id: I0efb32f896c597abc7d7e55c31d038528a5c72a1
Reviewed-on: https://boringssl-review.googlesource.com/6260
Reviewed-by: Adam Langley <alangley@gmail.com>
2015-10-26 20:31:30 +00:00
Adam Langley
0dd93002dd Revert section changes for ASM.
This change reverts the following commits:
  72d9cba7cb
  5b61b9ebc5
  3f85e04f40
  2ab24a2d40

Change-Id: I669b83f2269cf96aa71a649a346147b9407a811e
Reviewed-on: https://boringssl-review.googlesource.com/6056
Reviewed-by: David Benjamin <davidben@chromium.org>
Reviewed-by: Adam Langley <agl@google.com>
2015-09-30 22:09:52 +00:00