Commit Graph

24 Commits

Author SHA1 Message Date
Jesse Selover
b1e6a85443 Change OPENSSL_cpuid_setup to reserve more extended feature space.
Copy of openssl change https://git.openssl.org/gitweb/?p=openssl.git;h=d6ee8f3dc4414cd97bd63b801f8644f0ff8a1f17

OPENSSL_ia32cap: reserve for new extensions.
Change-Id: I96b43c82ba6568bae848449972d3ad9d20f6d063
Reviewed-on: https://boringssl-review.googlesource.com/27564
Reviewed-by: David Benjamin <davidben@google.com>
Reviewed-by: Steven Valdez <svaldez@google.com>
Commit-Queue: David Benjamin <davidben@google.com>
CQ-Verified: CQ bot account: commit-bot@chromium.org <commit-bot@chromium.org>
2018-04-19 20:48:58 +00:00
David Benjamin
9a127b43b8 Add CRYPTO_needs_hwcap2_workaround.
Bug: 203
Change-Id: I50384cce14509ab1ca36e6f0e9f192f9e458b313
Reviewed-on: https://boringssl-review.googlesource.com/20404
Commit-Queue: Steven Valdez <svaldez@google.com>
Reviewed-by: Steven Valdez <svaldez@google.com>
CQ-Verified: CQ bot account: commit-bot@chromium.org <commit-bot@chromium.org>
2017-09-18 14:05:46 +00:00
David Benjamin
4512b792ba Run comment conversion script on include/
ssl is all that's left. Will do that once that's at a quiet point.

Change-Id: Ia183aed5671e3b2de333def138d7f2c9296fb517
Reviewed-on: https://boringssl-review.googlesource.com/19564
Commit-Queue: David Benjamin <davidben@google.com>
Commit-Queue: Adam Langley <agl@google.com>
Reviewed-by: Adam Langley <agl@google.com>
CQ-Verified: CQ bot account: commit-bot@chromium.org <commit-bot@chromium.org>
2017-08-18 23:38:51 +00:00
David Benjamin
3b33f3eb2d Set static armcaps based on __ARM_FEATURE_CRYPTO.
Originally we had some confusion around whether the features could be
toggled individually or not. Per the ARM C Language Extensions doc[1],
__ARM_FEATURE_CRYPTO implies the "crypto extension" which encompasses
all of them. The runtime CPUID equivalent can report the features
individually, but it seems no one separates them in practice, for now.
(If they ever do, probably there'll be a new set of #defines.)

[1] http://infocenter.arm.com/help/topic/com.arm.doc.ihi0053c/IHI0053C_acle_2_0.pdf

Change-Id: I12915dfc308f58fb005286db75e50d8328eeb3ea
Reviewed-on: https://boringssl-review.googlesource.com/16991
Reviewed-by: Adam Langley <agl@google.com>
Commit-Queue: Adam Langley <agl@google.com>
CQ-Verified: CQ bot account: commit-bot@chromium.org <commit-bot@chromium.org>
2017-06-09 00:29:10 +00:00
David Benjamin
91871018a4 Add an OPENSSL_ia32cap_get() function for C code.
OPENSSL_ia32cap_addr avoids any relocations within the module, at the
cost of a runtime TEXTREL, which causes problems in some cases.
(Notably, if someone links us into a binary which uses the GCC "ifunc"
attribute, the loader crashes.)

Fix C references of OPENSSL_ia32cap_addr with a function. This is
analogous to the BSS getters. A follow-up commit will fix perlasm with a
different scheme which avoids calling into a function (clobbering
registers and complicating unwind directives.)

Change-Id: I09d6cda4cec35b693e16b5387611167da8c7a6de
Reviewed-on: https://boringssl-review.googlesource.com/15525
Reviewed-by: Adam Langley <agl@google.com>
2017-04-27 20:34:23 +00:00
Adam Langley
b18cb6a5d0 Make the POWER hardware capability value a global in crypto.c.
(Thanks to Sam Panzer for the patch.)

At least some linkers will drop constructor functions if no symbols from
that translation unit are used elsewhere in the program. On POWER, since
the cached capability value isn't a global in crypto.o (like other
platforms), the constructor function is getting discarded.

The C++11 spec says (3.6.2, paragraph 4):

    It is implementation-defined whether the dynamic initialization of a
    non-local variable with static storage duration is done before the
    first statement of main. If the initialization is deferred to some
    point in time after the first statement of main, it shall occur
    before the first odr-use (3.2) of any function or variable defined
    in the same translation unit as the variable to be initialized.

Compilers appear to interpret that to mean they are allowed to drop
(i.e. indefinitely defer) constructors that occur in translation units
that are never used, so they can avoid initializing some part of a
library if it's dropped on the floor.

This change makes the hardware capability value for POWER a global in
crypto.c, which should prevent the constructor function from being
ignored.

Change-Id: I43ebe492d0ac1491f6f6c2097971a277f923dd3e
Reviewed-on: https://boringssl-review.googlesource.com/14664
Commit-Queue: Adam Langley <agl@google.com>
Commit-Queue: David Benjamin <davidben@google.com>
Reviewed-by: David Benjamin <davidben@google.com>
CQ-Verified: CQ bot account: commit-bot@chromium.org <commit-bot@chromium.org>
2017-04-04 18:19:19 +00:00
Adam Langley
4467e59bc8 Add PPC64LE assembly for AES-GCM.
This change adds AES and GHASH assembly from upstream, with the aim of
speeding up AES-GCM.

The PPC64LE assembly matches the interface of the ARMv8 assembly so I've
changed the prefix of both sets of asm functions to be the same
("aes_hw_").

Otherwise, the new assmebly files and Perlasm match exactly those from
upstream's c536b6be1a (from their master branch).

Before:
Did 1879000 AES-128-GCM (16 bytes) seal operations in 1000428us (1878196.1 ops/sec): 30.1 MB/s
Did 61000 AES-128-GCM (1350 bytes) seal operations in 1006660us (60596.4 ops/sec): 81.8 MB/s
Did 11000 AES-128-GCM (8192 bytes) seal operations in 1072649us (10255.0 ops/sec): 84.0 MB/s
Did 1665000 AES-256-GCM (16 bytes) seal operations in 1000591us (1664016.6 ops/sec): 26.6 MB/s
Did 52000 AES-256-GCM (1350 bytes) seal operations in 1006971us (51640.0 ops/sec): 69.7 MB/s
Did 8840 AES-256-GCM (8192 bytes) seal operations in 1013294us (8724.0 ops/sec): 71.5 MB/s

After:
Did 4994000 AES-128-GCM (16 bytes) seal operations in 1000017us (4993915.1 ops/sec): 79.9 MB/s
Did 1389000 AES-128-GCM (1350 bytes) seal operations in 1000073us (1388898.6 ops/sec): 1875.0 MB/s
Did 319000 AES-128-GCM (8192 bytes) seal operations in 1000101us (318967.8 ops/sec): 2613.0 MB/s
Did 4668000 AES-256-GCM (16 bytes) seal operations in 1000149us (4667304.6 ops/sec): 74.7 MB/s
Did 1202000 AES-256-GCM (1350 bytes) seal operations in 1000646us (1201224.0 ops/sec): 1621.7 MB/s
Did 269000 AES-256-GCM (8192 bytes) seal operations in 1002804us (268247.8 ops/sec): 2197.5 MB/s

Change-Id: Id848562bd4e1aa79a4683012501dfa5e6c08cfcc
Reviewed-on: https://boringssl-review.googlesource.com/11262
Reviewed-by: Adam Langley <agl@google.com>
Commit-Queue: Adam Langley <agl@google.com>
CQ-Verified: CQ bot account: commit-bot@chromium.org <commit-bot@chromium.org>
2016-09-27 18:43:20 +00:00
David Benjamin
0a63b96535 Make CRYPTO_is_NEON_capable aware of the buggy CPU.
If we're to allow the buggy CPU workaround to fire when __ARM_NEON__ is set,
CRYPTO_is_NEON_capable also needs to be aware of it. Also add an API to export
this value out of BoringSSL, so we can get some metrics on how prevalent this
chip is.

BUG=chromium:606629

Change-Id: I97d65a47a6130689098b32ce45a8c57c468aa405
Reviewed-on: https://boringssl-review.googlesource.com/7796
Reviewed-by: Adam Langley <agl@google.com>
2016-04-28 16:42:21 +00:00
David Benjamin
054e151b16 Rewrite ARM feature detection.
This removes the thread-unsafe SIGILL-based detection and the
multi-consumer-hostile CRYPTO_set_NEON_capable API. (Changing
OPENSSL_armcap_P after initialization is likely to cause problems.)

The right way to detect ARM features on Linux is getauxval. On aarch64,
we should be able to rely on this, so use it straight. Split this out
into its own file. The #ifdefs in the old cpu-arm.c meant it shared all
but no code with its arm counterpart anyway.

Unfortunately, various versions of Android have different missing APIs, so, on
arm, we need a series of workarounds. Previously, we used a SIGILL fallback
based on OpenSSL's logic, but this is inherently not thread-safe. (SIGILL also
does not tell us if the OS knows how to save and restore NEON state.) Instead,
base the behavior on Android NDK's cpu-features library, what Chromium
currently uses with CRYPTO_set_NEON_capable:

- Android before API level 20 does not provide getauxval. Where missing,
  we can read from /proc/self/auxv.

- On some versions of Android, /proc/self/auxv is also not readable, so
  use /proc/cpuinfo's Features line.

- Linux only advertises optional features in /proc/cpuinfo. ARMv8 makes NEON
  mandatory, so /proc/cpuinfo can't be used without additional effort.

Finally, we must blacklist a particular chip because the NEON unit is broken
(https://crbug.com/341598).

Unfortunately, this means CRYPTO_library_init now depends on /proc being
available, which will require some care with Chromium's sandbox. The
simplest solution is to just call CRYPTO_library_init before entering
the sandbox.

It's worth noting that Chromium's current EnsureOpenSSLInit function already
depends on /proc/cpuinfo to detect the broken CPU, by way of base::CPU.
android_getCpuFeatures also interally depends on it. We were already relying on
both of those being stateful and primed prior to entering the sandbox.

BUG=chromium:589200

Change-Id: Ic5d1c341aab5a614eb129d8aa5ada2809edd6af8
Reviewed-on: https://boringssl-review.googlesource.com/7506
Reviewed-by: David Benjamin <davidben@google.com>
2016-03-26 04:54:44 +00:00
David Benjamin
85003903fc Remove CRYPTO_set_NEON_functional.
This depends on https://codereview.chromium.org/1730823002/. The bit was only
ever targetted to one (rather old) CPU. Disable NEON on it uniformly, so we
don't have to worry about whether any new NEON code breaks it.

BUG=589200

Change-Id: Icc7d17d634735aca5425fe0a765ec2fba3329326
Reviewed-on: https://boringssl-review.googlesource.com/7211
Reviewed-by: Adam Langley <agl@google.com>
2016-02-23 23:19:46 +00:00
Adam Langley
9e65d487b8 Allow |CRYPTO_is_NEON_capable| to be known at compile time, if possible.
If -mfpu=neon is passed then we don't need to worry about checking for
NEON support at run time. This change allows |CRYPTO_is_NEON_capable| to
statically return 1 in this case. This then allows the compiler to
discard generic code in several cases.

Change-Id: I3b229740ea3d5cb0a304f365c400a0996d0c66ef
Reviewed-on: https://boringssl-review.googlesource.com/6523
Reviewed-by: David Benjamin <davidben@chromium.org>
Reviewed-by: Adam Langley <agl@google.com>
2015-11-19 00:15:11 +00:00
David Benjamin
f8e9dcaeea iOS builds use the static ARM CPU configuration.
The other codepath is Linux-specific. This should get tidied up a bit but, in
the meantime, fix the Chromium iOS (NO_ASM) build. Even when the assembly gets
working, it seems iOS prefers you make fat binaries rather than detect features
at runtime, so this is what we want anyway.

BUG=548539

Change-Id: If19b2e380a96918b07bacc300a3a27b885697b99
Reviewed-on: https://boringssl-review.googlesource.com/6380
Reviewed-by: Adam Langley <agl@google.com>
2015-10-28 17:25:25 +00:00
Adam Langley
6a7cfbe06a Allow ARM capabilities to be set at compile time.
Some ARM environments don't support |getauxval| or signals and need to
configure the capabilities of the chip at compile time. This change adds
defines that allow them to do so.

Change-Id: I4e6987f69dd13444029bc7ac7ed4dbf8fb1faa76
Reviewed-on: https://boringssl-review.googlesource.com/6280
Reviewed-by: Adam Langley <agl@google.com>
2015-10-20 22:40:15 +00:00
David Benjamin
7315251d4e Replace cpuid assembly with C code.
Rather, take a leaf out of Chromium's book and use MSVC's __cpuid and
_xgetbv built-in, with an inline assembly emulated version for other
compilers.

This preserves the behavior of the original assembly with the following
differences:

- CPUs without cpuid aren't support. Chromium's base/cpu.cc doesn't
  check, and SSE2 support is part of our baseline; the perlasm code
  is always built with OPENSSL_IA32_SSE2.

- The clear_xmm block in cpu-x86-asm.pl is removed. This was used to
  clear some XMM-using features if OSXSAVE was set but XCR0 reports the
  OS doesn't use XSAVE to store SSE state. This wasn't present in the
  x86_64 and seems wrong. Section 13.5.2 of the Intel manual, volume 1,
  explicitly says SSE may still be used in this case; the OS may save
  that state in FXSAVE instead. A side discussion on upstream's RT#2633
  agrees.

- The old code ran some AMD CPUs through the "intel" codepath and some
  went straight to "generic" after duplicating some, but not all, logic.
  The AMD copy didn't clear some reserved bits and didn't query CPUID 7
  for AVX2 support. This is moot since AMD CPUs today don't support
  AVX2, but it seems they're expected to in the future?

- Setting bit 10 is dropped. This doesn't appear to be queried anywhere,
  was 32-bit only, and seems a remnant of upstream's
  14e21f863a3e3278bb8660ea9844e92e52e1f2f7.

Change-Id: I0548877c97e997f7beb25e15f3fea71c68a951d2
Reviewed-on: https://boringssl-review.googlesource.com/5434
Reviewed-by: Adam Langley <agl@google.com>
2015-07-20 18:59:44 +00:00
David Benjamin
c3717f4a00 Extra documentation.
Some other reserved bits are repurposed. Also explicitly mention that
bit 20 is zero (formerly RC4_CHAR), so it's not accidentally repurposed
later.

Change-Id: Idc4b32efe089ae7b7295472c4488f75258b7f962
Reviewed-on: https://boringssl-review.googlesource.com/5432
Reviewed-by: Adam Langley <agl@google.com>
2015-07-16 21:05:24 +00:00
Adam Langley
04c36b5062 Never set RC4_CHAR.
RC4_CHAR is a bit in the x86(-64) CPUID information that switches the
RC4 asm code from using an array of 256 uint32_t's to 256 uint8_t's. It
was originally written for the P4, where the uint8_t style was faster.

(On modern chips, setting RC4_CHAR took RC4-MD5 from 458 to 304 MB/s.
Although I wonder whether, on a server with many connections, using less
cache wouldn't be better.)

However, I'm not too worried about a slowdown of RC4 on P4 systems these
days (the last new P4 chip was released nine years ago) and I want the
code to be simplier.

Also, RC4_CHAR was set when the CPUID family was 15, but Intel actually
lists 15 as a special code meaning "also check the extended family
bits", which the asm didn't do.

The RC4_CHAR support remains in the RC4 asm code to avoid drift with
upstream.

Change-Id: If3febc925a83a76f453b9e9f8de5ee43759927c6
Reviewed-on: https://boringssl-review.googlesource.com/3550
Reviewed-by: David Benjamin <davidben@chromium.org>
Reviewed-by: Adam Langley <agl@google.com>
2015-02-20 23:59:59 +00:00
Adam Langley
3e6526575a aarch64 support.
This is an initial cut at aarch64 support. I have only qemu to test it
however—hopefully hardware will be coming soon.

This also affects 32-bit ARM in that aarch64 chips can run 32-bit code
and we would like to be able to take advantage of the crypto operations
even in 32-bit mode. AES and GHASH should Just Work in this case: the
-armx.pl files can be built for either 32- or 64-bit mode based on the
flavour argument given to the Perl script.

SHA-1 and SHA-256 don't work like this however because they've never
support for multiple implementations, thus BoringSSL built for 32-bit
won't use the SHA instructions on an aarch64 chip.

No dedicated ChaCha20 or Poly1305 support yet.

Change-Id: Ib275bc4894a365c8ec7c42f4e91af6dba3bd686c
Reviewed-on: https://boringssl-review.googlesource.com/2801
Reviewed-by: Adam Langley <agl@google.com>
2015-01-14 23:38:11 +00:00
David Benjamin
5b082e880d Various documentation fixes.
Add some missing headers and ensure each header has a short description. doc.go
gets confused at declarations that break before the first (, so avoid doing
that. Also skip a/an/deprecated: in markupFirstWord and process pipe words in
the table of contents.

Change-Id: Ia08ec5ae8e496dd617e377e154eeea74f4abf435
Reviewed-on: https://boringssl-review.googlesource.com/2839
Reviewed-by: Adam Langley <agl@google.com>
2015-01-14 21:50:50 +00:00
David Benjamin
c44d2f4cb8 Convert all zero-argument functions to '(void)'
Otherwise, in C, it becomes a K&R function declaration which doesn't actually
type-check the number of arguments.

Change-Id: I0731a9fefca46fb1c266bfb1c33d464cf451a22e
Reviewed-on: https://boringssl-review.googlesource.com/1582
Reviewed-by: Adam Langley <agl@google.com>
2014-08-21 01:06:07 +00:00
David Benjamin
5213df4e9e Prefer AES-GCM when hardware support is available.
BUG=396787

Change-Id: I72ddb0ec3c71dbc70054403163930cbbde4b6009
Reviewed-on: https://boringssl-review.googlesource.com/1581
Reviewed-by: Adam Langley <agl@google.com>
2014-08-20 20:53:31 +00:00
Adam Langley
31ebde9e5e Add a control to disable the Poly1305 NEON code.
Some phones have a buggy NEON unit and the Poly1305 NEON code fails on
them, even though other NEON code appears to work fine.

This change:

1) Fixes a bug where NEON was assumed even when the code wasn't compiled
   in NEON mode.

2) Adds a second NEON control bit that can be disabled in order to run
   NEON code, but not the Poly1305 NEON code.

https://code.google.com/p/chromium/issues/detail?id=341598

Change-Id: Icb121bf8dba47c7a46c7667f676ff7a4bc973625
Reviewed-on: https://boringssl-review.googlesource.com/1351
Reviewed-by: Adam Langley <agl@google.com>
2014-07-31 22:42:15 +00:00
Adam Langley
eb7d2ed1fe Add visibility rules.
This change marks public symbols as dynamically exported. This means
that it becomes viable to build a shared library of libcrypto and libssl
with -fvisibility=hidden.

On Windows, one not only needs to mark functions for export in a
component, but also for import when using them from a different
component. Because of this we have to build with
|BORINGSSL_IMPLEMENTATION| defined when building the code. Other
components, when including our headers, won't have that defined and then
the |OPENSSL_EXPORT| tag becomes an import tag instead. See the #defines
in base.h

In the asm code, symbols are now hidden by default and those that need
to be exported are wrapped by a C function.

In order to support Chromium, a couple of libssl functions were moved to
ssl.h from ssl_locl.h: ssl_get_new_session and ssl_update_cache.

Change-Id: Ib4b76e2f1983ee066e7806c24721e8626d08a261
Reviewed-on: https://boringssl-review.googlesource.com/1350
Reviewed-by: Adam Langley <agl@google.com>
2014-07-31 22:03:11 +00:00
Adam Langley
4c921e1bbc Move public headers to include/openssl/
Previously, public headers lived next to the respective code and there
were symlinks from include/openssl to them.

This doesn't work on Windows.

This change moves the headers to live in include/openssl. In cases where
some symlinks pointed to the same header, I've added a file that just
includes the intended target. These cases are all for backwards-compat.

Change-Id: I6e285b74caf621c644b5168a4877db226b07fd92
Reviewed-on: https://boringssl-review.googlesource.com/1180
Reviewed-by: David Benjamin <davidben@chromium.org>
Reviewed-by: Adam Langley <agl@google.com>
2014-07-14 22:42:18 +00:00
Adam Langley
95c29f3cd1 Inital import.
Initial fork from f2d678e6e89b6508147086610e985d4e8416e867 (1.0.2 beta).

(This change contains substantial changes from the original and
effectively starts a new history.)
2014-06-20 13:17:32 -07:00