Commit Graph

153 Commits

Author SHA1 Message Date
Brian Smith
df1201e6ee Remove unnecessary |BN_CTX_start|/|BN_CTX_end| in |BN_mod_exp_mont_consttime|.
Since the function doesn't call |BN_CTX_get|, it doesn't need to call
|BN_CTX_start|/|BN_CTX_end|.

Change-Id: I6cb954d3fee2959bdbc81b9b97abc52bb6f7704c
Reviewed-on: https://boringssl-review.googlesource.com/7469
Reviewed-by: David Benjamin <davidben@google.com>
2016-03-20 17:16:27 +00:00
Piotr Sikora
9bb8ba6ba1 Make local functions static.
Partially fixes build with -Wmissing-prototypes -Wmissing-declarations.

Change-Id: I6048f5b7ef31560399b25ed9880156bc7d8abac2
Signed-off-by: Piotr Sikora <piotrsikora@google.com>
Reviewed-on: https://boringssl-review.googlesource.com/7511
Reviewed-by: David Benjamin <davidben@google.com>
2016-03-20 16:37:58 +00:00
Brian Smith
3f1904bee1 Set |bn->neg| to zero in |bn_set_words|.
If the values of any of the coordinates in the output point |r| were
negative during nistz256 multiplication, then the calls to
|bn_set_word| would result in the wrong coordinates being returned
(the negatives of the correct coordinates would be returned instead).
Fix that.

Change-Id: I6048e62f76dca18f625650d11ef5a051c9e672a4
Reviewed-on: https://boringssl-review.googlesource.com/7442
Reviewed-by: David Benjamin <davidben@google.com>
2016-03-11 19:21:11 +00:00
Brian Smith
d279a21d8c Avoid potential uninitialized memory read in crypto/ec/p256-x86_64.c.
If the function returns early due to an error, then the coordinates of the
result will have their |top| value set to a value beyond what has actually
been been written. Fix that, and make it easier to avoid such issues in the
future by refactoring the code.

As a bonus, avoid a false positive MSVC 64-bit opt build "potentially
uninitialized value used" warning.

Change-Id: I8c48deb63163a27f739c8797962414f8ca2588cd
Reviewed-on: https://boringssl-review.googlesource.com/6579
Reviewed-by: David Benjamin <davidben@google.com>
2016-03-09 19:04:36 +00:00
Adam Langley
df2a5562f3 bn/asm/x86_64-mont5.pl: unify gather procedure in hardly used path and reorganize/harmonize post-conditions.
(Imported from upstream's 515f3be47a0b58eec808cf365bc5e8ef6917266b)

Additional hardening following on from CVE-2016-0702.

Change-Id: I19a6739b401887a42eb335fe5838379dc8d04100
Reviewed-on: https://boringssl-review.googlesource.com/7245
Reviewed-by: Adam Langley <agl@google.com>
2016-03-01 18:04:20 +00:00
Adam Langley
b360eaf001 crypto/bn/x86_64-mont5.pl: constant-time gather procedure.
(Imported from upstream's 25d14c6c29b53907bf614b9964d43cd98401a7fc.)

At the same time remove miniscule bias in final subtraction. Performance
penalty varies from platform to platform, and even with key length. For
rsa2048 sign it was observed to be 4% for Sandy Bridge and 7% on
Broadwell.

(This is part of the fix for CVE-2016-0702.)

Change-Id: I43a13d592c4a589d04c17c33c0ca40c2d7375522
Reviewed-on: https://boringssl-review.googlesource.com/7244
Reviewed-by: Adam Langley <agl@google.com>
2016-03-01 18:04:15 +00:00
Adam Langley
1168fc72fc bn/asm/rsaz-avx2.pl: constant-time gather procedure.
(Imported from upstream's 08ea966c01a39e38ef89e8920d53085e4807a43a)

Performance penalty is 2%.

(This is part of the fix for CVE-2016-0702.)

Change-Id: Id3b6262c5d3201dd64b93bdd34601a51794a9275
Reviewed-on: https://boringssl-review.googlesource.com/7243
Reviewed-by: Adam Langley <agl@google.com>
2016-03-01 18:04:09 +00:00
Adam Langley
842a06c2b9 bn/asm/rsax-x86_64.pl: constant-time gather procedure.
(Imported from upstream's ef98503eeef5c108018081ace902d28e609f7772.)

Performance penalty is 2% on Linux and 5% on Windows.

(This is part of the fix for CVE-2016-0702.)

Change-Id: If82f95131c93168282a46ac5a35e2b007cc2bd67
Reviewed-on: https://boringssl-review.googlesource.com/7242
Reviewed-by: Adam Langley <agl@google.com>
2016-03-01 18:03:16 +00:00
Adam Langley
82bdaa89f0 Make copy_from_prebuf constant time.
(Imported from upstream's 708dc2f1291e104fe4eef810bb8ffc1fae5b19c1.)

Performance penalty varies from platform to platform, and even key
length. For rsa2048 sign it was observed to reach almost 10%.

This is part of the fix for CVE-2016-0702.

Change-Id: Ie0860bf3e531196f03102db1bc48eeaf30ab1d58
Reviewed-on: https://boringssl-review.googlesource.com/7241
Reviewed-by: Adam Langley <agl@google.com>
2016-03-01 18:03:09 +00:00
Steven Valdez
d7305d50e4 Add missing initialization in bn/exponentiation
(Imported from upstream's 04f2a0b50d219aafcef2fa718d91462b587aa23d)

Change-Id: Ie840edeb1fc9d5a4273f137467e3ef16528c9668
Reviewed-on: https://boringssl-review.googlesource.com/7234
Reviewed-by: David Benjamin <davidben@google.com>
2016-02-29 21:54:15 +00:00
Brian Smith
cd8d1761df Move |bn_div_words| to crypto/bn/div.c and make it static.
It is only used by |bn_div_rem_words|.

Change-Id: I57627091d8db5890d7fea34d8560897717008646
Reviewed-on: https://boringssl-review.googlesource.com/7128
Reviewed-by: David Benjamin <davidben@google.com>
2016-02-25 16:16:14 +00:00
Brian Smith
d1425f69df Simplify division-with-remainder calculations in crypto/bn/div.c.
Create a |bn_div_rem_words| that is used for double-word/single-word
divisions and division-with-remainder. Remove all implementations of
|bn_div_words| except for the implementation needed for 64-bit MSVC.
This allows more code to be shared across platforms and also removes
an instance of the dangerous pattern wherein the |div_asm| macro
modified a variable that wasn't passed as a parameter.

Also, document the limitations of the compiler-generated code for the
non-asm code paths more fully. Compilers indeed have not improved in
this respect.

Change-Id: I5a36a2edd7465de406d47d72dcd6bf3e63e5c232
Reviewed-on: https://boringssl-review.googlesource.com/7127
Reviewed-by: David Benjamin <davidben@google.com>
2016-02-25 16:13:22 +00:00
David Benjamin
0182ecd346 Consistently use named constants in ARM assembly files.
Most of the OPENSSL_armcap_P accesses in assembly use named constants from
arm_arch.h, but some don't. Consistently use the constants. The dispatch really
should be in C, but in the meantime, make it easier to tell what's going on.

I'll send this patch upstream so we won't be carrying a diff here.

Change-Id: I63c68d2351ea5ce11005813314988e32b6459526
Reviewed-on: https://boringssl-review.googlesource.com/7203
Reviewed-by: Adam Langley <agl@google.com>
2016-02-23 17:18:18 +00:00
Brian Smith
5ba06897be Don't cast |OPENSSL_malloc|/|OPENSSL_realloc| result.
C has implicit conversion of |void *| to other pointer types so these
casts are unnecessary. Clean them up to make the code easier to read
and to make it easier to find dangerous casts.

Change-Id: I26988a672e8ed4d69c75cfbb284413999b475464
Reviewed-on: https://boringssl-review.googlesource.com/7102
Reviewed-by: David Benjamin <davidben@google.com>
2016-02-11 22:07:56 +00:00
David Benjamin
3ab3e3db6e Mark ARM assembly globals hidden uniformly in arm-xlate.pl.
We'd manually marked some of them hidden, but missed some. Do it in the perlasm
driver instead since we will never expose an asm symbol directly. This reduces
some of our divergence from upstream on these files (and indeed we'd
accidentally lose some .hiddens at one point).

BUG=586141

Change-Id: Ie1bfc6f38ba73d33f5c56a8a40c2bf1668562e7e
Reviewed-on: https://boringssl-review.googlesource.com/7140
Reviewed-by: Adam Langley <agl@google.com>
2016-02-11 17:28:03 +00:00
Brian Smith
a051bdd6cd Remove dead non-|BN_ULLONG|, non-64-bit-MSVC code in crypto/bn.
It is always the case that either |BN_ULLONG| is defined or
|BN_UMULT_LOHI| is defined because |BN_ULLONG| is defined everywhere
except 64-bit MSVC, and BN_UMULT_LOHI is defined for 64-bit MSVC.

Change-Id: I85e5d621458562501af1af65d587c0b8d937ba3b
Reviewed-on: https://boringssl-review.googlesource.com/7044
Reviewed-by: David Benjamin <davidben@google.com>
2016-02-09 16:21:41 +00:00
Brian Smith
767e1210e0 Remove unused Simics code in crypto/bn/asm/x86_64-gcc.c.
Change-Id: If9c5031855c0acfafb73caba169e146f0e16f706
Reviewed-on: https://boringssl-review.googlesource.com/7093
Reviewed-by: David Benjamin <davidben@google.com>
2016-02-08 23:41:47 +00:00
Brian Smith
aadf1ee77f Minimize the scope of the |BN_*_SIZE_*| constants.
mul.c is the only file that uses these values.

Change-Id: I50a685cbff0f26357229e742f42e014434e9cebe
Reviewed-on: https://boringssl-review.googlesource.com/7061
Reviewed-by: David Benjamin <davidben@google.com>
2016-02-08 18:28:31 +00:00
Brian Smith
8c5ea1338a Remove unused |bn_mul_low_normal| and related #defines.
Change-Id: I2e3745f5dd5132a48dcbf472bca3638324dfc7a3
Reviewed-on: https://boringssl-review.googlesource.com/7060
Reviewed-by: David Benjamin <davidben@google.com>
2016-02-08 18:25:23 +00:00
Brian Smith
f98be21fad Remove dead platform-specific code in |BN_div|.
It is always the case that |BN_ULLONG| is defined or we're building for
64-bit MSVC. Lots of code is trying to handle impossible cases where
neither of those is true.

Change-Id: Ie337adda1dfb453843c6e0999807dfa1afb1ed89
Reviewed-on: https://boringssl-review.googlesource.com/7043
Reviewed-by: David Benjamin <davidben@google.com>
2016-02-05 23:12:11 +00:00
Brian Smith
926f2194df Enable MSVC 128-bit multiplication regardless of OPENSSL_NO_ASM.
This allows much code to be subsequently simplified and removed.

Change-Id: I0ac256957c6eae9f35a70508bd454cb44f3f8653
Reviewed-on: https://boringssl-review.googlesource.com/7042
Reviewed-by: David Benjamin <davidben@google.com>
2016-02-05 00:30:34 +00:00
Adam Langley
dd31c4eba2 Update some comments in bn_test.c in light of acb24518.
Change acb24518 renamed some functions, but there were some dangling
references in bn_test.c. Thanks to Brian Smith for noticing.

This change has no semantic effect.

Change-Id: Id149505090566583834be3abce2cee28b8c248e2
Reviewed-on: https://boringssl-review.googlesource.com/7040
Reviewed-by: David Benjamin <davidben@google.com>
2016-02-02 18:22:19 +00:00
David Benjamin
acb2451807 Rename the BIGNUM ASN.1 functions.
There's many ways to serialize a BIGNUM, so not including asn1 in the name is
confusing (and collides with BN_bn2cbb_padded). Since BN_asn12bn looks
ridiculous, match the parse/marshal naming scheme of other modules instead.

Change-Id: I53d22ae0537a98e223ed943e943c48cb0743cf51
Reviewed-on: https://boringssl-review.googlesource.com/6822
Reviewed-by: Adam Langley <alangley@gmail.com>
2016-01-27 22:37:44 +00:00
Brian Smith
24e428899b Define int128_t and uint128_t in one place.
Change-Id: Ia93130aadf319eaba1b6f2ec2896a4c50d9e8ede
Reviewed-on: https://boringssl-review.googlesource.com/6975
Reviewed-by: David Benjamin <davidben@google.com>
2016-01-27 22:15:04 +00:00
Brian Smith
7cae9f5b6c Use |alignas| for alignment.
MSVC doesn't have stdalign.h and so doesn't support |alignas| in C
code. Define |alignas(x)| as a synonym for |__decltype(align(x))|
instead for it.

This also fixes -Wcast-qual warnings in rsaz_exp.c.

Change-Id: Ifce9031724cb93f5a4aa1f567e7af61b272df9d5
Reviewed-on: https://boringssl-review.googlesource.com/6924
Reviewed-by: Adam Langley <agl@google.com>
Reviewed-by: David Benjamin <davidben@google.com>
2016-01-25 23:05:04 +00:00
Brian Smith
d3a4e280db Fix trivial -Wcast-qual violations.
Fix casts from const to non-const where dropping the constness is
completely unnecessary. The changes to chacha_vec.c don't result in any
changes to chacha_vec_arm.S.

Change-Id: I2f10081fd0e73ff5db746347c5971f263a5221a6
Reviewed-on: https://boringssl-review.googlesource.com/6923
Reviewed-by: David Benjamin <davidben@google.com>
2016-01-21 21:06:02 +00:00
David Benjamin
6544426d82 Fix a ** 0 mod 1 = 0 for real this time.
Commit 2b0180c37fa6ffc48ee40caa831ca398b828e680 attempted to do this but
only hit one of many BN_mod_exp codepaths. Fix remaining variants and
add a test for each method.

Thanks to Hanno Boeck for reporting this issue.

(Imported from upstream's 44e4f5b04b43054571e278381662cebd3f3555e6.)

Change-Id: Ic691b354101c3e9c3565300836fb6d55c6f253ba
Reviewed-on: https://boringssl-review.googlesource.com/6820
Reviewed-by: Adam Langley <agl@google.com>
2015-12-22 23:30:22 +00:00
David Benjamin
ae0eaaa397 Convert ssl3_send_client_key_exchange to CBB.
This relieves some complexity budget for adding Curve25519 to this
code.

This also adds a BN_bn2cbb_padded helper function since this seems to be a
fairly common need.

Change-Id: Ied0066fdaec9d02659abd6eb1a13f33502c9e198
Reviewed-on: https://boringssl-review.googlesource.com/6767
Reviewed-by: Adam Langley <agl@google.com>
2015-12-22 17:00:56 +00:00
David Benjamin
502a843dee Switch unrolled loop in BN_usub with memcpy.
See also upstream's 06cf881a3a10d5af3c1255c08cfd0c6ddb5f1cc3,
9f040d6decca7930e978784c917f731e5c45e8f0, and
9f6795e7d2d1e35668ad70ba0afc480062be4e2e.

Change-Id: I27d90e382867a5fe988d152b31f8494e001a6a9f
Reviewed-on: https://boringssl-review.googlesource.com/6628
Reviewed-by: Adam Langley <agl@google.com>
2015-12-16 17:38:48 +00:00
David Benjamin
6d9e5a7448 Re-apply 75b833cc81
I messed up and missed that we were carrying a diff on x86_64-mont5.pl. This
was accidentally dropped in https://boringssl-review.googlesource.com/6616.

To confirm the merge is good now, check out at this revision and run:

  git diff e701f16bd69b6f251ed537e40364c281e85a63b2^ crypto/bn/asm/x86_64-mont5.pl > /tmp/A

Then in OpenSSL's repository:

  git diff d73cc256c8e256c32ed959456101b73ba9842f72^ d73cc256c8e256c32ed959456101b73ba9842f72 crypto/bn/asm/x86_64-mont5.pl  > /tmp/B

And confirm the diffs vary in only metadata:

  diff -u /tmp/A /tmp/B

--- /tmp/A	2015-12-03 11:53:23.127034998 -0500
+++ /tmp/B	2015-12-03 11:53:53.099314287 -0500
@@ -1,8 +1,8 @@
 diff --git a/crypto/bn/asm/x86_64-mont5.pl b/crypto/bn/asm/x86_64-mont5.pl
-index 38def07..3c5a8fc 100644
+index 388e3c6..64e668f 100755
 --- a/crypto/bn/asm/x86_64-mont5.pl
 +++ b/crypto/bn/asm/x86_64-mont5.pl
-@@ -1770,6 +1770,15 @@ sqr8x_reduction:
+@@ -1784,6 +1784,15 @@ sqr8x_reduction:
  .align	32
  .L8x_tail_done:
  	add	(%rdx),%r8		# can this overflow?
@@ -18,7 +18,7 @@
  	xor	%rax,%rax

  	neg	$carry
-@@ -3116,6 +3125,15 @@ sqrx8x_reduction:
+@@ -3130,6 +3139,15 @@ sqrx8x_reduction:
  .align	32
  .Lsqrx8x_tail_done:
  	add	24+8(%rsp),%r8		# can this overflow?
@@ -34,7 +34,7 @@
  	mov	$carry,%rax		# xor	%rax,%rax

  	sub	16+8(%rsp),$carry	# mov 16(%rsp),%cf
-@@ -3159,13 +3177,11 @@ my ($rptr,$nptr)=("%rdx","%rbp");
+@@ -3173,13 +3191,11 @@ my ($rptr,$nptr)=("%rdx","%rbp");
  my @ri=map("%r$_",(10..13));
  my @ni=map("%r$_",(14..15));
  $code.=<<___;

Change-Id: I3fb5253783ed82e4831f5bffde75273bd9609c23
Reviewed-on: https://boringssl-review.googlesource.com/6618
Reviewed-by: Adam Langley <agl@google.com>
2015-12-03 17:25:12 +00:00
David Benjamin
e701f16bd6 bn/asm/x86_64-mont5.pl: fix carry propagating bug (CVE-2015-3193).
(Imported from upstream's d73cc256c8e256c32ed959456101b73ba9842f72.)

Change-Id: I673301fee57f0ab5bef24553caf8b2aac67fb3a9
Reviewed-on: https://boringssl-review.googlesource.com/6616
Reviewed-by: Adam Langley <agl@google.com>
2015-12-03 16:44:35 +00:00
David Benjamin
81edc9beb6 Do away with BN_LLONG in favor of BN_ULLONG.
BN_LLONG is only ever used in #ifdefs. The actual type is BN_ULLONG. Switch the
ifdefs to check on BN_ULLONG and remove BN_LLONG. Also fix signedness of all
the constants (potentially avoiding undefined behavior in some operations).

Change-Id: I3e7739bbe14c50ea7db04fc507a034a8cb315a5f
Reviewed-on: https://boringssl-review.googlesource.com/6518
Reviewed-by: Adam Langley <agl@google.com>
2015-11-20 19:59:07 +00:00
Brian Smith
bf762186c6 Remove the |ri| field of |BN_MONT_CTX|.
The |ri| field was only used in |BN_MONT_CTX_set|, so make it a local
variable of that function.

Change-Id: Id8c3d44ac2e30e3961311a7b1a6731fe2c33a0eb
Reviewed-on: https://boringssl-review.googlesource.com/6526
Reviewed-by: Adam Langley <agl@google.com>
2015-11-19 01:40:13 +00:00
Brian Smith
596ab10b0f s/BN_BITS/BN_BITS2/ in |BN_mod_inverse_ex|; remove |BN_BITS| & |BN_MASK|.
The comment in |BN_mod_inverse_ex| makes it clear that |BN_BITS2| was
intended. Besides fixing the code to match the comment, remove
the now-unused |BN_BITS| and the already-unused |BN_MASK| to prevent
future confusion of this sort.

On MSVC builds there seems to be very little difference in performance
between the two code paths according to |bssl speed|.

Change-Id: I765b7b3d464e2057b1d7952af25b6deb2724976a
Reviewed-on: https://boringssl-review.googlesource.com/6525
Reviewed-by: Adam Langley <agl@google.com>
2015-11-19 01:39:32 +00:00
Brian Smith
7af36e1e38 Share common definitions of |TOBN| and |BIGNUM_STATIC|.
Previously, both crypto/dh and crypto/ec defined |TOBN| macros that did
the same thing, but which took their arguments in the opposite order.
This change makes the code consistently use the same macro. It also
makes |STATIC_BIGNUM| available for internal use outside of crypto/bn.

Change-Id: Ide57f6a5b74ea95b3585724c7e1a630c82a864d9
Reviewed-on: https://boringssl-review.googlesource.com/6528
Reviewed-by: Adam Langley <agl@google.com>
2015-11-19 01:38:52 +00:00
Piotr Sikora
9361243065 Don't include <alloca.h>, it's no longer needed.
Relevant code was removed in 5d5e39f5d2.

Change-Id: I198844064030c04f88e5541f2bbaa29ae13d14bb
Signed-off-by: Piotr Sikora <piotrsikora@google.com>
Reviewed-on: https://boringssl-review.googlesource.com/6521
Reviewed-by: David Benjamin <davidben@chromium.org>
Reviewed-by: Adam Langley <agl@google.com>
2015-11-17 19:21:40 +00:00
Brian Smith
8bde5d2e51 Remove the unused |Ni| member of |BN_MONT_CTX|.
Change-Id: I0a542c48c7adae28f05778d6c34c9b6836fc3449
Reviewed-on: https://boringssl-review.googlesource.com/6480
Reviewed-by: Adam Langley <agl@google.com>
2015-11-12 20:04:43 +00:00
David Benjamin
ef14b2d86e Remove stl_compat.h.
Chromium's toolchains may now assume C++11 library support, so we may freely
use C++11 features. (Chromium's still in the process of deciding what to allow,
but we use Google's style guide directly, toolchain limitations aside.)

Change-Id: I1c7feb92b7f5f51d9091a4c686649fb574ac138d
Reviewed-on: https://boringssl-review.googlesource.com/6465
Reviewed-by: Adam Langley <agl@google.com>
2015-11-11 22:19:36 +00:00
Adam Langley
4ab254017c Add AArch64 Montgomery assembly.
The file armv8-mont.pl is taken from upstream. The speed ups are fairly
modest (~30%) but seem worthwhile.

Before:

Did 231 RSA 2048 signing operations in 1008671us (229.0 ops/sec)
Did 11208 RSA 2048 verify operations in 1036997us (10808.1 ops/sec)
Did 342 RSA 2048 (3 prime, e=3) signing operations in 1021545us (334.8 ops/sec)
Did 32000 RSA 2048 (3 prime, e=3) verify operations in 1016162us (31491.0 ops/sec)
Did 45 RSA 4096 signing operations in 1039805us (43.3 ops/sec)
Did 3608 RSA 4096 verify operations in 1060283us (3402.9 ops/sec)

After:

Did 300 RSA 2048 signing operations in 1009772us (297.1 ops/sec)
Did 12740 RSA 2048 verify operations in 1075413us (11846.6 ops/sec)
Did 408 RSA 2048 (3 prime, e=3) signing operations in 1016139us (401.5 ops/sec)
Did 33000 RSA 2048 (3 prime, e=3) verify operations in 1017510us (32432.1 ops/sec)
Did 52 RSA 4096 signing operations in 1067678us (48.7 ops/sec)
Did 3408 RSA 4096 verify operations in 1062863us (3206.4 ops/sec)

Change-Id: Ife74fac784067fce3668b5c87f51d481732ff855
Reviewed-on: https://boringssl-review.googlesource.com/6444
Reviewed-by: David Benjamin <davidben@chromium.org>
Reviewed-by: Adam Langley <agl@google.com>
2015-11-10 19:13:46 +00:00
Adam Langley
ad38dc7452 Enable Montgomery optimisations on ARM.
These were accidently disabled for ARM.

Before:

Did 38 RSA 2048 signing operations in 1051209us (36.1 ops/sec)
Did 1500 RSA 2048 verify operations in 1069611us (1402.4 ops/sec)
Did 65 RSA 2048 (3 prime, e=3) signing operations in 1055664us (61.6 ops/sec)
Did 4719 RSA 2048 (3 prime, e=3) verify operations in 1029144us (4585.4 ops/sec)
Did 5 RSA 4096 signing operations in 1092346us (4.6 ops/sec)
Did 418 RSA 4096 verify operations in 1069977us (390.7 ops/sec)

After:

Did 156 RSA 2048 signing operations in 1000672us (155.9 ops/sec)
Did 6071 RSA 2048 verify operations in 1068512us (5681.7 ops/sec)
Did 84 RSA 2048 (3 prime, e=3) signing operations in 1068847us (78.6 ops/sec)
Did 11000 RSA 2048 (3 prime, e=3) verify operations in 1023620us (10746.2 ops/sec)
Did 26 RSA 4096 signing operations in 1028320us (25.3 ops/sec)
Did 1788 RSA 4096 verify operations in 1072479us (1667.2 ops/sec)

Change-Id: I448698f7d8e5b481a06f98d54d608f0278827cd1
Reviewed-on: https://boringssl-review.googlesource.com/6443
Reviewed-by: David Benjamin <davidben@chromium.org>
Reviewed-by: Adam Langley <agl@google.com>
2015-11-09 23:06:58 +00:00
Brian Smith
5d5e39f5d2 Remove non-ASM version of |bn_mul_mont| in bn/generic.c.
When building in OPENSSL_NO_ASM mode, MSVC complains about unreachable
code. The redundant initialization of |i| is the main problem. The
skipping of the first test of the condition |i < num| with |goto| was
also confusing.

It turns out that |bn_mul_mont| is only called when assembly language
optimizations are available, but in that case the assmebly language
versions will always be used instead. Although this code will be
compiled in |OPENSSL_NO_ASM| builds, it is never called in
|OPENSSL_NO_ASM| builds. Thus, it can just be removed.

Change-Id: Id551899b2602824978edc1a1cb0703b76516808d
Reviewed-on: https://boringssl-review.googlesource.com/5550
Reviewed-by: Adam Langley <agl@google.com>
2015-11-06 22:28:58 +00:00
David Benjamin
e82e6f6696 Constify more BN_MONT_CTX parameters.
Most functions can take this in as const. Note this changes an
RSA_METHOD hook, though one I would not expect anyone to override.

Change-Id: Ib70ae65e5876b01169bdc594e465e3e3c4319a8b
Reviewed-on: https://boringssl-review.googlesource.com/6419
Reviewed-by: Adam Langley <agl@google.com>
2015-11-06 20:04:36 +00:00
Adam Langley
efb42fbb60 Make BN_mod_exp_mont_consttime take a const context.
BN_mod_exp_mont_consttime does not modify its |BN_MONT_CTX| so that
value should be const.

Change-Id: Ie74e48eec8061899fd056fbd99dcca2a86b02cad
Reviewed-on: https://boringssl-review.googlesource.com/6403
Reviewed-by: Adam Langley <agl@google.com>
2015-11-03 01:58:12 +00:00
David Benjamin
278d34234f Get rid of all compiler version checks in perlasm files.
Since we pre-generate our perlasm, having the output of these files be
sensitive to the environment the run in is unhelpful. It would be bad to
suddenly change what features we do or don't compile in whenever workstations'
toolchains change or if developers do or don't have CC variables set.

Previously, all compiler-version-gated features were turned on in
https://boringssl-review.googlesource.com/6260, but this broke the build. I
also wasn't thorough enough in gathering performance numbers. So, flip them all
to off instead. I'll enable them one-by-one as they're tested.

This should result in no change to generated assembly.

Change-Id: Ib4259b3f97adc4939cb0557c5580e8def120d5bc
Reviewed-on: https://boringssl-review.googlesource.com/6383
Reviewed-by: Adam Langley <agl@google.com>
2015-10-28 19:33:04 +00:00
David Benjamin
75885e29c4 Revert "Get rid of all compiler version checks in perlasm files."
This reverts commit b9c26014de.

The win64 bot seems unhappy. Will sniff at it tomorrow. In
the meantime, get the tree green again.

Change-Id: I058ddb3ec549beee7eabb2f3f72feb0a4a5143b2
Reviewed-on: https://boringssl-review.googlesource.com/6353
Reviewed-by: Adam Langley <alangley@gmail.com>
2015-10-26 23:12:39 +00:00
Brian Smith
9383eab5e9 Avoid signed/unsigned comparison in crypto/bn's |probable_prime|.
Change-Id: I768a348e1e34207bca55c7d093c1ba8975e304ab
Reviewed-on: https://boringssl-review.googlesource.com/6213
Reviewed-by: Adam Langley <alangley@gmail.com>
2015-10-26 21:27:12 +00:00
David Benjamin
301afaf223 Add a run_tests target to run all tests.
It's very annoying having to remember the right incant every time I want
to switch around between my build, build-release, build-asan, etc.,
output directories.

Unfortunately, this target is pretty unfriendly without CMake 3.2+ (and
Ninja 1.5+). This combination gives a USES_TERMINAL flag to
add_custom_target which uses Ninja's "console" pool, otherwise the
output buffering gets in the way. Ubuntu LTS is still on an older CMake,
so do a version check in the meantime.

CMake also has its own test mechanism (CTest), but this doesn't use it.
It seems to prefer knowing what all the tests are and then tries to do
its own output management and parallelizing and such. We already have
our own runners. all_tests.go could actually be converted tidily, but
generate_build_files.py also needs to read it, and runner.go has very
specific needs.

Naming the target ninja -C build test would be nice, but CTest squats
that name and CMake grumps when you use a reserved name, so I've gone
with run_tests.

Change-Id: Ibd20ebd50febe1b4e91bb19921f3bbbd9fbcf66c
Reviewed-on: https://boringssl-review.googlesource.com/6270
Reviewed-by: Adam Langley <alangley@gmail.com>
2015-10-26 20:33:44 +00:00
David Benjamin
b9c26014de Get rid of all compiler version checks in perlasm files.
Since we pre-generate our perlasm, having the output of these files be
sensitive to the environment the run in is unhelpful. It would be bad to
suddenly change what features we do or don't compile in whenever workstations'
toolchains change.

Enable all compiler-version-gated features as they should all be runtime-gated
anyway. This should align with what upstream's files would have produced on
modern toolschains. We should assume our assemblers can take whatever we'd like
to throw at them. (If it turns out some can't, we'd rather find out and
probably switch the problematic instructions to explicit byte sequences.)

This actually results in a fairly significant change to the assembly we
generate. I'm guessing upstream's buildsystem sets the CC environment variable,
while ours doesn't and so the version checks were all coming out conservative.

diffstat of generated files:

 linux-x86/crypto/sha/sha1-586.S              | 1176 ++++++++++++
 linux-x86/crypto/sha/sha256-586.S            | 2248 ++++++++++++++++++++++++
 linux-x86_64/crypto/bn/rsaz-avx2.S           | 1644 +++++++++++++++++
 linux-x86_64/crypto/bn/rsaz-x86_64.S         |  638 ++++++
 linux-x86_64/crypto/bn/x86_64-mont.S         |  332 +++
 linux-x86_64/crypto/bn/x86_64-mont5.S        | 1130 ++++++++++++
 linux-x86_64/crypto/modes/aesni-gcm-x86_64.S |  754 ++++++++
 linux-x86_64/crypto/modes/ghash-x86_64.S     |  475 +++++
 linux-x86_64/crypto/sha/sha1-x86_64.S        | 1121 ++++++++++++
 linux-x86_64/crypto/sha/sha256-x86_64.S      | 1062 +++++++++++
 linux-x86_64/crypto/sha/sha512-x86_64.S      | 2241 ++++++++++++++++++++++++
 mac-x86/crypto/sha/sha1-586.S                | 1174 ++++++++++++
 mac-x86/crypto/sha/sha256-586.S              | 2248 ++++++++++++++++++++++++
 mac-x86_64/crypto/bn/rsaz-avx2.S             | 1637 +++++++++++++++++
 mac-x86_64/crypto/bn/rsaz-x86_64.S           |  638 ++++++
 mac-x86_64/crypto/bn/x86_64-mont.S           |  331 +++
 mac-x86_64/crypto/bn/x86_64-mont5.S          | 1130 ++++++++++++
 mac-x86_64/crypto/modes/aesni-gcm-x86_64.S   |  750 ++++++++
 mac-x86_64/crypto/modes/ghash-x86_64.S       |  475 +++++
 mac-x86_64/crypto/sha/sha1-x86_64.S          | 1121 ++++++++++++
 mac-x86_64/crypto/sha/sha256-x86_64.S        | 1062 +++++++++++
 mac-x86_64/crypto/sha/sha512-x86_64.S        | 2241 ++++++++++++++++++++++++
 win-x86/crypto/sha/sha1-586.asm              | 1173 ++++++++++++
 win-x86/crypto/sha/sha256-586.asm            | 2248 ++++++++++++++++++++++++
 win-x86_64/crypto/bn/rsaz-avx2.asm           | 1858 +++++++++++++++++++-
 win-x86_64/crypto/bn/rsaz-x86_64.asm         |  638 ++++++
 win-x86_64/crypto/bn/x86_64-mont.asm         |  352 +++
 win-x86_64/crypto/bn/x86_64-mont5.asm        | 1184 ++++++++++++
 win-x86_64/crypto/modes/aesni-gcm-x86_64.asm |  933 ++++++++++
 win-x86_64/crypto/modes/ghash-x86_64.asm     |  515 +++++
 win-x86_64/crypto/sha/sha1-x86_64.asm        | 1152 ++++++++++++
 win-x86_64/crypto/sha/sha256-x86_64.asm      | 1088 +++++++++++
 win-x86_64/crypto/sha/sha512-x86_64.asm      | 2499 ++++++

SHA* gets faster. RSA and AES-GCM seem to be more of a wash and even slower
sometimes!  This is a little concerning. Though when I repeated the latter two,
it's definitely noisy (RSA in particular), so we may wish to repeat in a more
controlled environment. We could also flip some of these toggles to something
other than the highest setting if it seems some of the variants aren't
desirable. We just shouldn't have them enabled or disabled on accident. This
aligns us closer to upstream though.

$ /tmp/bssl.old speed SHA-
Did 5028000 SHA-1 (16 bytes) operations in 1000048us (5027758.7 ops/sec): 80.4 MB/s
Did 1708000 SHA-1 (256 bytes) operations in 1000257us (1707561.2 ops/sec): 437.1 MB/s
Did 73000 SHA-1 (8192 bytes) operations in 1008406us (72391.5 ops/sec): 593.0 MB/s
Did 3041000 SHA-256 (16 bytes) operations in 1000311us (3040054.5 ops/sec): 48.6 MB/s
Did 779000 SHA-256 (256 bytes) operations in 1000820us (778361.7 ops/sec): 199.3 MB/s
Did 26000 SHA-256 (8192 bytes) operations in 1009875us (25745.8 ops/sec): 210.9 MB/s
Did 1837000 SHA-512 (16 bytes) operations in 1000251us (1836539.0 ops/sec): 29.4 MB/s
Did 803000 SHA-512 (256 bytes) operations in 1000969us (802222.6 ops/sec): 205.4 MB/s
Did 41000 SHA-512 (8192 bytes) operations in 1016768us (40323.8 ops/sec): 330.3 MB/s
$ /tmp/bssl.new speed SHA-
Did 5354000 SHA-1 (16 bytes) operations in 1000104us (5353443.2 ops/sec): 85.7 MB/s
Did 1779000 SHA-1 (256 bytes) operations in 1000121us (1778784.8 ops/sec): 455.4 MB/s
Did 87000 SHA-1 (8192 bytes) operations in 1012641us (85914.0 ops/sec): 703.8 MB/s
Did 3517000 SHA-256 (16 bytes) operations in 1000114us (3516599.1 ops/sec): 56.3 MB/s
Did 935000 SHA-256 (256 bytes) operations in 1000096us (934910.2 ops/sec): 239.3 MB/s
Did 38000 SHA-256 (8192 bytes) operations in 1004476us (37830.7 ops/sec): 309.9 MB/s
Did 2930000 SHA-512 (16 bytes) operations in 1000259us (2929241.3 ops/sec): 46.9 MB/s
Did 1008000 SHA-512 (256 bytes) operations in 1000509us (1007487.2 ops/sec): 257.9 MB/s
Did 45000 SHA-512 (8192 bytes) operations in 1000593us (44973.3 ops/sec): 368.4 MB/s

$ /tmp/bssl.old speed RSA
Did 820 RSA 2048 signing operations in 1017008us (806.3 ops/sec)
Did 27000 RSA 2048 verify operations in 1015400us (26590.5 ops/sec)
Did 1292 RSA 2048 (3 prime, e=3) signing operations in 1008185us (1281.5 ops/sec)
Did 65000 RSA 2048 (3 prime, e=3) verify operations in 1011388us (64268.1 ops/sec)
Did 120 RSA 4096 signing operations in 1061027us (113.1 ops/sec)
Did 8208 RSA 4096 verify operations in 1002717us (8185.8 ops/sec)
$ /tmp/bssl.new speed RSA
Did 760 RSA 2048 signing operations in 1003351us (757.5 ops/sec)
Did 25900 RSA 2048 verify operations in 1028931us (25171.8 ops/sec)
Did 1320 RSA 2048 (3 prime, e=3) signing operations in 1040806us (1268.2 ops/sec)
Did 63000 RSA 2048 (3 prime, e=3) verify operations in 1016042us (62005.3 ops/sec)
Did 104 RSA 4096 signing operations in 1008718us (103.1 ops/sec)
Did 6875 RSA 4096 verify operations in 1093441us (6287.5 ops/sec)

$ /tmp/bssl.old speed GCM
Did 5316000 AES-128-GCM (16 bytes) seal operations in 1000082us (5315564.1 ops/sec): 85.0 MB/s
Did 712000 AES-128-GCM (1350 bytes) seal operations in 1000252us (711820.6 ops/sec): 961.0 MB/s
Did 149000 AES-128-GCM (8192 bytes) seal operations in 1003182us (148527.4 ops/sec): 1216.7 MB/s
Did 5919750 AES-256-GCM (16 bytes) seal operations in 1000016us (5919655.3 ops/sec): 94.7 MB/s
Did 800000 AES-256-GCM (1350 bytes) seal operations in 1000951us (799239.9 ops/sec): 1079.0 MB/s
Did 152000 AES-256-GCM (8192 bytes) seal operations in 1000765us (151883.8 ops/sec): 1244.2 MB/s
$ /tmp/bssl.new speed GCM
Did 5315000 AES-128-GCM (16 bytes) seal operations in 1000125us (5314335.7 ops/sec): 85.0 MB/s
Did 755000 AES-128-GCM (1350 bytes) seal operations in 1000878us (754337.7 ops/sec): 1018.4 MB/s
Did 151000 AES-128-GCM (8192 bytes) seal operations in 1005655us (150150.9 ops/sec): 1230.0 MB/s
Did 5913500 AES-256-GCM (16 bytes) seal operations in 1000041us (5913257.6 ops/sec): 94.6 MB/s
Did 782000 AES-256-GCM (1350 bytes) seal operations in 1001484us (780841.2 ops/sec): 1054.1 MB/s
Did 121000 AES-256-GCM (8192 bytes) seal operations in 1006389us (120231.8 ops/sec): 984.9 MB/s

Change-Id: I0efb32f896c597abc7d7e55c31d038528a5c72a1
Reviewed-on: https://boringssl-review.googlesource.com/6260
Reviewed-by: Adam Langley <alangley@gmail.com>
2015-10-26 20:31:30 +00:00
David Benjamin
12f7737d32 Remove BN_MONT_CTX_init.
One less exported function. Nothing ever stack-allocates them, within BoringSSL
or in consumers. This avoids the slightly odd mechanism where BN_MONT_CTX_free
might or might not free the BN_MONT_CTX itself based on a flag.

(This is also consistent with OpenSSL 1.1.x which does away with the _init
variants of both this and BIGNUM so it shouldn't be a compatibility concern
long-term either.)

Change-Id: Id885ae35a26f75686cc68a8aa971e2ea6767ba88
Reviewed-on: https://boringssl-review.googlesource.com/6350
Reviewed-by: Adam Langley <alangley@gmail.com>
2015-10-26 19:47:26 +00:00
David Benjamin
036152e6a5 Fix incorrect error-handling in BN_div_recp.
See upstream's e90f1d9b74275c11e3492e521e46f4b1afa6f883.

Change-Id: I68470acb97dac59e586b1c72aad50de6bd0156cb
Reviewed-on: https://boringssl-review.googlesource.com/6342
Reviewed-by: Adam Langley <alangley@gmail.com>
2015-10-26 17:48:10 +00:00