Commit Graph

22 Commits

Author SHA1 Message Date
David Benjamin
378cca8016 Handle a modulus of -1 correctly.
Historically, OpenSSL's modular exponentiation functions tolerated negative
moduli by ignoring the sign bit. The special case for a modulus of 1 should do
the same. That said, this is ridiculous and the only reason I'm importing this
is BN_abs_is_word(1) is marginally more efficient than BN_is_one() and we
haven't gotten around to enforcing positive moduli yet.

Thanks to Guido Vranken and OSSFuzz for finding this issue and reporting to
OpenSSL.

(Imported from upstream's 235119f015e46a74040b78b10fd6e954f7f07774.)

Change-Id: I526889dfbe2356753aa1e6ecfd3aa3dc3a8cd2b8
Reviewed-on: https://boringssl-review.googlesource.com/31085
Commit-Queue: Adam Langley <agl@google.com>
Reviewed-by: Adam Langley <agl@google.com>
2018-08-16 15:57:10 +00:00
Brian Smith
fee8709f69 Replace |alloca| in |BN_mod_exp_mont_consttime|.
|alloca| is dangerous and poorly specified, according to any
description of |alloca|. It's also hard for some analysis tools to
reason about.

The code here assumed |alloca| is a macro, which isn't a valid
assumption. Depending on what which headers are included and what
toolchain is being used, |alloca| may or may not be defined as a macro,
and this might change over time if/when toolchains are updated. Or, we
might be doing static analysis and/or dynamic analysis with a different
configuration w.r.t. the availability of |alloca| than production
builds use.

Regardless, the |alloca| code path only kicked in when the inputs are
840 bits or smaller. Since the multi-prime RSA support was removed, for
interesting RSA key sizes the input will be at least 1024 bits and this
code path won't be triggered since powerbufLen will be larger than 3072
bytes in those cases. ECC inversion via Fermat's Little Theorem has its
own constant-time exponentiation so there are no cases where smaller
inputs need to be fast.

The RSAZ code avoids the |OPENSSL_malloc| for 2048-bit RSA keys.
Increasingly the RSAZ code won't be used though, since it will be
skipped over on Broadwell+ CPUs. Generalize the RSAZ stack allocation
to work for non-RSAZ code paths. In order to ensure this doesn't cause
too much stack usage on platforms where RSAZ wasn't already being used,
only do so on x86-64, which already has this large stack size
requirement due to RSAZ.

This change will make it easier to refactor |BN_mod_exp_mont_consttime|
to do that more safely and in a way that's more compatible with various
analysis tools.

This is also a step towards eliminating the |uintptr_t|-based alignment
hack.

Since this change increases the number of times |OPENSSL_free| is
skipped, I've added an explicit |OPENSSL_cleanse| to ensure the
zeroization is done. This should be done regardless of the other changes
here.

Change-Id: I8a161ce2720a26127e85fff7513f394883e50b2e
Reviewed-on: https://boringssl-review.googlesource.com/28584
Commit-Queue: David Benjamin <davidben@google.com>
CQ-Verified: CQ bot account: commit-bot@chromium.org <commit-bot@chromium.org>
Reviewed-by: David Benjamin <davidben@google.com>
2018-05-21 19:43:05 +00:00
David Benjamin
a63d0ad40d Require BN_mod_exp_mont* inputs be reduced.
If the caller asked for the base to be treated as secret, we should
provide that. Allowing unbounded inputs is not compatible with being
constant-time.

Additionally, this aligns with the guidance here:
https://github.com/HACS-workshop/spectre-mitigations/blob/master/crypto_guidelines.md#1-do-not-conditionally-choose-between-constant-and-non-constant-time

Update-Note: BN_mod_exp_mont_consttime and BN_mod_exp_mont now require
inputs be fully reduced. I believe current callers tolerate this.

Additionally, due to a quirk of how certain operations were ordered,
using (publicly) zero exponent tolerated a NULL BN_CTX while other
exponents required non-NULL BN_CTX. Non-NULL BN_CTX is now required
uniformly. This is unlikely to cause problems. Any call site where the
exponent is always zero should just be replaced with BN_value_one().

Change-Id: I7c941953ea05f36dc2754facb9f4cf83a6789c61
Reviewed-on: https://boringssl-review.googlesource.com/27665
Commit-Queue: David Benjamin <davidben@google.com>
CQ-Verified: CQ bot account: commit-bot@chromium.org <commit-bot@chromium.org>
Reviewed-by: Steven Valdez <svaldez@google.com>
2018-04-24 18:29:29 +00:00
David Benjamin
9291be5b27 Remove return values from bn_*_small.
No sense in adding impossible error cases we need to handle.
Additionally, tighten them a bit and require strong bounds. (I wasn't
sure what we'd need at first and made them unnecessarily general.)

Change-Id: I21a0afde90a55be2e9a0b8d7288f595252844f5f
Reviewed-on: https://boringssl-review.googlesource.com/27586
Reviewed-by: Adam Langley <alangley@gmail.com>
2018-04-24 15:34:32 +00:00
David Benjamin
9af9b946d2 Restore the BN_mod codepath for public Montgomery moduli.
https://boringssl-review.googlesource.com/10520 and then later
https://boringssl-review.googlesource.com/25285 made BN_MONT_CTX_set
constant-time, which is necessary for RSA's mont_p and mont_q. However,
due to a typo in the benchmark, they did not correctly measure.

Split BN_MONT_CTX creation into a constant-time and variable-time one.
The constant-time one uses our current algorithm and the latter restores
the original BN_mod codepath.

Should we wish to avoid BN_mod, I have an alternate version lying
around:

First, BN_set_bit + bn_mod_lshift1_consttime as now to count up to 2*R.
Next, observe that 2*R = BN_to_montgomery(2) and R*R =
BN_to_montgomery(R) = BN_to_montgomery(2^r_bits) Also observe that
BN_mod_mul_montgomery only needs n0, not RR. Split the core of
BN_mod_exp_mont into its own function so the caller handles conversion.
Raise 2*R to the r_bits power to get 2^r_bits*R = R*R.

The advantage of that algorithm is that it is still constant-time, so we
only need one BN_MONT_CTX_new. Additionally, it avoids BN_mod which is
otherwise (almost, but the remaining links should be easy to cut) out of
the critical path for correctness. One less operation to worry about.

The disadvantage is that it is gives a 25% (RSA-2048) or 32% (RSA-4096)
slower RSA verification speed. I went with the BN_mod one for the time
being.

Before:
Did 9204 RSA 2048 signing operations in 10052053us (915.6 ops/sec)
Did 326000 RSA 2048 verify (same key) operations in 10028823us (32506.3 ops/sec)
Did 50830 RSA 2048 verify (fresh key) operations in 10033794us (5065.9 ops/sec)
Did 1269 RSA 4096 signing operations in 10019204us (126.7 ops/sec)
Did 88435 RSA 4096 verify (same key) operations in 10031129us (8816.1 ops/sec)
Did 14552 RSA 4096 verify (fresh key) operations in 10053411us (1447.5 ops/sec)

After:
Did 9150 RSA 2048 signing operations in 10022831us (912.9 ops/sec)
Did 322000 RSA 2048 verify (same key) operations in 10028604us (32108.2 ops/sec)
Did 289000 RSA 2048 verify (fresh key) operations in 10017205us (28850.4 ops/sec)
Did 1270 RSA 4096 signing operations in 10072950us (126.1 ops/sec)
Did 87480 RSA 4096 verify (same key) operations in 10036328us (8716.3 ops/sec)
Did 80730 RSA 4096 verify (fresh key) operations in 10073614us (8014.0 ops/sec)

Change-Id: Ie8916d1634ccf8513ceda458fa302f09f3e93c07
Reviewed-on: https://boringssl-review.googlesource.com/27287
Commit-Queue: David Benjamin <davidben@google.com>
CQ-Verified: CQ bot account: commit-bot@chromium.org <commit-bot@chromium.org>
Reviewed-by: Adam Langley <agl@google.com>
2018-04-20 20:50:15 +00:00
David Benjamin
7e2a8a34ba Speed up variable windowed exponentation a bit.
The first non-zero window (which we can condition on for public
exponents) always multiplies by one. This means we can cut out one
Montgomery multiplication. It also means we never actually need to
initialize r to one, saving another Montgomery multiplication for P-521.

This, in turn, means we don't need the bn_one_to_montgomery optimization
for the public-exponent exponentations, so we can delete
bn_one_to_montgomery_small. (The function does currently promise to
handle p = 0, but this is not actually reachable, so it can just do a
reduction on RR.)

For RSA, where we're not doing many multiplications to begin with,
saving one is noticeable.

Before:
Did 92000 RSA 2048 verify (same key) operations in 3002557us (30640.6 ops/sec)
Did 25165 RSA 4096 verify (same key) operations in 3045046us (8264.2 ops/sec)

After:
Did 100000 RSA 2048 verify (same key) operations in 3002483us (33305.8 ops/sec)
Did 26603 RSA 4096 verify (same key) operations in 3010942us (8835.4 ops/sec)

(Not looking at the fresh key number yet as that still needs to be
fixed.)

Change-Id: I81a025a68d9b0f8eb0f9c6c04ec4eedf0995a345
Reviewed-on: https://boringssl-review.googlesource.com/27286
Reviewed-by: Adam Langley <agl@google.com>
Commit-Queue: David Benjamin <davidben@google.com>
CQ-Verified: CQ bot account: commit-bot@chromium.org <commit-bot@chromium.org>
2018-04-20 20:37:45 +00:00
David Benjamin
56ea9e2769 Fix bn_mod_exp_mont_small when exponentiating to zero.
It's defined to return one in Montgomery form, not a normal one.

(Not that this matters. This function is only used to Fermat's Little
Theorem. Probably it should have been less general, though we'd need to
make new test vectors first.)

Change-Id: Ia8d7588e6a413b25f01280af9aacef0192283771
Reviewed-on: https://boringssl-review.googlesource.com/27285
Reviewed-by: Adam Langley <agl@google.com>
2018-04-18 22:13:16 +00:00
David Benjamin
e0ae249f03 Remove a = 0 special-case in BN_mod_exp_mont.
BN_mod_exp_mont is intended to protect the base, but not the exponent.
Accordingly, it shouldn't treat a base of zero as special.

Change-Id: Ib053e8ce65ab1741973a9f9bfeff8c353567439c
Reviewed-on: https://boringssl-review.googlesource.com/27284
Reviewed-by: Adam Langley <agl@google.com>
2018-04-18 22:03:16 +00:00
David Benjamin
08d774a45f Remove some easy bn_set_minimal_width calls.
Functions that deserialize from bytes and Montgomery multiplication have
no reason to minimize their inputs.

Bug: 232
Change-Id: I121cc9b388033d684057b9df4ad0c08364849f58
Reviewed-on: https://boringssl-review.googlesource.com/25258
Commit-Queue: David Benjamin <davidben@google.com>
CQ-Verified: CQ bot account: commit-bot@chromium.org <commit-bot@chromium.org>
Reviewed-by: Adam Langley <agl@google.com>
2018-02-05 23:47:14 +00:00
David Benjamin
09633cc34e Rename bn->top to bn->width.
This has no behavior change, but it has a semantic one. This CL is an
assertion that all BIGNUM functions tolerate non-minimal BIGNUMs now.
Specifically:

- Functions that do not touch top/width are assumed to not care.

- Functions that do touch top/width will be changed by this CL. These
  should be checked in review that they tolerate non-minimal BIGNUMs.

Subsequent CLs will start adjusting the widths that BIGNUM functions
output, to fix timing leaks.

Bug: 232
Change-Id: I3a2b41b071f2174452f8d3801bce5c78947bb8f7
Reviewed-on: https://boringssl-review.googlesource.com/25257
Commit-Queue: David Benjamin <davidben@google.com>
CQ-Verified: CQ bot account: commit-bot@chromium.org <commit-bot@chromium.org>
Reviewed-by: Adam Langley <agl@google.com>
2018-02-05 23:44:24 +00:00
David Benjamin
226b4b51b5 Make the rest of BIGNUM accept non-minimal values.
Test this by re-running bn_tests.txt tests a lot. For the most part,
this was done by scattering bn_minimal_width or bn_correct_top calls as
needed. We'll incrementally tease apart the functions that need to act
on non-minimal BIGNUMs in constant-time.

BN_sqr was switched to call bn_correct_top at the end, rather than
sample bn_minimal_width, in anticipation of later splitting it into
BN_sqr (for calculators) and BN_sqr_fixed (for BN_mod_mul_montgomery).

BN_div_word also uses bn_correct_top because it calls BN_lshift so
officially shouldn't rely on BN_lshift returning something
minimal-width, though I expect we'd want to split off a BN_lshift_fixed
than change that anyway?

The shifts sample bn_minimal_width rather than bn_correct_top because
they all seem to try to be very clever around the bit width. If we need
constant-time versions of them, we can adjust them later.

Bug: 232
Change-Id: Ie17b39034a713542dbe906cf8954c0c5483c7db7
Reviewed-on: https://boringssl-review.googlesource.com/25255
Commit-Queue: David Benjamin <davidben@google.com>
CQ-Verified: CQ bot account: commit-bot@chromium.org <commit-bot@chromium.org>
Reviewed-by: Adam Langley <agl@google.com>
2018-02-05 23:05:34 +00:00
David Benjamin
f4b708cc1e Add a function which folds BN_MONT_CTX_{new,set} together.
These empty states aren't any use to either caller or implementor.

Change-Id: If0b748afeeb79e4a1386182e61c5b5ecf838de62
Reviewed-on: https://boringssl-review.googlesource.com/25254
Reviewed-by: Adam Langley <agl@google.com>
2018-02-02 20:23:25 +00:00
David Benjamin
2ccdf584aa Factor out BN_to_montgomery(1) optimization.
This cuts down on a duplicated place where we mess with bn->top. It also
also better abstracts away what determines the value of R.

(I ordered this wrong and rebasing will be annoying. Specifically, the
question is what happens if the modulus is non-minimal. In
https://boringssl-review.googlesource.com/c/boringssl/+/25250/, R will
be determined by the stored width of mont->N, so we want to use mont's
copy of the modulus. Though, one way or another, the important part is
that it's inside the Montgomery abstraction.)

Bug: 232
Change-Id: I74212e094c8a47f396b87982039e49048a130916
Reviewed-on: https://boringssl-review.googlesource.com/25247
Reviewed-by: Adam Langley <agl@google.com>
2018-02-02 18:42:39 +00:00
David Benjamin
32b5940267 Don't leak the exponent bit width in BN_mod_exp_mont_consttime.
(See also https://github.com/openssl/openssl/pull/5154.)

The exponent here is one of d, dmp1, or dmq1 for RSA. This value and its
bit length are both secret. The only public upper bound is the bit width
of the corresponding modulus (RSA n, p, and q, respectively).

Although BN_num_bits is constant-time (sort of; see bn_correct_top notes
in preceding patch), this does not fix the root problem, which is that
the windows are based on the minimal bit width, not the upper bound. We
could use BN_num_bits(m), but BN_mod_exp_mont_consttime is public API
and may be called with larger exponents. Instead, use all top*BN_BITS2
bits in the BIGNUM. This is still sensitive to the long-standing
bn_correct_top leak, but we need to fix that regardless.

This may cause us to do a handful of extra multiplications for RSA keys
which are just above a whole number of words, but that is not a standard
RSA key size.

Change-Id: I5e2f12b70c303b27c597a7e513b7bf7288f7b0e3
Reviewed-on: https://boringssl-review.googlesource.com/25185
Commit-Queue: David Benjamin <davidben@google.com>
CQ-Verified: CQ bot account: commit-bot@chromium.org <commit-bot@chromium.org>
Reviewed-by: Adam Langley <agl@google.com>
2018-01-23 22:27:37 +00:00
David Benjamin
a08bba51a5 Add bn_mod_exp_mont_small and bn_mod_inverse_prime_mont_small.
These can be used to invert values in ECDSA. Unlike their BIGNUM
counterparts, the caller is responsible for taking values in and out of
Montgomery domain. This will save some work later on in the ECDSA
computation.

Change-Id: Ib7292900a0fdeedce6cb3e9a9123c94863659043
Reviewed-on: https://boringssl-review.googlesource.com/23071
Reviewed-by: Adam Langley <agl@google.com>
2017-11-20 16:23:48 +00:00
David Benjamin
d66bbf3413 Tidy up BN_mod_exp_mont.
This was primarily for my own understanding, but this should hopefully
also be clearer and more amenable to using unsigned indices later.

Change-Id: I09cc3d55de0f7d9284d3b3168d8b0446274b2ab7
Reviewed-on: https://boringssl-review.googlesource.com/22889
Reviewed-by: Adam Langley <agl@google.com>
2017-11-10 22:43:54 +00:00
Daniel Hirche
2eb2889702 bn/exp: don't check |copy_to_prebuf|'s retval in |BN_mod_exp_mont_consttime|.
It always returns one, so just void it.

Change-Id: I8733cc3d6b20185e782cf0291e9c0dc57712bb63
Reviewed-on: https://boringssl-review.googlesource.com/22564
Reviewed-by: Adam Langley <agl@google.com>
Reviewed-by: David Benjamin <davidben@google.com>
Commit-Queue: David Benjamin <davidben@google.com>
CQ-Verified: CQ bot account: commit-bot@chromium.org <commit-bot@chromium.org>
2017-11-03 15:43:52 +00:00
David Benjamin
fed560ff2a Clear no-op BN_MASK2 masks.
This is an OpenSSL thing to support platforms where BN_ULONG is not
actually the size it claims to be. We define BN_ULONG to uint32_t and
uint64_t which are guaranteed by C to implement arithemetic modulo 2^32
and 2^64, respectively. Thus there is no need for any of this.

Change-Id: I098cd4cc050a136b9f2c091dfbc28dd83e01f531
Reviewed-on: https://boringssl-review.googlesource.com/21784
Commit-Queue: Adam Langley <agl@google.com>
Reviewed-by: Adam Langley <agl@google.com>
CQ-Verified: CQ bot account: commit-bot@chromium.org <commit-bot@chromium.org>
2017-10-27 02:38:45 +00:00
Martin Kreichgauer
6dc892fcdf Remove redundant calls to |OPENSSL_cleanse| and |OPENSSL_realloc_clean|.
Change-Id: I5c85c4d072ec157b37ed95b284a26ab32c0c42d9
Reviewed-on: https://boringssl-review.googlesource.com/19824
Reviewed-by: Martin Kreichgauer <martinkr@google.com>
Commit-Queue: Martin Kreichgauer <martinkr@google.com>
CQ-Verified: CQ bot account: commit-bot@chromium.org <commit-bot@chromium.org>
2017-09-18 19:16:51 +00:00
David Benjamin
808f832917 Run the comment converter on libcrypto.
crypto/{asn1,x509,x509v3,pem} were skipped as they are still OpenSSL
style.

Change-Id: I3cd9a60e1cb483a981aca325041f3fbce294247c
Reviewed-on: https://boringssl-review.googlesource.com/19504
Reviewed-by: Adam Langley <agl@google.com>
Commit-Queue: David Benjamin <davidben@google.com>
CQ-Verified: CQ bot account: commit-bot@chromium.org <commit-bot@chromium.org>
2017-08-18 21:49:04 +00:00
David Benjamin
9d4e06e6bc Switch some pointer casts to memcpy.
This isn't all of our pointer games by far, but for any code which
doesn't run on armv6, memcpy and pointer cast compile to the same code.
For code with does care about armv6 (do we care?), it'll need a bit more
work. armv6 makes memcpy into a function call.

Ironically, the one platform where C needs its alignment rules is the
one platform that makes it hard to honor C's alignment rules.

Change-Id: Ib9775aa4d9df9381995df8698bd11eb260aac58c
Reviewed-on: https://boringssl-review.googlesource.com/17707
Reviewed-by: David Benjamin <davidben@google.com>
Commit-Queue: David Benjamin <davidben@google.com>
CQ-Verified: CQ bot account: commit-bot@chromium.org <commit-bot@chromium.org>
2017-07-11 02:02:41 +00:00
Adam Langley
5c38c05b26 Move bn/ into crypto/fipsmodule/
Change-Id: I68aa4a740ee1c7f2a308a6536f408929f15b694c
Reviewed-on: https://boringssl-review.googlesource.com/15647
Reviewed-by: Adam Langley <agl@google.com>
Commit-Queue: Adam Langley <agl@google.com>
CQ-Verified: CQ bot account: commit-bot@chromium.org <commit-bot@chromium.org>
2017-05-01 22:51:25 +00:00