boringssl

Author	SHA1	Message	Date
David Benjamin	2257e8f3bf	Use bn_rshift_words for the ECDSA bit-shift. May as well use it. Also avoid an overflow with digest_len if someone asks to sign a truly enormous digest. Change-Id: Ia0a53007a496f9c7cadd44b1020ec2774b310936 Reviewed-on: https://boringssl-review.googlesource.com/26966 Reviewed-by: Adam Langley <agl@google.com>	2018-04-02 18:17:39 +00:00
David Benjamin	cbe77925f4	Extract the single-subtraction reduction into a helper function. We do this in four different places, with the same long comment, and I'm about to add yet another one. Change-Id: If28e3f87ea71020d9b07b92e8947f3848473d99d Reviewed-on: https://boringssl-review.googlesource.com/26964 Reviewed-by: Adam Langley <agl@google.com>	2018-04-02 18:13:45 +00:00
David Benjamin	a44dae7fd3	Add a constant-time generic modular inverse function. This uses the full binary GCD algorithm, where all four of A, B, C, and D must be retained. (BN_mod_inverse_odd implements the odd number version which only needs A and C.) It is patterned after the version in the Handbook of Applied Cryptography, but tweaked so the coefficients are non-negative and bounded. Median of 29 RSA keygens: 0m0.225s -> 0m0.220s (Accuracy beyond 0.1s is questionable.) Bug: 238 Change-Id: I6dc13524ea7c8ac1072592857880ddf141d87526 Reviewed-on: https://boringssl-review.googlesource.com/26370 Reviewed-by: Adam Langley <alangley@gmail.com>	2018-03-30 19:53:44 +00:00
David Benjamin	1044553d6d	Add new GCD and related primitives. RSA key generation requires computing a GCD (p-1 and q-1 are relatively prime with e) and an LCM (the Carmichael totient). I haven't made BN_gcd itself constant-time here to save having to implement bn_lshift_secret_shift, since the two necessary operations can be served by bn_rshift_secret_shift, already added for Rabin-Miller. However, the guts of BN_gcd are replaced. Otherwise, the new functions are only connected to tests for now, they'll be used in subsequent CLs. To support LCM, there is also now a constant-time division function. This does not replace BN_div because bn_div_consttime is some 40x slower than BN_div. That penalty is fine for RSA keygen because that operation is not bottlenecked on division, so we prefer simplicity over performance. Median of 29 RSA keygens: 0m0.212s -> 0m0.225s (Accuracy beyond 0.1s is questionable.) Bug: 238 Change-Id: Idbfbfa6e7f5a3b8782ce227fa130417b3702cf97 Reviewed-on: https://boringssl-review.googlesource.com/26369 Reviewed-by: Adam Langley <alangley@gmail.com>	2018-03-30 19:53:36 +00:00
David Benjamin	23af438ccd	Compute p - q in constant time. Expose the constant-time abs_sub functions from the fixed Karatsuba code in BIGNUM form for RSA to call into. RSA key generation involves checking if \|p - q\| is above some lower bound. BN_sub internally branches on which of p or q is bigger. For any given iteration, this is not secret---one of p or q is necessarily the larger, and whether we happened to pick the larger or smaller first is irrelevant. Accordingly, there is no need to perform the p/q swap at the end in constant-time. However, this stage of the algorithm picks p first, sticks with it, and then computes \|p - q\| for various q candidates. The distribution of comparisons leaks information about p. The leak is unlikely to be problematic, but plug it anyway. Median of 29 RSA keygens: 0m0.210s -> 0m0.212s (Accuracy beyond 0.1s is questionable.) Bug: 238 Change-Id: I024b4e51b364f5ca2bcb419a0393e7be13249aec Reviewed-on: https://boringssl-review.googlesource.com/26368 Reviewed-by: Adam Langley <alangley@gmail.com>	2018-03-30 19:53:28 +00:00
David Benjamin	97ac45e2f7	Change the order of GCD and trial division. RSA key generation currently does the GCD check before the primality test, in hopes of discarding things invalid by other means before running the expensive primality check. However, GCD is about to get a bit more expensive to clear the timing leak, and the trial division part of primality testing is quite fast. Thus, split that portion out via a new bn_is_obviously_composite and call it before GCD. Median of 29 RSA keygens: 0m0.252s -> 0m0.207s (Accuracy beyond 0.1s is questionable.) Bug: 238 Change-Id: I3999771fb73cca16797cab9332d14c4ebeb02046 Reviewed-on: https://boringssl-review.googlesource.com/26366 Reviewed-by: Adam Langley <alangley@gmail.com>	2018-03-30 19:53:06 +00:00
David Benjamin	56f5eb9ffd	Name constant-time functions more consistently. I'm not sure why I separated "fixed" and "quick_ctx" names. That's annoying and doesn't generalize well to, say, adding a bn_div_consttime function for RSA keygen. Change-Id: I751d52b30e079de2f0d37a952de380fbf2c1e6b7 Reviewed-on: https://boringssl-review.googlesource.com/26364 Commit-Queue: David Benjamin <davidben@google.com> CQ-Verified: CQ bot account: commit-bot@chromium.org <commit-bot@chromium.org> Reviewed-by: Adam Langley <agl@google.com>	2018-03-29 23:30:55 +00:00
David Benjamin	e6f46e2563	Blind the range check for finding a Rabin-Miller witness. Rabin-Miller requires selecting a random number from 2 to \|w\|-1. This is done by picking an N-bit number and discarding out-of-range values. This leaks information about \|w\|, so apply blinding. Rather than discard bad values, adjust them to be in range. Though not uniformly selected, these adjusted values are still usable as Rabin-Miller checks. Rabin-Miller is already probabilistic, so we could reach the desired confidence levels by just suitably increasing the iteration count. However, to align with FIPS 186-4, we use a more pessimal analysis: we do not count the non-uniform values towards the iteration count. As a result, this function is more complex and has more timing risk than necessary. We count both total iterations and uniform ones and iterate until we've reached at least \|BN_PRIME_CHECKS_BLINDED\| and \|iterations\|, respectively. If the latter is large enough, it will be the limiting factor with high probability and we won't leak information. Note this blinding does not impact most calls when picking primes because composites are rejected early. Only the two secret primes see extra work. So while this does make the BNTest.PrimeChecking test take about 2x longer to run on debug mode, RSA key generation time is fine. Another, perhaps simpler, option here would have to run bn_rand_range_words to the full 100 count, select an arbitrary successful try, and declare failure of the entire keygen process (as we do already) if all tries failed. I went with the option in this CL because I happened to come up with it first, and because the failure probability decreases much faster. Additionally, the option in this CL does not affect composite numbers, while the alternate would. This gives a smaller multiplier on our entropy draw. We also continue to use the "wasted" work for stronger assurance on primality. FIPS' numbers are remarkably low, considering the increase has negligible cost. Thanks to Nathan Benjamin for helping me explore the failure rate as the target count and blinding count change. Now we're down to the rest of RSA keygen, which will require all the operations we've traditionally just avoided in constant-time code! Median of 29 RSA keygens: 0m0.169s -> 0m0.298s (Accuracy beyond 0.1s is questionable. The runs at subsequent test- and rename-only CLs were 0m0.217s, 0m0.245s, 0m0.244s, 0m0.247s.) Bug: 238 Change-Id: Id6406c3020f2585b86946eb17df64ac42f30ebab Reviewed-on: https://boringssl-review.googlesource.com/25890 Commit-Queue: Adam Langley <agl@google.com> CQ-Verified: CQ bot account: commit-bot@chromium.org <commit-bot@chromium.org> Reviewed-by: Adam Langley <agl@google.com>	2018-03-29 22:02:24 +00:00
David Benjamin	8eadca50a2	Don't leak \|a\| in the primality test. (This is actually slightly silly as \|a\|'s probability distribution falls off exponentially, but it's easy enough to do right.) Instead, we run the loop to the end. This is still performant because we can, as before, return early on composite numbers. Only two calls actually run to the end. Moreover, running to the end has comparable cost to BN_mod_exp_mont_consttime. Median time goes from 0.140s to 0.231s. That cost some, but we're still faster than the original implementation. We're down to one more leak, which is that the BN_rand_range_ex call does not hide \|w1\|. That one may only be solved probabilistically... Median of 29 RSA keygens: 0m0.123s -> 0m0.145s (Accuracy beyond 0.1s is questionable.) Bug: 238 Change-Id: I4847cb0053118c572d2dd5f855388b5199fa6ce2 Reviewed-on: https://boringssl-review.googlesource.com/25888 Reviewed-by: Adam Langley <agl@google.com>	2018-03-28 01:44:31 +00:00
David Benjamin	9362ed9e14	Use a Barrett reduction variant for trial division. Compilers use a variant of Barrett reduction to divide by constants, which conveniently also avoids problematic operations on the secret numerator. Implement the variant as described here: http://ridiculousfish.com/blog/posts/labor-of-division-episode-i.html Repurpose this to implement a constant-time BN_mod_word replacement. It's even much faster! I've gone ahead and replaced the other BN_mod_word calls on the primes table. That should give plenty of budget for the other changes. (I am assuming that a regression is okay, as RSA keygen is not performance-sensitive, but that I should avoid anything too dramatic.) Proof of correctness: https://github.com/davidben/fiat-crypto/blob/barrett/src/Arithmetic/BarrettReduction/RidiculousFish.v Median of 29 RSA keygens: 0m0.621s -> 0m0.123s (Accuracy beyond 0.1s is questionable, though this particular improvement is quite solid.) Bug: 238 Change-Id: I67fa36ffe522365b13feb503c687b20d91e72932 Reviewed-on: https://boringssl-review.googlesource.com/25887 Reviewed-by: Adam Langley <agl@google.com>	2018-03-28 01:42:18 +00:00
David Benjamin	ad066861dd	Add bn_usub_fixed. There are a number of random subtractions in RSA key generation. Add a fixed-width version. Median of 29 RSA keygens: 0m0.859s -> 0m0.811s (Accuracy beyond 0.1s is questionable.) Bug: 238 Change-Id: I9fa0771b95a438fd7d2635fd77a332146ccc96d9 Reviewed-on: https://boringssl-review.googlesource.com/25884 Commit-Queue: Adam Langley <agl@google.com> Reviewed-by: Adam Langley <agl@google.com>	2018-03-26 18:53:43 +00:00
David Benjamin	2bf82975ad	Make bn_mul_part_recursive constant-time. This follows similar lines as the previous cleanups and fixes the documentation of the preconditions. And with that, RSA private key operations, provided p and q have the same bit length, should be constant time, as far as I know. (Though I'm sure I've missed something.) bn_cmp_part_words and bn_cmp_words are no longer used and deleted. Bug: 234 Change-Id: Iceefa39f57e466c214794c69b335c4d2c81f5577 Reviewed-on: https://boringssl-review.googlesource.com/25404 Commit-Queue: David Benjamin <davidben@google.com> CQ-Verified: CQ bot account: commit-bot@chromium.org <commit-bot@chromium.org> Reviewed-by: Adam Langley <agl@google.com>	2018-02-06 02:51:54 +00:00
David Benjamin	b01dd1c622	Make bn_sqr_recursive constant-time. We still need BN_mul and, in particular, bn_mul_recursive will either require bn_abs_sub_words be generalized or that we add a parallel bn_abs_sub_part_words, but start with the easy one. While I'm here, simplify the i and j mess in here. It's patterned after the multiplication one, but can be much simpler. Bug: 234 Change-Id: If936099d53304f2512262a1cbffb6c28ae30ccee Reviewed-on: https://boringssl-review.googlesource.com/25325 Commit-Queue: David Benjamin <davidben@google.com> CQ-Verified: CQ bot account: commit-bot@chromium.org <commit-bot@chromium.org> Reviewed-by: Adam Langley <agl@google.com>	2018-02-06 02:47:34 +00:00
David Benjamin	150ad30d28	Split BN_uadd into a bn_uadd_fixed. This is to be used in constant-time RSA CRT. Bug: 233 Change-Id: Ibade5792324dc6aba38cab6971d255d41fb5eb91 Reviewed-on: https://boringssl-review.googlesource.com/25286 Commit-Queue: David Benjamin <davidben@google.com> CQ-Verified: CQ bot account: commit-bot@chromium.org <commit-bot@chromium.org> Reviewed-by: Adam Langley <agl@google.com>	2018-02-06 02:39:45 +00:00
David Benjamin	5b10def1cf	Compute mont->RR in constant-time. Use the now constant-time modular arithmetic functions. Bug: 236 Change-Id: I4567d67bfe62ca82ec295f2233d1a6c9b131e5d2 Reviewed-on: https://boringssl-review.googlesource.com/25285 Commit-Queue: David Benjamin <davidben@google.com> CQ-Verified: CQ bot account: commit-bot@chromium.org <commit-bot@chromium.org> Reviewed-by: Adam Langley <agl@google.com>	2018-02-06 01:40:24 +00:00
David Benjamin	6f564afbdd	Make BN_mod__quick constant-time. As the EC code will ultimately want to use these in "words" form by way of EC_FELEM, and because it's much easier, I've implement these as low-level words-based functions that require all inputs have the same width. The BIGNUM versions which RSA and, for now, EC calls are implemented on top of that. Unfortunately, doing such things in constant-time and accounting for undersized inputs requires some scratch space, and these functions don't take BN_CTX. So I've added internal bn_mod__quick_ctx functions that take a BN_CTX and the old functions now allocate a bit unnecessarily. RSA only needs lshift (for BN_MONT_CTX) and sub (for CRT), but the generic EC code wants add as well. The generic EC code isn't even remotely constant-time, and I hope to ultimately use stack-allocated EC_FELEMs, so I've made the actual implementations here implemented in "words", which is much simpler anyway due to not having to take care of widths. I've also gone ahead and switched the EC code to these functions, largely as a test of their performance (an earlier iteration made the EC code noticeably slower). These operations are otherwise not performance-critical in RSA. The conversion from BIGNUM to BIGNUM+BN_CTX should be dropped by the static linker already, and the unused BIGNUM+BN_CTX functions will fall off when EC_FELEM happens. Update-Note: BN_mod_*_quick bounce on malloc a bit now, but they're not really used externally. The one caller I found was wpa_supplicant which bounces on malloc already. They appear to be implementing compressed coordinates by hand? We may be able to convince them to call EC_POINT_set_compressed_coordinates_GFp. Bug: 233, 236 Change-Id: I2bf361e9c089e0211b97d95523dbc06f1168e12b Reviewed-on: https://boringssl-review.googlesource.com/25261 Commit-Queue: David Benjamin <davidben@google.com> CQ-Verified: CQ bot account: commit-bot@chromium.org <commit-bot@chromium.org> Reviewed-by: Adam Langley <agl@google.com>	2018-02-06 01:16:04 +00:00
David Benjamin	c7b6e0a664	Don't leak widths in bn_mod_mul_montgomery_fallback. The fallback functions still themselves leak, but I've left TODOs there. This only affects BN_mod_mul_montgomery on platforms where we don't use the bn_mul_mont assembly, but BN_mul additionally affects the final multiplication in RSA CRT. Bug: 232 Change-Id: Ia1ae16162c38e10c056b76d6b2afbed67f1a5e16 Reviewed-on: https://boringssl-review.googlesource.com/25260 Commit-Queue: David Benjamin <davidben@google.com> CQ-Verified: CQ bot account: commit-bot@chromium.org <commit-bot@chromium.org> Reviewed-by: Adam Langley <agl@google.com>	2018-02-05 23:57:03 +00:00
David Benjamin	09633cc34e	Rename bn->top to bn->width. This has no behavior change, but it has a semantic one. This CL is an assertion that all BIGNUM functions tolerate non-minimal BIGNUMs now. Specifically: - Functions that do not touch top/width are assumed to not care. - Functions that do touch top/width will be changed by this CL. These should be checked in review that they tolerate non-minimal BIGNUMs. Subsequent CLs will start adjusting the widths that BIGNUM functions output, to fix timing leaks. Bug: 232 Change-Id: I3a2b41b071f2174452f8d3801bce5c78947bb8f7 Reviewed-on: https://boringssl-review.googlesource.com/25257 Commit-Queue: David Benjamin <davidben@google.com> CQ-Verified: CQ bot account: commit-bot@chromium.org <commit-bot@chromium.org> Reviewed-by: Adam Langley <agl@google.com>	2018-02-05 23:44:24 +00:00
David Benjamin	226b4b51b5	Make the rest of BIGNUM accept non-minimal values. Test this by re-running bn_tests.txt tests a lot. For the most part, this was done by scattering bn_minimal_width or bn_correct_top calls as needed. We'll incrementally tease apart the functions that need to act on non-minimal BIGNUMs in constant-time. BN_sqr was switched to call bn_correct_top at the end, rather than sample bn_minimal_width, in anticipation of later splitting it into BN_sqr (for calculators) and BN_sqr_fixed (for BN_mod_mul_montgomery). BN_div_word also uses bn_correct_top because it calls BN_lshift so officially shouldn't rely on BN_lshift returning something minimal-width, though I expect we'd want to split off a BN_lshift_fixed than change that anyway? The shifts sample bn_minimal_width rather than bn_correct_top because they all seem to try to be very clever around the bit width. If we need constant-time versions of them, we can adjust them later. Bug: 232 Change-Id: Ie17b39034a713542dbe906cf8954c0c5483c7db7 Reviewed-on: https://boringssl-review.googlesource.com/25255 Commit-Queue: David Benjamin <davidben@google.com> CQ-Verified: CQ bot account: commit-bot@chromium.org <commit-bot@chromium.org> Reviewed-by: Adam Langley <agl@google.com>	2018-02-05 23:05:34 +00:00
David Benjamin	76ce04bec8	Fix up BN_MONT_CTX_set with non-minimal values. Give a non-minimal modulus, there are two possible values of R we might pick: 2^(BN_BITS2 * width) or 2^(BN_BITS2 * bn_minimal_width). Potentially secret moduli would make the former attractive and things might even work, but our only secret moduli (RSA) have public bit widths. It's more cases to test and the usual BIGNUM invariant is that widths do not affect numerical output. Thus, settle on minimizing mont->N for now. With the top explicitly made minimal, computing \|lgBigR\| is also a little simpler. This CL also abstracts out the < R check in the RSA code, and implements it in a width-agnostic way. Bug: 232 Change-Id: I354643df30530db7866bb7820e34241d7614f3c2 Reviewed-on: https://boringssl-review.googlesource.com/25250 Reviewed-by: Adam Langley <agl@google.com>	2018-02-02 18:52:15 +00:00
David Benjamin	2ccdf584aa	Factor out BN_to_montgomery(1) optimization. This cuts down on a duplicated place where we mess with bn->top. It also also better abstracts away what determines the value of R. (I ordered this wrong and rebasing will be annoying. Specifically, the question is what happens if the modulus is non-minimal. In https://boringssl-review.googlesource.com/c/boringssl/+/25250/, R will be determined by the stored width of mont->N, so we want to use mont's copy of the modulus. Though, one way or another, the important part is that it's inside the Montgomery abstraction.) Bug: 232 Change-Id: I74212e094c8a47f396b87982039e49048a130916 Reviewed-on: https://boringssl-review.googlesource.com/25247 Reviewed-by: Adam Langley <agl@google.com>	2018-02-02 18:42:39 +00:00
David Benjamin	43cf27e7d7	Add bn_copy_words. This makes it easier going to and from non-minimal BIGNUMs and words without worrying about the widths which are ultimately to become less friendly. Bug: 232 Change-Id: Ia57cb29164c560b600573c27b112ad9375a86aad Reviewed-on: https://boringssl-review.googlesource.com/25245 Reviewed-by: Adam Langley <agl@google.com>	2018-02-02 18:24:39 +00:00
David Benjamin	ad5cfdf541	Add initial support for non-minimal BIGNUMs. Thanks to Andres Erbsen for extremely helpful suggestions on how finally plug this long-standing hole! OpenSSL BIGNUMs are currently minimal-width, which means they cannot be constant-time. We'll need to either excise BIGNUM from RSA and EC or somehow fix BIGNUM. EC_SCALAR and later EC_FELEM work will excise it from EC, but RSA's BIGNUMs are more transparent. Teaching BIGNUM to handle non-minimal word widths is probably simpler. The main constraint is BIGNUM's large "calculator" API surface. One could, in theory, do arbitrary math on RSA components, which means all public functions must tolerate non-minimal inputs. This is also useful for EC; https://boringssl-review.googlesource.com/c/boringssl/+/24445 is silly. As a first step, fix comparison-type functions that were assuming minimal BIGNUMs. I've also added bn_resize_words, but it is testing-only until the rest of the library is fixed. bn->top is now a loose upper bound we carry around. It does not affect numerical results, only performance and secrecy. This is a departure from the original meaning, and compiler help in auditing everything is nice, so the final change in this series will rename bn->top to bn->width. Thus these new functions are named per "width", not "top". Looking further ahead, how are output BIGNUM widths determined? There's three notions of correctness here: 1. Do I compute the right answer for all widths? 2. Do I handle secret data in constant time? 3. Does my memory usage not balloon absurdly? For (1), a BIGNUM function must give the same answer for all input widths. BN_mod_add_quick may assume \|a\| < \|m\|, but \|a\| may still be wider than \|m\| by way of leading zeres. The simplest approach is to write code in a width-agnostic way and rely on functions to accept all widths. Where functions need to look at bn->d, we'll a few helper functions to smooth over funny widths. For (2), (1) is little cumbersome. Consider constant-time modular addition. A sane type system would guarantee input widths match. But C is weak here, and bifurcating the internals is a lot of work. Thus, at least for now, I do not propose we move RSA's internal computation out of BIGNUM. (EC_SCALAR/EC_FELEM are valuable for EC because we get to stack-allocate, curves were already specialized, and EC only has two types with many operations on those types. None of these apply to RSA. We've got numbers mod n, mod p, mod q, and their corresponding exponents, each of which is used for basically one operation.) Instead, constant-time BIGNUM functions will output non-minimal widths. This is trivial for BN_bin2bn or modular arithmetic. But for BN_mul, constant-time[] would dictate r->top = a->top + b->top. A calculator repeatedly multiplying by one would then run out of memory. Those we'll split into a private BN_mul_fixed for crypto, leaving BN_mul for calculators. BN_mul is just BN_mul_fixed followed by bn_correct_top. [] BN_mul is not constant-time for other reasons, but that will be fixed separately. Bug: 232 Change-Id: Ide2258ae8c09a9a41bb71d6777908d1c27917069 Reviewed-on: https://boringssl-review.googlesource.com/25244 Reviewed-by: Adam Langley <agl@google.com>	2018-02-02 18:03:46 +00:00
David Benjamin	6fe960d174	Enable __asm__ and uint128_t code in clang-cl. It actually works fine. I just forgot one of the typedefs last time. This gives a roughly 2x improvement on P-256 in clang-cl + OPENSSL_SMALL, the configuration used by Chrome. Before: Did 1302 ECDH P-256 operations in 1015000us (1282.8 ops/sec) Did 4250 ECDSA P-256 signing operations in 1047000us (4059.2 ops/sec) Did 1750 ECDSA P-256 verify operations in 1094000us (1599.6 ops/sec) After: Did 3250 ECDH P-256 operations in 1078000us (3014.8 ops/sec) Did 8250 ECDSA P-256 signing operations in 1016000us (8120.1 ops/sec) Did 3250 ECDSA P-256 verify operations in 1063000us (3057.4 ops/sec) (These were taken on a VM, so the measurements are extremely noisy, but this sort of improvement is visible regardless.) Alas, we do need a little extra bit of fiddling because division does not work (crbug.com/787617). Bug: chromium:787617 Update-Note: This removes the MSan uint128_t workaround which does not appear to be necessary anymore. Change-Id: I8361314608521e5bdaf0e7eeae7a02c33f55c69f Reviewed-on: https://boringssl-review.googlesource.com/23984 Reviewed-by: Adam Langley <agl@google.com> Commit-Queue: Adam Langley <agl@google.com> CQ-Verified: CQ bot account: commit-bot@chromium.org <commit-bot@chromium.org>	2017-12-11 22:46:26 +00:00
David Benjamin	a838f9dc7e	Make ECDSA signing 10% faster and plug some timing leaks. None of the asymmetric crypto we inherented from OpenSSL is constant-time because of BIGNUM. BIGNUM chops leading zeros off the front of everything, so we end up leaking information about the first word, in theory. BIGNUM functions additionally tend to take the full range of inputs and then call into BN_nnmod at various points. All our secret values should be acted on in constant-time, but k in ECDSA is a particularly sensitive value. So, ecdsa_sign_setup, in an attempt to mitigate the BIGNUM leaks, would add a couple copies of the order. This does not work at all. k is used to compute two values: k^-1 and kG. The first operation when computing k^-1 is to call BN_nnmod if k is out of range. The entry point to our tuned constant-time curve implementations is to call BN_nnmod if the scalar has too many bits, which this causes. The result is both corrections are immediately undone but cause us to do more variable-time work in the meantime. Replace all these computations around k with the word-based functions added in the various preceding CLs. In doing so, replace the BN_mod_mul calls (which internally call BN_nnmod) with Montgomery reduction. We can avoid taking k^-1 out of Montgomery form, which combines nicely with Brian Smith's trick in `3426d10119`. Along the way, we avoid some unnecessary mallocs. BIGNUM still affects the private key itself, as well as the EC_POINTs. But this should hopefully be much better now. Also it's 10% faster: Before: Did 15000 ECDSA P-224 signing operations in 1069117us (14030.3 ops/sec) Did 18000 ECDSA P-256 signing operations in 1053908us (17079.3 ops/sec) Did 1078 ECDSA P-384 signing operations in 1087853us (990.9 ops/sec) Did 473 ECDSA P-521 signing operations in 1069835us (442.1 ops/sec) After: Did 16000 ECDSA P-224 signing operations in 1064799us (15026.3 ops/sec) Did 19000 ECDSA P-256 signing operations in 1007839us (18852.2 ops/sec) Did 1078 ECDSA P-384 signing operations in 1079413us (998.7 ops/sec) Did 484 ECDSA P-521 signing operations in 1083616us (446.7 ops/sec) Change-Id: I2a25e90fc99dac13c0616d0ea45e125a4bd8cca1 Reviewed-on: https://boringssl-review.googlesource.com/23075 Reviewed-by: Adam Langley <agl@google.com>	2017-11-22 22:51:40 +00:00
David Benjamin	a08bba51a5	Add bn_mod_exp_mont_small and bn_mod_inverse_prime_mont_small. These can be used to invert values in ECDSA. Unlike their BIGNUM counterparts, the caller is responsible for taking values in and out of Montgomery domain. This will save some work later on in the ECDSA computation. Change-Id: Ib7292900a0fdeedce6cb3e9a9123c94863659043 Reviewed-on: https://boringssl-review.googlesource.com/23071 Reviewed-by: Adam Langley <agl@google.com>	2017-11-20 16:23:48 +00:00
David Benjamin	40e4ecb793	Add "small" variants of Montgomery logic. These use the square and multiply functions added earlier. Change-Id: I723834f9a227a9983b752504a2d7ce0223c43d24 Reviewed-on: https://boringssl-review.googlesource.com/23070 Reviewed-by: Adam Langley <agl@google.com>	2017-11-20 16:23:01 +00:00
David Benjamin	6bc18a3bd4	Add bn_mul_small and bn_sqr_small. As part of excising BIGNUM from EC scalars, we will need a "words" version of BN_mod_mul_montgomery. That, in turn, requires BN_sqr and BN_mul for cases where we don't have bn_mul_mont. BN_sqr and BN_mul have a lot of logic in there, with the most complex cases being not even remotely constant time. Fortunately, those only apply to RSA-sized numbers, not EC-sized numbers. (With the exception, I believe, of 32-bit P-521 which just barely exceeds the cutoff.) Imposing a limit also makes it easier to stack-allocate temporaries (BN_CTX serves a similar purpose in BIGNUM). Extract bn_mul_small and bn_sqr_small and test them as part of bn_tests.txt. Later changes will build on these. If we end up reusing these functions for RSA in the future (though that would require tending to the egregiously non-constant-time code in the no-asm build), we probably want to extract a version where there is an explicit tmp parameter as in bn_sqr_normal rather than the stack bits. Change-Id: If414981eefe12d6664ab2f5e991a359534aa7532 Reviewed-on: https://boringssl-review.googlesource.com/23068 Reviewed-by: Adam Langley <agl@google.com>	2017-11-20 16:22:30 +00:00
David Benjamin	64619deaa3	Const-correct some of the low-level BIGNUM functions. Change-Id: I8c6257e336f54a3a1786df9c4103fcf29177030a Reviewed-on: https://boringssl-review.googlesource.com/23067 Reviewed-by: Adam Langley <agl@google.com>	2017-11-20 16:20:40 +00:00
David Benjamin	bd275702d2	size_t a bunch of bn words bits. Also replace a pointless call to bn_mul_words with a memset. Change-Id: Ief30ddab0e84864561b73fe2776bd0477931cf7f Reviewed-on: https://boringssl-review.googlesource.com/23066 Reviewed-by: Adam Langley <agl@google.com>	2017-11-20 16:20:28 +00:00
David Benjamin	73df153be8	Make BN_generate_dsa_nonce internally constant-time. This rewrites the internals with a "words" variant that can avoid bn_correct_top. It still ultimately calls bn_correct_top as the calling convention is sadly still BIGNUM, but we can lift that calling convention out incrementally. Performance seems to be comparable, if not faster. Before: Did 85000 ECDSA P-256 signing operations in 5030401us (16897.3 ops/sec) Did 34278 ECDSA P-256 verify operations in 5048029us (6790.4 ops/sec) After: Did 85000 ECDSA P-256 signing operations in 5021057us (16928.7 ops/sec) Did 34086 ECDSA P-256 verify operations in 5010416us (6803.0 ops/sec) Change-Id: I1159746dfcc00726dc3f28396076a354556e6e7d Reviewed-on: https://boringssl-review.googlesource.com/23065 Reviewed-by: Adam Langley <agl@google.com>	2017-11-20 16:18:30 +00:00
David Benjamin	607f9807e5	Remove BN_TBIT. Normal shifts do the trick just fine and are less likely to tempt the compiler into inserting a jump. Change-Id: Iaa1da1b6f986fd447694fcde8f3525efb9eeaf11 Reviewed-on: https://boringssl-review.googlesource.com/22888 Reviewed-by: Adam Langley <agl@google.com>	2017-11-10 22:43:37 +00:00
David Benjamin	bf3f6caaf3	Document some BIGNUM internals. Change-Id: I8f044febf16afe04da8b176c638111a9574c4d02 Reviewed-on: https://boringssl-review.googlesource.com/22887 Reviewed-by: Adam Langley <agl@google.com>	2017-11-10 22:43:13 +00:00
David Benjamin	fed560ff2a	Clear no-op BN_MASK2 masks. This is an OpenSSL thing to support platforms where BN_ULONG is not actually the size it claims to be. We define BN_ULONG to uint32_t and uint64_t which are guaranteed by C to implement arithemetic modulo 2^32 and 2^64, respectively. Thus there is no need for any of this. Change-Id: I098cd4cc050a136b9f2c091dfbc28dd83e01f531 Reviewed-on: https://boringssl-review.googlesource.com/21784 Commit-Queue: Adam Langley <agl@google.com> Reviewed-by: Adam Langley <agl@google.com> CQ-Verified: CQ bot account: commit-bot@chromium.org <commit-bot@chromium.org>	2017-10-27 02:38:45 +00:00
David Benjamin	cba7987978	Revert "Use uint128_t and __asm__ in clang-cl." This reverts commit `f6942f0d22`. Reason for revert: This doesn't actually work in clang-cl. I forgot we didn't have the clang-cl try bots enabled! :-( I believe __asm__ is still okay, but I'll try it by hand tomorrow. Original change's description: > Use uint128_t and __asm__ in clang-cl. > > clang-cl does not define __GNUC__ but is still a functioning clang. We > should be able to use our uint128_t and __asm__ code in it on Windows. > > Change-Id: I67310ee68baa0c0c947b2441c265b019ef12af7e > Reviewed-on: https://boringssl-review.googlesource.com/22184 > Commit-Queue: Adam Langley <agl@google.com> > Reviewed-by: Adam Langley <agl@google.com> > CQ-Verified: CQ bot account: commit-bot@chromium.org <commit-bot@chromium.org> TBR=agl@google.com,davidben@google.com Change-Id: I5c7e0391cd9c2e8cc0dfde37e174edaf5d17db22 No-Presubmit: true No-Tree-Checks: true No-Try: true Reviewed-on: https://boringssl-review.googlesource.com/22224 Reviewed-by: David Benjamin <davidben@google.com> Commit-Queue: David Benjamin <davidben@google.com> CQ-Verified: CQ bot account: commit-bot@chromium.org <commit-bot@chromium.org>	2017-10-27 00:22:06 +00:00
David Benjamin	f6942f0d22	Use uint128_t and __asm__ in clang-cl. clang-cl does not define __GNUC__ but is still a functioning clang. We should be able to use our uint128_t and __asm__ code in it on Windows. Change-Id: I67310ee68baa0c0c947b2441c265b019ef12af7e Reviewed-on: https://boringssl-review.googlesource.com/22184 Commit-Queue: Adam Langley <agl@google.com> Reviewed-by: Adam Langley <agl@google.com> CQ-Verified: CQ bot account: commit-bot@chromium.org <commit-bot@chromium.org>	2017-10-27 00:07:29 +00:00
David Benjamin	808f832917	Run the comment converter on libcrypto. crypto/{asn1,x509,x509v3,pem} were skipped as they are still OpenSSL style. Change-Id: I3cd9a60e1cb483a981aca325041f3fbce294247c Reviewed-on: https://boringssl-review.googlesource.com/19504 Reviewed-by: Adam Langley <agl@google.com> Commit-Queue: David Benjamin <davidben@google.com> CQ-Verified: CQ bot account: commit-bot@chromium.org <commit-bot@chromium.org>	2017-08-18 21:49:04 +00:00
Adam Langley	5c38c05b26	Move bn/ into crypto/fipsmodule/ Change-Id: I68aa4a740ee1c7f2a308a6536f408929f15b694c Reviewed-on: https://boringssl-review.googlesource.com/15647 Reviewed-by: Adam Langley <agl@google.com> Commit-Queue: Adam Langley <agl@google.com> CQ-Verified: CQ bot account: commit-bot@chromium.org <commit-bot@chromium.org>	2017-05-01 22:51:25 +00:00

38 Commits