(This appears to be the case with upstream too, it's not that BoringSSL
is missing optimisations from what I can see.)
Change-Id: I0e54762ef0d09e60994ec82c5cca1ff0b3b23ea4
Reviewed-on: https://boringssl-review.googlesource.com/1080
Reviewed-by: David Benjamin <davidben@chromium.org>
Reviewed-by: Adam Langley <agl@google.com>
Fixes one comment that mentioned the wrong function name. Also causes
two BN random functions to fail when the output is NULL. Previously they
would silently do nothing.
Change-Id: I89796ab855ea32787765c301a478352287e61190
Apart from the obvious little issues, this also works around a
(seeming) libtool/linker:
a.c defines a symbol:
int kFoo;
b.c uses it:
extern int kFoo;
int f() {
return kFoo;
}
compile them:
$ gcc -c a.c
$ gcc -c b.c
and create a dummy main in order to run it, main.c:
int f();
int main() {
return f();
}
this works as expected:
$ gcc main.c a.o b.o
but, if we make an archive:
$ ar q lib.a a.o b.o
and use that:
$ gcc main.c lib.a
Undefined symbols for architecture x86_64
"_kFoo", referenced from:
_f in lib.a(b.o)
(It doesn't matter what order the .o files are put into the .a)
Linux and Windows don't seem to have this problem.
nm on a.o shows that the symbol is of type "C", which is a "common symbol"[1].
Basically the linker will merge multiple common symbol definitions together.
If ones makes a.c read:
int kFoo = 0;
Then one gets a type "D" symbol - a "data section symbol" and everything works
just fine.
This might actually be a libtool bug instead of an ld bug: Looking at `xxd
lib.a | less`, the __.SYMDEF SORTED index at the beginning of the archive
doesn't contain an entry for kFoo unless initialised.
Change-Id: I4cdad9ba46e9919221c3cbd79637508959359427
Previously we generated a number that was 8 bytes too large and used a
modular reduction, which has a (tiny, tiny) bias towards zero.
Out of an excess of caution, instead truncate the generated nonce and
try again if it's out of range.
Change-Id: Ia9a7a57dd6d3e5f13d0b881b3e9b2e986d46e4ca
The lazy-initialisation of BN_MONT_CTX was serialising all threads, as noted by
Daniel Sands and co at Sandia. This was to handle the case that 2 or more
threads race to lazy-init the same context, but stunted all scalability in the
case where 2 or more threads are doing unrelated things! We favour the latter
case by punishing the former. The init work gets done by each thread that finds
the context to be uninitialised, and we then lock the "set" logic after that
work is done - the winning thread's work gets used, the losing threads throw
away what they've done.
(Imported from upstream's bf43446835bfd3f9abf1898a99ae20f2285320f3)
It's not clear whether this inconsistency could lead to an actual
computation error, but it involved a BIGNUM being passed around the
montgomery logic in an inconsistent state. This was found using flags
-DBN_DEBUG -DBN_DEBUG_RAND, and working backwards from this assertion
in 'ectest';
ectest: bn_mul.c:960: BN_mul: Assertion `(_bnum2->top == 0) ||
(_bnum2->d[_bnum2->top - 1] != 0)' failed
(Imported from upstream's 3cc546a3bbcbf26cd14fc45fb133d36820ed0a75)
This change adds a new function, BN_bn2bin_padded, that attempts, as
much as possible, to serialise a BIGNUM in constant time.
This is used to avoid some timing leaks in RSA decryption.
Some RSA private keys are specified with only n, e and d. Although we
can use these keys directly, it's nice to have a uniform representation
that includes the precomputed CRT values. This change adds a function
that can recover the primes from a minimal private key of that form.
Ensure that, when generating small primes, the result is actually of the
requested size. Fixes OpenSSL #2701.
This change does not address the cases of generating safe primes, or
where the |add| parameter is non-NULL.
(The issue was reported by Shay Gueron.)
The final reduction in Montgomery multiplication computes if (X >= m) then X =
X - m else X = X
In OpenSSL, this was done by computing T = X - m, doing a constant-time
selection of the *addresses* of X and T, and loading from the resulting
address. But this is not cache-neutral.
This patch changes the behaviour by loading both X and T into registers, and
doing a constant-time selection of the *values*.
TODO(fork): only some of the fixes from the original patch still apply to
the 1.0.2 code.
Initial fork from f2d678e6e89b6508147086610e985d4e8416e867 (1.0.2 beta).
(This change contains substantial changes from the original and
effectively starts a new history.)