The first step of RSA with the CRT optimization is to reduce our input
modulo p and q. We can do this in constant-time[*] with Montgomery
reduction. When p and q are the same size, Montgomery reduction's bounds
hold. We need two rounds of it because the first round gives us an
unwanted R^-1.
This does not appear to have a measurable impact on performance. Also
add a long TODO describing how to make the rest of the function
constant-time[*] which hopefully we'll get to later. RSA blinding should
protect us from it all, but make this constant-time anyway.
Since this and the follow-up work will special-case weird keys, add a
test that we don't break those unintentionally. (Though I am not above
breaking them intentionally someday...)
Thanks to Andres Erbsen for discussions on how to do this bit properly.
[*] Ignoring the pervasive bn_correct_top problem for the moment.
Change-Id: Ide099a9db8249cb6549be99c5f8791a39692ea81
Reviewed-on: https://boringssl-review.googlesource.com/24204
Reviewed-by: Adam Langley <agl@google.com>