bn_get_bits5 always reads two bytes, even when it doesn't need to. For some
sizes of |p|, this can result in reading just past the edge of the array.
Unroll the first iteration of the loop and avoid reading out of bounds.
Replace bn_get_bits5 altogether in C as it's not doing anything interesting.
Change-Id: Ibcc8cea7d9c644a2639445396455da47fe869a5c
Reviewed-on: https://boringssl-review.googlesource.com/1393
Reviewed-by: Adam Langley <agl@google.com>
(The issue was reported by Shay Gueron.)
The final reduction in Montgomery multiplication computes if (X >= m) then X =
X - m else X = X
In OpenSSL, this was done by computing T = X - m, doing a constant-time
selection of the *addresses* of X and T, and loading from the resulting
address. But this is not cache-neutral.
This patch changes the behaviour by loading both X and T into registers, and
doing a constant-time selection of the *values*.
TODO(fork): only some of the fixes from the original patch still apply to
the 1.0.2 code.
Initial fork from f2d678e6e89b6508147086610e985d4e8416e867 (1.0.2 beta).
(This change contains substantial changes from the original and
effectively starts a new history.)