This addresses
- request for improvement for faster key setup in RT#3576;
- clearing registers and stack in RT#3554 (this is more of a gesture to
see if there will be some traction from compiler side);
- more commentary around input parameters handling and stack layout
(desired when RT#3553 was reviewed);
- minor size and single block performance optimization (was lying around);
(Imported from upstream's 23f6eec71dbd472044db7dc854599f1de14a1f48)
This one is best reviewed by verifying that
23f6eec71dbd472044db7dc854599f1de14a1f48^ in upstream has the exact same
versions of these files (we had no local diffs), so we can just copy them
wholesale.
bssl speed reports a wash on my Mac. If I keep running it, different ones win
each time.
Change-Id: I729bd39cf0b3a30cc24de839e1c734dcaef972b8
Reviewed-on: https://boringssl-review.googlesource.com/4491
Reviewed-by: Adam Langley <agl@google.com>
Improve CBC decrypt and CTR by ~13/16%, which adds up to ~25/33%
improvement over "pre-Silvermont" version. [Add performance table to
aesni-x86.pl].
(Imported from upstream's b347341c75656cf8bc039bd0ea5e3571c9299687)
Initial fork from f2d678e6e89b6508147086610e985d4e8416e867 (1.0.2 beta).
(This change contains substantial changes from the original and
effectively starts a new history.)