boringssl/crypto/fipsmodule/aes/aes_test.cc
David Benjamin 55db667c62 Enable vpaes for aarch64, with CTR optimizations.
This patches vpaes-armv8.pl to add vpaes_ctr32_encrypt_blocks. CTR mode
is by far the most important mode these days. It should have access to
_vpaes_encrypt_2x, which gives a considerable speed boost. Also exclude
vpaes_ecb_* as they're not even used.

For iOS, this change is completely a no-op. iOS ARMv8 always has crypto
extensions, and we already statically drop all other AES
implementations.

Android ARMv8 is *not* required to have crypto extensions, but every
ARMv8 device I've seen has them. For those, it is a no-op
performance-wise and a win on size. vpaes appears to be about 5.6KiB
smaller than the tables. ARMv8 always makes SIMD (NEON) available, so we
can statically drop aes_nohw.

In theory, however, crypto-less Android ARMv8 is possible. Today such
chips get a variable-time AES. This CL fixes this, but the performance
story is complex.

The Raspberry Pi 3 is not Android but has a Cortex-A53 chip
without crypto extensions. (But the official images are 32-bit, so even
this is slightly artificial...) There, vpaes is a performance win.

Raspberry Pi 3, Model B+, Cortex-A53
Before:
Did 265000 AES-128-GCM (16 bytes) seal operations in 1003312us (264125.2 ops/sec): 4.2 MB/s
Did 44000 AES-128-GCM (256 bytes) seal operations in 1002141us (43906.0 ops/sec): 11.2 MB/s
Did 9394 AES-128-GCM (1350 bytes) seal operations in 1032104us (9101.8 ops/sec): 12.3 MB/s
Did 1562 AES-128-GCM (8192 bytes) seal operations in 1008982us (1548.1 ops/sec): 12.7 MB/s
After:
Did 277000 AES-128-GCM (16 bytes) seal operations in 1001884us (276479.1 ops/sec): 4.4 MB/s
Did 52000 AES-128-GCM (256 bytes) seal operations in 1001480us (51923.2 ops/sec): 13.3 MB/s
Did 11000 AES-128-GCM (1350 bytes) seal operations in 1007979us (10912.9 ops/sec): 14.7 MB/s
Did 2013 AES-128-GCM (8192 bytes) seal operations in 1085545us (1854.4 ops/sec): 15.2 MB/s

The Pixel 3 has a Cortex-A75 with crypto extensions, so it would never
run this code. However, artificially ignoring them gives another data
point (ARM documentation[*] suggests the extensions are still optional
on a Cortex-A75.) Sadly, vpaes no longer wins on perf over aes_nohw.
But, it is constant-time:

Pixel 3, AES/PMULL extensions ignored, Cortex-A75:
Before:
Did 2102000 AES-128-GCM (16 bytes) seal operations in 1000378us (2101205.7 ops/sec): 33.6 MB/s
Did 358000 AES-128-GCM (256 bytes) seal operations in 1002658us (357051.0 ops/sec): 91.4 MB/s
Did 75000 AES-128-GCM (1350 bytes) seal operations in 1012830us (74049.9 ops/sec): 100.0 MB/s
Did 13000 AES-128-GCM (8192 bytes) seal operations in 1036524us (12541.9 ops/sec): 102.7 MB/s
After:
Did 1453000 AES-128-GCM (16 bytes) seal operations in 1000213us (1452690.6 ops/sec): 23.2 MB/s
Did 285000 AES-128-GCM (256 bytes) seal operations in 1002227us (284366.7 ops/sec): 72.8 MB/s
Did 60000 AES-128-GCM (1350 bytes) seal operations in 1016106us (59049.0 ops/sec): 79.7 MB/s
Did 11000 AES-128-GCM (8192 bytes) seal operations in 1094184us (10053.2 ops/sec): 82.4 MB/s

Note the numbers above run with PMULL off, so the slow GHASH is
dampening the regression. If we test aes_nohw and vpaes paired with
PMULL on, the 20% perf hit becomes a 31% hit. The PMULL-less variant is
more likely to represent a real chip.

This is consistent with upstream's note in the comment, though it is
unclear if 20% is the right order of magnitude: "these results are worse
than scalar compiler-generated code, but it's constant-time and
therefore preferred".

[*] http://infocenter.arm.com/help/index.jsp?topic=/com.arm.doc.100458_0301_00_en/lau1442495529696.html

Bug: 246
Change-Id: If1dc87f5131fce742052498295476fbae4628dbf
Reviewed-on: https://boringssl-review.googlesource.com/c/boringssl/+/35026
Commit-Queue: David Benjamin <davidben@google.com>
Reviewed-by: Adam Langley <agl@google.com>
2019-03-04 20:31:39 +00:00

296 lines
10 KiB
C++

/* Copyright (c) 2015, Google Inc.
*
* Permission to use, copy, modify, and/or distribute this software for any
* purpose with or without fee is hereby granted, provided that the above
* copyright notice and this permission notice appear in all copies.
*
* THE SOFTWARE IS PROVIDED "AS IS" AND THE AUTHOR DISCLAIMS ALL WARRANTIES
* WITH REGARD TO THIS SOFTWARE INCLUDING ALL IMPLIED WARRANTIES OF
* MERCHANTABILITY AND FITNESS. IN NO EVENT SHALL THE AUTHOR BE LIABLE FOR ANY
* SPECIAL, DIRECT, INDIRECT, OR CONSEQUENTIAL DAMAGES OR ANY DAMAGES
* WHATSOEVER RESULTING FROM LOSS OF USE, DATA OR PROFITS, WHETHER IN AN ACTION
* OF CONTRACT, NEGLIGENCE OR OTHER TORTIOUS ACTION, ARISING OUT OF OR IN
* CONNECTION WITH THE USE OR PERFORMANCE OF THIS SOFTWARE. */
#include <stdio.h>
#include <stdlib.h>
#include <string.h>
#include <memory>
#include <vector>
#include <gtest/gtest.h>
#include <openssl/aes.h>
#include "internal.h"
#include "../../internal.h"
#include "../../test/abi_test.h"
#include "../../test/file_test.h"
#include "../../test/test_util.h"
#include "../../test/wycheproof_util.h"
static void TestRaw(FileTest *t) {
std::vector<uint8_t> key, plaintext, ciphertext;
ASSERT_TRUE(t->GetBytes(&key, "Key"));
ASSERT_TRUE(t->GetBytes(&plaintext, "Plaintext"));
ASSERT_TRUE(t->GetBytes(&ciphertext, "Ciphertext"));
ASSERT_EQ(static_cast<unsigned>(AES_BLOCK_SIZE), plaintext.size());
ASSERT_EQ(static_cast<unsigned>(AES_BLOCK_SIZE), ciphertext.size());
AES_KEY aes_key;
ASSERT_EQ(0, AES_set_encrypt_key(key.data(), 8 * key.size(), &aes_key));
// Test encryption.
uint8_t block[AES_BLOCK_SIZE];
AES_encrypt(plaintext.data(), block, &aes_key);
EXPECT_EQ(Bytes(ciphertext), Bytes(block));
// Test in-place encryption.
OPENSSL_memcpy(block, plaintext.data(), AES_BLOCK_SIZE);
AES_encrypt(block, block, &aes_key);
EXPECT_EQ(Bytes(ciphertext), Bytes(block));
ASSERT_EQ(0, AES_set_decrypt_key(key.data(), 8 * key.size(), &aes_key));
// Test decryption.
AES_decrypt(ciphertext.data(), block, &aes_key);
EXPECT_EQ(Bytes(plaintext), Bytes(block));
// Test in-place decryption.
OPENSSL_memcpy(block, ciphertext.data(), AES_BLOCK_SIZE);
AES_decrypt(block, block, &aes_key);
EXPECT_EQ(Bytes(plaintext), Bytes(block));
}
static void TestKeyWrap(FileTest *t) {
// All test vectors use the default IV, so test both with implicit and
// explicit IV.
//
// TODO(davidben): Find test vectors that use a different IV.
static const uint8_t kDefaultIV[] = {
0xa6, 0xa6, 0xa6, 0xa6, 0xa6, 0xa6, 0xa6, 0xa6,
};
std::vector<uint8_t> key, plaintext, ciphertext;
ASSERT_TRUE(t->GetBytes(&key, "Key"));
ASSERT_TRUE(t->GetBytes(&plaintext, "Plaintext"));
ASSERT_TRUE(t->GetBytes(&ciphertext, "Ciphertext"));
ASSERT_EQ(plaintext.size() + 8, ciphertext.size())
<< "Invalid Plaintext and Ciphertext lengths.";
// Test encryption.
AES_KEY aes_key;
ASSERT_EQ(0, AES_set_encrypt_key(key.data(), 8 * key.size(), &aes_key));
// Test with implicit IV.
std::unique_ptr<uint8_t[]> buf(new uint8_t[ciphertext.size()]);
int len = AES_wrap_key(&aes_key, nullptr /* iv */, buf.get(),
plaintext.data(), plaintext.size());
ASSERT_GE(len, 0);
EXPECT_EQ(Bytes(ciphertext), Bytes(buf.get(), static_cast<size_t>(len)));
// Test with explicit IV.
OPENSSL_memset(buf.get(), 0, ciphertext.size());
len = AES_wrap_key(&aes_key, kDefaultIV, buf.get(), plaintext.data(),
plaintext.size());
ASSERT_GE(len, 0);
EXPECT_EQ(Bytes(ciphertext), Bytes(buf.get(), static_cast<size_t>(len)));
// Test decryption.
ASSERT_EQ(0, AES_set_decrypt_key(key.data(), 8 * key.size(), &aes_key));
// Test with implicit IV.
buf.reset(new uint8_t[plaintext.size()]);
len = AES_unwrap_key(&aes_key, nullptr /* iv */, buf.get(), ciphertext.data(),
ciphertext.size());
ASSERT_GE(len, 0);
EXPECT_EQ(Bytes(plaintext), Bytes(buf.get(), static_cast<size_t>(len)));
// Test with explicit IV.
OPENSSL_memset(buf.get(), 0, plaintext.size());
len = AES_unwrap_key(&aes_key, kDefaultIV, buf.get(), ciphertext.data(),
ciphertext.size());
ASSERT_GE(len, 0);
// Test corrupted ciphertext.
ciphertext[0] ^= 1;
EXPECT_EQ(-1, AES_unwrap_key(&aes_key, nullptr /* iv */, buf.get(),
ciphertext.data(), ciphertext.size()));
}
TEST(AESTest, TestVectors) {
FileTestGTest("crypto/fipsmodule/aes/aes_tests.txt", [](FileTest *t) {
if (t->GetParameter() == "Raw") {
TestRaw(t);
} else if (t->GetParameter() == "KeyWrap") {
TestKeyWrap(t);
} else {
ADD_FAILURE() << "Unknown mode " << t->GetParameter();
}
});
}
TEST(AESTest, WycheproofKeyWrap) {
FileTestGTest("third_party/wycheproof_testvectors/kw_test.txt",
[](FileTest *t) {
std::string key_size;
ASSERT_TRUE(t->GetInstruction(&key_size, "keySize"));
std::vector<uint8_t> ct, key, msg;
ASSERT_TRUE(t->GetBytes(&ct, "ct"));
ASSERT_TRUE(t->GetBytes(&key, "key"));
ASSERT_TRUE(t->GetBytes(&msg, "msg"));
ASSERT_EQ(static_cast<unsigned>(atoi(key_size.c_str())), key.size() * 8);
WycheproofResult result;
ASSERT_TRUE(GetWycheproofResult(t, &result));
if (result != WycheproofResult::kInvalid) {
ASSERT_GE(ct.size(), 8u);
AES_KEY aes;
ASSERT_EQ(0, AES_set_decrypt_key(key.data(), 8 * key.size(), &aes));
std::vector<uint8_t> out(ct.size() - 8);
int len = AES_unwrap_key(&aes, nullptr, out.data(), ct.data(), ct.size());
ASSERT_EQ(static_cast<int>(out.size()), len);
EXPECT_EQ(Bytes(msg), Bytes(out));
out.resize(msg.size() + 8);
ASSERT_EQ(0, AES_set_encrypt_key(key.data(), 8 * key.size(), &aes));
len = AES_wrap_key(&aes, nullptr, out.data(), msg.data(), msg.size());
ASSERT_EQ(static_cast<int>(out.size()), len);
EXPECT_EQ(Bytes(ct), Bytes(out));
} else {
AES_KEY aes;
ASSERT_EQ(0, AES_set_decrypt_key(key.data(), 8 * key.size(), &aes));
std::vector<uint8_t> out(ct.size() < 8 ? 0 : ct.size() - 8);
int len = AES_unwrap_key(&aes, nullptr, out.data(), ct.data(), ct.size());
EXPECT_EQ(-1, len);
}
});
}
TEST(AESTest, WrapBadLengths) {
uint8_t key[128/8] = {0};
AES_KEY aes;
ASSERT_EQ(0, AES_set_encrypt_key(key, 128, &aes));
// Input lengths to |AES_wrap_key| must be a multiple of 8 and at least 16.
static const size_t kLengths[] = {0, 1, 2, 3, 4, 5, 6, 7, 8,
9, 10, 11, 12, 13, 14, 15, 20};
for (size_t len : kLengths) {
SCOPED_TRACE(len);
std::vector<uint8_t> in(len);
std::vector<uint8_t> out(len + 8);
EXPECT_EQ(-1,
AES_wrap_key(&aes, nullptr, out.data(), in.data(), in.size()));
}
}
#if defined(SUPPORTS_ABI_TEST)
TEST(AESTest, ABI) {
for (int bits : {128, 192, 256}) {
SCOPED_TRACE(bits);
const uint8_t kKey[256/8] = {0};
AES_KEY key;
uint8_t block[AES_BLOCK_SIZE];
uint8_t buf[AES_BLOCK_SIZE * 64] = {0};
std::vector<int> block_counts;
if (bits == 128) {
block_counts = {0, 1, 2, 3, 4, 8, 16, 31};
} else {
// Unwind tests are very slow. Assume that the various input sizes do not
// differ significantly by round count for ABI purposes.
block_counts = {0, 1, 8};
}
CHECK_ABI(aes_nohw_set_encrypt_key, kKey, bits, &key);
CHECK_ABI(aes_nohw_encrypt, block, block, &key);
#if defined(AES_NOHW_CBC)
for (size_t blocks : block_counts) {
SCOPED_TRACE(blocks);
CHECK_ABI(aes_nohw_cbc_encrypt, buf, buf, AES_BLOCK_SIZE * blocks, &key,
block, AES_ENCRYPT);
}
#endif
CHECK_ABI(aes_nohw_set_decrypt_key, kKey, bits, &key);
CHECK_ABI(aes_nohw_decrypt, block, block, &key);
#if defined(AES_NOHW_CBC)
for (size_t blocks : block_counts) {
SCOPED_TRACE(blocks);
CHECK_ABI(aes_nohw_cbc_encrypt, buf, buf, AES_BLOCK_SIZE * blocks, &key,
block, AES_DECRYPT);
}
#endif
if (bsaes_capable()) {
aes_nohw_set_encrypt_key(kKey, bits, &key);
for (size_t blocks : block_counts) {
SCOPED_TRACE(blocks);
if (blocks != 0) {
CHECK_ABI(bsaes_ctr32_encrypt_blocks, buf, buf, blocks, &key, block);
}
}
aes_nohw_set_decrypt_key(kKey, bits, &key);
for (size_t blocks : block_counts) {
SCOPED_TRACE(blocks);
CHECK_ABI(bsaes_cbc_encrypt, buf, buf, AES_BLOCK_SIZE * blocks, &key,
block, AES_DECRYPT);
}
}
if (vpaes_capable()) {
CHECK_ABI(vpaes_set_encrypt_key, kKey, bits, &key);
CHECK_ABI(vpaes_encrypt, block, block, &key);
for (size_t blocks : block_counts) {
SCOPED_TRACE(blocks);
CHECK_ABI(vpaes_cbc_encrypt, buf, buf, AES_BLOCK_SIZE * blocks, &key,
block, AES_ENCRYPT);
#if defined(VPAES_CTR32)
CHECK_ABI(vpaes_ctr32_encrypt_blocks, buf, buf, blocks, &key, block);
#endif
}
CHECK_ABI(vpaes_set_decrypt_key, kKey, bits, &key);
CHECK_ABI(vpaes_decrypt, block, block, &key);
for (size_t blocks : block_counts) {
SCOPED_TRACE(blocks);
CHECK_ABI(vpaes_cbc_encrypt, buf, buf, AES_BLOCK_SIZE * blocks, &key,
block, AES_DECRYPT);
}
}
if (hwaes_capable()) {
CHECK_ABI(aes_hw_set_encrypt_key, kKey, bits, &key);
CHECK_ABI(aes_hw_encrypt, block, block, &key);
for (size_t blocks : block_counts) {
SCOPED_TRACE(blocks);
CHECK_ABI(aes_hw_cbc_encrypt, buf, buf, AES_BLOCK_SIZE * blocks, &key,
block, AES_ENCRYPT);
CHECK_ABI(aes_hw_ctr32_encrypt_blocks, buf, buf, blocks, &key, block);
#if defined(HWAES_ECB)
CHECK_ABI(aes_hw_ecb_encrypt, buf, buf, AES_BLOCK_SIZE * blocks, &key,
AES_ENCRYPT);
#endif
}
CHECK_ABI(aes_hw_set_decrypt_key, kKey, bits, &key);
CHECK_ABI(aes_hw_decrypt, block, block, &key);
for (size_t blocks : block_counts) {
SCOPED_TRACE(blocks);
CHECK_ABI(aes_hw_cbc_encrypt, buf, buf, AES_BLOCK_SIZE * blocks, &key,
block, AES_DECRYPT);
#if defined(HWAES_ECB)
CHECK_ABI(aes_hw_ecb_encrypt, buf, buf, AES_BLOCK_SIZE * blocks, &key,
AES_DECRYPT);
#endif
}
}
}
}
#endif // SUPPORTS_ABI_TEST