55db667c62
This patches vpaes-armv8.pl to add vpaes_ctr32_encrypt_blocks. CTR mode is by far the most important mode these days. It should have access to _vpaes_encrypt_2x, which gives a considerable speed boost. Also exclude vpaes_ecb_* as they're not even used. For iOS, this change is completely a no-op. iOS ARMv8 always has crypto extensions, and we already statically drop all other AES implementations. Android ARMv8 is *not* required to have crypto extensions, but every ARMv8 device I've seen has them. For those, it is a no-op performance-wise and a win on size. vpaes appears to be about 5.6KiB smaller than the tables. ARMv8 always makes SIMD (NEON) available, so we can statically drop aes_nohw. In theory, however, crypto-less Android ARMv8 is possible. Today such chips get a variable-time AES. This CL fixes this, but the performance story is complex. The Raspberry Pi 3 is not Android but has a Cortex-A53 chip without crypto extensions. (But the official images are 32-bit, so even this is slightly artificial...) There, vpaes is a performance win. Raspberry Pi 3, Model B+, Cortex-A53 Before: Did 265000 AES-128-GCM (16 bytes) seal operations in 1003312us (264125.2 ops/sec): 4.2 MB/s Did 44000 AES-128-GCM (256 bytes) seal operations in 1002141us (43906.0 ops/sec): 11.2 MB/s Did 9394 AES-128-GCM (1350 bytes) seal operations in 1032104us (9101.8 ops/sec): 12.3 MB/s Did 1562 AES-128-GCM (8192 bytes) seal operations in 1008982us (1548.1 ops/sec): 12.7 MB/s After: Did 277000 AES-128-GCM (16 bytes) seal operations in 1001884us (276479.1 ops/sec): 4.4 MB/s Did 52000 AES-128-GCM (256 bytes) seal operations in 1001480us (51923.2 ops/sec): 13.3 MB/s Did 11000 AES-128-GCM (1350 bytes) seal operations in 1007979us (10912.9 ops/sec): 14.7 MB/s Did 2013 AES-128-GCM (8192 bytes) seal operations in 1085545us (1854.4 ops/sec): 15.2 MB/s The Pixel 3 has a Cortex-A75 with crypto extensions, so it would never run this code. However, artificially ignoring them gives another data point (ARM documentation[*] suggests the extensions are still optional on a Cortex-A75.) Sadly, vpaes no longer wins on perf over aes_nohw. But, it is constant-time: Pixel 3, AES/PMULL extensions ignored, Cortex-A75: Before: Did 2102000 AES-128-GCM (16 bytes) seal operations in 1000378us (2101205.7 ops/sec): 33.6 MB/s Did 358000 AES-128-GCM (256 bytes) seal operations in 1002658us (357051.0 ops/sec): 91.4 MB/s Did 75000 AES-128-GCM (1350 bytes) seal operations in 1012830us (74049.9 ops/sec): 100.0 MB/s Did 13000 AES-128-GCM (8192 bytes) seal operations in 1036524us (12541.9 ops/sec): 102.7 MB/s After: Did 1453000 AES-128-GCM (16 bytes) seal operations in 1000213us (1452690.6 ops/sec): 23.2 MB/s Did 285000 AES-128-GCM (256 bytes) seal operations in 1002227us (284366.7 ops/sec): 72.8 MB/s Did 60000 AES-128-GCM (1350 bytes) seal operations in 1016106us (59049.0 ops/sec): 79.7 MB/s Did 11000 AES-128-GCM (8192 bytes) seal operations in 1094184us (10053.2 ops/sec): 82.4 MB/s Note the numbers above run with PMULL off, so the slow GHASH is dampening the regression. If we test aes_nohw and vpaes paired with PMULL on, the 20% perf hit becomes a 31% hit. The PMULL-less variant is more likely to represent a real chip. This is consistent with upstream's note in the comment, though it is unclear if 20% is the right order of magnitude: "these results are worse than scalar compiler-generated code, but it's constant-time and therefore preferred". [*] http://infocenter.arm.com/help/index.jsp?topic=/com.arm.doc.100458_0301_00_en/lau1442495529696.html Bug: 246 Change-Id: If1dc87f5131fce742052498295476fbae4628dbf Reviewed-on: https://boringssl-review.googlesource.com/c/boringssl/+/35026 Commit-Queue: David Benjamin <davidben@google.com> Reviewed-by: Adam Langley <agl@google.com>
296 lines
10 KiB
C++
296 lines
10 KiB
C++
/* Copyright (c) 2015, Google Inc.
|
|
*
|
|
* Permission to use, copy, modify, and/or distribute this software for any
|
|
* purpose with or without fee is hereby granted, provided that the above
|
|
* copyright notice and this permission notice appear in all copies.
|
|
*
|
|
* THE SOFTWARE IS PROVIDED "AS IS" AND THE AUTHOR DISCLAIMS ALL WARRANTIES
|
|
* WITH REGARD TO THIS SOFTWARE INCLUDING ALL IMPLIED WARRANTIES OF
|
|
* MERCHANTABILITY AND FITNESS. IN NO EVENT SHALL THE AUTHOR BE LIABLE FOR ANY
|
|
* SPECIAL, DIRECT, INDIRECT, OR CONSEQUENTIAL DAMAGES OR ANY DAMAGES
|
|
* WHATSOEVER RESULTING FROM LOSS OF USE, DATA OR PROFITS, WHETHER IN AN ACTION
|
|
* OF CONTRACT, NEGLIGENCE OR OTHER TORTIOUS ACTION, ARISING OUT OF OR IN
|
|
* CONNECTION WITH THE USE OR PERFORMANCE OF THIS SOFTWARE. */
|
|
|
|
#include <stdio.h>
|
|
#include <stdlib.h>
|
|
#include <string.h>
|
|
|
|
#include <memory>
|
|
#include <vector>
|
|
|
|
#include <gtest/gtest.h>
|
|
|
|
#include <openssl/aes.h>
|
|
|
|
#include "internal.h"
|
|
#include "../../internal.h"
|
|
#include "../../test/abi_test.h"
|
|
#include "../../test/file_test.h"
|
|
#include "../../test/test_util.h"
|
|
#include "../../test/wycheproof_util.h"
|
|
|
|
|
|
static void TestRaw(FileTest *t) {
|
|
std::vector<uint8_t> key, plaintext, ciphertext;
|
|
ASSERT_TRUE(t->GetBytes(&key, "Key"));
|
|
ASSERT_TRUE(t->GetBytes(&plaintext, "Plaintext"));
|
|
ASSERT_TRUE(t->GetBytes(&ciphertext, "Ciphertext"));
|
|
|
|
ASSERT_EQ(static_cast<unsigned>(AES_BLOCK_SIZE), plaintext.size());
|
|
ASSERT_EQ(static_cast<unsigned>(AES_BLOCK_SIZE), ciphertext.size());
|
|
|
|
AES_KEY aes_key;
|
|
ASSERT_EQ(0, AES_set_encrypt_key(key.data(), 8 * key.size(), &aes_key));
|
|
|
|
// Test encryption.
|
|
uint8_t block[AES_BLOCK_SIZE];
|
|
AES_encrypt(plaintext.data(), block, &aes_key);
|
|
EXPECT_EQ(Bytes(ciphertext), Bytes(block));
|
|
|
|
// Test in-place encryption.
|
|
OPENSSL_memcpy(block, plaintext.data(), AES_BLOCK_SIZE);
|
|
AES_encrypt(block, block, &aes_key);
|
|
EXPECT_EQ(Bytes(ciphertext), Bytes(block));
|
|
|
|
ASSERT_EQ(0, AES_set_decrypt_key(key.data(), 8 * key.size(), &aes_key));
|
|
|
|
// Test decryption.
|
|
AES_decrypt(ciphertext.data(), block, &aes_key);
|
|
EXPECT_EQ(Bytes(plaintext), Bytes(block));
|
|
|
|
// Test in-place decryption.
|
|
OPENSSL_memcpy(block, ciphertext.data(), AES_BLOCK_SIZE);
|
|
AES_decrypt(block, block, &aes_key);
|
|
EXPECT_EQ(Bytes(plaintext), Bytes(block));
|
|
}
|
|
|
|
static void TestKeyWrap(FileTest *t) {
|
|
// All test vectors use the default IV, so test both with implicit and
|
|
// explicit IV.
|
|
//
|
|
// TODO(davidben): Find test vectors that use a different IV.
|
|
static const uint8_t kDefaultIV[] = {
|
|
0xa6, 0xa6, 0xa6, 0xa6, 0xa6, 0xa6, 0xa6, 0xa6,
|
|
};
|
|
|
|
std::vector<uint8_t> key, plaintext, ciphertext;
|
|
ASSERT_TRUE(t->GetBytes(&key, "Key"));
|
|
ASSERT_TRUE(t->GetBytes(&plaintext, "Plaintext"));
|
|
ASSERT_TRUE(t->GetBytes(&ciphertext, "Ciphertext"));
|
|
|
|
ASSERT_EQ(plaintext.size() + 8, ciphertext.size())
|
|
<< "Invalid Plaintext and Ciphertext lengths.";
|
|
|
|
// Test encryption.
|
|
AES_KEY aes_key;
|
|
ASSERT_EQ(0, AES_set_encrypt_key(key.data(), 8 * key.size(), &aes_key));
|
|
|
|
// Test with implicit IV.
|
|
std::unique_ptr<uint8_t[]> buf(new uint8_t[ciphertext.size()]);
|
|
int len = AES_wrap_key(&aes_key, nullptr /* iv */, buf.get(),
|
|
plaintext.data(), plaintext.size());
|
|
ASSERT_GE(len, 0);
|
|
EXPECT_EQ(Bytes(ciphertext), Bytes(buf.get(), static_cast<size_t>(len)));
|
|
|
|
// Test with explicit IV.
|
|
OPENSSL_memset(buf.get(), 0, ciphertext.size());
|
|
len = AES_wrap_key(&aes_key, kDefaultIV, buf.get(), plaintext.data(),
|
|
plaintext.size());
|
|
ASSERT_GE(len, 0);
|
|
EXPECT_EQ(Bytes(ciphertext), Bytes(buf.get(), static_cast<size_t>(len)));
|
|
|
|
// Test decryption.
|
|
ASSERT_EQ(0, AES_set_decrypt_key(key.data(), 8 * key.size(), &aes_key));
|
|
|
|
// Test with implicit IV.
|
|
buf.reset(new uint8_t[plaintext.size()]);
|
|
len = AES_unwrap_key(&aes_key, nullptr /* iv */, buf.get(), ciphertext.data(),
|
|
ciphertext.size());
|
|
ASSERT_GE(len, 0);
|
|
EXPECT_EQ(Bytes(plaintext), Bytes(buf.get(), static_cast<size_t>(len)));
|
|
|
|
// Test with explicit IV.
|
|
OPENSSL_memset(buf.get(), 0, plaintext.size());
|
|
len = AES_unwrap_key(&aes_key, kDefaultIV, buf.get(), ciphertext.data(),
|
|
ciphertext.size());
|
|
ASSERT_GE(len, 0);
|
|
|
|
// Test corrupted ciphertext.
|
|
ciphertext[0] ^= 1;
|
|
EXPECT_EQ(-1, AES_unwrap_key(&aes_key, nullptr /* iv */, buf.get(),
|
|
ciphertext.data(), ciphertext.size()));
|
|
}
|
|
|
|
TEST(AESTest, TestVectors) {
|
|
FileTestGTest("crypto/fipsmodule/aes/aes_tests.txt", [](FileTest *t) {
|
|
if (t->GetParameter() == "Raw") {
|
|
TestRaw(t);
|
|
} else if (t->GetParameter() == "KeyWrap") {
|
|
TestKeyWrap(t);
|
|
} else {
|
|
ADD_FAILURE() << "Unknown mode " << t->GetParameter();
|
|
}
|
|
});
|
|
}
|
|
|
|
TEST(AESTest, WycheproofKeyWrap) {
|
|
FileTestGTest("third_party/wycheproof_testvectors/kw_test.txt",
|
|
[](FileTest *t) {
|
|
std::string key_size;
|
|
ASSERT_TRUE(t->GetInstruction(&key_size, "keySize"));
|
|
std::vector<uint8_t> ct, key, msg;
|
|
ASSERT_TRUE(t->GetBytes(&ct, "ct"));
|
|
ASSERT_TRUE(t->GetBytes(&key, "key"));
|
|
ASSERT_TRUE(t->GetBytes(&msg, "msg"));
|
|
ASSERT_EQ(static_cast<unsigned>(atoi(key_size.c_str())), key.size() * 8);
|
|
WycheproofResult result;
|
|
ASSERT_TRUE(GetWycheproofResult(t, &result));
|
|
|
|
if (result != WycheproofResult::kInvalid) {
|
|
ASSERT_GE(ct.size(), 8u);
|
|
|
|
AES_KEY aes;
|
|
ASSERT_EQ(0, AES_set_decrypt_key(key.data(), 8 * key.size(), &aes));
|
|
std::vector<uint8_t> out(ct.size() - 8);
|
|
int len = AES_unwrap_key(&aes, nullptr, out.data(), ct.data(), ct.size());
|
|
ASSERT_EQ(static_cast<int>(out.size()), len);
|
|
EXPECT_EQ(Bytes(msg), Bytes(out));
|
|
|
|
out.resize(msg.size() + 8);
|
|
ASSERT_EQ(0, AES_set_encrypt_key(key.data(), 8 * key.size(), &aes));
|
|
len = AES_wrap_key(&aes, nullptr, out.data(), msg.data(), msg.size());
|
|
ASSERT_EQ(static_cast<int>(out.size()), len);
|
|
EXPECT_EQ(Bytes(ct), Bytes(out));
|
|
} else {
|
|
AES_KEY aes;
|
|
ASSERT_EQ(0, AES_set_decrypt_key(key.data(), 8 * key.size(), &aes));
|
|
std::vector<uint8_t> out(ct.size() < 8 ? 0 : ct.size() - 8);
|
|
int len = AES_unwrap_key(&aes, nullptr, out.data(), ct.data(), ct.size());
|
|
EXPECT_EQ(-1, len);
|
|
}
|
|
});
|
|
}
|
|
|
|
TEST(AESTest, WrapBadLengths) {
|
|
uint8_t key[128/8] = {0};
|
|
AES_KEY aes;
|
|
ASSERT_EQ(0, AES_set_encrypt_key(key, 128, &aes));
|
|
|
|
// Input lengths to |AES_wrap_key| must be a multiple of 8 and at least 16.
|
|
static const size_t kLengths[] = {0, 1, 2, 3, 4, 5, 6, 7, 8,
|
|
9, 10, 11, 12, 13, 14, 15, 20};
|
|
for (size_t len : kLengths) {
|
|
SCOPED_TRACE(len);
|
|
std::vector<uint8_t> in(len);
|
|
std::vector<uint8_t> out(len + 8);
|
|
EXPECT_EQ(-1,
|
|
AES_wrap_key(&aes, nullptr, out.data(), in.data(), in.size()));
|
|
}
|
|
}
|
|
|
|
#if defined(SUPPORTS_ABI_TEST)
|
|
TEST(AESTest, ABI) {
|
|
for (int bits : {128, 192, 256}) {
|
|
SCOPED_TRACE(bits);
|
|
const uint8_t kKey[256/8] = {0};
|
|
AES_KEY key;
|
|
uint8_t block[AES_BLOCK_SIZE];
|
|
uint8_t buf[AES_BLOCK_SIZE * 64] = {0};
|
|
std::vector<int> block_counts;
|
|
if (bits == 128) {
|
|
block_counts = {0, 1, 2, 3, 4, 8, 16, 31};
|
|
} else {
|
|
// Unwind tests are very slow. Assume that the various input sizes do not
|
|
// differ significantly by round count for ABI purposes.
|
|
block_counts = {0, 1, 8};
|
|
}
|
|
|
|
CHECK_ABI(aes_nohw_set_encrypt_key, kKey, bits, &key);
|
|
CHECK_ABI(aes_nohw_encrypt, block, block, &key);
|
|
#if defined(AES_NOHW_CBC)
|
|
for (size_t blocks : block_counts) {
|
|
SCOPED_TRACE(blocks);
|
|
CHECK_ABI(aes_nohw_cbc_encrypt, buf, buf, AES_BLOCK_SIZE * blocks, &key,
|
|
block, AES_ENCRYPT);
|
|
}
|
|
#endif
|
|
|
|
CHECK_ABI(aes_nohw_set_decrypt_key, kKey, bits, &key);
|
|
CHECK_ABI(aes_nohw_decrypt, block, block, &key);
|
|
#if defined(AES_NOHW_CBC)
|
|
for (size_t blocks : block_counts) {
|
|
SCOPED_TRACE(blocks);
|
|
CHECK_ABI(aes_nohw_cbc_encrypt, buf, buf, AES_BLOCK_SIZE * blocks, &key,
|
|
block, AES_DECRYPT);
|
|
}
|
|
#endif
|
|
|
|
if (bsaes_capable()) {
|
|
aes_nohw_set_encrypt_key(kKey, bits, &key);
|
|
for (size_t blocks : block_counts) {
|
|
SCOPED_TRACE(blocks);
|
|
if (blocks != 0) {
|
|
CHECK_ABI(bsaes_ctr32_encrypt_blocks, buf, buf, blocks, &key, block);
|
|
}
|
|
}
|
|
|
|
aes_nohw_set_decrypt_key(kKey, bits, &key);
|
|
for (size_t blocks : block_counts) {
|
|
SCOPED_TRACE(blocks);
|
|
CHECK_ABI(bsaes_cbc_encrypt, buf, buf, AES_BLOCK_SIZE * blocks, &key,
|
|
block, AES_DECRYPT);
|
|
}
|
|
}
|
|
|
|
if (vpaes_capable()) {
|
|
CHECK_ABI(vpaes_set_encrypt_key, kKey, bits, &key);
|
|
CHECK_ABI(vpaes_encrypt, block, block, &key);
|
|
for (size_t blocks : block_counts) {
|
|
SCOPED_TRACE(blocks);
|
|
CHECK_ABI(vpaes_cbc_encrypt, buf, buf, AES_BLOCK_SIZE * blocks, &key,
|
|
block, AES_ENCRYPT);
|
|
#if defined(VPAES_CTR32)
|
|
CHECK_ABI(vpaes_ctr32_encrypt_blocks, buf, buf, blocks, &key, block);
|
|
#endif
|
|
}
|
|
|
|
CHECK_ABI(vpaes_set_decrypt_key, kKey, bits, &key);
|
|
CHECK_ABI(vpaes_decrypt, block, block, &key);
|
|
for (size_t blocks : block_counts) {
|
|
SCOPED_TRACE(blocks);
|
|
CHECK_ABI(vpaes_cbc_encrypt, buf, buf, AES_BLOCK_SIZE * blocks, &key,
|
|
block, AES_DECRYPT);
|
|
}
|
|
}
|
|
|
|
if (hwaes_capable()) {
|
|
CHECK_ABI(aes_hw_set_encrypt_key, kKey, bits, &key);
|
|
CHECK_ABI(aes_hw_encrypt, block, block, &key);
|
|
for (size_t blocks : block_counts) {
|
|
SCOPED_TRACE(blocks);
|
|
CHECK_ABI(aes_hw_cbc_encrypt, buf, buf, AES_BLOCK_SIZE * blocks, &key,
|
|
block, AES_ENCRYPT);
|
|
CHECK_ABI(aes_hw_ctr32_encrypt_blocks, buf, buf, blocks, &key, block);
|
|
#if defined(HWAES_ECB)
|
|
CHECK_ABI(aes_hw_ecb_encrypt, buf, buf, AES_BLOCK_SIZE * blocks, &key,
|
|
AES_ENCRYPT);
|
|
#endif
|
|
}
|
|
|
|
CHECK_ABI(aes_hw_set_decrypt_key, kKey, bits, &key);
|
|
CHECK_ABI(aes_hw_decrypt, block, block, &key);
|
|
for (size_t blocks : block_counts) {
|
|
SCOPED_TRACE(blocks);
|
|
CHECK_ABI(aes_hw_cbc_encrypt, buf, buf, AES_BLOCK_SIZE * blocks, &key,
|
|
block, AES_DECRYPT);
|
|
#if defined(HWAES_ECB)
|
|
CHECK_ABI(aes_hw_ecb_encrypt, buf, buf, AES_BLOCK_SIZE * blocks, &key,
|
|
AES_DECRYPT);
|
|
#endif
|
|
}
|
|
}
|
|
}
|
|
}
|
|
#endif // SUPPORTS_ABI_TEST
|