boringssl/crypto
Andres Erbsen 46304abf7d ec/p256.c: fiat-crypto field arithmetic (64, 32)
The fiat-crypto-generated code uses the Montgomery form implementation
strategy, for both 32-bit and 64-bit code.

64-bit throughput seems slower, but the difference is smaller than noise between repetitions (-2%?)

32-bit throughput has decreased significantly for ECDH (-40%). I am
attributing this to the change from varibale-time scalar multiplication
to constant-time scalar multiplication. Due to the same bottleneck,
ECDSA verification still uses the old code (otherwise there would have
been a 60% throughput decrease). On the other hand, ECDSA signing
throughput has increased slightly (+10%), perhaps due to the use of a
precomputed table of multiples of the base point.

64-bit benchmarks (Google Cloud Haswell):

with this change:
Did 9126 ECDH P-256 operations in 1009572us (9039.5 ops/sec)
Did 23000 ECDSA P-256 signing operations in 1039832us (22119.0 ops/sec)
Did 8820 ECDSA P-256 verify operations in 1024242us (8611.2 ops/sec)

master (40e8c921ca):
Did 9340 ECDH P-256 operations in 1017975us (9175.1 ops/sec)
Did 23000 ECDSA P-256 signing operations in 1039820us (22119.2 ops/sec)
Did 8688 ECDSA P-256 verify operations in 1021108us (8508.4 ops/sec)

benchmarks on ARMv7 (LG Nexus 4):

with this change:
Did 150 ECDH P-256 operations in 1029726us (145.7 ops/sec)
Did 506 ECDSA P-256 signing operations in 1065192us (475.0 ops/sec)
Did 363 ECDSA P-256 verify operations in 1033298us (351.3 ops/sec)

master (2fce1beda0):
Did 245 ECDH P-256 operations in 1017518us (240.8 ops/sec)
Did 473 ECDSA P-256 signing operations in 1086281us (435.4 ops/sec)
Did 360 ECDSA P-256 verify operations in 1003846us (358.6 ops/sec)

64-bit tables converted as follows:

import re, sys, math

p = 2**256 - 2**224 + 2**192 + 2**96 - 1
R = 2**256

def convert(t):
    x0, s1, x1, s2, x2, s3, x3 = t.groups()
    v = int(x0, 0) + 2**64 * (int(x1, 0) + 2**64*(int(x2,0) + 2**64*(int(x3, 0)) ))
    w = v*R%p
    y0 = hex(w%(2**64))
    y1 = hex((w>>64)%(2**64))
    y2 = hex((w>>(2*64))%(2**64))
    y3 = hex((w>>(3*64))%(2**64))
    ww = int(y0, 0) + 2**64 * (int(y1, 0) + 2**64*(int(y2,0) + 2**64*(int(y3, 0)) ))
    if ww != v*R%p:
        print(x0,x1,x2,x3)
        print(hex(v))
        print(y0,y1,y2,y3)
        print(hex(w))
        print(hex(ww))
        assert 0
    return '{'+y0+s1+y1+s2+y2+s3+y3+'}'

fe_re = re.compile('{'+r'(\s*,\s*)'.join(r'(\d+|0x[abcdefABCDEF0123456789]+)' for i in range(4)) + '}')
print (re.sub(fe_re, convert, sys.stdin.read()).rstrip('\n'))

32-bit tables converted from 64-bit tables

Change-Id: I52d6e5504fcb6ca2e8b0ee13727f4500c80c1799
Reviewed-on: https://boringssl-review.googlesource.com/23244
Commit-Queue: Adam Langley <agl@google.com>
Reviewed-by: Adam Langley <agl@google.com>
CQ-Verified: CQ bot account: commit-bot@chromium.org <commit-bot@chromium.org>
2017-12-11 17:55:46 +00:00
..
asn1 Use uint32_t for unicode code points. 2017-12-08 17:51:34 +00:00
base64 Run the comment converter on libcrypto. 2017-08-18 21:49:04 +00:00
bio Add missing errno.h include to bio_test.cc 2017-12-04 01:32:37 +00:00
bn_extra Remove the buggy RSA parser. 2017-10-24 17:39:46 +00:00
buf Always process handshake records in full. 2017-10-17 14:53:11 +00:00
bytestring Revert "Support high tag numbers in CBS/CBB." 2017-11-30 21:57:17 +00:00
chacha Run the comment converter on libcrypto. 2017-08-18 21:49:04 +00:00
cipher_extra Explicit fallthrough on switch 2017-09-20 19:58:25 +00:00
cmac Run the comment converter on libcrypto. 2017-08-18 21:49:04 +00:00
conf Add more compatibility symbols for Node. 2017-11-03 01:31:50 +00:00
curve25519 Move curve25519 code to third_party/fiat. 2017-11-03 22:23:59 +00:00
dh Fx DH_set0_pqg. 2017-10-05 18:50:48 +00:00
digest_extra Export EVP_parse_digest_algorithm and add EVP_marshal_digest_algorithm. 2017-09-25 20:44:13 +00:00
dsa Remove DSA_sign_setup too. 2017-11-22 21:01:11 +00:00
ec_extra Support high tag numbers in CBS/CBB. 2017-11-22 22:34:05 +00:00
ecdh Check EC_POINT/EC_GROUP compatibility more accurately. 2017-10-28 08:02:50 +00:00
ecdsa_extra Remove ECDSA_sign_setup and friends. 2017-11-22 20:23:40 +00:00
engine Run the comment converter on libcrypto. 2017-08-18 21:49:04 +00:00
err Reimplement OBJ_txt2obj and add a lower-level function. 2017-11-27 21:29:00 +00:00
evp Add some missing OpenSSL 1.1.0 accessors. 2017-11-22 18:43:38 +00:00
fipsmodule ec/p256.c: fiat-crypto field arithmetic (64, 32) 2017-12-11 17:55:46 +00:00
hkdf Run the comment converter on libcrypto. 2017-08-18 21:49:04 +00:00
hmac_extra Convert a number of tests to GTest. 2017-06-01 17:02:13 +00:00
lhash Unexport more of lhash. 2017-10-25 04:17:18 +00:00
obj Also add a decoupled OBJ_obj2txt. 2017-11-30 18:21:48 +00:00
pem Clear some _CRT_SECURE_NO_WARNINGS warnings. 2017-10-25 04:14:28 +00:00
perlasm Revert assembly changes in "Hide CPU capability symbols in C." 2017-10-30 20:39:57 +00:00
pkcs7 Run the comment converter on libcrypto. 2017-08-18 21:49:04 +00:00
pkcs8 Export EVP_parse_digest_algorithm and add EVP_marshal_digest_algorithm. 2017-09-25 20:44:13 +00:00
poly1305 Remove custom memcpy and memset from poly1305_vec. 2017-11-10 20:53:30 +00:00
pool Run the comment converter on libcrypto. 2017-08-18 21:49:04 +00:00
rand_extra Remove CHROMIUM_ROLLING_MAGENTA_TO_ZIRCON scaffolding. 2017-09-18 21:34:32 +00:00
rc4
rsa_extra Make BN_generate_dsa_nonce internally constant-time. 2017-11-20 16:18:30 +00:00
stack Run the comment converter on libcrypto. 2017-08-18 21:49:04 +00:00
test Clarify ERR_print_errors_* clear the error queue. 2017-09-05 17:31:25 +00:00
x509 Use uint32_t for unicode code points. 2017-12-08 17:51:34 +00:00
x509v3 Pretty-print large INTEGERs and ENUMERATEDs in hex. 2017-11-27 18:38:50 +00:00
CMakeLists.txt Move curve25519 code to third_party/fiat. 2017-11-03 22:23:59 +00:00
compiler_test.cc Test that nullptr has the obvious memory representation. 2017-07-28 17:39:28 +00:00
constant_time_test.cc Switch constant-time functions to using |crypto_word_t|. 2017-04-21 22:06:05 +00:00
cpu-aarch64-linux.c Run the comment converter on libcrypto. 2017-08-18 21:49:04 +00:00
cpu-arm-linux.c Add CRYPTO_needs_hwcap2_workaround. 2017-09-18 14:05:46 +00:00
cpu-arm.c
cpu-intel.c Use unsigned integers for masks. 2017-10-30 18:39:58 +00:00
cpu-ppc64le.c Run the comment converter on libcrypto. 2017-08-18 21:49:04 +00:00
crypto.c Hide CPU capability symbols in C. 2017-10-23 18:36:49 +00:00
ex_data.c Unexport more of lhash. 2017-10-25 04:17:18 +00:00
internal.h Tidy up alignof #defines. 2017-09-25 14:20:54 +00:00
mem.c Remove now unnecessary _POSIX_C_SOURCE bits to work around macOS bug. 2017-10-02 20:02:22 +00:00
refcount_c11.c Run the comment converter on libcrypto. 2017-08-18 21:49:04 +00:00
refcount_lock.c Run the comment converter on libcrypto. 2017-08-18 21:49:04 +00:00
refcount_test.cc Convert various tests to GTest. 2017-05-23 22:34:09 +00:00
thread_none.c Run the comment converter on libcrypto. 2017-08-18 21:49:04 +00:00
thread_pthread.c Run the comment converter on libcrypto. 2017-08-18 21:49:04 +00:00
thread_test.cc Run the comment converter on libcrypto. 2017-08-18 21:49:04 +00:00
thread_win.c Run the comment converter on libcrypto. 2017-08-18 21:49:04 +00:00
thread.c