boringssl/crypto
Andres Erbsen 46304abf7d ec/p256.c: fiat-crypto field arithmetic (64, 32)
The fiat-crypto-generated code uses the Montgomery form implementation
strategy, for both 32-bit and 64-bit code.

64-bit throughput seems slower, but the difference is smaller than noise between repetitions (-2%?)

32-bit throughput has decreased significantly for ECDH (-40%). I am
attributing this to the change from varibale-time scalar multiplication
to constant-time scalar multiplication. Due to the same bottleneck,
ECDSA verification still uses the old code (otherwise there would have
been a 60% throughput decrease). On the other hand, ECDSA signing
throughput has increased slightly (+10%), perhaps due to the use of a
precomputed table of multiples of the base point.

64-bit benchmarks (Google Cloud Haswell):

with this change:
Did 9126 ECDH P-256 operations in 1009572us (9039.5 ops/sec)
Did 23000 ECDSA P-256 signing operations in 1039832us (22119.0 ops/sec)
Did 8820 ECDSA P-256 verify operations in 1024242us (8611.2 ops/sec)

master (40e8c921ca):
Did 9340 ECDH P-256 operations in 1017975us (9175.1 ops/sec)
Did 23000 ECDSA P-256 signing operations in 1039820us (22119.2 ops/sec)
Did 8688 ECDSA P-256 verify operations in 1021108us (8508.4 ops/sec)

benchmarks on ARMv7 (LG Nexus 4):

with this change:
Did 150 ECDH P-256 operations in 1029726us (145.7 ops/sec)
Did 506 ECDSA P-256 signing operations in 1065192us (475.0 ops/sec)
Did 363 ECDSA P-256 verify operations in 1033298us (351.3 ops/sec)

master (2fce1beda0):
Did 245 ECDH P-256 operations in 1017518us (240.8 ops/sec)
Did 473 ECDSA P-256 signing operations in 1086281us (435.4 ops/sec)
Did 360 ECDSA P-256 verify operations in 1003846us (358.6 ops/sec)

64-bit tables converted as follows:

import re, sys, math

p = 2**256 - 2**224 + 2**192 + 2**96 - 1
R = 2**256

def convert(t):
    x0, s1, x1, s2, x2, s3, x3 = t.groups()
    v = int(x0, 0) + 2**64 * (int(x1, 0) + 2**64*(int(x2,0) + 2**64*(int(x3, 0)) ))
    w = v*R%p
    y0 = hex(w%(2**64))
    y1 = hex((w>>64)%(2**64))
    y2 = hex((w>>(2*64))%(2**64))
    y3 = hex((w>>(3*64))%(2**64))
    ww = int(y0, 0) + 2**64 * (int(y1, 0) + 2**64*(int(y2,0) + 2**64*(int(y3, 0)) ))
    if ww != v*R%p:
        print(x0,x1,x2,x3)
        print(hex(v))
        print(y0,y1,y2,y3)
        print(hex(w))
        print(hex(ww))
        assert 0
    return '{'+y0+s1+y1+s2+y2+s3+y3+'}'

fe_re = re.compile('{'+r'(\s*,\s*)'.join(r'(\d+|0x[abcdefABCDEF0123456789]+)' for i in range(4)) + '}')
print (re.sub(fe_re, convert, sys.stdin.read()).rstrip('\n'))

32-bit tables converted from 64-bit tables

Change-Id: I52d6e5504fcb6ca2e8b0ee13727f4500c80c1799
Reviewed-on: https://boringssl-review.googlesource.com/23244
Commit-Queue: Adam Langley <agl@google.com>
Reviewed-by: Adam Langley <agl@google.com>
CQ-Verified: CQ bot account: commit-bot@chromium.org <commit-bot@chromium.org>
2017-12-11 17:55:46 +00:00
..
asn1 Use uint32_t for unicode code points. 2017-12-08 17:51:34 +00:00
base64
bio Add missing errno.h include to bio_test.cc 2017-12-04 01:32:37 +00:00
bn_extra
buf
bytestring Revert "Support high tag numbers in CBS/CBB." 2017-11-30 21:57:17 +00:00
chacha
cipher_extra
cmac
conf Add more compatibility symbols for Node. 2017-11-03 01:31:50 +00:00
curve25519 Move curve25519 code to third_party/fiat. 2017-11-03 22:23:59 +00:00
dh
digest_extra
dsa Remove DSA_sign_setup too. 2017-11-22 21:01:11 +00:00
ec_extra Support high tag numbers in CBS/CBB. 2017-11-22 22:34:05 +00:00
ecdh
ecdsa_extra Remove ECDSA_sign_setup and friends. 2017-11-22 20:23:40 +00:00
engine
err Reimplement OBJ_txt2obj and add a lower-level function. 2017-11-27 21:29:00 +00:00
evp Add some missing OpenSSL 1.1.0 accessors. 2017-11-22 18:43:38 +00:00
fipsmodule ec/p256.c: fiat-crypto field arithmetic (64, 32) 2017-12-11 17:55:46 +00:00
hkdf
hmac_extra
lhash
obj Also add a decoupled OBJ_obj2txt. 2017-11-30 18:21:48 +00:00
pem
perlasm
pkcs7
pkcs8
poly1305 Remove custom memcpy and memset from poly1305_vec. 2017-11-10 20:53:30 +00:00
pool
rand_extra
rc4
rsa_extra Make BN_generate_dsa_nonce internally constant-time. 2017-11-20 16:18:30 +00:00
stack
test
x509 Use uint32_t for unicode code points. 2017-12-08 17:51:34 +00:00
x509v3 Pretty-print large INTEGERs and ENUMERATEDs in hex. 2017-11-27 18:38:50 +00:00
CMakeLists.txt Move curve25519 code to third_party/fiat. 2017-11-03 22:23:59 +00:00
compiler_test.cc
constant_time_test.cc
cpu-aarch64-linux.c
cpu-arm-linux.c
cpu-arm.c
cpu-intel.c
cpu-ppc64le.c
crypto.c
ex_data.c
internal.h
mem.c
refcount_c11.c
refcount_lock.c
refcount_test.cc
thread_none.c
thread_pthread.c
thread_test.cc
thread_win.c
thread.c