boringssl

Author	SHA1	Message	Date
David Benjamin	8370fb6b41	Implement constant-time generic multiplication. This is slower, but constant-time. It intentionally omits the signed digit optimization because we cannot be sure the doubling case will be unreachable for all curves. This is a fallback generic implementation for curves which we must support for compatibility but which are not common or important enough to justify curve-specific work. Before: Did 814 ECDH P-384 operations in 1085384us (750.0 ops/sec) Did 1430 ECDSA P-384 signing operations in 1081988us (1321.6 ops/sec) Did 308 ECDH P-521 operations in 1057741us (291.2 ops/sec) Did 539 ECDSA P-521 signing operations in 1049797us (513.4 ops/sec) After: Did 715 ECDH P-384 operations in 1080161us (661.9 ops/sec) Did 1188 ECDSA P-384 verify operations in 1069567us (1110.7 ops/sec) Did 275 ECDH P-521 operations in 1060503us (259.3 ops/sec) Did 506 ECDSA P-521 signing operations in 1084739us (466.5 ops/sec) But we're still faster than the old BIGNUM implementation. EC_FELEM more than paid for both the loss of points_make_affine and this CL. Bug: 239 Change-Id: I65d71a731aad16b523928ee47618822d503ea704 Reviewed-on: https://boringssl-review.googlesource.com/27708 Commit-Queue: David Benjamin <davidben@google.com> CQ-Verified: CQ bot account: commit-bot@chromium.org <commit-bot@chromium.org> Reviewed-by: Adam Langley <agl@google.com>	2018-04-27 20:11:29 +00:00
David Benjamin	041dd68cec	Clear mallocs in ec_wNAF_mul. EC_POINT is split into the existing public EC_POINT (where the caller is sanity-checked about group mismatches) and the low-level EC_RAW_POINT (which, like EC_FELEM and EC_SCALAR, assume that is your problem and is a plain old struct). Having both EC_POINT and EC_RAW_POINT is a little silly, but we're going to want different type signatures for functions which return void anyway (my plan is to lift a non-BIGNUM get_affine_coordinates up through the ECDSA and ECDH code), so I think it's fine. This wasn't strictly necessary, but wnaf.c is a lot tidier now. Perf is a wash; once we get up to this layer, it's only 8 entries in the table so not particularly interesting. Bug: 239 Change-Id: I8ace749393d359f42649a5bb0734597bb7c07a2e Reviewed-on: https://boringssl-review.googlesource.com/27706 Commit-Queue: David Benjamin <davidben@google.com> CQ-Verified: CQ bot account: commit-bot@chromium.org <commit-bot@chromium.org> Reviewed-by: Adam Langley <agl@google.com>	2018-04-27 19:44:58 +00:00
David Benjamin	32e0d10069	Add EC_FELEM for EC_POINTs and related temporaries. This introduces EC_FELEM, which is analogous to EC_SCALAR. It is used for EC_POINT's representation in the generic EC_METHOD, as well as random operations on tuned EC_METHODs that still are implemented genericly. Unlike EC_SCALAR, EC_FELEM's exact representation is awkwardly specific to the EC_METHOD, analogous to how the old values were BIGNUMs but may or may not have been in Montgomery form. This is kind of a nuisance, but no more than before. (If p224-64.c were easily convertable to Montgomery form, we could say \|EC_FELEM\| is always in Montgomery form. If we exposed the internal add and double implementations in each of the curves, we could give \|EC_POINT\| an \|EC_METHOD\|-specific representation and \|EC_FELEM\| is purely a \|EC_GFp_mont_method\| type. I'll leave this for later.) The generic add and doubling formulas are aligned with the formulas proved in fiat-crypto. Those only applied to a = -3, so I've proved a generic one in https://github.com/mit-plv/fiat-crypto/pull/356, in case someone uses a custom curve. The new formulas are verified, constant-time, and swap a multiply for a square. As expressed in fiat-crypto they do use more temporaries, but this seems to be fine with stack-allocated EC_FELEMs. (We can try to help the compiler later, but benchamrks below suggest this isn't necessary.) Unlike BIGNUM, EC_FELEM can be stack-allocated. It also captures the bounds in the type system and, in particular, that the width is correct, which will make it easier to select a point in constant-time in the future. (Indeed the old code did not always have the correct width. Its point formula involved halving and implemented this in variable time and variable width.) Before: Did 77274 ECDH P-256 operations in 10046087us (7692.0 ops/sec) Did 5959 ECDH P-384 operations in 10031701us (594.0 ops/sec) Did 10815 ECDSA P-384 signing operations in 10087892us (1072.1 ops/sec) Did 8976 ECDSA P-384 verify operations in 10071038us (891.3 ops/sec) Did 2600 ECDH P-521 operations in 10091688us (257.6 ops/sec) Did 4590 ECDSA P-521 signing operations in 10055195us (456.5 ops/sec) Did 3811 ECDSA P-521 verify operations in 10003574us (381.0 ops/sec) After: Did 77736 ECDH P-256 operations in 10029858us (7750.5 ops/sec) [+0.8%] Did 7519 ECDH P-384 operations in 10068076us (746.8 ops/sec) [+25.7%] Did 13335 ECDSA P-384 signing operations in 10029962us (1329.5 ops/sec) [+24.0%] Did 11021 ECDSA P-384 verify operations in 10088600us (1092.4 ops/sec) [+22.6%] Did 2912 ECDH P-521 operations in 10001325us (291.2 ops/sec) [+13.0%] Did 5150 ECDSA P-521 signing operations in 10027462us (513.6 ops/sec) [+12.5%] Did 4264 ECDSA P-521 verify operations in 10069694us (423.4 ops/sec) [+11.1%] This more than pays for removing points_make_affine previously and even speeds up ECDH P-256 slightly. (The point-on-curve check uses the generic code.) Next is to push the stack-allocating up to ec_wNAF_mul, followed by a constant-time single-point multiplication. Bug: 239 Change-Id: I44a2dff7c52522e491d0f8cffff64c4ab5cd353c Reviewed-on: https://boringssl-review.googlesource.com/27668 Reviewed-by: Adam Langley <agl@google.com>	2018-04-25 16:39:58 +00:00
David Benjamin	5c0e0cec83	Remove Z = 1 special-case in generic point_get_affine. As the point may be the output of some private key operation, whether Z accidentally hit one is secret. Bug: 239 Change-Id: I7db34cd3b5dd5ca4b96980e8993a9b4eda49eb88 Reviewed-on: https://boringssl-review.googlesource.com/27664 Reviewed-by: Adam Langley <alangley@gmail.com>	2018-04-24 16:16:53 +00:00
David Benjamin	364a51ec3a	Abstract scalar inversion in EC_METHOD. This introduces a hook for the OpenSSL assembly. Change-Id: I35e0588f0ed5bed375b12f738d16c9f46ceedeea Reviewed-on: https://boringssl-review.googlesource.com/27592 Reviewed-by: Adam Langley <alangley@gmail.com>	2018-04-24 16:13:24 +00:00
David Benjamin	f4b708cc1e	Add a function which folds BN_MONT_CTX_{new,set} together. These empty states aren't any use to either caller or implementor. Change-Id: If0b748afeeb79e4a1386182e61c5b5ecf838de62 Reviewed-on: https://boringssl-review.googlesource.com/25254 Reviewed-by: Adam Langley <agl@google.com>	2018-02-02 20:23:25 +00:00
Andres Erbsen	46304abf7d	ec/p256.c: fiat-crypto field arithmetic (64, 32) The fiat-crypto-generated code uses the Montgomery form implementation strategy, for both 32-bit and 64-bit code. 64-bit throughput seems slower, but the difference is smaller than noise between repetitions (-2%?) 32-bit throughput has decreased significantly for ECDH (-40%). I am attributing this to the change from varibale-time scalar multiplication to constant-time scalar multiplication. Due to the same bottleneck, ECDSA verification still uses the old code (otherwise there would have been a 60% throughput decrease). On the other hand, ECDSA signing throughput has increased slightly (+10%), perhaps due to the use of a precomputed table of multiples of the base point. 64-bit benchmarks (Google Cloud Haswell): with this change: Did 9126 ECDH P-256 operations in 1009572us (9039.5 ops/sec) Did 23000 ECDSA P-256 signing operations in 1039832us (22119.0 ops/sec) Did 8820 ECDSA P-256 verify operations in 1024242us (8611.2 ops/sec) master (`40e8c921ca`): Did 9340 ECDH P-256 operations in 1017975us (9175.1 ops/sec) Did 23000 ECDSA P-256 signing operations in 1039820us (22119.2 ops/sec) Did 8688 ECDSA P-256 verify operations in 1021108us (8508.4 ops/sec) benchmarks on ARMv7 (LG Nexus 4): with this change: Did 150 ECDH P-256 operations in 1029726us (145.7 ops/sec) Did 506 ECDSA P-256 signing operations in 1065192us (475.0 ops/sec) Did 363 ECDSA P-256 verify operations in 1033298us (351.3 ops/sec) master (`2fce1beda0`): Did 245 ECDH P-256 operations in 1017518us (240.8 ops/sec) Did 473 ECDSA P-256 signing operations in 1086281us (435.4 ops/sec) Did 360 ECDSA P-256 verify operations in 1003846us (358.6 ops/sec) 64-bit tables converted as follows: import re, sys, math p = 2256 - 2224 + 2192 + 296 - 1 R = 2256 def convert(t): x0, s1, x1, s2, x2, s3, x3 = t.groups() v = int(x0, 0) + 264 * (int(x1, 0) + 2*64(int(x2,0) + 2*64(int(x3, 0)) )) w = vR%p y0 = hex(w%(264)) y1 = hex((w>>64)%(264)) y2 = hex((w>>(264))%(2*64)) y3 = hex((w>>(364))%(264)) ww = int(y0, 0) + 264 * (int(y1, 0) + 2*64(int(y2,0) + 2*64(int(y3, 0)) )) if ww != vR%p: print(x0,x1,x2,x3) print(hex(v)) print(y0,y1,y2,y3) print(hex(w)) print(hex(ww)) assert 0 return '{'+y0+s1+y1+s2+y2+s3+y3+'}' fe_re = re.compile('{'+r'(\s,\s*)'.join(r'(\d+\|0x[abcdefABCDEF0123456789]+)' for i in range(4)) + '}') print (re.sub(fe_re, convert, sys.stdin.read()).rstrip('\n')) 32-bit tables converted from 64-bit tables Change-Id: I52d6e5504fcb6ca2e8b0ee13727f4500c80c1799 Reviewed-on: https://boringssl-review.googlesource.com/23244 Commit-Queue: Adam Langley <agl@google.com> Reviewed-by: Adam Langley <agl@google.com> CQ-Verified: CQ bot account: commit-bot@chromium.org <commit-bot@chromium.org>	2017-12-11 17:55:46 +00:00
David Benjamin	51073ce055	Refcount EC_GROUP. I really need to resurrect the CL to make them entirely static (https://crbug.com/boringssl/20), but, in the meantime, to make replacing the EC_METHOD pointer in EC_POINT with EC_GROUP not completely insane, make them refcounted. OpenSSL did not do this because their EC_GROUPs are mutable (EC_GROUP_set_asn1_flag and EC_GROUP_set_point_conversion_form). Ours are immutable but for the two-function dance around custom curves (more of OpenSSL's habit of making their objects too complex), which is good enough to refcount. Change-Id: I3650993737a97da0ddcf0e5fb7a15876e724cadc Reviewed-on: https://boringssl-review.googlesource.com/22244 Reviewed-by: Adam Langley <agl@google.com> Commit-Queue: David Benjamin <davidben@google.com> CQ-Verified: CQ bot account: commit-bot@chromium.org <commit-bot@chromium.org>	2017-10-27 17:48:27 +00:00
David Benjamin	808f832917	Run the comment converter on libcrypto. crypto/{asn1,x509,x509v3,pem} were skipped as they are still OpenSSL style. Change-Id: I3cd9a60e1cb483a981aca325041f3fbce294247c Reviewed-on: https://boringssl-review.googlesource.com/19504 Reviewed-by: Adam Langley <agl@google.com> Commit-Queue: David Benjamin <davidben@google.com> CQ-Verified: CQ bot account: commit-bot@chromium.org <commit-bot@chromium.org>	2017-08-18 21:49:04 +00:00
Adam Langley	aacb72c1b7	Move ec/ and ecdsa/ into fipsmodule/ The names in the P-224 code collided with the P-256 code and thus many of the functions and constants in the P-224 code have been prefixed. Change-Id: I6bcd304640c539d0483d129d5eaf1702894929a8 Reviewed-on: https://boringssl-review.googlesource.com/15847 Reviewed-by: David Benjamin <davidben@google.com> Commit-Queue: David Benjamin <davidben@google.com> CQ-Verified: CQ bot account: commit-bot@chromium.org <commit-bot@chromium.org>	2017-05-04 20:27:23 +00:00

10 Commits