Non puoi selezionare più di 25 argomenti Gli argomenti devono iniziare con una lettera o un numero, possono includere trattini ('-') e possono essere lunghi fino a 35 caratteri.

Add EC_FELEM for EC_POINTs and related temporaries. This introduces EC_FELEM, which is analogous to EC_SCALAR. It is used for EC_POINT's representation in the generic EC_METHOD, as well as random operations on tuned EC_METHODs that still are implemented genericly. Unlike EC_SCALAR, EC_FELEM's exact representation is awkwardly specific to the EC_METHOD, analogous to how the old values were BIGNUMs but may or may not have been in Montgomery form. This is kind of a nuisance, but no more than before. (If p224-64.c were easily convertable to Montgomery form, we could say |EC_FELEM| is always in Montgomery form. If we exposed the internal add and double implementations in each of the curves, we could give |EC_POINT| an |EC_METHOD|-specific representation and |EC_FELEM| is purely a |EC_GFp_mont_method| type. I'll leave this for later.) The generic add and doubling formulas are aligned with the formulas proved in fiat-crypto. Those only applied to a = -3, so I've proved a generic one in https://github.com/mit-plv/fiat-crypto/pull/356, in case someone uses a custom curve. The new formulas are verified, constant-time, and swap a multiply for a square. As expressed in fiat-crypto they do use more temporaries, but this seems to be fine with stack-allocated EC_FELEMs. (We can try to help the compiler later, but benchamrks below suggest this isn't necessary.) Unlike BIGNUM, EC_FELEM can be stack-allocated. It also captures the bounds in the type system and, in particular, that the width is correct, which will make it easier to select a point in constant-time in the future. (Indeed the old code did not always have the correct width. Its point formula involved halving and implemented this in variable time and variable width.) Before: Did 77274 ECDH P-256 operations in 10046087us (7692.0 ops/sec) Did 5959 ECDH P-384 operations in 10031701us (594.0 ops/sec) Did 10815 ECDSA P-384 signing operations in 10087892us (1072.1 ops/sec) Did 8976 ECDSA P-384 verify operations in 10071038us (891.3 ops/sec) Did 2600 ECDH P-521 operations in 10091688us (257.6 ops/sec) Did 4590 ECDSA P-521 signing operations in 10055195us (456.5 ops/sec) Did 3811 ECDSA P-521 verify operations in 10003574us (381.0 ops/sec) After: Did 77736 ECDH P-256 operations in 10029858us (7750.5 ops/sec) [+0.8%] Did 7519 ECDH P-384 operations in 10068076us (746.8 ops/sec) [+25.7%] Did 13335 ECDSA P-384 signing operations in 10029962us (1329.5 ops/sec) [+24.0%] Did 11021 ECDSA P-384 verify operations in 10088600us (1092.4 ops/sec) [+22.6%] Did 2912 ECDH P-521 operations in 10001325us (291.2 ops/sec) [+13.0%] Did 5150 ECDSA P-521 signing operations in 10027462us (513.6 ops/sec) [+12.5%] Did 4264 ECDSA P-521 verify operations in 10069694us (423.4 ops/sec) [+11.1%] This more than pays for removing points_make_affine previously and even speeds up ECDH P-256 slightly. (The point-on-curve check uses the generic code.) Next is to push the stack-allocating up to ec_wNAF_mul, followed by a constant-time single-point multiplication. Bug: 239 Change-Id: I44a2dff7c52522e491d0f8cffff64c4ab5cd353c Reviewed-on: https://boringssl-review.googlesource.com/27668 Reviewed-by: Adam Langley <agl@google.com>
6 anni fa
ec/p256.c: fiat-crypto field arithmetic (64, 32) The fiat-crypto-generated code uses the Montgomery form implementation strategy, for both 32-bit and 64-bit code. 64-bit throughput seems slower, but the difference is smaller than noise between repetitions (-2%?) 32-bit throughput has decreased significantly for ECDH (-40%). I am attributing this to the change from varibale-time scalar multiplication to constant-time scalar multiplication. Due to the same bottleneck, ECDSA verification still uses the old code (otherwise there would have been a 60% throughput decrease). On the other hand, ECDSA signing throughput has increased slightly (+10%), perhaps due to the use of a precomputed table of multiples of the base point. 64-bit benchmarks (Google Cloud Haswell): with this change: Did 9126 ECDH P-256 operations in 1009572us (9039.5 ops/sec) Did 23000 ECDSA P-256 signing operations in 1039832us (22119.0 ops/sec) Did 8820 ECDSA P-256 verify operations in 1024242us (8611.2 ops/sec) master (40e8c921cab5cce2bc10722ecf4ebe0e380cf6c8): Did 9340 ECDH P-256 operations in 1017975us (9175.1 ops/sec) Did 23000 ECDSA P-256 signing operations in 1039820us (22119.2 ops/sec) Did 8688 ECDSA P-256 verify operations in 1021108us (8508.4 ops/sec) benchmarks on ARMv7 (LG Nexus 4): with this change: Did 150 ECDH P-256 operations in 1029726us (145.7 ops/sec) Did 506 ECDSA P-256 signing operations in 1065192us (475.0 ops/sec) Did 363 ECDSA P-256 verify operations in 1033298us (351.3 ops/sec) master (2fce1beda0f7e74e2d687860f807cf0b8d8056a4): Did 245 ECDH P-256 operations in 1017518us (240.8 ops/sec) Did 473 ECDSA P-256 signing operations in 1086281us (435.4 ops/sec) Did 360 ECDSA P-256 verify operations in 1003846us (358.6 ops/sec) 64-bit tables converted as follows: import re, sys, math p = 2**256 - 2**224 + 2**192 + 2**96 - 1 R = 2**256 def convert(t): x0, s1, x1, s2, x2, s3, x3 = t.groups() v = int(x0, 0) + 2**64 * (int(x1, 0) + 2**64*(int(x2,0) + 2**64*(int(x3, 0)) )) w = v*R%p y0 = hex(w%(2**64)) y1 = hex((w>>64)%(2**64)) y2 = hex((w>>(2*64))%(2**64)) y3 = hex((w>>(3*64))%(2**64)) ww = int(y0, 0) + 2**64 * (int(y1, 0) + 2**64*(int(y2,0) + 2**64*(int(y3, 0)) )) if ww != v*R%p: print(x0,x1,x2,x3) print(hex(v)) print(y0,y1,y2,y3) print(hex(w)) print(hex(ww)) assert 0 return '{'+y0+s1+y1+s2+y2+s3+y3+'}' fe_re = re.compile('{'+r'(\s*,\s*)'.join(r'(\d+|0x[abcdefABCDEF0123456789]+)' for i in range(4)) + '}') print (re.sub(fe_re, convert, sys.stdin.read()).rstrip('\n')) 32-bit tables converted from 64-bit tables Change-Id: I52d6e5504fcb6ca2e8b0ee13727f4500c80c1799 Reviewed-on: https://boringssl-review.googlesource.com/23244 Commit-Queue: Adam Langley <agl@google.com> Reviewed-by: Adam Langley <agl@google.com> CQ-Verified: CQ bot account: commit-bot@chromium.org <commit-bot@chromium.org>
7 anni fa
ec/p256.c: fiat-crypto field arithmetic (64, 32) The fiat-crypto-generated code uses the Montgomery form implementation strategy, for both 32-bit and 64-bit code. 64-bit throughput seems slower, but the difference is smaller than noise between repetitions (-2%?) 32-bit throughput has decreased significantly for ECDH (-40%). I am attributing this to the change from varibale-time scalar multiplication to constant-time scalar multiplication. Due to the same bottleneck, ECDSA verification still uses the old code (otherwise there would have been a 60% throughput decrease). On the other hand, ECDSA signing throughput has increased slightly (+10%), perhaps due to the use of a precomputed table of multiples of the base point. 64-bit benchmarks (Google Cloud Haswell): with this change: Did 9126 ECDH P-256 operations in 1009572us (9039.5 ops/sec) Did 23000 ECDSA P-256 signing operations in 1039832us (22119.0 ops/sec) Did 8820 ECDSA P-256 verify operations in 1024242us (8611.2 ops/sec) master (40e8c921cab5cce2bc10722ecf4ebe0e380cf6c8): Did 9340 ECDH P-256 operations in 1017975us (9175.1 ops/sec) Did 23000 ECDSA P-256 signing operations in 1039820us (22119.2 ops/sec) Did 8688 ECDSA P-256 verify operations in 1021108us (8508.4 ops/sec) benchmarks on ARMv7 (LG Nexus 4): with this change: Did 150 ECDH P-256 operations in 1029726us (145.7 ops/sec) Did 506 ECDSA P-256 signing operations in 1065192us (475.0 ops/sec) Did 363 ECDSA P-256 verify operations in 1033298us (351.3 ops/sec) master (2fce1beda0f7e74e2d687860f807cf0b8d8056a4): Did 245 ECDH P-256 operations in 1017518us (240.8 ops/sec) Did 473 ECDSA P-256 signing operations in 1086281us (435.4 ops/sec) Did 360 ECDSA P-256 verify operations in 1003846us (358.6 ops/sec) 64-bit tables converted as follows: import re, sys, math p = 2**256 - 2**224 + 2**192 + 2**96 - 1 R = 2**256 def convert(t): x0, s1, x1, s2, x2, s3, x3 = t.groups() v = int(x0, 0) + 2**64 * (int(x1, 0) + 2**64*(int(x2,0) + 2**64*(int(x3, 0)) )) w = v*R%p y0 = hex(w%(2**64)) y1 = hex((w>>64)%(2**64)) y2 = hex((w>>(2*64))%(2**64)) y3 = hex((w>>(3*64))%(2**64)) ww = int(y0, 0) + 2**64 * (int(y1, 0) + 2**64*(int(y2,0) + 2**64*(int(y3, 0)) )) if ww != v*R%p: print(x0,x1,x2,x3) print(hex(v)) print(y0,y1,y2,y3) print(hex(w)) print(hex(ww)) assert 0 return '{'+y0+s1+y1+s2+y2+s3+y3+'}' fe_re = re.compile('{'+r'(\s*,\s*)'.join(r'(\d+|0x[abcdefABCDEF0123456789]+)' for i in range(4)) + '}') print (re.sub(fe_re, convert, sys.stdin.read()).rstrip('\n')) 32-bit tables converted from 64-bit tables Change-Id: I52d6e5504fcb6ca2e8b0ee13727f4500c80c1799 Reviewed-on: https://boringssl-review.googlesource.com/23244 Commit-Queue: Adam Langley <agl@google.com> Reviewed-by: Adam Langley <agl@google.com> CQ-Verified: CQ bot account: commit-bot@chromium.org <commit-bot@chromium.org>
7 anni fa
123456789101112131415161718192021222324252627282930313233343536373839404142434445464748495051525354555657585960616263646566676869707172737475767778798081828384858687888990919293949596979899100101102103104105106107108109110111112113114115116117118119120121122123124125126127128129130131132133134135136137138139140141142143144145146147148149150151152153154155
  1. /* Copyright (c) 2017, Google Inc.
  2. *
  3. * Permission to use, copy, modify, and/or distribute this software for any
  4. * purpose with or without fee is hereby granted, provided that the above
  5. * copyright notice and this permission notice appear in all copies.
  6. *
  7. * THE SOFTWARE IS PROVIDED "AS IS" AND THE AUTHOR DISCLAIMS ALL WARRANTIES
  8. * WITH REGARD TO THIS SOFTWARE INCLUDING ALL IMPLIED WARRANTIES OF
  9. * MERCHANTABILITY AND FITNESS. IN NO EVENT SHALL THE AUTHOR BE LIABLE FOR ANY
  10. * SPECIAL, DIRECT, INDIRECT, OR CONSEQUENTIAL DAMAGES OR ANY DAMAGES
  11. * WHATSOEVER RESULTING FROM LOSS OF USE, DATA OR PROFITS, WHETHER IN AN ACTION
  12. * OF CONTRACT, NEGLIGENCE OR OTHER TORTIOUS ACTION, ARISING OUT OF OR IN
  13. * CONNECTION WITH THE USE OR PERFORMANCE OF THIS SOFTWARE. */
  14. #if !defined(_GNU_SOURCE)
  15. #define _GNU_SOURCE // needed for syscall() on Linux.
  16. #endif
  17. #include <openssl/crypto.h>
  18. #include <stdlib.h>
  19. #include <openssl/digest.h>
  20. #include <openssl/hmac.h>
  21. #include <openssl/sha.h>
  22. #include "../internal.h"
  23. #include "aes/aes.c"
  24. #include "aes/key_wrap.c"
  25. #include "aes/mode_wrappers.c"
  26. #include "bn/add.c"
  27. #include "bn/asm/x86_64-gcc.c"
  28. #include "bn/bn.c"
  29. #include "bn/bytes.c"
  30. #include "bn/cmp.c"
  31. #include "bn/ctx.c"
  32. #include "bn/div.c"
  33. #include "bn/div_extra.c"
  34. #include "bn/exponentiation.c"
  35. #include "bn/gcd.c"
  36. #include "bn/gcd_extra.c"
  37. #include "bn/generic.c"
  38. #include "bn/jacobi.c"
  39. #include "bn/montgomery.c"
  40. #include "bn/montgomery_inv.c"
  41. #include "bn/mul.c"
  42. #include "bn/prime.c"
  43. #include "bn/random.c"
  44. #include "bn/rsaz_exp.c"
  45. #include "bn/shift.c"
  46. #include "bn/sqrt.c"
  47. #include "cipher/aead.c"
  48. #include "cipher/cipher.c"
  49. #include "cipher/e_aes.c"
  50. #include "cipher/e_des.c"
  51. #include "des/des.c"
  52. #include "digest/digest.c"
  53. #include "digest/digests.c"
  54. #include "ecdh/ecdh.c"
  55. #include "ecdsa/ecdsa.c"
  56. #include "ec/ec.c"
  57. #include "ec/ec_key.c"
  58. #include "ec/ec_montgomery.c"
  59. #include "ec/felem.c"
  60. #include "ec/oct.c"
  61. #include "ec/p224-64.c"
  62. #include "../../third_party/fiat/p256.c"
  63. #include "ec/p256-x86_64.c"
  64. #include "ec/scalar.c"
  65. #include "ec/simple.c"
  66. #include "ec/simple_mul.c"
  67. #include "ec/util.c"
  68. #include "ec/wnaf.c"
  69. #include "hmac/hmac.c"
  70. #include "md4/md4.c"
  71. #include "md5/md5.c"
  72. #include "modes/cbc.c"
  73. #include "modes/ccm.c"
  74. #include "modes/cfb.c"
  75. #include "modes/ctr.c"
  76. #include "modes/gcm.c"
  77. #include "modes/ofb.c"
  78. #include "modes/polyval.c"
  79. #include "rand/ctrdrbg.c"
  80. #include "rand/rand.c"
  81. #include "rand/urandom.c"
  82. #include "rsa/blinding.c"
  83. #include "rsa/padding.c"
  84. #include "rsa/rsa.c"
  85. #include "rsa/rsa_impl.c"
  86. #include "self_check/self_check.c"
  87. #include "sha/sha1-altivec.c"
  88. #include "sha/sha1.c"
  89. #include "sha/sha256.c"
  90. #include "sha/sha512.c"
  91. #include "tls/kdf.c"
  92. #if defined(BORINGSSL_FIPS)
  93. #if !defined(OPENSSL_ASAN)
  94. // These symbols are filled in by delocate.go. They point to the start and end
  95. // of the module, and the location of the integrity hash, respectively.
  96. extern const uint8_t BORINGSSL_bcm_text_start[];
  97. extern const uint8_t BORINGSSL_bcm_text_end[];
  98. extern const uint8_t BORINGSSL_bcm_text_hash[];
  99. #endif
  100. static void __attribute__((constructor))
  101. BORINGSSL_bcm_power_on_self_test(void) {
  102. CRYPTO_library_init();
  103. #if !defined(OPENSSL_ASAN)
  104. // Integrity tests cannot run under ASAN because it involves reading the full
  105. // .text section, which triggers the global-buffer overflow detection.
  106. const uint8_t *const start = BORINGSSL_bcm_text_start;
  107. const uint8_t *const end = BORINGSSL_bcm_text_end;
  108. static const uint8_t kHMACKey[64] = {0};
  109. uint8_t result[SHA512_DIGEST_LENGTH];
  110. unsigned result_len;
  111. if (!HMAC(EVP_sha512(), kHMACKey, sizeof(kHMACKey), start, end - start,
  112. result, &result_len) ||
  113. result_len != sizeof(result)) {
  114. fprintf(stderr, "HMAC failed.\n");
  115. goto err;
  116. }
  117. const uint8_t *expected = BORINGSSL_bcm_text_hash;
  118. if (!check_test(expected, result, sizeof(result), "FIPS integrity test")) {
  119. goto err;
  120. }
  121. #endif
  122. if (!BORINGSSL_self_test()) {
  123. goto err;
  124. }
  125. return;
  126. err:
  127. BORINGSSL_FIPS_abort();
  128. }
  129. void BORINGSSL_FIPS_abort(void) {
  130. for (;;) {
  131. abort();
  132. exit(1);
  133. }
  134. }
  135. #endif // BORINGSSL_FIPS