You can not select more than 25 topics Topics must start with a letter or number, can include dashes ('-') and can be up to 35 characters long.

преди 10 години
преди 10 години
преди 10 години
преди 10 години
Enable upstream's Poly1305 code. The C implementation is still our existing C implementation, but slightly tweaked to fit with upstream's init/block/emits convention. I've tested this by looking at code coverage in kcachegrind and valgrind --tool=callgrind --dump-instr=yes --collect-jumps=yes (NB: valgrind 3.11.0 is needed for AVX2. And even that only does 64-bit AVX2, so we can't get coverage for the 32-bit code yet. But I had to disable that anyway.) This was paired with a hacked up version of poly1305_test that would repeat tests with different ia32cap and armcap values. This isn't checked in, but we badly need a story for testing all the different variants. I'm not happy with upstream's code in either the C/asm boundary or how it dispatches between different versions, but just debugging the code has been a significant time investment. I'd hoped to extract the SIMD parts and do the rest in C, but I think we need to focus on testing first (and use that to guide what modifications would help). For now, this version seems to work at least. The x86 (not x86_64) AVX2 code needs to be disabled because it's broken. It also seems pretty unnecessary. https://rt.openssl.org/Ticket/Display.html?id=4346 Otherwise it seems to work and buys us a decent performance improvement. Notably, my Nexus 6P is finally faster at ChaCha20-Poly1305 than my Nexus 4! bssl speed numbers follow: x86 --- Old: Did 1554000 ChaCha20-Poly1305 (16 bytes) seal operations in 1000536us (1553167.5 ops/sec): 24.9 MB/s Did 136000 ChaCha20-Poly1305 (1350 bytes) seal operations in 1003947us (135465.3 ops/sec): 182.9 MB/s Did 30000 ChaCha20-Poly1305 (8192 bytes) seal operations in 1022990us (29325.8 ops/sec): 240.2 MB/s Did 1888000 ChaCha20-Poly1305-Old (16 bytes) seal operations in 1000206us (1887611.2 ops/sec): 30.2 MB/s Did 173000 ChaCha20-Poly1305-Old (1350 bytes) seal operations in 1003036us (172476.4 ops/sec): 232.8 MB/s Did 30000 ChaCha20-Poly1305-Old (8192 bytes) seal operations in 1027759us (29189.7 ops/sec): 239.1 MB/s New: Did 2030000 ChaCha20-Poly1305 (16 bytes) seal operations in 1000507us (2028971.3 ops/sec): 32.5 MB/s Did 404000 ChaCha20-Poly1305 (1350 bytes) seal operations in 1000287us (403884.1 ops/sec): 545.2 MB/s Did 83000 ChaCha20-Poly1305 (8192 bytes) seal operations in 1001258us (82895.7 ops/sec): 679.1 MB/s Did 2018000 ChaCha20-Poly1305-Old (16 bytes) seal operations in 1000006us (2017987.9 ops/sec): 32.3 MB/s Did 360000 ChaCha20-Poly1305-Old (1350 bytes) seal operations in 1001962us (359295.1 ops/sec): 485.0 MB/s Did 85000 ChaCha20-Poly1305-Old (8192 bytes) seal operations in 1002479us (84789.8 ops/sec): 694.6 MB/s x86_64, no AVX2 --- Old: Did 2023000 ChaCha20-Poly1305 (16 bytes) seal operations in 1000258us (2022478.2 ops/sec): 32.4 MB/s Did 466000 ChaCha20-Poly1305 (1350 bytes) seal operations in 1002619us (464782.7 ops/sec): 627.5 MB/s Did 90000 ChaCha20-Poly1305 (8192 bytes) seal operations in 1001133us (89898.1 ops/sec): 736.4 MB/s Did 2238000 ChaCha20-Poly1305-Old (16 bytes) seal operations in 1000175us (2237608.4 ops/sec): 35.8 MB/s Did 483000 ChaCha20-Poly1305-Old (1350 bytes) seal operations in 1001348us (482349.8 ops/sec): 651.2 MB/s Did 90000 ChaCha20-Poly1305-Old (8192 bytes) seal operations in 1003141us (89718.2 ops/sec): 735.0 MB/s New: Did 2558000 ChaCha20-Poly1305 (16 bytes) seal operations in 1000275us (2557296.7 ops/sec): 40.9 MB/s Did 510000 ChaCha20-Poly1305 (1350 bytes) seal operations in 1001810us (509078.6 ops/sec): 687.3 MB/s Did 115000 ChaCha20-Poly1305 (8192 bytes) seal operations in 1006457us (114262.2 ops/sec): 936.0 MB/s Did 2818000 ChaCha20-Poly1305-Old (16 bytes) seal operations in 1000187us (2817473.1 ops/sec): 45.1 MB/s Did 418000 ChaCha20-Poly1305-Old (1350 bytes) seal operations in 1001140us (417524.0 ops/sec): 563.7 MB/s Did 91000 ChaCha20-Poly1305-Old (8192 bytes) seal operations in 1002539us (90769.5 ops/sec): 743.6 MB/s x86_64, AVX2 --- Old: Did 2516000 ChaCha20-Poly1305 (16 bytes) seal operations in 1000115us (2515710.7 ops/sec): 40.3 MB/s Did 774000 ChaCha20-Poly1305 (1350 bytes) seal operations in 1000300us (773767.9 ops/sec): 1044.6 MB/s Did 171000 ChaCha20-Poly1305 (8192 bytes) seal operations in 1004373us (170255.5 ops/sec): 1394.7 MB/s Did 2580000 ChaCha20-Poly1305-Old (16 bytes) seal operations in 1000144us (2579628.5 ops/sec): 41.3 MB/s Did 769000 ChaCha20-Poly1305-Old (1350 bytes) seal operations in 1000472us (768637.2 ops/sec): 1037.7 MB/s Did 169000 ChaCha20-Poly1305-Old (8192 bytes) seal operations in 1000320us (168945.9 ops/sec): 1384.0 MB/s New: Did 3240000 ChaCha20-Poly1305 (16 bytes) seal operations in 1000114us (3239630.7 ops/sec): 51.8 MB/s Did 932000 ChaCha20-Poly1305 (1350 bytes) seal operations in 1000059us (931945.0 ops/sec): 1258.1 MB/s Did 217000 ChaCha20-Poly1305 (8192 bytes) seal operations in 1003282us (216290.1 ops/sec): 1771.8 MB/s Did 3187000 ChaCha20-Poly1305-Old (16 bytes) seal operations in 1000100us (3186681.3 ops/sec): 51.0 MB/s Did 926000 ChaCha20-Poly1305-Old (1350 bytes) seal operations in 1000071us (925934.3 ops/sec): 1250.0 MB/s Did 215000 ChaCha20-Poly1305-Old (8192 bytes) seal operations in 1000479us (214897.1 ops/sec): 1760.4 MB/s arm, Nexus 4 --- Old: Did 430248 ChaCha20-Poly1305 (16 bytes) seal operations in 1000153us (430182.2 ops/sec): 6.9 MB/s Did 115250 ChaCha20-Poly1305 (1350 bytes) seal operations in 1000549us (115186.8 ops/sec): 155.5 MB/s Did 27000 ChaCha20-Poly1305 (8192 bytes) seal operations in 1030124us (26210.4 ops/sec): 214.7 MB/s Did 451750 ChaCha20-Poly1305-Old (16 bytes) seal operations in 1000549us (451502.1 ops/sec): 7.2 MB/s Did 118000 ChaCha20-Poly1305-Old (1350 bytes) seal operations in 1001557us (117816.6 ops/sec): 159.1 MB/s Did 27000 ChaCha20-Poly1305-Old (8192 bytes) seal operations in 1024263us (26360.4 ops/sec): 215.9 MB/s New: Did 553644 ChaCha20-Poly1305 (16 bytes) seal operations in 1000183us (553542.7 ops/sec): 8.9 MB/s Did 126000 ChaCha20-Poly1305 (1350 bytes) seal operations in 1000396us (125950.1 ops/sec): 170.0 MB/s Did 27000 ChaCha20-Poly1305 (8192 bytes) seal operations in 1000336us (26990.9 ops/sec): 221.1 MB/s Did 559000 ChaCha20-Poly1305-Old (16 bytes) seal operations in 1001465us (558182.3 ops/sec): 8.9 MB/s Did 124000 ChaCha20-Poly1305-Old (1350 bytes) seal operations in 1000824us (123897.9 ops/sec): 167.3 MB/s Did 28000 ChaCha20-Poly1305-Old (8192 bytes) seal operations in 1034854us (27057.0 ops/sec): 221.7 MB/s aarch64, Nexus 6P --- Old: Did 358000 ChaCha20-Poly1305 (16 bytes) seal operations in 1000358us (357871.9 ops/sec): 5.7 MB/s Did 45000 ChaCha20-Poly1305 (1350 bytes) seal operations in 1022386us (44014.7 ops/sec): 59.4 MB/s Did 8657 ChaCha20-Poly1305 (8192 bytes) seal operations in 1063722us (8138.4 ops/sec): 66.7 MB/s Did 350000 ChaCha20-Poly1305-Old (16 bytes) seal operations in 1000074us (349974.1 ops/sec): 5.6 MB/s Did 44000 ChaCha20-Poly1305-Old (1350 bytes) seal operations in 1007907us (43654.8 ops/sec): 58.9 MB/s Did 8525 ChaCha20-Poly1305-Old (8192 bytes) seal operations in 1042644us (8176.3 ops/sec): 67.0 MB/s New: Did 713000 ChaCha20-Poly1305 (16 bytes) seal operations in 1000190us (712864.6 ops/sec): 11.4 MB/s Did 180000 ChaCha20-Poly1305 (1350 bytes) seal operations in 1004249us (179238.4 ops/sec): 242.0 MB/s Did 41000 ChaCha20-Poly1305 (8192 bytes) seal operations in 1005811us (40763.1 ops/sec): 333.9 MB/s Did 775000 ChaCha20-Poly1305-Old (16 bytes) seal operations in 1000719us (774443.2 ops/sec): 12.4 MB/s Did 182000 ChaCha20-Poly1305-Old (1350 bytes) seal operations in 1003529us (181360.0 ops/sec): 244.8 MB/s Did 41000 ChaCha20-Poly1305-Old (8192 bytes) seal operations in 1010576us (40570.9 ops/sec): 332.4 MB/s Change-Id: Iaa4ab86ac1174b79833077963cc3616cfb08e686 Reviewed-on: https://boringssl-review.googlesource.com/7226 Reviewed-by: Adam Langley <agl@google.com>
преди 8 години
преди 10 години
преди 10 години
преди 10 години
преди 10 години
преди 10 години
преди 10 години
преди 10 години
преди 10 години
преди 10 години
преди 10 години
преди 10 години
преди 10 години
преди 10 години
преди 10 години
преди 10 години
преди 10 години
преди 10 години
преди 10 години
преди 10 години
123456789101112131415161718192021222324252627282930313233343536373839404142434445464748495051525354555657585960616263646566676869707172737475767778798081828384858687888990919293949596979899100101102103104105106107108109110111112113114115116117118119120121122123124125126127128129130131132133134135136137138139140141142143144145146147148149150151152153154155156157158159160161162163164165166167168169170171172173174175176177178179180181182183184185186187188189190191192193194195196197198199200201202203204205206207208209210211212213214215216217218219220221222223224225226227228229230231232233234235236237238239240241242243244245246247248249250251252253254255256257258259260261262263264265266267268269270271272273274275276277278279280281282283284285286287288289290291292293294295296297298299300301302303304305306307308309310311312313314315316317318
  1. /* Copyright (c) 2014, Google Inc.
  2. *
  3. * Permission to use, copy, modify, and/or distribute this software for any
  4. * purpose with or without fee is hereby granted, provided that the above
  5. * copyright notice and this permission notice appear in all copies.
  6. *
  7. * THE SOFTWARE IS PROVIDED "AS IS" AND THE AUTHOR DISCLAIMS ALL WARRANTIES
  8. * WITH REGARD TO THIS SOFTWARE INCLUDING ALL IMPLIED WARRANTIES OF
  9. * MERCHANTABILITY AND FITNESS. IN NO EVENT SHALL THE AUTHOR BE LIABLE FOR ANY
  10. * SPECIAL, DIRECT, INDIRECT, OR CONSEQUENTIAL DAMAGES OR ANY DAMAGES
  11. * WHATSOEVER RESULTING FROM LOSS OF USE, DATA OR PROFITS, WHETHER IN AN ACTION
  12. * OF CONTRACT, NEGLIGENCE OR OTHER TORTIOUS ACTION, ARISING OUT OF OR IN
  13. * CONNECTION WITH THE USE OR PERFORMANCE OF THIS SOFTWARE. */
  14. // This implementation of poly1305 is by Andrew Moon
  15. // (https://github.com/floodyberry/poly1305-donna) and released as public
  16. // domain.
  17. #include <openssl/poly1305.h>
  18. #include <string.h>
  19. #include <openssl/cpu.h>
  20. #include "internal.h"
  21. #include "../internal.h"
  22. #if defined(OPENSSL_WINDOWS) || !defined(OPENSSL_X86_64)
  23. // We can assume little-endian.
  24. static uint32_t U8TO32_LE(const uint8_t *m) {
  25. uint32_t r;
  26. OPENSSL_memcpy(&r, m, sizeof(r));
  27. return r;
  28. }
  29. static void U32TO8_LE(uint8_t *m, uint32_t v) {
  30. OPENSSL_memcpy(m, &v, sizeof(v));
  31. }
  32. static uint64_t mul32x32_64(uint32_t a, uint32_t b) { return (uint64_t)a * b; }
  33. struct poly1305_state_st {
  34. uint32_t r0, r1, r2, r3, r4;
  35. uint32_t s1, s2, s3, s4;
  36. uint32_t h0, h1, h2, h3, h4;
  37. uint8_t buf[16];
  38. unsigned int buf_used;
  39. uint8_t key[16];
  40. };
  41. static inline struct poly1305_state_st *poly1305_aligned_state(
  42. poly1305_state *state) {
  43. return (struct poly1305_state_st *)(((uintptr_t)state + 63) & ~63);
  44. }
  45. // poly1305_blocks updates |state| given some amount of input data. This
  46. // function may only be called with a |len| that is not a multiple of 16 at the
  47. // end of the data. Otherwise the input must be buffered into 16 byte blocks.
  48. static void poly1305_update(struct poly1305_state_st *state, const uint8_t *in,
  49. size_t len) {
  50. uint32_t t0, t1, t2, t3;
  51. uint64_t t[5];
  52. uint32_t b;
  53. uint64_t c;
  54. size_t j;
  55. uint8_t mp[16];
  56. if (len < 16) {
  57. goto poly1305_donna_atmost15bytes;
  58. }
  59. poly1305_donna_16bytes:
  60. t0 = U8TO32_LE(in);
  61. t1 = U8TO32_LE(in + 4);
  62. t2 = U8TO32_LE(in + 8);
  63. t3 = U8TO32_LE(in + 12);
  64. in += 16;
  65. len -= 16;
  66. state->h0 += t0 & 0x3ffffff;
  67. state->h1 += ((((uint64_t)t1 << 32) | t0) >> 26) & 0x3ffffff;
  68. state->h2 += ((((uint64_t)t2 << 32) | t1) >> 20) & 0x3ffffff;
  69. state->h3 += ((((uint64_t)t3 << 32) | t2) >> 14) & 0x3ffffff;
  70. state->h4 += (t3 >> 8) | (1 << 24);
  71. poly1305_donna_mul:
  72. t[0] = mul32x32_64(state->h0, state->r0) + mul32x32_64(state->h1, state->s4) +
  73. mul32x32_64(state->h2, state->s3) + mul32x32_64(state->h3, state->s2) +
  74. mul32x32_64(state->h4, state->s1);
  75. t[1] = mul32x32_64(state->h0, state->r1) + mul32x32_64(state->h1, state->r0) +
  76. mul32x32_64(state->h2, state->s4) + mul32x32_64(state->h3, state->s3) +
  77. mul32x32_64(state->h4, state->s2);
  78. t[2] = mul32x32_64(state->h0, state->r2) + mul32x32_64(state->h1, state->r1) +
  79. mul32x32_64(state->h2, state->r0) + mul32x32_64(state->h3, state->s4) +
  80. mul32x32_64(state->h4, state->s3);
  81. t[3] = mul32x32_64(state->h0, state->r3) + mul32x32_64(state->h1, state->r2) +
  82. mul32x32_64(state->h2, state->r1) + mul32x32_64(state->h3, state->r0) +
  83. mul32x32_64(state->h4, state->s4);
  84. t[4] = mul32x32_64(state->h0, state->r4) + mul32x32_64(state->h1, state->r3) +
  85. mul32x32_64(state->h2, state->r2) + mul32x32_64(state->h3, state->r1) +
  86. mul32x32_64(state->h4, state->r0);
  87. state->h0 = (uint32_t)t[0] & 0x3ffffff;
  88. c = (t[0] >> 26);
  89. t[1] += c;
  90. state->h1 = (uint32_t)t[1] & 0x3ffffff;
  91. b = (uint32_t)(t[1] >> 26);
  92. t[2] += b;
  93. state->h2 = (uint32_t)t[2] & 0x3ffffff;
  94. b = (uint32_t)(t[2] >> 26);
  95. t[3] += b;
  96. state->h3 = (uint32_t)t[3] & 0x3ffffff;
  97. b = (uint32_t)(t[3] >> 26);
  98. t[4] += b;
  99. state->h4 = (uint32_t)t[4] & 0x3ffffff;
  100. b = (uint32_t)(t[4] >> 26);
  101. state->h0 += b * 5;
  102. if (len >= 16) {
  103. goto poly1305_donna_16bytes;
  104. }
  105. // final bytes
  106. poly1305_donna_atmost15bytes:
  107. if (!len) {
  108. return;
  109. }
  110. for (j = 0; j < len; j++) {
  111. mp[j] = in[j];
  112. }
  113. mp[j++] = 1;
  114. for (; j < 16; j++) {
  115. mp[j] = 0;
  116. }
  117. len = 0;
  118. t0 = U8TO32_LE(mp + 0);
  119. t1 = U8TO32_LE(mp + 4);
  120. t2 = U8TO32_LE(mp + 8);
  121. t3 = U8TO32_LE(mp + 12);
  122. state->h0 += t0 & 0x3ffffff;
  123. state->h1 += ((((uint64_t)t1 << 32) | t0) >> 26) & 0x3ffffff;
  124. state->h2 += ((((uint64_t)t2 << 32) | t1) >> 20) & 0x3ffffff;
  125. state->h3 += ((((uint64_t)t3 << 32) | t2) >> 14) & 0x3ffffff;
  126. state->h4 += (t3 >> 8);
  127. goto poly1305_donna_mul;
  128. }
  129. void CRYPTO_poly1305_init(poly1305_state *statep, const uint8_t key[32]) {
  130. struct poly1305_state_st *state = poly1305_aligned_state(statep);
  131. uint32_t t0, t1, t2, t3;
  132. #if defined(OPENSSL_POLY1305_NEON)
  133. if (CRYPTO_is_NEON_capable()) {
  134. CRYPTO_poly1305_init_neon(statep, key);
  135. return;
  136. }
  137. #endif
  138. t0 = U8TO32_LE(key + 0);
  139. t1 = U8TO32_LE(key + 4);
  140. t2 = U8TO32_LE(key + 8);
  141. t3 = U8TO32_LE(key + 12);
  142. // precompute multipliers
  143. state->r0 = t0 & 0x3ffffff;
  144. t0 >>= 26;
  145. t0 |= t1 << 6;
  146. state->r1 = t0 & 0x3ffff03;
  147. t1 >>= 20;
  148. t1 |= t2 << 12;
  149. state->r2 = t1 & 0x3ffc0ff;
  150. t2 >>= 14;
  151. t2 |= t3 << 18;
  152. state->r3 = t2 & 0x3f03fff;
  153. t3 >>= 8;
  154. state->r4 = t3 & 0x00fffff;
  155. state->s1 = state->r1 * 5;
  156. state->s2 = state->r2 * 5;
  157. state->s3 = state->r3 * 5;
  158. state->s4 = state->r4 * 5;
  159. // init state
  160. state->h0 = 0;
  161. state->h1 = 0;
  162. state->h2 = 0;
  163. state->h3 = 0;
  164. state->h4 = 0;
  165. state->buf_used = 0;
  166. OPENSSL_memcpy(state->key, key + 16, sizeof(state->key));
  167. }
  168. void CRYPTO_poly1305_update(poly1305_state *statep, const uint8_t *in,
  169. size_t in_len) {
  170. unsigned int i;
  171. struct poly1305_state_st *state = poly1305_aligned_state(statep);
  172. #if defined(OPENSSL_POLY1305_NEON)
  173. if (CRYPTO_is_NEON_capable()) {
  174. CRYPTO_poly1305_update_neon(statep, in, in_len);
  175. return;
  176. }
  177. #endif
  178. if (state->buf_used) {
  179. unsigned todo = 16 - state->buf_used;
  180. if (todo > in_len) {
  181. todo = (unsigned)in_len;
  182. }
  183. for (i = 0; i < todo; i++) {
  184. state->buf[state->buf_used + i] = in[i];
  185. }
  186. state->buf_used += todo;
  187. in_len -= todo;
  188. in += todo;
  189. if (state->buf_used == 16) {
  190. poly1305_update(state, state->buf, 16);
  191. state->buf_used = 0;
  192. }
  193. }
  194. if (in_len >= 16) {
  195. size_t todo = in_len & ~0xf;
  196. poly1305_update(state, in, todo);
  197. in += todo;
  198. in_len &= 0xf;
  199. }
  200. if (in_len) {
  201. for (i = 0; i < in_len; i++) {
  202. state->buf[i] = in[i];
  203. }
  204. state->buf_used = (unsigned)in_len;
  205. }
  206. }
  207. void CRYPTO_poly1305_finish(poly1305_state *statep, uint8_t mac[16]) {
  208. struct poly1305_state_st *state = poly1305_aligned_state(statep);
  209. uint64_t f0, f1, f2, f3;
  210. uint32_t g0, g1, g2, g3, g4;
  211. uint32_t b, nb;
  212. #if defined(OPENSSL_POLY1305_NEON)
  213. if (CRYPTO_is_NEON_capable()) {
  214. CRYPTO_poly1305_finish_neon(statep, mac);
  215. return;
  216. }
  217. #endif
  218. if (state->buf_used) {
  219. poly1305_update(state, state->buf, state->buf_used);
  220. }
  221. b = state->h0 >> 26;
  222. state->h0 = state->h0 & 0x3ffffff;
  223. state->h1 += b;
  224. b = state->h1 >> 26;
  225. state->h1 = state->h1 & 0x3ffffff;
  226. state->h2 += b;
  227. b = state->h2 >> 26;
  228. state->h2 = state->h2 & 0x3ffffff;
  229. state->h3 += b;
  230. b = state->h3 >> 26;
  231. state->h3 = state->h3 & 0x3ffffff;
  232. state->h4 += b;
  233. b = state->h4 >> 26;
  234. state->h4 = state->h4 & 0x3ffffff;
  235. state->h0 += b * 5;
  236. g0 = state->h0 + 5;
  237. b = g0 >> 26;
  238. g0 &= 0x3ffffff;
  239. g1 = state->h1 + b;
  240. b = g1 >> 26;
  241. g1 &= 0x3ffffff;
  242. g2 = state->h2 + b;
  243. b = g2 >> 26;
  244. g2 &= 0x3ffffff;
  245. g3 = state->h3 + b;
  246. b = g3 >> 26;
  247. g3 &= 0x3ffffff;
  248. g4 = state->h4 + b - (1 << 26);
  249. b = (g4 >> 31) - 1;
  250. nb = ~b;
  251. state->h0 = (state->h0 & nb) | (g0 & b);
  252. state->h1 = (state->h1 & nb) | (g1 & b);
  253. state->h2 = (state->h2 & nb) | (g2 & b);
  254. state->h3 = (state->h3 & nb) | (g3 & b);
  255. state->h4 = (state->h4 & nb) | (g4 & b);
  256. f0 = ((state->h0) | (state->h1 << 26)) + (uint64_t)U8TO32_LE(&state->key[0]);
  257. f1 = ((state->h1 >> 6) | (state->h2 << 20)) +
  258. (uint64_t)U8TO32_LE(&state->key[4]);
  259. f2 = ((state->h2 >> 12) | (state->h3 << 14)) +
  260. (uint64_t)U8TO32_LE(&state->key[8]);
  261. f3 = ((state->h3 >> 18) | (state->h4 << 8)) +
  262. (uint64_t)U8TO32_LE(&state->key[12]);
  263. U32TO8_LE(&mac[0], f0);
  264. f1 += (f0 >> 32);
  265. U32TO8_LE(&mac[4], f1);
  266. f2 += (f1 >> 32);
  267. U32TO8_LE(&mac[8], f2);
  268. f3 += (f2 >> 32);
  269. U32TO8_LE(&mac[12], f3);
  270. }
  271. #endif // OPENSSL_WINDOWS || !OPENSSL_X86_64