|
12345678910111213141516171819202122232425262728293031323334 |
- ## PQ SIDH/SIKE implementation using AVX512IFMA instructions
-
- Using the AVX512IFMA (vpmadd52luq and vpmadd52huq) specifically designed for
- prime field arithmetic allows a projected speedup of up to 4X on supporting
- processors, when those become available.
-
- ### Current status
-
- * Tested for correctness with Intel SDE
- * EphemeralKeyGeneration_A and EphemeralKeyGeneration_B with P751 are implemented
- * Using "standins": 3X performance gain on Xeon Gold (with two FMA units)
- * Optimizations are 3-fold
- * Finite field *𝔽~p~* multiplication by performing a single horizontal Montgomery multiplication
- * Quadratic finite field *𝔽~p²~* multiplication and square by performing 3/4 horizontal Montgomery multiplications in parallel
- * A pair of quadratic finite field *𝔽~p²~* multiplications (where applicable) by performing 8 vertical Montgomery multiplications in parallel
- * AVX512 add/sub are also implemented
-
- ### How to test?
-
- The Makefile generates to executables: sidh_ifma can be run with Intel SDE to
- check for correctness. sidh_standin produces incorrect results, because it
- replaces the IFMA instrutions with FMA instructions and can be executed on a
- machine with AVX512 support to estimate performance.
-
- ### TODO
-
- * EphemeralSecretAgreement_A and EphemeralSecretAgreement_B
- * SIKE
- * P503
- * Using vertical representation throughout for greater speedups
-
- ### License
-
- Available under the original [SIKE](https://github.com/Microsoft/PQCrypto-SIKE) license
|