Nelze vybrat více než 25 témat Téma musí začínat písmenem nebo číslem, může obsahovat pomlčky („-“) a může být dlouhé až 35 znaků.
 
 
 

1.4 KiB

PQ SIDH/SIKE implementation using AVX512IFMA instructions

Using the AVX512IFMA (vpmadd52luq and vpmadd52huq) specifically designed for prime field arithmetic allows a projected speedup of up to 4X on supporting processors, when those become available.

Current status

  • Tested for correctness with Intel SDE
  • EphemeralKeyGeneration_A and EphemeralKeyGeneration_B with P751 are implemented
  • Using “standins”: 3X performance gain on Xeon Gold (with two FMA units)
  • Optimizations are 3-fold
    • Finite field 𝔽~p~ multiplication by performing a single horizontal Montgomery multiplication
    • Quadratic finite field 𝔽~p²~ multiplication and square by performing 3/4 horizontal Montgomery multiplications in parallel
    • A pair of quadratic finite field 𝔽~p²~ multiplications (where applicable) by performing 8 vertical Montgomery multiplications in parallel
    • AVX512 add/sub are also implemented

How to test?

The Makefile generates to executables: sidh_ifma can be run with Intel SDE to check for correctness. sidh_standin produces incorrect results, because it replaces the IFMA instrutions with FMA instructions and can be executed on a machine with AVX512 support to estimate performance.

TODO

  • EphemeralSecretAgreement_A and EphemeralSecretAgreement_B
  • SIKE
  • P503
  • Using vertical representation throughout for greater speedups

License

Available under the original SIKE license