This contains Montgomery (pseudo)addition, doubling, and tripling.
The formulas are slightly amended from the usual Montgomery arithmetic to allow
projective curve coefficients.
Split the field arithmetic code into three types:
- ExtensionFieldElement :: represents an element of F_{p^2}
- PrimeFieldElement :: represents an element of F_{p}
- fp751Element :: internal type holding an element of F_{p}
The difference between PrimeFieldElement and fp751Element types is that the
PrimeFieldElement assigns a particular interpretation (Montgomery form) to its
data, while the fp751Element doesn't.
In order to have an assembly-implemented function operate on some data, it's
necessary to pass pointers to the data into the assembly implementation.
However, Go sees a pointer to some data passed into a function, and cannot
prove that the function does not keep pointers to that data. It therefore
assumes that the data escapes the local scope, and moves it onto the heap. To
avoid allocations in the hot path, instead mark the functions with
//go:noescape, which instructs the compiler that the inputs don't escape.
This ports the fpadd751_asm and fpsub751_asm functions from the MSR
implementation to Go, and adds property testing and benchmarks. I chose these
functions to start because they use no stack, so there's no need to interact
with Go's stack handling.
Some care and trickery is required because Go's assembler misassembles `MOVQ
$0, AX` into `xor eax, eax`, which destroys the carry flags. Otherwise the
assembly is essentially similar (i.e., easily diff'able) compared to the MSR
implementation.