7efbbf4745
cSIDH-511: ( #26 )
...
Implementation of Commutative Supersingular Isogeny Diffie Hellman,
based on "A faster way to CSIDH" paper (2018/782).
* For fast isogeny calculation, implementation converts a curve from
Montgomery to Edwards. All calculations are done on Edwards curve
and then converted back to Montgomery.
* As multiplication in a field Fp511 is most expensive operation
the implementation contains multiple multiplications. It has
most performant, assembly implementation which uses BMI2 and
ADOX/ADCX instructions for modern CPUs. It also contains
slower implementation which will run on older CPUs
* Benchmarks (Intel SkyLake):
BenchmarkGeneratePrivate 6459 172213 ns/op 0 B/op 0 allocs/op
BenchmarkGenerateKeyPair 25 45800356 ns/op 0 B/op 0 allocs/op
BenchmarkValidate 297 3915983 ns/op 0 B/op 0 allocs/op
BenchmarkValidateRandom 184683 6231 ns/op 0 B/op 0 allocs/op
BenchmarkValidateGenerated 25 48481306 ns/op 0 B/op 0 allocs/op
BenchmarkDerive 19 60928763 ns/op 0 B/op 0 allocs/op
BenchmarkDeriveGenerated 8 137342421 ns/op 0 B/op 0 allocs/op
BenchmarkXMul 2311 494267 ns/op 1 B/op 0 allocs/op
BenchmarkXAdd 2396754 501 ns/op 0 B/op 0 allocs/op
BenchmarkXDbl 2072690 571 ns/op 0 B/op 0 allocs/op
BenchmarkIsom 78004 15171 ns/op 0 B/op 0 allocs/op
BenchmarkFp512Sub 224635152 5.33 ns/op 0 B/op 0 allocs/op
BenchmarkFp512Mul 246633255 4.90 ns/op 0 B/op 0 allocs/op
BenchmarkCSwap 233228547 5.10 ns/op 0 B/op 0 allocs/op
BenchmarkAddRdc 87348240 12.6 ns/op 0 B/op 0 allocs/op
BenchmarkSubRdc 95112787 11.7 ns/op 0 B/op 0 allocs/op
BenchmarkModExpRdc 25436 46878 ns/op 0 B/op 0 allocs/op
BenchmarkMulBmiAsm 19527573 60.1 ns/op 0 B/op 0 allocs/op
BenchmarkMulGeneric 7117650 164 ns/op 0 B/op 0 allocs/op
* Go code has very similar performance when compared to C
implementation.
Results from sidh_torturer (4e2996e12d68364761064341cbe1d1b47efafe23)
github.com:henrydcase/sidh-torture/csidh
| TestName |Go | C |
|------------------|----------|----------|
|TestSharedSecret | 57.95774 | 57.91092 |
|TestKeyGeneration | 62.23614 | 58.12980 |
|TestSharedSecret | 55.28988 | 57.23132 |
|TestKeyGeneration | 61.68745 | 58.66396 |
|TestSharedSecret | 63.19408 | 58.64774 |
|TestKeyGeneration | 62.34022 | 61.62539 |
|TestSharedSecret | 62.85453 | 68.74503 |
|TestKeyGeneration | 52.58518 | 58.40115 |
|TestSharedSecret | 50.77081 | 61.91699 |
|TestKeyGeneration | 59.91843 | 61.09266 |
|TestSharedSecret | 59.97962 | 62.98151 |
|TestKeyGeneration | 64.57525 | 56.22863 |
|TestSharedSecret | 56.40521 | 55.77447 |
|TestKeyGeneration | 67.85850 | 58.52604 |
|TestSharedSecret | 60.54290 | 65.14052 |
|TestKeyGeneration | 65.45766 | 58.42823 |
On average Go implementation is 2% faster.
2019-11-25 15:03:29 +00:00
15f6ee16b9
SHA-3: Fixes crash when cloning Shake state
2019-05-26 17:29:15 +01:00
9b3c0190b0
Updates P34 strategy calculation
2019-05-23 18:32:28 +01:00
7298b650cc
Adds go.mod ( #21 )
...
* Reset Makefile after adding go.mod
* Remove ``build`` directory
* Simiplifies makefile
* shake: Make xorIn copyOut platform specific
2019-05-15 18:03:35 +01:00
49bf0db8fd
SHAKE: Don't use function pointers ( #20 )
...
* xorIn and copyOut function pointers cause input and output data
to be moved to heap. This degrades performance of calling code.
* This change removes usage of those function pointers. We will always
use unaligned implementation as it's faster (but may crash on some
systems)
* Benchmark compares generic vs unaligned xorIn and copyOut
benchmark old ns/op new ns/op delta
BenchmarkPermutationFunction-4 463 815 +76.03%
BenchmarkShake128_MTU-4 4443 8180 +84.11%
BenchmarkShake256_MTU-4 4739 9060 +91.18%
BenchmarkShake256_16x-4 71886 132629 +84.50%
BenchmarkShake256_1MiB-4 3695138 6649012 +79.94%
BenchmarkCShake128_448_16x-4 21210 24611 +16.03%
BenchmarkCShake128_1MiB-4 3009342 3396496 +12.87%
BenchmarkCShake256_448_16x-4 26034 27785 +6.73%
BenchmarkCShake256_1MiB-4 3654713 3829404 +4.78%
2019-05-14 17:08:33 +01:00
e6439f96ab
Adds cSHAKE with 0 alloc interface ( #19 )
2019-05-14 01:19:29 +01:00
6f9706df01
CTR-DRBG: Use hardware acceleration on X86 ( #18 )
...
benchmark old ns/op new ns/op delta
BenchmarkInit-4 3403 397 -88.33%
BenchmarkRead-4 14535 1560 -89.27%
2019-04-09 23:50:21 +01:00
71624cdc4c
Improvements to makefile
2019-04-09 17:30:30 +01:00
b184944242
Nits for SIDH
2019-04-09 17:09:34 +01:00
08f7315b64
DRBG: Speed improvements
...
* CTR-DRBG doesn't call "NewCipher" for block encryption
* Changes API of CTR-DRBG, so that read operation implementes io.Reader
Benchmark results:
----------------------
benchmark old ns/op new ns/op delta
BenchmarkInit-4 1118 3579 +220.13%
BenchmarkRead-4 5343 14589 +173.05%
benchmark old allocs new allocs delta
BenchmarkInit-4 15 0 -100.00%
BenchmarkRead-4 67 0 -100.00%
benchmark old bytes new bytes delta
BenchmarkInit-4 1824 0 -100.00%
BenchmarkRead-4 9488 0 -100.00%
2019-04-09 14:37:59 +01:00
e66cc99401
Improves comment
2019-02-19 14:44:11 +00:00
b47a731959
Run tests on ARM64 ( #11 )
2019-02-16 21:29:20 +00:00
90f8cba329
SIDH: Update ( #9 )
...
* Change license to BSD-3
* SIDH: Multiple developlemnts
2018-12-03 23:07:01 +00:00
ea2ffa2d61
PERF: sidh-p503: Split sub and add into 2 uops instead of 3 ( #8 )
...
The performance improvement comes from the fact that on Skylake
"add mem, reg" splits into 2 uops - one arithmetic uop and another one
for loading a value from mem.
However, changing operand order to "add reg, mem" splits into 3 uops:
one for arithmetic op, one for load and one additional one for storing
the result back.
Using separated instruction for loading/storing helps to parallelize
execution (load/store and arithmetic instruction is done in parallel
if possible)
For details, see: https://www.agner.org/optimize/instruction_tables.pdf
New: BenchmarkFp503StrongReduce-4 300000000 5.57 ns/op
Old: BenchmarkFp503StrongReduce-4 200000000 8.60 ns/op
This just improves one function, but more functions can be improved
2018-11-18 20:57:29 +00:00
e9ddb6fb45
sidh/csidh: use SEE for performing CSWAP ( #6 )
...
* Makefile
* makefile: tools for profiling
* sidh: use SIMD for performing CSWAP
Loads data into 128-bit XMM registers and performs conditional swap.
This is probably less useful for SIDH, but will be useful for cSIDH
2018-10-29 15:41:09 +00:00
a456dc4dd9
readme: License
2018-10-25 15:22:28 +01:00
ae57368c7b
License BS for sha3
2018-10-25 15:22:28 +01:00
00c16fe97e
License bulshit
2018-10-25 15:22:28 +01:00
65bbafeef5
script used for calculating sliding window startegy in SIDH P34
2018-10-25 15:22:28 +01:00
0531c3479b
Update README.md
2018-10-25 15:22:28 +01:00
1e34845d00
complate rewrite for SIDH and SIKE. adds p503 ( #5 )
2018-10-25 15:22:28 +01:00
d6fc82531f
Doc
2018-10-25 15:22:28 +01:00
b769c88767
Improves some comments and hardcodes precomputed value ( #4 )
...
* Improves some comments and hardcodes precomputed value
* Tests curve coefficients recovery
2018-10-25 15:22:28 +01:00
51688dc4bb
makefile: adds bench target
2018-10-25 15:18:54 +01:00
35e326cf2c
Merge branch 'master' of github.com:henrydcase/nobscrypto
2018-08-03 14:39:10 +01:00
10fb1a7164
x448: Export shared secret size
...
Changes x448Bytes variable to SharedSecretSize
2018-08-03 14:37:38 +01:00
c88bbf0f75
x448: Export shared secret size ( #3 )
...
Changes x448Bytes variable to SharedSecretSize
2018-08-03 14:36:45 +01:00
2ff456da90
Temporarily adds simple x448 implementation
2018-08-02 23:45:28 +01:00
fc932264c3
Merge pull request #2 from henrydcase/x448
...
Temporarily adds simple x448 implementation
2018-08-02 23:44:22 +01:00
22e3d2373f
adds code coverage
2018-07-31 20:26:50 +01:00
ddbd866ee5
additional comments
2018-07-31 20:21:32 +01:00
dc58ebcd23
makefile formatting
2018-07-31 19:14:49 +01:00
771516ce3f
fixes sike tests
2018-07-31 19:14:39 +01:00
2a25a09b4a
improves makefile
2018-07-31 18:20:27 +01:00
34805fc1fb
Improves Makefile
2018-07-31 18:00:55 +01:00
73c9938c59
Use ADCB instead of SBBL in checkLessThanThree238
2018-07-31 17:10:03 +01:00
958dae0be7
tls: git ignore
2018-07-27 17:11:53 +01:00
2fc873ca64
creates package ready to move to tls-tris
2018-07-27 00:38:21 +01:00
105532aa09
sidh: move p751 implementation to p751 folder
2018-07-27 00:09:34 +01:00
431c20d5ff
readme: sike/sidh
2018-07-23 23:23:34 +01:00
a4d12ceaae
adds SIKE and SIDH
2018-07-23 23:18:38 +01:00
bd9a3f2b6b
Temporarily change sha3 import location
2018-07-05 15:51:09 +01:00
4d0f3e5293
AES-256 CTR_DRBG
2018-06-24 09:50:06 +01:00
4b06c1b314
go fmt
2018-06-23 16:48:54 +01:00
8cf7cfdc8d
SM3 and cSHAKE
2018-06-23 16:34:45 +01:00
94bf28a208
first commit
2018-05-31 00:24:43 +01:00