Add MQDSS AVX2 implementations (#288)
* Add AVX2 version of mqdss * Fix duplicate consistency
This commit is contained in:
parent
ea5a83f7a8
commit
90630db2eb
@ -16,3 +16,12 @@ auxiliary-submitters:
|
||||
implementations:
|
||||
- name: clean
|
||||
version: https://github.com/joostrijneveld/MQDSS/commit/00608d7610262ff07b1834885d32bc3fd27ef5e1
|
||||
- name: avx2
|
||||
version: https://github.com/joostrijneveld/MQDSS/commit/00608d7610262ff07b1834885d32bc3fd27ef5e1
|
||||
supported_platforms:
|
||||
- architecture: x86_64
|
||||
required_flags:
|
||||
- avx2
|
||||
- architecture: x86
|
||||
required_flags:
|
||||
- avx2
|
||||
|
116
crypto_sign/mqdss-48/avx2/LICENSE
Normal file
116
crypto_sign/mqdss-48/avx2/LICENSE
Normal file
@ -0,0 +1,116 @@
|
||||
CC0 1.0 Universal
|
||||
|
||||
Statement of Purpose
|
||||
|
||||
The laws of most jurisdictions throughout the world automatically confer
|
||||
exclusive Copyright and Related Rights (defined below) upon the creator and
|
||||
subsequent owner(s) (each and all, an "owner") of an original work of
|
||||
authorship and/or a database (each, a "Work").
|
||||
|
||||
Certain owners wish to permanently relinquish those rights to a Work for the
|
||||
purpose of contributing to a commons of creative, cultural and scientific
|
||||
works ("Commons") that the public can reliably and without fear of later
|
||||
claims of infringement build upon, modify, incorporate in other works, reuse
|
||||
and redistribute as freely as possible in any form whatsoever and for any
|
||||
purposes, including without limitation commercial purposes. These owners may
|
||||
contribute to the Commons to promote the ideal of a free culture and the
|
||||
further production of creative, cultural and scientific works, or to gain
|
||||
reputation or greater distribution for their Work in part through the use and
|
||||
efforts of others.
|
||||
|
||||
For these and/or other purposes and motivations, and without any expectation
|
||||
of additional consideration or compensation, the person associating CC0 with a
|
||||
Work (the "Affirmer"), to the extent that he or she is an owner of Copyright
|
||||
and Related Rights in the Work, voluntarily elects to apply CC0 to the Work
|
||||
and publicly distribute the Work under its terms, with knowledge of his or her
|
||||
Copyright and Related Rights in the Work and the meaning and intended legal
|
||||
effect of CC0 on those rights.
|
||||
|
||||
1. Copyright and Related Rights. A Work made available under CC0 may be
|
||||
protected by copyright and related or neighboring rights ("Copyright and
|
||||
Related Rights"). Copyright and Related Rights include, but are not limited
|
||||
to, the following:
|
||||
|
||||
i. the right to reproduce, adapt, distribute, perform, display, communicate,
|
||||
and translate a Work;
|
||||
|
||||
ii. moral rights retained by the original author(s) and/or performer(s);
|
||||
|
||||
iii. publicity and privacy rights pertaining to a person's image or likeness
|
||||
depicted in a Work;
|
||||
|
||||
iv. rights protecting against unfair competition in regards to a Work,
|
||||
subject to the limitations in paragraph 4(a), below;
|
||||
|
||||
v. rights protecting the extraction, dissemination, use and reuse of data in
|
||||
a Work;
|
||||
|
||||
vi. database rights (such as those arising under Directive 96/9/EC of the
|
||||
European Parliament and of the Council of 11 March 1996 on the legal
|
||||
protection of databases, and under any national implementation thereof,
|
||||
including any amended or successor version of such directive); and
|
||||
|
||||
vii. other similar, equivalent or corresponding rights throughout the world
|
||||
based on applicable law or treaty, and any national implementations thereof.
|
||||
|
||||
2. Waiver. To the greatest extent permitted by, but not in contravention of,
|
||||
applicable law, Affirmer hereby overtly, fully, permanently, irrevocably and
|
||||
unconditionally waives, abandons, and surrenders all of Affirmer's Copyright
|
||||
and Related Rights and associated claims and causes of action, whether now
|
||||
known or unknown (including existing as well as future claims and causes of
|
||||
action), in the Work (i) in all territories worldwide, (ii) for the maximum
|
||||
duration provided by applicable law or treaty (including future time
|
||||
extensions), (iii) in any current or future medium and for any number of
|
||||
copies, and (iv) for any purpose whatsoever, including without limitation
|
||||
commercial, advertising or promotional purposes (the "Waiver"). Affirmer makes
|
||||
the Waiver for the benefit of each member of the public at large and to the
|
||||
detriment of Affirmer's heirs and successors, fully intending that such Waiver
|
||||
shall not be subject to revocation, rescission, cancellation, termination, or
|
||||
any other legal or equitable action to disrupt the quiet enjoyment of the Work
|
||||
by the public as contemplated by Affirmer's express Statement of Purpose.
|
||||
|
||||
3. Public License Fallback. Should any part of the Waiver for any reason be
|
||||
judged legally invalid or ineffective under applicable law, then the Waiver
|
||||
shall be preserved to the maximum extent permitted taking into account
|
||||
Affirmer's express Statement of Purpose. In addition, to the extent the Waiver
|
||||
is so judged Affirmer hereby grants to each affected person a royalty-free,
|
||||
non transferable, non sublicensable, non exclusive, irrevocable and
|
||||
unconditional license to exercise Affirmer's Copyright and Related Rights in
|
||||
the Work (i) in all territories worldwide, (ii) for the maximum duration
|
||||
provided by applicable law or treaty (including future time extensions), (iii)
|
||||
in any current or future medium and for any number of copies, and (iv) for any
|
||||
purpose whatsoever, including without limitation commercial, advertising or
|
||||
promotional purposes (the "License"). The License shall be deemed effective as
|
||||
of the date CC0 was applied by Affirmer to the Work. Should any part of the
|
||||
License for any reason be judged legally invalid or ineffective under
|
||||
applicable law, such partial invalidity or ineffectiveness shall not
|
||||
invalidate the remainder of the License, and in such case Affirmer hereby
|
||||
affirms that he or she will not (i) exercise any of his or her remaining
|
||||
Copyright and Related Rights in the Work or (ii) assert any associated claims
|
||||
and causes of action with respect to the Work, in either case contrary to
|
||||
Affirmer's express Statement of Purpose.
|
||||
|
||||
4. Limitations and Disclaimers.
|
||||
|
||||
a. No trademark or patent rights held by Affirmer are waived, abandoned,
|
||||
surrendered, licensed or otherwise affected by this document.
|
||||
|
||||
b. Affirmer offers the Work as-is and makes no representations or warranties
|
||||
of any kind concerning the Work, express, implied, statutory or otherwise,
|
||||
including without limitation warranties of title, merchantability, fitness
|
||||
for a particular purpose, non infringement, or the absence of latent or
|
||||
other defects, accuracy, or the present or absence of errors, whether or not
|
||||
discoverable, all to the greatest extent permissible under applicable law.
|
||||
|
||||
c. Affirmer disclaims responsibility for clearing rights of other persons
|
||||
that may apply to the Work or any use thereof, including without limitation
|
||||
any person's Copyright and Related Rights in the Work. Further, Affirmer
|
||||
disclaims responsibility for obtaining any necessary consents, permissions
|
||||
or other rights required for any use of the Work.
|
||||
|
||||
d. Affirmer understands and acknowledges that Creative Commons is not a
|
||||
party to this document and has no duty or obligation with respect to this
|
||||
CC0 or use of the Work.
|
||||
|
||||
For more information, please see
|
||||
<http://creativecommons.org/publicdomain/zero/1.0/>
|
22
crypto_sign/mqdss-48/avx2/Makefile
Normal file
22
crypto_sign/mqdss-48/avx2/Makefile
Normal file
@ -0,0 +1,22 @@
|
||||
# This Makefile can be used with GNU Make or BSD Make
|
||||
|
||||
LIB=libmqdss-48_avx2.a
|
||||
|
||||
HEADERS = params.h gf31.h mq.h api.h
|
||||
OBJECTS = gf31.o mq.o sign.o
|
||||
|
||||
CFLAGS=-O3 -Wall -Wconversion -Wextra -Wpedantic -Wvla -Werror \
|
||||
-Wmissing-prototypes -Wredundant-decls -std=c99 -mavx2 \
|
||||
-I../../../common $(EXTRAFLAGS)
|
||||
|
||||
all: $(LIB)
|
||||
|
||||
%.o: %.c $(HEADERS)
|
||||
$(CC) $(CFLAGS) -c -o $@ $<
|
||||
|
||||
$(LIB): $(OBJECTS)
|
||||
$(AR) -r $@ $(OBJECTS)
|
||||
|
||||
clean:
|
||||
$(RM) $(OBJECTS)
|
||||
$(RM) $(LIB)
|
19
crypto_sign/mqdss-48/avx2/Makefile.Microsoft_nmake
Normal file
19
crypto_sign/mqdss-48/avx2/Makefile.Microsoft_nmake
Normal file
@ -0,0 +1,19 @@
|
||||
# This Makefile can be used with Microsoft Visual Studio's nmake using the command:
|
||||
# nmake /f Makefile.Microsoft_nmake
|
||||
|
||||
LIBRARY=libmqdss-48_avx2.lib
|
||||
OBJECTS=gf31.obj mq.obj sign.obj
|
||||
|
||||
CFLAGS=/nologo /O2 /I ..\..\..\common /W4 /WX /arch:AVX2
|
||||
|
||||
all: $(LIBRARY)
|
||||
|
||||
# Make sure objects are recompiled if headers change.
|
||||
$(OBJECTS): *.h
|
||||
|
||||
$(LIBRARY): $(OBJECTS)
|
||||
LIB.EXE /NOLOGO /WX /OUT:$@ $**
|
||||
|
||||
clean:
|
||||
-DEL $(OBJECTS)
|
||||
-DEL $(LIBRARY)
|
47
crypto_sign/mqdss-48/avx2/api.h
Normal file
47
crypto_sign/mqdss-48/avx2/api.h
Normal file
@ -0,0 +1,47 @@
|
||||
#ifndef PQCLEAN_MQDSS48_AVX2_API_H
|
||||
#define PQCLEAN_MQDSS48_AVX2_API_H
|
||||
|
||||
#include <stddef.h>
|
||||
#include <stdint.h>
|
||||
|
||||
#define PQCLEAN_MQDSS48_AVX2_CRYPTO_ALGNAME "MQDSS-48"
|
||||
|
||||
#define PQCLEAN_MQDSS48_AVX2_CRYPTO_SECRETKEYBYTES 16
|
||||
#define PQCLEAN_MQDSS48_AVX2_CRYPTO_PUBLICKEYBYTES 46
|
||||
#define PQCLEAN_MQDSS48_AVX2_CRYPTO_BYTES 28400
|
||||
|
||||
/*
|
||||
* Generates an MQDSS key pair.
|
||||
*/
|
||||
int PQCLEAN_MQDSS48_AVX2_crypto_sign_keypair(
|
||||
uint8_t *pk, uint8_t *sk);
|
||||
|
||||
/**
|
||||
* Returns an array containing a detached signature.
|
||||
*/
|
||||
int PQCLEAN_MQDSS48_AVX2_crypto_sign_signature(
|
||||
uint8_t *sig, size_t *siglen,
|
||||
const uint8_t *m, size_t mlen, const uint8_t *sk);
|
||||
|
||||
/**
|
||||
* Verifies a detached signature and message under a given public key.
|
||||
*/
|
||||
int PQCLEAN_MQDSS48_AVX2_crypto_sign_verify(
|
||||
const uint8_t *sig, size_t siglen,
|
||||
const uint8_t *m, size_t mlen, const uint8_t *pk);
|
||||
|
||||
/**
|
||||
* Returns an array containing the signature followed by the message.
|
||||
*/
|
||||
int PQCLEAN_MQDSS48_AVX2_crypto_sign(
|
||||
uint8_t *sm, size_t *smlen,
|
||||
const uint8_t *m, size_t mlen, const uint8_t *sk);
|
||||
|
||||
/**
|
||||
* Verifies a given signature-message pair under a given public key.
|
||||
*/
|
||||
int PQCLEAN_MQDSS48_AVX2_crypto_sign_open(
|
||||
uint8_t *m, size_t *mlen,
|
||||
const uint8_t *sm, size_t smlen, const uint8_t *pk);
|
||||
|
||||
#endif
|
123
crypto_sign/mqdss-48/avx2/gf31.c
Normal file
123
crypto_sign/mqdss-48/avx2/gf31.c
Normal file
@ -0,0 +1,123 @@
|
||||
#include "params.h"
|
||||
#include "fips202.h"
|
||||
#include "gf31.h"
|
||||
#include <immintrin.h>
|
||||
#include <stdint.h>
|
||||
#include <string.h>
|
||||
|
||||
/* Given a vector of N elements in the range [0, 31], this reduces the elements
|
||||
to the range [0, 30] by mapping 31 to 0 (i.e reduction mod 31) */
|
||||
void PQCLEAN_MQDSS48_AVX2_vgf31_unique(gf31 *out, gf31 *in) {
|
||||
__m256i x;
|
||||
__m256i _w31 = _mm256_set1_epi16(31);
|
||||
int i;
|
||||
|
||||
for (i = 0; i < (N >> 4); ++i) {
|
||||
x = _mm256_loadu_si256((__m256i const *) (in + 16 * i));
|
||||
x = _mm256_xor_si256(x, _mm256_and_si256(_w31, _mm256_cmpeq_epi16(x, _w31)));
|
||||
_mm256_storeu_si256((__m256i *)(out + i * 16), x);
|
||||
}
|
||||
}
|
||||
|
||||
/* This function acts on vectors with 64 gf31 elements.
|
||||
It performs one reduction step and guarantees output in [0, 30],
|
||||
but requires input to be in [0, 32768). */
|
||||
void PQCLEAN_MQDSS48_AVX2_vgf31_shorten_unique(gf31 *out, gf31 *in) {
|
||||
__m256i x;
|
||||
__m256i _w2114 = _mm256_set1_epi32(2114 * 65536 + 2114);
|
||||
__m256i _w31 = _mm256_set1_epi16(31);
|
||||
int i;
|
||||
|
||||
for (i = 0; i < (N >> 4); ++i) {
|
||||
x = _mm256_loadu_si256((__m256i const *) (in + 16 * i));
|
||||
x = _mm256_sub_epi16(x, _mm256_mullo_epi16(_w31, _mm256_mulhi_epi16(x, _w2114)));
|
||||
x = _mm256_xor_si256(x, _mm256_and_si256(_w31, _mm256_cmpeq_epi16(x, _w31)));
|
||||
_mm256_storeu_si256((__m256i *)(out + i * 16), x);
|
||||
}
|
||||
}
|
||||
|
||||
/* Given a seed, samples len gf31 elements (in the range [0, 30]), and places
|
||||
them in a vector of 16-bit elements */
|
||||
void PQCLEAN_MQDSS48_AVX2_gf31_nrand(gf31 *out, size_t len, const uint8_t *seed, size_t seedlen) {
|
||||
size_t i = 0, j;
|
||||
shake256ctx shakestate;
|
||||
uint8_t shakeblock[SHAKE256_RATE];
|
||||
|
||||
shake256_absorb(&shakestate, seed, seedlen);
|
||||
|
||||
while (i < len) {
|
||||
shake256_squeezeblocks(shakeblock, 1, &shakestate);
|
||||
for (j = 0; j < SHAKE256_RATE && i < len; j++) {
|
||||
if ((shakeblock[j] & 31) != 31) {
|
||||
out[i] = (shakeblock[j] & 31);
|
||||
i++;
|
||||
}
|
||||
}
|
||||
}
|
||||
shake256_ctx_release(&shakestate);
|
||||
}
|
||||
|
||||
/* Given a seed, samples len gf31 elements, transposed into unsigned range,
|
||||
i.e. in the range [-15, 15], and places them in an array of 8-bit integers.
|
||||
This is used for the expansion of F, which wants packed elements. */
|
||||
void PQCLEAN_MQDSS48_AVX2_gf31_nrand_schar(signed char *out, size_t len, const uint8_t *seed, size_t seedlen) {
|
||||
size_t i = 0, j;
|
||||
shake256ctx shakestate;
|
||||
uint8_t shakeblock[SHAKE256_RATE];
|
||||
|
||||
shake256_absorb(&shakestate, seed, seedlen);
|
||||
|
||||
while (i < len) {
|
||||
shake256_squeezeblocks(shakeblock, 1, &shakestate);
|
||||
for (j = 0; j < SHAKE256_RATE && i < len; j++) {
|
||||
if ((shakeblock[j] & 31) != 31) {
|
||||
out[i] = (signed char)((shakeblock[j] & 31) - 15);
|
||||
i++;
|
||||
}
|
||||
}
|
||||
}
|
||||
shake256_ctx_release(&shakestate);
|
||||
|
||||
}
|
||||
|
||||
/* Unpacks an array of packed GF31 elements to one element per gf31.
|
||||
Assumes that there is sufficient empty space available at the end of the
|
||||
array to unpack. Can perform in-place. */
|
||||
void PQCLEAN_MQDSS48_AVX2_gf31_nunpack(gf31 *out, const uint8_t *in, size_t n) {
|
||||
size_t i;
|
||||
size_t j = ((n * 5) >> 3) - 1;
|
||||
unsigned int d = 0;
|
||||
|
||||
for (i = n; i > 0; i--) {
|
||||
out[i - 1] = (gf31)((in[j] >> d) & 31);
|
||||
d += 5;
|
||||
if (d > 8) {
|
||||
d -= 8;
|
||||
j--;
|
||||
out[i - 1] = (gf31)(out[i - 1] ^ ((in[j] << (5 - d)) & 31));
|
||||
}
|
||||
}
|
||||
}
|
||||
|
||||
/* Packs an array of GF31 elements from gf31's to concatenated 5-bit values.
|
||||
Assumes that there is sufficient space available to unpack.
|
||||
Can perform in-place. */
|
||||
void PQCLEAN_MQDSS48_AVX2_gf31_npack(uint8_t *out, const gf31 *in, size_t n) {
|
||||
unsigned int i = 0;
|
||||
unsigned int j;
|
||||
int d = 3;
|
||||
|
||||
/* There will be ceil(5n / 8) output blocks */
|
||||
memset(out, 0, (size_t)((5 * n + 7) & ~7U) >> 3);
|
||||
|
||||
for (j = 0; j < n; j++) {
|
||||
if (d < 0) {
|
||||
d += 8;
|
||||
out[i] = (uint8_t)((out[i] & (255 << (d - 3))) |
|
||||
((in[j] >> (8 - d)) & ~(255 << (d - 3))));
|
||||
i++;
|
||||
}
|
||||
out[i] = (uint8_t)((out[i] & ~(31 << d)) | ((in[j] << d) & (31 << d)));
|
||||
d -= 5;
|
||||
}
|
||||
}
|
36
crypto_sign/mqdss-48/avx2/gf31.h
Normal file
36
crypto_sign/mqdss-48/avx2/gf31.h
Normal file
@ -0,0 +1,36 @@
|
||||
#ifndef MQDSS_GF31_H
|
||||
#define MQDSS_GF31_H
|
||||
|
||||
#include <stddef.h>
|
||||
#include <stdint.h>
|
||||
|
||||
typedef unsigned short gf31;
|
||||
|
||||
/* Given a vector of elements in the range [0, 31], this reduces the elements
|
||||
to the range [0, 30] by mapping 31 to 0 (i.e reduction mod 31) */
|
||||
void PQCLEAN_MQDSS48_AVX2_vgf31_unique(gf31 *out, gf31 *in);
|
||||
|
||||
/* Given a vector of 16-bit integers (i.e. in [0, 65535], this reduces the
|
||||
elements to the range [0, 30] by mapping 31 to 0 (i.e reduction mod 31) */
|
||||
void PQCLEAN_MQDSS48_AVX2_vgf31_shorten_unique(gf31 *out, gf31 *in);
|
||||
|
||||
/* Given a seed, samples len gf31 elements (in the range [0, 30]), and places
|
||||
them in a vector of 16-bit elements */
|
||||
void PQCLEAN_MQDSS48_AVX2_gf31_nrand(gf31 *out, size_t len, const uint8_t *seed, size_t seedlen);
|
||||
|
||||
/* Given a seed, samples len gf31 elements, transposed into unsigned range,
|
||||
i.e. in the range [-15, 15], and places them in an array of 8-bit integers.
|
||||
This is used for the expansion of F, which wants packed elements. */
|
||||
void PQCLEAN_MQDSS48_AVX2_gf31_nrand_schar(signed char *out, size_t len, const uint8_t *seed, size_t seedlen);
|
||||
|
||||
/* Unpacks an array of packed GF31 elements to one element per gf31.
|
||||
Assumes that there is sufficient empty space available at the end of the
|
||||
array to unpack. Can perform in-place. */
|
||||
void PQCLEAN_MQDSS48_AVX2_gf31_nunpack(gf31 *out, const uint8_t *in, size_t n);
|
||||
|
||||
/* Packs an array of GF31 elements from gf31's to concatenated 5-bit values.
|
||||
Assumes that there is sufficient space available to unpack.
|
||||
Can perform in-place. */
|
||||
void PQCLEAN_MQDSS48_AVX2_gf31_npack(uint8_t *out, const gf31 *in, size_t n);
|
||||
|
||||
#endif
|
251
crypto_sign/mqdss-48/avx2/mq.c
Normal file
251
crypto_sign/mqdss-48/avx2/mq.c
Normal file
@ -0,0 +1,251 @@
|
||||
#include "mq.h"
|
||||
#include "params.h"
|
||||
#include <immintrin.h>
|
||||
#include <stdio.h>
|
||||
|
||||
static inline __m256i reduce_16(__m256i r, __m256i _w31, __m256i _w2114) {
|
||||
__m256i exp = _mm256_mulhi_epi16(r, _w2114);
|
||||
return _mm256_sub_epi16(r, _mm256_mullo_epi16(_w31, exp));
|
||||
}
|
||||
|
||||
/* Computes all products x_i * x_j, returns in reduced form */
|
||||
inline static
|
||||
void generate_quadratic_terms( unsigned char *xij, const gf31 *x ) {
|
||||
__m256i mask_2114 = _mm256_set1_epi16( 2114 );
|
||||
__m256i mask_31 = _mm256_set1_epi16( 31 );
|
||||
__m256i xi[4];
|
||||
xi[0] = _mm256_loadu_si256((__m256i const *) (x));
|
||||
xi[1] = _mm256_loadu_si256((__m256i const *) (x + 16));
|
||||
xi[2] = _mm256_loadu_si256((__m256i const *) (x + 32));
|
||||
xi[3] = _mm256_setzero_si256();
|
||||
|
||||
__m256i xixj[4];
|
||||
xixj[0] = _mm256_setzero_si256();
|
||||
xixj[1] = _mm256_setzero_si256();
|
||||
xixj[2] = _mm256_setzero_si256();
|
||||
xixj[3] = _mm256_setzero_si256();
|
||||
|
||||
int k = 0;
|
||||
for (int i = 0; i < 32; i++) {
|
||||
__m256i br_xi = _mm256_set1_epi16( (short)x[i] );
|
||||
for (int j = 0; j <= (i >> 4); j++) {
|
||||
xixj[j] = _mm256_mullo_epi16( xi[j], br_xi );
|
||||
xixj[j] = reduce_16( xixj[j], mask_31, mask_2114 );
|
||||
}
|
||||
|
||||
__m256i r = _mm256_packs_epi16(xixj[0], xixj[1]);
|
||||
r = _mm256_permute4x64_epi64(r, 0xd8); // 3,1,2,0
|
||||
_mm256_storeu_si256( (__m256i *)( xij + k ), r );
|
||||
k += i + 1;
|
||||
}
|
||||
|
||||
for (int i = 32; i < N; i++) {
|
||||
__m256i br_xi = _mm256_set1_epi16( (short)x[i] );
|
||||
for (int j = 0; j <= (i >> 4); j++) {
|
||||
xixj[j] = _mm256_mullo_epi16( xi[j], br_xi );
|
||||
xixj[j] = reduce_16( xixj[j], mask_31, mask_2114 );
|
||||
}
|
||||
|
||||
__m256i r0 = _mm256_packs_epi16(xixj[0], xixj[1]);
|
||||
r0 = _mm256_permute4x64_epi64(r0, 0xd8); // 3,1,2,0
|
||||
_mm256_storeu_si256( (__m256i *)( xij + k ), r0 );
|
||||
__m256i r1 = _mm256_packs_epi16(xixj[2], xixj[3]);
|
||||
r1 = _mm256_permute4x64_epi64(r1, 0xd8); // 3,1,2,0
|
||||
_mm256_storeu_si256( (__m256i *)( xij + 32 + k ), r1 );
|
||||
k += i + 1;
|
||||
}
|
||||
}
|
||||
|
||||
/* Computes all terms (x_i * y_j) + (x_j * y_i), returns in reduced form */
|
||||
inline static
|
||||
void generate_xiyj_p_xjyi_terms( unsigned char *xij, const gf31 *x, const gf31 *y ) {
|
||||
__m256i mask_2114 = _mm256_set1_epi16( 2114 );
|
||||
__m256i mask_31 = _mm256_set1_epi16( 31 );
|
||||
__m256i xiyi[4];
|
||||
xiyi[0] = _mm256_xor_si256(_mm256_loadu_si256((__m256i const *) (x)), _mm256_slli_si256( _mm256_loadu_si256((__m256i const *) (y)), 1 ));
|
||||
xiyi[1] = _mm256_xor_si256(_mm256_loadu_si256((__m256i const *) (x + 16)), _mm256_slli_si256( _mm256_loadu_si256((__m256i const *) (y + 16)), 1 ));
|
||||
xiyi[2] = _mm256_xor_si256(_mm256_loadu_si256((__m256i const *) (x + 32)), _mm256_slli_si256( _mm256_loadu_si256((__m256i const *) (y + 32)), 1 ));
|
||||
xiyi[3] = _mm256_setzero_si256();
|
||||
|
||||
__m256i xixj[4];
|
||||
xixj[0] = _mm256_setzero_si256();
|
||||
xixj[1] = _mm256_setzero_si256();
|
||||
xixj[2] = _mm256_setzero_si256();
|
||||
xixj[3] = _mm256_setzero_si256();
|
||||
|
||||
int k = 0;
|
||||
for (int i = 0; i < 32; i++) {
|
||||
__m256i br_yixi = _mm256_set1_epi16( (short)((x[i] << 8)^y[i]) );
|
||||
for (int j = 0; j <= (i >> 4); j++) {
|
||||
xixj[j] = _mm256_maddubs_epi16( xiyi[j], br_yixi );
|
||||
xixj[j] = reduce_16( xixj[j], mask_31, mask_2114 );
|
||||
}
|
||||
|
||||
__m256i r = _mm256_packs_epi16(xixj[0], xixj[1]);
|
||||
r = _mm256_permute4x64_epi64(r, 0xd8); // 3,1,2,0
|
||||
_mm256_storeu_si256( (__m256i *)( xij + k ), r );
|
||||
k += i + 1;
|
||||
}
|
||||
|
||||
for (int i = 32; i < N; i++) {
|
||||
__m256i br_yixi = _mm256_set1_epi16( (short)((x[i] << 8)^y[i]) );
|
||||
for (int j = 0; j <= (i >> 4); j++) {
|
||||
xixj[j] = _mm256_maddubs_epi16( xiyi[j], br_yixi );
|
||||
xixj[j] = reduce_16( xixj[j], mask_31, mask_2114 );
|
||||
}
|
||||
|
||||
__m256i r0 = _mm256_packs_epi16(xixj[0], xixj[1]);
|
||||
r0 = _mm256_permute4x64_epi64(r0, 0xd8); // 3,1,2,0
|
||||
_mm256_storeu_si256( (__m256i *)( xij + k ), r0 );
|
||||
__m256i r1 = _mm256_packs_epi16(xixj[2], xixj[3]);
|
||||
r1 = _mm256_permute4x64_epi64(r1, 0xd8); // 3,1,2,0
|
||||
_mm256_storeu_si256( (__m256i *)( xij + 32 + k ), r1 );
|
||||
k += i + 1;
|
||||
}
|
||||
}
|
||||
|
||||
#define EVAL_YMM_0(xx) {\
|
||||
__m128i tmp = _mm256_castsi256_si128(xx); \
|
||||
for (int macro_i = 0; macro_i < 8; macro_i++) { \
|
||||
__m256i _xi = _mm256_broadcastw_epi16(tmp); \
|
||||
tmp = _mm_srli_si128(tmp, 2); \
|
||||
for (int macro_j = 0; macro_j < (N/16); macro_j++) { \
|
||||
__m256i coeff = _mm256_loadu_si256((__m256i const *) F); \
|
||||
F += 32; \
|
||||
yy[macro_j] = _mm256_add_epi16(yy[macro_j], _mm256_maddubs_epi16(_xi, coeff)); \
|
||||
} \
|
||||
} \
|
||||
}
|
||||
|
||||
#define EVAL_YMM_1(xx) {\
|
||||
__m128i tmp = _mm256_extracti128_si256(xx, 1); \
|
||||
for (int macro_i = 0; macro_i < 8; macro_i++) { \
|
||||
__m256i _xi = _mm256_broadcastw_epi16(tmp); \
|
||||
tmp = _mm_srli_si128(tmp, 2); \
|
||||
for (int macro_j = 0; macro_j < (N/16); macro_j++) { \
|
||||
__m256i coeff = _mm256_loadu_si256((__m256i const *) F); \
|
||||
F += 32; \
|
||||
yy[macro_j] = _mm256_add_epi16(yy[macro_j], _mm256_maddubs_epi16(_xi, coeff)); \
|
||||
} \
|
||||
} \
|
||||
}
|
||||
|
||||
#define REDUCE_(yy) { \
|
||||
(yy)[0] = reduce_16((yy)[0], mask_reduce, mask_2114); \
|
||||
(yy)[1] = reduce_16((yy)[1], mask_reduce, mask_2114); \
|
||||
(yy)[2] = reduce_16((yy)[2], mask_reduce, mask_2114); \
|
||||
}
|
||||
|
||||
/* Evaluates the MQ function on a vector of N gf31 elements x (expected to be
|
||||
in reduced 5-bit representation). Expects the coefficients in F to be in
|
||||
signed representation (i.e. [-15, 15], packed bytewise).
|
||||
Outputs M gf31 elements in unique 16-bit representation as fx. */
|
||||
void PQCLEAN_MQDSS48_AVX2_MQ(gf31 *fx, const gf31 *x, const signed char *F) {
|
||||
__m256i mask_2114 = _mm256_set1_epi32(2114 * 65536 + 2114);
|
||||
__m256i mask_reduce = _mm256_srli_epi16(_mm256_cmpeq_epi16(mask_2114, mask_2114), 11);
|
||||
|
||||
__m256i xi[4];
|
||||
xi[0] = _mm256_loadu_si256((__m256i const *) (x));
|
||||
xi[1] = _mm256_loadu_si256((__m256i const *) (x + 16));
|
||||
xi[2] = _mm256_loadu_si256((__m256i const *) (x + 32));
|
||||
xi[3] = _mm256_setzero_si256();
|
||||
|
||||
__m256i _zero = _mm256_setzero_si256();
|
||||
xi[0] = _mm256_add_epi16(_mm256_and_si256(mask_reduce, _mm256_cmpgt_epi16(_zero, xi[0])), xi[0]);
|
||||
xi[1] = _mm256_add_epi16(_mm256_and_si256(mask_reduce, _mm256_cmpgt_epi16(_zero, xi[1])), xi[1]);
|
||||
xi[2] = _mm256_add_epi16(_mm256_and_si256(mask_reduce, _mm256_cmpgt_epi16(_zero, xi[2])), xi[2]);
|
||||
|
||||
__m256i x1 = _mm256_packs_epi16(xi[0], xi[1]);
|
||||
x1 = _mm256_permute4x64_epi64(x1, 0xd8); // 3,1,2,0
|
||||
__m256i x2 = _mm256_packs_epi16(xi[2], xi[3]);
|
||||
x2 = _mm256_permute4x64_epi64(x2, 0xd8); // 3,1,2,0
|
||||
|
||||
__m256i yy[M / 16];
|
||||
yy[0] = _zero;
|
||||
yy[1] = _zero;
|
||||
yy[2] = _zero;
|
||||
|
||||
EVAL_YMM_0(x1)
|
||||
EVAL_YMM_1(x1)
|
||||
EVAL_YMM_0(x2)
|
||||
REDUCE_(yy)
|
||||
|
||||
__m256i xixj[38];
|
||||
generate_quadratic_terms( (unsigned char *) xixj, x );
|
||||
for (int i = 0 ; i < 36 ; i += 2) {
|
||||
EVAL_YMM_0(xixj[i])
|
||||
EVAL_YMM_1(xixj[i])
|
||||
EVAL_YMM_0(xixj[i + 1])
|
||||
EVAL_YMM_1(xixj[i + 1])
|
||||
REDUCE_(yy)
|
||||
}
|
||||
EVAL_YMM_0(xixj[36]) {
|
||||
__m128i tmp = _mm256_extracti128_si256(xixj[36], 1);
|
||||
for (int i = 0; i < 4; i++) {
|
||||
__m256i _xi = _mm256_broadcastw_epi16(tmp);
|
||||
tmp = _mm_srli_si128(tmp, 2);
|
||||
for (int j = 0; j < (N / 16); j++) {
|
||||
__m256i coeff = _mm256_loadu_si256((__m256i const *) F);
|
||||
F += 32;
|
||||
yy[j] = _mm256_add_epi16(yy[j], _mm256_maddubs_epi16(_xi, coeff));
|
||||
}
|
||||
}
|
||||
}
|
||||
REDUCE_(yy)
|
||||
|
||||
yy[0] = _mm256_add_epi16(_mm256_and_si256(mask_reduce, _mm256_cmpgt_epi16(_mm256_setzero_si256(), yy[0])), yy[0]);
|
||||
yy[1] = _mm256_add_epi16(_mm256_and_si256(mask_reduce, _mm256_cmpgt_epi16(_mm256_setzero_si256(), yy[1])), yy[1]);
|
||||
yy[2] = _mm256_add_epi16(_mm256_and_si256(mask_reduce, _mm256_cmpgt_epi16(_mm256_setzero_si256(), yy[2])), yy[2]);
|
||||
|
||||
for (int i = 0; i < (N / 16); ++i) {
|
||||
_mm256_storeu_si256((__m256i *)(fx + i * 16), yy[i]);
|
||||
}
|
||||
}
|
||||
|
||||
/* Evaluates the bilinear polar form of the MQ function (i.e. G) on a vector of
|
||||
N gf31 elements x (expected to be in reduced 5-bit representation). Expects
|
||||
the coefficients in F to be in signed representation (i.e. [-15, 15], packed
|
||||
bytewise). Outputs M gf31 elements in unique 16-bit representation as fx. */
|
||||
void PQCLEAN_MQDSS48_AVX2_G(gf31 *fx, const gf31 *x, const gf31 *y, const signed char *F) {
|
||||
__m256i mask_2114 = _mm256_set1_epi32(2114 * 65536 + 2114);
|
||||
__m256i mask_reduce = _mm256_srli_epi16(_mm256_cmpeq_epi16(mask_2114, mask_2114), 11);
|
||||
__m256i _zero = _mm256_setzero_si256();
|
||||
|
||||
__m256i yy[(M / 16)];
|
||||
yy[0] = _zero;
|
||||
yy[1] = _zero;
|
||||
yy[2] = _zero;
|
||||
|
||||
F += N * M;
|
||||
|
||||
__m256i xixj[38];
|
||||
generate_xiyj_p_xjyi_terms( (unsigned char *) xixj, x, y );
|
||||
for (int i = 0 ; i < 36 ; i += 2) {
|
||||
EVAL_YMM_0(xixj[i])
|
||||
EVAL_YMM_1(xixj[i])
|
||||
EVAL_YMM_0(xixj[i + 1])
|
||||
EVAL_YMM_1(xixj[i + 1])
|
||||
REDUCE_(yy)
|
||||
}
|
||||
EVAL_YMM_0(xixj[36]) {
|
||||
__m128i tmp = _mm256_extracti128_si256(xixj[36], 1);
|
||||
for (int i = 0; i < 4; i++) {
|
||||
__m256i _xi = _mm256_broadcastw_epi16(tmp);
|
||||
tmp = _mm_srli_si128(tmp, 2);
|
||||
for (int j = 0; j < (N / 16); j++) {
|
||||
__m256i coeff = _mm256_loadu_si256((__m256i const *) F);
|
||||
F += 32;
|
||||
yy[j] = _mm256_add_epi16(yy[j], _mm256_maddubs_epi16(_xi, coeff));
|
||||
}
|
||||
}
|
||||
}
|
||||
REDUCE_(yy)
|
||||
|
||||
yy[0] = _mm256_add_epi16(_mm256_and_si256(mask_reduce, _mm256_cmpgt_epi16(_mm256_setzero_si256(), yy[0])), yy[0]);
|
||||
yy[1] = _mm256_add_epi16(_mm256_and_si256(mask_reduce, _mm256_cmpgt_epi16(_mm256_setzero_si256(), yy[1])), yy[1]);
|
||||
yy[2] = _mm256_add_epi16(_mm256_and_si256(mask_reduce, _mm256_cmpgt_epi16(_mm256_setzero_si256(), yy[2])), yy[2]);
|
||||
|
||||
for (int i = 0; i < (N / 16); ++i) {
|
||||
_mm256_storeu_si256((__m256i *)(fx + i * 16), yy[i]);
|
||||
}
|
||||
}
|
18
crypto_sign/mqdss-48/avx2/mq.h
Normal file
18
crypto_sign/mqdss-48/avx2/mq.h
Normal file
@ -0,0 +1,18 @@
|
||||
#ifndef MQDSS_MQ_H
|
||||
#define MQDSS_MQ_H
|
||||
|
||||
#include "gf31.h"
|
||||
|
||||
/* Evaluates the MQ function on a vector of N gf31 elements x (expected to be
|
||||
in reduced 5-bit representation). Expects the coefficients in F to be in
|
||||
signed representation (i.e. [-15, 15], packed bytewise).
|
||||
Outputs M gf31 elements in unique 16-bit representation as fx. */
|
||||
void PQCLEAN_MQDSS48_AVX2_MQ(gf31 *fx, const gf31 *x, const signed char *F);
|
||||
|
||||
/* Evaluates the bilinear polar form of the MQ function (i.e. G) on a vector of
|
||||
N gf31 elements x (expected to be in reduced 5-bit representation). Expects
|
||||
the coefficients in F to be in signed representation (i.e. [-15, 15], packed
|
||||
bytewise). Outputs M gf31 elements in unique 16-bit representation as fx. */
|
||||
void PQCLEAN_MQDSS48_AVX2_G(gf31 *fx, const gf31 *x, const gf31 *y, const signed char *F);
|
||||
|
||||
#endif
|
25
crypto_sign/mqdss-48/avx2/params.h
Normal file
25
crypto_sign/mqdss-48/avx2/params.h
Normal file
@ -0,0 +1,25 @@
|
||||
#ifndef MQDSS_PARAMS_H
|
||||
#define MQDSS_PARAMS_H
|
||||
|
||||
#define N 48
|
||||
#define M N
|
||||
#define F_LEN (M * (((N * (N + 1)) >> 1) + N)) /* Number of elements in F */
|
||||
|
||||
#define ROUNDS 184
|
||||
|
||||
/* Number of bytes that N, M and F_LEN elements require when packed into a byte
|
||||
array, 5-bit elements packed continuously. */
|
||||
/* Assumes N and M to be multiples of 8 */
|
||||
#define NPACKED_BYTES ((N * 5) >> 3)
|
||||
#define MPACKED_BYTES ((M * 5) >> 3)
|
||||
#define FPACKED_BYTES ((F_LEN * 5) >> 3)
|
||||
|
||||
#define HASH_BYTES 32
|
||||
#define SEED_BYTES 16
|
||||
#define PK_BYTES (SEED_BYTES + MPACKED_BYTES)
|
||||
#define SK_BYTES SEED_BYTES
|
||||
|
||||
// R, sigma_0, ROUNDS * (t1, r{0,1}, e1, c, rho)
|
||||
#define SIG_LEN (2 * HASH_BYTES + ROUNDS * (2*NPACKED_BYTES + MPACKED_BYTES + HASH_BYTES + HASH_BYTES))
|
||||
|
||||
#endif
|
389
crypto_sign/mqdss-48/avx2/sign.c
Normal file
389
crypto_sign/mqdss-48/avx2/sign.c
Normal file
@ -0,0 +1,389 @@
|
||||
#include <stddef.h>
|
||||
#include <stdint.h>
|
||||
#include <string.h>
|
||||
|
||||
#include "api.h"
|
||||
#include "fips202.h"
|
||||
#include "gf31.h"
|
||||
#include "mq.h"
|
||||
#include "params.h"
|
||||
#include "randombytes.h"
|
||||
|
||||
/* Takes an array of len bytes and computes a hash digest.
|
||||
This is used as a hash function in the Fiat-Shamir transform. */
|
||||
static void H(unsigned char *out, const unsigned char *in, const size_t len) {
|
||||
shake256(out, HASH_BYTES, in, len);
|
||||
}
|
||||
|
||||
/* Takes two arrays of N packed elements and an array of M packed elements,
|
||||
and computes a HASH_BYTES commitment. */
|
||||
static void com_0(unsigned char *c,
|
||||
const unsigned char *rho,
|
||||
const unsigned char *inn, const unsigned char *inn2,
|
||||
const unsigned char *inm) {
|
||||
unsigned char buffer[HASH_BYTES + 2 * NPACKED_BYTES + MPACKED_BYTES];
|
||||
memcpy(buffer, rho, HASH_BYTES);
|
||||
memcpy(buffer + HASH_BYTES, inn, NPACKED_BYTES);
|
||||
memcpy(buffer + HASH_BYTES + NPACKED_BYTES, inn2, NPACKED_BYTES);
|
||||
memcpy(buffer + HASH_BYTES + 2 * NPACKED_BYTES, inm, MPACKED_BYTES);
|
||||
shake256(c, HASH_BYTES, buffer, HASH_BYTES + 2 * NPACKED_BYTES + MPACKED_BYTES);
|
||||
}
|
||||
|
||||
/* Takes an array of N packed elements and an array of M packed elements,
|
||||
and computes a HASH_BYTES commitment. */
|
||||
static void com_1(unsigned char *c,
|
||||
const unsigned char *rho,
|
||||
const unsigned char *inn, const unsigned char *inm) {
|
||||
unsigned char buffer[HASH_BYTES + NPACKED_BYTES + MPACKED_BYTES];
|
||||
memcpy(buffer, rho, HASH_BYTES);
|
||||
memcpy(buffer + HASH_BYTES, inn, NPACKED_BYTES);
|
||||
memcpy(buffer + HASH_BYTES + NPACKED_BYTES, inm, MPACKED_BYTES);
|
||||
shake256(c, HASH_BYTES, buffer, HASH_BYTES + NPACKED_BYTES + MPACKED_BYTES);
|
||||
}
|
||||
|
||||
/*
|
||||
* Generates an MQDSS key pair.
|
||||
*/
|
||||
int PQCLEAN_MQDSS48_AVX2_crypto_sign_keypair(uint8_t *pk, uint8_t *sk) {
|
||||
signed char F[F_LEN];
|
||||
unsigned char skbuf[SEED_BYTES * 2];
|
||||
gf31 sk_gf31[N];
|
||||
gf31 pk_gf31[M];
|
||||
|
||||
// Expand sk to obtain a seed for F and the secret input s.
|
||||
// We also expand to obtain a value for sampling r0, t0 and e0 during
|
||||
// signature generation, but that is not relevant here.
|
||||
randombytes(sk, SEED_BYTES);
|
||||
shake256(skbuf, SEED_BYTES * 2, sk, SEED_BYTES);
|
||||
|
||||
memcpy(pk, skbuf, SEED_BYTES);
|
||||
PQCLEAN_MQDSS48_AVX2_gf31_nrand_schar(F, F_LEN, pk, SEED_BYTES);
|
||||
PQCLEAN_MQDSS48_AVX2_gf31_nrand(sk_gf31, N, skbuf + SEED_BYTES, SEED_BYTES);
|
||||
PQCLEAN_MQDSS48_AVX2_MQ(pk_gf31, sk_gf31, F);
|
||||
PQCLEAN_MQDSS48_AVX2_vgf31_unique(pk_gf31, pk_gf31);
|
||||
PQCLEAN_MQDSS48_AVX2_gf31_npack(pk + SEED_BYTES, pk_gf31, M);
|
||||
|
||||
return 0;
|
||||
}
|
||||
|
||||
/**
|
||||
* Returns an array containing a detached signature.
|
||||
*/
|
||||
int PQCLEAN_MQDSS48_AVX2_crypto_sign_signature(
|
||||
uint8_t *sig, size_t *siglen,
|
||||
const uint8_t *m, size_t mlen, const uint8_t *sk) {
|
||||
|
||||
signed char F[F_LEN];
|
||||
unsigned char skbuf[SEED_BYTES * 4];
|
||||
gf31 pk_gf31[M];
|
||||
unsigned char pk[SEED_BYTES + MPACKED_BYTES];
|
||||
// Concatenated for convenient hashing.
|
||||
unsigned char D_sigma0_h0_sigma1[HASH_BYTES * 3 + ROUNDS * (NPACKED_BYTES + MPACKED_BYTES)];
|
||||
unsigned char *D = D_sigma0_h0_sigma1;
|
||||
unsigned char *sigma0 = D_sigma0_h0_sigma1 + HASH_BYTES;
|
||||
unsigned char *h0 = D_sigma0_h0_sigma1 + 2 * HASH_BYTES;
|
||||
unsigned char *t1packed = D_sigma0_h0_sigma1 + 3 * HASH_BYTES;
|
||||
unsigned char *e1packed = D_sigma0_h0_sigma1 + 3 * HASH_BYTES + ROUNDS * NPACKED_BYTES;
|
||||
shake256ctx shakestate;
|
||||
unsigned char shakeblock[SHAKE256_RATE];
|
||||
unsigned char h1[((ROUNDS + 7) & ~7) >> 3];
|
||||
unsigned char rnd_seed[HASH_BYTES + SEED_BYTES];
|
||||
unsigned char rho[2 * ROUNDS * HASH_BYTES];
|
||||
unsigned char *rho0 = rho;
|
||||
unsigned char *rho1 = rho + ROUNDS * HASH_BYTES;
|
||||
gf31 sk_gf31[N];
|
||||
gf31 rnd[(2 * N + M) * ROUNDS]; // Concatenated for easy RNG.
|
||||
gf31 *r0 = rnd;
|
||||
gf31 *t0 = rnd + N * ROUNDS;
|
||||
gf31 *e0 = rnd + 2 * N * ROUNDS;
|
||||
gf31 r1[N * ROUNDS];
|
||||
gf31 t1[N * ROUNDS];
|
||||
gf31 e1[M * ROUNDS];
|
||||
gf31 gx[M * ROUNDS];
|
||||
unsigned char packbuf0[NPACKED_BYTES];
|
||||
unsigned char packbuf1[NPACKED_BYTES];
|
||||
unsigned char packbuf2[MPACKED_BYTES];
|
||||
unsigned char c[HASH_BYTES * ROUNDS * 2];
|
||||
gf31 alpha;
|
||||
int alpha_count = 0;
|
||||
int b;
|
||||
int i, j;
|
||||
shake256incctx state;
|
||||
|
||||
shake256(skbuf, SEED_BYTES * 4, sk, SEED_BYTES);
|
||||
|
||||
PQCLEAN_MQDSS48_AVX2_gf31_nrand_schar(F, F_LEN, skbuf, SEED_BYTES);
|
||||
|
||||
shake256_inc_init(&state);
|
||||
shake256_inc_absorb(&state, sk, SEED_BYTES);
|
||||
shake256_inc_absorb(&state, m, mlen);
|
||||
shake256_inc_finalize(&state);
|
||||
shake256_inc_squeeze(sig, HASH_BYTES, &state); // Compute R.
|
||||
shake256_inc_ctx_release(&state);
|
||||
|
||||
memcpy(pk, skbuf, SEED_BYTES);
|
||||
PQCLEAN_MQDSS48_AVX2_gf31_nrand(sk_gf31, N, skbuf + SEED_BYTES, SEED_BYTES);
|
||||
PQCLEAN_MQDSS48_AVX2_MQ(pk_gf31, sk_gf31, F);
|
||||
PQCLEAN_MQDSS48_AVX2_vgf31_unique(pk_gf31, pk_gf31);
|
||||
PQCLEAN_MQDSS48_AVX2_gf31_npack(pk + SEED_BYTES, pk_gf31, M);
|
||||
|
||||
shake256_inc_init(&state);
|
||||
shake256_inc_absorb(&state, pk, PK_BYTES);
|
||||
shake256_inc_absorb(&state, sig, HASH_BYTES);
|
||||
shake256_inc_absorb(&state, m, mlen);
|
||||
shake256_inc_finalize(&state);
|
||||
shake256_inc_squeeze(D, HASH_BYTES, &state);
|
||||
shake256_inc_ctx_release(&state);
|
||||
|
||||
sig += HASH_BYTES; // Compensate for prefixed R.
|
||||
|
||||
memcpy(rnd_seed, skbuf + 2 * SEED_BYTES, SEED_BYTES);
|
||||
memcpy(rnd_seed + SEED_BYTES, D, HASH_BYTES);
|
||||
shake256(rho, 2 * ROUNDS * HASH_BYTES, rnd_seed, SEED_BYTES + HASH_BYTES);
|
||||
|
||||
memcpy(rnd_seed, skbuf + 3 * SEED_BYTES, SEED_BYTES);
|
||||
memcpy(rnd_seed + SEED_BYTES, D, HASH_BYTES);
|
||||
PQCLEAN_MQDSS48_AVX2_gf31_nrand(rnd, (2 * N + M) * ROUNDS, rnd_seed, SEED_BYTES + HASH_BYTES);
|
||||
|
||||
for (i = 0; i < ROUNDS; i++) {
|
||||
for (j = 0; j < N; j++) {
|
||||
r1[j + i * N] = (gf31)(31 + sk_gf31[j] - r0[j + i * N]);
|
||||
}
|
||||
PQCLEAN_MQDSS48_AVX2_G(gx + i * M, t0 + i * N, r1 + i * N, F);
|
||||
}
|
||||
for (i = 0; i < ROUNDS * M; i++) {
|
||||
gx[i] = (gf31)(gx[i] + e0[i]);
|
||||
}
|
||||
for (i = 0; i < ROUNDS; i++) {
|
||||
PQCLEAN_MQDSS48_AVX2_gf31_npack(packbuf0, r0 + i * N, N);
|
||||
PQCLEAN_MQDSS48_AVX2_gf31_npack(packbuf1, t0 + i * N, N);
|
||||
PQCLEAN_MQDSS48_AVX2_gf31_npack(packbuf2, e0 + i * M, M);
|
||||
com_0(c + HASH_BYTES * (2 * i + 0), rho0 + i * HASH_BYTES, packbuf0, packbuf1, packbuf2);
|
||||
PQCLEAN_MQDSS48_AVX2_vgf31_shorten_unique(r1 + i * N, r1 + i * N);
|
||||
PQCLEAN_MQDSS48_AVX2_vgf31_shorten_unique(gx + i * M, gx + i * M);
|
||||
PQCLEAN_MQDSS48_AVX2_gf31_npack(packbuf0, r1 + i * N, N);
|
||||
PQCLEAN_MQDSS48_AVX2_gf31_npack(packbuf1, gx + i * M, M);
|
||||
com_1(c + HASH_BYTES * (2 * i + 1), rho1 + i * HASH_BYTES, packbuf0, packbuf1);
|
||||
}
|
||||
|
||||
H(sigma0, c, HASH_BYTES * ROUNDS * 2); // Compute sigma_0.
|
||||
shake256_absorb(&shakestate, D_sigma0_h0_sigma1, 2 * HASH_BYTES);
|
||||
shake256_squeezeblocks(shakeblock, 1, &shakestate);
|
||||
|
||||
memcpy(h0, shakeblock, HASH_BYTES);
|
||||
|
||||
memcpy(sig, sigma0, HASH_BYTES);
|
||||
sig += HASH_BYTES; // Compensate for sigma_0.
|
||||
|
||||
for (i = 0; i < ROUNDS; i++) {
|
||||
do {
|
||||
alpha = shakeblock[alpha_count] & 31;
|
||||
alpha_count++;
|
||||
if (alpha_count == SHAKE256_RATE) {
|
||||
alpha_count = 0;
|
||||
shake256_squeezeblocks(shakeblock, 1, &shakestate);
|
||||
}
|
||||
} while (alpha == 31);
|
||||
for (j = 0; j < N; j++) {
|
||||
t1[i * N + j] = (gf31)(alpha * r0[j + i * N] - t0[j + i * N] + 31);
|
||||
}
|
||||
PQCLEAN_MQDSS48_AVX2_MQ(e1 + i * M, r0 + i * N, F);
|
||||
for (j = 0; j < N; j++) {
|
||||
e1[i * N + j] = (gf31)(alpha * e1[j + i * M] - e0[j + i * M] + 31);
|
||||
}
|
||||
PQCLEAN_MQDSS48_AVX2_vgf31_shorten_unique(t1 + i * N, t1 + i * N);
|
||||
PQCLEAN_MQDSS48_AVX2_vgf31_shorten_unique(e1 + i * N, e1 + i * N);
|
||||
}
|
||||
shake256_ctx_release(&shakestate);
|
||||
|
||||
PQCLEAN_MQDSS48_AVX2_gf31_npack(t1packed, t1, N * ROUNDS);
|
||||
PQCLEAN_MQDSS48_AVX2_gf31_npack(e1packed, e1, M * ROUNDS);
|
||||
|
||||
memcpy(sig, t1packed, NPACKED_BYTES * ROUNDS);
|
||||
sig += NPACKED_BYTES * ROUNDS;
|
||||
memcpy(sig, e1packed, MPACKED_BYTES * ROUNDS);
|
||||
sig += MPACKED_BYTES * ROUNDS;
|
||||
|
||||
shake256(h1, ((ROUNDS + 7) & ~7) >> 3, D_sigma0_h0_sigma1, 3 * HASH_BYTES + ROUNDS * (NPACKED_BYTES + MPACKED_BYTES));
|
||||
|
||||
for (i = 0; i < ROUNDS; i++) {
|
||||
b = (h1[(i >> 3)] >> (i & 7)) & 1;
|
||||
if (b == 0) {
|
||||
PQCLEAN_MQDSS48_AVX2_gf31_npack(sig, r0 + i * N, N);
|
||||
} else if (b == 1) {
|
||||
PQCLEAN_MQDSS48_AVX2_gf31_npack(sig, r1 + i * N, N);
|
||||
}
|
||||
memcpy(sig + NPACKED_BYTES, c + HASH_BYTES * (2 * i + (1 - b)), HASH_BYTES);
|
||||
memcpy(sig + NPACKED_BYTES + HASH_BYTES, rho + (i + b * ROUNDS) * HASH_BYTES, HASH_BYTES);
|
||||
sig += NPACKED_BYTES + 2 * HASH_BYTES;
|
||||
}
|
||||
|
||||
*siglen = SIG_LEN;
|
||||
return 0;
|
||||
}
|
||||
|
||||
/**
|
||||
* Verifies a detached signature and message under a given public key.
|
||||
*/
|
||||
int PQCLEAN_MQDSS48_AVX2_crypto_sign_verify(
|
||||
const uint8_t *sig, size_t siglen,
|
||||
const uint8_t *m, size_t mlen, const uint8_t *pk) {
|
||||
|
||||
gf31 r[N];
|
||||
gf31 t[N];
|
||||
gf31 e[M];
|
||||
signed char F[F_LEN];
|
||||
gf31 pk_gf31[M];
|
||||
// Concatenated for convenient hashing.
|
||||
unsigned char D_sigma0_h0_sigma1[HASH_BYTES * 3 + ROUNDS * (NPACKED_BYTES + MPACKED_BYTES)];
|
||||
unsigned char *D = D_sigma0_h0_sigma1;
|
||||
unsigned char *sigma0 = D_sigma0_h0_sigma1 + HASH_BYTES;
|
||||
unsigned char *h0 = D_sigma0_h0_sigma1 + 2 * HASH_BYTES;
|
||||
unsigned char *t1packed = D_sigma0_h0_sigma1 + 3 * HASH_BYTES;
|
||||
unsigned char *e1packed = D_sigma0_h0_sigma1 + 3 * HASH_BYTES + ROUNDS * NPACKED_BYTES;
|
||||
unsigned char h1[((ROUNDS + 7) & ~7) >> 3];
|
||||
unsigned char c[HASH_BYTES * ROUNDS * 2];
|
||||
memset(c, 0, HASH_BYTES * 2);
|
||||
gf31 x[N];
|
||||
gf31 y[M];
|
||||
gf31 z[M];
|
||||
unsigned char packbuf0[NPACKED_BYTES];
|
||||
unsigned char packbuf1[MPACKED_BYTES];
|
||||
shake256ctx shakestate;
|
||||
unsigned char shakeblock[SHAKE256_RATE];
|
||||
int i, j;
|
||||
gf31 alpha;
|
||||
int alpha_count = 0;
|
||||
int b;
|
||||
shake256incctx state;
|
||||
|
||||
if (siglen != SIG_LEN) {
|
||||
return -1;
|
||||
}
|
||||
|
||||
shake256_inc_init(&state);
|
||||
shake256_inc_absorb(&state, pk, PK_BYTES);
|
||||
shake256_inc_absorb(&state, sig, HASH_BYTES);
|
||||
shake256_inc_absorb(&state, m, mlen);
|
||||
shake256_inc_finalize(&state);
|
||||
shake256_inc_squeeze(D, HASH_BYTES, &state);
|
||||
shake256_inc_ctx_release(&state);
|
||||
|
||||
sig += HASH_BYTES;
|
||||
|
||||
PQCLEAN_MQDSS48_AVX2_gf31_nrand_schar(F, F_LEN, pk, SEED_BYTES);
|
||||
pk += SEED_BYTES;
|
||||
PQCLEAN_MQDSS48_AVX2_gf31_nunpack(pk_gf31, pk, M);
|
||||
|
||||
memcpy(sigma0, sig, HASH_BYTES);
|
||||
|
||||
shake256_absorb(&shakestate, D_sigma0_h0_sigma1, 2 * HASH_BYTES);
|
||||
shake256_squeezeblocks(shakeblock, 1, &shakestate);
|
||||
|
||||
memcpy(h0, shakeblock, HASH_BYTES);
|
||||
|
||||
sig += HASH_BYTES;
|
||||
|
||||
memcpy(t1packed, sig, ROUNDS * NPACKED_BYTES);
|
||||
sig += ROUNDS * NPACKED_BYTES;
|
||||
memcpy(e1packed, sig, ROUNDS * MPACKED_BYTES);
|
||||
sig += ROUNDS * MPACKED_BYTES;
|
||||
|
||||
shake256(h1, ((ROUNDS + 7) & ~7) >> 3, D_sigma0_h0_sigma1, 3 * HASH_BYTES + ROUNDS * (NPACKED_BYTES + MPACKED_BYTES));
|
||||
|
||||
for (i = 0; i < ROUNDS; i++) {
|
||||
do {
|
||||
alpha = shakeblock[alpha_count] & 31;
|
||||
alpha_count++;
|
||||
if (alpha_count == SHAKE256_RATE) {
|
||||
alpha_count = 0;
|
||||
shake256_squeezeblocks(shakeblock, 1, &shakestate);
|
||||
}
|
||||
} while (alpha == 31);
|
||||
b = (h1[(i >> 3)] >> (i & 7)) & 1;
|
||||
|
||||
PQCLEAN_MQDSS48_AVX2_gf31_nunpack(r, sig, N);
|
||||
PQCLEAN_MQDSS48_AVX2_gf31_nunpack(t, t1packed + NPACKED_BYTES * i, N);
|
||||
PQCLEAN_MQDSS48_AVX2_gf31_nunpack(e, e1packed + MPACKED_BYTES * i, M);
|
||||
|
||||
if (b == 0) {
|
||||
PQCLEAN_MQDSS48_AVX2_MQ(y, r, F);
|
||||
for (j = 0; j < N; j++) {
|
||||
x[j] = (gf31)(alpha * r[j] - t[j] + 31);
|
||||
}
|
||||
for (j = 0; j < N; j++) {
|
||||
y[j] = (gf31)(alpha * y[j] - e[j] + 31);
|
||||
}
|
||||
PQCLEAN_MQDSS48_AVX2_vgf31_shorten_unique(x, x);
|
||||
PQCLEAN_MQDSS48_AVX2_vgf31_shorten_unique(y, y);
|
||||
PQCLEAN_MQDSS48_AVX2_gf31_npack(packbuf0, x, N);
|
||||
PQCLEAN_MQDSS48_AVX2_gf31_npack(packbuf1, y, M);
|
||||
com_0(c + HASH_BYTES * (2 * i + 0), sig + HASH_BYTES + NPACKED_BYTES, sig, packbuf0, packbuf1);
|
||||
} else {
|
||||
PQCLEAN_MQDSS48_AVX2_MQ(y, r, F);
|
||||
PQCLEAN_MQDSS48_AVX2_G(z, t, r, F);
|
||||
for (j = 0; j < N; j++) {
|
||||
y[j] = (gf31)(alpha * (31 + pk_gf31[j] - y[j]) - z[j] - e[j] + 62);
|
||||
}
|
||||
PQCLEAN_MQDSS48_AVX2_vgf31_shorten_unique(y, y);
|
||||
PQCLEAN_MQDSS48_AVX2_gf31_npack(packbuf0, y, M);
|
||||
com_1(c + HASH_BYTES * (2 * i + 1), sig + HASH_BYTES + NPACKED_BYTES, sig, packbuf0);
|
||||
}
|
||||
memcpy(c + HASH_BYTES * (2 * i + (1 - b)), sig + NPACKED_BYTES, HASH_BYTES);
|
||||
sig += NPACKED_BYTES + 2 * HASH_BYTES;
|
||||
}
|
||||
shake256_ctx_release(&shakestate);
|
||||
|
||||
H(c, c, HASH_BYTES * ROUNDS * 2);
|
||||
if (memcmp(c, sigma0, HASH_BYTES) != 0) {
|
||||
return -1;
|
||||
}
|
||||
|
||||
return 0;
|
||||
}
|
||||
|
||||
/**
|
||||
* Returns an array containing the signature followed by the message.
|
||||
*/
|
||||
int PQCLEAN_MQDSS48_AVX2_crypto_sign(
|
||||
uint8_t *sm, size_t *smlen,
|
||||
const uint8_t *m, size_t mlen, const uint8_t *sk) {
|
||||
size_t siglen;
|
||||
|
||||
PQCLEAN_MQDSS48_AVX2_crypto_sign_signature(
|
||||
sm, &siglen, m, mlen, sk);
|
||||
|
||||
memmove(sm + SIG_LEN, m, mlen);
|
||||
*smlen = siglen + mlen;
|
||||
|
||||
return 0;
|
||||
}
|
||||
|
||||
/**
|
||||
* Verifies a given signature-message pair under a given public key.
|
||||
*/
|
||||
int PQCLEAN_MQDSS48_AVX2_crypto_sign_open(
|
||||
uint8_t *m, size_t *mlen,
|
||||
const uint8_t *sm, size_t smlen, const uint8_t *pk) {
|
||||
/* The API caller does not necessarily know what size a signature should be
|
||||
but MQDSS signatures are always exactly SIG_LEN. */
|
||||
if (smlen < SIG_LEN) {
|
||||
memset(m, 0, smlen);
|
||||
*mlen = 0;
|
||||
return -1;
|
||||
}
|
||||
|
||||
*mlen = smlen - SIG_LEN;
|
||||
|
||||
if (PQCLEAN_MQDSS48_AVX2_crypto_sign_verify(
|
||||
sm, SIG_LEN, sm + SIG_LEN, *mlen, pk)) {
|
||||
memset(m, 0, smlen);
|
||||
*mlen = 0;
|
||||
return -1;
|
||||
}
|
||||
|
||||
/* If verification was successful, move the message to the right place. */
|
||||
memmove(m, sm + SIG_LEN, *mlen);
|
||||
|
||||
return 0;
|
||||
}
|
@ -1,4 +1,3 @@
|
||||
#include <assert.h>
|
||||
#include <stddef.h>
|
||||
#include <stdint.h>
|
||||
#include <string.h>
|
||||
|
@ -16,3 +16,12 @@ auxiliary-submitters:
|
||||
implementations:
|
||||
- name: clean
|
||||
version: https://github.com/joostrijneveld/MQDSS/commit/00608d7610262ff07b1834885d32bc3fd27ef5e1
|
||||
- name: avx2
|
||||
version: https://github.com/joostrijneveld/MQDSS/commit/00608d7610262ff07b1834885d32bc3fd27ef5e1
|
||||
supported_platforms:
|
||||
- architecture: x86_64
|
||||
required_flags:
|
||||
- avx2
|
||||
- architecture: x86
|
||||
required_flags:
|
||||
- avx2
|
||||
|
116
crypto_sign/mqdss-64/avx2/LICENSE
Normal file
116
crypto_sign/mqdss-64/avx2/LICENSE
Normal file
@ -0,0 +1,116 @@
|
||||
CC0 1.0 Universal
|
||||
|
||||
Statement of Purpose
|
||||
|
||||
The laws of most jurisdictions throughout the world automatically confer
|
||||
exclusive Copyright and Related Rights (defined below) upon the creator and
|
||||
subsequent owner(s) (each and all, an "owner") of an original work of
|
||||
authorship and/or a database (each, a "Work").
|
||||
|
||||
Certain owners wish to permanently relinquish those rights to a Work for the
|
||||
purpose of contributing to a commons of creative, cultural and scientific
|
||||
works ("Commons") that the public can reliably and without fear of later
|
||||
claims of infringement build upon, modify, incorporate in other works, reuse
|
||||
and redistribute as freely as possible in any form whatsoever and for any
|
||||
purposes, including without limitation commercial purposes. These owners may
|
||||
contribute to the Commons to promote the ideal of a free culture and the
|
||||
further production of creative, cultural and scientific works, or to gain
|
||||
reputation or greater distribution for their Work in part through the use and
|
||||
efforts of others.
|
||||
|
||||
For these and/or other purposes and motivations, and without any expectation
|
||||
of additional consideration or compensation, the person associating CC0 with a
|
||||
Work (the "Affirmer"), to the extent that he or she is an owner of Copyright
|
||||
and Related Rights in the Work, voluntarily elects to apply CC0 to the Work
|
||||
and publicly distribute the Work under its terms, with knowledge of his or her
|
||||
Copyright and Related Rights in the Work and the meaning and intended legal
|
||||
effect of CC0 on those rights.
|
||||
|
||||
1. Copyright and Related Rights. A Work made available under CC0 may be
|
||||
protected by copyright and related or neighboring rights ("Copyright and
|
||||
Related Rights"). Copyright and Related Rights include, but are not limited
|
||||
to, the following:
|
||||
|
||||
i. the right to reproduce, adapt, distribute, perform, display, communicate,
|
||||
and translate a Work;
|
||||
|
||||
ii. moral rights retained by the original author(s) and/or performer(s);
|
||||
|
||||
iii. publicity and privacy rights pertaining to a person's image or likeness
|
||||
depicted in a Work;
|
||||
|
||||
iv. rights protecting against unfair competition in regards to a Work,
|
||||
subject to the limitations in paragraph 4(a), below;
|
||||
|
||||
v. rights protecting the extraction, dissemination, use and reuse of data in
|
||||
a Work;
|
||||
|
||||
vi. database rights (such as those arising under Directive 96/9/EC of the
|
||||
European Parliament and of the Council of 11 March 1996 on the legal
|
||||
protection of databases, and under any national implementation thereof,
|
||||
including any amended or successor version of such directive); and
|
||||
|
||||
vii. other similar, equivalent or corresponding rights throughout the world
|
||||
based on applicable law or treaty, and any national implementations thereof.
|
||||
|
||||
2. Waiver. To the greatest extent permitted by, but not in contravention of,
|
||||
applicable law, Affirmer hereby overtly, fully, permanently, irrevocably and
|
||||
unconditionally waives, abandons, and surrenders all of Affirmer's Copyright
|
||||
and Related Rights and associated claims and causes of action, whether now
|
||||
known or unknown (including existing as well as future claims and causes of
|
||||
action), in the Work (i) in all territories worldwide, (ii) for the maximum
|
||||
duration provided by applicable law or treaty (including future time
|
||||
extensions), (iii) in any current or future medium and for any number of
|
||||
copies, and (iv) for any purpose whatsoever, including without limitation
|
||||
commercial, advertising or promotional purposes (the "Waiver"). Affirmer makes
|
||||
the Waiver for the benefit of each member of the public at large and to the
|
||||
detriment of Affirmer's heirs and successors, fully intending that such Waiver
|
||||
shall not be subject to revocation, rescission, cancellation, termination, or
|
||||
any other legal or equitable action to disrupt the quiet enjoyment of the Work
|
||||
by the public as contemplated by Affirmer's express Statement of Purpose.
|
||||
|
||||
3. Public License Fallback. Should any part of the Waiver for any reason be
|
||||
judged legally invalid or ineffective under applicable law, then the Waiver
|
||||
shall be preserved to the maximum extent permitted taking into account
|
||||
Affirmer's express Statement of Purpose. In addition, to the extent the Waiver
|
||||
is so judged Affirmer hereby grants to each affected person a royalty-free,
|
||||
non transferable, non sublicensable, non exclusive, irrevocable and
|
||||
unconditional license to exercise Affirmer's Copyright and Related Rights in
|
||||
the Work (i) in all territories worldwide, (ii) for the maximum duration
|
||||
provided by applicable law or treaty (including future time extensions), (iii)
|
||||
in any current or future medium and for any number of copies, and (iv) for any
|
||||
purpose whatsoever, including without limitation commercial, advertising or
|
||||
promotional purposes (the "License"). The License shall be deemed effective as
|
||||
of the date CC0 was applied by Affirmer to the Work. Should any part of the
|
||||
License for any reason be judged legally invalid or ineffective under
|
||||
applicable law, such partial invalidity or ineffectiveness shall not
|
||||
invalidate the remainder of the License, and in such case Affirmer hereby
|
||||
affirms that he or she will not (i) exercise any of his or her remaining
|
||||
Copyright and Related Rights in the Work or (ii) assert any associated claims
|
||||
and causes of action with respect to the Work, in either case contrary to
|
||||
Affirmer's express Statement of Purpose.
|
||||
|
||||
4. Limitations and Disclaimers.
|
||||
|
||||
a. No trademark or patent rights held by Affirmer are waived, abandoned,
|
||||
surrendered, licensed or otherwise affected by this document.
|
||||
|
||||
b. Affirmer offers the Work as-is and makes no representations or warranties
|
||||
of any kind concerning the Work, express, implied, statutory or otherwise,
|
||||
including without limitation warranties of title, merchantability, fitness
|
||||
for a particular purpose, non infringement, or the absence of latent or
|
||||
other defects, accuracy, or the present or absence of errors, whether or not
|
||||
discoverable, all to the greatest extent permissible under applicable law.
|
||||
|
||||
c. Affirmer disclaims responsibility for clearing rights of other persons
|
||||
that may apply to the Work or any use thereof, including without limitation
|
||||
any person's Copyright and Related Rights in the Work. Further, Affirmer
|
||||
disclaims responsibility for obtaining any necessary consents, permissions
|
||||
or other rights required for any use of the Work.
|
||||
|
||||
d. Affirmer understands and acknowledges that Creative Commons is not a
|
||||
party to this document and has no duty or obligation with respect to this
|
||||
CC0 or use of the Work.
|
||||
|
||||
For more information, please see
|
||||
<http://creativecommons.org/publicdomain/zero/1.0/>
|
22
crypto_sign/mqdss-64/avx2/Makefile
Normal file
22
crypto_sign/mqdss-64/avx2/Makefile
Normal file
@ -0,0 +1,22 @@
|
||||
# This Makefile can be used with GNU Make or BSD Make
|
||||
|
||||
LIB=libmqdss-64_avx2.a
|
||||
|
||||
HEADERS = params.h gf31.h mq.h api.h
|
||||
OBJECTS = gf31.o mq.o sign.o
|
||||
|
||||
CFLAGS=-O3 -Wall -Wconversion -Wextra -Wpedantic -Wvla -Werror \
|
||||
-Wmissing-prototypes -Wredundant-decls -std=c99 -mavx2 \
|
||||
-I../../../common $(EXTRAFLAGS)
|
||||
|
||||
all: $(LIB)
|
||||
|
||||
%.o: %.c $(HEADERS)
|
||||
$(CC) $(CFLAGS) -c -o $@ $<
|
||||
|
||||
$(LIB): $(OBJECTS)
|
||||
$(AR) -r $@ $(OBJECTS)
|
||||
|
||||
clean:
|
||||
$(RM) $(OBJECTS)
|
||||
$(RM) $(LIB)
|
19
crypto_sign/mqdss-64/avx2/Makefile.Microsoft_nmake
Normal file
19
crypto_sign/mqdss-64/avx2/Makefile.Microsoft_nmake
Normal file
@ -0,0 +1,19 @@
|
||||
# This Makefile can be used with Microsoft Visual Studio's nmake using the command:
|
||||
# nmake /f Makefile.Microsoft_nmake
|
||||
|
||||
LIBRARY=libmqdss-64_clean.lib
|
||||
OBJECTS=gf31.obj mq.obj sign.obj
|
||||
|
||||
CFLAGS=/nologo /O2 /I ..\..\..\common /W4 /WX /arch:AVX2
|
||||
|
||||
all: $(LIBRARY)
|
||||
|
||||
# Make sure objects are recompiled if headers change.
|
||||
$(OBJECTS): *.h
|
||||
|
||||
$(LIBRARY): $(OBJECTS)
|
||||
LIB.EXE /NOLOGO /WX /OUT:$@ $**
|
||||
|
||||
clean:
|
||||
-DEL $(OBJECTS)
|
||||
-DEL $(LIBRARY)
|
47
crypto_sign/mqdss-64/avx2/api.h
Normal file
47
crypto_sign/mqdss-64/avx2/api.h
Normal file
@ -0,0 +1,47 @@
|
||||
#ifndef PQCLEAN_MQDSS64_AVX2_API_H
|
||||
#define PQCLEAN_MQDSS64_AVX2_API_H
|
||||
|
||||
#include <stddef.h>
|
||||
#include <stdint.h>
|
||||
|
||||
#define PQCLEAN_MQDSS64_AVX2_CRYPTO_ALGNAME "MQDSS-64"
|
||||
|
||||
#define PQCLEAN_MQDSS64_AVX2_CRYPTO_SECRETKEYBYTES 24
|
||||
#define PQCLEAN_MQDSS64_AVX2_CRYPTO_PUBLICKEYBYTES 64
|
||||
#define PQCLEAN_MQDSS64_AVX2_CRYPTO_BYTES 59928
|
||||
|
||||
/*
|
||||
* Generates an MQDSS key pair.
|
||||
*/
|
||||
int PQCLEAN_MQDSS64_AVX2_crypto_sign_keypair(
|
||||
uint8_t *pk, uint8_t *sk);
|
||||
|
||||
/**
|
||||
* Returns an array containing a detached signature.
|
||||
*/
|
||||
int PQCLEAN_MQDSS64_AVX2_crypto_sign_signature(
|
||||
uint8_t *sig, size_t *siglen,
|
||||
const uint8_t *m, size_t mlen, const uint8_t *sk);
|
||||
|
||||
/**
|
||||
* Verifies a detached signature and message under a given public key.
|
||||
*/
|
||||
int PQCLEAN_MQDSS64_AVX2_crypto_sign_verify(
|
||||
const uint8_t *sig, size_t siglen,
|
||||
const uint8_t *m, size_t mlen, const uint8_t *pk);
|
||||
|
||||
/**
|
||||
* Returns an array containing the signature followed by the message.
|
||||
*/
|
||||
int PQCLEAN_MQDSS64_AVX2_crypto_sign(
|
||||
uint8_t *sm, size_t *smlen,
|
||||
const uint8_t *m, size_t mlen, const uint8_t *sk);
|
||||
|
||||
/**
|
||||
* Verifies a given signature-message pair under a given public key.
|
||||
*/
|
||||
int PQCLEAN_MQDSS64_AVX2_crypto_sign_open(
|
||||
uint8_t *m, size_t *mlen,
|
||||
const uint8_t *sm, size_t smlen, const uint8_t *pk);
|
||||
|
||||
#endif
|
128
crypto_sign/mqdss-64/avx2/gf31.c
Normal file
128
crypto_sign/mqdss-64/avx2/gf31.c
Normal file
@ -0,0 +1,128 @@
|
||||
#include "params.h"
|
||||
#include "fips202.h"
|
||||
#include "gf31.h"
|
||||
#include <assert.h>
|
||||
#include <immintrin.h>
|
||||
#include <stdint.h>
|
||||
#include <string.h>
|
||||
|
||||
/* Given a vector of N elements in the range [0, 31], this reduces the elements
|
||||
to the range [0, 30] by mapping 31 to 0 (i.e reduction mod 31) */
|
||||
void PQCLEAN_MQDSS64_AVX2_vgf31_unique(gf31 *out, gf31 *in) {
|
||||
__m256i x;
|
||||
__m256i _w31 = _mm256_set1_epi16(31);
|
||||
int i;
|
||||
|
||||
for (i = 0; i < (N >> 4); ++i) {
|
||||
x = _mm256_loadu_si256((__m256i const *) (in + 16 * i));
|
||||
x = _mm256_xor_si256(x, _mm256_and_si256(_w31, _mm256_cmpeq_epi16(x, _w31)));
|
||||
_mm256_storeu_si256((__m256i *)(out + i * 16), x);
|
||||
}
|
||||
}
|
||||
|
||||
/* This function acts on vectors with 64 gf31 elements.
|
||||
It performs one reduction step and guarantees output in [0, 30],
|
||||
but requires input to be in [0, 32768). */
|
||||
void PQCLEAN_MQDSS64_AVX2_vgf31_shorten_unique(gf31 *out, gf31 *in) {
|
||||
__m256i x;
|
||||
__m256i _w2114 = _mm256_set1_epi32(2114 * 65536 + 2114);
|
||||
__m256i _w31 = _mm256_set1_epi16(31);
|
||||
int i;
|
||||
|
||||
for (i = 0; i < (N >> 4); ++i) {
|
||||
x = _mm256_loadu_si256((__m256i const *) (in + 16 * i));
|
||||
x = _mm256_sub_epi16(x, _mm256_mullo_epi16(_w31, _mm256_mulhi_epi16(x, _w2114)));
|
||||
x = _mm256_xor_si256(x, _mm256_and_si256(_w31, _mm256_cmpeq_epi16(x, _w31)));
|
||||
_mm256_storeu_si256((__m256i *)(out + i * 16), x);
|
||||
}
|
||||
}
|
||||
|
||||
/* Given a seed, samples len gf31 elements (in the range [0, 30]), and places
|
||||
them in a vector of 16-bit elements */
|
||||
void PQCLEAN_MQDSS64_AVX2_gf31_nrand(gf31 *out, size_t len, const uint8_t *seed, size_t seedlen) {
|
||||
size_t i = 0, j;
|
||||
shake256ctx shakestate;
|
||||
uint8_t shakeblock[SHAKE256_RATE];
|
||||
|
||||
shake256_absorb(&shakestate, seed, seedlen);
|
||||
|
||||
while (i < len) {
|
||||
shake256_squeezeblocks(shakeblock, 1, &shakestate);
|
||||
for (j = 0; j < SHAKE256_RATE && i < len; j++) {
|
||||
if ((shakeblock[j] & 31) != 31) {
|
||||
out[i] = (shakeblock[j] & 31);
|
||||
i++;
|
||||
}
|
||||
}
|
||||
}
|
||||
shake256_ctx_release(&shakestate);
|
||||
}
|
||||
|
||||
/* Given a seed, samples len gf31 elements, transposed into unsigned range,
|
||||
i.e. in the range [-15, 15], and places them in an array of 8-bit integers.
|
||||
This is used for the expansion of F, which wants packed elements. */
|
||||
void PQCLEAN_MQDSS64_AVX2_gf31_nrand_schar(signed char *out, size_t len, const uint8_t *seed, size_t seedlen) {
|
||||
size_t i = 0, j;
|
||||
shake256ctx shakestate;
|
||||
uint8_t shakeblock[SHAKE256_RATE];
|
||||
|
||||
shake256_absorb(&shakestate, seed, seedlen);
|
||||
|
||||
while (i < len) {
|
||||
shake256_squeezeblocks(shakeblock, 1, &shakestate);
|
||||
for (j = 0; j < SHAKE256_RATE && i < len; j++) {
|
||||
if ((shakeblock[j] & 31) != 31) {
|
||||
out[i] = (signed char)((shakeblock[j] & 31) - 15);
|
||||
i++;
|
||||
}
|
||||
}
|
||||
}
|
||||
shake256_ctx_release(&shakestate);
|
||||
|
||||
}
|
||||
|
||||
/* Unpacks an array of packed GF31 elements to one element per gf31.
|
||||
Assumes that there is sufficient empty space available at the end of the
|
||||
array to unpack. Can perform in-place. */
|
||||
void PQCLEAN_MQDSS64_AVX2_gf31_nunpack(gf31 *out, const uint8_t *in, size_t n) {
|
||||
size_t i;
|
||||
size_t j = ((n * 5) >> 3) - 1;
|
||||
unsigned int d = 0;
|
||||
|
||||
for (i = n; i > 0; i--) {
|
||||
out[i - 1] = (gf31)((in[j] >> d) & 31);
|
||||
d += 5;
|
||||
if (d > 8) {
|
||||
d -= 8;
|
||||
j--;
|
||||
out[i - 1] = (gf31)(out[i - 1] ^ ((in[j] << (5 - d)) & 31));
|
||||
}
|
||||
}
|
||||
}
|
||||
|
||||
/* Packs an array of GF31 elements from gf31's to concatenated 5-bit values.
|
||||
Assumes that there is sufficient space available to unpack.
|
||||
Can perform in-place. */
|
||||
void PQCLEAN_MQDSS64_AVX2_gf31_npack(uint8_t *out, const gf31 *in, size_t n) {
|
||||
unsigned int i = 0;
|
||||
unsigned int j;
|
||||
int d = 3;
|
||||
|
||||
for (j = 0; j < n; j++) {
|
||||
assert(in[j] < 31);
|
||||
}
|
||||
|
||||
/* There will be ceil(5n / 8) output blocks */
|
||||
memset(out, 0, (size_t)((5 * n + 7) & ~7U) >> 3);
|
||||
|
||||
for (j = 0; j < n; j++) {
|
||||
if (d < 0) {
|
||||
d += 8;
|
||||
out[i] = (uint8_t)((out[i] & (255 << (d - 3))) |
|
||||
((in[j] >> (8 - d)) & ~(255 << (d - 3))));
|
||||
i++;
|
||||
}
|
||||
out[i] = (uint8_t)((out[i] & ~(31 << d)) | ((in[j] << d) & (31 << d)));
|
||||
d -= 5;
|
||||
}
|
||||
}
|
36
crypto_sign/mqdss-64/avx2/gf31.h
Normal file
36
crypto_sign/mqdss-64/avx2/gf31.h
Normal file
@ -0,0 +1,36 @@
|
||||
#ifndef MQDSS_GF31_H
|
||||
#define MQDSS_GF31_H
|
||||
|
||||
#include <stddef.h>
|
||||
#include <stdint.h>
|
||||
|
||||
typedef unsigned short gf31;
|
||||
|
||||
/* Given a vector of elements in the range [0, 31], this reduces the elements
|
||||
to the range [0, 30] by mapping 31 to 0 (i.e reduction mod 31) */
|
||||
void PQCLEAN_MQDSS64_AVX2_vgf31_unique(gf31 *out, gf31 *in);
|
||||
|
||||
/* Given a vector of 16-bit integers (i.e. in [0, 65535], this reduces the
|
||||
elements to the range [0, 30] by mapping 31 to 0 (i.e reduction mod 31) */
|
||||
void PQCLEAN_MQDSS64_AVX2_vgf31_shorten_unique(gf31 *out, gf31 *in);
|
||||
|
||||
/* Given a seed, samples len gf31 elements (in the range [0, 30]), and places
|
||||
them in a vector of 16-bit elements */
|
||||
void PQCLEAN_MQDSS64_AVX2_gf31_nrand(gf31 *out, size_t len, const uint8_t *seed, size_t seedlen);
|
||||
|
||||
/* Given a seed, samples len gf31 elements, transposed into unsigned range,
|
||||
i.e. in the range [-15, 15], and places them in an array of 8-bit integers.
|
||||
This is used for the expansion of F, which wants packed elements. */
|
||||
void PQCLEAN_MQDSS64_AVX2_gf31_nrand_schar(signed char *out, size_t len, const uint8_t *seed, size_t seedlen);
|
||||
|
||||
/* Unpacks an array of packed GF31 elements to one element per gf31.
|
||||
Assumes that there is sufficient empty space available at the end of the
|
||||
array to unpack. Can perform in-place. */
|
||||
void PQCLEAN_MQDSS64_AVX2_gf31_nunpack(gf31 *out, const uint8_t *in, size_t n);
|
||||
|
||||
/* Packs an array of GF31 elements from gf31's to concatenated 5-bit values.
|
||||
Assumes that there is sufficient space available to unpack.
|
||||
Can perform in-place. */
|
||||
void PQCLEAN_MQDSS64_AVX2_gf31_npack(uint8_t *out, const gf31 *in, size_t n);
|
||||
|
||||
#endif
|
239
crypto_sign/mqdss-64/avx2/mq.c
Normal file
239
crypto_sign/mqdss-64/avx2/mq.c
Normal file
@ -0,0 +1,239 @@
|
||||
#include "mq.h"
|
||||
#include "params.h"
|
||||
#include <immintrin.h>
|
||||
#include <stdio.h>
|
||||
|
||||
static inline __m256i reduce_16(__m256i r, __m256i _w31, __m256i _w2114) {
|
||||
__m256i exp = _mm256_mulhi_epi16(r, _w2114);
|
||||
return _mm256_sub_epi16(r, _mm256_mullo_epi16(_w31, exp));
|
||||
}
|
||||
|
||||
/* Computes all products x_i * x_j, returns in reduced form */
|
||||
inline static
|
||||
void generate_quadratic_terms( unsigned char *xij, const gf31 *x ) {
|
||||
__m256i mask_2114 = _mm256_set1_epi16( 2114 );
|
||||
__m256i mask_31 = _mm256_set1_epi16( 31 );
|
||||
__m256i xi[4];
|
||||
xi[0] = _mm256_loadu_si256((__m256i const *) (x));
|
||||
xi[1] = _mm256_loadu_si256((__m256i const *) (x + 16));
|
||||
xi[2] = _mm256_loadu_si256((__m256i const *) (x + 32));
|
||||
xi[3] = _mm256_loadu_si256((__m256i const *) (x + 48));
|
||||
|
||||
__m256i xixj[4];
|
||||
xixj[0] = _mm256_setzero_si256();
|
||||
xixj[1] = _mm256_setzero_si256();
|
||||
xixj[2] = _mm256_setzero_si256();
|
||||
xixj[3] = _mm256_setzero_si256();
|
||||
|
||||
int k = 0;
|
||||
for (int i = 0; i < 32; i++) {
|
||||
__m256i br_xi = _mm256_set1_epi16( (short)x[i] );
|
||||
for (int j = 0; j <= (i >> 4); j++) {
|
||||
xixj[j] = _mm256_mullo_epi16( xi[j], br_xi );
|
||||
xixj[j] = reduce_16( xixj[j], mask_31, mask_2114 );
|
||||
}
|
||||
|
||||
__m256i r = _mm256_packs_epi16(xixj[0], xixj[1]);
|
||||
r = _mm256_permute4x64_epi64(r, 0xd8); // 3,1,2,0
|
||||
_mm256_storeu_si256( (__m256i *)( xij + k ), r );
|
||||
k += i + 1;
|
||||
}
|
||||
|
||||
for (int i = 32; i < N; i++) {
|
||||
__m256i br_xi = _mm256_set1_epi16( (short)x[i] );
|
||||
for (int j = 0; j <= (i >> 4); j++) {
|
||||
xixj[j] = _mm256_mullo_epi16( xi[j], br_xi );
|
||||
xixj[j] = reduce_16( xixj[j], mask_31, mask_2114 );
|
||||
}
|
||||
|
||||
__m256i r0 = _mm256_packs_epi16(xixj[0], xixj[1]);
|
||||
r0 = _mm256_permute4x64_epi64(r0, 0xd8); // 3,1,2,0
|
||||
_mm256_storeu_si256( (__m256i *)( xij + k ), r0 );
|
||||
__m256i r1 = _mm256_packs_epi16(xixj[2], xixj[3]);
|
||||
r1 = _mm256_permute4x64_epi64(r1, 0xd8); // 3,1,2,0
|
||||
_mm256_storeu_si256( (__m256i *)( xij + 32 + k ), r1 );
|
||||
k += i + 1;
|
||||
}
|
||||
}
|
||||
|
||||
/* Computes all terms (x_i * y_j) + (x_j * y_i), returns in reduced form */
|
||||
inline static
|
||||
void generate_xiyj_p_xjyi_terms( unsigned char *xij, const gf31 *x, const gf31 *y ) {
|
||||
__m256i mask_2114 = _mm256_set1_epi16( 2114 );
|
||||
__m256i mask_31 = _mm256_set1_epi16( 31 );
|
||||
__m256i xiyi[4];
|
||||
xiyi[0] = _mm256_xor_si256(_mm256_loadu_si256((__m256i const *) (x)), _mm256_slli_si256( _mm256_loadu_si256((__m256i const *) (y)), 1 ));
|
||||
xiyi[1] = _mm256_xor_si256(_mm256_loadu_si256((__m256i const *) (x + 16)), _mm256_slli_si256( _mm256_loadu_si256((__m256i const *) (y + 16)), 1 ));
|
||||
xiyi[2] = _mm256_xor_si256(_mm256_loadu_si256((__m256i const *) (x + 32)), _mm256_slli_si256( _mm256_loadu_si256((__m256i const *) (y + 32)), 1 ));
|
||||
xiyi[3] = _mm256_xor_si256(_mm256_loadu_si256((__m256i const *) (x + 48)), _mm256_slli_si256( _mm256_loadu_si256((__m256i const *) (y + 48)), 1 ));
|
||||
|
||||
__m256i xixj[4];
|
||||
xixj[0] = _mm256_setzero_si256();
|
||||
xixj[1] = _mm256_setzero_si256();
|
||||
xixj[2] = _mm256_setzero_si256();
|
||||
xixj[3] = _mm256_setzero_si256();
|
||||
|
||||
int k = 0;
|
||||
for (int i = 0; i < 32; i++) {
|
||||
__m256i br_yixi = _mm256_set1_epi16( (short)((x[i] << 8)^y[i]) );
|
||||
for (int j = 0; j <= (i >> 4); j++) {
|
||||
xixj[j] = _mm256_maddubs_epi16( xiyi[j], br_yixi );
|
||||
xixj[j] = reduce_16( xixj[j], mask_31, mask_2114 );
|
||||
}
|
||||
|
||||
__m256i r = _mm256_packs_epi16(xixj[0], xixj[1]);
|
||||
r = _mm256_permute4x64_epi64(r, 0xd8); // 3,1,2,0
|
||||
_mm256_storeu_si256( (__m256i *)( xij + k ), r );
|
||||
k += i + 1;
|
||||
}
|
||||
|
||||
for (int i = 32; i < N; i++) {
|
||||
__m256i br_yixi = _mm256_set1_epi16( (short)((x[i] << 8)^y[i]) );
|
||||
for (int j = 0; j <= (i >> 4); j++) {
|
||||
xixj[j] = _mm256_maddubs_epi16( xiyi[j], br_yixi );
|
||||
xixj[j] = reduce_16( xixj[j], mask_31, mask_2114 );
|
||||
}
|
||||
|
||||
__m256i r0 = _mm256_packs_epi16(xixj[0], xixj[1]);
|
||||
r0 = _mm256_permute4x64_epi64(r0, 0xd8); // 3,1,2,0
|
||||
_mm256_storeu_si256( (__m256i *)( xij + k ), r0 );
|
||||
__m256i r1 = _mm256_packs_epi16(xixj[2], xixj[3]);
|
||||
r1 = _mm256_permute4x64_epi64(r1, 0xd8); // 3,1,2,0
|
||||
_mm256_storeu_si256( (__m256i *)( xij + 32 + k ), r1 );
|
||||
k += i + 1;
|
||||
}
|
||||
}
|
||||
|
||||
#define EVAL_YMM_0(xx) {\
|
||||
__m128i tmp = _mm256_castsi256_si128(xx); \
|
||||
for (int macro_i = 0; macro_i < 8; macro_i++) { \
|
||||
__m256i _xi = _mm256_broadcastw_epi16(tmp); \
|
||||
tmp = _mm_srli_si128(tmp, 2); \
|
||||
for (int macro_j = 0; macro_j < (N/16); macro_j++) { \
|
||||
__m256i coeff = _mm256_loadu_si256((__m256i const *) F); \
|
||||
F += 32; \
|
||||
yy[macro_j] = _mm256_add_epi16(yy[macro_j], _mm256_maddubs_epi16(_xi, coeff)); \
|
||||
} \
|
||||
} \
|
||||
}
|
||||
|
||||
#define EVAL_YMM_1(xx) {\
|
||||
__m128i tmp = _mm256_extracti128_si256(xx, 1); \
|
||||
for (int macro_i = 0; macro_i < 8; macro_i++) { \
|
||||
__m256i _xi = _mm256_broadcastw_epi16(tmp); \
|
||||
tmp = _mm_srli_si128(tmp, 2); \
|
||||
for (int macro_j = 0; macro_j < (N/16); macro_j++) { \
|
||||
__m256i coeff = _mm256_loadu_si256((__m256i const *) F); \
|
||||
F += 32; \
|
||||
yy[macro_j] = _mm256_add_epi16(yy[macro_j], _mm256_maddubs_epi16(_xi, coeff)); \
|
||||
} \
|
||||
} \
|
||||
}
|
||||
|
||||
#define REDUCE_(yy) { \
|
||||
(yy)[0] = reduce_16((yy)[0], mask_reduce, mask_2114); \
|
||||
(yy)[1] = reduce_16((yy)[1], mask_reduce, mask_2114); \
|
||||
(yy)[2] = reduce_16((yy)[2], mask_reduce, mask_2114); \
|
||||
(yy)[3] = reduce_16((yy)[3], mask_reduce, mask_2114); \
|
||||
}
|
||||
|
||||
|
||||
/* Evaluates the MQ function on a vector of N gf31 elements x (expected to be
|
||||
in reduced 5-bit representation). Expects the coefficients in F to be in
|
||||
signed representation (i.e. [-15, 15], packed bytewise).
|
||||
Outputs M gf31 elements in unique 16-bit representation as fx. */
|
||||
void PQCLEAN_MQDSS64_AVX2_MQ(gf31 *fx, const gf31 *x, const signed char *F) {
|
||||
__m256i mask_2114 = _mm256_set1_epi32(2114 * 65536 + 2114);
|
||||
__m256i mask_reduce = _mm256_srli_epi16(_mm256_cmpeq_epi16(mask_2114, mask_2114), 11);
|
||||
|
||||
__m256i xi[4];
|
||||
xi[0] = _mm256_loadu_si256((__m256i const *) (x));
|
||||
xi[1] = _mm256_loadu_si256((__m256i const *) (x + 16));
|
||||
xi[2] = _mm256_loadu_si256((__m256i const *) (x + 32));
|
||||
xi[3] = _mm256_loadu_si256((__m256i const *) (x + 48));
|
||||
|
||||
__m256i _zero = _mm256_setzero_si256();
|
||||
xi[0] = _mm256_add_epi16(_mm256_and_si256(mask_reduce, _mm256_cmpgt_epi16(_zero, xi[0])), xi[0]);
|
||||
xi[1] = _mm256_add_epi16(_mm256_and_si256(mask_reduce, _mm256_cmpgt_epi16(_zero, xi[1])), xi[1]);
|
||||
xi[2] = _mm256_add_epi16(_mm256_and_si256(mask_reduce, _mm256_cmpgt_epi16(_zero, xi[2])), xi[2]);
|
||||
xi[3] = _mm256_add_epi16(_mm256_and_si256(mask_reduce, _mm256_cmpgt_epi16(_zero, xi[3])), xi[3]);
|
||||
|
||||
__m256i x1 = _mm256_packs_epi16(xi[0], xi[1]);
|
||||
x1 = _mm256_permute4x64_epi64(x1, 0xd8); // 3,1,2,0
|
||||
__m256i x2 = _mm256_packs_epi16(xi[2], xi[3]);
|
||||
x2 = _mm256_permute4x64_epi64(x2, 0xd8); // 3,1,2,0
|
||||
|
||||
__m256i yy[M / 16];
|
||||
yy[0] = _zero;
|
||||
yy[1] = _zero;
|
||||
yy[2] = _zero;
|
||||
yy[3] = _zero;
|
||||
|
||||
EVAL_YMM_0(x1)
|
||||
EVAL_YMM_1(x1)
|
||||
EVAL_YMM_0(x2)
|
||||
EVAL_YMM_1(x2)
|
||||
REDUCE_(yy)
|
||||
|
||||
__m256i xixj[65];
|
||||
generate_quadratic_terms( (unsigned char *) xixj, x );
|
||||
for (int i = 0 ; i < 64 ; i += 2) {
|
||||
EVAL_YMM_0(xixj[i])
|
||||
EVAL_YMM_1(xixj[i])
|
||||
EVAL_YMM_0(xixj[i + 1])
|
||||
EVAL_YMM_1(xixj[i + 1])
|
||||
REDUCE_(yy)
|
||||
}
|
||||
EVAL_YMM_0(xixj[64])
|
||||
EVAL_YMM_1(xixj[64])
|
||||
REDUCE_(yy)
|
||||
|
||||
yy[0] = _mm256_add_epi16(_mm256_and_si256(mask_reduce, _mm256_cmpgt_epi16(_mm256_setzero_si256(), yy[0])), yy[0]);
|
||||
yy[1] = _mm256_add_epi16(_mm256_and_si256(mask_reduce, _mm256_cmpgt_epi16(_mm256_setzero_si256(), yy[1])), yy[1]);
|
||||
yy[2] = _mm256_add_epi16(_mm256_and_si256(mask_reduce, _mm256_cmpgt_epi16(_mm256_setzero_si256(), yy[2])), yy[2]);
|
||||
yy[3] = _mm256_add_epi16(_mm256_and_si256(mask_reduce, _mm256_cmpgt_epi16(_mm256_setzero_si256(), yy[3])), yy[3]);
|
||||
|
||||
for (int i = 0; i < (N / 16); ++i) {
|
||||
_mm256_storeu_si256((__m256i *)(fx + i * 16), yy[i]);
|
||||
}
|
||||
}
|
||||
|
||||
/* Evaluates the bilinear polar form of the MQ function (i.e. G) on a vector of
|
||||
N gf31 elements x (expected to be in reduced 5-bit representation). Expects
|
||||
the coefficients in F to be in signed representation (i.e. [-15, 15], packed
|
||||
bytewise). Outputs M gf31 elements in unique 16-bit representation as fx. */
|
||||
void PQCLEAN_MQDSS64_AVX2_G(gf31 *fx, const gf31 *x, const gf31 *y, const signed char *F) {
|
||||
__m256i mask_2114 = _mm256_set1_epi32(2114 * 65536 + 2114);
|
||||
__m256i mask_reduce = _mm256_srli_epi16(_mm256_cmpeq_epi16(mask_2114, mask_2114), 11);
|
||||
__m256i _zero = _mm256_setzero_si256();
|
||||
|
||||
__m256i yy[(M / 16)];
|
||||
yy[0] = _zero;
|
||||
yy[1] = _zero;
|
||||
yy[2] = _zero;
|
||||
yy[3] = _zero;
|
||||
|
||||
F += N * M;
|
||||
|
||||
__m256i xixj[65];
|
||||
generate_xiyj_p_xjyi_terms( (unsigned char *) xixj, x, y );
|
||||
for (int i = 0 ; i < 64 ; i += 2) {
|
||||
EVAL_YMM_0(xixj[i])
|
||||
EVAL_YMM_1(xixj[i])
|
||||
EVAL_YMM_0(xixj[i + 1])
|
||||
EVAL_YMM_1(xixj[i + 1])
|
||||
REDUCE_(yy)
|
||||
}
|
||||
EVAL_YMM_0(xixj[64])
|
||||
EVAL_YMM_1(xixj[64])
|
||||
REDUCE_(yy)
|
||||
|
||||
yy[0] = _mm256_add_epi16(_mm256_and_si256(mask_reduce, _mm256_cmpgt_epi16(_mm256_setzero_si256(), yy[0])), yy[0]);
|
||||
yy[1] = _mm256_add_epi16(_mm256_and_si256(mask_reduce, _mm256_cmpgt_epi16(_mm256_setzero_si256(), yy[1])), yy[1]);
|
||||
yy[2] = _mm256_add_epi16(_mm256_and_si256(mask_reduce, _mm256_cmpgt_epi16(_mm256_setzero_si256(), yy[2])), yy[2]);
|
||||
yy[3] = _mm256_add_epi16(_mm256_and_si256(mask_reduce, _mm256_cmpgt_epi16(_mm256_setzero_si256(), yy[3])), yy[3]);
|
||||
|
||||
for (int i = 0; i < (N / 16); ++i) {
|
||||
_mm256_storeu_si256((__m256i *)(fx + i * 16), yy[i]);
|
||||
}
|
||||
}
|
18
crypto_sign/mqdss-64/avx2/mq.h
Normal file
18
crypto_sign/mqdss-64/avx2/mq.h
Normal file
@ -0,0 +1,18 @@
|
||||
#ifndef MQDSS_MQ_H
|
||||
#define MQDSS_MQ_H
|
||||
|
||||
#include "gf31.h"
|
||||
|
||||
/* Evaluates the MQ function on a vector of N gf31 elements x (expected to be
|
||||
in reduced 5-bit representation). Expects the coefficients in F to be in
|
||||
signed representation (i.e. [-15, 15], packed bytewise).
|
||||
Outputs M gf31 elements in unique 16-bit representation as fx. */
|
||||
void PQCLEAN_MQDSS64_AVX2_MQ(gf31 *fx, const gf31 *x, const signed char *F);
|
||||
|
||||
/* Evaluates the bilinear polar form of the MQ function (i.e. G) on a vector of
|
||||
N gf31 elements x (expected to be in reduced 5-bit representation). Expects
|
||||
the coefficients in F to be in signed representation (i.e. [-15, 15], packed
|
||||
bytewise). Outputs M gf31 elements in unique 16-bit representation as fx. */
|
||||
void PQCLEAN_MQDSS64_AVX2_G(gf31 *fx, const gf31 *x, const gf31 *y, const signed char *F);
|
||||
|
||||
#endif
|
25
crypto_sign/mqdss-64/avx2/params.h
Normal file
25
crypto_sign/mqdss-64/avx2/params.h
Normal file
@ -0,0 +1,25 @@
|
||||
#ifndef MQDSS_PARAMS_H
|
||||
#define MQDSS_PARAMS_H
|
||||
|
||||
#define N 64
|
||||
#define M N
|
||||
#define F_LEN (M * (((N * (N + 1)) >> 1) + N)) /* Number of elements in F */
|
||||
|
||||
#define ROUNDS 277
|
||||
|
||||
/* Number of bytes that N, M and F_LEN elements require when packed into a byte
|
||||
array, 5-bit elements packed continuously. */
|
||||
/* Assumes N and M to be multiples of 8 */
|
||||
#define NPACKED_BYTES ((N * 5) >> 3)
|
||||
#define MPACKED_BYTES ((M * 5) >> 3)
|
||||
#define FPACKED_BYTES ((F_LEN * 5) >> 3)
|
||||
|
||||
#define HASH_BYTES 48
|
||||
#define SEED_BYTES 24
|
||||
#define PK_BYTES (SEED_BYTES + MPACKED_BYTES)
|
||||
#define SK_BYTES SEED_BYTES
|
||||
|
||||
// R, sigma_0, ROUNDS * (t1, r{0,1}, e1, c, rho)
|
||||
#define SIG_LEN (2 * HASH_BYTES + ROUNDS * (2*NPACKED_BYTES + MPACKED_BYTES + HASH_BYTES + HASH_BYTES))
|
||||
|
||||
#endif
|
389
crypto_sign/mqdss-64/avx2/sign.c
Normal file
389
crypto_sign/mqdss-64/avx2/sign.c
Normal file
@ -0,0 +1,389 @@
|
||||
#include <stddef.h>
|
||||
#include <stdint.h>
|
||||
#include <string.h>
|
||||
|
||||
#include "api.h"
|
||||
#include "fips202.h"
|
||||
#include "gf31.h"
|
||||
#include "mq.h"
|
||||
#include "params.h"
|
||||
#include "randombytes.h"
|
||||
|
||||
/* Takes an array of len bytes and computes a hash digest.
|
||||
This is used as a hash function in the Fiat-Shamir transform. */
|
||||
static void H(unsigned char *out, const unsigned char *in, const size_t len) {
|
||||
shake256(out, HASH_BYTES, in, len);
|
||||
}
|
||||
|
||||
/* Takes two arrays of N packed elements and an array of M packed elements,
|
||||
and computes a HASH_BYTES commitment. */
|
||||
static void com_0(unsigned char *c,
|
||||
const unsigned char *rho,
|
||||
const unsigned char *inn, const unsigned char *inn2,
|
||||
const unsigned char *inm) {
|
||||
unsigned char buffer[HASH_BYTES + 2 * NPACKED_BYTES + MPACKED_BYTES];
|
||||
memcpy(buffer, rho, HASH_BYTES);
|
||||
memcpy(buffer + HASH_BYTES, inn, NPACKED_BYTES);
|
||||
memcpy(buffer + HASH_BYTES + NPACKED_BYTES, inn2, NPACKED_BYTES);
|
||||
memcpy(buffer + HASH_BYTES + 2 * NPACKED_BYTES, inm, MPACKED_BYTES);
|
||||
shake256(c, HASH_BYTES, buffer, HASH_BYTES + 2 * NPACKED_BYTES + MPACKED_BYTES);
|
||||
}
|
||||
|
||||
/* Takes an array of N packed elements and an array of M packed elements,
|
||||
and computes a HASH_BYTES commitment. */
|
||||
static void com_1(unsigned char *c,
|
||||
const unsigned char *rho,
|
||||
const unsigned char *inn, const unsigned char *inm) {
|
||||
unsigned char buffer[HASH_BYTES + NPACKED_BYTES + MPACKED_BYTES];
|
||||
memcpy(buffer, rho, HASH_BYTES);
|
||||
memcpy(buffer + HASH_BYTES, inn, NPACKED_BYTES);
|
||||
memcpy(buffer + HASH_BYTES + NPACKED_BYTES, inm, MPACKED_BYTES);
|
||||
shake256(c, HASH_BYTES, buffer, HASH_BYTES + NPACKED_BYTES + MPACKED_BYTES);
|
||||
}
|
||||
|
||||
/*
|
||||
* Generates an MQDSS key pair.
|
||||
*/
|
||||
int PQCLEAN_MQDSS64_AVX2_crypto_sign_keypair(uint8_t *pk, uint8_t *sk) {
|
||||
signed char F[F_LEN];
|
||||
unsigned char skbuf[SEED_BYTES * 2];
|
||||
gf31 sk_gf31[N];
|
||||
gf31 pk_gf31[M];
|
||||
|
||||
// Expand sk to obtain a seed for F and the secret input s.
|
||||
// We also expand to obtain a value for sampling r0, t0 and e0 during
|
||||
// signature generation, but that is not relevant here.
|
||||
randombytes(sk, SEED_BYTES);
|
||||
shake256(skbuf, SEED_BYTES * 2, sk, SEED_BYTES);
|
||||
|
||||
memcpy(pk, skbuf, SEED_BYTES);
|
||||
PQCLEAN_MQDSS64_AVX2_gf31_nrand_schar(F, F_LEN, pk, SEED_BYTES);
|
||||
PQCLEAN_MQDSS64_AVX2_gf31_nrand(sk_gf31, N, skbuf + SEED_BYTES, SEED_BYTES);
|
||||
PQCLEAN_MQDSS64_AVX2_MQ(pk_gf31, sk_gf31, F);
|
||||
PQCLEAN_MQDSS64_AVX2_vgf31_unique(pk_gf31, pk_gf31);
|
||||
PQCLEAN_MQDSS64_AVX2_gf31_npack(pk + SEED_BYTES, pk_gf31, M);
|
||||
|
||||
return 0;
|
||||
}
|
||||
|
||||
/**
|
||||
* Returns an array containing a detached signature.
|
||||
*/
|
||||
int PQCLEAN_MQDSS64_AVX2_crypto_sign_signature(
|
||||
uint8_t *sig, size_t *siglen,
|
||||
const uint8_t *m, size_t mlen, const uint8_t *sk) {
|
||||
|
||||
signed char F[F_LEN];
|
||||
unsigned char skbuf[SEED_BYTES * 4];
|
||||
gf31 pk_gf31[M];
|
||||
unsigned char pk[SEED_BYTES + MPACKED_BYTES];
|
||||
// Concatenated for convenient hashing.
|
||||
unsigned char D_sigma0_h0_sigma1[HASH_BYTES * 3 + ROUNDS * (NPACKED_BYTES + MPACKED_BYTES)];
|
||||
unsigned char *D = D_sigma0_h0_sigma1;
|
||||
unsigned char *sigma0 = D_sigma0_h0_sigma1 + HASH_BYTES;
|
||||
unsigned char *h0 = D_sigma0_h0_sigma1 + 2 * HASH_BYTES;
|
||||
unsigned char *t1packed = D_sigma0_h0_sigma1 + 3 * HASH_BYTES;
|
||||
unsigned char *e1packed = D_sigma0_h0_sigma1 + 3 * HASH_BYTES + ROUNDS * NPACKED_BYTES;
|
||||
shake256ctx shakestate;
|
||||
unsigned char shakeblock[SHAKE256_RATE];
|
||||
unsigned char h1[((ROUNDS + 7) & ~7) >> 3];
|
||||
unsigned char rnd_seed[HASH_BYTES + SEED_BYTES];
|
||||
unsigned char rho[2 * ROUNDS * HASH_BYTES];
|
||||
unsigned char *rho0 = rho;
|
||||
unsigned char *rho1 = rho + ROUNDS * HASH_BYTES;
|
||||
gf31 sk_gf31[N];
|
||||
gf31 rnd[(2 * N + M) * ROUNDS]; // Concatenated for easy RNG.
|
||||
gf31 *r0 = rnd;
|
||||
gf31 *t0 = rnd + N * ROUNDS;
|
||||
gf31 *e0 = rnd + 2 * N * ROUNDS;
|
||||
gf31 r1[N * ROUNDS];
|
||||
gf31 t1[N * ROUNDS];
|
||||
gf31 e1[M * ROUNDS];
|
||||
gf31 gx[M * ROUNDS];
|
||||
unsigned char packbuf0[NPACKED_BYTES];
|
||||
unsigned char packbuf1[NPACKED_BYTES];
|
||||
unsigned char packbuf2[MPACKED_BYTES];
|
||||
unsigned char c[HASH_BYTES * ROUNDS * 2];
|
||||
gf31 alpha;
|
||||
int alpha_count = 0;
|
||||
int b;
|
||||
int i, j;
|
||||
shake256incctx state;
|
||||
|
||||
shake256(skbuf, SEED_BYTES * 4, sk, SEED_BYTES);
|
||||
|
||||
PQCLEAN_MQDSS64_AVX2_gf31_nrand_schar(F, F_LEN, skbuf, SEED_BYTES);
|
||||
|
||||
shake256_inc_init(&state);
|
||||
shake256_inc_absorb(&state, sk, SEED_BYTES);
|
||||
shake256_inc_absorb(&state, m, mlen);
|
||||
shake256_inc_finalize(&state);
|
||||
shake256_inc_squeeze(sig, HASH_BYTES, &state); // Compute R.
|
||||
shake256_inc_ctx_release(&state);
|
||||
|
||||
memcpy(pk, skbuf, SEED_BYTES);
|
||||
PQCLEAN_MQDSS64_AVX2_gf31_nrand(sk_gf31, N, skbuf + SEED_BYTES, SEED_BYTES);
|
||||
PQCLEAN_MQDSS64_AVX2_MQ(pk_gf31, sk_gf31, F);
|
||||
PQCLEAN_MQDSS64_AVX2_vgf31_unique(pk_gf31, pk_gf31);
|
||||
PQCLEAN_MQDSS64_AVX2_gf31_npack(pk + SEED_BYTES, pk_gf31, M);
|
||||
|
||||
shake256_inc_init(&state);
|
||||
shake256_inc_absorb(&state, pk, PK_BYTES);
|
||||
shake256_inc_absorb(&state, sig, HASH_BYTES);
|
||||
shake256_inc_absorb(&state, m, mlen);
|
||||
shake256_inc_finalize(&state);
|
||||
shake256_inc_squeeze(D, HASH_BYTES, &state);
|
||||
shake256_inc_ctx_release(&state);
|
||||
|
||||
sig += HASH_BYTES; // Compensate for prefixed R.
|
||||
|
||||
memcpy(rnd_seed, skbuf + 2 * SEED_BYTES, SEED_BYTES);
|
||||
memcpy(rnd_seed + SEED_BYTES, D, HASH_BYTES);
|
||||
shake256(rho, 2 * ROUNDS * HASH_BYTES, rnd_seed, SEED_BYTES + HASH_BYTES);
|
||||
|
||||
memcpy(rnd_seed, skbuf + 3 * SEED_BYTES, SEED_BYTES);
|
||||
memcpy(rnd_seed + SEED_BYTES, D, HASH_BYTES);
|
||||
PQCLEAN_MQDSS64_AVX2_gf31_nrand(rnd, (2 * N + M) * ROUNDS, rnd_seed, SEED_BYTES + HASH_BYTES);
|
||||
|
||||
for (i = 0; i < ROUNDS; i++) {
|
||||
for (j = 0; j < N; j++) {
|
||||
r1[j + i * N] = (gf31)(31 + sk_gf31[j] - r0[j + i * N]);
|
||||
}
|
||||
PQCLEAN_MQDSS64_AVX2_G(gx + i * M, t0 + i * N, r1 + i * N, F);
|
||||
}
|
||||
for (i = 0; i < ROUNDS * M; i++) {
|
||||
gx[i] = (gf31)(gx[i] + e0[i]);
|
||||
}
|
||||
for (i = 0; i < ROUNDS; i++) {
|
||||
PQCLEAN_MQDSS64_AVX2_gf31_npack(packbuf0, r0 + i * N, N);
|
||||
PQCLEAN_MQDSS64_AVX2_gf31_npack(packbuf1, t0 + i * N, N);
|
||||
PQCLEAN_MQDSS64_AVX2_gf31_npack(packbuf2, e0 + i * M, M);
|
||||
com_0(c + HASH_BYTES * (2 * i + 0), rho0 + i * HASH_BYTES, packbuf0, packbuf1, packbuf2);
|
||||
PQCLEAN_MQDSS64_AVX2_vgf31_shorten_unique(r1 + i * N, r1 + i * N);
|
||||
PQCLEAN_MQDSS64_AVX2_vgf31_shorten_unique(gx + i * M, gx + i * M);
|
||||
PQCLEAN_MQDSS64_AVX2_gf31_npack(packbuf0, r1 + i * N, N);
|
||||
PQCLEAN_MQDSS64_AVX2_gf31_npack(packbuf1, gx + i * M, M);
|
||||
com_1(c + HASH_BYTES * (2 * i + 1), rho1 + i * HASH_BYTES, packbuf0, packbuf1);
|
||||
}
|
||||
|
||||
H(sigma0, c, HASH_BYTES * ROUNDS * 2); // Compute sigma_0.
|
||||
shake256_absorb(&shakestate, D_sigma0_h0_sigma1, 2 * HASH_BYTES);
|
||||
shake256_squeezeblocks(shakeblock, 1, &shakestate);
|
||||
|
||||
memcpy(h0, shakeblock, HASH_BYTES);
|
||||
|
||||
memcpy(sig, sigma0, HASH_BYTES);
|
||||
sig += HASH_BYTES; // Compensate for sigma_0.
|
||||
|
||||
for (i = 0; i < ROUNDS; i++) {
|
||||
do {
|
||||
alpha = shakeblock[alpha_count] & 31;
|
||||
alpha_count++;
|
||||
if (alpha_count == SHAKE256_RATE) {
|
||||
alpha_count = 0;
|
||||
shake256_squeezeblocks(shakeblock, 1, &shakestate);
|
||||
}
|
||||
} while (alpha == 31);
|
||||
for (j = 0; j < N; j++) {
|
||||
t1[i * N + j] = (gf31)(alpha * r0[j + i * N] - t0[j + i * N] + 31);
|
||||
}
|
||||
PQCLEAN_MQDSS64_AVX2_MQ(e1 + i * M, r0 + i * N, F);
|
||||
for (j = 0; j < N; j++) {
|
||||
e1[i * N + j] = (gf31)(alpha * e1[j + i * M] - e0[j + i * M] + 31);
|
||||
}
|
||||
PQCLEAN_MQDSS64_AVX2_vgf31_shorten_unique(t1 + i * N, t1 + i * N);
|
||||
PQCLEAN_MQDSS64_AVX2_vgf31_shorten_unique(e1 + i * N, e1 + i * N);
|
||||
}
|
||||
shake256_ctx_release(&shakestate);
|
||||
|
||||
PQCLEAN_MQDSS64_AVX2_gf31_npack(t1packed, t1, N * ROUNDS);
|
||||
PQCLEAN_MQDSS64_AVX2_gf31_npack(e1packed, e1, M * ROUNDS);
|
||||
|
||||
memcpy(sig, t1packed, NPACKED_BYTES * ROUNDS);
|
||||
sig += NPACKED_BYTES * ROUNDS;
|
||||
memcpy(sig, e1packed, MPACKED_BYTES * ROUNDS);
|
||||
sig += MPACKED_BYTES * ROUNDS;
|
||||
|
||||
shake256(h1, ((ROUNDS + 7) & ~7) >> 3, D_sigma0_h0_sigma1, 3 * HASH_BYTES + ROUNDS * (NPACKED_BYTES + MPACKED_BYTES));
|
||||
|
||||
for (i = 0; i < ROUNDS; i++) {
|
||||
b = (h1[(i >> 3)] >> (i & 7)) & 1;
|
||||
if (b == 0) {
|
||||
PQCLEAN_MQDSS64_AVX2_gf31_npack(sig, r0 + i * N, N);
|
||||
} else if (b == 1) {
|
||||
PQCLEAN_MQDSS64_AVX2_gf31_npack(sig, r1 + i * N, N);
|
||||
}
|
||||
memcpy(sig + NPACKED_BYTES, c + HASH_BYTES * (2 * i + (1 - b)), HASH_BYTES);
|
||||
memcpy(sig + NPACKED_BYTES + HASH_BYTES, rho + (i + b * ROUNDS) * HASH_BYTES, HASH_BYTES);
|
||||
sig += NPACKED_BYTES + 2 * HASH_BYTES;
|
||||
}
|
||||
|
||||
*siglen = SIG_LEN;
|
||||
return 0;
|
||||
}
|
||||
|
||||
/**
|
||||
* Verifies a detached signature and message under a given public key.
|
||||
*/
|
||||
int PQCLEAN_MQDSS64_AVX2_crypto_sign_verify(
|
||||
const uint8_t *sig, size_t siglen,
|
||||
const uint8_t *m, size_t mlen, const uint8_t *pk) {
|
||||
|
||||
gf31 r[N];
|
||||
gf31 t[N];
|
||||
gf31 e[M];
|
||||
signed char F[F_LEN];
|
||||
gf31 pk_gf31[M];
|
||||
// Concatenated for convenient hashing.
|
||||
unsigned char D_sigma0_h0_sigma1[HASH_BYTES * 3 + ROUNDS * (NPACKED_BYTES + MPACKED_BYTES)];
|
||||
unsigned char *D = D_sigma0_h0_sigma1;
|
||||
unsigned char *sigma0 = D_sigma0_h0_sigma1 + HASH_BYTES;
|
||||
unsigned char *h0 = D_sigma0_h0_sigma1 + 2 * HASH_BYTES;
|
||||
unsigned char *t1packed = D_sigma0_h0_sigma1 + 3 * HASH_BYTES;
|
||||
unsigned char *e1packed = D_sigma0_h0_sigma1 + 3 * HASH_BYTES + ROUNDS * NPACKED_BYTES;
|
||||
unsigned char h1[((ROUNDS + 7) & ~7) >> 3];
|
||||
unsigned char c[HASH_BYTES * ROUNDS * 2];
|
||||
memset(c, 0, HASH_BYTES * 2);
|
||||
gf31 x[N];
|
||||
gf31 y[M];
|
||||
gf31 z[M];
|
||||
unsigned char packbuf0[NPACKED_BYTES];
|
||||
unsigned char packbuf1[MPACKED_BYTES];
|
||||
shake256ctx shakestate;
|
||||
unsigned char shakeblock[SHAKE256_RATE];
|
||||
int i, j;
|
||||
gf31 alpha;
|
||||
int alpha_count = 0;
|
||||
int b;
|
||||
shake256incctx state;
|
||||
|
||||
if (siglen != SIG_LEN) {
|
||||
return -1;
|
||||
}
|
||||
|
||||
shake256_inc_init(&state);
|
||||
shake256_inc_absorb(&state, pk, PK_BYTES);
|
||||
shake256_inc_absorb(&state, sig, HASH_BYTES);
|
||||
shake256_inc_absorb(&state, m, mlen);
|
||||
shake256_inc_finalize(&state);
|
||||
shake256_inc_squeeze(D, HASH_BYTES, &state);
|
||||
shake256_inc_ctx_release(&state);
|
||||
|
||||
sig += HASH_BYTES;
|
||||
|
||||
PQCLEAN_MQDSS64_AVX2_gf31_nrand_schar(F, F_LEN, pk, SEED_BYTES);
|
||||
pk += SEED_BYTES;
|
||||
PQCLEAN_MQDSS64_AVX2_gf31_nunpack(pk_gf31, pk, M);
|
||||
|
||||
memcpy(sigma0, sig, HASH_BYTES);
|
||||
|
||||
shake256_absorb(&shakestate, D_sigma0_h0_sigma1, 2 * HASH_BYTES);
|
||||
shake256_squeezeblocks(shakeblock, 1, &shakestate);
|
||||
|
||||
memcpy(h0, shakeblock, HASH_BYTES);
|
||||
|
||||
sig += HASH_BYTES;
|
||||
|
||||
memcpy(t1packed, sig, ROUNDS * NPACKED_BYTES);
|
||||
sig += ROUNDS * NPACKED_BYTES;
|
||||
memcpy(e1packed, sig, ROUNDS * MPACKED_BYTES);
|
||||
sig += ROUNDS * MPACKED_BYTES;
|
||||
|
||||
shake256(h1, ((ROUNDS + 7) & ~7) >> 3, D_sigma0_h0_sigma1, 3 * HASH_BYTES + ROUNDS * (NPACKED_BYTES + MPACKED_BYTES));
|
||||
|
||||
for (i = 0; i < ROUNDS; i++) {
|
||||
do {
|
||||
alpha = shakeblock[alpha_count] & 31;
|
||||
alpha_count++;
|
||||
if (alpha_count == SHAKE256_RATE) {
|
||||
alpha_count = 0;
|
||||
shake256_squeezeblocks(shakeblock, 1, &shakestate);
|
||||
}
|
||||
} while (alpha == 31);
|
||||
b = (h1[(i >> 3)] >> (i & 7)) & 1;
|
||||
|
||||
PQCLEAN_MQDSS64_AVX2_gf31_nunpack(r, sig, N);
|
||||
PQCLEAN_MQDSS64_AVX2_gf31_nunpack(t, t1packed + NPACKED_BYTES * i, N);
|
||||
PQCLEAN_MQDSS64_AVX2_gf31_nunpack(e, e1packed + MPACKED_BYTES * i, M);
|
||||
|
||||
if (b == 0) {
|
||||
PQCLEAN_MQDSS64_AVX2_MQ(y, r, F);
|
||||
for (j = 0; j < N; j++) {
|
||||
x[j] = (gf31)(alpha * r[j] - t[j] + 31);
|
||||
}
|
||||
for (j = 0; j < N; j++) {
|
||||
y[j] = (gf31)(alpha * y[j] - e[j] + 31);
|
||||
}
|
||||
PQCLEAN_MQDSS64_AVX2_vgf31_shorten_unique(x, x);
|
||||
PQCLEAN_MQDSS64_AVX2_vgf31_shorten_unique(y, y);
|
||||
PQCLEAN_MQDSS64_AVX2_gf31_npack(packbuf0, x, N);
|
||||
PQCLEAN_MQDSS64_AVX2_gf31_npack(packbuf1, y, M);
|
||||
com_0(c + HASH_BYTES * (2 * i + 0), sig + HASH_BYTES + NPACKED_BYTES, sig, packbuf0, packbuf1);
|
||||
} else {
|
||||
PQCLEAN_MQDSS64_AVX2_MQ(y, r, F);
|
||||
PQCLEAN_MQDSS64_AVX2_G(z, t, r, F);
|
||||
for (j = 0; j < N; j++) {
|
||||
y[j] = (gf31)(alpha * (31 + pk_gf31[j] - y[j]) - z[j] - e[j] + 62);
|
||||
}
|
||||
PQCLEAN_MQDSS64_AVX2_vgf31_shorten_unique(y, y);
|
||||
PQCLEAN_MQDSS64_AVX2_gf31_npack(packbuf0, y, M);
|
||||
com_1(c + HASH_BYTES * (2 * i + 1), sig + HASH_BYTES + NPACKED_BYTES, sig, packbuf0);
|
||||
}
|
||||
memcpy(c + HASH_BYTES * (2 * i + (1 - b)), sig + NPACKED_BYTES, HASH_BYTES);
|
||||
sig += NPACKED_BYTES + 2 * HASH_BYTES;
|
||||
}
|
||||
shake256_ctx_release(&shakestate);
|
||||
|
||||
H(c, c, HASH_BYTES * ROUNDS * 2);
|
||||
if (memcmp(c, sigma0, HASH_BYTES) != 0) {
|
||||
return -1;
|
||||
}
|
||||
|
||||
return 0;
|
||||
}
|
||||
|
||||
/**
|
||||
* Returns an array containing the signature followed by the message.
|
||||
*/
|
||||
int PQCLEAN_MQDSS64_AVX2_crypto_sign(
|
||||
uint8_t *sm, size_t *smlen,
|
||||
const uint8_t *m, size_t mlen, const uint8_t *sk) {
|
||||
size_t siglen;
|
||||
|
||||
PQCLEAN_MQDSS64_AVX2_crypto_sign_signature(
|
||||
sm, &siglen, m, mlen, sk);
|
||||
|
||||
memmove(sm + SIG_LEN, m, mlen);
|
||||
*smlen = siglen + mlen;
|
||||
|
||||
return 0;
|
||||
}
|
||||
|
||||
/**
|
||||
* Verifies a given signature-message pair under a given public key.
|
||||
*/
|
||||
int PQCLEAN_MQDSS64_AVX2_crypto_sign_open(
|
||||
uint8_t *m, size_t *mlen,
|
||||
const uint8_t *sm, size_t smlen, const uint8_t *pk) {
|
||||
/* The API caller does not necessarily know what size a signature should be
|
||||
but MQDSS signatures are always exactly SIG_LEN. */
|
||||
if (smlen < SIG_LEN) {
|
||||
memset(m, 0, smlen);
|
||||
*mlen = 0;
|
||||
return -1;
|
||||
}
|
||||
|
||||
*mlen = smlen - SIG_LEN;
|
||||
|
||||
if (PQCLEAN_MQDSS64_AVX2_crypto_sign_verify(
|
||||
sm, SIG_LEN, sm + SIG_LEN, *mlen, pk)) {
|
||||
memset(m, 0, smlen);
|
||||
*mlen = 0;
|
||||
return -1;
|
||||
}
|
||||
|
||||
/* If verification was successful, move the message to the right place. */
|
||||
memmove(m, sm + SIG_LEN, *mlen);
|
||||
|
||||
return 0;
|
||||
}
|
@ -1,4 +1,3 @@
|
||||
#include <assert.h>
|
||||
#include <stddef.h>
|
||||
#include <stdint.h>
|
||||
#include <string.h>
|
||||
|
20
test/duplicate_consistency/mqdss-48_clean.yml
Normal file
20
test/duplicate_consistency/mqdss-48_clean.yml
Normal file
@ -0,0 +1,20 @@
|
||||
consistency_checks:
|
||||
- source:
|
||||
scheme: mqdss-48
|
||||
implementation: avx2
|
||||
files:
|
||||
- api.h
|
||||
- mq.h
|
||||
- LICENSE
|
||||
- mq.h
|
||||
- sign.c
|
||||
- params.h
|
||||
- source:
|
||||
scheme: mqdss-64
|
||||
implementation: clean
|
||||
files:
|
||||
- gf31.c
|
||||
- gf31.h
|
||||
- LICENSE
|
||||
- mq.c
|
||||
- mq.h
|
@ -9,3 +9,14 @@ consistency_checks:
|
||||
- mq.c
|
||||
- mq.h
|
||||
- sign.c
|
||||
- source:
|
||||
scheme: mqdss-64
|
||||
implementation: avx2
|
||||
files:
|
||||
- api.h
|
||||
- mq.h
|
||||
- LICENSE
|
||||
- mq.h
|
||||
- sign.c
|
||||
- params.h
|
||||
|
||||
|
@ -40,6 +40,7 @@ def test_testvectors(implementation, impl_path, test_dir, init, destr):
|
||||
implementation.name,
|
||||
'.exe' if os.name == 'nt' else ''
|
||||
))],
|
||||
print_output=False,
|
||||
).replace('\r', '')
|
||||
assert(implementation.scheme.metadata()['testvectors-sha256'].lower()
|
||||
== hashlib.sha256(out.encode('utf-8')).hexdigest().lower())
|
||||
|
Loading…
Reference in New Issue
Block a user