Explorar el Código

Add MQDSS AVX2 implementations (#288)

* Add AVX2 version of mqdss

* Fix duplicate consistency
master
Thom Wiggers hace 4 años
committed by GitHub
padre
commit
90630db2eb
No se encontró ninguna clave conocida en la base de datos para esta firma ID de clave GPG: 4AEE18F83AFDEB23
Se han modificado 27 ficheros con 2135 adiciones y 2 borrados
  1. +9
    -0
      crypto_sign/mqdss-48/META.yml
  2. +116
    -0
      crypto_sign/mqdss-48/avx2/LICENSE
  3. +22
    -0
      crypto_sign/mqdss-48/avx2/Makefile
  4. +19
    -0
      crypto_sign/mqdss-48/avx2/Makefile.Microsoft_nmake
  5. +47
    -0
      crypto_sign/mqdss-48/avx2/api.h
  6. +123
    -0
      crypto_sign/mqdss-48/avx2/gf31.c
  7. +36
    -0
      crypto_sign/mqdss-48/avx2/gf31.h
  8. +251
    -0
      crypto_sign/mqdss-48/avx2/mq.c
  9. +18
    -0
      crypto_sign/mqdss-48/avx2/mq.h
  10. +25
    -0
      crypto_sign/mqdss-48/avx2/params.h
  11. +389
    -0
      crypto_sign/mqdss-48/avx2/sign.c
  12. +0
    -1
      crypto_sign/mqdss-48/clean/sign.c
  13. +9
    -0
      crypto_sign/mqdss-64/META.yml
  14. +116
    -0
      crypto_sign/mqdss-64/avx2/LICENSE
  15. +22
    -0
      crypto_sign/mqdss-64/avx2/Makefile
  16. +19
    -0
      crypto_sign/mqdss-64/avx2/Makefile.Microsoft_nmake
  17. +47
    -0
      crypto_sign/mqdss-64/avx2/api.h
  18. +128
    -0
      crypto_sign/mqdss-64/avx2/gf31.c
  19. +36
    -0
      crypto_sign/mqdss-64/avx2/gf31.h
  20. +239
    -0
      crypto_sign/mqdss-64/avx2/mq.c
  21. +18
    -0
      crypto_sign/mqdss-64/avx2/mq.h
  22. +25
    -0
      crypto_sign/mqdss-64/avx2/params.h
  23. +389
    -0
      crypto_sign/mqdss-64/avx2/sign.c
  24. +0
    -1
      crypto_sign/mqdss-64/clean/sign.c
  25. +20
    -0
      test/duplicate_consistency/mqdss-48_clean.yml
  26. +11
    -0
      test/duplicate_consistency/mqdss-64_clean.yml
  27. +1
    -0
      test/test_testvectors.py

+ 9
- 0
crypto_sign/mqdss-48/META.yml Ver fichero

@@ -16,3 +16,12 @@ auxiliary-submitters:
implementations:
- name: clean
version: https://github.com/joostrijneveld/MQDSS/commit/00608d7610262ff07b1834885d32bc3fd27ef5e1
- name: avx2
version: https://github.com/joostrijneveld/MQDSS/commit/00608d7610262ff07b1834885d32bc3fd27ef5e1
supported_platforms:
- architecture: x86_64
required_flags:
- avx2
- architecture: x86
required_flags:
- avx2

+ 116
- 0
crypto_sign/mqdss-48/avx2/LICENSE Ver fichero

@@ -0,0 +1,116 @@
CC0 1.0 Universal

Statement of Purpose

The laws of most jurisdictions throughout the world automatically confer
exclusive Copyright and Related Rights (defined below) upon the creator and
subsequent owner(s) (each and all, an "owner") of an original work of
authorship and/or a database (each, a "Work").

Certain owners wish to permanently relinquish those rights to a Work for the
purpose of contributing to a commons of creative, cultural and scientific
works ("Commons") that the public can reliably and without fear of later
claims of infringement build upon, modify, incorporate in other works, reuse
and redistribute as freely as possible in any form whatsoever and for any
purposes, including without limitation commercial purposes. These owners may
contribute to the Commons to promote the ideal of a free culture and the
further production of creative, cultural and scientific works, or to gain
reputation or greater distribution for their Work in part through the use and
efforts of others.

For these and/or other purposes and motivations, and without any expectation
of additional consideration or compensation, the person associating CC0 with a
Work (the "Affirmer"), to the extent that he or she is an owner of Copyright
and Related Rights in the Work, voluntarily elects to apply CC0 to the Work
and publicly distribute the Work under its terms, with knowledge of his or her
Copyright and Related Rights in the Work and the meaning and intended legal
effect of CC0 on those rights.

1. Copyright and Related Rights. A Work made available under CC0 may be
protected by copyright and related or neighboring rights ("Copyright and
Related Rights"). Copyright and Related Rights include, but are not limited
to, the following:

i. the right to reproduce, adapt, distribute, perform, display, communicate,
and translate a Work;

ii. moral rights retained by the original author(s) and/or performer(s);

iii. publicity and privacy rights pertaining to a person's image or likeness
depicted in a Work;

iv. rights protecting against unfair competition in regards to a Work,
subject to the limitations in paragraph 4(a), below;

v. rights protecting the extraction, dissemination, use and reuse of data in
a Work;

vi. database rights (such as those arising under Directive 96/9/EC of the
European Parliament and of the Council of 11 March 1996 on the legal
protection of databases, and under any national implementation thereof,
including any amended or successor version of such directive); and

vii. other similar, equivalent or corresponding rights throughout the world
based on applicable law or treaty, and any national implementations thereof.

2. Waiver. To the greatest extent permitted by, but not in contravention of,
applicable law, Affirmer hereby overtly, fully, permanently, irrevocably and
unconditionally waives, abandons, and surrenders all of Affirmer's Copyright
and Related Rights and associated claims and causes of action, whether now
known or unknown (including existing as well as future claims and causes of
action), in the Work (i) in all territories worldwide, (ii) for the maximum
duration provided by applicable law or treaty (including future time
extensions), (iii) in any current or future medium and for any number of
copies, and (iv) for any purpose whatsoever, including without limitation
commercial, advertising or promotional purposes (the "Waiver"). Affirmer makes
the Waiver for the benefit of each member of the public at large and to the
detriment of Affirmer's heirs and successors, fully intending that such Waiver
shall not be subject to revocation, rescission, cancellation, termination, or
any other legal or equitable action to disrupt the quiet enjoyment of the Work
by the public as contemplated by Affirmer's express Statement of Purpose.

3. Public License Fallback. Should any part of the Waiver for any reason be
judged legally invalid or ineffective under applicable law, then the Waiver
shall be preserved to the maximum extent permitted taking into account
Affirmer's express Statement of Purpose. In addition, to the extent the Waiver
is so judged Affirmer hereby grants to each affected person a royalty-free,
non transferable, non sublicensable, non exclusive, irrevocable and
unconditional license to exercise Affirmer's Copyright and Related Rights in
the Work (i) in all territories worldwide, (ii) for the maximum duration
provided by applicable law or treaty (including future time extensions), (iii)
in any current or future medium and for any number of copies, and (iv) for any
purpose whatsoever, including without limitation commercial, advertising or
promotional purposes (the "License"). The License shall be deemed effective as
of the date CC0 was applied by Affirmer to the Work. Should any part of the
License for any reason be judged legally invalid or ineffective under
applicable law, such partial invalidity or ineffectiveness shall not
invalidate the remainder of the License, and in such case Affirmer hereby
affirms that he or she will not (i) exercise any of his or her remaining
Copyright and Related Rights in the Work or (ii) assert any associated claims
and causes of action with respect to the Work, in either case contrary to
Affirmer's express Statement of Purpose.

4. Limitations and Disclaimers.

a. No trademark or patent rights held by Affirmer are waived, abandoned,
surrendered, licensed or otherwise affected by this document.

b. Affirmer offers the Work as-is and makes no representations or warranties
of any kind concerning the Work, express, implied, statutory or otherwise,
including without limitation warranties of title, merchantability, fitness
for a particular purpose, non infringement, or the absence of latent or
other defects, accuracy, or the present or absence of errors, whether or not
discoverable, all to the greatest extent permissible under applicable law.

c. Affirmer disclaims responsibility for clearing rights of other persons
that may apply to the Work or any use thereof, including without limitation
any person's Copyright and Related Rights in the Work. Further, Affirmer
disclaims responsibility for obtaining any necessary consents, permissions
or other rights required for any use of the Work.

d. Affirmer understands and acknowledges that Creative Commons is not a
party to this document and has no duty or obligation with respect to this
CC0 or use of the Work.

For more information, please see
<http://creativecommons.org/publicdomain/zero/1.0/>

+ 22
- 0
crypto_sign/mqdss-48/avx2/Makefile Ver fichero

@@ -0,0 +1,22 @@
# This Makefile can be used with GNU Make or BSD Make

LIB=libmqdss-48_avx2.a

HEADERS = params.h gf31.h mq.h api.h
OBJECTS = gf31.o mq.o sign.o

CFLAGS=-O3 -Wall -Wconversion -Wextra -Wpedantic -Wvla -Werror \
-Wmissing-prototypes -Wredundant-decls -std=c99 -mavx2 \
-I../../../common $(EXTRAFLAGS)

all: $(LIB)

%.o: %.c $(HEADERS)
$(CC) $(CFLAGS) -c -o $@ $<

$(LIB): $(OBJECTS)
$(AR) -r $@ $(OBJECTS)

clean:
$(RM) $(OBJECTS)
$(RM) $(LIB)

+ 19
- 0
crypto_sign/mqdss-48/avx2/Makefile.Microsoft_nmake Ver fichero

@@ -0,0 +1,19 @@
# This Makefile can be used with Microsoft Visual Studio's nmake using the command:
# nmake /f Makefile.Microsoft_nmake

LIBRARY=libmqdss-48_avx2.lib
OBJECTS=gf31.obj mq.obj sign.obj

CFLAGS=/nologo /O2 /I ..\..\..\common /W4 /WX /arch:AVX2

all: $(LIBRARY)

# Make sure objects are recompiled if headers change.
$(OBJECTS): *.h

$(LIBRARY): $(OBJECTS)
LIB.EXE /NOLOGO /WX /OUT:$@ $**

clean:
-DEL $(OBJECTS)
-DEL $(LIBRARY)

+ 47
- 0
crypto_sign/mqdss-48/avx2/api.h Ver fichero

@@ -0,0 +1,47 @@
#ifndef PQCLEAN_MQDSS48_AVX2_API_H
#define PQCLEAN_MQDSS48_AVX2_API_H

#include <stddef.h>
#include <stdint.h>

#define PQCLEAN_MQDSS48_AVX2_CRYPTO_ALGNAME "MQDSS-48"

#define PQCLEAN_MQDSS48_AVX2_CRYPTO_SECRETKEYBYTES 16
#define PQCLEAN_MQDSS48_AVX2_CRYPTO_PUBLICKEYBYTES 46
#define PQCLEAN_MQDSS48_AVX2_CRYPTO_BYTES 28400

/*
* Generates an MQDSS key pair.
*/
int PQCLEAN_MQDSS48_AVX2_crypto_sign_keypair(
uint8_t *pk, uint8_t *sk);

/**
* Returns an array containing a detached signature.
*/
int PQCLEAN_MQDSS48_AVX2_crypto_sign_signature(
uint8_t *sig, size_t *siglen,
const uint8_t *m, size_t mlen, const uint8_t *sk);

/**
* Verifies a detached signature and message under a given public key.
*/
int PQCLEAN_MQDSS48_AVX2_crypto_sign_verify(
const uint8_t *sig, size_t siglen,
const uint8_t *m, size_t mlen, const uint8_t *pk);

/**
* Returns an array containing the signature followed by the message.
*/
int PQCLEAN_MQDSS48_AVX2_crypto_sign(
uint8_t *sm, size_t *smlen,
const uint8_t *m, size_t mlen, const uint8_t *sk);

/**
* Verifies a given signature-message pair under a given public key.
*/
int PQCLEAN_MQDSS48_AVX2_crypto_sign_open(
uint8_t *m, size_t *mlen,
const uint8_t *sm, size_t smlen, const uint8_t *pk);

#endif

+ 123
- 0
crypto_sign/mqdss-48/avx2/gf31.c Ver fichero

@@ -0,0 +1,123 @@
#include "params.h"
#include "fips202.h"
#include "gf31.h"
#include <immintrin.h>
#include <stdint.h>
#include <string.h>

/* Given a vector of N elements in the range [0, 31], this reduces the elements
to the range [0, 30] by mapping 31 to 0 (i.e reduction mod 31) */
void PQCLEAN_MQDSS48_AVX2_vgf31_unique(gf31 *out, gf31 *in) {
__m256i x;
__m256i _w31 = _mm256_set1_epi16(31);
int i;

for (i = 0; i < (N >> 4); ++i) {
x = _mm256_loadu_si256((__m256i const *) (in + 16 * i));
x = _mm256_xor_si256(x, _mm256_and_si256(_w31, _mm256_cmpeq_epi16(x, _w31)));
_mm256_storeu_si256((__m256i *)(out + i * 16), x);
}
}

/* This function acts on vectors with 64 gf31 elements.
It performs one reduction step and guarantees output in [0, 30],
but requires input to be in [0, 32768). */
void PQCLEAN_MQDSS48_AVX2_vgf31_shorten_unique(gf31 *out, gf31 *in) {
__m256i x;
__m256i _w2114 = _mm256_set1_epi32(2114 * 65536 + 2114);
__m256i _w31 = _mm256_set1_epi16(31);
int i;

for (i = 0; i < (N >> 4); ++i) {
x = _mm256_loadu_si256((__m256i const *) (in + 16 * i));
x = _mm256_sub_epi16(x, _mm256_mullo_epi16(_w31, _mm256_mulhi_epi16(x, _w2114)));
x = _mm256_xor_si256(x, _mm256_and_si256(_w31, _mm256_cmpeq_epi16(x, _w31)));
_mm256_storeu_si256((__m256i *)(out + i * 16), x);
}
}

/* Given a seed, samples len gf31 elements (in the range [0, 30]), and places
them in a vector of 16-bit elements */
void PQCLEAN_MQDSS48_AVX2_gf31_nrand(gf31 *out, size_t len, const uint8_t *seed, size_t seedlen) {
size_t i = 0, j;
shake256ctx shakestate;
uint8_t shakeblock[SHAKE256_RATE];

shake256_absorb(&shakestate, seed, seedlen);

while (i < len) {
shake256_squeezeblocks(shakeblock, 1, &shakestate);
for (j = 0; j < SHAKE256_RATE && i < len; j++) {
if ((shakeblock[j] & 31) != 31) {
out[i] = (shakeblock[j] & 31);
i++;
}
}
}
shake256_ctx_release(&shakestate);
}

/* Given a seed, samples len gf31 elements, transposed into unsigned range,
i.e. in the range [-15, 15], and places them in an array of 8-bit integers.
This is used for the expansion of F, which wants packed elements. */
void PQCLEAN_MQDSS48_AVX2_gf31_nrand_schar(signed char *out, size_t len, const uint8_t *seed, size_t seedlen) {
size_t i = 0, j;
shake256ctx shakestate;
uint8_t shakeblock[SHAKE256_RATE];

shake256_absorb(&shakestate, seed, seedlen);

while (i < len) {
shake256_squeezeblocks(shakeblock, 1, &shakestate);
for (j = 0; j < SHAKE256_RATE && i < len; j++) {
if ((shakeblock[j] & 31) != 31) {
out[i] = (signed char)((shakeblock[j] & 31) - 15);
i++;
}
}
}
shake256_ctx_release(&shakestate);

}

/* Unpacks an array of packed GF31 elements to one element per gf31.
Assumes that there is sufficient empty space available at the end of the
array to unpack. Can perform in-place. */
void PQCLEAN_MQDSS48_AVX2_gf31_nunpack(gf31 *out, const uint8_t *in, size_t n) {
size_t i;
size_t j = ((n * 5) >> 3) - 1;
unsigned int d = 0;

for (i = n; i > 0; i--) {
out[i - 1] = (gf31)((in[j] >> d) & 31);
d += 5;
if (d > 8) {
d -= 8;
j--;
out[i - 1] = (gf31)(out[i - 1] ^ ((in[j] << (5 - d)) & 31));
}
}
}

/* Packs an array of GF31 elements from gf31's to concatenated 5-bit values.
Assumes that there is sufficient space available to unpack.
Can perform in-place. */
void PQCLEAN_MQDSS48_AVX2_gf31_npack(uint8_t *out, const gf31 *in, size_t n) {
unsigned int i = 0;
unsigned int j;
int d = 3;

/* There will be ceil(5n / 8) output blocks */
memset(out, 0, (size_t)((5 * n + 7) & ~7U) >> 3);

for (j = 0; j < n; j++) {
if (d < 0) {
d += 8;
out[i] = (uint8_t)((out[i] & (255 << (d - 3))) |
((in[j] >> (8 - d)) & ~(255 << (d - 3))));
i++;
}
out[i] = (uint8_t)((out[i] & ~(31 << d)) | ((in[j] << d) & (31 << d)));
d -= 5;
}
}

+ 36
- 0
crypto_sign/mqdss-48/avx2/gf31.h Ver fichero

@@ -0,0 +1,36 @@
#ifndef MQDSS_GF31_H
#define MQDSS_GF31_H

#include <stddef.h>
#include <stdint.h>

typedef unsigned short gf31;

/* Given a vector of elements in the range [0, 31], this reduces the elements
to the range [0, 30] by mapping 31 to 0 (i.e reduction mod 31) */
void PQCLEAN_MQDSS48_AVX2_vgf31_unique(gf31 *out, gf31 *in);

/* Given a vector of 16-bit integers (i.e. in [0, 65535], this reduces the
elements to the range [0, 30] by mapping 31 to 0 (i.e reduction mod 31) */
void PQCLEAN_MQDSS48_AVX2_vgf31_shorten_unique(gf31 *out, gf31 *in);

/* Given a seed, samples len gf31 elements (in the range [0, 30]), and places
them in a vector of 16-bit elements */
void PQCLEAN_MQDSS48_AVX2_gf31_nrand(gf31 *out, size_t len, const uint8_t *seed, size_t seedlen);

/* Given a seed, samples len gf31 elements, transposed into unsigned range,
i.e. in the range [-15, 15], and places them in an array of 8-bit integers.
This is used for the expansion of F, which wants packed elements. */
void PQCLEAN_MQDSS48_AVX2_gf31_nrand_schar(signed char *out, size_t len, const uint8_t *seed, size_t seedlen);

/* Unpacks an array of packed GF31 elements to one element per gf31.
Assumes that there is sufficient empty space available at the end of the
array to unpack. Can perform in-place. */
void PQCLEAN_MQDSS48_AVX2_gf31_nunpack(gf31 *out, const uint8_t *in, size_t n);

/* Packs an array of GF31 elements from gf31's to concatenated 5-bit values.
Assumes that there is sufficient space available to unpack.
Can perform in-place. */
void PQCLEAN_MQDSS48_AVX2_gf31_npack(uint8_t *out, const gf31 *in, size_t n);

#endif

+ 251
- 0
crypto_sign/mqdss-48/avx2/mq.c Ver fichero

@@ -0,0 +1,251 @@
#include "mq.h"
#include "params.h"
#include <immintrin.h>
#include <stdio.h>

static inline __m256i reduce_16(__m256i r, __m256i _w31, __m256i _w2114) {
__m256i exp = _mm256_mulhi_epi16(r, _w2114);
return _mm256_sub_epi16(r, _mm256_mullo_epi16(_w31, exp));
}

/* Computes all products x_i * x_j, returns in reduced form */
inline static
void generate_quadratic_terms( unsigned char *xij, const gf31 *x ) {
__m256i mask_2114 = _mm256_set1_epi16( 2114 );
__m256i mask_31 = _mm256_set1_epi16( 31 );
__m256i xi[4];
xi[0] = _mm256_loadu_si256((__m256i const *) (x));
xi[1] = _mm256_loadu_si256((__m256i const *) (x + 16));
xi[2] = _mm256_loadu_si256((__m256i const *) (x + 32));
xi[3] = _mm256_setzero_si256();

__m256i xixj[4];
xixj[0] = _mm256_setzero_si256();
xixj[1] = _mm256_setzero_si256();
xixj[2] = _mm256_setzero_si256();
xixj[3] = _mm256_setzero_si256();

int k = 0;
for (int i = 0; i < 32; i++) {
__m256i br_xi = _mm256_set1_epi16( (short)x[i] );
for (int j = 0; j <= (i >> 4); j++) {
xixj[j] = _mm256_mullo_epi16( xi[j], br_xi );
xixj[j] = reduce_16( xixj[j], mask_31, mask_2114 );
}

__m256i r = _mm256_packs_epi16(xixj[0], xixj[1]);
r = _mm256_permute4x64_epi64(r, 0xd8); // 3,1,2,0
_mm256_storeu_si256( (__m256i *)( xij + k ), r );
k += i + 1;
}

for (int i = 32; i < N; i++) {
__m256i br_xi = _mm256_set1_epi16( (short)x[i] );
for (int j = 0; j <= (i >> 4); j++) {
xixj[j] = _mm256_mullo_epi16( xi[j], br_xi );
xixj[j] = reduce_16( xixj[j], mask_31, mask_2114 );
}

__m256i r0 = _mm256_packs_epi16(xixj[0], xixj[1]);
r0 = _mm256_permute4x64_epi64(r0, 0xd8); // 3,1,2,0
_mm256_storeu_si256( (__m256i *)( xij + k ), r0 );
__m256i r1 = _mm256_packs_epi16(xixj[2], xixj[3]);
r1 = _mm256_permute4x64_epi64(r1, 0xd8); // 3,1,2,0
_mm256_storeu_si256( (__m256i *)( xij + 32 + k ), r1 );
k += i + 1;
}
}

/* Computes all terms (x_i * y_j) + (x_j * y_i), returns in reduced form */
inline static
void generate_xiyj_p_xjyi_terms( unsigned char *xij, const gf31 *x, const gf31 *y ) {
__m256i mask_2114 = _mm256_set1_epi16( 2114 );
__m256i mask_31 = _mm256_set1_epi16( 31 );
__m256i xiyi[4];
xiyi[0] = _mm256_xor_si256(_mm256_loadu_si256((__m256i const *) (x)), _mm256_slli_si256( _mm256_loadu_si256((__m256i const *) (y)), 1 ));
xiyi[1] = _mm256_xor_si256(_mm256_loadu_si256((__m256i const *) (x + 16)), _mm256_slli_si256( _mm256_loadu_si256((__m256i const *) (y + 16)), 1 ));
xiyi[2] = _mm256_xor_si256(_mm256_loadu_si256((__m256i const *) (x + 32)), _mm256_slli_si256( _mm256_loadu_si256((__m256i const *) (y + 32)), 1 ));
xiyi[3] = _mm256_setzero_si256();

__m256i xixj[4];
xixj[0] = _mm256_setzero_si256();
xixj[1] = _mm256_setzero_si256();
xixj[2] = _mm256_setzero_si256();
xixj[3] = _mm256_setzero_si256();

int k = 0;
for (int i = 0; i < 32; i++) {
__m256i br_yixi = _mm256_set1_epi16( (short)((x[i] << 8)^y[i]) );
for (int j = 0; j <= (i >> 4); j++) {
xixj[j] = _mm256_maddubs_epi16( xiyi[j], br_yixi );
xixj[j] = reduce_16( xixj[j], mask_31, mask_2114 );
}

__m256i r = _mm256_packs_epi16(xixj[0], xixj[1]);
r = _mm256_permute4x64_epi64(r, 0xd8); // 3,1,2,0
_mm256_storeu_si256( (__m256i *)( xij + k ), r );
k += i + 1;
}

for (int i = 32; i < N; i++) {
__m256i br_yixi = _mm256_set1_epi16( (short)((x[i] << 8)^y[i]) );
for (int j = 0; j <= (i >> 4); j++) {
xixj[j] = _mm256_maddubs_epi16( xiyi[j], br_yixi );
xixj[j] = reduce_16( xixj[j], mask_31, mask_2114 );
}

__m256i r0 = _mm256_packs_epi16(xixj[0], xixj[1]);
r0 = _mm256_permute4x64_epi64(r0, 0xd8); // 3,1,2,0
_mm256_storeu_si256( (__m256i *)( xij + k ), r0 );
__m256i r1 = _mm256_packs_epi16(xixj[2], xixj[3]);
r1 = _mm256_permute4x64_epi64(r1, 0xd8); // 3,1,2,0
_mm256_storeu_si256( (__m256i *)( xij + 32 + k ), r1 );
k += i + 1;
}
}

#define EVAL_YMM_0(xx) {\
__m128i tmp = _mm256_castsi256_si128(xx); \
for (int macro_i = 0; macro_i < 8; macro_i++) { \
__m256i _xi = _mm256_broadcastw_epi16(tmp); \
tmp = _mm_srli_si128(tmp, 2); \
for (int macro_j = 0; macro_j < (N/16); macro_j++) { \
__m256i coeff = _mm256_loadu_si256((__m256i const *) F); \
F += 32; \
yy[macro_j] = _mm256_add_epi16(yy[macro_j], _mm256_maddubs_epi16(_xi, coeff)); \
} \
} \
}

#define EVAL_YMM_1(xx) {\
__m128i tmp = _mm256_extracti128_si256(xx, 1); \
for (int macro_i = 0; macro_i < 8; macro_i++) { \
__m256i _xi = _mm256_broadcastw_epi16(tmp); \
tmp = _mm_srli_si128(tmp, 2); \
for (int macro_j = 0; macro_j < (N/16); macro_j++) { \
__m256i coeff = _mm256_loadu_si256((__m256i const *) F); \
F += 32; \
yy[macro_j] = _mm256_add_epi16(yy[macro_j], _mm256_maddubs_epi16(_xi, coeff)); \
} \
} \
}

#define REDUCE_(yy) { \
(yy)[0] = reduce_16((yy)[0], mask_reduce, mask_2114); \
(yy)[1] = reduce_16((yy)[1], mask_reduce, mask_2114); \
(yy)[2] = reduce_16((yy)[2], mask_reduce, mask_2114); \
}

/* Evaluates the MQ function on a vector of N gf31 elements x (expected to be
in reduced 5-bit representation). Expects the coefficients in F to be in
signed representation (i.e. [-15, 15], packed bytewise).
Outputs M gf31 elements in unique 16-bit representation as fx. */
void PQCLEAN_MQDSS48_AVX2_MQ(gf31 *fx, const gf31 *x, const signed char *F) {
__m256i mask_2114 = _mm256_set1_epi32(2114 * 65536 + 2114);
__m256i mask_reduce = _mm256_srli_epi16(_mm256_cmpeq_epi16(mask_2114, mask_2114), 11);

__m256i xi[4];
xi[0] = _mm256_loadu_si256((__m256i const *) (x));
xi[1] = _mm256_loadu_si256((__m256i const *) (x + 16));
xi[2] = _mm256_loadu_si256((__m256i const *) (x + 32));
xi[3] = _mm256_setzero_si256();

__m256i _zero = _mm256_setzero_si256();
xi[0] = _mm256_add_epi16(_mm256_and_si256(mask_reduce, _mm256_cmpgt_epi16(_zero, xi[0])), xi[0]);
xi[1] = _mm256_add_epi16(_mm256_and_si256(mask_reduce, _mm256_cmpgt_epi16(_zero, xi[1])), xi[1]);
xi[2] = _mm256_add_epi16(_mm256_and_si256(mask_reduce, _mm256_cmpgt_epi16(_zero, xi[2])), xi[2]);

__m256i x1 = _mm256_packs_epi16(xi[0], xi[1]);
x1 = _mm256_permute4x64_epi64(x1, 0xd8); // 3,1,2,0
__m256i x2 = _mm256_packs_epi16(xi[2], xi[3]);
x2 = _mm256_permute4x64_epi64(x2, 0xd8); // 3,1,2,0

__m256i yy[M / 16];
yy[0] = _zero;
yy[1] = _zero;
yy[2] = _zero;

EVAL_YMM_0(x1)
EVAL_YMM_1(x1)
EVAL_YMM_0(x2)
REDUCE_(yy)

__m256i xixj[38];
generate_quadratic_terms( (unsigned char *) xixj, x );
for (int i = 0 ; i < 36 ; i += 2) {
EVAL_YMM_0(xixj[i])
EVAL_YMM_1(xixj[i])
EVAL_YMM_0(xixj[i + 1])
EVAL_YMM_1(xixj[i + 1])
REDUCE_(yy)
}
EVAL_YMM_0(xixj[36]) {
__m128i tmp = _mm256_extracti128_si256(xixj[36], 1);
for (int i = 0; i < 4; i++) {
__m256i _xi = _mm256_broadcastw_epi16(tmp);
tmp = _mm_srli_si128(tmp, 2);
for (int j = 0; j < (N / 16); j++) {
__m256i coeff = _mm256_loadu_si256((__m256i const *) F);
F += 32;
yy[j] = _mm256_add_epi16(yy[j], _mm256_maddubs_epi16(_xi, coeff));
}
}
}
REDUCE_(yy)

yy[0] = _mm256_add_epi16(_mm256_and_si256(mask_reduce, _mm256_cmpgt_epi16(_mm256_setzero_si256(), yy[0])), yy[0]);
yy[1] = _mm256_add_epi16(_mm256_and_si256(mask_reduce, _mm256_cmpgt_epi16(_mm256_setzero_si256(), yy[1])), yy[1]);
yy[2] = _mm256_add_epi16(_mm256_and_si256(mask_reduce, _mm256_cmpgt_epi16(_mm256_setzero_si256(), yy[2])), yy[2]);

for (int i = 0; i < (N / 16); ++i) {
_mm256_storeu_si256((__m256i *)(fx + i * 16), yy[i]);
}
}

/* Evaluates the bilinear polar form of the MQ function (i.e. G) on a vector of
N gf31 elements x (expected to be in reduced 5-bit representation). Expects
the coefficients in F to be in signed representation (i.e. [-15, 15], packed
bytewise). Outputs M gf31 elements in unique 16-bit representation as fx. */
void PQCLEAN_MQDSS48_AVX2_G(gf31 *fx, const gf31 *x, const gf31 *y, const signed char *F) {
__m256i mask_2114 = _mm256_set1_epi32(2114 * 65536 + 2114);
__m256i mask_reduce = _mm256_srli_epi16(_mm256_cmpeq_epi16(mask_2114, mask_2114), 11);
__m256i _zero = _mm256_setzero_si256();

__m256i yy[(M / 16)];
yy[0] = _zero;
yy[1] = _zero;
yy[2] = _zero;

F += N * M;

__m256i xixj[38];
generate_xiyj_p_xjyi_terms( (unsigned char *) xixj, x, y );
for (int i = 0 ; i < 36 ; i += 2) {
EVAL_YMM_0(xixj[i])
EVAL_YMM_1(xixj[i])
EVAL_YMM_0(xixj[i + 1])
EVAL_YMM_1(xixj[i + 1])
REDUCE_(yy)
}
EVAL_YMM_0(xixj[36]) {
__m128i tmp = _mm256_extracti128_si256(xixj[36], 1);
for (int i = 0; i < 4; i++) {
__m256i _xi = _mm256_broadcastw_epi16(tmp);
tmp = _mm_srli_si128(tmp, 2);
for (int j = 0; j < (N / 16); j++) {
__m256i coeff = _mm256_loadu_si256((__m256i const *) F);
F += 32;
yy[j] = _mm256_add_epi16(yy[j], _mm256_maddubs_epi16(_xi, coeff));
}
}
}
REDUCE_(yy)

yy[0] = _mm256_add_epi16(_mm256_and_si256(mask_reduce, _mm256_cmpgt_epi16(_mm256_setzero_si256(), yy[0])), yy[0]);
yy[1] = _mm256_add_epi16(_mm256_and_si256(mask_reduce, _mm256_cmpgt_epi16(_mm256_setzero_si256(), yy[1])), yy[1]);
yy[2] = _mm256_add_epi16(_mm256_and_si256(mask_reduce, _mm256_cmpgt_epi16(_mm256_setzero_si256(), yy[2])), yy[2]);

for (int i = 0; i < (N / 16); ++i) {
_mm256_storeu_si256((__m256i *)(fx + i * 16), yy[i]);
}
}

+ 18
- 0
crypto_sign/mqdss-48/avx2/mq.h Ver fichero

@@ -0,0 +1,18 @@
#ifndef MQDSS_MQ_H
#define MQDSS_MQ_H

#include "gf31.h"

/* Evaluates the MQ function on a vector of N gf31 elements x (expected to be
in reduced 5-bit representation). Expects the coefficients in F to be in
signed representation (i.e. [-15, 15], packed bytewise).
Outputs M gf31 elements in unique 16-bit representation as fx. */
void PQCLEAN_MQDSS48_AVX2_MQ(gf31 *fx, const gf31 *x, const signed char *F);

/* Evaluates the bilinear polar form of the MQ function (i.e. G) on a vector of
N gf31 elements x (expected to be in reduced 5-bit representation). Expects
the coefficients in F to be in signed representation (i.e. [-15, 15], packed
bytewise). Outputs M gf31 elements in unique 16-bit representation as fx. */
void PQCLEAN_MQDSS48_AVX2_G(gf31 *fx, const gf31 *x, const gf31 *y, const signed char *F);

#endif

+ 25
- 0
crypto_sign/mqdss-48/avx2/params.h Ver fichero

@@ -0,0 +1,25 @@
#ifndef MQDSS_PARAMS_H
#define MQDSS_PARAMS_H

#define N 48
#define M N
#define F_LEN (M * (((N * (N + 1)) >> 1) + N)) /* Number of elements in F */

#define ROUNDS 184

/* Number of bytes that N, M and F_LEN elements require when packed into a byte
array, 5-bit elements packed continuously. */
/* Assumes N and M to be multiples of 8 */
#define NPACKED_BYTES ((N * 5) >> 3)
#define MPACKED_BYTES ((M * 5) >> 3)
#define FPACKED_BYTES ((F_LEN * 5) >> 3)

#define HASH_BYTES 32
#define SEED_BYTES 16
#define PK_BYTES (SEED_BYTES + MPACKED_BYTES)
#define SK_BYTES SEED_BYTES

// R, sigma_0, ROUNDS * (t1, r{0,1}, e1, c, rho)
#define SIG_LEN (2 * HASH_BYTES + ROUNDS * (2*NPACKED_BYTES + MPACKED_BYTES + HASH_BYTES + HASH_BYTES))

#endif

+ 389
- 0
crypto_sign/mqdss-48/avx2/sign.c Ver fichero

@@ -0,0 +1,389 @@
#include <stddef.h>
#include <stdint.h>
#include <string.h>

#include "api.h"
#include "fips202.h"
#include "gf31.h"
#include "mq.h"
#include "params.h"
#include "randombytes.h"

/* Takes an array of len bytes and computes a hash digest.
This is used as a hash function in the Fiat-Shamir transform. */
static void H(unsigned char *out, const unsigned char *in, const size_t len) {
shake256(out, HASH_BYTES, in, len);
}

/* Takes two arrays of N packed elements and an array of M packed elements,
and computes a HASH_BYTES commitment. */
static void com_0(unsigned char *c,
const unsigned char *rho,
const unsigned char *inn, const unsigned char *inn2,
const unsigned char *inm) {
unsigned char buffer[HASH_BYTES + 2 * NPACKED_BYTES + MPACKED_BYTES];
memcpy(buffer, rho, HASH_BYTES);
memcpy(buffer + HASH_BYTES, inn, NPACKED_BYTES);
memcpy(buffer + HASH_BYTES + NPACKED_BYTES, inn2, NPACKED_BYTES);
memcpy(buffer + HASH_BYTES + 2 * NPACKED_BYTES, inm, MPACKED_BYTES);
shake256(c, HASH_BYTES, buffer, HASH_BYTES + 2 * NPACKED_BYTES + MPACKED_BYTES);
}

/* Takes an array of N packed elements and an array of M packed elements,
and computes a HASH_BYTES commitment. */
static void com_1(unsigned char *c,
const unsigned char *rho,
const unsigned char *inn, const unsigned char *inm) {
unsigned char buffer[HASH_BYTES + NPACKED_BYTES + MPACKED_BYTES];
memcpy(buffer, rho, HASH_BYTES);
memcpy(buffer + HASH_BYTES, inn, NPACKED_BYTES);
memcpy(buffer + HASH_BYTES + NPACKED_BYTES, inm, MPACKED_BYTES);
shake256(c, HASH_BYTES, buffer, HASH_BYTES + NPACKED_BYTES + MPACKED_BYTES);
}

/*
* Generates an MQDSS key pair.
*/
int PQCLEAN_MQDSS48_AVX2_crypto_sign_keypair(uint8_t *pk, uint8_t *sk) {
signed char F[F_LEN];
unsigned char skbuf[SEED_BYTES * 2];
gf31 sk_gf31[N];
gf31 pk_gf31[M];

// Expand sk to obtain a seed for F and the secret input s.
// We also expand to obtain a value for sampling r0, t0 and e0 during
// signature generation, but that is not relevant here.
randombytes(sk, SEED_BYTES);
shake256(skbuf, SEED_BYTES * 2, sk, SEED_BYTES);

memcpy(pk, skbuf, SEED_BYTES);
PQCLEAN_MQDSS48_AVX2_gf31_nrand_schar(F, F_LEN, pk, SEED_BYTES);
PQCLEAN_MQDSS48_AVX2_gf31_nrand(sk_gf31, N, skbuf + SEED_BYTES, SEED_BYTES);
PQCLEAN_MQDSS48_AVX2_MQ(pk_gf31, sk_gf31, F);
PQCLEAN_MQDSS48_AVX2_vgf31_unique(pk_gf31, pk_gf31);
PQCLEAN_MQDSS48_AVX2_gf31_npack(pk + SEED_BYTES, pk_gf31, M);

return 0;
}

/**
* Returns an array containing a detached signature.
*/
int PQCLEAN_MQDSS48_AVX2_crypto_sign_signature(
uint8_t *sig, size_t *siglen,
const uint8_t *m, size_t mlen, const uint8_t *sk) {

signed char F[F_LEN];
unsigned char skbuf[SEED_BYTES * 4];
gf31 pk_gf31[M];
unsigned char pk[SEED_BYTES + MPACKED_BYTES];
// Concatenated for convenient hashing.
unsigned char D_sigma0_h0_sigma1[HASH_BYTES * 3 + ROUNDS * (NPACKED_BYTES + MPACKED_BYTES)];
unsigned char *D = D_sigma0_h0_sigma1;
unsigned char *sigma0 = D_sigma0_h0_sigma1 + HASH_BYTES;
unsigned char *h0 = D_sigma0_h0_sigma1 + 2 * HASH_BYTES;
unsigned char *t1packed = D_sigma0_h0_sigma1 + 3 * HASH_BYTES;
unsigned char *e1packed = D_sigma0_h0_sigma1 + 3 * HASH_BYTES + ROUNDS * NPACKED_BYTES;
shake256ctx shakestate;
unsigned char shakeblock[SHAKE256_RATE];
unsigned char h1[((ROUNDS + 7) & ~7) >> 3];
unsigned char rnd_seed[HASH_BYTES + SEED_BYTES];
unsigned char rho[2 * ROUNDS * HASH_BYTES];
unsigned char *rho0 = rho;
unsigned char *rho1 = rho + ROUNDS * HASH_BYTES;
gf31 sk_gf31[N];
gf31 rnd[(2 * N + M) * ROUNDS]; // Concatenated for easy RNG.
gf31 *r0 = rnd;
gf31 *t0 = rnd + N * ROUNDS;
gf31 *e0 = rnd + 2 * N * ROUNDS;
gf31 r1[N * ROUNDS];
gf31 t1[N * ROUNDS];
gf31 e1[M * ROUNDS];
gf31 gx[M * ROUNDS];
unsigned char packbuf0[NPACKED_BYTES];
unsigned char packbuf1[NPACKED_BYTES];
unsigned char packbuf2[MPACKED_BYTES];
unsigned char c[HASH_BYTES * ROUNDS * 2];
gf31 alpha;
int alpha_count = 0;
int b;
int i, j;
shake256incctx state;

shake256(skbuf, SEED_BYTES * 4, sk, SEED_BYTES);

PQCLEAN_MQDSS48_AVX2_gf31_nrand_schar(F, F_LEN, skbuf, SEED_BYTES);

shake256_inc_init(&state);
shake256_inc_absorb(&state, sk, SEED_BYTES);
shake256_inc_absorb(&state, m, mlen);
shake256_inc_finalize(&state);
shake256_inc_squeeze(sig, HASH_BYTES, &state); // Compute R.
shake256_inc_ctx_release(&state);

memcpy(pk, skbuf, SEED_BYTES);
PQCLEAN_MQDSS48_AVX2_gf31_nrand(sk_gf31, N, skbuf + SEED_BYTES, SEED_BYTES);
PQCLEAN_MQDSS48_AVX2_MQ(pk_gf31, sk_gf31, F);
PQCLEAN_MQDSS48_AVX2_vgf31_unique(pk_gf31, pk_gf31);
PQCLEAN_MQDSS48_AVX2_gf31_npack(pk + SEED_BYTES, pk_gf31, M);

shake256_inc_init(&state);
shake256_inc_absorb(&state, pk, PK_BYTES);
shake256_inc_absorb(&state, sig, HASH_BYTES);
shake256_inc_absorb(&state, m, mlen);
shake256_inc_finalize(&state);
shake256_inc_squeeze(D, HASH_BYTES, &state);
shake256_inc_ctx_release(&state);

sig += HASH_BYTES; // Compensate for prefixed R.

memcpy(rnd_seed, skbuf + 2 * SEED_BYTES, SEED_BYTES);
memcpy(rnd_seed + SEED_BYTES, D, HASH_BYTES);
shake256(rho, 2 * ROUNDS * HASH_BYTES, rnd_seed, SEED_BYTES + HASH_BYTES);

memcpy(rnd_seed, skbuf + 3 * SEED_BYTES, SEED_BYTES);
memcpy(rnd_seed + SEED_BYTES, D, HASH_BYTES);
PQCLEAN_MQDSS48_AVX2_gf31_nrand(rnd, (2 * N + M) * ROUNDS, rnd_seed, SEED_BYTES + HASH_BYTES);

for (i = 0; i < ROUNDS; i++) {
for (j = 0; j < N; j++) {
r1[j + i * N] = (gf31)(31 + sk_gf31[j] - r0[j + i * N]);
}
PQCLEAN_MQDSS48_AVX2_G(gx + i * M, t0 + i * N, r1 + i * N, F);
}
for (i = 0; i < ROUNDS * M; i++) {
gx[i] = (gf31)(gx[i] + e0[i]);
}
for (i = 0; i < ROUNDS; i++) {
PQCLEAN_MQDSS48_AVX2_gf31_npack(packbuf0, r0 + i * N, N);
PQCLEAN_MQDSS48_AVX2_gf31_npack(packbuf1, t0 + i * N, N);
PQCLEAN_MQDSS48_AVX2_gf31_npack(packbuf2, e0 + i * M, M);
com_0(c + HASH_BYTES * (2 * i + 0), rho0 + i * HASH_BYTES, packbuf0, packbuf1, packbuf2);
PQCLEAN_MQDSS48_AVX2_vgf31_shorten_unique(r1 + i * N, r1 + i * N);
PQCLEAN_MQDSS48_AVX2_vgf31_shorten_unique(gx + i * M, gx + i * M);
PQCLEAN_MQDSS48_AVX2_gf31_npack(packbuf0, r1 + i * N, N);
PQCLEAN_MQDSS48_AVX2_gf31_npack(packbuf1, gx + i * M, M);
com_1(c + HASH_BYTES * (2 * i + 1), rho1 + i * HASH_BYTES, packbuf0, packbuf1);
}

H(sigma0, c, HASH_BYTES * ROUNDS * 2); // Compute sigma_0.
shake256_absorb(&shakestate, D_sigma0_h0_sigma1, 2 * HASH_BYTES);
shake256_squeezeblocks(shakeblock, 1, &shakestate);

memcpy(h0, shakeblock, HASH_BYTES);

memcpy(sig, sigma0, HASH_BYTES);
sig += HASH_BYTES; // Compensate for sigma_0.

for (i = 0; i < ROUNDS; i++) {
do {
alpha = shakeblock[alpha_count] & 31;
alpha_count++;
if (alpha_count == SHAKE256_RATE) {
alpha_count = 0;
shake256_squeezeblocks(shakeblock, 1, &shakestate);
}
} while (alpha == 31);
for (j = 0; j < N; j++) {
t1[i * N + j] = (gf31)(alpha * r0[j + i * N] - t0[j + i * N] + 31);
}
PQCLEAN_MQDSS48_AVX2_MQ(e1 + i * M, r0 + i * N, F);
for (j = 0; j < N; j++) {
e1[i * N + j] = (gf31)(alpha * e1[j + i * M] - e0[j + i * M] + 31);
}
PQCLEAN_MQDSS48_AVX2_vgf31_shorten_unique(t1 + i * N, t1 + i * N);
PQCLEAN_MQDSS48_AVX2_vgf31_shorten_unique(e1 + i * N, e1 + i * N);
}
shake256_ctx_release(&shakestate);

PQCLEAN_MQDSS48_AVX2_gf31_npack(t1packed, t1, N * ROUNDS);
PQCLEAN_MQDSS48_AVX2_gf31_npack(e1packed, e1, M * ROUNDS);

memcpy(sig, t1packed, NPACKED_BYTES * ROUNDS);
sig += NPACKED_BYTES * ROUNDS;
memcpy(sig, e1packed, MPACKED_BYTES * ROUNDS);
sig += MPACKED_BYTES * ROUNDS;

shake256(h1, ((ROUNDS + 7) & ~7) >> 3, D_sigma0_h0_sigma1, 3 * HASH_BYTES + ROUNDS * (NPACKED_BYTES + MPACKED_BYTES));

for (i = 0; i < ROUNDS; i++) {
b = (h1[(i >> 3)] >> (i & 7)) & 1;
if (b == 0) {
PQCLEAN_MQDSS48_AVX2_gf31_npack(sig, r0 + i * N, N);
} else if (b == 1) {
PQCLEAN_MQDSS48_AVX2_gf31_npack(sig, r1 + i * N, N);
}
memcpy(sig + NPACKED_BYTES, c + HASH_BYTES * (2 * i + (1 - b)), HASH_BYTES);
memcpy(sig + NPACKED_BYTES + HASH_BYTES, rho + (i + b * ROUNDS) * HASH_BYTES, HASH_BYTES);
sig += NPACKED_BYTES + 2 * HASH_BYTES;
}

*siglen = SIG_LEN;
return 0;
}

/**
* Verifies a detached signature and message under a given public key.
*/
int PQCLEAN_MQDSS48_AVX2_crypto_sign_verify(
const uint8_t *sig, size_t siglen,
const uint8_t *m, size_t mlen, const uint8_t *pk) {

gf31 r[N];
gf31 t[N];
gf31 e[M];
signed char F[F_LEN];
gf31 pk_gf31[M];
// Concatenated for convenient hashing.
unsigned char D_sigma0_h0_sigma1[HASH_BYTES * 3 + ROUNDS * (NPACKED_BYTES + MPACKED_BYTES)];
unsigned char *D = D_sigma0_h0_sigma1;
unsigned char *sigma0 = D_sigma0_h0_sigma1 + HASH_BYTES;
unsigned char *h0 = D_sigma0_h0_sigma1 + 2 * HASH_BYTES;
unsigned char *t1packed = D_sigma0_h0_sigma1 + 3 * HASH_BYTES;
unsigned char *e1packed = D_sigma0_h0_sigma1 + 3 * HASH_BYTES + ROUNDS * NPACKED_BYTES;
unsigned char h1[((ROUNDS + 7) & ~7) >> 3];
unsigned char c[HASH_BYTES * ROUNDS * 2];
memset(c, 0, HASH_BYTES * 2);
gf31 x[N];
gf31 y[M];
gf31 z[M];
unsigned char packbuf0[NPACKED_BYTES];
unsigned char packbuf1[MPACKED_BYTES];
shake256ctx shakestate;
unsigned char shakeblock[SHAKE256_RATE];
int i, j;
gf31 alpha;
int alpha_count = 0;
int b;
shake256incctx state;

if (siglen != SIG_LEN) {
return -1;
}

shake256_inc_init(&state);
shake256_inc_absorb(&state, pk, PK_BYTES);
shake256_inc_absorb(&state, sig, HASH_BYTES);
shake256_inc_absorb(&state, m, mlen);
shake256_inc_finalize(&state);
shake256_inc_squeeze(D, HASH_BYTES, &state);
shake256_inc_ctx_release(&state);

sig += HASH_BYTES;

PQCLEAN_MQDSS48_AVX2_gf31_nrand_schar(F, F_LEN, pk, SEED_BYTES);
pk += SEED_BYTES;
PQCLEAN_MQDSS48_AVX2_gf31_nunpack(pk_gf31, pk, M);

memcpy(sigma0, sig, HASH_BYTES);

shake256_absorb(&shakestate, D_sigma0_h0_sigma1, 2 * HASH_BYTES);
shake256_squeezeblocks(shakeblock, 1, &shakestate);

memcpy(h0, shakeblock, HASH_BYTES);

sig += HASH_BYTES;

memcpy(t1packed, sig, ROUNDS * NPACKED_BYTES);
sig += ROUNDS * NPACKED_BYTES;
memcpy(e1packed, sig, ROUNDS * MPACKED_BYTES);
sig += ROUNDS * MPACKED_BYTES;

shake256(h1, ((ROUNDS + 7) & ~7) >> 3, D_sigma0_h0_sigma1, 3 * HASH_BYTES + ROUNDS * (NPACKED_BYTES + MPACKED_BYTES));

for (i = 0; i < ROUNDS; i++) {
do {
alpha = shakeblock[alpha_count] & 31;
alpha_count++;
if (alpha_count == SHAKE256_RATE) {
alpha_count = 0;
shake256_squeezeblocks(shakeblock, 1, &shakestate);
}
} while (alpha == 31);
b = (h1[(i >> 3)] >> (i & 7)) & 1;

PQCLEAN_MQDSS48_AVX2_gf31_nunpack(r, sig, N);
PQCLEAN_MQDSS48_AVX2_gf31_nunpack(t, t1packed + NPACKED_BYTES * i, N);
PQCLEAN_MQDSS48_AVX2_gf31_nunpack(e, e1packed + MPACKED_BYTES * i, M);

if (b == 0) {
PQCLEAN_MQDSS48_AVX2_MQ(y, r, F);
for (j = 0; j < N; j++) {
x[j] = (gf31)(alpha * r[j] - t[j] + 31);
}
for (j = 0; j < N; j++) {
y[j] = (gf31)(alpha * y[j] - e[j] + 31);
}
PQCLEAN_MQDSS48_AVX2_vgf31_shorten_unique(x, x);
PQCLEAN_MQDSS48_AVX2_vgf31_shorten_unique(y, y);
PQCLEAN_MQDSS48_AVX2_gf31_npack(packbuf0, x, N);
PQCLEAN_MQDSS48_AVX2_gf31_npack(packbuf1, y, M);
com_0(c + HASH_BYTES * (2 * i + 0), sig + HASH_BYTES + NPACKED_BYTES, sig, packbuf0, packbuf1);
} else {
PQCLEAN_MQDSS48_AVX2_MQ(y, r, F);
PQCLEAN_MQDSS48_AVX2_G(z, t, r, F);
for (j = 0; j < N; j++) {
y[j] = (gf31)(alpha * (31 + pk_gf31[j] - y[j]) - z[j] - e[j] + 62);
}
PQCLEAN_MQDSS48_AVX2_vgf31_shorten_unique(y, y);
PQCLEAN_MQDSS48_AVX2_gf31_npack(packbuf0, y, M);
com_1(c + HASH_BYTES * (2 * i + 1), sig + HASH_BYTES + NPACKED_BYTES, sig, packbuf0);
}
memcpy(c + HASH_BYTES * (2 * i + (1 - b)), sig + NPACKED_BYTES, HASH_BYTES);
sig += NPACKED_BYTES + 2 * HASH_BYTES;
}
shake256_ctx_release(&shakestate);

H(c, c, HASH_BYTES * ROUNDS * 2);
if (memcmp(c, sigma0, HASH_BYTES) != 0) {
return -1;
}

return 0;
}

/**
* Returns an array containing the signature followed by the message.
*/
int PQCLEAN_MQDSS48_AVX2_crypto_sign(
uint8_t *sm, size_t *smlen,
const uint8_t *m, size_t mlen, const uint8_t *sk) {
size_t siglen;

PQCLEAN_MQDSS48_AVX2_crypto_sign_signature(
sm, &siglen, m, mlen, sk);

memmove(sm + SIG_LEN, m, mlen);
*smlen = siglen + mlen;

return 0;
}

/**
* Verifies a given signature-message pair under a given public key.
*/
int PQCLEAN_MQDSS48_AVX2_crypto_sign_open(
uint8_t *m, size_t *mlen,
const uint8_t *sm, size_t smlen, const uint8_t *pk) {
/* The API caller does not necessarily know what size a signature should be
but MQDSS signatures are always exactly SIG_LEN. */
if (smlen < SIG_LEN) {
memset(m, 0, smlen);
*mlen = 0;
return -1;
}

*mlen = smlen - SIG_LEN;

if (PQCLEAN_MQDSS48_AVX2_crypto_sign_verify(
sm, SIG_LEN, sm + SIG_LEN, *mlen, pk)) {
memset(m, 0, smlen);
*mlen = 0;
return -1;
}

/* If verification was successful, move the message to the right place. */
memmove(m, sm + SIG_LEN, *mlen);

return 0;
}

+ 0
- 1
crypto_sign/mqdss-48/clean/sign.c Ver fichero

@@ -1,4 +1,3 @@
#include <assert.h>
#include <stddef.h>
#include <stdint.h>
#include <string.h>


+ 9
- 0
crypto_sign/mqdss-64/META.yml Ver fichero

@@ -16,3 +16,12 @@ auxiliary-submitters:
implementations:
- name: clean
version: https://github.com/joostrijneveld/MQDSS/commit/00608d7610262ff07b1834885d32bc3fd27ef5e1
- name: avx2
version: https://github.com/joostrijneveld/MQDSS/commit/00608d7610262ff07b1834885d32bc3fd27ef5e1
supported_platforms:
- architecture: x86_64
required_flags:
- avx2
- architecture: x86
required_flags:
- avx2

+ 116
- 0
crypto_sign/mqdss-64/avx2/LICENSE Ver fichero

@@ -0,0 +1,116 @@
CC0 1.0 Universal

Statement of Purpose

The laws of most jurisdictions throughout the world automatically confer
exclusive Copyright and Related Rights (defined below) upon the creator and
subsequent owner(s) (each and all, an "owner") of an original work of
authorship and/or a database (each, a "Work").

Certain owners wish to permanently relinquish those rights to a Work for the
purpose of contributing to a commons of creative, cultural and scientific
works ("Commons") that the public can reliably and without fear of later
claims of infringement build upon, modify, incorporate in other works, reuse
and redistribute as freely as possible in any form whatsoever and for any
purposes, including without limitation commercial purposes. These owners may
contribute to the Commons to promote the ideal of a free culture and the
further production of creative, cultural and scientific works, or to gain
reputation or greater distribution for their Work in part through the use and
efforts of others.

For these and/or other purposes and motivations, and without any expectation
of additional consideration or compensation, the person associating CC0 with a
Work (the "Affirmer"), to the extent that he or she is an owner of Copyright
and Related Rights in the Work, voluntarily elects to apply CC0 to the Work
and publicly distribute the Work under its terms, with knowledge of his or her
Copyright and Related Rights in the Work and the meaning and intended legal
effect of CC0 on those rights.

1. Copyright and Related Rights. A Work made available under CC0 may be
protected by copyright and related or neighboring rights ("Copyright and
Related Rights"). Copyright and Related Rights include, but are not limited
to, the following:

i. the right to reproduce, adapt, distribute, perform, display, communicate,
and translate a Work;

ii. moral rights retained by the original author(s) and/or performer(s);

iii. publicity and privacy rights pertaining to a person's image or likeness
depicted in a Work;

iv. rights protecting against unfair competition in regards to a Work,
subject to the limitations in paragraph 4(a), below;

v. rights protecting the extraction, dissemination, use and reuse of data in
a Work;

vi. database rights (such as those arising under Directive 96/9/EC of the
European Parliament and of the Council of 11 March 1996 on the legal
protection of databases, and under any national implementation thereof,
including any amended or successor version of such directive); and

vii. other similar, equivalent or corresponding rights throughout the world
based on applicable law or treaty, and any national implementations thereof.

2. Waiver. To the greatest extent permitted by, but not in contravention of,
applicable law, Affirmer hereby overtly, fully, permanently, irrevocably and
unconditionally waives, abandons, and surrenders all of Affirmer's Copyright
and Related Rights and associated claims and causes of action, whether now
known or unknown (including existing as well as future claims and causes of
action), in the Work (i) in all territories worldwide, (ii) for the maximum
duration provided by applicable law or treaty (including future time
extensions), (iii) in any current or future medium and for any number of
copies, and (iv) for any purpose whatsoever, including without limitation
commercial, advertising or promotional purposes (the "Waiver"). Affirmer makes
the Waiver for the benefit of each member of the public at large and to the
detriment of Affirmer's heirs and successors, fully intending that such Waiver
shall not be subject to revocation, rescission, cancellation, termination, or
any other legal or equitable action to disrupt the quiet enjoyment of the Work
by the public as contemplated by Affirmer's express Statement of Purpose.

3. Public License Fallback. Should any part of the Waiver for any reason be
judged legally invalid or ineffective under applicable law, then the Waiver
shall be preserved to the maximum extent permitted taking into account
Affirmer's express Statement of Purpose. In addition, to the extent the Waiver
is so judged Affirmer hereby grants to each affected person a royalty-free,
non transferable, non sublicensable, non exclusive, irrevocable and
unconditional license to exercise Affirmer's Copyright and Related Rights in
the Work (i) in all territories worldwide, (ii) for the maximum duration
provided by applicable law or treaty (including future time extensions), (iii)
in any current or future medium and for any number of copies, and (iv) for any
purpose whatsoever, including without limitation commercial, advertising or
promotional purposes (the "License"). The License shall be deemed effective as
of the date CC0 was applied by Affirmer to the Work. Should any part of the
License for any reason be judged legally invalid or ineffective under
applicable law, such partial invalidity or ineffectiveness shall not
invalidate the remainder of the License, and in such case Affirmer hereby
affirms that he or she will not (i) exercise any of his or her remaining
Copyright and Related Rights in the Work or (ii) assert any associated claims
and causes of action with respect to the Work, in either case contrary to
Affirmer's express Statement of Purpose.

4. Limitations and Disclaimers.

a. No trademark or patent rights held by Affirmer are waived, abandoned,
surrendered, licensed or otherwise affected by this document.

b. Affirmer offers the Work as-is and makes no representations or warranties
of any kind concerning the Work, express, implied, statutory or otherwise,
including without limitation warranties of title, merchantability, fitness
for a particular purpose, non infringement, or the absence of latent or
other defects, accuracy, or the present or absence of errors, whether or not
discoverable, all to the greatest extent permissible under applicable law.

c. Affirmer disclaims responsibility for clearing rights of other persons
that may apply to the Work or any use thereof, including without limitation
any person's Copyright and Related Rights in the Work. Further, Affirmer
disclaims responsibility for obtaining any necessary consents, permissions
or other rights required for any use of the Work.

d. Affirmer understands and acknowledges that Creative Commons is not a
party to this document and has no duty or obligation with respect to this
CC0 or use of the Work.

For more information, please see
<http://creativecommons.org/publicdomain/zero/1.0/>

+ 22
- 0
crypto_sign/mqdss-64/avx2/Makefile Ver fichero

@@ -0,0 +1,22 @@
# This Makefile can be used with GNU Make or BSD Make

LIB=libmqdss-64_avx2.a

HEADERS = params.h gf31.h mq.h api.h
OBJECTS = gf31.o mq.o sign.o

CFLAGS=-O3 -Wall -Wconversion -Wextra -Wpedantic -Wvla -Werror \
-Wmissing-prototypes -Wredundant-decls -std=c99 -mavx2 \
-I../../../common $(EXTRAFLAGS)

all: $(LIB)

%.o: %.c $(HEADERS)
$(CC) $(CFLAGS) -c -o $@ $<

$(LIB): $(OBJECTS)
$(AR) -r $@ $(OBJECTS)

clean:
$(RM) $(OBJECTS)
$(RM) $(LIB)

+ 19
- 0
crypto_sign/mqdss-64/avx2/Makefile.Microsoft_nmake Ver fichero

@@ -0,0 +1,19 @@
# This Makefile can be used with Microsoft Visual Studio's nmake using the command:
# nmake /f Makefile.Microsoft_nmake

LIBRARY=libmqdss-64_clean.lib
OBJECTS=gf31.obj mq.obj sign.obj

CFLAGS=/nologo /O2 /I ..\..\..\common /W4 /WX /arch:AVX2

all: $(LIBRARY)

# Make sure objects are recompiled if headers change.
$(OBJECTS): *.h

$(LIBRARY): $(OBJECTS)
LIB.EXE /NOLOGO /WX /OUT:$@ $**

clean:
-DEL $(OBJECTS)
-DEL $(LIBRARY)

+ 47
- 0
crypto_sign/mqdss-64/avx2/api.h Ver fichero

@@ -0,0 +1,47 @@
#ifndef PQCLEAN_MQDSS64_AVX2_API_H
#define PQCLEAN_MQDSS64_AVX2_API_H

#include <stddef.h>
#include <stdint.h>

#define PQCLEAN_MQDSS64_AVX2_CRYPTO_ALGNAME "MQDSS-64"

#define PQCLEAN_MQDSS64_AVX2_CRYPTO_SECRETKEYBYTES 24
#define PQCLEAN_MQDSS64_AVX2_CRYPTO_PUBLICKEYBYTES 64
#define PQCLEAN_MQDSS64_AVX2_CRYPTO_BYTES 59928

/*
* Generates an MQDSS key pair.
*/
int PQCLEAN_MQDSS64_AVX2_crypto_sign_keypair(
uint8_t *pk, uint8_t *sk);

/**
* Returns an array containing a detached signature.
*/
int PQCLEAN_MQDSS64_AVX2_crypto_sign_signature(
uint8_t *sig, size_t *siglen,
const uint8_t *m, size_t mlen, const uint8_t *sk);

/**
* Verifies a detached signature and message under a given public key.
*/
int PQCLEAN_MQDSS64_AVX2_crypto_sign_verify(
const uint8_t *sig, size_t siglen,
const uint8_t *m, size_t mlen, const uint8_t *pk);

/**
* Returns an array containing the signature followed by the message.
*/
int PQCLEAN_MQDSS64_AVX2_crypto_sign(
uint8_t *sm, size_t *smlen,
const uint8_t *m, size_t mlen, const uint8_t *sk);

/**
* Verifies a given signature-message pair under a given public key.
*/
int PQCLEAN_MQDSS64_AVX2_crypto_sign_open(
uint8_t *m, size_t *mlen,
const uint8_t *sm, size_t smlen, const uint8_t *pk);

#endif

+ 128
- 0
crypto_sign/mqdss-64/avx2/gf31.c Ver fichero

@@ -0,0 +1,128 @@
#include "params.h"
#include "fips202.h"
#include "gf31.h"
#include <assert.h>
#include <immintrin.h>
#include <stdint.h>
#include <string.h>

/* Given a vector of N elements in the range [0, 31], this reduces the elements
to the range [0, 30] by mapping 31 to 0 (i.e reduction mod 31) */
void PQCLEAN_MQDSS64_AVX2_vgf31_unique(gf31 *out, gf31 *in) {
__m256i x;
__m256i _w31 = _mm256_set1_epi16(31);
int i;

for (i = 0; i < (N >> 4); ++i) {
x = _mm256_loadu_si256((__m256i const *) (in + 16 * i));
x = _mm256_xor_si256(x, _mm256_and_si256(_w31, _mm256_cmpeq_epi16(x, _w31)));
_mm256_storeu_si256((__m256i *)(out + i * 16), x);
}
}

/* This function acts on vectors with 64 gf31 elements.
It performs one reduction step and guarantees output in [0, 30],
but requires input to be in [0, 32768). */
void PQCLEAN_MQDSS64_AVX2_vgf31_shorten_unique(gf31 *out, gf31 *in) {
__m256i x;
__m256i _w2114 = _mm256_set1_epi32(2114 * 65536 + 2114);
__m256i _w31 = _mm256_set1_epi16(31);
int i;

for (i = 0; i < (N >> 4); ++i) {
x = _mm256_loadu_si256((__m256i const *) (in + 16 * i));
x = _mm256_sub_epi16(x, _mm256_mullo_epi16(_w31, _mm256_mulhi_epi16(x, _w2114)));
x = _mm256_xor_si256(x, _mm256_and_si256(_w31, _mm256_cmpeq_epi16(x, _w31)));
_mm256_storeu_si256((__m256i *)(out + i * 16), x);
}
}

/* Given a seed, samples len gf31 elements (in the range [0, 30]), and places
them in a vector of 16-bit elements */
void PQCLEAN_MQDSS64_AVX2_gf31_nrand(gf31 *out, size_t len, const uint8_t *seed, size_t seedlen) {
size_t i = 0, j;
shake256ctx shakestate;
uint8_t shakeblock[SHAKE256_RATE];

shake256_absorb(&shakestate, seed, seedlen);

while (i < len) {
shake256_squeezeblocks(shakeblock, 1, &shakestate);
for (j = 0; j < SHAKE256_RATE && i < len; j++) {
if ((shakeblock[j] & 31) != 31) {
out[i] = (shakeblock[j] & 31);
i++;
}
}
}
shake256_ctx_release(&shakestate);
}

/* Given a seed, samples len gf31 elements, transposed into unsigned range,
i.e. in the range [-15, 15], and places them in an array of 8-bit integers.
This is used for the expansion of F, which wants packed elements. */
void PQCLEAN_MQDSS64_AVX2_gf31_nrand_schar(signed char *out, size_t len, const uint8_t *seed, size_t seedlen) {
size_t i = 0, j;
shake256ctx shakestate;
uint8_t shakeblock[SHAKE256_RATE];

shake256_absorb(&shakestate, seed, seedlen);

while (i < len) {
shake256_squeezeblocks(shakeblock, 1, &shakestate);
for (j = 0; j < SHAKE256_RATE && i < len; j++) {
if ((shakeblock[j] & 31) != 31) {
out[i] = (signed char)((shakeblock[j] & 31) - 15);
i++;
}
}
}
shake256_ctx_release(&shakestate);

}

/* Unpacks an array of packed GF31 elements to one element per gf31.
Assumes that there is sufficient empty space available at the end of the
array to unpack. Can perform in-place. */
void PQCLEAN_MQDSS64_AVX2_gf31_nunpack(gf31 *out, const uint8_t *in, size_t n) {
size_t i;
size_t j = ((n * 5) >> 3) - 1;
unsigned int d = 0;

for (i = n; i > 0; i--) {
out[i - 1] = (gf31)((in[j] >> d) & 31);
d += 5;
if (d > 8) {
d -= 8;
j--;
out[i - 1] = (gf31)(out[i - 1] ^ ((in[j] << (5 - d)) & 31));
}
}
}

/* Packs an array of GF31 elements from gf31's to concatenated 5-bit values.
Assumes that there is sufficient space available to unpack.
Can perform in-place. */
void PQCLEAN_MQDSS64_AVX2_gf31_npack(uint8_t *out, const gf31 *in, size_t n) {
unsigned int i = 0;
unsigned int j;
int d = 3;

for (j = 0; j < n; j++) {
assert(in[j] < 31);
}

/* There will be ceil(5n / 8) output blocks */
memset(out, 0, (size_t)((5 * n + 7) & ~7U) >> 3);

for (j = 0; j < n; j++) {
if (d < 0) {
d += 8;
out[i] = (uint8_t)((out[i] & (255 << (d - 3))) |
((in[j] >> (8 - d)) & ~(255 << (d - 3))));
i++;
}
out[i] = (uint8_t)((out[i] & ~(31 << d)) | ((in[j] << d) & (31 << d)));
d -= 5;
}
}

+ 36
- 0
crypto_sign/mqdss-64/avx2/gf31.h Ver fichero

@@ -0,0 +1,36 @@
#ifndef MQDSS_GF31_H
#define MQDSS_GF31_H

#include <stddef.h>
#include <stdint.h>

typedef unsigned short gf31;

/* Given a vector of elements in the range [0, 31], this reduces the elements
to the range [0, 30] by mapping 31 to 0 (i.e reduction mod 31) */
void PQCLEAN_MQDSS64_AVX2_vgf31_unique(gf31 *out, gf31 *in);

/* Given a vector of 16-bit integers (i.e. in [0, 65535], this reduces the
elements to the range [0, 30] by mapping 31 to 0 (i.e reduction mod 31) */
void PQCLEAN_MQDSS64_AVX2_vgf31_shorten_unique(gf31 *out, gf31 *in);

/* Given a seed, samples len gf31 elements (in the range [0, 30]), and places
them in a vector of 16-bit elements */
void PQCLEAN_MQDSS64_AVX2_gf31_nrand(gf31 *out, size_t len, const uint8_t *seed, size_t seedlen);

/* Given a seed, samples len gf31 elements, transposed into unsigned range,
i.e. in the range [-15, 15], and places them in an array of 8-bit integers.
This is used for the expansion of F, which wants packed elements. */
void PQCLEAN_MQDSS64_AVX2_gf31_nrand_schar(signed char *out, size_t len, const uint8_t *seed, size_t seedlen);

/* Unpacks an array of packed GF31 elements to one element per gf31.
Assumes that there is sufficient empty space available at the end of the
array to unpack. Can perform in-place. */
void PQCLEAN_MQDSS64_AVX2_gf31_nunpack(gf31 *out, const uint8_t *in, size_t n);

/* Packs an array of GF31 elements from gf31's to concatenated 5-bit values.
Assumes that there is sufficient space available to unpack.
Can perform in-place. */
void PQCLEAN_MQDSS64_AVX2_gf31_npack(uint8_t *out, const gf31 *in, size_t n);

#endif

+ 239
- 0
crypto_sign/mqdss-64/avx2/mq.c Ver fichero

@@ -0,0 +1,239 @@
#include "mq.h"
#include "params.h"
#include <immintrin.h>
#include <stdio.h>

static inline __m256i reduce_16(__m256i r, __m256i _w31, __m256i _w2114) {
__m256i exp = _mm256_mulhi_epi16(r, _w2114);
return _mm256_sub_epi16(r, _mm256_mullo_epi16(_w31, exp));
}

/* Computes all products x_i * x_j, returns in reduced form */
inline static
void generate_quadratic_terms( unsigned char *xij, const gf31 *x ) {
__m256i mask_2114 = _mm256_set1_epi16( 2114 );
__m256i mask_31 = _mm256_set1_epi16( 31 );
__m256i xi[4];
xi[0] = _mm256_loadu_si256((__m256i const *) (x));
xi[1] = _mm256_loadu_si256((__m256i const *) (x + 16));
xi[2] = _mm256_loadu_si256((__m256i const *) (x + 32));
xi[3] = _mm256_loadu_si256((__m256i const *) (x + 48));

__m256i xixj[4];
xixj[0] = _mm256_setzero_si256();
xixj[1] = _mm256_setzero_si256();
xixj[2] = _mm256_setzero_si256();
xixj[3] = _mm256_setzero_si256();

int k = 0;
for (int i = 0; i < 32; i++) {
__m256i br_xi = _mm256_set1_epi16( (short)x[i] );
for (int j = 0; j <= (i >> 4); j++) {
xixj[j] = _mm256_mullo_epi16( xi[j], br_xi );
xixj[j] = reduce_16( xixj[j], mask_31, mask_2114 );
}

__m256i r = _mm256_packs_epi16(xixj[0], xixj[1]);
r = _mm256_permute4x64_epi64(r, 0xd8); // 3,1,2,0
_mm256_storeu_si256( (__m256i *)( xij + k ), r );
k += i + 1;
}

for (int i = 32; i < N; i++) {
__m256i br_xi = _mm256_set1_epi16( (short)x[i] );
for (int j = 0; j <= (i >> 4); j++) {
xixj[j] = _mm256_mullo_epi16( xi[j], br_xi );
xixj[j] = reduce_16( xixj[j], mask_31, mask_2114 );
}

__m256i r0 = _mm256_packs_epi16(xixj[0], xixj[1]);
r0 = _mm256_permute4x64_epi64(r0, 0xd8); // 3,1,2,0
_mm256_storeu_si256( (__m256i *)( xij + k ), r0 );
__m256i r1 = _mm256_packs_epi16(xixj[2], xixj[3]);
r1 = _mm256_permute4x64_epi64(r1, 0xd8); // 3,1,2,0
_mm256_storeu_si256( (__m256i *)( xij + 32 + k ), r1 );
k += i + 1;
}
}

/* Computes all terms (x_i * y_j) + (x_j * y_i), returns in reduced form */
inline static
void generate_xiyj_p_xjyi_terms( unsigned char *xij, const gf31 *x, const gf31 *y ) {
__m256i mask_2114 = _mm256_set1_epi16( 2114 );
__m256i mask_31 = _mm256_set1_epi16( 31 );
__m256i xiyi[4];
xiyi[0] = _mm256_xor_si256(_mm256_loadu_si256((__m256i const *) (x)), _mm256_slli_si256( _mm256_loadu_si256((__m256i const *) (y)), 1 ));
xiyi[1] = _mm256_xor_si256(_mm256_loadu_si256((__m256i const *) (x + 16)), _mm256_slli_si256( _mm256_loadu_si256((__m256i const *) (y + 16)), 1 ));
xiyi[2] = _mm256_xor_si256(_mm256_loadu_si256((__m256i const *) (x + 32)), _mm256_slli_si256( _mm256_loadu_si256((__m256i const *) (y + 32)), 1 ));
xiyi[3] = _mm256_xor_si256(_mm256_loadu_si256((__m256i const *) (x + 48)), _mm256_slli_si256( _mm256_loadu_si256((__m256i const *) (y + 48)), 1 ));

__m256i xixj[4];
xixj[0] = _mm256_setzero_si256();
xixj[1] = _mm256_setzero_si256();
xixj[2] = _mm256_setzero_si256();
xixj[3] = _mm256_setzero_si256();

int k = 0;
for (int i = 0; i < 32; i++) {
__m256i br_yixi = _mm256_set1_epi16( (short)((x[i] << 8)^y[i]) );
for (int j = 0; j <= (i >> 4); j++) {
xixj[j] = _mm256_maddubs_epi16( xiyi[j], br_yixi );
xixj[j] = reduce_16( xixj[j], mask_31, mask_2114 );
}

__m256i r = _mm256_packs_epi16(xixj[0], xixj[1]);
r = _mm256_permute4x64_epi64(r, 0xd8); // 3,1,2,0
_mm256_storeu_si256( (__m256i *)( xij + k ), r );
k += i + 1;
}

for (int i = 32; i < N; i++) {
__m256i br_yixi = _mm256_set1_epi16( (short)((x[i] << 8)^y[i]) );
for (int j = 0; j <= (i >> 4); j++) {
xixj[j] = _mm256_maddubs_epi16( xiyi[j], br_yixi );
xixj[j] = reduce_16( xixj[j], mask_31, mask_2114 );
}

__m256i r0 = _mm256_packs_epi16(xixj[0], xixj[1]);
r0 = _mm256_permute4x64_epi64(r0, 0xd8); // 3,1,2,0
_mm256_storeu_si256( (__m256i *)( xij + k ), r0 );
__m256i r1 = _mm256_packs_epi16(xixj[2], xixj[3]);
r1 = _mm256_permute4x64_epi64(r1, 0xd8); // 3,1,2,0
_mm256_storeu_si256( (__m256i *)( xij + 32 + k ), r1 );
k += i + 1;
}
}

#define EVAL_YMM_0(xx) {\
__m128i tmp = _mm256_castsi256_si128(xx); \
for (int macro_i = 0; macro_i < 8; macro_i++) { \
__m256i _xi = _mm256_broadcastw_epi16(tmp); \
tmp = _mm_srli_si128(tmp, 2); \
for (int macro_j = 0; macro_j < (N/16); macro_j++) { \
__m256i coeff = _mm256_loadu_si256((__m256i const *) F); \
F += 32; \
yy[macro_j] = _mm256_add_epi16(yy[macro_j], _mm256_maddubs_epi16(_xi, coeff)); \
} \
} \
}

#define EVAL_YMM_1(xx) {\
__m128i tmp = _mm256_extracti128_si256(xx, 1); \
for (int macro_i = 0; macro_i < 8; macro_i++) { \
__m256i _xi = _mm256_broadcastw_epi16(tmp); \
tmp = _mm_srli_si128(tmp, 2); \
for (int macro_j = 0; macro_j < (N/16); macro_j++) { \
__m256i coeff = _mm256_loadu_si256((__m256i const *) F); \
F += 32; \
yy[macro_j] = _mm256_add_epi16(yy[macro_j], _mm256_maddubs_epi16(_xi, coeff)); \
} \
} \
}

#define REDUCE_(yy) { \
(yy)[0] = reduce_16((yy)[0], mask_reduce, mask_2114); \
(yy)[1] = reduce_16((yy)[1], mask_reduce, mask_2114); \
(yy)[2] = reduce_16((yy)[2], mask_reduce, mask_2114); \
(yy)[3] = reduce_16((yy)[3], mask_reduce, mask_2114); \
}


/* Evaluates the MQ function on a vector of N gf31 elements x (expected to be
in reduced 5-bit representation). Expects the coefficients in F to be in
signed representation (i.e. [-15, 15], packed bytewise).
Outputs M gf31 elements in unique 16-bit representation as fx. */
void PQCLEAN_MQDSS64_AVX2_MQ(gf31 *fx, const gf31 *x, const signed char *F) {
__m256i mask_2114 = _mm256_set1_epi32(2114 * 65536 + 2114);
__m256i mask_reduce = _mm256_srli_epi16(_mm256_cmpeq_epi16(mask_2114, mask_2114), 11);

__m256i xi[4];
xi[0] = _mm256_loadu_si256((__m256i const *) (x));
xi[1] = _mm256_loadu_si256((__m256i const *) (x + 16));
xi[2] = _mm256_loadu_si256((__m256i const *) (x + 32));
xi[3] = _mm256_loadu_si256((__m256i const *) (x + 48));

__m256i _zero = _mm256_setzero_si256();
xi[0] = _mm256_add_epi16(_mm256_and_si256(mask_reduce, _mm256_cmpgt_epi16(_zero, xi[0])), xi[0]);
xi[1] = _mm256_add_epi16(_mm256_and_si256(mask_reduce, _mm256_cmpgt_epi16(_zero, xi[1])), xi[1]);
xi[2] = _mm256_add_epi16(_mm256_and_si256(mask_reduce, _mm256_cmpgt_epi16(_zero, xi[2])), xi[2]);
xi[3] = _mm256_add_epi16(_mm256_and_si256(mask_reduce, _mm256_cmpgt_epi16(_zero, xi[3])), xi[3]);

__m256i x1 = _mm256_packs_epi16(xi[0], xi[1]);
x1 = _mm256_permute4x64_epi64(x1, 0xd8); // 3,1,2,0
__m256i x2 = _mm256_packs_epi16(xi[2], xi[3]);
x2 = _mm256_permute4x64_epi64(x2, 0xd8); // 3,1,2,0

__m256i yy[M / 16];
yy[0] = _zero;
yy[1] = _zero;
yy[2] = _zero;
yy[3] = _zero;

EVAL_YMM_0(x1)
EVAL_YMM_1(x1)
EVAL_YMM_0(x2)
EVAL_YMM_1(x2)
REDUCE_(yy)

__m256i xixj[65];
generate_quadratic_terms( (unsigned char *) xixj, x );
for (int i = 0 ; i < 64 ; i += 2) {
EVAL_YMM_0(xixj[i])
EVAL_YMM_1(xixj[i])
EVAL_YMM_0(xixj[i + 1])
EVAL_YMM_1(xixj[i + 1])
REDUCE_(yy)
}
EVAL_YMM_0(xixj[64])
EVAL_YMM_1(xixj[64])
REDUCE_(yy)

yy[0] = _mm256_add_epi16(_mm256_and_si256(mask_reduce, _mm256_cmpgt_epi16(_mm256_setzero_si256(), yy[0])), yy[0]);
yy[1] = _mm256_add_epi16(_mm256_and_si256(mask_reduce, _mm256_cmpgt_epi16(_mm256_setzero_si256(), yy[1])), yy[1]);
yy[2] = _mm256_add_epi16(_mm256_and_si256(mask_reduce, _mm256_cmpgt_epi16(_mm256_setzero_si256(), yy[2])), yy[2]);
yy[3] = _mm256_add_epi16(_mm256_and_si256(mask_reduce, _mm256_cmpgt_epi16(_mm256_setzero_si256(), yy[3])), yy[3]);

for (int i = 0; i < (N / 16); ++i) {
_mm256_storeu_si256((__m256i *)(fx + i * 16), yy[i]);
}
}

/* Evaluates the bilinear polar form of the MQ function (i.e. G) on a vector of
N gf31 elements x (expected to be in reduced 5-bit representation). Expects
the coefficients in F to be in signed representation (i.e. [-15, 15], packed
bytewise). Outputs M gf31 elements in unique 16-bit representation as fx. */
void PQCLEAN_MQDSS64_AVX2_G(gf31 *fx, const gf31 *x, const gf31 *y, const signed char *F) {
__m256i mask_2114 = _mm256_set1_epi32(2114 * 65536 + 2114);
__m256i mask_reduce = _mm256_srli_epi16(_mm256_cmpeq_epi16(mask_2114, mask_2114), 11);
__m256i _zero = _mm256_setzero_si256();

__m256i yy[(M / 16)];
yy[0] = _zero;
yy[1] = _zero;
yy[2] = _zero;
yy[3] = _zero;

F += N * M;

__m256i xixj[65];
generate_xiyj_p_xjyi_terms( (unsigned char *) xixj, x, y );
for (int i = 0 ; i < 64 ; i += 2) {
EVAL_YMM_0(xixj[i])
EVAL_YMM_1(xixj[i])
EVAL_YMM_0(xixj[i + 1])
EVAL_YMM_1(xixj[i + 1])
REDUCE_(yy)
}
EVAL_YMM_0(xixj[64])
EVAL_YMM_1(xixj[64])
REDUCE_(yy)

yy[0] = _mm256_add_epi16(_mm256_and_si256(mask_reduce, _mm256_cmpgt_epi16(_mm256_setzero_si256(), yy[0])), yy[0]);
yy[1] = _mm256_add_epi16(_mm256_and_si256(mask_reduce, _mm256_cmpgt_epi16(_mm256_setzero_si256(), yy[1])), yy[1]);
yy[2] = _mm256_add_epi16(_mm256_and_si256(mask_reduce, _mm256_cmpgt_epi16(_mm256_setzero_si256(), yy[2])), yy[2]);
yy[3] = _mm256_add_epi16(_mm256_and_si256(mask_reduce, _mm256_cmpgt_epi16(_mm256_setzero_si256(), yy[3])), yy[3]);

for (int i = 0; i < (N / 16); ++i) {
_mm256_storeu_si256((__m256i *)(fx + i * 16), yy[i]);
}
}

+ 18
- 0
crypto_sign/mqdss-64/avx2/mq.h Ver fichero

@@ -0,0 +1,18 @@
#ifndef MQDSS_MQ_H
#define MQDSS_MQ_H

#include "gf31.h"

/* Evaluates the MQ function on a vector of N gf31 elements x (expected to be
in reduced 5-bit representation). Expects the coefficients in F to be in
signed representation (i.e. [-15, 15], packed bytewise).
Outputs M gf31 elements in unique 16-bit representation as fx. */
void PQCLEAN_MQDSS64_AVX2_MQ(gf31 *fx, const gf31 *x, const signed char *F);

/* Evaluates the bilinear polar form of the MQ function (i.e. G) on a vector of
N gf31 elements x (expected to be in reduced 5-bit representation). Expects
the coefficients in F to be in signed representation (i.e. [-15, 15], packed
bytewise). Outputs M gf31 elements in unique 16-bit representation as fx. */
void PQCLEAN_MQDSS64_AVX2_G(gf31 *fx, const gf31 *x, const gf31 *y, const signed char *F);

#endif

+ 25
- 0
crypto_sign/mqdss-64/avx2/params.h Ver fichero

@@ -0,0 +1,25 @@
#ifndef MQDSS_PARAMS_H
#define MQDSS_PARAMS_H

#define N 64
#define M N
#define F_LEN (M * (((N * (N + 1)) >> 1) + N)) /* Number of elements in F */

#define ROUNDS 277

/* Number of bytes that N, M and F_LEN elements require when packed into a byte
array, 5-bit elements packed continuously. */
/* Assumes N and M to be multiples of 8 */
#define NPACKED_BYTES ((N * 5) >> 3)
#define MPACKED_BYTES ((M * 5) >> 3)
#define FPACKED_BYTES ((F_LEN * 5) >> 3)

#define HASH_BYTES 48
#define SEED_BYTES 24
#define PK_BYTES (SEED_BYTES + MPACKED_BYTES)
#define SK_BYTES SEED_BYTES

// R, sigma_0, ROUNDS * (t1, r{0,1}, e1, c, rho)
#define SIG_LEN (2 * HASH_BYTES + ROUNDS * (2*NPACKED_BYTES + MPACKED_BYTES + HASH_BYTES + HASH_BYTES))

#endif

+ 389
- 0
crypto_sign/mqdss-64/avx2/sign.c Ver fichero

@@ -0,0 +1,389 @@
#include <stddef.h>
#include <stdint.h>
#include <string.h>

#include "api.h"
#include "fips202.h"
#include "gf31.h"
#include "mq.h"
#include "params.h"
#include "randombytes.h"

/* Takes an array of len bytes and computes a hash digest.
This is used as a hash function in the Fiat-Shamir transform. */
static void H(unsigned char *out, const unsigned char *in, const size_t len) {
shake256(out, HASH_BYTES, in, len);
}

/* Takes two arrays of N packed elements and an array of M packed elements,
and computes a HASH_BYTES commitment. */
static void com_0(unsigned char *c,
const unsigned char *rho,
const unsigned char *inn, const unsigned char *inn2,
const unsigned char *inm) {
unsigned char buffer[HASH_BYTES + 2 * NPACKED_BYTES + MPACKED_BYTES];
memcpy(buffer, rho, HASH_BYTES);
memcpy(buffer + HASH_BYTES, inn, NPACKED_BYTES);
memcpy(buffer + HASH_BYTES + NPACKED_BYTES, inn2, NPACKED_BYTES);
memcpy(buffer + HASH_BYTES + 2 * NPACKED_BYTES, inm, MPACKED_BYTES);
shake256(c, HASH_BYTES, buffer, HASH_BYTES + 2 * NPACKED_BYTES + MPACKED_BYTES);
}

/* Takes an array of N packed elements and an array of M packed elements,
and computes a HASH_BYTES commitment. */
static void com_1(unsigned char *c,
const unsigned char *rho,
const unsigned char *inn, const unsigned char *inm) {
unsigned char buffer[HASH_BYTES + NPACKED_BYTES + MPACKED_BYTES];
memcpy(buffer, rho, HASH_BYTES);
memcpy(buffer + HASH_BYTES, inn, NPACKED_BYTES);
memcpy(buffer + HASH_BYTES + NPACKED_BYTES, inm, MPACKED_BYTES);
shake256(c, HASH_BYTES, buffer, HASH_BYTES + NPACKED_BYTES + MPACKED_BYTES);
}

/*
* Generates an MQDSS key pair.
*/
int PQCLEAN_MQDSS64_AVX2_crypto_sign_keypair(uint8_t *pk, uint8_t *sk) {
signed char F[F_LEN];
unsigned char skbuf[SEED_BYTES * 2];
gf31 sk_gf31[N];
gf31 pk_gf31[M];

// Expand sk to obtain a seed for F and the secret input s.
// We also expand to obtain a value for sampling r0, t0 and e0 during
// signature generation, but that is not relevant here.
randombytes(sk, SEED_BYTES);
shake256(skbuf, SEED_BYTES * 2, sk, SEED_BYTES);

memcpy(pk, skbuf, SEED_BYTES);
PQCLEAN_MQDSS64_AVX2_gf31_nrand_schar(F, F_LEN, pk, SEED_BYTES);
PQCLEAN_MQDSS64_AVX2_gf31_nrand(sk_gf31, N, skbuf + SEED_BYTES, SEED_BYTES);
PQCLEAN_MQDSS64_AVX2_MQ(pk_gf31, sk_gf31, F);
PQCLEAN_MQDSS64_AVX2_vgf31_unique(pk_gf31, pk_gf31);
PQCLEAN_MQDSS64_AVX2_gf31_npack(pk + SEED_BYTES, pk_gf31, M);

return 0;
}

/**
* Returns an array containing a detached signature.
*/
int PQCLEAN_MQDSS64_AVX2_crypto_sign_signature(
uint8_t *sig, size_t *siglen,
const uint8_t *m, size_t mlen, const uint8_t *sk) {

signed char F[F_LEN];
unsigned char skbuf[SEED_BYTES * 4];
gf31 pk_gf31[M];
unsigned char pk[SEED_BYTES + MPACKED_BYTES];
// Concatenated for convenient hashing.
unsigned char D_sigma0_h0_sigma1[HASH_BYTES * 3 + ROUNDS * (NPACKED_BYTES + MPACKED_BYTES)];
unsigned char *D = D_sigma0_h0_sigma1;
unsigned char *sigma0 = D_sigma0_h0_sigma1 + HASH_BYTES;
unsigned char *h0 = D_sigma0_h0_sigma1 + 2 * HASH_BYTES;
unsigned char *t1packed = D_sigma0_h0_sigma1 + 3 * HASH_BYTES;
unsigned char *e1packed = D_sigma0_h0_sigma1 + 3 * HASH_BYTES + ROUNDS * NPACKED_BYTES;
shake256ctx shakestate;
unsigned char shakeblock[SHAKE256_RATE];
unsigned char h1[((ROUNDS + 7) & ~7) >> 3];
unsigned char rnd_seed[HASH_BYTES + SEED_BYTES];
unsigned char rho[2 * ROUNDS * HASH_BYTES];
unsigned char *rho0 = rho;
unsigned char *rho1 = rho + ROUNDS * HASH_BYTES;
gf31 sk_gf31[N];
gf31 rnd[(2 * N + M) * ROUNDS]; // Concatenated for easy RNG.
gf31 *r0 = rnd;
gf31 *t0 = rnd + N * ROUNDS;
gf31 *e0 = rnd + 2 * N * ROUNDS;
gf31 r1[N * ROUNDS];
gf31 t1[N * ROUNDS];
gf31 e1[M * ROUNDS];
gf31 gx[M * ROUNDS];
unsigned char packbuf0[NPACKED_BYTES];
unsigned char packbuf1[NPACKED_BYTES];
unsigned char packbuf2[MPACKED_BYTES];
unsigned char c[HASH_BYTES * ROUNDS * 2];
gf31 alpha;
int alpha_count = 0;
int b;
int i, j;
shake256incctx state;

shake256(skbuf, SEED_BYTES * 4, sk, SEED_BYTES);

PQCLEAN_MQDSS64_AVX2_gf31_nrand_schar(F, F_LEN, skbuf, SEED_BYTES);

shake256_inc_init(&state);
shake256_inc_absorb(&state, sk, SEED_BYTES);
shake256_inc_absorb(&state, m, mlen);
shake256_inc_finalize(&state);
shake256_inc_squeeze(sig, HASH_BYTES, &state); // Compute R.
shake256_inc_ctx_release(&state);

memcpy(pk, skbuf, SEED_BYTES);
PQCLEAN_MQDSS64_AVX2_gf31_nrand(sk_gf31, N, skbuf + SEED_BYTES, SEED_BYTES);
PQCLEAN_MQDSS64_AVX2_MQ(pk_gf31, sk_gf31, F);
PQCLEAN_MQDSS64_AVX2_vgf31_unique(pk_gf31, pk_gf31);
PQCLEAN_MQDSS64_AVX2_gf31_npack(pk + SEED_BYTES, pk_gf31, M);

shake256_inc_init(&state);
shake256_inc_absorb(&state, pk, PK_BYTES);
shake256_inc_absorb(&state, sig, HASH_BYTES);
shake256_inc_absorb(&state, m, mlen);
shake256_inc_finalize(&state);
shake256_inc_squeeze(D, HASH_BYTES, &state);
shake256_inc_ctx_release(&state);

sig += HASH_BYTES; // Compensate for prefixed R.

memcpy(rnd_seed, skbuf + 2 * SEED_BYTES, SEED_BYTES);
memcpy(rnd_seed + SEED_BYTES, D, HASH_BYTES);
shake256(rho, 2 * ROUNDS * HASH_BYTES, rnd_seed, SEED_BYTES + HASH_BYTES);

memcpy(rnd_seed, skbuf + 3 * SEED_BYTES, SEED_BYTES);
memcpy(rnd_seed + SEED_BYTES, D, HASH_BYTES);
PQCLEAN_MQDSS64_AVX2_gf31_nrand(rnd, (2 * N + M) * ROUNDS, rnd_seed, SEED_BYTES + HASH_BYTES);

for (i = 0; i < ROUNDS; i++) {
for (j = 0; j < N; j++) {
r1[j + i * N] = (gf31)(31 + sk_gf31[j] - r0[j + i * N]);
}
PQCLEAN_MQDSS64_AVX2_G(gx + i * M, t0 + i * N, r1 + i * N, F);
}
for (i = 0; i < ROUNDS * M; i++) {
gx[i] = (gf31)(gx[i] + e0[i]);
}
for (i = 0; i < ROUNDS; i++) {
PQCLEAN_MQDSS64_AVX2_gf31_npack(packbuf0, r0 + i * N, N);
PQCLEAN_MQDSS64_AVX2_gf31_npack(packbuf1, t0 + i * N, N);
PQCLEAN_MQDSS64_AVX2_gf31_npack(packbuf2, e0 + i * M, M);
com_0(c + HASH_BYTES * (2 * i + 0), rho0 + i * HASH_BYTES, packbuf0, packbuf1, packbuf2);
PQCLEAN_MQDSS64_AVX2_vgf31_shorten_unique(r1 + i * N, r1 + i * N);
PQCLEAN_MQDSS64_AVX2_vgf31_shorten_unique(gx + i * M, gx + i * M);
PQCLEAN_MQDSS64_AVX2_gf31_npack(packbuf0, r1 + i * N, N);
PQCLEAN_MQDSS64_AVX2_gf31_npack(packbuf1, gx + i * M, M);
com_1(c + HASH_BYTES * (2 * i + 1), rho1 + i * HASH_BYTES, packbuf0, packbuf1);
}

H(sigma0, c, HASH_BYTES * ROUNDS * 2); // Compute sigma_0.
shake256_absorb(&shakestate, D_sigma0_h0_sigma1, 2 * HASH_BYTES);
shake256_squeezeblocks(shakeblock, 1, &shakestate);

memcpy(h0, shakeblock, HASH_BYTES);

memcpy(sig, sigma0, HASH_BYTES);
sig += HASH_BYTES; // Compensate for sigma_0.

for (i = 0; i < ROUNDS; i++) {
do {
alpha = shakeblock[alpha_count] & 31;
alpha_count++;
if (alpha_count == SHAKE256_RATE) {
alpha_count = 0;
shake256_squeezeblocks(shakeblock, 1, &shakestate);
}
} while (alpha == 31);
for (j = 0; j < N; j++) {
t1[i * N + j] = (gf31)(alpha * r0[j + i * N] - t0[j + i * N] + 31);
}
PQCLEAN_MQDSS64_AVX2_MQ(e1 + i * M, r0 + i * N, F);
for (j = 0; j < N; j++) {
e1[i * N + j] = (gf31)(alpha * e1[j + i * M] - e0[j + i * M] + 31);
}
PQCLEAN_MQDSS64_AVX2_vgf31_shorten_unique(t1 + i * N, t1 + i * N);
PQCLEAN_MQDSS64_AVX2_vgf31_shorten_unique(e1 + i * N, e1 + i * N);
}
shake256_ctx_release(&shakestate);

PQCLEAN_MQDSS64_AVX2_gf31_npack(t1packed, t1, N * ROUNDS);
PQCLEAN_MQDSS64_AVX2_gf31_npack(e1packed, e1, M * ROUNDS);

memcpy(sig, t1packed, NPACKED_BYTES * ROUNDS);
sig += NPACKED_BYTES * ROUNDS;
memcpy(sig, e1packed, MPACKED_BYTES * ROUNDS);
sig += MPACKED_BYTES * ROUNDS;

shake256(h1, ((ROUNDS + 7) & ~7) >> 3, D_sigma0_h0_sigma1, 3 * HASH_BYTES + ROUNDS * (NPACKED_BYTES + MPACKED_BYTES));

for (i = 0; i < ROUNDS; i++) {
b = (h1[(i >> 3)] >> (i & 7)) & 1;
if (b == 0) {
PQCLEAN_MQDSS64_AVX2_gf31_npack(sig, r0 + i * N, N);
} else if (b == 1) {
PQCLEAN_MQDSS64_AVX2_gf31_npack(sig, r1 + i * N, N);
}
memcpy(sig + NPACKED_BYTES, c + HASH_BYTES * (2 * i + (1 - b)), HASH_BYTES);
memcpy(sig + NPACKED_BYTES + HASH_BYTES, rho + (i + b * ROUNDS) * HASH_BYTES, HASH_BYTES);
sig += NPACKED_BYTES + 2 * HASH_BYTES;
}

*siglen = SIG_LEN;
return 0;
}

/**
* Verifies a detached signature and message under a given public key.
*/
int PQCLEAN_MQDSS64_AVX2_crypto_sign_verify(
const uint8_t *sig, size_t siglen,
const uint8_t *m, size_t mlen, const uint8_t *pk) {

gf31 r[N];
gf31 t[N];
gf31 e[M];
signed char F[F_LEN];
gf31 pk_gf31[M];
// Concatenated for convenient hashing.
unsigned char D_sigma0_h0_sigma1[HASH_BYTES * 3 + ROUNDS * (NPACKED_BYTES + MPACKED_BYTES)];
unsigned char *D = D_sigma0_h0_sigma1;
unsigned char *sigma0 = D_sigma0_h0_sigma1 + HASH_BYTES;
unsigned char *h0 = D_sigma0_h0_sigma1 + 2 * HASH_BYTES;
unsigned char *t1packed = D_sigma0_h0_sigma1 + 3 * HASH_BYTES;
unsigned char *e1packed = D_sigma0_h0_sigma1 + 3 * HASH_BYTES + ROUNDS * NPACKED_BYTES;
unsigned char h1[((ROUNDS + 7) & ~7) >> 3];
unsigned char c[HASH_BYTES * ROUNDS * 2];
memset(c, 0, HASH_BYTES * 2);
gf31 x[N];
gf31 y[M];
gf31 z[M];
unsigned char packbuf0[NPACKED_BYTES];
unsigned char packbuf1[MPACKED_BYTES];
shake256ctx shakestate;
unsigned char shakeblock[SHAKE256_RATE];
int i, j;
gf31 alpha;
int alpha_count = 0;
int b;
shake256incctx state;

if (siglen != SIG_LEN) {
return -1;
}

shake256_inc_init(&state);
shake256_inc_absorb(&state, pk, PK_BYTES);
shake256_inc_absorb(&state, sig, HASH_BYTES);
shake256_inc_absorb(&state, m, mlen);
shake256_inc_finalize(&state);
shake256_inc_squeeze(D, HASH_BYTES, &state);
shake256_inc_ctx_release(&state);

sig += HASH_BYTES;

PQCLEAN_MQDSS64_AVX2_gf31_nrand_schar(F, F_LEN, pk, SEED_BYTES);
pk += SEED_BYTES;
PQCLEAN_MQDSS64_AVX2_gf31_nunpack(pk_gf31, pk, M);

memcpy(sigma0, sig, HASH_BYTES);

shake256_absorb(&shakestate, D_sigma0_h0_sigma1, 2 * HASH_BYTES);
shake256_squeezeblocks(shakeblock, 1, &shakestate);

memcpy(h0, shakeblock, HASH_BYTES);

sig += HASH_BYTES;

memcpy(t1packed, sig, ROUNDS * NPACKED_BYTES);
sig += ROUNDS * NPACKED_BYTES;
memcpy(e1packed, sig, ROUNDS * MPACKED_BYTES);
sig += ROUNDS * MPACKED_BYTES;

shake256(h1, ((ROUNDS + 7) & ~7) >> 3, D_sigma0_h0_sigma1, 3 * HASH_BYTES + ROUNDS * (NPACKED_BYTES + MPACKED_BYTES));

for (i = 0; i < ROUNDS; i++) {
do {
alpha = shakeblock[alpha_count] & 31;
alpha_count++;
if (alpha_count == SHAKE256_RATE) {
alpha_count = 0;
shake256_squeezeblocks(shakeblock, 1, &shakestate);
}
} while (alpha == 31);
b = (h1[(i >> 3)] >> (i & 7)) & 1;

PQCLEAN_MQDSS64_AVX2_gf31_nunpack(r, sig, N);
PQCLEAN_MQDSS64_AVX2_gf31_nunpack(t, t1packed + NPACKED_BYTES * i, N);
PQCLEAN_MQDSS64_AVX2_gf31_nunpack(e, e1packed + MPACKED_BYTES * i, M);

if (b == 0) {
PQCLEAN_MQDSS64_AVX2_MQ(y, r, F);
for (j = 0; j < N; j++) {
x[j] = (gf31)(alpha * r[j] - t[j] + 31);
}
for (j = 0; j < N; j++) {
y[j] = (gf31)(alpha * y[j] - e[j] + 31);
}
PQCLEAN_MQDSS64_AVX2_vgf31_shorten_unique(x, x);
PQCLEAN_MQDSS64_AVX2_vgf31_shorten_unique(y, y);
PQCLEAN_MQDSS64_AVX2_gf31_npack(packbuf0, x, N);
PQCLEAN_MQDSS64_AVX2_gf31_npack(packbuf1, y, M);
com_0(c + HASH_BYTES * (2 * i + 0), sig + HASH_BYTES + NPACKED_BYTES, sig, packbuf0, packbuf1);
} else {
PQCLEAN_MQDSS64_AVX2_MQ(y, r, F);
PQCLEAN_MQDSS64_AVX2_G(z, t, r, F);
for (j = 0; j < N; j++) {
y[j] = (gf31)(alpha * (31 + pk_gf31[j] - y[j]) - z[j] - e[j] + 62);
}
PQCLEAN_MQDSS64_AVX2_vgf31_shorten_unique(y, y);
PQCLEAN_MQDSS64_AVX2_gf31_npack(packbuf0, y, M);
com_1(c + HASH_BYTES * (2 * i + 1), sig + HASH_BYTES + NPACKED_BYTES, sig, packbuf0);
}
memcpy(c + HASH_BYTES * (2 * i + (1 - b)), sig + NPACKED_BYTES, HASH_BYTES);
sig += NPACKED_BYTES + 2 * HASH_BYTES;
}
shake256_ctx_release(&shakestate);

H(c, c, HASH_BYTES * ROUNDS * 2);
if (memcmp(c, sigma0, HASH_BYTES) != 0) {
return -1;
}

return 0;
}

/**
* Returns an array containing the signature followed by the message.
*/
int PQCLEAN_MQDSS64_AVX2_crypto_sign(
uint8_t *sm, size_t *smlen,
const uint8_t *m, size_t mlen, const uint8_t *sk) {
size_t siglen;

PQCLEAN_MQDSS64_AVX2_crypto_sign_signature(
sm, &siglen, m, mlen, sk);

memmove(sm + SIG_LEN, m, mlen);
*smlen = siglen + mlen;

return 0;
}

/**
* Verifies a given signature-message pair under a given public key.
*/
int PQCLEAN_MQDSS64_AVX2_crypto_sign_open(
uint8_t *m, size_t *mlen,
const uint8_t *sm, size_t smlen, const uint8_t *pk) {
/* The API caller does not necessarily know what size a signature should be
but MQDSS signatures are always exactly SIG_LEN. */
if (smlen < SIG_LEN) {
memset(m, 0, smlen);
*mlen = 0;
return -1;
}

*mlen = smlen - SIG_LEN;

if (PQCLEAN_MQDSS64_AVX2_crypto_sign_verify(
sm, SIG_LEN, sm + SIG_LEN, *mlen, pk)) {
memset(m, 0, smlen);
*mlen = 0;
return -1;
}

/* If verification was successful, move the message to the right place. */
memmove(m, sm + SIG_LEN, *mlen);

return 0;
}

+ 0
- 1
crypto_sign/mqdss-64/clean/sign.c Ver fichero

@@ -1,4 +1,3 @@
#include <assert.h>
#include <stddef.h>
#include <stdint.h>
#include <string.h>


+ 20
- 0
test/duplicate_consistency/mqdss-48_clean.yml Ver fichero

@@ -0,0 +1,20 @@
consistency_checks:
- source:
scheme: mqdss-48
implementation: avx2
files:
- api.h
- mq.h
- LICENSE
- mq.h
- sign.c
- params.h
- source:
scheme: mqdss-64
implementation: clean
files:
- gf31.c
- gf31.h
- LICENSE
- mq.c
- mq.h

+ 11
- 0
test/duplicate_consistency/mqdss-64_clean.yml Ver fichero

@@ -9,3 +9,14 @@ consistency_checks:
- mq.c
- mq.h
- sign.c
- source:
scheme: mqdss-64
implementation: avx2
files:
- api.h
- mq.h
- LICENSE
- mq.h
- sign.c
- params.h


+ 1
- 0
test/test_testvectors.py Ver fichero

@@ -40,6 +40,7 @@ def test_testvectors(implementation, impl_path, test_dir, init, destr):
implementation.name,
'.exe' if os.name == 'nt' else ''
))],
print_output=False,
).replace('\r', '')
assert(implementation.scheme.metadata()['testvectors-sha256'].lower()
== hashlib.sha256(out.encode('utf-8')).hexdigest().lower())


Cargando…
Cancelar
Guardar