boringssl/crypto/perlasm
David Benjamin b111f7a0e4 Rebase x86_64-xlate.pl atop master.
This functionally pulls in a number of changes from upstream, including:
4e3d2866b6e8e7a700ea22e05840a093bfd7a4b1
1eb12c437bbeb2c748291bcd23733d4a59d5d1ca
6a4ea0022c475bbc2c7ad98a6f05f6e2e850575b
c25278db8e4c21772a0cd81f7873e767cbc6d219
e0a651945cb5a70a2abd9902c0fd3e9759d35867
d405aa2ff265965c71ce7331cf0e49d634a06924
ce3d25d3e5a7e82fd59fd30dff7acc39baed8b5e
9ba96fbb2523cb12747c559c704c58bd8f9e7982

Notably, c25278db8e4c21772a0cd81f7873e767cbc6d219 makes it enable 'use strict'.

To avoid having to deal with complex conflicts, this was done by taking a diff
of our copy of the file with the point just before
c25278db8e4c21772a0cd81f7873e767cbc6d219, and reapplying the non-reverting
parts of our diff on top of upstream's current version.

Confirmed with generate_build_files.py that this makes no changes *except*
d405aa2ff265965c71ce7331cf0e49d634a06924 causes this sort of change throughout
chacha-x86_64.pl's nasm output:

@@ -1179,7 +1179,7 @@ $L$oop8x:
        vpslld  ymm14,ymm0,12
        vpsrld  ymm0,ymm0,20
        vpor    ymm0,ymm14,ymm0
-       vbroadcasti128  ymm14,YMMWORD[r11]
+       vbroadcasti128  ymm14,XMMWORD[r11]
        vpaddd  ymm13,ymm13,ymm5
        vpxor   ymm1,ymm13,ymm1
        vpslld  ymm15,ymm1,12

This appears to be correct. vbroadcasti128 takes a 128-bit-wide second
argument, so it wants XMMWORD, not YMMWORD. I suppose nasm just didn't care.

(Looking at a diff-diff may be a more useful way to review this CL.)

Change-Id: I61be0d225ddf13b5f05d1369ddda84b2f322ef9d
Reviewed-on: https://boringssl-review.googlesource.com/8392
Reviewed-by: Adam Langley <agl@google.com>
Commit-Queue: Adam Langley <agl@google.com>
2016-06-22 19:54:14 +00:00
..
arm-xlate.pl Mark ARM assembly globals hidden uniformly in arm-xlate.pl. 2016-02-11 17:28:03 +00:00
cbc.pl
readme
x86_64-xlate.pl Rebase x86_64-xlate.pl atop master. 2016-06-22 19:54:14 +00:00
x86asm.pl
x86gas.pl ymm registers are not suffixed with w. 2016-02-23 23:18:53 +00:00
x86masm.pl
x86nasm.pl

The perl scripts in this directory are my 'hack' to generate
multiple different assembler formats via the one origional script.

The way to use this library is to start with adding the path to this directory
and then include it.

push(@INC,"perlasm","../../perlasm");
require "x86asm.pl";

The first thing we do is setup the file and type of assember

&asm_init($ARGV[0],$0);

The first argument is the 'type'.  Currently
'cpp', 'sol', 'a.out', 'elf' or 'win32'.
Argument 2 is the file name.

The reciprocal function is
&asm_finish() which should be called at the end.

There are 2 main 'packages'. x86ms.pl, which is the microsoft assembler,
and x86unix.pl which is the unix (gas) version.

Functions of interest are:
&external_label("des_SPtrans");	declare and external variable
&LB(reg);			Low byte for a register
&HB(reg);			High byte for a register
&BP(off,base,index,scale)	Byte pointer addressing
&DWP(off,base,index,scale)	Word pointer addressing
&stack_push(num)		Basically a 'sub esp, num*4' with extra
&stack_pop(num)			inverse of stack_push
&function_begin(name,extra)	Start a function with pushing of
				edi, esi, ebx and ebp.  extra is extra win32
				external info that may be required.
&function_begin_B(name,extra)	Same as norma function_begin but no pushing.
&function_end(name)		Call at end of function.
&function_end_A(name)		Standard pop and ret, for use inside functions
&function_end_B(name)		Call at end but with poping or 'ret'.
&swtmp(num)			Address on stack temp word.
&wparam(num)			Parameter number num, that was push
				in C convention.  This all works over pushes
				and pops.
&comment("hello there")		Put in a comment.
&label("loop")			Refer to a label, normally a jmp target.
&set_label("loop")		Set a label at this point.
&data_word(word)		Put in a word of data.

So how does this all hold together?  Given

int calc(int len, int *data)
	{
	int i,j=0;

	for (i=0; i<len; i++)
		{
		j+=other(data[i]);
		}
	}

So a very simple version of this function could be coded as

	push(@INC,"perlasm","../../perlasm");
	require "x86asm.pl";
	
	&asm_init($ARGV[0],"cacl.pl");

	&external_label("other");

	$tmp1=	"eax";
	$j=	"edi";
	$data=	"esi";
	$i=	"ebp";

	&comment("a simple function");
	&function_begin("calc");
	&mov(	$data,		&wparam(1)); # data
	&xor(	$j,		$j);
	&xor(	$i,		$i);

	&set_label("loop");
	&cmp(	$i,		&wparam(0));
	&jge(	&label("end"));

	&mov(	$tmp1,		&DWP(0,$data,$i,4));
	&push(	$tmp1);
	&call(	"other");
	&add(	$j,		"eax");
	&pop(	$tmp1);
	&inc(	$i);
	&jmp(	&label("loop"));

	&set_label("end");
	&mov(	"eax",		$j);

	&function_end("calc");

	&asm_finish();

The above example is very very unoptimised but gives an idea of how
things work.

There is also a cbc mode function generator in cbc.pl

&cbc(	$name,
	$encrypt_function_name,
	$decrypt_function_name,
	$true_if_byte_swap_needed,
	$parameter_number_for_iv,
	$parameter_number_for_encrypt_flag,
	$first_parameter_to_pass,
	$second_parameter_to_pass,
	$third_parameter_to_pass);

So for example, given
void BF_encrypt(BF_LONG *data,BF_KEY *key);
void BF_decrypt(BF_LONG *data,BF_KEY *key);
void BF_cbc_encrypt(unsigned char *in, unsigned char *out, long length,
        BF_KEY *ks, unsigned char *iv, int enc);

&cbc("BF_cbc_encrypt","BF_encrypt","BF_encrypt",1,4,5,3,-1,-1);

&cbc("des_ncbc_encrypt","des_encrypt","des_encrypt",0,4,5,3,5,-1);
&cbc("des_ede3_cbc_encrypt","des_encrypt3","des_decrypt3",0,6,7,3,4,5);