boringssl/crypto/perlasm
David Benjamin 293d9ee4e8 Support execute-only memory for AArch64 assembly.
Put data in .rodata and, rather than adr, use the combination of adrp :pg_hi21:
and add :lo12:. Unfortunately, iOS uses different syntax, so we must add more
transforms to arm-xlate.pl.

Tested manually by:

1. Use Android NDK r19-beta1

2. Follow usual instructions to configure CMake for aarch64, but pass
   -DCMAKE_EXE_LINKER_FLAGS="-fuse-ld=lld -Wl,-execute-only".

3. Build. Confirm with readelf -l tool/bssl that .text is not marked
   readable.

4. Push the test binaries onto a Pixel 3. Test normally and with
   --cpu={none,neon,crypto}. I had to pass --gtest_filter=-*Thread* to
   crypto_test. There appears to be an issue with some runtime function
   that's unrelated to our assembly.

No measurable performance difference.

Going forward, to support this, we will need to apply similar changes to
all other AArch64 assembly. This is relatively straightforward, but may
be a little finicky for dual-AArch32/AArch64 files (aesv8-armx.pl).

Update-Note: Assembly syntax is a mess. There's a decent chance some
assembler will get offend.

Change-Id: Ib59b921d4cce76584320fefd23e6bb7ebd4847eb
Reviewed-on: https://boringssl-review.googlesource.com/c/33245
Reviewed-by: Adam Langley <agl@google.com>
Commit-Queue: David Benjamin <davidben@google.com>
2018-11-19 19:58:15 +00:00
..
arm-xlate.pl Support execute-only memory for AArch64 assembly. 2018-11-19 19:58:15 +00:00
ppc-xlate.pl Don't include quotes in heredocs. 2018-09-14 16:51:00 +00:00
readme Remove filename argument to x86 asm_init. 2017-05-12 14:58:27 +00:00
x86_64-xlate.pl Automatically disable assembly with MSAN. 2018-09-07 21:12:37 +00:00
x86asm.pl Remove filename argument to x86 asm_init. 2017-05-12 14:58:27 +00:00
x86gas.pl Fixing assembly coverage reporting. 2017-05-11 16:55:29 +00:00
x86masm.pl Remove filename argument to x86 asm_init. 2017-05-12 14:58:27 +00:00
x86nasm.pl

The perl scripts in this directory are my 'hack' to generate
multiple different assembler formats via the one origional script.

The way to use this library is to start with adding the path to this directory
and then include it.

push(@INC,"perlasm","../../perlasm");
require "x86asm.pl";

The first thing we do is setup the file and type of assembler

&asm_init($ARGV[0]);

The first argument is the 'type'.  Currently
'cpp', 'sol', 'a.out', 'elf' or 'win32'.
Argument 2 is the file name.

The reciprocal function is
&asm_finish() which should be called at the end.

There are 2 main 'packages'. x86ms.pl, which is the Microsoft assembler,
and x86unix.pl which is the unix (gas) version.

Functions of interest are:
&external_label("des_SPtrans");	declare and external variable
&LB(reg);			Low byte for a register
&HB(reg);			High byte for a register
&BP(off,base,index,scale)	Byte pointer addressing
&DWP(off,base,index,scale)	Word pointer addressing
&stack_push(num)		Basically a 'sub esp, num*4' with extra
&stack_pop(num)			inverse of stack_push
&function_begin(name,extra)	Start a function with pushing of
				edi, esi, ebx and ebp.  extra is extra win32
				external info that may be required.
&function_begin_B(name,extra)	Same as normal function_begin but no pushing.
&function_end(name)		Call at end of function.
&function_end_A(name)		Standard pop and ret, for use inside functions
&function_end_B(name)		Call at end but with poping or 'ret'.
&swtmp(num)			Address on stack temp word.
&wparam(num)			Parameter number num, that was push
				in C convention.  This all works over pushes
				and pops.
&comment("hello there")		Put in a comment.
&label("loop")			Refer to a label, normally a jmp target.
&set_label("loop")		Set a label at this point.
&data_word(word)		Put in a word of data.

So how does this all hold together?  Given

int calc(int len, int *data)
	{
	int i,j=0;

	for (i=0; i<len; i++)
		{
		j+=other(data[i]);
		}
	}

So a very simple version of this function could be coded as

	push(@INC,"perlasm","../../perlasm");
	require "x86asm.pl";
	
	&asm_init($ARGV[0]);

	&external_label("other");

	$tmp1=	"eax";
	$j=	"edi";
	$data=	"esi";
	$i=	"ebp";

	&comment("a simple function");
	&function_begin("calc");
	&mov(	$data,		&wparam(1)); # data
	&xor(	$j,		$j);
	&xor(	$i,		$i);

	&set_label("loop");
	&cmp(	$i,		&wparam(0));
	&jge(	&label("end"));

	&mov(	$tmp1,		&DWP(0,$data,$i,4));
	&push(	$tmp1);
	&call(	"other");
	&add(	$j,		"eax");
	&pop(	$tmp1);
	&inc(	$i);
	&jmp(	&label("loop"));

	&set_label("end");
	&mov(	"eax",		$j);

	&function_end("calc");

	&asm_finish();

The above example is very very unoptimised but gives an idea of how
things work.