1656d9c7f52f88b3a3daccafa7655dec086c4756eThe Android Open Source Project#!/usr/bin/env perl
2656d9c7f52f88b3a3daccafa7655dec086c4756eThe Android Open Source Project#
3656d9c7f52f88b3a3daccafa7655dec086c4756eThe Android Open Source Project# ====================================================================
4656d9c7f52f88b3a3daccafa7655dec086c4756eThe Android Open Source Project# Written by Andy Polyakov <appro@fy.chalmers.se> for the OpenSSL
5656d9c7f52f88b3a3daccafa7655dec086c4756eThe Android Open Source Project# project. The module is, however, dual licensed under OpenSSL and
6656d9c7f52f88b3a3daccafa7655dec086c4756eThe Android Open Source Project# CRYPTOGAMS licenses depending on where you obtain it. For further
7656d9c7f52f88b3a3daccafa7655dec086c4756eThe Android Open Source Project# details see http://www.openssl.org/~appro/cryptogams/.
8656d9c7f52f88b3a3daccafa7655dec086c4756eThe Android Open Source Project# ====================================================================
9656d9c7f52f88b3a3daccafa7655dec086c4756eThe Android Open Source Project#
10656d9c7f52f88b3a3daccafa7655dec086c4756eThe Android Open Source Project# SHA256/512_Transform for Itanium.
11656d9c7f52f88b3a3daccafa7655dec086c4756eThe Android Open Source Project#
12656d9c7f52f88b3a3daccafa7655dec086c4756eThe Android Open Source Project# sha512_block runs in 1003 cycles on Itanium 2, which is almost 50%
13656d9c7f52f88b3a3daccafa7655dec086c4756eThe Android Open Source Project# faster than gcc and >60%(!) faster than code generated by HP-UX
14656d9c7f52f88b3a3daccafa7655dec086c4756eThe Android Open Source Project# compiler (yes, HP-UX is generating slower code, because unlike gcc,
15656d9c7f52f88b3a3daccafa7655dec086c4756eThe Android Open Source Project# it failed to deploy "shift right pair," 'shrp' instruction, which
16656d9c7f52f88b3a3daccafa7655dec086c4756eThe Android Open Source Project# substitutes for 64-bit rotate).
17656d9c7f52f88b3a3daccafa7655dec086c4756eThe Android Open Source Project#
18656d9c7f52f88b3a3daccafa7655dec086c4756eThe Android Open Source Project# 924 cycles long sha256_block outperforms gcc by over factor of 2(!)
19656d9c7f52f88b3a3daccafa7655dec086c4756eThe Android Open Source Project# and HP-UX compiler - by >40% (yes, gcc won sha512_block, but lost
20656d9c7f52f88b3a3daccafa7655dec086c4756eThe Android Open Source Project# this one big time). Note that "formally" 924 is about 100 cycles
21656d9c7f52f88b3a3daccafa7655dec086c4756eThe Android Open Source Project# too much. I mean it's 64 32-bit rounds vs. 80 virtually identical
22656d9c7f52f88b3a3daccafa7655dec086c4756eThe Android Open Source Project# 64-bit ones and 1003*64/80 gives 802. Extra cycles, 2 per round,
23656d9c7f52f88b3a3daccafa7655dec086c4756eThe Android Open Source Project# are spent on extra work to provide for 32-bit rotations. 32-bit
24656d9c7f52f88b3a3daccafa7655dec086c4756eThe Android Open Source Project# rotations are still handled by 'shrp' instruction and for this
25656d9c7f52f88b3a3daccafa7655dec086c4756eThe Android Open Source Project# reason lower 32 bits are deposited to upper half of 64-bit register
26656d9c7f52f88b3a3daccafa7655dec086c4756eThe Android Open Source Project# prior 'shrp' issue. And in order to minimize the amount of such
27656d9c7f52f88b3a3daccafa7655dec086c4756eThe Android Open Source Project# operations, X[16] values are *maintained* with copies of lower
28656d9c7f52f88b3a3daccafa7655dec086c4756eThe Android Open Source Project# halves in upper halves, which is why you'll spot such instructions
29656d9c7f52f88b3a3daccafa7655dec086c4756eThe Android Open Source Project# as custom 'mux2', "parallel 32-bit add," 'padd4' and "parallel
30656d9c7f52f88b3a3daccafa7655dec086c4756eThe Android Open Source Project# 32-bit unsigned right shift," 'pshr4.u' instructions here.
31656d9c7f52f88b3a3daccafa7655dec086c4756eThe Android Open Source Project#
32656d9c7f52f88b3a3daccafa7655dec086c4756eThe Android Open Source Project# Rules of engagement.
33656d9c7f52f88b3a3daccafa7655dec086c4756eThe Android Open Source Project#
34656d9c7f52f88b3a3daccafa7655dec086c4756eThe Android Open Source Project# There is only one integer shifter meaning that if I have two rotate,
35656d9c7f52f88b3a3daccafa7655dec086c4756eThe Android Open Source Project# deposit or extract instructions in adjacent bundles, they shall
36656d9c7f52f88b3a3daccafa7655dec086c4756eThe Android Open Source Project# split [at run-time if they have to]. But note that variable and
37656d9c7f52f88b3a3daccafa7655dec086c4756eThe Android Open Source Project# parallel shifts are performed by multi-media ALU and *are* pairable
38656d9c7f52f88b3a3daccafa7655dec086c4756eThe Android Open Source Project# with rotates [and alike]. On the backside MMALU is rather slow: it
39656d9c7f52f88b3a3daccafa7655dec086c4756eThe Android Open Source Project# takes 2 extra cycles before the result of integer operation is
40656d9c7f52f88b3a3daccafa7655dec086c4756eThe Android Open Source Project# available *to* MMALU and 2(*) extra cycles before the result of MM
41656d9c7f52f88b3a3daccafa7655dec086c4756eThe Android Open Source Project# operation is available "back" *to* integer ALU, not to mention that
42656d9c7f52f88b3a3daccafa7655dec086c4756eThe Android Open Source Project# MMALU itself has 2 cycles latency. However! I explicitly scheduled
43656d9c7f52f88b3a3daccafa7655dec086c4756eThe Android Open Source Project# these MM instructions to avoid MM stalls, so that all these extra
44656d9c7f52f88b3a3daccafa7655dec086c4756eThe Android Open Source Project# latencies get "hidden" in instruction-level parallelism.
45656d9c7f52f88b3a3daccafa7655dec086c4756eThe Android Open Source Project#
46656d9c7f52f88b3a3daccafa7655dec086c4756eThe Android Open Source Project# (*) 2 cycles on Itanium 1 and 1 cycle on Itanium 2. But I schedule
47656d9c7f52f88b3a3daccafa7655dec086c4756eThe Android Open Source Project#     for 2 in order to provide for best *overall* performance,
48656d9c7f52f88b3a3daccafa7655dec086c4756eThe Android Open Source Project#     because on Itanium 1 stall on MM result is accompanied by
49656d9c7f52f88b3a3daccafa7655dec086c4756eThe Android Open Source Project#     pipeline flush, which takes 6 cycles:-(
50656d9c7f52f88b3a3daccafa7655dec086c4756eThe Android Open Source Project#
51656d9c7f52f88b3a3daccafa7655dec086c4756eThe Android Open Source Project# Resulting performance numbers for 900MHz Itanium 2 system:
52656d9c7f52f88b3a3daccafa7655dec086c4756eThe Android Open Source Project#
53656d9c7f52f88b3a3daccafa7655dec086c4756eThe Android Open Source Project# The 'numbers' are in 1000s of bytes per second processed.
54656d9c7f52f88b3a3daccafa7655dec086c4756eThe Android Open Source Project# type     16 bytes    64 bytes   256 bytes  1024 bytes  8192 bytes
55656d9c7f52f88b3a3daccafa7655dec086c4756eThe Android Open Source Project# sha1(*)   6210.14k   20376.30k   52447.83k   85870.05k  105478.12k
56656d9c7f52f88b3a3daccafa7655dec086c4756eThe Android Open Source Project# sha256    7476.45k   20572.05k   41538.34k   56062.29k   62093.18k
57656d9c7f52f88b3a3daccafa7655dec086c4756eThe Android Open Source Project# sha512    4996.56k   20026.28k   47597.20k   85278.79k  111501.31k
58656d9c7f52f88b3a3daccafa7655dec086c4756eThe Android Open Source Project#
59656d9c7f52f88b3a3daccafa7655dec086c4756eThe Android Open Source Project# (*) SHA1 numbers are for HP-UX compiler and are presented purely
60656d9c7f52f88b3a3daccafa7655dec086c4756eThe Android Open Source Project#     for reference purposes. I bet it can improved too...
61656d9c7f52f88b3a3daccafa7655dec086c4756eThe Android Open Source Project#
62656d9c7f52f88b3a3daccafa7655dec086c4756eThe Android Open Source Project# To generate code, pass the file name with either 256 or 512 in its
63656d9c7f52f88b3a3daccafa7655dec086c4756eThe Android Open Source Project# name and compiler flags.
64656d9c7f52f88b3a3daccafa7655dec086c4756eThe Android Open Source Project
65656d9c7f52f88b3a3daccafa7655dec086c4756eThe Android Open Source Project$output=shift;
66656d9c7f52f88b3a3daccafa7655dec086c4756eThe Android Open Source Project
67656d9c7f52f88b3a3daccafa7655dec086c4756eThe Android Open Source Projectif ($output =~ /512.*\.[s|asm]/) {
68656d9c7f52f88b3a3daccafa7655dec086c4756eThe Android Open Source Project	$SZ=8;
69656d9c7f52f88b3a3daccafa7655dec086c4756eThe Android Open Source Project	$BITS=8*$SZ;
70656d9c7f52f88b3a3daccafa7655dec086c4756eThe Android Open Source Project	$LDW="ld8";
71656d9c7f52f88b3a3daccafa7655dec086c4756eThe Android Open Source Project	$STW="st8";
72656d9c7f52f88b3a3daccafa7655dec086c4756eThe Android Open Source Project	$ADD="add";
73656d9c7f52f88b3a3daccafa7655dec086c4756eThe Android Open Source Project	$SHRU="shr.u";
74656d9c7f52f88b3a3daccafa7655dec086c4756eThe Android Open Source Project	$TABLE="K512";
75656d9c7f52f88b3a3daccafa7655dec086c4756eThe Android Open Source Project	$func="sha512_block_data_order";
76656d9c7f52f88b3a3daccafa7655dec086c4756eThe Android Open Source Project	@Sigma0=(28,34,39);
77656d9c7f52f88b3a3daccafa7655dec086c4756eThe Android Open Source Project	@Sigma1=(14,18,41);
78656d9c7f52f88b3a3daccafa7655dec086c4756eThe Android Open Source Project	@sigma0=(1,  8, 7);
79656d9c7f52f88b3a3daccafa7655dec086c4756eThe Android Open Source Project	@sigma1=(19,61, 6);
80656d9c7f52f88b3a3daccafa7655dec086c4756eThe Android Open Source Project	$rounds=80;
81656d9c7f52f88b3a3daccafa7655dec086c4756eThe Android Open Source Project} elsif ($output =~ /256.*\.[s|asm]/) {
82656d9c7f52f88b3a3daccafa7655dec086c4756eThe Android Open Source Project	$SZ=4;
83656d9c7f52f88b3a3daccafa7655dec086c4756eThe Android Open Source Project	$BITS=8*$SZ;
84656d9c7f52f88b3a3daccafa7655dec086c4756eThe Android Open Source Project	$LDW="ld4";
85656d9c7f52f88b3a3daccafa7655dec086c4756eThe Android Open Source Project	$STW="st4";
86656d9c7f52f88b3a3daccafa7655dec086c4756eThe Android Open Source Project	$ADD="padd4";
87656d9c7f52f88b3a3daccafa7655dec086c4756eThe Android Open Source Project	$SHRU="pshr4.u";
88656d9c7f52f88b3a3daccafa7655dec086c4756eThe Android Open Source Project	$TABLE="K256";
89656d9c7f52f88b3a3daccafa7655dec086c4756eThe Android Open Source Project	$func="sha256_block_data_order";
90656d9c7f52f88b3a3daccafa7655dec086c4756eThe Android Open Source Project	@Sigma0=( 2,13,22);
91656d9c7f52f88b3a3daccafa7655dec086c4756eThe Android Open Source Project	@Sigma1=( 6,11,25);
92656d9c7f52f88b3a3daccafa7655dec086c4756eThe Android Open Source Project	@sigma0=( 7,18, 3);
93656d9c7f52f88b3a3daccafa7655dec086c4756eThe Android Open Source Project	@sigma1=(17,19,10);
94656d9c7f52f88b3a3daccafa7655dec086c4756eThe Android Open Source Project	$rounds=64;
95656d9c7f52f88b3a3daccafa7655dec086c4756eThe Android Open Source Project} else { die "nonsense $output"; }
96656d9c7f52f88b3a3daccafa7655dec086c4756eThe Android Open Source Project
97656d9c7f52f88b3a3daccafa7655dec086c4756eThe Android Open Source Projectopen STDOUT,">$output" || die "can't open $output: $!";
98656d9c7f52f88b3a3daccafa7655dec086c4756eThe Android Open Source Project
99656d9c7f52f88b3a3daccafa7655dec086c4756eThe Android Open Source Projectif ($^O eq "hpux") {
100656d9c7f52f88b3a3daccafa7655dec086c4756eThe Android Open Source Project    $ADDP="addp4";
101656d9c7f52f88b3a3daccafa7655dec086c4756eThe Android Open Source Project    for (@ARGV) { $ADDP="add" if (/[\+DD|\-mlp]64/); }
102656d9c7f52f88b3a3daccafa7655dec086c4756eThe Android Open Source Project} else { $ADDP="add"; }
103656d9c7f52f88b3a3daccafa7655dec086c4756eThe Android Open Source Projectfor (@ARGV)  {	$big_endian=1 if (/\-DB_ENDIAN/);
104656d9c7f52f88b3a3daccafa7655dec086c4756eThe Android Open Source Project		$big_endian=0 if (/\-DL_ENDIAN/);  }
105656d9c7f52f88b3a3daccafa7655dec086c4756eThe Android Open Source Projectif (!defined($big_endian))
106656d9c7f52f88b3a3daccafa7655dec086c4756eThe Android Open Source Project             {	$big_endian=(unpack('L',pack('N',1))==1);  }
107656d9c7f52f88b3a3daccafa7655dec086c4756eThe Android Open Source Project
108656d9c7f52f88b3a3daccafa7655dec086c4756eThe Android Open Source Project$code=<<___;
109656d9c7f52f88b3a3daccafa7655dec086c4756eThe Android Open Source Project.ident  \"$output, version 1.1\"
110656d9c7f52f88b3a3daccafa7655dec086c4756eThe Android Open Source Project.ident  \"IA-64 ISA artwork by Andy Polyakov <appro\@fy.chalmers.se>\"
111656d9c7f52f88b3a3daccafa7655dec086c4756eThe Android Open Source Project.explicit
112656d9c7f52f88b3a3daccafa7655dec086c4756eThe Android Open Source Project.text
113656d9c7f52f88b3a3daccafa7655dec086c4756eThe Android Open Source Project
114656d9c7f52f88b3a3daccafa7655dec086c4756eThe Android Open Source Projectpfssave=r2;
115656d9c7f52f88b3a3daccafa7655dec086c4756eThe Android Open Source Projectlcsave=r3;
116656d9c7f52f88b3a3daccafa7655dec086c4756eThe Android Open Source Projectprsave=r14;
117656d9c7f52f88b3a3daccafa7655dec086c4756eThe Android Open Source ProjectK=r15;
118656d9c7f52f88b3a3daccafa7655dec086c4756eThe Android Open Source ProjectA=r16;	B=r17;	C=r18;	D=r19;
119656d9c7f52f88b3a3daccafa7655dec086c4756eThe Android Open Source ProjectE=r20;	F=r21;	G=r22;	H=r23;
120656d9c7f52f88b3a3daccafa7655dec086c4756eThe Android Open Source ProjectT1=r24;	T2=r25;
121656d9c7f52f88b3a3daccafa7655dec086c4756eThe Android Open Source Projects0=r26;	s1=r27;	t0=r28;	t1=r29;
122656d9c7f52f88b3a3daccafa7655dec086c4756eThe Android Open Source ProjectKtbl=r30;
123656d9c7f52f88b3a3daccafa7655dec086c4756eThe Android Open Source Projectctx=r31;	// 1st arg
124656d9c7f52f88b3a3daccafa7655dec086c4756eThe Android Open Source Projectinput=r48;	// 2nd arg
125656d9c7f52f88b3a3daccafa7655dec086c4756eThe Android Open Source Projectnum=r49;	// 3rd arg
126656d9c7f52f88b3a3daccafa7655dec086c4756eThe Android Open Source Projectsgm0=r50;	sgm1=r51;	// small constants
127656d9c7f52f88b3a3daccafa7655dec086c4756eThe Android Open Source ProjectA_=r54;	B_=r55;	C_=r56;	D_=r57;
128656d9c7f52f88b3a3daccafa7655dec086c4756eThe Android Open Source ProjectE_=r58;	F_=r59;	G_=r60;	H_=r61;
129656d9c7f52f88b3a3daccafa7655dec086c4756eThe Android Open Source Project
130656d9c7f52f88b3a3daccafa7655dec086c4756eThe Android Open Source Project// void $func (SHA_CTX *ctx, const void *in,size_t num[,int host])
131656d9c7f52f88b3a3daccafa7655dec086c4756eThe Android Open Source Project.global	$func#
132656d9c7f52f88b3a3daccafa7655dec086c4756eThe Android Open Source Project.proc	$func#
133656d9c7f52f88b3a3daccafa7655dec086c4756eThe Android Open Source Project.align	32
134656d9c7f52f88b3a3daccafa7655dec086c4756eThe Android Open Source Project$func:
135656d9c7f52f88b3a3daccafa7655dec086c4756eThe Android Open Source Project	.prologue
136656d9c7f52f88b3a3daccafa7655dec086c4756eThe Android Open Source Project	.save	ar.pfs,pfssave
137656d9c7f52f88b3a3daccafa7655dec086c4756eThe Android Open Source Project{ .mmi;	alloc	pfssave=ar.pfs,3,27,0,16
138656d9c7f52f88b3a3daccafa7655dec086c4756eThe Android Open Source Project	$ADDP	ctx=0,r32		// 1st arg
139656d9c7f52f88b3a3daccafa7655dec086c4756eThe Android Open Source Project	.save	ar.lc,lcsave
140656d9c7f52f88b3a3daccafa7655dec086c4756eThe Android Open Source Project	mov	lcsave=ar.lc	}
141656d9c7f52f88b3a3daccafa7655dec086c4756eThe Android Open Source Project{ .mmi;	$ADDP	input=0,r33		// 2nd arg
142656d9c7f52f88b3a3daccafa7655dec086c4756eThe Android Open Source Project	mov	num=r34			// 3rd arg
143656d9c7f52f88b3a3daccafa7655dec086c4756eThe Android Open Source Project	.save	pr,prsave
144656d9c7f52f88b3a3daccafa7655dec086c4756eThe Android Open Source Project	mov	prsave=pr	};;
145656d9c7f52f88b3a3daccafa7655dec086c4756eThe Android Open Source Project
146656d9c7f52f88b3a3daccafa7655dec086c4756eThe Android Open Source Project	.body
147656d9c7f52f88b3a3daccafa7655dec086c4756eThe Android Open Source Project{ .mib;	add	r8=0*$SZ,ctx
148656d9c7f52f88b3a3daccafa7655dec086c4756eThe Android Open Source Project	add	r9=1*$SZ,ctx
149656d9c7f52f88b3a3daccafa7655dec086c4756eThe Android Open Source Project	brp.loop.imp	.L_first16,.L_first16_end-16	}
150656d9c7f52f88b3a3daccafa7655dec086c4756eThe Android Open Source Project{ .mib;	add	r10=2*$SZ,ctx
151656d9c7f52f88b3a3daccafa7655dec086c4756eThe Android Open Source Project	add	r11=3*$SZ,ctx
152656d9c7f52f88b3a3daccafa7655dec086c4756eThe Android Open Source Project	brp.loop.imp	.L_rest,.L_rest_end-16		};;
153656d9c7f52f88b3a3daccafa7655dec086c4756eThe Android Open Source Project
154656d9c7f52f88b3a3daccafa7655dec086c4756eThe Android Open Source Project// load A-H
155656d9c7f52f88b3a3daccafa7655dec086c4756eThe Android Open Source Project.Lpic_point:
156656d9c7f52f88b3a3daccafa7655dec086c4756eThe Android Open Source Project{ .mmi;	$LDW	A_=[r8],4*$SZ
157656d9c7f52f88b3a3daccafa7655dec086c4756eThe Android Open Source Project	$LDW	B_=[r9],4*$SZ
158656d9c7f52f88b3a3daccafa7655dec086c4756eThe Android Open Source Project	mov	Ktbl=ip		}
159656d9c7f52f88b3a3daccafa7655dec086c4756eThe Android Open Source Project{ .mmi;	$LDW	C_=[r10],4*$SZ
160656d9c7f52f88b3a3daccafa7655dec086c4756eThe Android Open Source Project	$LDW	D_=[r11],4*$SZ
161656d9c7f52f88b3a3daccafa7655dec086c4756eThe Android Open Source Project	mov	sgm0=$sigma0[2]	};;
162656d9c7f52f88b3a3daccafa7655dec086c4756eThe Android Open Source Project{ .mmi;	$LDW	E_=[r8]
163656d9c7f52f88b3a3daccafa7655dec086c4756eThe Android Open Source Project	$LDW	F_=[r9]
164656d9c7f52f88b3a3daccafa7655dec086c4756eThe Android Open Source Project	add	Ktbl=($TABLE#-.Lpic_point),Ktbl		}
165656d9c7f52f88b3a3daccafa7655dec086c4756eThe Android Open Source Project{ .mmi;	$LDW	G_=[r10]
166656d9c7f52f88b3a3daccafa7655dec086c4756eThe Android Open Source Project	$LDW	H_=[r11]
167656d9c7f52f88b3a3daccafa7655dec086c4756eThe Android Open Source Project	cmp.ne	p0,p16=0,r0	};;	// used in sha256_block
168656d9c7f52f88b3a3daccafa7655dec086c4756eThe Android Open Source Project___
169656d9c7f52f88b3a3daccafa7655dec086c4756eThe Android Open Source Project$code.=<<___ if ($BITS==64);
170656d9c7f52f88b3a3daccafa7655dec086c4756eThe Android Open Source Project{ .mii;	and	r8=7,input
171656d9c7f52f88b3a3daccafa7655dec086c4756eThe Android Open Source Project	and	input=~7,input;;
172656d9c7f52f88b3a3daccafa7655dec086c4756eThe Android Open Source Project	cmp.eq	p9,p0=1,r8	}
173656d9c7f52f88b3a3daccafa7655dec086c4756eThe Android Open Source Project{ .mmi;	cmp.eq	p10,p0=2,r8
174656d9c7f52f88b3a3daccafa7655dec086c4756eThe Android Open Source Project	cmp.eq	p11,p0=3,r8
175656d9c7f52f88b3a3daccafa7655dec086c4756eThe Android Open Source Project	cmp.eq	p12,p0=4,r8	}
176656d9c7f52f88b3a3daccafa7655dec086c4756eThe Android Open Source Project{ .mmi;	cmp.eq	p13,p0=5,r8
177656d9c7f52f88b3a3daccafa7655dec086c4756eThe Android Open Source Project	cmp.eq	p14,p0=6,r8
178656d9c7f52f88b3a3daccafa7655dec086c4756eThe Android Open Source Project	cmp.eq	p15,p0=7,r8	};;
179656d9c7f52f88b3a3daccafa7655dec086c4756eThe Android Open Source Project___
180656d9c7f52f88b3a3daccafa7655dec086c4756eThe Android Open Source Project$code.=<<___;
181656d9c7f52f88b3a3daccafa7655dec086c4756eThe Android Open Source Project.L_outer:
182656d9c7f52f88b3a3daccafa7655dec086c4756eThe Android Open Source Project.rotr	X[16]
183656d9c7f52f88b3a3daccafa7655dec086c4756eThe Android Open Source Project{ .mmi;	mov	A=A_
184656d9c7f52f88b3a3daccafa7655dec086c4756eThe Android Open Source Project	mov	B=B_
185656d9c7f52f88b3a3daccafa7655dec086c4756eThe Android Open Source Project	mov	ar.lc=14	}
186656d9c7f52f88b3a3daccafa7655dec086c4756eThe Android Open Source Project{ .mmi;	mov	C=C_
187656d9c7f52f88b3a3daccafa7655dec086c4756eThe Android Open Source Project	mov	D=D_
188656d9c7f52f88b3a3daccafa7655dec086c4756eThe Android Open Source Project	mov	E=E_		}
189656d9c7f52f88b3a3daccafa7655dec086c4756eThe Android Open Source Project{ .mmi;	mov	F=F_
190656d9c7f52f88b3a3daccafa7655dec086c4756eThe Android Open Source Project	mov	G=G_
191656d9c7f52f88b3a3daccafa7655dec086c4756eThe Android Open Source Project	mov	ar.ec=2		}
192656d9c7f52f88b3a3daccafa7655dec086c4756eThe Android Open Source Project{ .mmi;	ld1	X[15]=[input],$SZ		// eliminated in 64-bit
193656d9c7f52f88b3a3daccafa7655dec086c4756eThe Android Open Source Project	mov	H=H_
194656d9c7f52f88b3a3daccafa7655dec086c4756eThe Android Open Source Project	mov	sgm1=$sigma1[2]	};;
195656d9c7f52f88b3a3daccafa7655dec086c4756eThe Android Open Source Project
196656d9c7f52f88b3a3daccafa7655dec086c4756eThe Android Open Source Project___
197656d9c7f52f88b3a3daccafa7655dec086c4756eThe Android Open Source Project$t0="t0", $t1="t1", $code.=<<___ if ($BITS==32);
198656d9c7f52f88b3a3daccafa7655dec086c4756eThe Android Open Source Project.align	32
199656d9c7f52f88b3a3daccafa7655dec086c4756eThe Android Open Source Project.L_first16:
200656d9c7f52f88b3a3daccafa7655dec086c4756eThe Android Open Source Project{ .mmi;		add	r9=1-$SZ,input
201656d9c7f52f88b3a3daccafa7655dec086c4756eThe Android Open Source Project		add	r10=2-$SZ,input
202656d9c7f52f88b3a3daccafa7655dec086c4756eThe Android Open Source Project		add	r11=3-$SZ,input	};;
203656d9c7f52f88b3a3daccafa7655dec086c4756eThe Android Open Source Project{ .mmi;		ld1	r9=[r9]
204656d9c7f52f88b3a3daccafa7655dec086c4756eThe Android Open Source Project		ld1	r10=[r10]
205656d9c7f52f88b3a3daccafa7655dec086c4756eThe Android Open Source Project		dep.z	$t1=E,32,32	}
206656d9c7f52f88b3a3daccafa7655dec086c4756eThe Android Open Source Project{ .mmi;		$LDW	K=[Ktbl],$SZ
207656d9c7f52f88b3a3daccafa7655dec086c4756eThe Android Open Source Project		ld1	r11=[r11]
208656d9c7f52f88b3a3daccafa7655dec086c4756eThe Android Open Source Project		zxt4	E=E		};;
209656d9c7f52f88b3a3daccafa7655dec086c4756eThe Android Open Source Project{ .mii;		or	$t1=$t1,E
210656d9c7f52f88b3a3daccafa7655dec086c4756eThe Android Open Source Project		dep	X[15]=X[15],r9,8,8
211656d9c7f52f88b3a3daccafa7655dec086c4756eThe Android Open Source Project		dep	r11=r10,r11,8,8	};;
212656d9c7f52f88b3a3daccafa7655dec086c4756eThe Android Open Source Project{ .mmi;		and	T1=F,E
213656d9c7f52f88b3a3daccafa7655dec086c4756eThe Android Open Source Project		and	T2=A,B
214656d9c7f52f88b3a3daccafa7655dec086c4756eThe Android Open Source Project		dep	X[15]=X[15],r11,16,16	}
215656d9c7f52f88b3a3daccafa7655dec086c4756eThe Android Open Source Project{ .mmi;		andcm	r8=G,E
216656d9c7f52f88b3a3daccafa7655dec086c4756eThe Android Open Source Project		and	r9=A,C
217656d9c7f52f88b3a3daccafa7655dec086c4756eThe Android Open Source Project		mux2	$t0=A,0x44	};;	// copy lower half to upper
218656d9c7f52f88b3a3daccafa7655dec086c4756eThe Android Open Source Project{ .mmi;	(p16)	ld1	X[15-1]=[input],$SZ	// prefetch
219656d9c7f52f88b3a3daccafa7655dec086c4756eThe Android Open Source Project		xor	T1=T1,r8		// T1=((e & f) ^ (~e & g))
220656d9c7f52f88b3a3daccafa7655dec086c4756eThe Android Open Source Project		_rotr	r11=$t1,$Sigma1[0] }	// ROTR(e,14)
221656d9c7f52f88b3a3daccafa7655dec086c4756eThe Android Open Source Project{ .mib;		and	r10=B,C
222656d9c7f52f88b3a3daccafa7655dec086c4756eThe Android Open Source Project		xor	T2=T2,r9	};;
223656d9c7f52f88b3a3daccafa7655dec086c4756eThe Android Open Source Project___
224656d9c7f52f88b3a3daccafa7655dec086c4756eThe Android Open Source Project$t0="A", $t1="E", $code.=<<___ if ($BITS==64);
225656d9c7f52f88b3a3daccafa7655dec086c4756eThe Android Open Source Project// in 64-bit mode I load whole X[16] at once and take care of alignment...
226656d9c7f52f88b3a3daccafa7655dec086c4756eThe Android Open Source Project{ .mmi;	add	r8=1*$SZ,input
227656d9c7f52f88b3a3daccafa7655dec086c4756eThe Android Open Source Project	add	r9=2*$SZ,input
228656d9c7f52f88b3a3daccafa7655dec086c4756eThe Android Open Source Project	add	r10=3*$SZ,input		};;
229656d9c7f52f88b3a3daccafa7655dec086c4756eThe Android Open Source Project{ .mmb;	$LDW	X[15]=[input],4*$SZ
230656d9c7f52f88b3a3daccafa7655dec086c4756eThe Android Open Source Project	$LDW	X[14]=[r8],4*$SZ
231656d9c7f52f88b3a3daccafa7655dec086c4756eThe Android Open Source Project(p9)	br.cond.dpnt.many	.L1byte	};;
232656d9c7f52f88b3a3daccafa7655dec086c4756eThe Android Open Source Project{ .mmb;	$LDW	X[13]=[r9],4*$SZ
233656d9c7f52f88b3a3daccafa7655dec086c4756eThe Android Open Source Project	$LDW	X[12]=[r10],4*$SZ
234656d9c7f52f88b3a3daccafa7655dec086c4756eThe Android Open Source Project(p10)	br.cond.dpnt.many	.L2byte	};;
235656d9c7f52f88b3a3daccafa7655dec086c4756eThe Android Open Source Project{ .mmb;	$LDW	X[11]=[input],4*$SZ
236656d9c7f52f88b3a3daccafa7655dec086c4756eThe Android Open Source Project	$LDW	X[10]=[r8],4*$SZ
237656d9c7f52f88b3a3daccafa7655dec086c4756eThe Android Open Source Project(p11)	br.cond.dpnt.many	.L3byte	};;
238656d9c7f52f88b3a3daccafa7655dec086c4756eThe Android Open Source Project{ .mmb;	$LDW	X[ 9]=[r9],4*$SZ
239656d9c7f52f88b3a3daccafa7655dec086c4756eThe Android Open Source Project	$LDW	X[ 8]=[r10],4*$SZ
240656d9c7f52f88b3a3daccafa7655dec086c4756eThe Android Open Source Project(p12)	br.cond.dpnt.many	.L4byte	};;
241656d9c7f52f88b3a3daccafa7655dec086c4756eThe Android Open Source Project{ .mmb;	$LDW	X[ 7]=[input],4*$SZ
242656d9c7f52f88b3a3daccafa7655dec086c4756eThe Android Open Source Project	$LDW	X[ 6]=[r8],4*$SZ
243656d9c7f52f88b3a3daccafa7655dec086c4756eThe Android Open Source Project(p13)	br.cond.dpnt.many	.L5byte	};;
244656d9c7f52f88b3a3daccafa7655dec086c4756eThe Android Open Source Project{ .mmb;	$LDW	X[ 5]=[r9],4*$SZ
245656d9c7f52f88b3a3daccafa7655dec086c4756eThe Android Open Source Project	$LDW	X[ 4]=[r10],4*$SZ
246656d9c7f52f88b3a3daccafa7655dec086c4756eThe Android Open Source Project(p14)	br.cond.dpnt.many	.L6byte	};;
247656d9c7f52f88b3a3daccafa7655dec086c4756eThe Android Open Source Project{ .mmb;	$LDW	X[ 3]=[input],4*$SZ
248656d9c7f52f88b3a3daccafa7655dec086c4756eThe Android Open Source Project	$LDW	X[ 2]=[r8],4*$SZ
249656d9c7f52f88b3a3daccafa7655dec086c4756eThe Android Open Source Project(p15)	br.cond.dpnt.many	.L7byte	};;
250656d9c7f52f88b3a3daccafa7655dec086c4756eThe Android Open Source Project{ .mmb;	$LDW	X[ 1]=[r9],4*$SZ
251656d9c7f52f88b3a3daccafa7655dec086c4756eThe Android Open Source Project	$LDW	X[ 0]=[r10],4*$SZ
252656d9c7f52f88b3a3daccafa7655dec086c4756eThe Android Open Source Project	br.many	.L_first16		};;
253656d9c7f52f88b3a3daccafa7655dec086c4756eThe Android Open Source Project.L1byte:
254656d9c7f52f88b3a3daccafa7655dec086c4756eThe Android Open Source Project{ .mmi;	$LDW	X[13]=[r9],4*$SZ
255656d9c7f52f88b3a3daccafa7655dec086c4756eThe Android Open Source Project	$LDW	X[12]=[r10],4*$SZ
256656d9c7f52f88b3a3daccafa7655dec086c4756eThe Android Open Source Project	shrp	X[15]=X[15],X[14],56	};;
257656d9c7f52f88b3a3daccafa7655dec086c4756eThe Android Open Source Project{ .mmi;	$LDW	X[11]=[input],4*$SZ
258656d9c7f52f88b3a3daccafa7655dec086c4756eThe Android Open Source Project	$LDW	X[10]=[r8],4*$SZ
259656d9c7f52f88b3a3daccafa7655dec086c4756eThe Android Open Source Project	shrp	X[14]=X[14],X[13],56	}
260656d9c7f52f88b3a3daccafa7655dec086c4756eThe Android Open Source Project{ .mmi;	$LDW	X[ 9]=[r9],4*$SZ
261656d9c7f52f88b3a3daccafa7655dec086c4756eThe Android Open Source Project	$LDW	X[ 8]=[r10],4*$SZ
262656d9c7f52f88b3a3daccafa7655dec086c4756eThe Android Open Source Project	shrp	X[13]=X[13],X[12],56	};;
263656d9c7f52f88b3a3daccafa7655dec086c4756eThe Android Open Source Project{ .mmi;	$LDW	X[ 7]=[input],4*$SZ
264656d9c7f52f88b3a3daccafa7655dec086c4756eThe Android Open Source Project	$LDW	X[ 6]=[r8],4*$SZ
265656d9c7f52f88b3a3daccafa7655dec086c4756eThe Android Open Source Project	shrp	X[12]=X[12],X[11],56	}
266656d9c7f52f88b3a3daccafa7655dec086c4756eThe Android Open Source Project{ .mmi;	$LDW	X[ 5]=[r9],4*$SZ
267656d9c7f52f88b3a3daccafa7655dec086c4756eThe Android Open Source Project	$LDW	X[ 4]=[r10],4*$SZ
268656d9c7f52f88b3a3daccafa7655dec086c4756eThe Android Open Source Project	shrp	X[11]=X[11],X[10],56	};;
269656d9c7f52f88b3a3daccafa7655dec086c4756eThe Android Open Source Project{ .mmi;	$LDW	X[ 3]=[input],4*$SZ
270656d9c7f52f88b3a3daccafa7655dec086c4756eThe Android Open Source Project	$LDW	X[ 2]=[r8],4*$SZ
271656d9c7f52f88b3a3daccafa7655dec086c4756eThe Android Open Source Project	shrp	X[10]=X[10],X[ 9],56	}
272656d9c7f52f88b3a3daccafa7655dec086c4756eThe Android Open Source Project{ .mmi;	$LDW	X[ 1]=[r9],4*$SZ
273656d9c7f52f88b3a3daccafa7655dec086c4756eThe Android Open Source Project	$LDW	X[ 0]=[r10],4*$SZ
274656d9c7f52f88b3a3daccafa7655dec086c4756eThe Android Open Source Project	shrp	X[ 9]=X[ 9],X[ 8],56	};;
275656d9c7f52f88b3a3daccafa7655dec086c4756eThe Android Open Source Project{ .mii;	$LDW	T1=[input]
276656d9c7f52f88b3a3daccafa7655dec086c4756eThe Android Open Source Project	shrp	X[ 8]=X[ 8],X[ 7],56
277656d9c7f52f88b3a3daccafa7655dec086c4756eThe Android Open Source Project	shrp	X[ 7]=X[ 7],X[ 6],56	}
278656d9c7f52f88b3a3daccafa7655dec086c4756eThe Android Open Source Project{ .mii;	shrp	X[ 6]=X[ 6],X[ 5],56
279656d9c7f52f88b3a3daccafa7655dec086c4756eThe Android Open Source Project	shrp	X[ 5]=X[ 5],X[ 4],56	};;
280656d9c7f52f88b3a3daccafa7655dec086c4756eThe Android Open Source Project{ .mii;	shrp	X[ 4]=X[ 4],X[ 3],56
281656d9c7f52f88b3a3daccafa7655dec086c4756eThe Android Open Source Project	shrp	X[ 3]=X[ 3],X[ 2],56	}
282656d9c7f52f88b3a3daccafa7655dec086c4756eThe Android Open Source Project{ .mii;	shrp	X[ 2]=X[ 2],X[ 1],56
283656d9c7f52f88b3a3daccafa7655dec086c4756eThe Android Open Source Project	shrp	X[ 1]=X[ 1],X[ 0],56	}
284656d9c7f52f88b3a3daccafa7655dec086c4756eThe Android Open Source Project{ .mib;	shrp	X[ 0]=X[ 0],T1,56
285656d9c7f52f88b3a3daccafa7655dec086c4756eThe Android Open Source Project	br.many	.L_first16		};;
286656d9c7f52f88b3a3daccafa7655dec086c4756eThe Android Open Source Project.L2byte:
287656d9c7f52f88b3a3daccafa7655dec086c4756eThe Android Open Source Project{ .mmi;	$LDW	X[11]=[input],4*$SZ
288656d9c7f52f88b3a3daccafa7655dec086c4756eThe Android Open Source Project	$LDW	X[10]=[r8],4*$SZ
289656d9c7f52f88b3a3daccafa7655dec086c4756eThe Android Open Source Project	shrp	X[15]=X[15],X[14],48	}
290656d9c7f52f88b3a3daccafa7655dec086c4756eThe Android Open Source Project{ .mmi;	$LDW	X[ 9]=[r9],4*$SZ
291656d9c7f52f88b3a3daccafa7655dec086c4756eThe Android Open Source Project	$LDW	X[ 8]=[r10],4*$SZ
292656d9c7f52f88b3a3daccafa7655dec086c4756eThe Android Open Source Project	shrp	X[14]=X[14],X[13],48	};;
293656d9c7f52f88b3a3daccafa7655dec086c4756eThe Android Open Source Project{ .mmi;	$LDW	X[ 7]=[input],4*$SZ
294656d9c7f52f88b3a3daccafa7655dec086c4756eThe Android Open Source Project	$LDW	X[ 6]=[r8],4*$SZ
295656d9c7f52f88b3a3daccafa7655dec086c4756eThe Android Open Source Project	shrp	X[13]=X[13],X[12],48	}
296656d9c7f52f88b3a3daccafa7655dec086c4756eThe Android Open Source Project{ .mmi;	$LDW	X[ 5]=[r9],4*$SZ
297656d9c7f52f88b3a3daccafa7655dec086c4756eThe Android Open Source Project	$LDW	X[ 4]=[r10],4*$SZ
298656d9c7f52f88b3a3daccafa7655dec086c4756eThe Android Open Source Project	shrp	X[12]=X[12],X[11],48	};;
299656d9c7f52f88b3a3daccafa7655dec086c4756eThe Android Open Source Project{ .mmi;	$LDW	X[ 3]=[input],4*$SZ
300656d9c7f52f88b3a3daccafa7655dec086c4756eThe Android Open Source Project	$LDW	X[ 2]=[r8],4*$SZ
301656d9c7f52f88b3a3daccafa7655dec086c4756eThe Android Open Source Project	shrp	X[11]=X[11],X[10],48	}
302656d9c7f52f88b3a3daccafa7655dec086c4756eThe Android Open Source Project{ .mmi;	$LDW	X[ 1]=[r9],4*$SZ
303656d9c7f52f88b3a3daccafa7655dec086c4756eThe Android Open Source Project	$LDW	X[ 0]=[r10],4*$SZ
304656d9c7f52f88b3a3daccafa7655dec086c4756eThe Android Open Source Project	shrp	X[10]=X[10],X[ 9],48	};;
305656d9c7f52f88b3a3daccafa7655dec086c4756eThe Android Open Source Project{ .mii;	$LDW	T1=[input]
306656d9c7f52f88b3a3daccafa7655dec086c4756eThe Android Open Source Project	shrp	X[ 9]=X[ 9],X[ 8],48
307656d9c7f52f88b3a3daccafa7655dec086c4756eThe Android Open Source Project	shrp	X[ 8]=X[ 8],X[ 7],48	}
308656d9c7f52f88b3a3daccafa7655dec086c4756eThe Android Open Source Project{ .mii;	shrp	X[ 7]=X[ 7],X[ 6],48
309656d9c7f52f88b3a3daccafa7655dec086c4756eThe Android Open Source Project	shrp	X[ 6]=X[ 6],X[ 5],48	};;
310656d9c7f52f88b3a3daccafa7655dec086c4756eThe Android Open Source Project{ .mii;	shrp	X[ 5]=X[ 5],X[ 4],48
311656d9c7f52f88b3a3daccafa7655dec086c4756eThe Android Open Source Project	shrp	X[ 4]=X[ 4],X[ 3],48	}
312656d9c7f52f88b3a3daccafa7655dec086c4756eThe Android Open Source Project{ .mii;	shrp	X[ 3]=X[ 3],X[ 2],48
313656d9c7f52f88b3a3daccafa7655dec086c4756eThe Android Open Source Project	shrp	X[ 2]=X[ 2],X[ 1],48	}
314656d9c7f52f88b3a3daccafa7655dec086c4756eThe Android Open Source Project{ .mii;	shrp	X[ 1]=X[ 1],X[ 0],48
315656d9c7f52f88b3a3daccafa7655dec086c4756eThe Android Open Source Project	shrp	X[ 0]=X[ 0],T1,48	}
316656d9c7f52f88b3a3daccafa7655dec086c4756eThe Android Open Source Project{ .mfb;	br.many	.L_first16		};;
317656d9c7f52f88b3a3daccafa7655dec086c4756eThe Android Open Source Project.L3byte:
318656d9c7f52f88b3a3daccafa7655dec086c4756eThe Android Open Source Project{ .mmi;	$LDW	X[ 9]=[r9],4*$SZ
319656d9c7f52f88b3a3daccafa7655dec086c4756eThe Android Open Source Project	$LDW	X[ 8]=[r10],4*$SZ
320656d9c7f52f88b3a3daccafa7655dec086c4756eThe Android Open Source Project	shrp	X[15]=X[15],X[14],40	};;
321656d9c7f52f88b3a3daccafa7655dec086c4756eThe Android Open Source Project{ .mmi;	$LDW	X[ 7]=[input],4*$SZ
322656d9c7f52f88b3a3daccafa7655dec086c4756eThe Android Open Source Project	$LDW	X[ 6]=[r8],4*$SZ
323656d9c7f52f88b3a3daccafa7655dec086c4756eThe Android Open Source Project	shrp	X[14]=X[14],X[13],40	}
324656d9c7f52f88b3a3daccafa7655dec086c4756eThe Android Open Source Project{ .mmi;	$LDW	X[ 5]=[r9],4*$SZ
325656d9c7f52f88b3a3daccafa7655dec086c4756eThe Android Open Source Project	$LDW	X[ 4]=[r10],4*$SZ
326656d9c7f52f88b3a3daccafa7655dec086c4756eThe Android Open Source Project	shrp	X[13]=X[13],X[12],40	};;
327656d9c7f52f88b3a3daccafa7655dec086c4756eThe Android Open Source Project{ .mmi;	$LDW	X[ 3]=[input],4*$SZ
328656d9c7f52f88b3a3daccafa7655dec086c4756eThe Android Open Source Project	$LDW	X[ 2]=[r8],4*$SZ
329656d9c7f52f88b3a3daccafa7655dec086c4756eThe Android Open Source Project	shrp	X[12]=X[12],X[11],40	}
330656d9c7f52f88b3a3daccafa7655dec086c4756eThe Android Open Source Project{ .mmi;	$LDW	X[ 1]=[r9],4*$SZ
331656d9c7f52f88b3a3daccafa7655dec086c4756eThe Android Open Source Project	$LDW	X[ 0]=[r10],4*$SZ
332656d9c7f52f88b3a3daccafa7655dec086c4756eThe Android Open Source Project	shrp	X[11]=X[11],X[10],40	};;
333656d9c7f52f88b3a3daccafa7655dec086c4756eThe Android Open Source Project{ .mii;	$LDW	T1=[input]
334656d9c7f52f88b3a3daccafa7655dec086c4756eThe Android Open Source Project	shrp	X[10]=X[10],X[ 9],40
335656d9c7f52f88b3a3daccafa7655dec086c4756eThe Android Open Source Project	shrp	X[ 9]=X[ 9],X[ 8],40	}
336656d9c7f52f88b3a3daccafa7655dec086c4756eThe Android Open Source Project{ .mii;	shrp	X[ 8]=X[ 8],X[ 7],40
337656d9c7f52f88b3a3daccafa7655dec086c4756eThe Android Open Source Project	shrp	X[ 7]=X[ 7],X[ 6],40	};;
338656d9c7f52f88b3a3daccafa7655dec086c4756eThe Android Open Source Project{ .mii;	shrp	X[ 6]=X[ 6],X[ 5],40
339656d9c7f52f88b3a3daccafa7655dec086c4756eThe Android Open Source Project	shrp	X[ 5]=X[ 5],X[ 4],40	}
340656d9c7f52f88b3a3daccafa7655dec086c4756eThe Android Open Source Project{ .mii;	shrp	X[ 4]=X[ 4],X[ 3],40
341656d9c7f52f88b3a3daccafa7655dec086c4756eThe Android Open Source Project	shrp	X[ 3]=X[ 3],X[ 2],40	}
342656d9c7f52f88b3a3daccafa7655dec086c4756eThe Android Open Source Project{ .mii;	shrp	X[ 2]=X[ 2],X[ 1],40
343656d9c7f52f88b3a3daccafa7655dec086c4756eThe Android Open Source Project	shrp	X[ 1]=X[ 1],X[ 0],40	}
344656d9c7f52f88b3a3daccafa7655dec086c4756eThe Android Open Source Project{ .mib;	shrp	X[ 0]=X[ 0],T1,40
345656d9c7f52f88b3a3daccafa7655dec086c4756eThe Android Open Source Project	br.many	.L_first16		};;
346656d9c7f52f88b3a3daccafa7655dec086c4756eThe Android Open Source Project.L4byte:
347656d9c7f52f88b3a3daccafa7655dec086c4756eThe Android Open Source Project{ .mmi;	$LDW	X[ 7]=[input],4*$SZ
348656d9c7f52f88b3a3daccafa7655dec086c4756eThe Android Open Source Project	$LDW	X[ 6]=[r8],4*$SZ
349656d9c7f52f88b3a3daccafa7655dec086c4756eThe Android Open Source Project	shrp	X[15]=X[15],X[14],32	}
350656d9c7f52f88b3a3daccafa7655dec086c4756eThe Android Open Source Project{ .mmi;	$LDW	X[ 5]=[r9],4*$SZ
351656d9c7f52f88b3a3daccafa7655dec086c4756eThe Android Open Source Project	$LDW	X[ 4]=[r10],4*$SZ
352656d9c7f52f88b3a3daccafa7655dec086c4756eThe Android Open Source Project	shrp	X[14]=X[14],X[13],32	};;
353656d9c7f52f88b3a3daccafa7655dec086c4756eThe Android Open Source Project{ .mmi;	$LDW	X[ 3]=[input],4*$SZ
354656d9c7f52f88b3a3daccafa7655dec086c4756eThe Android Open Source Project	$LDW	X[ 2]=[r8],4*$SZ
355656d9c7f52f88b3a3daccafa7655dec086c4756eThe Android Open Source Project	shrp	X[13]=X[13],X[12],32	}
356656d9c7f52f88b3a3daccafa7655dec086c4756eThe Android Open Source Project{ .mmi;	$LDW	X[ 1]=[r9],4*$SZ
357656d9c7f52f88b3a3daccafa7655dec086c4756eThe Android Open Source Project	$LDW	X[ 0]=[r10],4*$SZ
358656d9c7f52f88b3a3daccafa7655dec086c4756eThe Android Open Source Project	shrp	X[12]=X[12],X[11],32	};;
359656d9c7f52f88b3a3daccafa7655dec086c4756eThe Android Open Source Project{ .mii;	$LDW	T1=[input]
360656d9c7f52f88b3a3daccafa7655dec086c4756eThe Android Open Source Project	shrp	X[11]=X[11],X[10],32
361656d9c7f52f88b3a3daccafa7655dec086c4756eThe Android Open Source Project	shrp	X[10]=X[10],X[ 9],32	}
362656d9c7f52f88b3a3daccafa7655dec086c4756eThe Android Open Source Project{ .mii;	shrp	X[ 9]=X[ 9],X[ 8],32
363656d9c7f52f88b3a3daccafa7655dec086c4756eThe Android Open Source Project	shrp	X[ 8]=X[ 8],X[ 7],32	};;
364656d9c7f52f88b3a3daccafa7655dec086c4756eThe Android Open Source Project{ .mii;	shrp	X[ 7]=X[ 7],X[ 6],32
365656d9c7f52f88b3a3daccafa7655dec086c4756eThe Android Open Source Project	shrp	X[ 6]=X[ 6],X[ 5],32	}
366656d9c7f52f88b3a3daccafa7655dec086c4756eThe Android Open Source Project{ .mii;	shrp	X[ 5]=X[ 5],X[ 4],32
367656d9c7f52f88b3a3daccafa7655dec086c4756eThe Android Open Source Project	shrp	X[ 4]=X[ 4],X[ 3],32	}
368656d9c7f52f88b3a3daccafa7655dec086c4756eThe Android Open Source Project{ .mii;	shrp	X[ 3]=X[ 3],X[ 2],32
369656d9c7f52f88b3a3daccafa7655dec086c4756eThe Android Open Source Project	shrp	X[ 2]=X[ 2],X[ 1],32	}
370656d9c7f52f88b3a3daccafa7655dec086c4756eThe Android Open Source Project{ .mii;	shrp	X[ 1]=X[ 1],X[ 0],32
371656d9c7f52f88b3a3daccafa7655dec086c4756eThe Android Open Source Project	shrp	X[ 0]=X[ 0],T1,32	}
372656d9c7f52f88b3a3daccafa7655dec086c4756eThe Android Open Source Project{ .mfb;	br.many	.L_first16		};;
373656d9c7f52f88b3a3daccafa7655dec086c4756eThe Android Open Source Project.L5byte:
374656d9c7f52f88b3a3daccafa7655dec086c4756eThe Android Open Source Project{ .mmi;	$LDW	X[ 5]=[r9],4*$SZ
375656d9c7f52f88b3a3daccafa7655dec086c4756eThe Android Open Source Project	$LDW	X[ 4]=[r10],4*$SZ
376656d9c7f52f88b3a3daccafa7655dec086c4756eThe Android Open Source Project	shrp	X[15]=X[15],X[14],24	};;
377656d9c7f52f88b3a3daccafa7655dec086c4756eThe Android Open Source Project{ .mmi;	$LDW	X[ 3]=[input],4*$SZ
378656d9c7f52f88b3a3daccafa7655dec086c4756eThe Android Open Source Project	$LDW	X[ 2]=[r8],4*$SZ
379656d9c7f52f88b3a3daccafa7655dec086c4756eThe Android Open Source Project	shrp	X[14]=X[14],X[13],24	}
380656d9c7f52f88b3a3daccafa7655dec086c4756eThe Android Open Source Project{ .mmi;	$LDW	X[ 1]=[r9],4*$SZ
381656d9c7f52f88b3a3daccafa7655dec086c4756eThe Android Open Source Project	$LDW	X[ 0]=[r10],4*$SZ
382656d9c7f52f88b3a3daccafa7655dec086c4756eThe Android Open Source Project	shrp	X[13]=X[13],X[12],24	};;
383656d9c7f52f88b3a3daccafa7655dec086c4756eThe Android Open Source Project{ .mii;	$LDW	T1=[input]
384656d9c7f52f88b3a3daccafa7655dec086c4756eThe Android Open Source Project	shrp	X[12]=X[12],X[11],24
385656d9c7f52f88b3a3daccafa7655dec086c4756eThe Android Open Source Project	shrp	X[11]=X[11],X[10],24	}
386656d9c7f52f88b3a3daccafa7655dec086c4756eThe Android Open Source Project{ .mii;	shrp	X[10]=X[10],X[ 9],24
387656d9c7f52f88b3a3daccafa7655dec086c4756eThe Android Open Source Project	shrp	X[ 9]=X[ 9],X[ 8],24	};;
388656d9c7f52f88b3a3daccafa7655dec086c4756eThe Android Open Source Project{ .mii;	shrp	X[ 8]=X[ 8],X[ 7],24
389656d9c7f52f88b3a3daccafa7655dec086c4756eThe Android Open Source Project	shrp	X[ 7]=X[ 7],X[ 6],24	}
390656d9c7f52f88b3a3daccafa7655dec086c4756eThe Android Open Source Project{ .mii;	shrp	X[ 6]=X[ 6],X[ 5],24
391656d9c7f52f88b3a3daccafa7655dec086c4756eThe Android Open Source Project	shrp	X[ 5]=X[ 5],X[ 4],24	}
392656d9c7f52f88b3a3daccafa7655dec086c4756eThe Android Open Source Project{ .mii;	shrp	X[ 4]=X[ 4],X[ 3],24
393656d9c7f52f88b3a3daccafa7655dec086c4756eThe Android Open Source Project	shrp	X[ 3]=X[ 3],X[ 2],24	}
394656d9c7f52f88b3a3daccafa7655dec086c4756eThe Android Open Source Project{ .mii;	shrp	X[ 2]=X[ 2],X[ 1],24
395656d9c7f52f88b3a3daccafa7655dec086c4756eThe Android Open Source Project	shrp	X[ 1]=X[ 1],X[ 0],24	}
396656d9c7f52f88b3a3daccafa7655dec086c4756eThe Android Open Source Project{ .mib;	shrp	X[ 0]=X[ 0],T1,24
397656d9c7f52f88b3a3daccafa7655dec086c4756eThe Android Open Source Project	br.many	.L_first16		};;
398656d9c7f52f88b3a3daccafa7655dec086c4756eThe Android Open Source Project.L6byte:
399656d9c7f52f88b3a3daccafa7655dec086c4756eThe Android Open Source Project{ .mmi;	$LDW	X[ 3]=[input],4*$SZ
400656d9c7f52f88b3a3daccafa7655dec086c4756eThe Android Open Source Project	$LDW	X[ 2]=[r8],4*$SZ
401656d9c7f52f88b3a3daccafa7655dec086c4756eThe Android Open Source Project	shrp	X[15]=X[15],X[14],16	}
402656d9c7f52f88b3a3daccafa7655dec086c4756eThe Android Open Source Project{ .mmi;	$LDW	X[ 1]=[r9],4*$SZ
403656d9c7f52f88b3a3daccafa7655dec086c4756eThe Android Open Source Project	$LDW	X[ 0]=[r10],4*$SZ
404656d9c7f52f88b3a3daccafa7655dec086c4756eThe Android Open Source Project	shrp	X[14]=X[14],X[13],16	};;
405656d9c7f52f88b3a3daccafa7655dec086c4756eThe Android Open Source Project{ .mii;	$LDW	T1=[input]
406656d9c7f52f88b3a3daccafa7655dec086c4756eThe Android Open Source Project	shrp	X[13]=X[13],X[12],16
407656d9c7f52f88b3a3daccafa7655dec086c4756eThe Android Open Source Project	shrp	X[12]=X[12],X[11],16	}
408656d9c7f52f88b3a3daccafa7655dec086c4756eThe Android Open Source Project{ .mii;	shrp	X[11]=X[11],X[10],16
409656d9c7f52f88b3a3daccafa7655dec086c4756eThe Android Open Source Project	shrp	X[10]=X[10],X[ 9],16	};;
410656d9c7f52f88b3a3daccafa7655dec086c4756eThe Android Open Source Project{ .mii;	shrp	X[ 9]=X[ 9],X[ 8],16
411656d9c7f52f88b3a3daccafa7655dec086c4756eThe Android Open Source Project	shrp	X[ 8]=X[ 8],X[ 7],16	}
412656d9c7f52f88b3a3daccafa7655dec086c4756eThe Android Open Source Project{ .mii;	shrp	X[ 7]=X[ 7],X[ 6],16
413656d9c7f52f88b3a3daccafa7655dec086c4756eThe Android Open Source Project	shrp	X[ 6]=X[ 6],X[ 5],16	}
414656d9c7f52f88b3a3daccafa7655dec086c4756eThe Android Open Source Project{ .mii;	shrp	X[ 5]=X[ 5],X[ 4],16
415656d9c7f52f88b3a3daccafa7655dec086c4756eThe Android Open Source Project	shrp	X[ 4]=X[ 4],X[ 3],16	}
416656d9c7f52f88b3a3daccafa7655dec086c4756eThe Android Open Source Project{ .mii;	shrp	X[ 3]=X[ 3],X[ 2],16
417656d9c7f52f88b3a3daccafa7655dec086c4756eThe Android Open Source Project	shrp	X[ 2]=X[ 2],X[ 1],16	}
418656d9c7f52f88b3a3daccafa7655dec086c4756eThe Android Open Source Project{ .mii;	shrp	X[ 1]=X[ 1],X[ 0],16
419656d9c7f52f88b3a3daccafa7655dec086c4756eThe Android Open Source Project	shrp	X[ 0]=X[ 0],T1,16	}
420656d9c7f52f88b3a3daccafa7655dec086c4756eThe Android Open Source Project{ .mfb;	br.many	.L_first16		};;
421656d9c7f52f88b3a3daccafa7655dec086c4756eThe Android Open Source Project.L7byte:
422656d9c7f52f88b3a3daccafa7655dec086c4756eThe Android Open Source Project{ .mmi;	$LDW	X[ 1]=[r9],4*$SZ
423656d9c7f52f88b3a3daccafa7655dec086c4756eThe Android Open Source Project	$LDW	X[ 0]=[r10],4*$SZ
424656d9c7f52f88b3a3daccafa7655dec086c4756eThe Android Open Source Project	shrp	X[15]=X[15],X[14],8	};;
425656d9c7f52f88b3a3daccafa7655dec086c4756eThe Android Open Source Project{ .mii;	$LDW	T1=[input]
426656d9c7f52f88b3a3daccafa7655dec086c4756eThe Android Open Source Project	shrp	X[14]=X[14],X[13],8
427656d9c7f52f88b3a3daccafa7655dec086c4756eThe Android Open Source Project	shrp	X[13]=X[13],X[12],8	}
428656d9c7f52f88b3a3daccafa7655dec086c4756eThe Android Open Source Project{ .mii;	shrp	X[12]=X[12],X[11],8
429656d9c7f52f88b3a3daccafa7655dec086c4756eThe Android Open Source Project	shrp	X[11]=X[11],X[10],8	};;
430656d9c7f52f88b3a3daccafa7655dec086c4756eThe Android Open Source Project{ .mii;	shrp	X[10]=X[10],X[ 9],8
431656d9c7f52f88b3a3daccafa7655dec086c4756eThe Android Open Source Project	shrp	X[ 9]=X[ 9],X[ 8],8	}
432656d9c7f52f88b3a3daccafa7655dec086c4756eThe Android Open Source Project{ .mii;	shrp	X[ 8]=X[ 8],X[ 7],8
433656d9c7f52f88b3a3daccafa7655dec086c4756eThe Android Open Source Project	shrp	X[ 7]=X[ 7],X[ 6],8	}
434656d9c7f52f88b3a3daccafa7655dec086c4756eThe Android Open Source Project{ .mii;	shrp	X[ 6]=X[ 6],X[ 5],8
435656d9c7f52f88b3a3daccafa7655dec086c4756eThe Android Open Source Project	shrp	X[ 5]=X[ 5],X[ 4],8	}
436656d9c7f52f88b3a3daccafa7655dec086c4756eThe Android Open Source Project{ .mii;	shrp	X[ 4]=X[ 4],X[ 3],8
437656d9c7f52f88b3a3daccafa7655dec086c4756eThe Android Open Source Project	shrp	X[ 3]=X[ 3],X[ 2],8	}
438656d9c7f52f88b3a3daccafa7655dec086c4756eThe Android Open Source Project{ .mii;	shrp	X[ 2]=X[ 2],X[ 1],8
439656d9c7f52f88b3a3daccafa7655dec086c4756eThe Android Open Source Project	shrp	X[ 1]=X[ 1],X[ 0],8	}
440656d9c7f52f88b3a3daccafa7655dec086c4756eThe Android Open Source Project{ .mib;	shrp	X[ 0]=X[ 0],T1,8
441656d9c7f52f88b3a3daccafa7655dec086c4756eThe Android Open Source Project	br.many	.L_first16		};;
442656d9c7f52f88b3a3daccafa7655dec086c4756eThe Android Open Source Project
443656d9c7f52f88b3a3daccafa7655dec086c4756eThe Android Open Source Project.align	32
444656d9c7f52f88b3a3daccafa7655dec086c4756eThe Android Open Source Project.L_first16:
445656d9c7f52f88b3a3daccafa7655dec086c4756eThe Android Open Source Project{ .mmi;		$LDW	K=[Ktbl],$SZ
446656d9c7f52f88b3a3daccafa7655dec086c4756eThe Android Open Source Project		and	T1=F,E
447656d9c7f52f88b3a3daccafa7655dec086c4756eThe Android Open Source Project		and	T2=A,B		}
448656d9c7f52f88b3a3daccafa7655dec086c4756eThe Android Open Source Project{ .mmi;		//$LDW	X[15]=[input],$SZ	// X[i]=*input++
449656d9c7f52f88b3a3daccafa7655dec086c4756eThe Android Open Source Project		andcm	r8=G,E
450656d9c7f52f88b3a3daccafa7655dec086c4756eThe Android Open Source Project		and	r9=A,C		};;
451656d9c7f52f88b3a3daccafa7655dec086c4756eThe Android Open Source Project{ .mmi;		xor	T1=T1,r8		//T1=((e & f) ^ (~e & g))
452656d9c7f52f88b3a3daccafa7655dec086c4756eThe Android Open Source Project		and	r10=B,C
453656d9c7f52f88b3a3daccafa7655dec086c4756eThe Android Open Source Project		_rotr	r11=$t1,$Sigma1[0] }	// ROTR(e,14)
454656d9c7f52f88b3a3daccafa7655dec086c4756eThe Android Open Source Project{ .mmi;		xor	T2=T2,r9
455656d9c7f52f88b3a3daccafa7655dec086c4756eThe Android Open Source Project		mux1	X[15]=X[15],\@rev };;	// eliminated in big-endian
456656d9c7f52f88b3a3daccafa7655dec086c4756eThe Android Open Source Project___
457656d9c7f52f88b3a3daccafa7655dec086c4756eThe Android Open Source Project$code.=<<___;
458656d9c7f52f88b3a3daccafa7655dec086c4756eThe Android Open Source Project{ .mib;		add	T1=T1,H			// T1=Ch(e,f,g)+h
459656d9c7f52f88b3a3daccafa7655dec086c4756eThe Android Open Source Project		_rotr	r8=$t1,$Sigma1[1] }	// ROTR(e,18)
460656d9c7f52f88b3a3daccafa7655dec086c4756eThe Android Open Source Project{ .mib;		xor	T2=T2,r10		// T2=((a & b) ^ (a & c) ^ (b & c))
461656d9c7f52f88b3a3daccafa7655dec086c4756eThe Android Open Source Project		mov	H=G		};;
462656d9c7f52f88b3a3daccafa7655dec086c4756eThe Android Open Source Project{ .mib;		xor	r11=r8,r11
463656d9c7f52f88b3a3daccafa7655dec086c4756eThe Android Open Source Project		_rotr	r9=$t1,$Sigma1[2] }	// ROTR(e,41)
464656d9c7f52f88b3a3daccafa7655dec086c4756eThe Android Open Source Project{ .mib;		mov	G=F
465656d9c7f52f88b3a3daccafa7655dec086c4756eThe Android Open Source Project		mov	F=E		};;
466656d9c7f52f88b3a3daccafa7655dec086c4756eThe Android Open Source Project{ .mib;		xor	r9=r9,r11		// r9=Sigma1(e)
467656d9c7f52f88b3a3daccafa7655dec086c4756eThe Android Open Source Project		_rotr	r10=$t0,$Sigma0[0] }	// ROTR(a,28)
468656d9c7f52f88b3a3daccafa7655dec086c4756eThe Android Open Source Project{ .mib;		add	T1=T1,K			// T1=Ch(e,f,g)+h+K512[i]
469656d9c7f52f88b3a3daccafa7655dec086c4756eThe Android Open Source Project		mov	E=D		};;
470656d9c7f52f88b3a3daccafa7655dec086c4756eThe Android Open Source Project{ .mib;		add	T1=T1,r9		// T1+=Sigma1(e)
471656d9c7f52f88b3a3daccafa7655dec086c4756eThe Android Open Source Project		_rotr	r11=$t0,$Sigma0[1] }	// ROTR(a,34)
472656d9c7f52f88b3a3daccafa7655dec086c4756eThe Android Open Source Project{ .mib;		mov	D=C
473656d9c7f52f88b3a3daccafa7655dec086c4756eThe Android Open Source Project		mov	C=B		};;
474656d9c7f52f88b3a3daccafa7655dec086c4756eThe Android Open Source Project{ .mib;		add	T1=T1,X[15]		// T1+=X[i]
475656d9c7f52f88b3a3daccafa7655dec086c4756eThe Android Open Source Project		_rotr	r8=$t0,$Sigma0[2] }	// ROTR(a,39)
476656d9c7f52f88b3a3daccafa7655dec086c4756eThe Android Open Source Project{ .mib;		xor	r10=r10,r11
477656d9c7f52f88b3a3daccafa7655dec086c4756eThe Android Open Source Project		mux2	X[15]=X[15],0x44 };;	// eliminated in 64-bit
478656d9c7f52f88b3a3daccafa7655dec086c4756eThe Android Open Source Project{ .mmi;		xor	r10=r8,r10		// r10=Sigma0(a)
479656d9c7f52f88b3a3daccafa7655dec086c4756eThe Android Open Source Project		mov	B=A
480656d9c7f52f88b3a3daccafa7655dec086c4756eThe Android Open Source Project		add	A=T1,T2		};;
481656d9c7f52f88b3a3daccafa7655dec086c4756eThe Android Open Source Project{ .mib;		add	E=E,T1
482656d9c7f52f88b3a3daccafa7655dec086c4756eThe Android Open Source Project		add	A=A,r10			// T2=Maj(a,b,c)+Sigma0(a)
483656d9c7f52f88b3a3daccafa7655dec086c4756eThe Android Open Source Project	br.ctop.sptk	.L_first16	};;
484656d9c7f52f88b3a3daccafa7655dec086c4756eThe Android Open Source Project.L_first16_end:
485656d9c7f52f88b3a3daccafa7655dec086c4756eThe Android Open Source Project
486656d9c7f52f88b3a3daccafa7655dec086c4756eThe Android Open Source Project{ .mii;	mov	ar.lc=$rounds-17
487656d9c7f52f88b3a3daccafa7655dec086c4756eThe Android Open Source Project	mov	ar.ec=1			};;
488656d9c7f52f88b3a3daccafa7655dec086c4756eThe Android Open Source Project
489656d9c7f52f88b3a3daccafa7655dec086c4756eThe Android Open Source Project.align	32
490656d9c7f52f88b3a3daccafa7655dec086c4756eThe Android Open Source Project.L_rest:
491656d9c7f52f88b3a3daccafa7655dec086c4756eThe Android Open Source Project.rotr	X[16]
492656d9c7f52f88b3a3daccafa7655dec086c4756eThe Android Open Source Project{ .mib;		$LDW	K=[Ktbl],$SZ
493656d9c7f52f88b3a3daccafa7655dec086c4756eThe Android Open Source Project		_rotr	r8=X[15-1],$sigma0[0] }	// ROTR(s0,1)
494656d9c7f52f88b3a3daccafa7655dec086c4756eThe Android Open Source Project{ .mib; 	$ADD	X[15]=X[15],X[15-9]	// X[i&0xF]+=X[(i+9)&0xF]
495656d9c7f52f88b3a3daccafa7655dec086c4756eThe Android Open Source Project		$SHRU	s0=X[15-1],sgm0	};;	// s0=X[(i+1)&0xF]>>7
496656d9c7f52f88b3a3daccafa7655dec086c4756eThe Android Open Source Project{ .mib;		and	T1=F,E
497656d9c7f52f88b3a3daccafa7655dec086c4756eThe Android Open Source Project		_rotr	r9=X[15-1],$sigma0[1] }	// ROTR(s0,8)
498656d9c7f52f88b3a3daccafa7655dec086c4756eThe Android Open Source Project{ .mib;		andcm	r10=G,E
499656d9c7f52f88b3a3daccafa7655dec086c4756eThe Android Open Source Project		$SHRU	s1=X[15-14],sgm1 };;	// s1=X[(i+14)&0xF]>>6
500656d9c7f52f88b3a3daccafa7655dec086c4756eThe Android Open Source Project{ .mmi;		xor	T1=T1,r10		// T1=((e & f) ^ (~e & g))
501656d9c7f52f88b3a3daccafa7655dec086c4756eThe Android Open Source Project		xor	r9=r8,r9
502656d9c7f52f88b3a3daccafa7655dec086c4756eThe Android Open Source Project		_rotr	r10=X[15-14],$sigma1[0] };;// ROTR(s1,19)
503656d9c7f52f88b3a3daccafa7655dec086c4756eThe Android Open Source Project{ .mib;		and	T2=A,B
504656d9c7f52f88b3a3daccafa7655dec086c4756eThe Android Open Source Project		_rotr	r11=X[15-14],$sigma1[1] }// ROTR(s1,61)
505656d9c7f52f88b3a3daccafa7655dec086c4756eThe Android Open Source Project{ .mib;		and	r8=A,C		};;
506656d9c7f52f88b3a3daccafa7655dec086c4756eThe Android Open Source Project___
507656d9c7f52f88b3a3daccafa7655dec086c4756eThe Android Open Source Project$t0="t0", $t1="t1", $code.=<<___ if ($BITS==32);
508656d9c7f52f88b3a3daccafa7655dec086c4756eThe Android Open Source Project// I adhere to mmi; in order to hold Itanium 1 back and avoid 6 cycle
509656d9c7f52f88b3a3daccafa7655dec086c4756eThe Android Open Source Project// pipeline flush in last bundle. Note that even on Itanium2 the
510656d9c7f52f88b3a3daccafa7655dec086c4756eThe Android Open Source Project// latter stalls for one clock cycle...
511656d9c7f52f88b3a3daccafa7655dec086c4756eThe Android Open Source Project{ .mmi;		xor	s0=s0,r9		// s0=sigma0(X[(i+1)&0xF])
512656d9c7f52f88b3a3daccafa7655dec086c4756eThe Android Open Source Project		dep.z	$t1=E,32,32	}
513656d9c7f52f88b3a3daccafa7655dec086c4756eThe Android Open Source Project{ .mmi;		xor	r10=r11,r10
514656d9c7f52f88b3a3daccafa7655dec086c4756eThe Android Open Source Project		zxt4	E=E		};;
515656d9c7f52f88b3a3daccafa7655dec086c4756eThe Android Open Source Project{ .mmi;		or	$t1=$t1,E
516656d9c7f52f88b3a3daccafa7655dec086c4756eThe Android Open Source Project		xor	s1=s1,r10		// s1=sigma1(X[(i+14)&0xF])
517656d9c7f52f88b3a3daccafa7655dec086c4756eThe Android Open Source Project		mux2	$t0=A,0x44	};;	// copy lower half to upper
518656d9c7f52f88b3a3daccafa7655dec086c4756eThe Android Open Source Project{ .mmi;		xor	T2=T2,r8
519656d9c7f52f88b3a3daccafa7655dec086c4756eThe Android Open Source Project		_rotr	r9=$t1,$Sigma1[0] }	// ROTR(e,14)
520656d9c7f52f88b3a3daccafa7655dec086c4756eThe Android Open Source Project{ .mmi;		and	r10=B,C
521656d9c7f52f88b3a3daccafa7655dec086c4756eThe Android Open Source Project		add	T1=T1,H			// T1=Ch(e,f,g)+h
522656d9c7f52f88b3a3daccafa7655dec086c4756eThe Android Open Source Project		$ADD	X[15]=X[15],s0	};;	// X[i&0xF]+=sigma0(X[(i+1)&0xF])
523656d9c7f52f88b3a3daccafa7655dec086c4756eThe Android Open Source Project___
524656d9c7f52f88b3a3daccafa7655dec086c4756eThe Android Open Source Project$t0="A", $t1="E", $code.=<<___ if ($BITS==64);
525656d9c7f52f88b3a3daccafa7655dec086c4756eThe Android Open Source Project{ .mib;		xor	s0=s0,r9		// s0=sigma0(X[(i+1)&0xF])
526656d9c7f52f88b3a3daccafa7655dec086c4756eThe Android Open Source Project		_rotr	r9=$t1,$Sigma1[0] }	// ROTR(e,14)
527656d9c7f52f88b3a3daccafa7655dec086c4756eThe Android Open Source Project{ .mib;		xor	r10=r11,r10
528656d9c7f52f88b3a3daccafa7655dec086c4756eThe Android Open Source Project		xor	T2=T2,r8	};;
529656d9c7f52f88b3a3daccafa7655dec086c4756eThe Android Open Source Project{ .mib;		xor	s1=s1,r10		// s1=sigma1(X[(i+14)&0xF])
530656d9c7f52f88b3a3daccafa7655dec086c4756eThe Android Open Source Project		add	T1=T1,H		}
531656d9c7f52f88b3a3daccafa7655dec086c4756eThe Android Open Source Project{ .mib;		and	r10=B,C
532656d9c7f52f88b3a3daccafa7655dec086c4756eThe Android Open Source Project		$ADD	X[15]=X[15],s0	};;	// X[i&0xF]+=sigma0(X[(i+1)&0xF])
533656d9c7f52f88b3a3daccafa7655dec086c4756eThe Android Open Source Project___
534656d9c7f52f88b3a3daccafa7655dec086c4756eThe Android Open Source Project$code.=<<___;
535656d9c7f52f88b3a3daccafa7655dec086c4756eThe Android Open Source Project{ .mmi;		xor	T2=T2,r10		// T2=((a & b) ^ (a & c) ^ (b & c))
536656d9c7f52f88b3a3daccafa7655dec086c4756eThe Android Open Source Project		mov	H=G
537656d9c7f52f88b3a3daccafa7655dec086c4756eThe Android Open Source Project		_rotr	r8=$t1,$Sigma1[1] };;	// ROTR(e,18)
538656d9c7f52f88b3a3daccafa7655dec086c4756eThe Android Open Source Project{ .mmi;		xor	r11=r8,r9
539656d9c7f52f88b3a3daccafa7655dec086c4756eThe Android Open Source Project		$ADD	X[15]=X[15],s1		// X[i&0xF]+=sigma1(X[(i+14)&0xF])
540656d9c7f52f88b3a3daccafa7655dec086c4756eThe Android Open Source Project		_rotr	r9=$t1,$Sigma1[2] }	// ROTR(e,41)
541656d9c7f52f88b3a3daccafa7655dec086c4756eThe Android Open Source Project{ .mmi;		mov	G=F
542656d9c7f52f88b3a3daccafa7655dec086c4756eThe Android Open Source Project		mov	F=E		};;
543656d9c7f52f88b3a3daccafa7655dec086c4756eThe Android Open Source Project{ .mib;		xor	r9=r9,r11		// r9=Sigma1(e)
544656d9c7f52f88b3a3daccafa7655dec086c4756eThe Android Open Source Project		_rotr	r10=$t0,$Sigma0[0] }	// ROTR(a,28)
545656d9c7f52f88b3a3daccafa7655dec086c4756eThe Android Open Source Project{ .mib;		add	T1=T1,K			// T1=Ch(e,f,g)+h+K512[i]
546656d9c7f52f88b3a3daccafa7655dec086c4756eThe Android Open Source Project		mov	E=D		};;
547656d9c7f52f88b3a3daccafa7655dec086c4756eThe Android Open Source Project{ .mib;		add	T1=T1,r9		// T1+=Sigma1(e)
548656d9c7f52f88b3a3daccafa7655dec086c4756eThe Android Open Source Project		_rotr	r11=$t0,$Sigma0[1] }	// ROTR(a,34)
549656d9c7f52f88b3a3daccafa7655dec086c4756eThe Android Open Source Project{ .mib;		mov	D=C
550656d9c7f52f88b3a3daccafa7655dec086c4756eThe Android Open Source Project		mov	C=B		};;
551656d9c7f52f88b3a3daccafa7655dec086c4756eThe Android Open Source Project{ .mmi;		add	T1=T1,X[15]		// T1+=X[i]
552656d9c7f52f88b3a3daccafa7655dec086c4756eThe Android Open Source Project		xor	r10=r10,r11
553656d9c7f52f88b3a3daccafa7655dec086c4756eThe Android Open Source Project		_rotr	r8=$t0,$Sigma0[2] };;	// ROTR(a,39)
554656d9c7f52f88b3a3daccafa7655dec086c4756eThe Android Open Source Project{ .mmi;		xor	r10=r8,r10		// r10=Sigma0(a)
555656d9c7f52f88b3a3daccafa7655dec086c4756eThe Android Open Source Project		mov	B=A
556656d9c7f52f88b3a3daccafa7655dec086c4756eThe Android Open Source Project		add	A=T1,T2		};;
557656d9c7f52f88b3a3daccafa7655dec086c4756eThe Android Open Source Project{ .mib;		add	E=E,T1
558656d9c7f52f88b3a3daccafa7655dec086c4756eThe Android Open Source Project		add	A=A,r10			// T2=Maj(a,b,c)+Sigma0(a)
559656d9c7f52f88b3a3daccafa7655dec086c4756eThe Android Open Source Project	br.ctop.sptk	.L_rest	};;
560656d9c7f52f88b3a3daccafa7655dec086c4756eThe Android Open Source Project.L_rest_end:
561656d9c7f52f88b3a3daccafa7655dec086c4756eThe Android Open Source Project
562656d9c7f52f88b3a3daccafa7655dec086c4756eThe Android Open Source Project{ .mmi;	add	A_=A_,A
563656d9c7f52f88b3a3daccafa7655dec086c4756eThe Android Open Source Project	add	B_=B_,B
564656d9c7f52f88b3a3daccafa7655dec086c4756eThe Android Open Source Project	add	C_=C_,C			}
565656d9c7f52f88b3a3daccafa7655dec086c4756eThe Android Open Source Project{ .mmi;	add	D_=D_,D
566656d9c7f52f88b3a3daccafa7655dec086c4756eThe Android Open Source Project	add	E_=E_,E
567656d9c7f52f88b3a3daccafa7655dec086c4756eThe Android Open Source Project	cmp.ltu	p16,p0=1,num		};;
568656d9c7f52f88b3a3daccafa7655dec086c4756eThe Android Open Source Project{ .mmi;	add	F_=F_,F
569656d9c7f52f88b3a3daccafa7655dec086c4756eThe Android Open Source Project	add	G_=G_,G
570656d9c7f52f88b3a3daccafa7655dec086c4756eThe Android Open Source Project	add	H_=H_,H			}
571656d9c7f52f88b3a3daccafa7655dec086c4756eThe Android Open Source Project{ .mmb;	add	Ktbl=-$SZ*$rounds,Ktbl
572656d9c7f52f88b3a3daccafa7655dec086c4756eThe Android Open Source Project(p16)	add	num=-1,num
573656d9c7f52f88b3a3daccafa7655dec086c4756eThe Android Open Source Project(p16)	br.dptk.many	.L_outer	};;
574656d9c7f52f88b3a3daccafa7655dec086c4756eThe Android Open Source Project
575656d9c7f52f88b3a3daccafa7655dec086c4756eThe Android Open Source Project{ .mib;	add	r8=0*$SZ,ctx
576656d9c7f52f88b3a3daccafa7655dec086c4756eThe Android Open Source Project	add	r9=1*$SZ,ctx		}
577656d9c7f52f88b3a3daccafa7655dec086c4756eThe Android Open Source Project{ .mib;	add	r10=2*$SZ,ctx
578656d9c7f52f88b3a3daccafa7655dec086c4756eThe Android Open Source Project	add	r11=3*$SZ,ctx		};;
579656d9c7f52f88b3a3daccafa7655dec086c4756eThe Android Open Source Project{ .mmi;	$STW	[r8]=A_,4*$SZ
580656d9c7f52f88b3a3daccafa7655dec086c4756eThe Android Open Source Project	$STW	[r9]=B_,4*$SZ
581656d9c7f52f88b3a3daccafa7655dec086c4756eThe Android Open Source Project	mov	ar.lc=lcsave		}
582656d9c7f52f88b3a3daccafa7655dec086c4756eThe Android Open Source Project{ .mmi;	$STW	[r10]=C_,4*$SZ
583656d9c7f52f88b3a3daccafa7655dec086c4756eThe Android Open Source Project	$STW	[r11]=D_,4*$SZ
584656d9c7f52f88b3a3daccafa7655dec086c4756eThe Android Open Source Project	mov	pr=prsave,0x1ffff	};;
585656d9c7f52f88b3a3daccafa7655dec086c4756eThe Android Open Source Project{ .mmb;	$STW	[r8]=E_
586656d9c7f52f88b3a3daccafa7655dec086c4756eThe Android Open Source Project	$STW	[r9]=F_			}
587656d9c7f52f88b3a3daccafa7655dec086c4756eThe Android Open Source Project{ .mmb;	$STW	[r10]=G_
588656d9c7f52f88b3a3daccafa7655dec086c4756eThe Android Open Source Project	$STW	[r11]=H_
589656d9c7f52f88b3a3daccafa7655dec086c4756eThe Android Open Source Project	br.ret.sptk.many	b0	};;
590656d9c7f52f88b3a3daccafa7655dec086c4756eThe Android Open Source Project.endp	$func#
591656d9c7f52f88b3a3daccafa7655dec086c4756eThe Android Open Source Project___
592656d9c7f52f88b3a3daccafa7655dec086c4756eThe Android Open Source Project
593656d9c7f52f88b3a3daccafa7655dec086c4756eThe Android Open Source Project$code =~ s/\`([^\`]*)\`/eval $1/gem;
594656d9c7f52f88b3a3daccafa7655dec086c4756eThe Android Open Source Project$code =~ s/_rotr(\s+)([^=]+)=([^,]+),([0-9]+)/shrp$1$2=$3,$3,$4/gm;
595656d9c7f52f88b3a3daccafa7655dec086c4756eThe Android Open Source Projectif ($BITS==64) {
596656d9c7f52f88b3a3daccafa7655dec086c4756eThe Android Open Source Project    $code =~ s/mux2(\s+)\S+/nop.i$1 0x0/gm;
597656d9c7f52f88b3a3daccafa7655dec086c4756eThe Android Open Source Project    $code =~ s/mux1(\s+)\S+/nop.i$1 0x0/gm	if ($big_endian);
598656d9c7f52f88b3a3daccafa7655dec086c4756eThe Android Open Source Project    $code =~ s/(shrp\s+X\[[^=]+)=([^,]+),([^,]+),([1-9]+)/$1=$3,$2,64-$4/gm
599656d9c7f52f88b3a3daccafa7655dec086c4756eThe Android Open Source Project    						if (!$big_endian);
600656d9c7f52f88b3a3daccafa7655dec086c4756eThe Android Open Source Project    $code =~ s/ld1(\s+)X\[\S+/nop.m$1 0x0/gm;
601656d9c7f52f88b3a3daccafa7655dec086c4756eThe Android Open Source Project}
602656d9c7f52f88b3a3daccafa7655dec086c4756eThe Android Open Source Project
603656d9c7f52f88b3a3daccafa7655dec086c4756eThe Android Open Source Projectprint $code;
604656d9c7f52f88b3a3daccafa7655dec086c4756eThe Android Open Source Project
605656d9c7f52f88b3a3daccafa7655dec086c4756eThe Android Open Source Projectprint<<___ if ($BITS==32);
606656d9c7f52f88b3a3daccafa7655dec086c4756eThe Android Open Source Project.align	64
607656d9c7f52f88b3a3daccafa7655dec086c4756eThe Android Open Source Project.type	K256#,\@object
608656d9c7f52f88b3a3daccafa7655dec086c4756eThe Android Open Source ProjectK256:	data4	0x428a2f98,0x71374491,0xb5c0fbcf,0xe9b5dba5
609656d9c7f52f88b3a3daccafa7655dec086c4756eThe Android Open Source Project	data4	0x3956c25b,0x59f111f1,0x923f82a4,0xab1c5ed5
610656d9c7f52f88b3a3daccafa7655dec086c4756eThe Android Open Source Project	data4	0xd807aa98,0x12835b01,0x243185be,0x550c7dc3
611656d9c7f52f88b3a3daccafa7655dec086c4756eThe Android Open Source Project	data4	0x72be5d74,0x80deb1fe,0x9bdc06a7,0xc19bf174
612656d9c7f52f88b3a3daccafa7655dec086c4756eThe Android Open Source Project	data4	0xe49b69c1,0xefbe4786,0x0fc19dc6,0x240ca1cc
613656d9c7f52f88b3a3daccafa7655dec086c4756eThe Android Open Source Project	data4	0x2de92c6f,0x4a7484aa,0x5cb0a9dc,0x76f988da
614656d9c7f52f88b3a3daccafa7655dec086c4756eThe Android Open Source Project	data4	0x983e5152,0xa831c66d,0xb00327c8,0xbf597fc7
615656d9c7f52f88b3a3daccafa7655dec086c4756eThe Android Open Source Project	data4	0xc6e00bf3,0xd5a79147,0x06ca6351,0x14292967
616656d9c7f52f88b3a3daccafa7655dec086c4756eThe Android Open Source Project	data4	0x27b70a85,0x2e1b2138,0x4d2c6dfc,0x53380d13
617656d9c7f52f88b3a3daccafa7655dec086c4756eThe Android Open Source Project	data4	0x650a7354,0x766a0abb,0x81c2c92e,0x92722c85
618656d9c7f52f88b3a3daccafa7655dec086c4756eThe Android Open Source Project	data4	0xa2bfe8a1,0xa81a664b,0xc24b8b70,0xc76c51a3
619656d9c7f52f88b3a3daccafa7655dec086c4756eThe Android Open Source Project	data4	0xd192e819,0xd6990624,0xf40e3585,0x106aa070
620656d9c7f52f88b3a3daccafa7655dec086c4756eThe Android Open Source Project	data4	0x19a4c116,0x1e376c08,0x2748774c,0x34b0bcb5
621656d9c7f52f88b3a3daccafa7655dec086c4756eThe Android Open Source Project	data4	0x391c0cb3,0x4ed8aa4a,0x5b9cca4f,0x682e6ff3
622656d9c7f52f88b3a3daccafa7655dec086c4756eThe Android Open Source Project	data4	0x748f82ee,0x78a5636f,0x84c87814,0x8cc70208
623656d9c7f52f88b3a3daccafa7655dec086c4756eThe Android Open Source Project	data4	0x90befffa,0xa4506ceb,0xbef9a3f7,0xc67178f2
624656d9c7f52f88b3a3daccafa7655dec086c4756eThe Android Open Source Project.size	K256#,$SZ*$rounds
625656d9c7f52f88b3a3daccafa7655dec086c4756eThe Android Open Source Projectstringz	"SHA256 block transform for IA64, CRYPTOGAMS by <appro\@openssl.org>"
626656d9c7f52f88b3a3daccafa7655dec086c4756eThe Android Open Source Project___
627656d9c7f52f88b3a3daccafa7655dec086c4756eThe Android Open Source Projectprint<<___ if ($BITS==64);
628656d9c7f52f88b3a3daccafa7655dec086c4756eThe Android Open Source Project.align	64
629656d9c7f52f88b3a3daccafa7655dec086c4756eThe Android Open Source Project.type	K512#,\@object
630656d9c7f52f88b3a3daccafa7655dec086c4756eThe Android Open Source ProjectK512:	data8	0x428a2f98d728ae22,0x7137449123ef65cd
631656d9c7f52f88b3a3daccafa7655dec086c4756eThe Android Open Source Project	data8	0xb5c0fbcfec4d3b2f,0xe9b5dba58189dbbc
632656d9c7f52f88b3a3daccafa7655dec086c4756eThe Android Open Source Project	data8	0x3956c25bf348b538,0x59f111f1b605d019
633656d9c7f52f88b3a3daccafa7655dec086c4756eThe Android Open Source Project	data8	0x923f82a4af194f9b,0xab1c5ed5da6d8118
634656d9c7f52f88b3a3daccafa7655dec086c4756eThe Android Open Source Project	data8	0xd807aa98a3030242,0x12835b0145706fbe
635656d9c7f52f88b3a3daccafa7655dec086c4756eThe Android Open Source Project	data8	0x243185be4ee4b28c,0x550c7dc3d5ffb4e2
636656d9c7f52f88b3a3daccafa7655dec086c4756eThe Android Open Source Project	data8	0x72be5d74f27b896f,0x80deb1fe3b1696b1
637656d9c7f52f88b3a3daccafa7655dec086c4756eThe Android Open Source Project	data8	0x9bdc06a725c71235,0xc19bf174cf692694
638656d9c7f52f88b3a3daccafa7655dec086c4756eThe Android Open Source Project	data8	0xe49b69c19ef14ad2,0xefbe4786384f25e3
639656d9c7f52f88b3a3daccafa7655dec086c4756eThe Android Open Source Project	data8	0x0fc19dc68b8cd5b5,0x240ca1cc77ac9c65
640656d9c7f52f88b3a3daccafa7655dec086c4756eThe Android Open Source Project	data8	0x2de92c6f592b0275,0x4a7484aa6ea6e483
641656d9c7f52f88b3a3daccafa7655dec086c4756eThe Android Open Source Project	data8	0x5cb0a9dcbd41fbd4,0x76f988da831153b5
642656d9c7f52f88b3a3daccafa7655dec086c4756eThe Android Open Source Project	data8	0x983e5152ee66dfab,0xa831c66d2db43210
643656d9c7f52f88b3a3daccafa7655dec086c4756eThe Android Open Source Project	data8	0xb00327c898fb213f,0xbf597fc7beef0ee4
644656d9c7f52f88b3a3daccafa7655dec086c4756eThe Android Open Source Project	data8	0xc6e00bf33da88fc2,0xd5a79147930aa725
645656d9c7f52f88b3a3daccafa7655dec086c4756eThe Android Open Source Project	data8	0x06ca6351e003826f,0x142929670a0e6e70
646656d9c7f52f88b3a3daccafa7655dec086c4756eThe Android Open Source Project	data8	0x27b70a8546d22ffc,0x2e1b21385c26c926
647656d9c7f52f88b3a3daccafa7655dec086c4756eThe Android Open Source Project	data8	0x4d2c6dfc5ac42aed,0x53380d139d95b3df
648656d9c7f52f88b3a3daccafa7655dec086c4756eThe Android Open Source Project	data8	0x650a73548baf63de,0x766a0abb3c77b2a8
649656d9c7f52f88b3a3daccafa7655dec086c4756eThe Android Open Source Project	data8	0x81c2c92e47edaee6,0x92722c851482353b
650656d9c7f52f88b3a3daccafa7655dec086c4756eThe Android Open Source Project	data8	0xa2bfe8a14cf10364,0xa81a664bbc423001
651656d9c7f52f88b3a3daccafa7655dec086c4756eThe Android Open Source Project	data8	0xc24b8b70d0f89791,0xc76c51a30654be30
652656d9c7f52f88b3a3daccafa7655dec086c4756eThe Android Open Source Project	data8	0xd192e819d6ef5218,0xd69906245565a910
653656d9c7f52f88b3a3daccafa7655dec086c4756eThe Android Open Source Project	data8	0xf40e35855771202a,0x106aa07032bbd1b8
654656d9c7f52f88b3a3daccafa7655dec086c4756eThe Android Open Source Project	data8	0x19a4c116b8d2d0c8,0x1e376c085141ab53
655656d9c7f52f88b3a3daccafa7655dec086c4756eThe Android Open Source Project	data8	0x2748774cdf8eeb99,0x34b0bcb5e19b48a8
656656d9c7f52f88b3a3daccafa7655dec086c4756eThe Android Open Source Project	data8	0x391c0cb3c5c95a63,0x4ed8aa4ae3418acb
657656d9c7f52f88b3a3daccafa7655dec086c4756eThe Android Open Source Project	data8	0x5b9cca4f7763e373,0x682e6ff3d6b2b8a3
658656d9c7f52f88b3a3daccafa7655dec086c4756eThe Android Open Source Project	data8	0x748f82ee5defb2fc,0x78a5636f43172f60
659656d9c7f52f88b3a3daccafa7655dec086c4756eThe Android Open Source Project	data8	0x84c87814a1f0ab72,0x8cc702081a6439ec
660656d9c7f52f88b3a3daccafa7655dec086c4756eThe Android Open Source Project	data8	0x90befffa23631e28,0xa4506cebde82bde9
661656d9c7f52f88b3a3daccafa7655dec086c4756eThe Android Open Source Project	data8	0xbef9a3f7b2c67915,0xc67178f2e372532b
662656d9c7f52f88b3a3daccafa7655dec086c4756eThe Android Open Source Project	data8	0xca273eceea26619c,0xd186b8c721c0c207
663656d9c7f52f88b3a3daccafa7655dec086c4756eThe Android Open Source Project	data8	0xeada7dd6cde0eb1e,0xf57d4f7fee6ed178
664656d9c7f52f88b3a3daccafa7655dec086c4756eThe Android Open Source Project	data8	0x06f067aa72176fba,0x0a637dc5a2c898a6
665656d9c7f52f88b3a3daccafa7655dec086c4756eThe Android Open Source Project	data8	0x113f9804bef90dae,0x1b710b35131c471b
666656d9c7f52f88b3a3daccafa7655dec086c4756eThe Android Open Source Project	data8	0x28db77f523047d84,0x32caab7b40c72493
667656d9c7f52f88b3a3daccafa7655dec086c4756eThe Android Open Source Project	data8	0x3c9ebe0a15c9bebc,0x431d67c49c100d4c
668656d9c7f52f88b3a3daccafa7655dec086c4756eThe Android Open Source Project	data8	0x4cc5d4becb3e42b6,0x597f299cfc657e2a
669656d9c7f52f88b3a3daccafa7655dec086c4756eThe Android Open Source Project	data8	0x5fcb6fab3ad6faec,0x6c44198c4a475817
670656d9c7f52f88b3a3daccafa7655dec086c4756eThe Android Open Source Project.size	K512#,$SZ*$rounds
671656d9c7f52f88b3a3daccafa7655dec086c4756eThe Android Open Source Projectstringz	"SHA512 block transform for IA64, CRYPTOGAMS by <appro\@openssl.org>"
672656d9c7f52f88b3a3daccafa7655dec086c4756eThe Android Open Source Project___
673