1<?xml version="1.0"?>
2<!DOCTYPE refentry PUBLIC "-//OASIS//DTD DocBook XML V4.1.2//EN"
3          "http://www.oasis-open.org/docbook/xml/4.1.2/docbookx.dtd">
4
5<refentry id="yasm_arch">
6
7 <refentryinfo>
8  <title>Yasm Supported Architectures</title>
9  <date>October 2006</date>
10  <productname>Yasm</productname>
11  <author>
12   <firstname>Peter</firstname>
13   <surname>Johnson</surname>
14   <affiliation>
15    <address><email>peter@tortall.net</email></address>
16   </affiliation>
17  </author>
18
19  <copyright>
20   <year>2004</year>
21   <year>2005</year>
22   <year>2006</year>
23   <year>2007</year>
24   <holder>Peter Johnson</holder>
25  </copyright>
26 </refentryinfo>
27
28 <refmeta>
29  <refentrytitle>yasm_arch</refentrytitle>
30  <manvolnum>7</manvolnum>
31 </refmeta>
32
33 <refnamediv>
34  <refname>yasm_arch</refname>
35  <refpurpose>Yasm Supported Target Architectures</refpurpose>
36 </refnamediv>
37
38 <refsynopsisdiv>
39  <cmdsynopsis>
40   <command>yasm</command>
41   <arg choice="plain">
42    <option>-a <replaceable>arch</replaceable></option>
43   </arg>
44   <arg choice="opt">
45    <option>-m <replaceable>machine</replaceable></option>
46   </arg>
47   <arg choice="plain">
48    <option><replaceable>...</replaceable></option>
49   </arg>
50  </cmdsynopsis>
51 </refsynopsisdiv>
52
53 <refsect1>
54  <title>Description</title>
55
56  <para>The standard Yasm distribution includes a number of modules
57   for different target architectures.  Each target architecture can
58   support one or more machine architectures.</para>
59
60  <para>The architecture and machine are selected on the
61   
62   <citerefentry>
63    <refentrytitle>yasm</refentrytitle>
64    <manvolnum>1</manvolnum>
65   </citerefentry>
66   
67   command line by use of the <option>-a
68    <replaceable>arch</replaceable></option> and <option>-m
69    <replaceable>machine</replaceable></option> command line options,
70   respectively.</para>
71
72  <para>The machine architecture may also automatically be selected by
73   certain object formats.  For example, the <quote>elf32</quote>
74   object format selects the <quote>x86</quote> machine architecture
75   by default, while the <quote>elf64</quote> object format selects
76   the <quote>amd64</quote> machine architecture by default.</para>
77 </refsect1>
78
79 <refsect1>
80  <title>x86 Architecture</title>
81
82  <para>The <quote>x86</quote> architecture supports the IA-32
83   instruction set and derivatives and the AMD64 instruction set.  It
84   consists of two machines: <quote>x86</quote> (for the IA-32 and
85   derivatives) and <quote>amd64</quote> (for the AMD64 and
86   derivatives).  The default machine for the <quote>x86</quote>
87   architecture is the <quote>x86</quote> machine.</para>
88
89  <refsect2>
90   <title>BITS Setting</title>
91
92   <para>The x86 architecture BITS setting specifies to Yasm the
93    processor mode in which the generated code is intended to execute.
94    x86 processors can run in three different major execution modes:
95    16-bit, 32-bit, and on AMD64-supporting processors, 64-bit.  As
96    the x86 instruction set contains portions whose function is
97    execution-mode dependent (such as operand-size and address-size
98    override prefixes), Yasm cannot assemble x86 instructions
99    correctly unless it is told by the user in what processor mode the
100    code will execute.</para>
101
102   <para>The BITS setting can be changed in a variety of ways.  When
103    using the NASM-compatible parser, the BITS setting can be changed
104    directly via the use of the <userinput>BITS xx</userinput>
105    assembler directive.  The default BITS setting is determined by
106    the object format in use.</para>
107  </refsect2>
108
109  <refsect2>
110   <title>BITS 64 Extensions</title>
111
112   <para>The AMD64 architecture is a new 64-bit architecture developed
113    by AMD, based on the 32-bit x86 architecture. It extends the
114    original x86 architecture by doubling the number of general
115    purpose and SIMD registers, extending the arithmetic operations
116    and address space to 64 bits, as well as other features.</para>
117
118   <para>Recently, Intel has introduced an essentially identical
119    version of AMD64 called EM64T.</para>
120
121   <para>When an AMD64-supporting processor is executing in 64-bit
122    mode, a number of additional extensions are available, including
123    extra general purpose registers, extra SSE2 registers, and
124    RIP-relative addressing.</para>
125
126   <para>Yasm extends the base NASM syntax to support AMD64 as
127    follows.  To enable assembly of instructions for the 64-bit mode
128    of AMD64 processors, use the directive <userinput>BITS
129     64</userinput>. As with NASM's BITS directive, this does not
130    change the format of the output object file to 64 bits; it only
131    changes the assembler mode to assume that the instructions being
132    assembled will be run in 64-bit mode.  To specify an AMD64 object
133    file, use <option>-m amd64</option> on the Yasm command line, or
134    explicitly target a 64-bit object format such as <option>-f
135     win64</option> or <option>-f elf64</option>.</para>
136
137   <refsect3>
138    <title>Register Changes</title>
139
140    <para>The additional 64-bit general purpose registers are named
141     r8-r15.  There are also 8-bit (rXb), 16-bit (rXw), and 32-bit
142     (rXd) subregisters that map to the least significant 8, 16, or 32
143     bits of the 64-bit register.  The original 8 general purpose
144     registers have also been extended to 64-bits: eax, edx, ecx, ebx,
145     esi, edi, esp, and ebp have new 64-bit versions called rax, rdx,
146     rcx, rbx, rsi, rdi, rsp, and rbp respectively.  The old 32-bit
147     registers map to the least significant bits of the new 64-bit
148     registers.</para>
149
150    <para>New 8-bit registers are also available that map to the 8
151     least significant bits of rsi, rdi, rsp, and rbp.  These are
152     called sil, dil, spl, and bpl respectively.  Unfortunately, due
153     to the way instructions are encoded, these new 8-bit registers
154     are encoded the same as the old 8-bit registers ah, dh, ch, and
155     bh.  The processor tells which is being used by the presence of
156     the new REX prefix that is used to specify the other extended
157     registers.  This means it is illegal to mix the use of ah, dh,
158     ch, and bh with an instruction that requires the REX prefix for
159     other reasons.  For instance:</para>
160
161    <screen>add ah, [r10]</screen>
162                
163    <para>(NASM syntax) is not a legal instruction because the use of
164     r10 requires a REX prefix, making it impossible to use ah.</para>
165
166    <para>In 64-bit mode, an additional 8 SSE2 registers are also
167     available.  These are named xmm8-xmm15.</para>
168   </refsect3>
169
170   <refsect3>
171    <title>64 Bit Instructions</title>
172
173    <para>By default, most operations in 64-bit mode remain 32-bit;
174     operations that are 64-bit usually require a REX prefix (one bit
175     in the REX prefix determines whether an operation is 64-bit or
176     32-bit).  Thus, essentially all 32-bit instructions have a 64-bit
177     version, and the 64-bit versions of instructions can use extended
178     registers <quote>for free</quote> (as the REX prefix is already
179     present).  Examples in NASM syntax:</para>
180
181    <screen>mov eax, 1  ; 32-bit instruction</screen>
182    <screen>mov rcx, 1  ; 64-bit instruction</screen>
183
184    <para>Instructions that modify the stack (push, pop, call, ret,
185     enter, and leave) are implicitly 64-bit.  Their 32-bit
186     counterparts are not available, but their 16-bit counterparts
187     are.  Examples in NASM syntax:</para>
188
189    <screen>push eax  ; illegal instruction</screen>
190    <screen>push rbx  ; 1-byte instruction</screen>
191    <screen>push r11  ; 2-byte instruction with REX prefix</screen>
192   </refsect3>
193
194   <refsect3>
195    <title>Implicit Zero Extension</title>
196
197    <para>Results of 32-bit operations are implicitly zero-extended to
198     the upper 32 bits of the corresponding 64-bit register.  16 and 8
199     bit operations, on the other hand, do not affect upper bits of
200     the register (just as in 32-bit and 16-bit modes).  This can be
201     used to generate smaller code in some instances.  Examples in
202     NASM syntax:</para>
203
204    <screen>mov ecx, 1  ; 1 byte shorter than mov rcx, 1</screen>
205    <screen>and edx, 3  ; equivalent to and rdx, 3</screen>
206   </refsect3>
207
208   <refsect3>
209    <title>Immediates</title>
210
211    <para>For most instructions in 64-bit mode, immediate values
212     remain 32 bits; their value is sign-extended into the upper 32
213     bits of the target register prior to being used.  The exception
214     is the mov instruction, which can take a 64-bit immediate when
215     the destination is a 64-bit register.  Examples in NASM
216     syntax:</para>
217
218    <screen>add rax, 1           ; optimized down to signed 8-bit</screen>
219    <screen>add rax, dword 1     ; force size to 32-bit</screen>
220    <screen>add rax, 0xffffffff  ; sign-extended 32-bit</screen>
221    <screen>add rax, -1          ; same as above</screen>
222    <screen>add rax, 0xffffffffffffffff ; truncated to 32-bit (warning)</screen>
223    <screen>mov eax, 1           ; 5 byte</screen>
224    <screen>mov rax, 1           ; 5 byte (optimized to signed 32-bit)</screen>
225    <screen>mov rax, qword 1     ; 10 byte (forced 64-bit)</screen>
226    <screen>mov rbx, 0x1234567890abcdef ; 10 byte</screen>
227    <screen>mov rcx, 0xffffffff  ; 10 byte (does not fit in signed 32-bit)</screen>
228    <screen>mov ecx, -1          ; 5 byte, equivalent to above</screen>
229    <screen>mov rcx, sym         ; 5 byte, 32-bit size default for symbols</screen>
230    <screen>mov rcx, qword sym   ; 10 byte, override default size</screen>
231
232    <para>The handling of mov reg64, unsized immediate is different
233     between YASM and NASM 2.x; YASM follows the above behavior, while
234     NASM 2.x does the following:</para>
235
236    <screen>add rax, 0xffffffff  ; sign-extended 32-bit immediate</screen>
237    <screen>add rax, -1          ; same as above</screen>
238    <screen>add rax, 0xffffffffffffffff ; truncated 32-bit (warning)</screen>
239    <screen>add rax, sym         ; sign-extended 32-bit immediate</screen>
240    <screen>mov eax, 1           ; 5 byte (32-bit immediate)</screen>
241    <screen>mov rax, 1           ; 10 byte (64-bit immediate)</screen>
242    <screen>mov rbx, 0x1234567890abcdef ; 10 byte instruction</screen>
243    <screen>mov rcx, 0xffffffff  ; 10 byte instruction</screen>
244    <screen>mov ecx, -1          ; 5 byte, equivalent to above</screen>
245    <screen>mov ecx, sym         ; 5 byte (32-bit immediate)</screen>
246    <screen>mov rcx, sym         ; 10 byte instruction</screen>
247    <screen>mov rcx, qword sym   ; 10 byte (64-bit immediate)</screen>
248   </refsect3>
249
250   <refsect3>
251    <title>Displacements</title>
252
253    <para>Just like immediates, displacements, for the most part,
254     remain 32 bits and are sign extended prior to use.  Again, the
255     exception is one restricted form of the mov instruction: between
256     the al/ax/eax/rax register and a 64-bit absolute address (no
257     registers allowed in the effective address).  In NASM syntax, use
258     of the 64-bit absolute form requires
259     <userinput>[qword]</userinput>.  Examples in NASM syntax:</para>
260
261    <screen>mov eax, [1]    ; 32 bit, with sign extension</screen>
262    <screen>mov al, [rax-1] ; 32 bit, with sign extension</screen>
263    <screen>mov al, [qword 0x1122334455667788] ; 64-bit absolute</screen>
264    <screen>mov al, [0x1122334455667788] ; truncated to 32-bit (warning)</screen>
265   </refsect3>
266
267   <refsect3>
268    <title>RIP Relative Addressing</title>
269
270    <para>In 64-bit mode, a new form of effective addressing is
271     available to make it easier to write position-independent code.
272     Any memory reference may be made RIP relative (RIP is the
273     instruction pointer register, which contains the address of the
274     location immediately following the current instruction).</para>
275
276    <para>In NASM syntax, there are two ways to specify RIP-relative
277     addressing:</para>
278
279    <screen>mov dword [rip+10], 1</screen>
280
281    <para>stores the value 1 ten bytes after the end of the
282     instruction.  <userinput>10</userinput> can also be a symbolic
283     constant, and will be treated the same way.  On the other
284     hand,</para>
285
286    <screen>mov dword [symb wrt rip], 1</screen>
287
288    <para>stores the value 1 into the address of symbol
289     <userinput>symb</userinput>.  This is distinctly different than
290     the behavior of:</para>
291
292    <screen>mov dword [symb+rip], 1</screen>
293
294    <para>which takes the address of the end of the instruction, adds
295     the address of <userinput>symb</userinput> to it, then stores the
296     value 1 there.  If <userinput>symb</userinput> is a variable,
297     this will <emphasis>not</emphasis> store the value 1 into the
298     <userinput>symb</userinput> variable!</para>
299
300    <para>Yasm also supports the following syntax for RIP-relative
301     addressing:</para>
302
303    <screen>mov [rel sym], rax  ; RIP-relative</screen>
304    <screen>mov [abs sym], rax  ; not RIP-relative</screen>
305
306    <para>The behavior of:</para>
307
308    <screen>mov [sym], rax</screen>
309
310    <para>Depends on a mode set by the DEFAULT directive, as follows.
311     The default mode is always "abs", and in "rel" mode, use of
312     registers, an fs or gs segment override, or an explicit "abs"
313     override will result in a non-RIP-relative effective
314     address.</para>
315  
316    <screen>default rel</screen>
317    <screen>mov [sym], rbx      ; RIP-relative</screen>
318    <screen>mov [abs sym], rbx  ; not RIP-relative (explicit override)</screen>
319    <screen>mov [rbx+1], rbx    ; not RIP-relative (register use)</screen>
320    <screen>mov [fs:sym], rbx   ; not RIP-relative (fs or gs use)</screen>
321    <screen>mov [ds:sym], rbx   ; RIP-relative (segment, but not fs or gs)</screen>
322    <screen>mov [rel sym], rbx  ; RIP-relative (redundant override)</screen>
323
324    <screen>default abs</screen>
325    <screen>mov [sym], rbx      ; not RIP-relative</screen>
326    <screen>mov [abs sym], rbx  ; not RIP-relative</screen>
327    <screen>mov [rbx+1], rbx    ; not RIP-relative</screen>
328    <screen>mov [fs:sym], rbx   ; not RIP-relative</screen>
329    <screen>mov [ds:sym], rbx   ; not RIP-relative</screen>
330    <screen>mov [rel sym], rbx  ; RIP-relative (explicit override)</screen>
331   </refsect3>
332
333   <refsect3>
334    <title>Memory references</title>
335
336    <para>Usually the size of a memory reference can be deduced by
337     which registers you're moving--for example, "mov [rax],ecx" is a
338     32-bit move, because ecx is 32 bits.  YASM currently gives the
339     non-obvious "invalid combination of opcode and operands" error if
340     it can't figure out how much memory you're moving.  The fix in
341     this case is to add a memory size specifier: qword, dword, word,
342     or byte.</para>
343
344    <para>Here's a 64-bit memory move, which sets 8 bytes starting at
345     rax:</para>
346
347    <screen>mov qword [rax], 1</screen>
348
349    <para>Here's a 32-bit memory move, which sets 4 bytes:</para>
350
351    <screen>mov dword [rax], 1</screen>
352
353    <para>Here's a 16-bit memory move, which sets 2 bytes:</para>
354
355    <screen>mov word [rax], 1</screen>
356
357    <para>Here's an 8-bit memory move, which sets 1 byte:</para>
358
359    <screen>mov byte [rax], 1</screen>
360   </refsect3>
361  </refsect2>
362 </refsect1>
363
364 <refsect1>
365  <title>lc3b Architecture</title>
366
367  <para>The <quote>lc3b</quote> architecture supports the LC-3b ISA as
368   used in the ECE 312 (now ECE 411) course at the University of
369   Illinois, Urbana-Champaign, as well as other university courses.
370   See <ulink url="http://courses.ece.uiuc.edu/ece411/"/> for more
371   details and example code.  The <quote>lc3b</quote> architecture
372   consists of only one machine: <quote>lc3b</quote>.</para>
373 </refsect1>
374
375 <refsect1>
376  <title>See Also</title>
377
378  <para><citerefentry>
379   <refentrytitle>yasm</refentrytitle>
380   <manvolnum>1</manvolnum>
381  </citerefentry></para>
382 </refsect1>
383
384 <refsect1>
385  <title>Bugs</title>
386
387  <para>When using the <quote>x86</quote> architecture, it is overly
388   easy to generate AMD64 code (using the <userinput>BITS
389    64</userinput> directive) and generate a 32-bit object file (by
390   failing to specify <option>-m amd64</option> on the command line or
391   selecting a 64-bit object format).  Similarly, specifying
392   <option>-m amd64</option> does not default the BITS setting to
393   64.  An easy way to avoid this is by directly specifying
394   a 64-bit object format such as <option>-f elf64</option>.</para>
395 </refsect1>
396</refentry>
397