1<!DOCTYPE html PUBLIC "-//W3C//DTD XHTML 1.0 Strict//EN" "http://www.w3.org/TR/xhtml1/DTD/xhtml1-strict.dtd">
2
3<html>
4
5<head>
6<title>Dalvik VM Instruction Formats</title>
7<link rel=stylesheet href="instruction-formats.css">
8</head>
9
10<body>
11
12<h1>Dalvik VM Instruction Formats</h1>
13<p>Copyright &copy; 2007 The Android Open Source Project
14
15<h2>Introduction and Overview</h2>
16
17<p>This document lists the instruction formats used by Dalvik bytecode
18and is meant to be used in conjunction with the
19<a href="dalvik-bytecode.html">bytecode reference document</a>.</p>
20
21<h3>Bitwise descriptions</h3>
22
23<p>The first column in the format table lists the bitwise layout of
24the format. It consists of one or more space-separated "words" each of
25which describes a 16-bit code unit. Each character in a word
26represents four bits, read from high bits to low, with vertical bars
27("<code>|</code>") interspersed to aid in reading. Uppercase letters
28in sequence from "<code>A</code>" are used to indicate fields within
29the format (which then get defined further by the syntax column). The term
30"<code>op</code>" is used to indicate the position of an eight-bit
31opcode within the format, and similarly "<code>exop</code>" is used
32to indicate an extended sixteen-bit opcode. A slashed zero
33("<code>&Oslash;</code>") is used to indicate that all bits must be
34zero in the indicated position.</p>
35
36<p>For the most part, lettering proceeds from earlier code units to
37later code units, and low-order to high-order within a code unit.
38However, there are a few exceptions to this general rule, which are
39done in order to make the naming of similar-meaning parts be the same
40across different instruction formats. These cases are noted explicitly
41in the format descriptions.</p>
42
43<p>For example, the format "<code>B|A|<i>op</i> CCCC</code>" indicates
44that the format consists of two 16-bit code units. The first word
45consists of the opcode in the low eight bits and a pair of four-bit
46values in the high eight bits; and the second word consists of a single
4716-bit value.</p>
48
49<h3>Format IDs</h3>
50
51<p>The second column in the format table indicates the short identifier
52for the format, which is used in other documents and in code to identify
53the format.</p>
54
55<p>Most format IDs consist of three characters, two digits followed by a
56letter. The first digit indicates the number of 16-bit code units in the
57format. The second digit indicates the maximum number of registers that the
58format contains (maximum, since some formats can accomodate a variable
59number of registers), with the special designation "<code>r</code>" indicating
60that a range of registers is encoded. The final letter semi-mnemonically
61indicates the type of any extra data encoded by the format. For example,
62format "<code>21t</code>" is of length two, contains one register reference,
63and additionally contains a branch target.</p>
64
65<p>Suggested static linking formats have an additional
66"<code>s</code>" suffix, making them four characters total. Similarly,
67suggested "inline" linking formats have an additional "<code>i</code>"
68suffix. (In this context, inline linking is like static linking,
69except with more direct ties into a virtual machine's implementation.) 
70Finally, a couple oddball suggested formats (e.g.,
71"<code>20bc</code>") include two pieces of data which are both
72represented in its format ID.</p>
73
74<p>The full list of typecode letters are as follows. Note that some
75forms have different sizes, depending on the format:</p>
76
77<table class="letters">
78<thead>
79<tr>
80  <th>Mnemonic</th>
81  <th>Bit Sizes</th>
82  <th>Meaning</th>
83</tr>
84</thead>
85<tbody>
86<tr>
87  <td>b</td>
88  <td>8</td>
89  <td>immediate signed <b>b</b>yte</td>
90</tr>
91<tr>
92  <td>c</td>
93  <td>16, 32</td>
94  <td><b>c</b>onstant pool index</td>
95</tr>
96<tr>
97  <td>f</td>
98  <td>16</td>
99  <td>inter<b>f</b>ace constants (only used in statically linked formats)
100  </td>
101</tr>
102<tr>
103  <td>h</td>
104  <td>16</td>
105  <td>immediate signed <b>h</b>at (high-order bits of a 32- or 64-bit
106    value; low-order bits are all <code>0</code>)
107  </td>
108</tr>
109<tr>
110  <td>i</td>
111  <td>32</td>
112  <td>immediate signed <b>i</b>nt, or 32-bit float</td>
113</tr>
114<tr>
115  <td>l</td>
116  <td>64</td>
117  <td>immediate signed <b>l</b>ong, or 64-bit double</td>
118</tr>
119<tr>
120  <td>m</td>
121  <td>16</td>
122  <td><b>m</b>ethod constants (only used in statically linked formats)</td>
123</tr>
124<tr>
125  <td>n</td>
126  <td>4</td>
127  <td>immediate signed <b>n</b>ibble</td>
128</tr>
129<tr>
130  <td>s</td>
131  <td>16</td>
132  <td>immediate signed <b>s</b>hort</td>
133</tr>
134<tr>
135  <td>t</td>
136  <td>8, 16, 32</td>
137  <td>branch <b>t</b>arget</td>
138</tr>
139<tr>
140  <td>x</td>
141  <td>0</td>
142  <td>no additional data</td>
143</tr>
144</tbody>
145</table>
146
147<h3>Syntax</h3>
148
149<p>The third column of the format table indicates the human-oriented
150syntax for instructions which use the indicated format. Each instruction
151starts with the named opcode and is optionally followed by one or
152more arguments, themselves separated with commas.</p>
153
154<p>Wherever an argument refers to a field from the first column, the
155letter for that field is indicated in the syntax, repeated once for
156each four bits of the field. For example, an eight-bit field labeled
157"<code>BB</code>" in the first column would also be labeled
158"<code>BB</code>" in the syntax column.</p>
159
160<p>Arguments which name a register have the form "<code>v<i>X</i></code>".
161The prefix "<code>v</code>" was chosen instead of the more common
162"<code>r</code>" exactly to avoid conflicting with (non-virtual) architectures
163on which a Dalvik virtual machine might be implemented which themselves
164use the prefix "<code>r</code>" for their registers. (That is, this
165decision makes it possible to talk about both virtual and real registers
166together without the need for circumlocution.)</p>
167
168<p>Arguments which indicate a literal value have the form
169"<code>#+<i>X</i></code>". Some formats indicate literals that only
170have non-zero bits in their high-order bits; for these, the zeroes
171are represented explicitly in the syntax, even though they do not
172appear in the bitwise representation.</p>
173
174<p>Arguments which indicate a relative instruction address offset have the
175form "<code>+<i>X</i></code>".</p>
176
177<p>Arguments which indicate a literal constant pool index have the form
178"<code><i>kind</i>@<i>X</i></code>", where "<code><i>kind</i></code>"
179indicates which constant pool is being referred to. Each opcode that
180uses such a format explicitly allows only one kind of constant; see
181the opcode reference to figure out the correspondence. The four
182kinds of constant pool are "<code>string</code>" (string pool index),
183"<code>type</code>" (type pool index), "<code>field</code>" (field
184pool index), and "<code>meth</code>" (method pool index).</p>
185
186<p>Similar to the representation of constant pool indices, there are
187also suggested (optional) forms that indicate prelinked offsets or
188indices. There are two types of suggested prelinked value: vtable offsets
189(indicated as "<code>vtaboff</code>") and field offsets (indicated as
190"<code>fieldoff</code>").</p>
191
192<p>In the cases where a format value isn't explictly part of the syntax
193but instead picks a variant, each variant is listed with the prefix
194"<code>[<i>X</i>=<i>N</i>]</code>" (e.g., "<code>[A=2]</code>") to indicate
195the correspondence.</p>
196
197<h2>The Formats</h2>
198
199<table class="format">
200<thead>
201<tr>
202  <th>Format</th>
203  <th>ID</th>
204  <th>Syntax</th>
205  <th>Notable Opcodes Covered</th>
206</tr>
207</thead>
208<tbody>
209<tr>
210  <td><i>N/A</i></td>
211  <td>00x</td>
212  <td><i><code>N/A</code></i></td>
213  <td><i>pseudo-format used for unused opcodes; suggested for use as the
214    nominal format for a breakpoint opcode</i></td>
215</tr>
216<tr>
217  <td>&Oslash;&Oslash;|<i>op</i></td>
218  <td>10x</td>
219  <td><i><code>op</code></i></td>
220  <td>&nbsp;</td>
221</tr>
222<tr>
223  <td rowspan="2">B|A|<i>op</i></td>
224  <td>12x</td>
225  <td><i><code>op</code></i> vA, vB</td>
226  <td>&nbsp;</td>
227</tr>
228<tr>
229  <td>11n</td>
230  <td><i><code>op</code></i> vA, #+B</td>
231  <td>&nbsp;</td>
232</tr>
233<tr>
234  <td rowspan="2">AA|<i>op</i></td>
235  <td>11x</td>
236  <td><i><code>op</code></i> vAA</td>
237  <td>&nbsp;</td>
238</tr>
239<tr>
240  <td>10t</td>
241  <td><i><code>op</code></i> +AA</td>
242  <td>goto</td>
243</tr>
244<tr>
245  <td>&Oslash;&Oslash;|<i>op</i> AAAA</td></td>
246  <td>20t</td>
247  <td><i><code>op</code></i> +AAAA</td>
248  <td>goto/16</td>
249</tr>
250<tr>
251  <td>AA|<i>op</i> BBBB</td></td>
252  <td>20bc</td>
253  <td><i><code>op</code></i> AA, kind@BBBB</td>
254  <td><i>suggested format for statically determined verification errors;
255    A is the type of error and B is an index into a type-appropriate
256    table (e.g. method references for a no-such-method error)</i></td>
257</tr>
258<tr>
259  <td rowspan="5">AA|<i>op</i> BBBB</td>
260  <td>22x</td>
261  <td><i><code>op</code></i> vAA, vBBBB</td>
262  <td>&nbsp;</td>
263</tr>
264<tr>
265  <td>21t</td>
266  <td><i><code>op</code></i> vAA, +BBBB</td>
267  <td>&nbsp;</td>
268</tr>
269<tr>
270  <td>21s</td>
271  <td><i><code>op</code></i> vAA, #+BBBB</td>
272  <td>&nbsp;</td>
273</tr>
274<tr>
275  <td>21h</td>
276  <td><i><code>op</code></i> vAA, #+BBBB0000<br/>
277    <i><code>op</code></i> vAA, #+BBBB000000000000
278  </td>
279  <td>&nbsp;</td>
280</tr>
281<tr>
282  <td>21c</td>
283  <td><i><code>op</code></i> vAA, type@BBBB<br/>
284    <i><code>op</code></i> vAA, field@BBBB<br/>
285    <i><code>op</code></i> vAA, string@BBBB
286  </td>
287  <td>check-cast<br/>
288    const-class<br/>
289    const-string
290  </td>
291</tr>
292<tr>
293  <td rowspan="2">AA|<i>op</i> CC|BB</td>
294  <td>23x</td>
295  <td><i><code>op</code></i> vAA, vBB, vCC</td>
296  <td>&nbsp;</td>
297</tr>
298<tr>
299  <td>22b</td>
300  <td><i><code>op</code></i> vAA, vBB, #+CC</td>
301  <td>&nbsp;</td>
302</tr>
303<tr>
304  <td rowspan="4">B|A|<i>op</i> CCCC</td>
305  <td>22t</td>
306  <td><i><code>op</code></i> vA, vB, +CCCC</td>
307  <td>&nbsp;</td>
308</tr>
309<tr>
310  <td>22s</td>
311  <td><i><code>op</code></i> vA, vB, #+CCCC</td>
312  <td>&nbsp;</td>
313</tr>
314<tr>
315  <td>22c</td>
316  <td><i><code>op</code></i> vA, vB, type@CCCC<br/>
317    <i><code>op</code></i> vA, vB, field@CCCC
318  </td>
319  <td>instance-of</td>
320</tr>
321<tr>
322  <td>22cs</td>
323  <td><i><code>op</code></i> vA, vB, fieldoff@CCCC</td>
324  <td><i>suggested format for statically linked field access instructions of
325    format 22c</i>
326  </td>
327</tr>
328<tr>
329  <td>&Oslash;&Oslash;|<i>op</i> AAAA<sub>lo</sub> AAAA<sub>hi</sub></td></td>
330  <td>30t</td>
331  <td><i><code>op</code></i> +AAAAAAAA</td>
332  <td>goto/32</td>
333</tr>
334<tr>
335  <td>&Oslash;&Oslash;|<i>op</i> AAAA BBBB</td>
336  <td>32x</td>
337  <td><i><code>op</code></i> vAAAA, vBBBB</td>
338  <td>&nbsp;</td>
339</tr>
340<tr>
341  <td rowspan="3">AA|<i>op</i> BBBB<sub>lo</sub> BBBB<sub>hi</sub></td>
342  <td>31i</td>
343  <td><i><code>op</code></i> vAA, #+BBBBBBBB</td>
344  <td>&nbsp;</td>
345</tr>
346<tr>
347  <td>31t</td>
348  <td><i><code>op</code></i> vAA, +BBBBBBBB</td>
349  <td>&nbsp;</td>
350</tr>
351<tr>
352  <td>31c</td>
353  <td><i><code>op</code></i> vAA, string@BBBBBBBB</td>
354  <td>const-string/jumbo</td>
355</tr>
356<tr>
357  <td rowspan="3">A|G|<i>op</i> BBBB F|E|D|C</td>
358  <td>35c</td>
359  <td><i>[<code>A=5</code>] <code>op</code></i> {vC, vD, vE, vF, vG},
360    meth@BBBB<br/>
361    <i>[<code>A=5</code>] <code>op</code></i> {vC, vD, vE, vF, vG},
362    type@BBBB<br/>
363    <i>[<code>A=4</code>] <code>op</code></i> {vC, vD, vE, vF},
364    <i><code>kind</code></i>@BBBB<br/>
365    <i>[<code>A=3</code>] <code>op</code></i> {vC, vD, vE},
366    <i><code>kind</code></i>@BBBB<br/>
367    <i>[<code>A=2</code>] <code>op</code></i> {vC, vD},
368    <i><code>kind</code></i>@BBBB<br/>
369    <i>[<code>A=1</code>] <code>op</code></i> {vC},
370    <i><code>kind</code></i>@BBBB<br/>
371    <i>[<code>A=0</code>] <code>op</code></i> {},
372    <i><code>kind</code></i>@BBBB<br/>
373    <p><i>The unusual choice in lettering here reflects a desire to make
374    the count and the reference index have the same label as in format
375    3rc.</i></p>
376  </td>
377  <td>&nbsp;</td>
378</tr>
379<tr>
380  <td>35ms</td>
381  <td><i>[<code>A=5</code>] <code>op</code></i> {vC, vD, vE, vF, vG},
382    vtaboff@BBBB<br/>
383    <i>[<code>A=4</code>] <code>op</code></i> {vC, vD, vE, vF},
384    vtaboff@BBBB<br/>
385    <i>[<code>A=3</code>] <code>op</code></i> {vC, vD, vE},
386    vtaboff@BBBB<br/>
387    <i>[<code>A=2</code>] <code>op</code></i> {vC, vD},
388    vtaboff@BBBB<br/>
389    <i>[<code>A=1</code>] <code>op</code></i> {vC},
390    vtaboff@BBBB<br/>
391    <p><i>The unusual choice in lettering here reflects a desire to make
392    the count and the reference index have the same label as in format
393    3rms.</i></p>
394  </td>
395  <td><i>suggested format for statically linked <code>invoke-virtual</code>
396    and <code>invoke-super</code> instructions of format 35c</i>
397  </td>
398</tr>
399<tr>
400  <td>35mi</td>
401  <td><i>[<code>A=5</code>] <code>op</code></i> {vC, vD, vE, vF, vG},
402    inline@BBBB<br/>
403    <i>[<code>A=4</code>] <code>op</code></i> {vC, vD, vE, vF},
404    inline@BBBB<br/>
405    <i>[<code>A=3</code>] <code>op</code></i> {vC, vD, vE},
406    inline@BBBB<br/>
407    <i>[<code>A=2</code>] <code>op</code></i> {vC, vD},
408    inline@BBBB<br/>
409    <i>[<code>A=1</code>] <code>op</code></i> {vC},
410    inline@BBBB<br/>
411    <p><i>The unusual choice in lettering here reflects a desire to make
412    the count and the reference index have the same label as in format
413    3rmi.</i></p>
414  </td>
415  <td><i>suggested format for inline linked <code>invoke-static</code>
416    and <code>invoke-virtual</code> instructions of format 35c</i>
417  </td>
418</tr>
419<tr>
420  <td rowspan="3">AA|<i>op</i> BBBB CCCC</td>
421  <td>3rc</td>
422  <td><i><code>op</code></i> {vCCCC .. vNNNN}, meth@BBBB<br/>
423    <i><code>op</code></i> {vCCCC .. vNNNN}, type@BBBB<br/>
424    <p><i>where <code>NNNN = CCCC+AA-1</code>, that is <code>A</code>
425    determines the count <code>0..255</code>, and <code>C</code>
426    determines the first register</i></p>
427  </td>
428  <td>&nbsp;</td>
429</tr>
430<tr>
431  <td>3rms</td>
432  <td><i><code>op</code></i> {vCCCC .. vNNNN}, vtaboff@BBBB<br/>
433    <p><i>where <code>NNNN = CCCC+AA-1</code>, that is <code>A</code>
434    determines the count <code>0..255</code>, and <code>C</code>
435    determines the first register</i></p>
436  </td>
437  <td><i>suggested format for statically linked <code>invoke-virtual</code>
438    and <code>invoke-super</code> instructions of format <code>3rc</code></i>
439  </td>
440</tr>
441<tr>
442  <td>3rmi</td>
443  <td><i><code>op</code></i> {vCCCC .. vNNNN}, inline@BBBB<br/>
444    <p><i>where <code>NNNN = CCCC+AA-1</code>, that is <code>A</code>
445    determines the count <code>0..255</code>, and <code>C</code>
446    determines the first register</i></p>
447  </td>
448  <td><i>suggested format for inline linked <code>invoke-static</code>
449    and <code>invoke-virtual</code> instructions of format 3rc</i>
450  </td>
451</tr>
452<tr>
453  <td>AA|<i>op</i> BBBB<sub>lo</sub> BBBB BBBB BBBB<sub>hi</sub></td>
454  <td>51l</td>
455  <td><i><code>op</code></i> vAA, #+BBBBBBBBBBBBBBBB</td>
456  <td>const-wide</td>
457</tr>
458<tr>
459  <td rowspan="2"><i>exop</i> BB|AA CCCC</td>
460  <td>33x</td>
461  <td><i><code>exop</code></i> vAA, vBB, vCCCC</td>
462  <td>&nbsp;</td>
463</tr>
464<tr>
465  <td>32s</td>
466  <td><i><code>exop</code></i> vAA, vBB, #+CCCC</td>
467  <td>&nbsp;</td>
468</tr>
469<tr>
470  <td><i>exop</i> BBBB<sub>lo</sub> BBBB<sub>hi</sub> AAAA</td></td>
471  <td>40sc</td>
472  <td><i><code>exop</code></i> AAAA, kind@BBBBBBBB</td>
473  <td><i>suggested format for statically determined verification errors;
474    see <code>20bc</code>, above</i></td>
475</tr>
476<tr>
477  <td><i>exop</i> BBBB<sub>lo</sub> BBBB<sub>hi</sub> AAAA
478  <td>41c</td>
479  <td><i><code>exop</code></i> vAAAA, field@BBBBBBBB<br/>
480    <i><code>exop</code></i> vAAAA, type@BBBBBBBB
481    <p><i>The unusual choice in lettering here reflects a desire to make
482    the letters match their use in related formats 21c and 31c.</i></p>
483  </td>
484  <td>&nbsp;</td>
485</tr>
486<tr>
487  <td><i>exop</i> CCCC<sub>lo</sub> CCCC<sub>hi</sub>
488    AAAA BBBB</td>
489  <td>52c</td>
490  <td><i><code>exop</code></i> vAAAA, vBBBB, field@CCCCCCCC<br/>
491    <i><code>exop</code></i> vAAAA, vBBBB, type@CCCCCCCC
492    <p><i>The unusual choice in lettering here reflects a desire to make
493    the letters match their use in related formats 22c and 22cs.</i></p>
494  </td>
495  <td>&nbsp;</td>
496</tr>
497<tr>
498  <td><i>exop</i> BBBB<sub>lo</sub> BBBB<sub>hi</sub>
499    AAAA CCCC</td>
500  <td>5rc</td>
501  <td><i><code>exop</code></i> {vCCCC .. vNNNN}, meth@BBBBBBBB<br/>
502    <i><code>exop</code></i> {vCCCC .. vNNNN}, type@BBBBBBBB<br/>
503    <p><i>where <code>NNNN = CCCC+AAAA-1</code>, that is <code>A</code>
504    determines the count <code>0..65535</code>, and <code>C</code>
505    determines the first register</i></p>
506    <p><i>The unusual choice in lettering here reflects a desire to make
507    the letters match their use in related formats 3rc, 3rms, and 3rmi.</i></p>
508  </td>
509  <td>&nbsp;</td>
510</tr>
511</tbody>
512</table>
513
514</body>
515</html>
516