NameDateSize

..24-Jul-20134 KiB

arm-vfp/24-Jul-20134 KiB

armv5te/24-Jul-201312 KiB

armv6/24-Jul-20134 KiB

armv6t2/24-Jul-20134 KiB

armv7-a/24-Jul-20134 KiB

c/24-Jul-201312 KiB

common/24-Jul-20134 KiB

config-allstubs24-Jul-20131.4 KiB

config-armv5te24-Jul-20131.6 KiB

config-armv5te-vfp24-Jul-20133.2 KiB

config-armv7-a24-Jul-20135.2 KiB

config-armv7-a-neon24-Jul-20135.1 KiB

config-mips24-Jul-20131.7 KiB

config-portable24-Jul-20131.1 KiB

config-x8624-Jul-20131.7 KiB

cstubs/24-Jul-20134 KiB

gen-mterp.py24-Jul-201319.9 KiB

Makefile-mterp24-Jul-20131.8 KiB

mips/24-Jul-201312 KiB

Mterp.cpp24-Jul-20133 KiB

Mterp.h24-Jul-20131.3 KiB

NOTES.txt24-Jul-20133.2 KiB

out/24-Jul-20134 KiB

portable/24-Jul-20134 KiB

README.txt24-Jul-201313 KiB

rebuild.sh24-Jul-20131.1 KiB

x86/24-Jul-201312 KiB

README.txt

1Dalvik "mterp" README
2
3NOTE: Find rebuilding instructions at the bottom of this file.
4
5
6==== Overview ====
7
8This is the source code for the Dalvik interpreter.  The core of the
9original version was implemented as a single C function, but to improve
10performance we rewrote it in assembly.  To make this and future assembly
11ports easier and less error-prone, we used a modular approach that allows
12development of platform-specific code one opcode at a time.
13
14The original all-in-one-function C version still exists as the "portable"
15interpreter, and is generated using the same sources and tools that
16generate the platform-specific versions.
17
18Every configuration has a "config-*" file that controls how the sources
19are generated.  The sources are written into the "out" directory, where
20they are picked up by the Android build system.
21
22The best way to become familiar with the interpreter is to look at the
23generated files in the "out" directory, such as out/InterpC-portstd.c,
24rather than trying to look at the various component pieces in (say)
25armv5te.
26
27
28==== Platform-specific source generation ====
29
30The architecture-specific config files determine what goes into two
31generated output files (InterpC-<arch>.c, InterpAsm-<arch>.S).  The goal is
32to make it easy to swap C and assembly sources during initial development
33and testing, and to provide a way to use architecture-specific versions of
34some operations (e.g. making use of PLD instructions on ARMv6 or avoiding
35CLZ on ARMv4T).
36
37Depending on architecture, instruction-to-instruction transitions may
38be done as either computed goto or jump table.  In the computed goto
39variant, each instruction handler is allocated a fixed-size area (e.g. 64
40byte).  "Overflow" code is tacked on to the end.  In the jump table variant,
41all of the instructions handlers are contiguous and may be of any size.
42The interpreter style is selected via the "handler-size" command (see below).
43
44When a C implementation for an instruction is desired, the assembly
45version packs all local state into the Thread structure and passes
46that to the C function.  Updates to the state are pulled out of
47"Thread" on return.
48
49The "arch" value should indicate an architecture family with common
50programming characteristics, so "armv5te" would work for all ARMv5TE CPUs,
51but might not be backward- or forward-compatible.  (We *might* want to
52specify the ABI model as well, e.g. "armv5te-eabi", but currently that adds
53verbosity without value.)
54
55
56==== Config file format ====
57
58The config files are parsed from top to bottom.  Each line in the file
59may be blank, hold a comment (line starts with '#'), or be a command.
60
61The commands are:
62
63  handler-style <computed-goto|jump-table|all-c>
64
65    Specify which style of interpreter to generate.  In computed-goto,
66    each handler is allocated a fixed region, allowing transitions to
67    be done via table-start-address + (opcode * handler-size). With
68    jump-table style, handlers may be of any length, and the generated
69    table is an array of pointers to the handlers. The "all-c" style is
70    for the portable interpreter (which is implemented completely in C).
71    [Note: all-c is distinct from an "allstubs" configuration.  In both
72    configurations, all handlers are the C versions, but the allstubs
73    configuration uses the assembly outer loop and assembly stubs to
74    transition to the handlers].  This command is required, and must be
75    the first command in the config file.
76
77  handler-size <bytes>
78
79    Specify the size of the fixed region, in bytes.  On most platforms
80    this will need to be a power of 2.  For jump-table and all-c
81    implementations, this command is ignored.
82
83  import <filename>
84
85    The specified file is included immediately, in its entirety.  No
86    substitutions are performed.  ".cpp" and ".h" files are copied to the
87    C output, ".S" files are copied to the asm output.
88
89  asm-stub <filename>
90
91    The named file will be included whenever an assembly "stub" is needed
92    to transfer control to a handler written in C.  Text substitution is
93    performed on the opcode name.  This command is not applicable to
94    to "all-c" configurations.
95
96  asm-alt-stub <filename>
97
98    When present, this command will cause the generation of an alternate
99    set of entry points (for computed-goto interpreters) or an alternate
100    jump table (for jump-table interpreters).
101
102  op-start <directory>
103
104    Indicates the start of the opcode list.  Must precede any "op"
105    commands.  The specified directory is the default location to pull
106    instruction files from.
107
108  op <opcode> <directory>
109
110    Can only appear after "op-start" and before "op-end".  Overrides the
111    default source file location of the specified opcode.  The opcode
112    definition will come from the specified file, e.g. "op OP_NOP armv5te"
113    will load from "armv5te/OP_NOP.S".  A substitution dictionary will be
114    applied (see below).
115
116  alt <opcode> <directory>
117
118    Can only appear after "op-start" and before "op-end".  Similar to the
119    "op" command above, but denotes a source file to override the entry
120    in the alternate handler table.  The opcode definition will come from
121    the specified file, e.g. "alt OP_NOP armv5te" will load from
122    "armv5te/ALT_OP_NOP.S".  A substitution dictionary will be applied
123    (see below).
124
125  op-end
126
127    Indicates the end of the opcode list.  All kNumPackedOpcodes
128    opcodes are emitted when this is seen, followed by any code that
129    didn't fit inside the fixed-size instruction handler space.
130
131The order of "op" and "alt" directives are not significant; the generation
132tool will extract ordering info from the VM sources.
133
134Typically the form in which most opcodes currently exist is used in
135the "op-start" directive.  For a new port you would start with "c",
136and add architecture-specific "op" entries as you write instructions.
137When complete it will default to the target architecture, and you insert
138"c" ops to stub out platform-specific code.
139
140For the <directory> specified in the "op" command, the "c" directory
141is special in two ways: (1) the sources are assumed to be C code, and
142will be inserted into the generated C file; (2) when a C implementation
143is emitted, a "glue stub" is emitted in the assembly source file.
144(The generator script always emits kNumPackedOpcodes assembly
145instructions, unless "asm-stub" was left blank, in which case it only
146emits some labels.)
147
148
149==== Instruction file format ====
150
151The assembly instruction files are simply fragments of assembly sources.
152The starting label will be provided by the generation tool, as will
153declarations for the segment type and alignment.  The expected target
154assembler is GNU "as", but others will work (may require fiddling with
155some of the pseudo-ops emitted by the generation tool).
156
157The C files do a bunch of fancy things with macros in an attempt to share
158code with the portable interpreter.  (This is expected to be reduced in
159the future.)
160
161A substitution dictionary is applied to all opcode fragments as they are
162appended to the output.  Substitutions can look like "$value" or "${value}".
163
164The dictionary always includes:
165
166  $opcode - opcode name, e.g. "OP_NOP"
167  $opnum - opcode number, e.g. 0 for OP_NOP
168  $handler_size_bytes - max size of an instruction handler, in bytes
169  $handler_size_bits - max size of an instruction handler, log 2
170
171Both C and assembly sources will be passed through the C pre-processor,
172so you can take advantage of C-style comments and preprocessor directives
173like "#define".
174
175Some generator operations are available.
176
177  %include "filename" [subst-dict]
178
179    Includes the file, which should look like "armv5te/OP_NOP.S".  You can
180    specify values for the substitution dictionary, using standard Python
181    syntax.  For example, this:
182      %include "armv5te/unop.S" {"result":"r1"}
183    would insert "armv5te/unop.S" at the current file position, replacing
184    occurrences of "$result" with "r1".
185
186  %default <subst-dict>
187
188    Specify default substitution dictionary values, using standard Python
189    syntax.  Useful if you want to have a "base" version and variants.
190
191  %break
192
193    Identifies the split between the main portion of the instruction
194    handler (which must fit in "handler-size" bytes) and the "sister"
195    code, which is appended to the end of the instruction handler block.
196    In jump table implementations, %break is ignored.
197
198  %verify "message"
199
200    Leave a note to yourself about what needs to be tested.  (This may
201    turn into something more interesting someday; for now, it just gets
202    stripped out before the output is generated.)
203
204The generation tool does *not* print a warning if your instructions
205exceed "handler-size", but the VM will abort on startup if it detects an
206oversized handler.  On architectures with fixed-width instructions this
207is easy to work with, on others this you will need to count bytes.
208
209
210==== Using C constants from assembly sources ====
211
212The file "common/asm-constants.h" has some definitions for constant
213values, structure sizes, and struct member offsets.  The format is fairly
214restricted, as simple macros are used to massage it for use with both C
215(where it is verified) and assembly (where the definitions are used).
216
217If a constant in the file becomes out of sync, the VM will log an error
218message and abort during startup.
219
220
221==== Development tips ====
222
223If you need to debug the initial piece of an opcode handler, and your
224debug code expands it beyond the handler size limit, you can insert a
225generic header at the top:
226
227    b       ${opcode}_start
228%break
229${opcode}_start:
230
231If you already have a %break, it's okay to leave it in place -- the second
232%break is ignored.
233
234
235==== Rebuilding ====
236
237If you change any of the source file fragments, you need to rebuild the
238combined source files in the "out" directory.  Make sure the files in
239"out" are editable, then:
240
241    $ cd mterp
242    $ ./rebuild.sh
243
244As of this writing, this requires Python 2.5. You may see inscrutible
245error messages or just general failure if you have a different version
246of Python installed.
247
248The ultimate goal is to have the build system generate the necessary
249output files without requiring this separate step, but we're not yet
250ready to require Python in the build.
251
252==== Interpreter Control ====
253
254The central mechanism for interpreter control is the InterpBreak struture
255that is found in each thread's Thread struct (see vm/Thread.h).  There
256is one mandatory field, and two optional fields:
257
258    subMode - required, describes debug/profile/special operation
259    breakFlags & curHandlerTable - optional, used lower subMode polling costs
260
261The subMode field is a bitmask which records all currently active
262special modes of operation.  For example, when Traceview profiling
263is active, kSubModeMethodTrace is set.  This bit informs the interpreter
264that it must notify the profiling subsystem on each method entry and
265return.  There are similar bits for an active debugging session,
266instruction count profiling, pending thread suspension request, etc.
267
268To support special subMode operation the simplest mechanism for the
269interpreter is to poll the subMode field before interpreting each Dalvik
270bytecode and take any required action.  In fact, this is precisely
271what the portable interpreter does.  The "FINISH" macro expands to
272include a test of subMode and subsequent call to the "dvmCheckBefore()".
273
274Per-instruction polling, however, is expensive and subMode operation is
275relative rare.  For normal operation we'd like to avoid having to perform
276any checks unless a special subMode is actually in effect.  This is
277where curHandlerTable and breakFlags come in to play.
278
279The mterp fast interpreter achieves much of its performance advantage
280over the portable interpreter through its efficient mechanism of
281transitioning from one Dalvik bytecode to the next.  Mterp for ARM targets
282uses a computed-goto mechanism, in which the handler entrypoints are
283located at the base of the handler table + (opcode * 64).  Mterp for x86
284targets instead uses a jump table of handler entry points indexed
285by the Dalvik opcode.  To support efficient handling of special subModes,
286mterp supports two sets of handler entries (for ARM) or two jump
287tables (for x86).  One handler set is optimized for speed and performs no
288inter-instruction checks (mainHandlerTable in the Thread structure), while
289the other includes a test of the subMode field (altHandlerTable).
290
291In normal operation (i.e. subMode == 0), the dedicated register rIBASE
292(r8 for ARM, edx for x86) holds a mainHandlerTable.  If we need to switch
293to a subMode that requires inter-instruction checking, rIBASE is changed
294to altHandlerTable.  Note that this change is not immediate.  What is actually
295changed is the value of curHandlerTable - which is part of the interpBreak
296structure.  Rather than explicitly check for changes, each thread will
297blindly refresh rIBASE at backward branches, exception throws and returns.
298
299The breakFlags field tells the interpreter control mechanism whether
300curHandlerTable should hold the real or alternate handler base.  If
301non-zero, we use the altHandlerBase.  The bits within breakFlags
302tells dvmCheckBefore which set of subModes need to be checked.
303
304See dvmCheckBefore() for subMode handling, and dvmEnableSubMode(),
305dvmDisableSubMode() for switching on and off.
306