History log of /dalvik/vm/interp/Jit.h
Revision Date Author Comments
375fb116bcb817b37509ab579dbd55cdbb765cbf 15-Jun-2011 Carl Shapiro <cshapiro@google.com> Normalize the include guard style.

An leading underscore followed by a capital letter is a reserved
name space in C and C++.

This change also moves any #include directives within the include
guard in some of the compiler/codegen/arm header files.

Change-Id: I9715e2c5301699d31886e61d0fe6e29483555a2a
d862faa2ceae186da5518607505eb942d634ced9 28-Apr-2011 Carl Shapiro <cshapiro@google.com> Get rid of uneeded extern, enum, typedef and struct qualifiers.

Change-Id: I236c5a1553a51f82c9bc3eaaab042046c854d3b4
1813ab265f691e93401c7307c0b34247842ab35e 16-Apr-2011 Carl Shapiro <cshapiro@google.com> Move the verifier and parts of the interpreter into C++.

Change-Id: I8ce5fb558871d9709b251512dd01206be5ca8497
ae188c676c681e47a93ade7fdf0144099b470e03 08-Apr-2011 Carl Shapiro <cshapiro@google.com> Compile the garbage collector and heap profiler as C++.

Change-Id: I25d8fa821987a3dd6d7109d07fd42dbf2fe0e589
99e3e6e72e3471eb85fc2e405866392b01c080fe 29-Mar-2011 buzbee <buzbee@google.com> Fix interpreter debug attach

Fix a few miscellaneous bugs from the interpreter restructuring that were
causing a segfault on debugger attach.

Added a sanity checking routine for debugging.

Fixed a problem in which the JIT's threshold and on/off switch
wouldn't get initialized properly on thread creation.

Renamed dvmCompilerStateRefresh() to dvmCompilerUpdateGlobalState() to
better reflect its function.

Change-Id: I5b8af1ce2175e3c6f53cda19dd8e052a5f355587
9a3147c7412f4794434b4c2604aa2ba784867774 03-Mar-2011 buzbee <buzbee@google.com> Interpreter restructuring

This is a restructuring of the Dalvik ARM and x86 interpreters:

o Combine the old portstd and portdbg interpreters into a single
portable interpreter.
o Add debug/profiling support to the fast (mterp) interpreters.
o Delete old mechansim of switching between interpreters. Now, once
you choose an interpreter at startup, you stick with it.
o Allow JIT to co-exist with profiling & debugging (necessary for
first-class support of debugging with the JIT active).
o Adds single-step capability to the fast assembly interpreters without
slowing them down (and, in fact, measurably improves their performance).
o Remove old "polling for safe point" mechanism. Breakouts now achieved
via modifying base of interpreter handler table.
o Simplify interpeter control mechanism.
o Allow thread-granularity control for profiling & debugging

The primary motivation behind this change was to improve the responsiveness
of debugging and profiling and to make it easier to add new debugging and
profiling capabilities in the future. Instead of always bailing out to the
slow debug portable interpreter, we can now stay in the fast interpreter.

A nice side effect of the change is that the fast interpreters
got a healthy speed boost because we were able to replace the
polling safepoint check that involved a dozen or so instructions
with a single table-base reload. When combined with the two earlier CLs
related to this restructuring, we show a 5.6% performance improvement
using libdvm_interp.so on the Checkers benchmark relative to Honeycomb.

Change-Id: I8d37e866b3618def4e582fc73f1cf69ffe428f3c
d82cebc13cf8ff110f2ad1a299782c9d1e8aada6 14-Mar-2011 buzbee <buzbee@google.com> Add trace description dump routine for debugging

New dvmJitDumpTraceDescription prints out a trace descriptor.

Change-Id: I613fe17cf75e4ee26ad976b15c2cf89ce4552192
385828e36ea70effe9aa18a954d008b1f7dc1d63 05-Mar-2011 Ben Cheng <bccheng@android.com> Handle relocatable class objects in JIT'ed code.

1) Split the original literal pool into class object literals and
constants. Elements in the class object pool have to match the specicial
values perfectly (ie no +delta space optimizations) since they might be
relocated.

2) Implement dvmJitScanAllClassPointers(void (*callback)(void *))
which is the entry routine to report all memory locations in the code cache
that contain class objects (ie class object pool and predicted chaining
cells for virtual calls).

3) Major codegen changes on how/when the class object pool are populated
and how predicted chains are patched. Before this change the compiler
thread is always in the VM_WAIT state, which won't prevent GC from
running. Since the class object pointers captured by a worker thread
are no longer guaranteed to be stable at JIT time, change various
internal data structures to capture the class descriptor/loader
tuple instead. The conversion from descriptor/loader tuple to actual
class object pointers are only performed when the thread state is
RUNNING or at GC safe point.

4) Separate the class object installation phase out of the main
dvmCompilerAssembleLIR routine so that the impact to blocking GC
requests is minimal. Add new stats to report the potential block time.
For example:
Potential GC blocked by compiler: max 46 us / avg 25 us

5) Various cleanup in the trace structure walkup code. Modified the
verbose print routine to show the class descriptor in the class literal
pool. For example:

D/dalvikvm( 1450): -------- end of chaining cells (0x007c)
D/dalvikvm( 1450): 0x44020628 (00b4): .class
(Lcom/android/unit_tests/PerformanceTests$EmptyClass;)
D/dalvikvm( 1450): 0x4402062c (00b8): .word (0xaca8d1a5)
D/dalvikvm( 1450): 0x44020630 (00bc): .word (0x401abc02)
D/dalvikvm( 1450): End

Bug: 3482956

Change-Id: I2e736b00d63adc255c33067544606b8b96b72ffc
9f601a917c8878204482c37aec7005054b6776fa 12-Feb-2011 buzbee <buzbee@google.com> Interpreter restructuring: eliminate InterpState

The key datastructure for the interpreter is InterpState.
This change eliminates it, merging its data with the Thread structure.

Here's why:

In principio creavit Fadden Thread et InterpState. And it was good.

Thread holds thread-private state, while InterpState captures data
associated with a Dalvik interpreter activation. Because JNI calls
can result in nested interpreter invocations, we can have more than one
InterpState for each actual thread. InterpState was relatively small,
and it all worked well. It was used enough that in the Arm version
a register (rGLUE) was dedicated to it.

Then, along came the JIT guys, who saw InterpState as a convenient place
to dump all sorts of useful data that they wanted quick access to through
that dedicated register. InterpState grew and grew. In terms of
space, this wasn't a big problem - but it did mean that the initialization
cost of each interpreter activation grew as well. For applications
that do a lot of callbacks from native code into Dalvik, this is
measurable. It's also mostly useless cost because much of the JIT-related
InterpState initialization was setting up useful constants - things that
don't need to be saved and restored all the time.

The biggest problem, though, deals with thread control. When something
interesting is happening that needs all threads to be stopped (such as
GC and debugger attach), we have access to all of the Thread structures,
but we don't have access to all of the InterpState structures (which
may be buried/nested on the native stack). As a result, polling for
thread suspension is done via a one-indirection pointer chase. InterpState
itself can't hold the stop bits because we can't always find it, so
instead it holds a pointer to the global or thread-specific stop control.

Yuck.

With this change, we eliminate InterpState and merge all needed data
into Thread. Further, we replace the decidated rGLUE register with a
pointer to the Thread structure (rSELF). The small subset of state
data that needs to be saved and restored across nested interpreter
activations is collected into a record that is saved to the interpreter
frame, and restored on exit. Further, these small records are linked
together to allow tracebacks to show nested activations. Old InterpState
variables that simply contain useful constants are initialized once at
thread creation time.

This CL is large enough by itself that the new ability to streamline
suspend checks is not done here - that will happen in a future CL. Here
we just focus on consolidation.

Change-Id: Ide6b2fb85716fea454ac113f5611263a96687356
1b3da59fff0c63770e10684e243a36f3d0218637 03-Feb-2011 Bill Buzbee <buzbee@google.com> [JIT] Fix for 3385583: Performance variance

Closes a window in which the "interpret-only" templace could get chained
to an existing trace while the intended translation was under construction.
Note that this CL also introduces some small, but fundamental changes in trace
formation:

1. Previouosly, when an exception or other trace terminating event
occurred during trace formation, the entire trace was abandoned. With this
change, we instead end the trace at the last successful instruction.

2. We previously allowed multiple attempts (perhaps by multiple threads)
to form a trace compilation request for a dalvik PC. This was done in an
attempt to allow recovery from compiler failures. Now we enforce a new rule:
only the thread that wins the race to allocate an entry in the JitTable will
form the trace request.

3. In a (probably misguided) attempt avoid unnecessary contention, we
previously allowed work order enqueue requests to be dropped if a requester
did not aquire TableLock on first attempt (assuming that if the trace were
hot, it would be requested again). Now we block on enqueue.

Change-Id: I40ea4f1b012250219ca37d5c40c5f22cae2092f1
cfdeca37fcaa27c37bad5077223e4d1e87f1182e 14-Jan-2011 Ben Cheng <bccheng@android.com> Add runtime support for method based compilation.

Enhanced code cache management to accommodate both trace and method
compilations. Also implemented a hacky dispatch routine for virtual
leaf methods.

Microbenchmark showed 3x speedup in leaf method invocation.

Change-Id: I79d95b7300ba993667b3aa221c1df9c7b0583521
2e152baec01433de9c63633ebc6f4adf1cea3a87 16-Dec-2010 buzbee <buzbee@google.com> [JIT] Trace profiling support

In preparation for method compilation, this CL causes all traces to
include two entry points: profiling and non-profiling. For now, the
profiling entry will only be used if dalvik is run with -Xjitprofile,
and largely works like it did before. The difference is that profiling
support no longer requires the "assert" build - it's always there now.

This will enable us to do a form of sampling profiling of
traces in order to identify hot methods or hot trace groups,
while keeping the overhead low by only switching profiling on periodically.

To turn the periodic profiling on and off, we simply unchain all existing
translations and set the appropriate global profile state. The underlying
translation lookup and chaining utilties will examine the profile state to
determine which entry point to use (i.e. - profiling or non-profiling) while
the traces naturally rechain during further execution.

Change-Id: I9ee33e69e33869b9fab3a57e88f9bc524175172b
7a2697d327936e20ef5484f7819e2e4bf91c891f 07-Jun-2010 Ben Cheng <bccheng@android.com> Implement method inlining for getters/setters

Changes include:
1) Force the trace that ends with an invoke instruction to include
the next instruction if it is a move-result (because both need
to be turned into no-ops if callee is inlined).
2) Interpreter entry point/trace builder changes so that return
target won't automatically be considered as trace starting points
(to avoid duplicate traces that include the move result
instructions).
3) Codegen changes to handle getters/setters invoked from both
monomorphic and polymorphic callsites.
4) Extend/fix self-verification to form identical trace regions and
handle traces with inlined callees.
5) Apply touchups to the method based parsing - still not in use.

Change-Id: I116b934df01bf9ada6d5a25187510e352bccd13c
d5adae17d71e86a1a5f3ae7825054e3249fb7879 27-Mar-2010 Ben Cheng <bccheng@android.com> Improve JIT self verifier test coverage to follow single-step instructions.

Bug: 2549326
Change-Id: I01412d4aac1379b61c90fe6e59c534b33be93f66
94e79ebaa340e8ba3bb4d13f5e908fef6d9d5eed 05-Feb-2010 Ben Cheng <bccheng@android.com> Enable JIT parameters to be initialized in an architecture dependent way.

The search for optimial value is still ongoing. The current settings are:

v5 v7
JIT profile table 512 2048
JIT code cache 512K 1M
JIT threshold 200 40
5e7bfdfc9598b00af4ced9d43fc797af6036b708 09-Feb-2010 Bill Buzbee <buzbee@google.com> Jit: Startup/Shutdown cleanup

A legacy of early parallel Jit development was separate Startup & Shutdown
code for the interpreter half of the jit (dvmJitStartup/dvmJitShutdown)
and the compiler half (dvmCompilerStartup/dvmCompilerShutdown). This cl
eliminates the dvmJit pair. Additionally, guard coded added to the
framework callback to return immediately if the Jit isn't active.
96cfe6c39b91dabc78182e2f7676b27b4012886a 09-Feb-2010 Bill Buzbee <buzbee@google.com> Jit: Startup/Shutdown cleanup

A legacy of early parallel Jit development was separate Startup & Shutdown
code for the interpreter half of the jit (dvmJitStartup/dvmJitShutdown)
and the compiler half (dvmCompilerStartup/dvmCompilerShutdown). This cl
eliminates the dvmJit pair. Additionally, guard coded added to the
framework callback to return immediately if the Jit isn't active.
7b133ef7c84e68c3c4042176d830ea5b52e84139 05-Feb-2010 Ben Cheng <bccheng@android.com> Enable JIT parameters to be initialized in an architecture dependent way.

The search for optimial value is still ongoing. The current settings are:

v5 v7
JIT profile table 512 2048
JIT code cache 512K 1M
JIT threshold 200 40
964a7b06a9134947b5985c7f712d18d57ed665d2 28-Jan-2010 Bill Buzbee <buzbee@google.com> Jit: Rework delayed start plus misc. cleanup

Defer initialization of jit to support upcoming feature to wait until
first screen is painted to start in order to avoid wasting effort on
jit'ng initialization code. Timed delay in place for the moment.
To change the on/off state, call dvmSuspendAllThreads(), update the
value of gDvmJit.pJitTable and then dvmResumeAllThreads().
Each time a thread goes through the heavyweight check suspend path, returns
from a monitor lock/unlock or returns from a JNI call, it will refresh
its on/off state.

Also:
Recognize and handle failure to increase size of JitTable.
Avoid repeated lock/unlock of JitTable modification mutex during resize
Make all work order enqueue actions non-blocking, which includes adding
a non-blocking mutex lock: dvmTryLockMutex().
Fix bug Jeff noticed where we were using a half-word form of a Thumb2
instruction rather than the byte form.
Minor comment changes.
9797a237b48e880c33e2a2f497f48fb6f67c7a16 12-Jan-2010 Bill Buzbee <buzbee@google.com> Performance tweak for Jit lookup & adjust table sizes for better performance

Also, move setting of Jit table parameters to architecture-specific init
funciton.
60c24f436d603c564d5351a6f81821f12635733c 04-Jan-2010 Ben Cheng <bccheng@google.com> Tear down the code cache when it is full and restart from scratch.

Because the code cache may be wiped out after safe points now the patching of
inline cache for predicted chains is done through the compiler thread's work
queue.
72e93344b4d1ffc71e9c832ec23de0657e5b04a5 13-Nov-2009 Jean-Baptiste Queru <jbq@google.com> eclair snapshot
d726991ba52466cde88e37aba4de2395b62477fa 10-Nov-2009 Bill Buzbee <buzbee@google.com> Jit stress mode: translate everything we can and self verify.

This represents a general clean-up of some existing command-line
options: -Xjitthreshold:num and -Xjitblocking. The jit threshold
controls how quickly we treat a Dalvik address as a potential trace
head. Normally this is set around 200 (and the range is 0..255, where
0 is in effect 256 and 1 means begin trace selection on first visit).

-Xjitblocking forces the system to pause execution whenever a translation
request is made and resume when that translation is complete. Normally
the system make a request but continues execution (to avoid jitter).

Additionally, if WITH_SELF_VERIFICATION is defined, we force blocking
to be true, and set the threshold to 1. And finally, we treat
threshold==1 as a special case and disable the 2nd-level trace-building
filter - which causes the system to immediately start trace selection.
9a8c75adb2abf551d06dbf757bff558c1feded08 08-Nov-2009 Bill Buzbee <buzbee@google.com> Introduce "just interpret" chainable pseudo-translation.

This is the first step towards enabling translation & self-cosim stress modes.
When trace selection begins, the trace head address is pinned and
remains in a limbo state until the translation is complete. Previously,
if the trace selected aborted for any reason, the trace head would remain
forever in limbo. This was not a correctness problem, but caused some
small performance anomolies and made life more difficult for self-cosimulation
mode.

This CL introduces a pseudo-translation that simply routes control to
the interpreter. When we detect that a trace selection attempt has
failed, the trace head is associated with this fully-chainable
pseudo-translation. This also has the benefit for self-cosimulation that
we are guaranteed forward progress.
ccd6c0102d1f898aaea1c94761167fdd083b5275 15-Oct-2009 Ben Cheng <bccheng@google.com> Make the traige process for self-verification found divergence easier.

1. Automatically replay the code compilation with verbose mode turned on for
the offending compilation.
3. Mark the registers with divergence explicitly.
2. Print accurate operand names using the dataflow attributes. Constant values
are still printed for reference only.
3. Fixed a few correctness/style issues in self-verification code.
bcdc1ded149faa8dec51778db4047cae0862027f 22-Aug-2009 Ben Cheng <bccheng@google.com> Updated expected outputs in dalvik benchmarks. Improved debugging output and added spin loop on detection of divergence in self verification tool.
97319a8a234e9fe1cf90ca39aa6eca37d729afd5 13-Aug-2009 Jeff Hao <jeffhao@google.com> New changes to enable self verification mode.
716f120d7f33ca18a5dcbef811399df0cbefe5d0 23-Jul-2009 Bill Buzbee <buzbee@google.com> First phase of restructuring to support THUMB2 & ARM traces
Store some useful info about traces in JitTable entry; some general cleanup
50a6bf2f01efba0acbff9bb03e7ee09688553e08 08-Jul-2009 Bill Buzbee <buzbee@google.com> Inline-execute for Java.Lang.Math routines, jit codegen restructure, various bug fixes.
48f1824fd36241067e7bed2302cc00b2d880be7f 20-Jun-2009 Bill Buzbee <buzbee@google.com> New threshold mechanism for trace selections. Intended to reduce number of junk traces.
6e963e1cfbaeac377fed3ba8d5715c1dccfc1a57 18-Jun-2009 Bill Buzbee <buzbee@google.com> Trace profiling support for the jit
2717622484eb0f7ad537275f7260b2f93324eda2 09-Jun-2009 Bill Buzbee <buzbee@google.com> Makes the primary Jit table growable. Also includes a change suggested earlier by Dan to use a pre-defined mask in the hash function. Reduce the default JitTable size from 2048 entries to 512 entries.
Update per Ben's comments.
1efc9c5e4c5c4c2fccde18e5771c68d064c33bd3 09-Jun-2009 Ben Cheng <bccheng@android.com> Fix two codegen problems: out-of-bound PC-relative addresses and missing branch to the chaining cell at the end of non-branch-ending basic blocks.
ba4fc8bfc1bccae048403bd1cea3b869dca61dd7 01-Jun-2009 Ben Cheng <bccheng@android.com> Initial port of the Dalvik JIT enging to the internal repository.
Fixed files with trailing spaces.
Addressed review comments from Dan.
Addressed review comments from fadden.
Addressed review comments from Dan x 2.
Addressed review comments from Dan x 3.