• Home
  • History
  • Annotate
  • only in /dalvik/vm/compiler/template/armv5te-vfp/
History log of /dalvik/vm/compiler/template/armv5te-vfp/
Revision Date Author Comments (<<< Hide modified files) (Show modified files >>>)
9f601a917c8878204482c37aec7005054b6776fa 12-Feb-2011 buzbee <buzbee@google.com> Interpreter restructuring: eliminate InterpState

The key datastructure for the interpreter is InterpState.
This change eliminates it, merging its data with the Thread structure.

Here's why:

In principio creavit Fadden Thread et InterpState. And it was good.

Thread holds thread-private state, while InterpState captures data
associated with a Dalvik interpreter activation. Because JNI calls
can result in nested interpreter invocations, we can have more than one
InterpState for each actual thread. InterpState was relatively small,
and it all worked well. It was used enough that in the Arm version
a register (rGLUE) was dedicated to it.

Then, along came the JIT guys, who saw InterpState as a convenient place
to dump all sorts of useful data that they wanted quick access to through
that dedicated register. InterpState grew and grew. In terms of
space, this wasn't a big problem - but it did mean that the initialization
cost of each interpreter activation grew as well. For applications
that do a lot of callbacks from native code into Dalvik, this is
measurable. It's also mostly useless cost because much of the JIT-related
InterpState initialization was setting up useful constants - things that
don't need to be saved and restored all the time.

The biggest problem, though, deals with thread control. When something
interesting is happening that needs all threads to be stopped (such as
GC and debugger attach), we have access to all of the Thread structures,
but we don't have access to all of the InterpState structures (which
may be buried/nested on the native stack). As a result, polling for
thread suspension is done via a one-indirection pointer chase. InterpState
itself can't hold the stop bits because we can't always find it, so
instead it holds a pointer to the global or thread-specific stop control.

Yuck.

With this change, we eliminate InterpState and merge all needed data
into Thread. Further, we replace the decidated rGLUE register with a
pointer to the Thread structure (rSELF). The small subset of state
data that needs to be saved and restored across nested interpreter
activations is collected into a record that is saved to the interpreter
frame, and restored on exit. Further, these small records are linked
together to allow tracebacks to show nested activations. Old InterpState
variables that simply contain useful constants are initialized once at
thread creation time.

This CL is large enough by itself that the new ability to streamline
suspend checks is not done here - that will happen in a future CL. Here
we just focus on consolidation.

Change-Id: Ide6b2fb85716fea454ac113f5611263a96687356
EMPLATE_RESTORE_STATE.S
EMPLATE_SAVE_STATE.S
d72564ca7aa66c6d95b6ca34299258b65ecfd1cb 09-Feb-2011 Ben Cheng <bccheng@android.com> Misc goodies in the JIT in preparation for more aggressive code motion.

- Set up resource masks correctly for Thumb push/pop when LR/PC are involved.
- Preserve LR around simulated heap references under self-verification mode.
- Compact a few simple flags in ArmLIR into bit fields.
- Minor performance tuning in TEMPLATE_MEM_OP_DECODE

Change-Id: Id73edac837c5bb37dfd21f372d6fa21c238cf42a
EMPLATE_MEM_OP_DECODE.S
18fba346582c08d81aa96d9508c0e935bad5f36f 20-Jan-2011 buzbee <buzbee@google.com> Support traceview-style profiling in all builds

This change builds on an earlier bccheng change that allowed JIT'd code
to avoid reverting to the debug portable interpeter when doing traceview-style
method profiling. That CL introduced a new traceview build (libdvm_traceview)
because the performance delta was too great to enable the capability for
all builds.

In this CL, we remove the libdvm_traceview build and provide full-speed
method tracing in all builds. This is done by introducing "_PROF"
versions of invoke and return templates used by the JIT. Normally, these
templates are not used, and performace in unaffected. However, when method
profiling is enabled, all existing translation are purged and new translations
are created using the _PROF templates. These templates introduce a
smallish performance penalty above and beyond the actual tracing cost, but
again are only used when tracing has been enabled.

Strictly speaking, there is a slight burden that is placed on invokes and
returns in the non-tracing case - on the order of an additional 3 or 4
cycles per invoke/return. Those operations are already heavyweight enough
that I was unable to measure the added cost in benchmarks.

Change-Id: Ic09baf4249f1e716e136a65458f4e06cea35fc18
emplateOpList.h
2e152baec01433de9c63633ebc6f4adf1cea3a87 16-Dec-2010 buzbee <buzbee@google.com> [JIT] Trace profiling support

In preparation for method compilation, this CL causes all traces to
include two entry points: profiling and non-profiling. For now, the
profiling entry will only be used if dalvik is run with -Xjitprofile,
and largely works like it did before. The difference is that profiling
support no longer requires the "assert" build - it's always there now.

This will enable us to do a form of sampling profiling of
traces in order to identify hot methods or hot trace groups,
while keeping the overhead low by only switching profiling on periodically.

To turn the periodic profiling on and off, we simply unchain all existing
translations and set the appropriate global profile state. The underlying
translation lookup and chaining utilties will examine the profile state to
determine which entry point to use (i.e. - profiling or non-profiling) while
the traces naturally rechain during further execution.

Change-Id: I9ee33e69e33869b9fab3a57e88f9bc524175172b
emplateOpList.h
d88756df5b4dbc6fd450afd0019a5f64ebe4432d 22-Oct-2010 Elliott Hughes <enh@google.com> Remove junk from platform.S now armv4t is gone.

Change-Id: I30079aacc753c89cfc3a3f64bd900a0cc858d65f
latform.S
7365493ad8d360c1dcf9cd8b6eee62747af01cae 09-Jun-2010 Carl Shapiro <cshapiro@google.com> Remove repeated newlines at the end of files.

Change-Id: I1e3d103a7b932ef21acedb6438c0f26b315df28f
EMPLATE_CMPG_DOUBLE_VFP.S
88dc28740f628a8d0d8fe05af0e11443f8793aa1 03-Feb-2010 jeffhao <jeffhao@google.com> Made Self Verification mode's memory interface less intrusive.
EMPLATE_MEM_OP_DECODE.S
emplateOpList.h
f5ceaebfe5633a16b11a7073d2bf36b5bb0c9945 02-Feb-2010 Bill Buzbee <buzbee@google.com> Jit: Rework monitor enter/exit to simplify thread suspension

The Jit must stop all threads in order to flush the translation cache (and
other tables). Threads which are blocked in a monitor wait cause some
headache here because they effectively hold a references to the translation
cache (though the return address on the native stack). The new model
introduced in this CL is that for the fast path of monitor enter, control
is allowed to resume in the translation cache. However, if we need to do a
heavyweight lock (which may cause us to block) control does not return to the
translation cache but instead bails out to the interpreter. This allows us to
safely clear the code cache even if some threads are in THREAD_MONITOR state.
emplateOpList.h
24ac537cf8d214f7f1bcb07aace429521247d1eb 16-Dec-2009 Ben Cheng <bccheng@google.com> Move VFP register save/restore routines from template to codegen.

Code in the template directory will occupy space in the code cache and is
invoked from JIT'ed code. Since these routines are only invoked from statically
compiled functions we can move them to the codegen directory which also has
arch-variant configurations.
latform.S
342806dae77556290dfe0760e6fe3117d812c7ba 08-Dec-2009 Bill Buzbee <buzbee@google.com> Jit: Save/restore callee-save floating point registers at interpreter entry/exit
latform.S
9a8c75adb2abf551d06dbf757bff558c1feded08 08-Nov-2009 Bill Buzbee <buzbee@google.com> Introduce "just interpret" chainable pseudo-translation.

This is the first step towards enabling translation & self-cosim stress modes.
When trace selection begins, the trace head address is pinned and
remains in a limbo state until the translation is complete. Previously,
if the trace selected aborted for any reason, the trace head would remain
forever in limbo. This was not a correctness problem, but caused some
small performance anomolies and made life more difficult for self-cosimulation
mode.

This CL introduces a pseudo-translation that simply routes control to
the interpreter. When we detect that a trace selection attempt has
failed, the trace head is associated with this fully-chainable
pseudo-translation. This also has the benefit for self-cosimulation that
we are guaranteed forward progress.
emplateOpList.h
fd023aaec5f2b0df61d1702ea2f29a70abe90158 02-Nov-2009 Bill Buzbee <buzbee@google.com> Jit - optimized inline string compareto, indexof; fill_array_data bug fix

Added flushAllRegs() prior to C handlers in preparation for upcoming support
for holding live/dirty values in physical registers.
emplateOpList.h
1465db5ee2d3c4c4dcc8e017a294172e858765cb 24-Sep-2009 Bill Buzbee <buzbee@google.com> Major registor allocation rework - stage 1.

Direct usage of registers abstracted out.
Live values tracked locally. Redundant loads and stores suppressed.
Address of registers and register pairs unified w/ single "location" mechanism
Register types inferred using existing dataflow analysis pass.
Interim (i.e. Hack) mechanism for storing register liveness info. Rewrite TBD.
Stubbed-out code for linear scan allocation (for loop and long traces)
Moved optimistic lock check for monitor-enter/exit inline for Thumb2
Minor restructuring, renaming and general cleanup of codegen
Renaming of enums to follow coding convention
Formatting fixes introduced by the enum renaming

Rewrite of RallocUtil.c and addition of linear scan to come in stage 2.
EMPLATE_RESTORE_STATE.S
EMPLATE_SAVE_STATE.S
emplateOpList.h
4f48917c0741e4d9b15ca7c45956aea05fea103f 28-Sep-2009 Ben Cheng <bccheng@google.com> Fixed OOM exception handling in JIT'ed code and added a new unit test.
emplateOpList.h
7fb2edd2f69d11435da8dc0f1c251349238863b3 31-Aug-2009 Bill Buzbee <buzbee@google.com> Inline Sqrt bug fix; add support for fp/gen register copies
EMPLATE_CMPG_DOUBLE_VFP.S
EMPLATE_CMPG_FLOAT_VFP.S
EMPLATE_CMPL_FLOAT_VFP.S
50a6bf2f01efba0acbff9bb03e7ee09688553e08 08-Jul-2009 Bill Buzbee <buzbee@google.com> Inline-execute for Java.Lang.Math routines, jit codegen restructure, various bug fixes.
EMPLATE_ADD_DOUBLE_VFP.S
EMPLATE_ADD_FLOAT_VFP.S
EMPLATE_CMPG_DOUBLE_VFP.S
EMPLATE_CMPG_FLOAT_VFP.S
EMPLATE_CMPL_DOUBLE_VFP.S
EMPLATE_CMPL_FLOAT_VFP.S
EMPLATE_DIV_DOUBLE_VFP.S
EMPLATE_DIV_FLOAT_VFP.S
EMPLATE_DOUBLE_TO_FLOAT_VFP.S
EMPLATE_DOUBLE_TO_INT_VFP.S
EMPLATE_FLOAT_TO_DOUBLE_VFP.S
EMPLATE_FLOAT_TO_INT_VFP.S
EMPLATE_INT_TO_DOUBLE_VFP.S
EMPLATE_INT_TO_FLOAT_VFP.S
EMPLATE_MUL_DOUBLE_VFP.S
EMPLATE_MUL_FLOAT_VFP.S
EMPLATE_SQRT_DOUBLE_VFP.S
EMPLATE_SUB_DOUBLE_VFP.S
EMPLATE_SUB_FLOAT_VFP.S
emplateOpList.h
binop.S
binopWide.S
unop.S
unopNarrower.S
unopWider.S