History log of /dalvik/vm/compiler/template/out/CompilerTemplateAsm-armv5te-vfp.S
Revision Date Author Comments
7a2697d327936e20ef5484f7819e2e4bf91c891f 07-Jun-2010 Ben Cheng <bccheng@android.com> Implement method inlining for getters/setters

Changes include:
1) Force the trace that ends with an invoke instruction to include
the next instruction if it is a move-result (because both need
to be turned into no-ops if callee is inlined).
2) Interpreter entry point/trace builder changes so that return
target won't automatically be considered as trace starting points
(to avoid duplicate traces that include the move result
instructions).
3) Codegen changes to handle getters/setters invoked from both
monomorphic and polymorphic callsites.
4) Extend/fix self-verification to form identical trace regions and
handle traces with inlined callees.
5) Apply touchups to the method based parsing - still not in use.

Change-Id: I116b934df01bf9ada6d5a25187510e352bccd13c
7365493ad8d360c1dcf9cd8b6eee62747af01cae 09-Jun-2010 Carl Shapiro <cshapiro@google.com> Remove repeated newlines at the end of files.

Change-Id: I1e3d103a7b932ef21acedb6438c0f26b315df28f
fbdcfb9ea9e2a78f295834424c3f24986ea45dac 29-May-2010 Brian Carlstrom <bdc@google.com> Merge remote branch 'goog/dalvik-dev' into dalvik-dev-to-master

Change-Id: I0c0edb3ebf0d5e040d6bbbf60269fab0deb70ef9
b88ec3cbb419b5eac23508dc6b73de2620d7521a 17-May-2010 Ben Cheng <bccheng@android.com> Remove the write permission for the JIT code cache when not needed

To support the feature, redesigned the predicted chaining mechanism so that the
profile count is shared globally in InterpState.

Bug: 2690371
Change-Id: Ifed427e8b1fa4f6c670f19e0761e45e2d4afdbb6
bd0472480c6e876198fe19c4ffa22350c0ce57da 13-May-2010 Bill Buzbee <buzbee@google.com> JIT: Fix for [Issue 2675245] FRF40 monkey crash in jit-cache

The JIT's chaining mechanism suffered from a narrow window that
could result in i-cache inconsistency. One of the forms of chaining
cell consisted of a two 16-bit thumb instruction sequence. If a thread were
interrupted between the execution of those two instructions *and*
another thread picked that moment to convert that cell's
chained/unchained state, then bad things happen.

This CL alters the chain/unchain model somewhat to avoid this case.
Chainable chaining cells grow by 4 bytes each, and instead of rewriting
a 32-bit cell to chain/unchain, we switch between chained and unchained
state by [re]writing the first 16-bits of the cell as either a 16-bit
Thumb unconditional branch (unchained mode) or the first half of a
32-bit Thumb branch. The 2nd 16-bits of the cell will never change once
the cell moves from its inital state - thus avoiding the possibility of it
becoming inconsistent.

This adds a trivial execution penalty on the slow path, but will add
about a kByte of memory usage to a typical process.

Change-Id: Id8b99802e11386cfbab23da6abae10e2d9fc4065
978738d2cbf9d08fa78c65762eaac3351ab76b9a 13-May-2010 Ben Cheng <bccheng@android.com> Add counters to track JIT inline cache hit rate and code cache patch counts.

Also did some WITH_JIT_TUNING cleanup.

Change-Id: I8bb2d681a06b0f2af1f976a007326825a88cea38
a62475ecfcc80c58add8f153c9605762dafb8227 30-Apr-2010 Ben Cheng <bccheng@android.com> Use unsigned comparison for stack pointers.

Bug: 2613607
Change-Id: I6a8abd69fbf9cb9f8ec9d9febf1ea42fd631fe9c
11d8f14eef83d1b7bfa8f116de56a92d5ba9e71e 24-Mar-2010 Ben Cheng <bccheng@android.com> Fix for the JIT blocking mode plus some code cleanup.

Bug: 2517606
Change-Id: I2b5aa92ceaf23d484329330ae20de5966704280b
86717f79d9b018f4d69cc991075fa36611f234e5 06-Mar-2010 Ben Cheng <bccheng@android.com> Collect more JIT stats in the assert build.

New stuff includes breakdown of callsite types (ie monomorphic vs polymorphic
vs monoporphic resolved to native), total time spent in JIT'ing, and average
JIT time per compilation.

Example output:
D/dalvikvm( 840): 4042 compilations using 1976 + 329108 bytes
D/dalvikvm( 840): Compiler arena uses 10 blocks (8100 bytes each)
D/dalvikvm( 840): Compiler work queue length is 0/36
D/dalvikvm( 840): size if 8192, entries used is 4137
D/dalvikvm( 840): JIT: 4137 traces, 8192 slots, 1099 chains, 40 thresh, Non-blocking
D/dalvikvm( 840): JIT: Lookups: 1128780 hits, 168564 misses; 179520 normal, 6 punt
D/dalvikvm( 840): JIT: noChainExit: 528464 IC miss, 194708 interp callsite, 0 switch overflow
D/dalvikvm( 840): JIT: Invoke: 507 mono, 988 poly, 72 native, 1038 return
D/dalvikvm( 840): JIT: Total compilation time: 2342 ms
D/dalvikvm( 840): JIT: Avg unit compilation time: 579 us
D/dalvikvm( 840): JIT: 3357 Translation chains, 97 interp stubs
D/dalvikvm( 840): dalvik.vm.jit.op = 0-2,4-5,7-8,a-c,e-16,19-1a,1c-23,26,28-29,2b-2f,31-3d,44-4b,4d-51,60,62-63,68-69,70-72,76-78,7b,81-82,84,87,89,8d-93,95-98,a1,a3,a6,a8-a9,b0-b3,b5-b6,bb-bf,c6-c8,d0,d2-d6,d8,da-e2,ee-f0,f2-fb,
D/dalvikvm( 840): Code size stats: 50666/105126 (compiled/total Dalvik), 329108 (native)
40094c16d9727cc1e047a7d4bddffe04dd566211 25-Feb-2010 Ben Cheng <bccheng@android.com> Tweak the interpreter entries and 2nd level trace filter to capture more traces.

Real changes:
1) Add a new entry point from JIT to the interpreter to request hot traces w/o
doing chaining.
2) Increase the granularity of the secondary profile filter to match 64-byte
chunks using 64 entries.

The remaining are just cosmetic changes.
e6af13cf607de870de51ffe00f48552252946a00 06-Feb-2010 Bill Buzbee <buzbee@google.com> JIT: Replace missing ending comment marker MONITOR_ENTER template

...which, luckily, was followed by a debug version of the same handler
so everything magically worked anyway. I should buy a lottery ticket today.
fccb31dd58e5cb9f7a3f6e128d481f0ff35a51f0 05-Feb-2010 Bill Buzbee <buzbee@google.com> Jit: Start the Jit when framework signals on first screen draw

Cleanup of delayed start - introduce dvmRelativeCondWait in Sync.c.
Additionally, support for deadman timer to start Jit when no screen draws
happen, and to start immediately when running stand-alone.

Fixed bug in assert variant of libdvm - recent MONITOR change had neglected
to add a new type of exit to the exit stats.
79842ac67e2a23cb544bfe1ee3961d325a2552e7 06-Feb-2010 Bill Buzbee <buzbee@google.com> JIT: Replace missing ending comment marker MONITOR_ENTER template

...which, luckily, was followed by a debug version of the same handler
so everything magically worked anyway. I should buy a lottery ticket today.
eb695c6f814f6b0bdbba0e837555d3fe5ad23104 05-Feb-2010 Bill Buzbee <buzbee@google.com> Jit: Start the Jit when framework signals on first screen draw

Cleanup of delayed start - introduce dvmRelativeCondWait in Sync.c.
Additionally, support for deadman timer to start Jit when no screen draws
happen, and to start immediately when running stand-alone.

Fixed bug in assert variant of libdvm - recent MONITOR change had neglected
to add a new type of exit to the exit stats.
88dc28740f628a8d0d8fe05af0e11443f8793aa1 03-Feb-2010 jeffhao <jeffhao@google.com> Made Self Verification mode's memory interface less intrusive.
9e45c0b968d63ea38353c99252d233879c2efdaf 03-Feb-2010 jeffhao <jeffhao@google.com> Made Self Verification mode's memory interface less intrusive.
f5ceaebfe5633a16b11a7073d2bf36b5bb0c9945 02-Feb-2010 Bill Buzbee <buzbee@google.com> Jit: Rework monitor enter/exit to simplify thread suspension

The Jit must stop all threads in order to flush the translation cache (and
other tables). Threads which are blocked in a monitor wait cause some
headache here because they effectively hold a references to the translation
cache (though the return address on the native stack). The new model
introduced in this CL is that for the fast path of monitor enter, control
is allowed to resume in the translation cache. However, if we need to do a
heavyweight lock (which may cause us to block) control does not return to the
translation cache but instead bails out to the interpreter. This allows us to
safely clear the code cache even if some threads are in THREAD_MONITOR state.
c1d9ed490a7bd6caab51df41f3c9e590fcecb727 02-Feb-2010 Bill Buzbee <buzbee@google.com> Jit: Rework monitor enter/exit to simplify thread suspension

The Jit must stop all threads in order to flush the translation cache (and
other tables). Threads which are blocked in a monitor wait cause some
headache here because they effectively hold a references to the translation
cache (though the return address on the native stack). The new model
introduced in this CL is that for the fast path of monitor enter, control
is allowed to resume in the translation cache. However, if we need to do a
heavyweight lock (which may cause us to block) control does not return to the
translation cache but instead bails out to the interpreter. This allows us to
safely clear the code cache even if some threads are in THREAD_MONITOR state.
964a7b06a9134947b5985c7f712d18d57ed665d2 28-Jan-2010 Bill Buzbee <buzbee@google.com> Jit: Rework delayed start plus misc. cleanup

Defer initialization of jit to support upcoming feature to wait until
first screen is painted to start in order to avoid wasting effort on
jit'ng initialization code. Timed delay in place for the moment.
To change the on/off state, call dvmSuspendAllThreads(), update the
value of gDvmJit.pJitTable and then dvmResumeAllThreads().
Each time a thread goes through the heavyweight check suspend path, returns
from a monitor lock/unlock or returns from a JNI call, it will refresh
its on/off state.

Also:
Recognize and handle failure to increase size of JitTable.
Avoid repeated lock/unlock of JitTable modification mutex during resize
Make all work order enqueue actions non-blocking, which includes adding
a non-blocking mutex lock: dvmTryLockMutex().
Fix bug Jeff noticed where we were using a half-word form of a Thumb2
instruction rather than the byte form.
Minor comment changes.
7a0bcd0de6c4da6499a088a18d1750e51204c2a6 23-Jan-2010 Ben Cheng <bccheng@android.com> Tighten the safe points for code cache resets to happen.

Add a new flag in the Thread struct to track the whereabout of the top frame
in each Java thread. It is not safe to blow away the code cache if any thread
is in the JIT'ed land.
60c24f436d603c564d5351a6f81821f12635733c 04-Jan-2010 Ben Cheng <bccheng@google.com> Tear down the code cache when it is full and restart from scratch.

Because the code cache may be wiped out after safe points now the patching of
inline cache for predicted chains is done through the compiler thread's work
queue.
24ac537cf8d214f7f1bcb07aace429521247d1eb 16-Dec-2009 Ben Cheng <bccheng@google.com> Move VFP register save/restore routines from template to codegen.

Code in the template directory will occupy space in the code cache and is
invoked from JIT'ed code. Since these routines are only invoked from statically
compiled functions we can move them to the codegen directory which also has
arch-variant configurations.
342806dae77556290dfe0760e6fe3117d812c7ba 08-Dec-2009 Bill Buzbee <buzbee@google.com> Jit: Save/restore callee-save floating point registers at interpreter entry/exit
909b418219f63c0d0b2bde8a0835dbf27d5061b8 03-Dec-2009 Bill Buzbee <buzbee@google.com> Jit: Fix for 2187020, bad exception recovery from native invoke static
ab875c79c56eacc510b09710d38a9b20f7337486 19-Nov-2009 Bill Buzbee <buzbee@google.com> Jit: fix for string/indexOf handler.
4c0dedfd9006daee4f6d96482cc6ac94a1797880 16-Nov-2009 Bill Buzbee <buzbee@google.com> Jit: string's compareTo performance improvement.

Changed compareTo handler to call __memcmp16() for strings >= 32 chars.
However, even for those strings, the first two chars are done in the
handler (to catch early-out cases).

Comparisons were done with micro-benchmarks comparing 10 and 200-char
strings.

The strings were:
equal -> Q
not equal at start -> S
not equal at end -> E

The test configurations were handler (H) [the previous handler], subroutine (S)
[memcmp16()} and blended (B) [this commit]

H S B
10E 60 138 65
10S 32 70 30
10Q 9 9 9
100E 745 708 716

In short, the small string cases were twice as fast with the existing
handler compared to memcmp16, but memcmp16 was ~5% faster for long
strings.
5965d47b624798343b6a53afd384f2cf88d091de 15-Nov-2009 Bill Buzbee <buzbee@google.com> Jit: fix for compareTo handler.

Note to self: Units tests are much more effective when the test main actually
calls them.
72e93344b4d1ffc71e9c832ec23de0657e5b04a5 13-Nov-2009 Jean-Baptiste Queru <jbq@google.com> eclair snapshot
9a8c75adb2abf551d06dbf757bff558c1feded08 08-Nov-2009 Bill Buzbee <buzbee@google.com> Introduce "just interpret" chainable pseudo-translation.

This is the first step towards enabling translation & self-cosim stress modes.
When trace selection begins, the trace head address is pinned and
remains in a limbo state until the translation is complete. Previously,
if the trace selected aborted for any reason, the trace head would remain
forever in limbo. This was not a correctness problem, but caused some
small performance anomolies and made life more difficult for self-cosimulation
mode.

This CL introduces a pseudo-translation that simply routes control to
the interpreter. When we detect that a trace selection attempt has
failed, the trace head is associated with this fully-chainable
pseudo-translation. This also has the benefit for self-cosimulation that
we are guaranteed forward progress.
49024493479b1ab8b5f9b44c24a3b0c33afc796c 04-Nov-2009 Bill Buzbee <buzbee@google.com> Fix for inline string indexof; added regression tests
6c10a977ec892c26c8e306356491833bbb073d40 29-Oct-2009 Ben Cheng <bccheng@google.com> Implement chaining up to the first 64 cases in a switch statement.
fd023aaec5f2b0df61d1702ea2f29a70abe90158 02-Nov-2009 Bill Buzbee <buzbee@google.com> Jit - optimized inline string compareto, indexof; fill_array_data bug fix

Added flushAllRegs() prior to C handlers in preparation for upcoming support
for holding live/dirty values in physical registers.
1465db5ee2d3c4c4dcc8e017a294172e858765cb 24-Sep-2009 Bill Buzbee <buzbee@google.com> Major registor allocation rework - stage 1.

Direct usage of registers abstracted out.
Live values tracked locally. Redundant loads and stores suppressed.
Address of registers and register pairs unified w/ single "location" mechanism
Register types inferred using existing dataflow analysis pass.
Interim (i.e. Hack) mechanism for storing register liveness info. Rewrite TBD.
Stubbed-out code for linear scan allocation (for loop and long traces)
Moved optimistic lock check for monitor-enter/exit inline for Thumb2
Minor restructuring, renaming and general cleanup of codegen
Renaming of enums to follow coding convention
Formatting fixes introduced by the enum renaming

Rewrite of RallocUtil.c and addition of linear scan to come in stage 2.
4f48917c0741e4d9b15ca7c45956aea05fea103f 28-Sep-2009 Ben Cheng <bccheng@google.com> Fixed OOM exception handling in JIT'ed code and added a new unit test.
7fb2edd2f69d11435da8dc0f1c251349238863b3 31-Aug-2009 Bill Buzbee <buzbee@google.com> Inline Sqrt bug fix; add support for fp/gen register copies
d5ab726b65d7271be261864c7e224fb90bfe06e0 25-Aug-2009 Andy McFadden <fadden@android.com> Another round of scary indirect ref changes.

This change adds a not-really-working implementation to Jni.c, with
various changes #ifdefed throughout the code. The ifdef is currently
disabled, so the old behavior should continue. Eventually the old
version will be stripped out and the ifdefs removed.

This renames the stack's "localRefTop" field, which nudged a bunch of
code. The name wasn't really right before (it's the *bottom* of the
local references), and it's even less right now. This and one other
mterp-visible constant were changed, which caused some ripples through
mterp and the JIT, but the ifdeffing was limited to one in
asm-constants.h (and the constant is the same both ways, so toggling the
ifdef won't require rebuilding asm sources).

Some comments and arg names in ReferenceTable were updated for the
correct orientation of bottom vs. top.

Some adjustments were made to the JNI code, e.g. dvmCallMethod now needs
to understand if it needs to convert reference arguments from
local/global refs to pointers (it's called from various places
throughout the VM).
97319a8a234e9fe1cf90ca39aa6eca37d729afd5 13-Aug-2009 Jeff Hao <jeffhao@google.com> New changes to enable self verification mode.
89efc3d632adfa076bd622369b1ad8e4b49cf20e 28-Jul-2009 Bill Buzbee <buzbee@google.com> Stage 2 of structural changes for support of THUMB2. No logic changes.
50a6bf2f01efba0acbff9bb03e7ee09688553e08 08-Jul-2009 Bill Buzbee <buzbee@google.com> Inline-execute for Java.Lang.Math routines, jit codegen restructure, various bug fixes.