7a2697d327936e20ef5484f7819e2e4bf91c891f |
|
07-Jun-2010 |
Ben Cheng <bccheng@android.com> |
Implement method inlining for getters/setters Changes include: 1) Force the trace that ends with an invoke instruction to include the next instruction if it is a move-result (because both need to be turned into no-ops if callee is inlined). 2) Interpreter entry point/trace builder changes so that return target won't automatically be considered as trace starting points (to avoid duplicate traces that include the move result instructions). 3) Codegen changes to handle getters/setters invoked from both monomorphic and polymorphic callsites. 4) Extend/fix self-verification to form identical trace regions and handle traces with inlined callees. 5) Apply touchups to the method based parsing - still not in use. Change-Id: I116b934df01bf9ada6d5a25187510e352bccd13c
|
7365493ad8d360c1dcf9cd8b6eee62747af01cae |
|
09-Jun-2010 |
Carl Shapiro <cshapiro@google.com> |
Remove repeated newlines at the end of files. Change-Id: I1e3d103a7b932ef21acedb6438c0f26b315df28f
|
fbdcfb9ea9e2a78f295834424c3f24986ea45dac |
|
29-May-2010 |
Brian Carlstrom <bdc@google.com> |
Merge remote branch 'goog/dalvik-dev' into dalvik-dev-to-master Change-Id: I0c0edb3ebf0d5e040d6bbbf60269fab0deb70ef9
|
b88ec3cbb419b5eac23508dc6b73de2620d7521a |
|
17-May-2010 |
Ben Cheng <bccheng@android.com> |
Remove the write permission for the JIT code cache when not needed To support the feature, redesigned the predicted chaining mechanism so that the profile count is shared globally in InterpState. Bug: 2690371 Change-Id: Ifed427e8b1fa4f6c670f19e0761e45e2d4afdbb6
|
bd0472480c6e876198fe19c4ffa22350c0ce57da |
|
13-May-2010 |
Bill Buzbee <buzbee@google.com> |
JIT: Fix for [Issue 2675245] FRF40 monkey crash in jit-cache The JIT's chaining mechanism suffered from a narrow window that could result in i-cache inconsistency. One of the forms of chaining cell consisted of a two 16-bit thumb instruction sequence. If a thread were interrupted between the execution of those two instructions *and* another thread picked that moment to convert that cell's chained/unchained state, then bad things happen. This CL alters the chain/unchain model somewhat to avoid this case. Chainable chaining cells grow by 4 bytes each, and instead of rewriting a 32-bit cell to chain/unchain, we switch between chained and unchained state by [re]writing the first 16-bits of the cell as either a 16-bit Thumb unconditional branch (unchained mode) or the first half of a 32-bit Thumb branch. The 2nd 16-bits of the cell will never change once the cell moves from its inital state - thus avoiding the possibility of it becoming inconsistent. This adds a trivial execution penalty on the slow path, but will add about a kByte of memory usage to a typical process. Change-Id: Id8b99802e11386cfbab23da6abae10e2d9fc4065
|
978738d2cbf9d08fa78c65762eaac3351ab76b9a |
|
13-May-2010 |
Ben Cheng <bccheng@android.com> |
Add counters to track JIT inline cache hit rate and code cache patch counts. Also did some WITH_JIT_TUNING cleanup. Change-Id: I8bb2d681a06b0f2af1f976a007326825a88cea38
|
a62475ecfcc80c58add8f153c9605762dafb8227 |
|
30-Apr-2010 |
Ben Cheng <bccheng@android.com> |
Use unsigned comparison for stack pointers. Bug: 2613607 Change-Id: I6a8abd69fbf9cb9f8ec9d9febf1ea42fd631fe9c
|
11d8f14eef83d1b7bfa8f116de56a92d5ba9e71e |
|
24-Mar-2010 |
Ben Cheng <bccheng@android.com> |
Fix for the JIT blocking mode plus some code cleanup. Bug: 2517606 Change-Id: I2b5aa92ceaf23d484329330ae20de5966704280b
|
86717f79d9b018f4d69cc991075fa36611f234e5 |
|
06-Mar-2010 |
Ben Cheng <bccheng@android.com> |
Collect more JIT stats in the assert build. New stuff includes breakdown of callsite types (ie monomorphic vs polymorphic vs monoporphic resolved to native), total time spent in JIT'ing, and average JIT time per compilation. Example output: D/dalvikvm( 840): 4042 compilations using 1976 + 329108 bytes D/dalvikvm( 840): Compiler arena uses 10 blocks (8100 bytes each) D/dalvikvm( 840): Compiler work queue length is 0/36 D/dalvikvm( 840): size if 8192, entries used is 4137 D/dalvikvm( 840): JIT: 4137 traces, 8192 slots, 1099 chains, 40 thresh, Non-blocking D/dalvikvm( 840): JIT: Lookups: 1128780 hits, 168564 misses; 179520 normal, 6 punt D/dalvikvm( 840): JIT: noChainExit: 528464 IC miss, 194708 interp callsite, 0 switch overflow D/dalvikvm( 840): JIT: Invoke: 507 mono, 988 poly, 72 native, 1038 return D/dalvikvm( 840): JIT: Total compilation time: 2342 ms D/dalvikvm( 840): JIT: Avg unit compilation time: 579 us D/dalvikvm( 840): JIT: 3357 Translation chains, 97 interp stubs D/dalvikvm( 840): dalvik.vm.jit.op = 0-2,4-5,7-8,a-c,e-16,19-1a,1c-23,26,28-29,2b-2f,31-3d,44-4b,4d-51,60,62-63,68-69,70-72,76-78,7b,81-82,84,87,89,8d-93,95-98,a1,a3,a6,a8-a9,b0-b3,b5-b6,bb-bf,c6-c8,d0,d2-d6,d8,da-e2,ee-f0,f2-fb, D/dalvikvm( 840): Code size stats: 50666/105126 (compiled/total Dalvik), 329108 (native)
|
40094c16d9727cc1e047a7d4bddffe04dd566211 |
|
25-Feb-2010 |
Ben Cheng <bccheng@android.com> |
Tweak the interpreter entries and 2nd level trace filter to capture more traces. Real changes: 1) Add a new entry point from JIT to the interpreter to request hot traces w/o doing chaining. 2) Increase the granularity of the secondary profile filter to match 64-byte chunks using 64 entries. The remaining are just cosmetic changes.
|
e6af13cf607de870de51ffe00f48552252946a00 |
|
06-Feb-2010 |
Bill Buzbee <buzbee@google.com> |
JIT: Replace missing ending comment marker MONITOR_ENTER template ...which, luckily, was followed by a debug version of the same handler so everything magically worked anyway. I should buy a lottery ticket today.
|
fccb31dd58e5cb9f7a3f6e128d481f0ff35a51f0 |
|
05-Feb-2010 |
Bill Buzbee <buzbee@google.com> |
Jit: Start the Jit when framework signals on first screen draw Cleanup of delayed start - introduce dvmRelativeCondWait in Sync.c. Additionally, support for deadman timer to start Jit when no screen draws happen, and to start immediately when running stand-alone. Fixed bug in assert variant of libdvm - recent MONITOR change had neglected to add a new type of exit to the exit stats.
|
79842ac67e2a23cb544bfe1ee3961d325a2552e7 |
|
06-Feb-2010 |
Bill Buzbee <buzbee@google.com> |
JIT: Replace missing ending comment marker MONITOR_ENTER template ...which, luckily, was followed by a debug version of the same handler so everything magically worked anyway. I should buy a lottery ticket today.
|
eb695c6f814f6b0bdbba0e837555d3fe5ad23104 |
|
05-Feb-2010 |
Bill Buzbee <buzbee@google.com> |
Jit: Start the Jit when framework signals on first screen draw Cleanup of delayed start - introduce dvmRelativeCondWait in Sync.c. Additionally, support for deadman timer to start Jit when no screen draws happen, and to start immediately when running stand-alone. Fixed bug in assert variant of libdvm - recent MONITOR change had neglected to add a new type of exit to the exit stats.
|
88dc28740f628a8d0d8fe05af0e11443f8793aa1 |
|
03-Feb-2010 |
jeffhao <jeffhao@google.com> |
Made Self Verification mode's memory interface less intrusive.
|
9e45c0b968d63ea38353c99252d233879c2efdaf |
|
03-Feb-2010 |
jeffhao <jeffhao@google.com> |
Made Self Verification mode's memory interface less intrusive.
|
f5ceaebfe5633a16b11a7073d2bf36b5bb0c9945 |
|
02-Feb-2010 |
Bill Buzbee <buzbee@google.com> |
Jit: Rework monitor enter/exit to simplify thread suspension The Jit must stop all threads in order to flush the translation cache (and other tables). Threads which are blocked in a monitor wait cause some headache here because they effectively hold a references to the translation cache (though the return address on the native stack). The new model introduced in this CL is that for the fast path of monitor enter, control is allowed to resume in the translation cache. However, if we need to do a heavyweight lock (which may cause us to block) control does not return to the translation cache but instead bails out to the interpreter. This allows us to safely clear the code cache even if some threads are in THREAD_MONITOR state.
|
c1d9ed490a7bd6caab51df41f3c9e590fcecb727 |
|
02-Feb-2010 |
Bill Buzbee <buzbee@google.com> |
Jit: Rework monitor enter/exit to simplify thread suspension The Jit must stop all threads in order to flush the translation cache (and other tables). Threads which are blocked in a monitor wait cause some headache here because they effectively hold a references to the translation cache (though the return address on the native stack). The new model introduced in this CL is that for the fast path of monitor enter, control is allowed to resume in the translation cache. However, if we need to do a heavyweight lock (which may cause us to block) control does not return to the translation cache but instead bails out to the interpreter. This allows us to safely clear the code cache even if some threads are in THREAD_MONITOR state.
|
964a7b06a9134947b5985c7f712d18d57ed665d2 |
|
28-Jan-2010 |
Bill Buzbee <buzbee@google.com> |
Jit: Rework delayed start plus misc. cleanup Defer initialization of jit to support upcoming feature to wait until first screen is painted to start in order to avoid wasting effort on jit'ng initialization code. Timed delay in place for the moment. To change the on/off state, call dvmSuspendAllThreads(), update the value of gDvmJit.pJitTable and then dvmResumeAllThreads(). Each time a thread goes through the heavyweight check suspend path, returns from a monitor lock/unlock or returns from a JNI call, it will refresh its on/off state. Also: Recognize and handle failure to increase size of JitTable. Avoid repeated lock/unlock of JitTable modification mutex during resize Make all work order enqueue actions non-blocking, which includes adding a non-blocking mutex lock: dvmTryLockMutex(). Fix bug Jeff noticed where we were using a half-word form of a Thumb2 instruction rather than the byte form. Minor comment changes.
|
7a0bcd0de6c4da6499a088a18d1750e51204c2a6 |
|
23-Jan-2010 |
Ben Cheng <bccheng@android.com> |
Tighten the safe points for code cache resets to happen. Add a new flag in the Thread struct to track the whereabout of the top frame in each Java thread. It is not safe to blow away the code cache if any thread is in the JIT'ed land.
|
60c24f436d603c564d5351a6f81821f12635733c |
|
04-Jan-2010 |
Ben Cheng <bccheng@google.com> |
Tear down the code cache when it is full and restart from scratch. Because the code cache may be wiped out after safe points now the patching of inline cache for predicted chains is done through the compiler thread's work queue.
|
24ac537cf8d214f7f1bcb07aace429521247d1eb |
|
16-Dec-2009 |
Ben Cheng <bccheng@google.com> |
Move VFP register save/restore routines from template to codegen. Code in the template directory will occupy space in the code cache and is invoked from JIT'ed code. Since these routines are only invoked from statically compiled functions we can move them to the codegen directory which also has arch-variant configurations.
|
342806dae77556290dfe0760e6fe3117d812c7ba |
|
08-Dec-2009 |
Bill Buzbee <buzbee@google.com> |
Jit: Save/restore callee-save floating point registers at interpreter entry/exit
|
909b418219f63c0d0b2bde8a0835dbf27d5061b8 |
|
03-Dec-2009 |
Bill Buzbee <buzbee@google.com> |
Jit: Fix for 2187020, bad exception recovery from native invoke static
|
ab875c79c56eacc510b09710d38a9b20f7337486 |
|
19-Nov-2009 |
Bill Buzbee <buzbee@google.com> |
Jit: fix for string/indexOf handler.
|
4c0dedfd9006daee4f6d96482cc6ac94a1797880 |
|
16-Nov-2009 |
Bill Buzbee <buzbee@google.com> |
Jit: string's compareTo performance improvement. Changed compareTo handler to call __memcmp16() for strings >= 32 chars. However, even for those strings, the first two chars are done in the handler (to catch early-out cases). Comparisons were done with micro-benchmarks comparing 10 and 200-char strings. The strings were: equal -> Q not equal at start -> S not equal at end -> E The test configurations were handler (H) [the previous handler], subroutine (S) [memcmp16()} and blended (B) [this commit] H S B 10E 60 138 65 10S 32 70 30 10Q 9 9 9 100E 745 708 716 In short, the small string cases were twice as fast with the existing handler compared to memcmp16, but memcmp16 was ~5% faster for long strings.
|
5965d47b624798343b6a53afd384f2cf88d091de |
|
15-Nov-2009 |
Bill Buzbee <buzbee@google.com> |
Jit: fix for compareTo handler. Note to self: Units tests are much more effective when the test main actually calls them.
|
72e93344b4d1ffc71e9c832ec23de0657e5b04a5 |
|
13-Nov-2009 |
Jean-Baptiste Queru <jbq@google.com> |
eclair snapshot
|
9a8c75adb2abf551d06dbf757bff558c1feded08 |
|
08-Nov-2009 |
Bill Buzbee <buzbee@google.com> |
Introduce "just interpret" chainable pseudo-translation. This is the first step towards enabling translation & self-cosim stress modes. When trace selection begins, the trace head address is pinned and remains in a limbo state until the translation is complete. Previously, if the trace selected aborted for any reason, the trace head would remain forever in limbo. This was not a correctness problem, but caused some small performance anomolies and made life more difficult for self-cosimulation mode. This CL introduces a pseudo-translation that simply routes control to the interpreter. When we detect that a trace selection attempt has failed, the trace head is associated with this fully-chainable pseudo-translation. This also has the benefit for self-cosimulation that we are guaranteed forward progress.
|
49024493479b1ab8b5f9b44c24a3b0c33afc796c |
|
04-Nov-2009 |
Bill Buzbee <buzbee@google.com> |
Fix for inline string indexof; added regression tests
|
6c10a977ec892c26c8e306356491833bbb073d40 |
|
29-Oct-2009 |
Ben Cheng <bccheng@google.com> |
Implement chaining up to the first 64 cases in a switch statement.
|
fd023aaec5f2b0df61d1702ea2f29a70abe90158 |
|
02-Nov-2009 |
Bill Buzbee <buzbee@google.com> |
Jit - optimized inline string compareto, indexof; fill_array_data bug fix Added flushAllRegs() prior to C handlers in preparation for upcoming support for holding live/dirty values in physical registers.
|
1465db5ee2d3c4c4dcc8e017a294172e858765cb |
|
24-Sep-2009 |
Bill Buzbee <buzbee@google.com> |
Major registor allocation rework - stage 1. Direct usage of registers abstracted out. Live values tracked locally. Redundant loads and stores suppressed. Address of registers and register pairs unified w/ single "location" mechanism Register types inferred using existing dataflow analysis pass. Interim (i.e. Hack) mechanism for storing register liveness info. Rewrite TBD. Stubbed-out code for linear scan allocation (for loop and long traces) Moved optimistic lock check for monitor-enter/exit inline for Thumb2 Minor restructuring, renaming and general cleanup of codegen Renaming of enums to follow coding convention Formatting fixes introduced by the enum renaming Rewrite of RallocUtil.c and addition of linear scan to come in stage 2.
|
4f48917c0741e4d9b15ca7c45956aea05fea103f |
|
28-Sep-2009 |
Ben Cheng <bccheng@google.com> |
Fixed OOM exception handling in JIT'ed code and added a new unit test.
|
7fb2edd2f69d11435da8dc0f1c251349238863b3 |
|
31-Aug-2009 |
Bill Buzbee <buzbee@google.com> |
Inline Sqrt bug fix; add support for fp/gen register copies
|
d5ab726b65d7271be261864c7e224fb90bfe06e0 |
|
25-Aug-2009 |
Andy McFadden <fadden@android.com> |
Another round of scary indirect ref changes. This change adds a not-really-working implementation to Jni.c, with various changes #ifdefed throughout the code. The ifdef is currently disabled, so the old behavior should continue. Eventually the old version will be stripped out and the ifdefs removed. This renames the stack's "localRefTop" field, which nudged a bunch of code. The name wasn't really right before (it's the *bottom* of the local references), and it's even less right now. This and one other mterp-visible constant were changed, which caused some ripples through mterp and the JIT, but the ifdeffing was limited to one in asm-constants.h (and the constant is the same both ways, so toggling the ifdef won't require rebuilding asm sources). Some comments and arg names in ReferenceTable were updated for the correct orientation of bottom vs. top. Some adjustments were made to the JNI code, e.g. dvmCallMethod now needs to understand if it needs to convert reference arguments from local/global refs to pointers (it's called from various places throughout the VM).
|
97319a8a234e9fe1cf90ca39aa6eca37d729afd5 |
|
13-Aug-2009 |
Jeff Hao <jeffhao@google.com> |
New changes to enable self verification mode.
|
89efc3d632adfa076bd622369b1ad8e4b49cf20e |
|
28-Jul-2009 |
Bill Buzbee <buzbee@google.com> |
Stage 2 of structural changes for support of THUMB2. No logic changes.
|
50a6bf2f01efba0acbff9bb03e7ee09688553e08 |
|
08-Jul-2009 |
Bill Buzbee <buzbee@google.com> |
Inline-execute for Java.Lang.Math routines, jit codegen restructure, various bug fixes.
|