1daf86bdb630efa96147220019e1a97c853ed3d2 |
|
07-Jul-2010 |
Bill Buzbee <buzbee@google.com> |
JIT: Fix for 2813841, use core regs for sub-word data In an attempt to avoid unnecessary register copies, the JIT allows data items to live in either floating point or core registers until an instruction is used which requires one or the other. The bug here was that sub-word data was allowed to live in floating point registers at the point of a load or store. This cl forces the use of core registers in those cases. Change-Id: Iaee57545c6a62990186a5d0ab5bb22728d75dd60
|
09e50c941a51abb2f1e750fbdeaa3e2c29d1d093 |
|
02-May-2010 |
Ben Cheng <bccheng@android.com> |
Throw an exception in JIT'ed code if dvmFindInterfaceMethodInCache returns NULL Bug: 2642019 Change-Id: Iec2be8f61388d99b1500bb144e56b86febe76c0b
|
bd1326d0e6b82a24ee80d50921e62152ea919151 |
|
03-Apr-2010 |
Ben Cheng <bccheng@android.com> |
Clean up the codegen for invoking helper callout functions. All invoked functions are documented in compiler/codegen/arm/CalloutHelper.h Bug: 2567981 Change-Id: Ia7cd4107272df1b0b5588fbcc0aafcc6d0723d60
|
a497359afa1abe4c5780c8799c6fe0edab551c2d |
|
31-Mar-2010 |
Ben Cheng <bccheng@android.com> |
Fix a race condition in JIT state refresh under debugging / misc code cleanup. Bug: 2561283 Change-Id: I9fd94928f3e661de97098808340ea92b28cafa07
|
d5adae17d71e86a1a5f3ae7825054e3249fb7879 |
|
27-Mar-2010 |
Ben Cheng <bccheng@android.com> |
Improve JIT self verifier test coverage to follow single-step instructions. Bug: 2549326 Change-Id: I01412d4aac1379b61c90fe6e59c534b33be93f66
|
80cef8675b2ce54faa31e837b79db9f66d8e652c |
|
25-Mar-2010 |
Bill Buzbee <buzbee@google.com> |
Jit: Fix for 2542488 JIT codegen bug with overlapping wide operands Change-Id: I7b922e223fe1f5242d1f3db1fa18f54aaed725af
|
11d8f14eef83d1b7bfa8f116de56a92d5ba9e71e |
|
24-Mar-2010 |
Ben Cheng <bccheng@android.com> |
Fix for the JIT blocking mode plus some code cleanup. Bug: 2517606 Change-Id: I2b5aa92ceaf23d484329330ae20de5966704280b
|
900a3afd0e8e0d88426b21447d601ee67e17b642 |
|
16-Mar-2010 |
Bill Buzbee <buzbee@google.com> |
Jit: Fix register usage bug - Issue 2518825 native crash running ARMv5te JIT Change I8ca61804 added a call to dvmCanPutArrayElement for APUT_OBJECT, but did so in a way that violated register usage restrictions. This change tells the register allocation system what registers we expect to remain live across the call to dvmCanPutArrayElement. Change-Id: Icd83b888ba60768a196070d62d07d12c7a3c73c6
|
be6534f384529e51dfba5c3f1b7eb90c86b66e77 |
|
13-Mar-2010 |
Bill Buzbee <buzbee@google.com> |
Jit: Fix for [Issue 2487514] Dropped exception The jit was failing to call dvmCanPutArrayElement for aput-object. Change-Id: I8ca618048dc4d1be5b1f1ed85078759041883b09
|
4527387dd3b5c4dce7300c764805ffd0f3d22649 |
|
11-Mar-2010 |
Bill Buzbee <buzbee@google.com> |
Jit: Make debugging mode aware of inlineExecute/moveResult optimization The Jit has a mode in which selected opcodes can be handled normally or single-stepped in the interpter. This was broken for cases in which the Jit applied an optimization to fold inlineExecute/moveResult intruction pairs into a single operation and the debug mode was set to handle the two opcodes differently. Change-Id: Ifa436d4ba66ba0c13ea366c0956e6cf92ce9cdfd
|
fc519dc8f4444f6d93806ec15ce7445b322070fd |
|
07-Mar-2010 |
Bill Buzbee <buzbee@google.com> |
Jit: Make most Jit compile failures non-fatal; just abort offending translation Issue 2175597 Jit compile failures should abort translation, but not the VM Added new dvmCompileAbort() to replace uses of dvmAbort() when something goes wrong during the compliation of a trace. In that case, we'll abort the translation and set it's head to the interpret-only "translation".
|
86717f79d9b018f4d69cc991075fa36611f234e5 |
|
06-Mar-2010 |
Ben Cheng <bccheng@android.com> |
Collect more JIT stats in the assert build. New stuff includes breakdown of callsite types (ie monomorphic vs polymorphic vs monoporphic resolved to native), total time spent in JIT'ing, and average JIT time per compilation. Example output: D/dalvikvm( 840): 4042 compilations using 1976 + 329108 bytes D/dalvikvm( 840): Compiler arena uses 10 blocks (8100 bytes each) D/dalvikvm( 840): Compiler work queue length is 0/36 D/dalvikvm( 840): size if 8192, entries used is 4137 D/dalvikvm( 840): JIT: 4137 traces, 8192 slots, 1099 chains, 40 thresh, Non-blocking D/dalvikvm( 840): JIT: Lookups: 1128780 hits, 168564 misses; 179520 normal, 6 punt D/dalvikvm( 840): JIT: noChainExit: 528464 IC miss, 194708 interp callsite, 0 switch overflow D/dalvikvm( 840): JIT: Invoke: 507 mono, 988 poly, 72 native, 1038 return D/dalvikvm( 840): JIT: Total compilation time: 2342 ms D/dalvikvm( 840): JIT: Avg unit compilation time: 579 us D/dalvikvm( 840): JIT: 3357 Translation chains, 97 interp stubs D/dalvikvm( 840): dalvik.vm.jit.op = 0-2,4-5,7-8,a-c,e-16,19-1a,1c-23,26,28-29,2b-2f,31-3d,44-4b,4d-51,60,62-63,68-69,70-72,76-78,7b,81-82,84,87,89,8d-93,95-98,a1,a3,a6,a8-a9,b0-b3,b5-b6,bb-bf,c6-c8,d0,d2-d6,d8,da-e2,ee-f0,f2-fb, D/dalvikvm( 840): Code size stats: 50666/105126 (compiled/total Dalvik), 329108 (native)
|
1f74863d3e0f19930818398f375ebf1cf2d78969 |
|
03-Mar-2010 |
Bill Buzbee <buzbee@google.com> |
Jit: Sapphire tuning - mostly scheduling. Re-enabled load/store motion that had inadvertently been turned off for non-armv7 targets. Tagged memory references with the kind of memory they touch (Dalvik frame, literal pool, heap) to enable more aggressive load hoisting. Eliminated some largely duplicate code in the target specific files. Reworked temp register allocation code to allocate next temp round-robin (to improve scheduling opportunities). Overall, nice gain for Sapphire. Shows 5% to 15% on some benchmarks, and measurable improvements for Passion.
|
40094c16d9727cc1e047a7d4bddffe04dd566211 |
|
25-Feb-2010 |
Ben Cheng <bccheng@android.com> |
Tweak the interpreter entries and 2nd level trace filter to capture more traces. Real changes: 1) Add a new entry point from JIT to the interpreter to request hot traces w/o doing chaining. 2) Increase the granularity of the secondary profile filter to match 64-byte chunks using 64 entries. The remaining are just cosmetic changes.
|
6a55513b0d268bc0721834050a3698316854fa0a |
|
26-Feb-2010 |
Elliott Hughes <enh@google.com> |
Fix a couple of typos in JIT function names. (I saw these the other day, but preferred a separate patch.)
|
b4c05977c28c38d2f81b48d0cb15559dc3d05564 |
|
25-Feb-2010 |
Elliott Hughes <enh@google.com> |
Optimize more easy multiplications by constants. Rather than make these changes in the libraries (*10 being a common case), let's do them once and for all in the JIT. The 2^n-1 case could be better if we generated RSB instructions, but the current "fake" RSB is still better than a full multiply. Thumb doesn't support reg/reg/reg/shift instructions, so we can't optimize the "population count <= 2" cases (such as *10) there. Tested on sholes, passion, and passion-running-sapphire (and visually inspected to check we weren't trying to generate Thumb2 instructions there). Also tested with the self-verifier.
|
6bbdd6b005ec5cb567ec9576190a7cd784248c5c |
|
16-Feb-2010 |
Bill Buzbee <buzbee@google.com> |
Jit: Monitor exit, possible fix for Issue 2396073 Two problems with monitor-exit: 1. The Jit code wasn't checking for exception thrown following unlocks of fat locks using dvmUnlockObject(). 2. The mterp interpreter unlock code branched to handle exceptions thrown during dvmUnlockObject() with the wrong dalvik PC (the dPC of the unlock, rather than the instruction following the unlock). Similar issue with the x86 interpreter fixed. Also, deleted armv7-a MONITOR_ENTER template, which turned out to be identical to the armv5te one.
|
78cb0e2c6e118c647915c3f8a72f1564cccb521a |
|
11-Feb-2010 |
Bill Buzbee <buzbee@google.com> |
Jit: Minor codegen tuning.
|
c6f1066fd2dd761349128a9f422bc1ce3c3de595 |
|
09-Feb-2010 |
Bill Buzbee <buzbee@google.com> |
Jit: Phase 1 of register utility cleanup/rewrite - the great renaming Renaming of all of those register utilities which used to be local because of our include mechanism to the standard dvmCompiler prefix scheme.
|
9e45c0b968d63ea38353c99252d233879c2efdaf |
|
03-Feb-2010 |
jeffhao <jeffhao@google.com> |
Made Self Verification mode's memory interface less intrusive.
|
6999d84e2c55dc4a46a6c311b55bd5811336d9c4 |
|
27-Jan-2010 |
Ben Cheng <bccheng@android.com> |
Fix performance issues related to chaining and unchaining. 1) Patching requests for predicted chaining cells (used by virtual/interface methods) are now batched in a queue and processed when the VM is paused for GC. 2) When the code cache is full the reset operation is also conducted at the end of GC pauses so this totally eliminates the need for the compiler thread to issue suspend-all requests. This is a very rare event and when happening it takes less than 5ms to finish. 3) Change the initial value of the branch in a predicted chaining cell from 0 (ie lsl r0, r0, #0) to 0xe7fe (ie branch to self) so that initializing a predicted chaining cell doesn't need to suspend all threads. Together with 1) seeing 20% speedup on some benchmarks. 4) Add TestCompability.c where defining "TEST_VM_IN_ECLAIR := true" in buildspec.mk will activate dummy symbols needed to run libdvm.so in older releases. Bug: 2397689 Bug: 2396513 Bug: 2331313
|
c1d9ed490a7bd6caab51df41f3c9e590fcecb727 |
|
02-Feb-2010 |
Bill Buzbee <buzbee@google.com> |
Jit: Rework monitor enter/exit to simplify thread suspension The Jit must stop all threads in order to flush the translation cache (and other tables). Threads which are blocked in a monitor wait cause some headache here because they effectively hold a references to the translation cache (though the return address on the native stack). The new model introduced in this CL is that for the fast path of monitor enter, control is allowed to resume in the translation cache. However, if we need to do a heavyweight lock (which may cause us to block) control does not return to the translation cache but instead bails out to the interpreter. This allows us to safely clear the code cache even if some threads are in THREAD_MONITOR state.
|
964a7b06a9134947b5985c7f712d18d57ed665d2 |
|
28-Jan-2010 |
Bill Buzbee <buzbee@google.com> |
Jit: Rework delayed start plus misc. cleanup Defer initialization of jit to support upcoming feature to wait until first screen is painted to start in order to avoid wasting effort on jit'ng initialization code. Timed delay in place for the moment. To change the on/off state, call dvmSuspendAllThreads(), update the value of gDvmJit.pJitTable and then dvmResumeAllThreads(). Each time a thread goes through the heavyweight check suspend path, returns from a monitor lock/unlock or returns from a JNI call, it will refresh its on/off state. Also: Recognize and handle failure to increase size of JitTable. Avoid repeated lock/unlock of JitTable modification mutex during resize Make all work order enqueue actions non-blocking, which includes adding a non-blocking mutex lock: dvmTryLockMutex(). Fix bug Jeff noticed where we were using a half-word form of a Thumb2 instruction rather than the byte form. Minor comment changes.
|
480e67866a50c64cecfdd7bdc4aeafe41e12b2b0 |
|
28-Jan-2010 |
Bill Buzbee <buzbee@google.com> |
Jit: Fix INSTANCE_OF corner case.
|
7a0bcd0de6c4da6499a088a18d1750e51204c2a6 |
|
23-Jan-2010 |
Ben Cheng <bccheng@android.com> |
Tighten the safe points for code cache resets to happen. Add a new flag in the Thread struct to track the whereabout of the top frame in each Java thread. It is not safe to blow away the code cache if any thread is in the JIT'ed land.
|
cec26f6ae3347d5ab3d60de02caca2e47151c6b2 |
|
16-Jan-2010 |
Ben Cheng <bccheng@android.com> |
Fix chaining offset mis-calculation for translations w/ large switch statements. Bug: 2369821 There are 12 bytes of additional code after the 65th chaining cell. So if a switch statement with more than that many cases is translated by the JIT, it will run fine until the next unchaining event, which will patch the wrong code and lead to all kinds of unexpected crashes.
|
51ecf60dca9f98eeda1818814de6a344e197802f |
|
14-Jan-2010 |
Bill Buzbee <buzbee@google.com> |
Fix bad long negate; bug 2373405 - EnumSetTest failure with JIT today
|
60c24f436d603c564d5351a6f81821f12635733c |
|
04-Jan-2010 |
Ben Cheng <bccheng@google.com> |
Tear down the code cache when it is full and restart from scratch. Because the code cache may be wiped out after safe points now the patching of inline cache for predicted chains is done through the compiler thread's work queue.
|
d0937ef76b41a57d25c084e76aed1bb91c6dfde7 |
|
23-Dec-2009 |
Bill Buzbee <buzbee@google.com> |
Jit: Update monitor lock/unlock to reflect thinlock changes (I34b20f49)
|
94338aadf8355b28846f0d21c49142ca29479dc4 |
|
21-Dec-2009 |
Carl Shapiro <cshapiro@google.com> |
Repurpose bits 1 and 2 of the lockword for encoding the hash state of an object. Invert the meaning of the shape bit to match the encoding scheme described in Bacon's paper. Consequently, monitor pointers must have the lower 3 bits stripped before they may be dereferenced.
|
0e605279abe713cb54cac3b8eec90d674b6766ce |
|
01-Dec-2009 |
Bill Buzbee <buzbee@google.com> |
Jit: shift bug fix - 2296099
|
ce46c9456590968db896b5f6e63509a70232044c |
|
21-Nov-2009 |
Bill Buzbee <buzbee@google.com> |
Jit: Support for inline-execute/range [issue 2268232]
|
f9f33287693f9f9aa44318036b8aab627bd21a32 |
|
22-Nov-2009 |
Bill Buzbee <buzbee@google.com> |
Jit: Misc fixes, move_exception, blocking mode, self-cosim OP_MOVE_EXCEPTION handler was neglecting to reset. Blocking mode was failing to signal empty queue in some cases Self-cosim was including operations in traces that can't be done twice Added OP_MOVE_EXCEPTION to self cosim's no-replay ops (it has side effects) Restored threshold of 1 to self-cosim (now able to boot device with self-cosim) When threshold < 6, disable 2nd-level translation filter
|
5d90c20bd7903d7bba966b224e576bf137bf8b4b |
|
23-Nov-2009 |
Ben Cheng <bccheng@google.com> |
Restructure the codegen to make architectural depedency explicit. The original Codegen.c is broken into three components: - CodegenCommon.c (arch-independend) - CodegenFactory.c (Thumb1/2 dependent) - CodegenDriver.c (Dalvik dependent) For the Thumb/Thumb2 directories, each contain the followin three files: - Factory.c (low-level routines for instruction selections) - Gen.c (invoke the ISA-specific instruction selection routines) - Ralloc.c (arch-dependent register pools) The FP directory contains FP-specific codegen routines depending on Thumb/Thumb2/VFP/PortableFP: - Thumb2VFP.c - ThumbVFP.c - ThumbPortableFP.c Then the hierarchy is formed by stacking these files in the following top-down order: 1 CodegenCommon.c 2 Thumb[2]/Factory.c 3 CodegenFactory.c 4 Thumb[2]/Gen.c 5 FP stuff 6 Thumb[2]/Ralloc.c 7 CodegenDriver.c
|