History log of /dalvik/vm/compiler/codegen/arm/CodegenDriver.c
Revision Date Author Comments
1daf86bdb630efa96147220019e1a97c853ed3d2 07-Jul-2010 Bill Buzbee <buzbee@google.com> JIT: Fix for 2813841, use core regs for sub-word data

In an attempt to avoid unnecessary register copies, the JIT allows
data items to live in either floating point or core registers until
an instruction is used which requires one or the other. The bug here
was that sub-word data was allowed to live in floating point registers
at the point of a load or store. This cl forces the use of core registers
in those cases.

Change-Id: Iaee57545c6a62990186a5d0ab5bb22728d75dd60
09e50c941a51abb2f1e750fbdeaa3e2c29d1d093 02-May-2010 Ben Cheng <bccheng@android.com> Throw an exception in JIT'ed code if dvmFindInterfaceMethodInCache returns NULL

Bug: 2642019
Change-Id: Iec2be8f61388d99b1500bb144e56b86febe76c0b
bd1326d0e6b82a24ee80d50921e62152ea919151 03-Apr-2010 Ben Cheng <bccheng@android.com> Clean up the codegen for invoking helper callout functions.

All invoked functions are documented in compiler/codegen/arm/CalloutHelper.h
Bug: 2567981

Change-Id: Ia7cd4107272df1b0b5588fbcc0aafcc6d0723d60
a497359afa1abe4c5780c8799c6fe0edab551c2d 31-Mar-2010 Ben Cheng <bccheng@android.com> Fix a race condition in JIT state refresh under debugging / misc code cleanup.

Bug: 2561283
Change-Id: I9fd94928f3e661de97098808340ea92b28cafa07
d5adae17d71e86a1a5f3ae7825054e3249fb7879 27-Mar-2010 Ben Cheng <bccheng@android.com> Improve JIT self verifier test coverage to follow single-step instructions.

Bug: 2549326
Change-Id: I01412d4aac1379b61c90fe6e59c534b33be93f66
80cef8675b2ce54faa31e837b79db9f66d8e652c 25-Mar-2010 Bill Buzbee <buzbee@google.com> Jit: Fix for 2542488 JIT codegen bug with overlapping wide operands

Change-Id: I7b922e223fe1f5242d1f3db1fa18f54aaed725af
11d8f14eef83d1b7bfa8f116de56a92d5ba9e71e 24-Mar-2010 Ben Cheng <bccheng@android.com> Fix for the JIT blocking mode plus some code cleanup.

Bug: 2517606
Change-Id: I2b5aa92ceaf23d484329330ae20de5966704280b
900a3afd0e8e0d88426b21447d601ee67e17b642 16-Mar-2010 Bill Buzbee <buzbee@google.com> Jit: Fix register usage bug - Issue 2518825 native crash running ARMv5te JIT

Change I8ca61804 added a call to dvmCanPutArrayElement for APUT_OBJECT,
but did so in a way that violated register usage restrictions. This change
tells the register allocation system what registers we expect to remain
live across the call to dvmCanPutArrayElement.

Change-Id: Icd83b888ba60768a196070d62d07d12c7a3c73c6
be6534f384529e51dfba5c3f1b7eb90c86b66e77 13-Mar-2010 Bill Buzbee <buzbee@google.com> Jit: Fix for [Issue 2487514] Dropped exception

The jit was failing to call dvmCanPutArrayElement for aput-object.

Change-Id: I8ca618048dc4d1be5b1f1ed85078759041883b09
4527387dd3b5c4dce7300c764805ffd0f3d22649 11-Mar-2010 Bill Buzbee <buzbee@google.com> Jit: Make debugging mode aware of inlineExecute/moveResult optimization

The Jit has a mode in which selected opcodes can be handled normally
or single-stepped in the interpter. This was broken for cases in
which the Jit applied an optimization to fold inlineExecute/moveResult
intruction pairs into a single operation and the debug mode was set
to handle the two opcodes differently.

Change-Id: Ifa436d4ba66ba0c13ea366c0956e6cf92ce9cdfd
fc519dc8f4444f6d93806ec15ce7445b322070fd 07-Mar-2010 Bill Buzbee <buzbee@google.com> Jit: Make most Jit compile failures non-fatal; just abort offending translation

Issue 2175597 Jit compile failures should abort translation, but not the VM

Added new dvmCompileAbort() to replace uses of dvmAbort() when something goes
wrong during the compliation of a trace. In that case, we'll abort the translation
and set it's head to the interpret-only "translation".
86717f79d9b018f4d69cc991075fa36611f234e5 06-Mar-2010 Ben Cheng <bccheng@android.com> Collect more JIT stats in the assert build.

New stuff includes breakdown of callsite types (ie monomorphic vs polymorphic
vs monoporphic resolved to native), total time spent in JIT'ing, and average
JIT time per compilation.

Example output:
D/dalvikvm( 840): 4042 compilations using 1976 + 329108 bytes
D/dalvikvm( 840): Compiler arena uses 10 blocks (8100 bytes each)
D/dalvikvm( 840): Compiler work queue length is 0/36
D/dalvikvm( 840): size if 8192, entries used is 4137
D/dalvikvm( 840): JIT: 4137 traces, 8192 slots, 1099 chains, 40 thresh, Non-blocking
D/dalvikvm( 840): JIT: Lookups: 1128780 hits, 168564 misses; 179520 normal, 6 punt
D/dalvikvm( 840): JIT: noChainExit: 528464 IC miss, 194708 interp callsite, 0 switch overflow
D/dalvikvm( 840): JIT: Invoke: 507 mono, 988 poly, 72 native, 1038 return
D/dalvikvm( 840): JIT: Total compilation time: 2342 ms
D/dalvikvm( 840): JIT: Avg unit compilation time: 579 us
D/dalvikvm( 840): JIT: 3357 Translation chains, 97 interp stubs
D/dalvikvm( 840): dalvik.vm.jit.op = 0-2,4-5,7-8,a-c,e-16,19-1a,1c-23,26,28-29,2b-2f,31-3d,44-4b,4d-51,60,62-63,68-69,70-72,76-78,7b,81-82,84,87,89,8d-93,95-98,a1,a3,a6,a8-a9,b0-b3,b5-b6,bb-bf,c6-c8,d0,d2-d6,d8,da-e2,ee-f0,f2-fb,
D/dalvikvm( 840): Code size stats: 50666/105126 (compiled/total Dalvik), 329108 (native)
1f74863d3e0f19930818398f375ebf1cf2d78969 03-Mar-2010 Bill Buzbee <buzbee@google.com> Jit: Sapphire tuning - mostly scheduling.

Re-enabled load/store motion that had inadvertently been turned off for
non-armv7 targets. Tagged memory references with the kind of memory
they touch (Dalvik frame, literal pool, heap) to enable more aggressive
load hoisting. Eliminated some largely duplicate code in the target
specific files. Reworked temp register allocation code to allocate next
temp round-robin (to improve scheduling opportunities).

Overall, nice gain for Sapphire. Shows 5% to 15% on some benchmarks, and
measurable improvements for Passion.
40094c16d9727cc1e047a7d4bddffe04dd566211 25-Feb-2010 Ben Cheng <bccheng@android.com> Tweak the interpreter entries and 2nd level trace filter to capture more traces.

Real changes:
1) Add a new entry point from JIT to the interpreter to request hot traces w/o
doing chaining.
2) Increase the granularity of the secondary profile filter to match 64-byte
chunks using 64 entries.

The remaining are just cosmetic changes.
6a55513b0d268bc0721834050a3698316854fa0a 26-Feb-2010 Elliott Hughes <enh@google.com> Fix a couple of typos in JIT function names.

(I saw these the other day, but preferred a separate patch.)
b4c05977c28c38d2f81b48d0cb15559dc3d05564 25-Feb-2010 Elliott Hughes <enh@google.com> Optimize more easy multiplications by constants.

Rather than make these changes in the libraries (*10 being a common case),
let's do them once and for all in the JIT.

The 2^n-1 case could be better if we generated RSB instructions, but the
current "fake" RSB is still better than a full multiply.

Thumb doesn't support reg/reg/reg/shift instructions, so we can't optimize
the "population count <= 2" cases (such as *10) there.

Tested on sholes, passion, and passion-running-sapphire (and visually
inspected to check we weren't trying to generate Thumb2 instructions there).
Also tested with the self-verifier.
6bbdd6b005ec5cb567ec9576190a7cd784248c5c 16-Feb-2010 Bill Buzbee <buzbee@google.com> Jit: Monitor exit, possible fix for Issue 2396073

Two problems with monitor-exit:
1. The Jit code wasn't checking for exception thrown following
unlocks of fat locks using dvmUnlockObject().
2. The mterp interpreter unlock code branched to handle exceptions
thrown during dvmUnlockObject() with the wrong dalvik PC (the
dPC of the unlock, rather than the instruction following the unlock).

Similar issue with the x86 interpreter fixed. Also, deleted armv7-a
MONITOR_ENTER template, which turned out to be identical to the armv5te
one.
78cb0e2c6e118c647915c3f8a72f1564cccb521a 11-Feb-2010 Bill Buzbee <buzbee@google.com> Jit: Minor codegen tuning.
c6f1066fd2dd761349128a9f422bc1ce3c3de595 09-Feb-2010 Bill Buzbee <buzbee@google.com> Jit: Phase 1 of register utility cleanup/rewrite - the great renaming

Renaming of all of those register utilities which used to be local because
of our include mechanism to the standard dvmCompiler prefix scheme.
9e45c0b968d63ea38353c99252d233879c2efdaf 03-Feb-2010 jeffhao <jeffhao@google.com> Made Self Verification mode's memory interface less intrusive.
6999d84e2c55dc4a46a6c311b55bd5811336d9c4 27-Jan-2010 Ben Cheng <bccheng@android.com> Fix performance issues related to chaining and unchaining.

1) Patching requests for predicted chaining cells (used by virtual/interface
methods) are now batched in a queue and processed when the VM is paused for GC.

2) When the code cache is full the reset operation is also conducted at the
end of GC pauses so this totally eliminates the need for the compiler thread
to issue suspend-all requests. This is a very rare event and when happening it
takes less than 5ms to finish.

3) Change the initial value of the branch in a predicted chaining cell from 0
(ie lsl r0, r0, #0) to 0xe7fe (ie branch to self) so that initializing a
predicted chaining cell doesn't need to suspend all threads. Together with 1)
seeing 20% speedup on some benchmarks.

4) Add TestCompability.c where defining "TEST_VM_IN_ECLAIR := true" in
buildspec.mk will activate dummy symbols needed to run libdvm.so in older
releases.

Bug: 2397689
Bug: 2396513
Bug: 2331313
c1d9ed490a7bd6caab51df41f3c9e590fcecb727 02-Feb-2010 Bill Buzbee <buzbee@google.com> Jit: Rework monitor enter/exit to simplify thread suspension

The Jit must stop all threads in order to flush the translation cache (and
other tables). Threads which are blocked in a monitor wait cause some
headache here because they effectively hold a references to the translation
cache (though the return address on the native stack). The new model
introduced in this CL is that for the fast path of monitor enter, control
is allowed to resume in the translation cache. However, if we need to do a
heavyweight lock (which may cause us to block) control does not return to the
translation cache but instead bails out to the interpreter. This allows us to
safely clear the code cache even if some threads are in THREAD_MONITOR state.
964a7b06a9134947b5985c7f712d18d57ed665d2 28-Jan-2010 Bill Buzbee <buzbee@google.com> Jit: Rework delayed start plus misc. cleanup

Defer initialization of jit to support upcoming feature to wait until
first screen is painted to start in order to avoid wasting effort on
jit'ng initialization code. Timed delay in place for the moment.
To change the on/off state, call dvmSuspendAllThreads(), update the
value of gDvmJit.pJitTable and then dvmResumeAllThreads().
Each time a thread goes through the heavyweight check suspend path, returns
from a monitor lock/unlock or returns from a JNI call, it will refresh
its on/off state.

Also:
Recognize and handle failure to increase size of JitTable.
Avoid repeated lock/unlock of JitTable modification mutex during resize
Make all work order enqueue actions non-blocking, which includes adding
a non-blocking mutex lock: dvmTryLockMutex().
Fix bug Jeff noticed where we were using a half-word form of a Thumb2
instruction rather than the byte form.
Minor comment changes.
480e67866a50c64cecfdd7bdc4aeafe41e12b2b0 28-Jan-2010 Bill Buzbee <buzbee@google.com> Jit: Fix INSTANCE_OF corner case.
7a0bcd0de6c4da6499a088a18d1750e51204c2a6 23-Jan-2010 Ben Cheng <bccheng@android.com> Tighten the safe points for code cache resets to happen.

Add a new flag in the Thread struct to track the whereabout of the top frame
in each Java thread. It is not safe to blow away the code cache if any thread
is in the JIT'ed land.
cec26f6ae3347d5ab3d60de02caca2e47151c6b2 16-Jan-2010 Ben Cheng <bccheng@android.com> Fix chaining offset mis-calculation for translations w/ large switch statements.

Bug: 2369821

There are 12 bytes of additional code after the 65th chaining cell. So if a
switch statement with more than that many cases is translated by the JIT, it
will run fine until the next unchaining event, which will patch the wrong code
and lead to all kinds of unexpected crashes.
51ecf60dca9f98eeda1818814de6a344e197802f 14-Jan-2010 Bill Buzbee <buzbee@google.com> Fix bad long negate; bug 2373405 - EnumSetTest failure with JIT today
60c24f436d603c564d5351a6f81821f12635733c 04-Jan-2010 Ben Cheng <bccheng@google.com> Tear down the code cache when it is full and restart from scratch.

Because the code cache may be wiped out after safe points now the patching of
inline cache for predicted chains is done through the compiler thread's work
queue.
d0937ef76b41a57d25c084e76aed1bb91c6dfde7 23-Dec-2009 Bill Buzbee <buzbee@google.com> Jit: Update monitor lock/unlock to reflect thinlock changes (I34b20f49)
94338aadf8355b28846f0d21c49142ca29479dc4 21-Dec-2009 Carl Shapiro <cshapiro@google.com> Repurpose bits 1 and 2 of the lockword for encoding the hash state of
an object. Invert the meaning of the shape bit to match the encoding
scheme described in Bacon's paper. Consequently, monitor pointers
must have the lower 3 bits stripped before they may be dereferenced.
0e605279abe713cb54cac3b8eec90d674b6766ce 01-Dec-2009 Bill Buzbee <buzbee@google.com> Jit: shift bug fix - 2296099
ce46c9456590968db896b5f6e63509a70232044c 21-Nov-2009 Bill Buzbee <buzbee@google.com> Jit: Support for inline-execute/range [issue 2268232]
f9f33287693f9f9aa44318036b8aab627bd21a32 22-Nov-2009 Bill Buzbee <buzbee@google.com> Jit: Misc fixes, move_exception, blocking mode, self-cosim

OP_MOVE_EXCEPTION handler was neglecting to reset.
Blocking mode was failing to signal empty queue in some cases
Self-cosim was including operations in traces that can't be done twice
Added OP_MOVE_EXCEPTION to self cosim's no-replay ops (it has side effects)
Restored threshold of 1 to self-cosim (now able to boot device with self-cosim)
When threshold < 6, disable 2nd-level translation filter
5d90c20bd7903d7bba966b224e576bf137bf8b4b 23-Nov-2009 Ben Cheng <bccheng@google.com> Restructure the codegen to make architectural depedency explicit.

The original Codegen.c is broken into three components:

- CodegenCommon.c (arch-independend)
- CodegenFactory.c (Thumb1/2 dependent)
- CodegenDriver.c (Dalvik dependent)

For the Thumb/Thumb2 directories, each contain the followin three files:

- Factory.c (low-level routines for instruction selections)
- Gen.c (invoke the ISA-specific instruction selection routines)
- Ralloc.c (arch-dependent register pools)

The FP directory contains FP-specific codegen routines depending on
Thumb/Thumb2/VFP/PortableFP:

- Thumb2VFP.c
- ThumbVFP.c
- ThumbPortableFP.c

Then the hierarchy is formed by stacking these files in the following top-down
order:

1 CodegenCommon.c
2 Thumb[2]/Factory.c
3 CodegenFactory.c
4 Thumb[2]/Gen.c
5 FP stuff
6 Thumb[2]/Ralloc.c
7 CodegenDriver.c